Geneformer / geneformer

Commit History

Update geneformer/tokenizer.py
b8b87fd
verified

hchen725 commited on

Filter gene mapping dict for items that exist in gene_token_dict
1e8d481
verified

hchen725 commited on

Update geneformer/tokenizer.py
5197a60
verified

hchen725 commited on

Upload ensembl_mapping_dict_gc95M.pickle
47341ab
verified

hchen725 commited on

Add init for ensembl mapping dict
43b4290
verified

hchen725 commited on

Add function for summing of Ensembl IDs
fb901a0
verified

hchen725 commited on

add typing list import
42053dc

ctheodoris commited on

move dicts to init
ea428cb

ctheodoris commited on

add random state to umap
eb2a04b

ctheodoris commited on

update get_embs with token_gene_dict arg
ace12e9
verified

ctheodoris commited on

update refs to get_model_emb_dims
3fe35ba
verified

ctheodoris commited on

clone embs_i to resolve memory leak in cls embs
57f02a4

ctheodoris commited on

update perturber stats to reflect cos sim and emb_extractor to suppress warnings for non-cls
25dd1da

ctheodoris commited on

update to account for set of perturbed genes with aggregate_data
eb038a6

ctheodoris commited on

update to enable cls emb
b2bbd7c

ctheodoris commited on

update tokenizer to include eos token
ead0550

Christina Theodoris commited on

fix cell state gene embeddings bug (#345)
c0e7b19
verified

ctheodoris commited on

patch datasets save_to_disk
75c67a1

Christina Theodoris commited on

update kwargs for pretrainer
fb130e6

Christina Theodoris commited on

refer to token dictionary in self
86fe0dd

Christina Theodoris commited on

Update with gene classifier, custom token dict, and str validate options (#329)
0568479
verified

ctheodoris hchen725 commited on

add option for hyperparameter tuning to cc.validate
4bddd45

Christina Theodoris commited on

correct typo
5a43832
verified

ctheodoris commited on

update examples for predict_eval and handle roc for 2 cell classes
eeba323

Christina Theodoris commited on

Update readthedocs for classifier
f75f5ac

Christina Theodoris commited on

Get the gene keys and gene list keys from the token dictionary instead of medians (#304)
b294421
verified

ctheodoris hchen725 commited on

Prevent ruff/isort on init
941390d

Christina Theodoris commited on

Add classifier module and examples
9e9cca9

Christina Theodoris commited on

Add option for variable input_size and to add CLS/SEP Tokens (#299)
aa25cd2
verified

ctheodoris hchen725 commited on

add load model for train and fix validate anchor gene error
0d675a3

Christina Theodoris commited on

Handle case of single gene del for isp modeling of gene embs
316d817

Christina Theodoris commited on

edit docstring format to highlight options
e3330a6

Christina Theodoris commited on

edit docstring codeblock highlighting
d1931b1

Christina Theodoris commited on

update type of null_dict_list in docstring
79788b6

Christina Theodoris commited on

change doc formatting
17f036a

Christina Theodoris commited on

add sphinx docs
2a0dcbe

Christina Theodoris commited on

update dependencies, reinstate compatibility with python<3.9 with typing for List
10d3f10

Christina Theodoris commited on

Add option for modified batch size for loom tokenizer
0960cf6

Christina Theodoris commited on

Add functions for extracting gene embeddings, move state_embs_dict outside isp, fix bugs in isp
2f25aea

Christina Theodoris commited on

Add option for modifying chunk size for anndata tokenizer
fd93ebf

Christina Theodoris commited on