epitome.models.PeakModel¶
-
class
epitome.models.PeakModel(dataset, test_celltypes=[], single_cell=False, debug=False, batch_size=64, shuffle_size=10, prefetch_size=10, l1=0.0, l2=0.0, lr=0.001, radii=[1, 3, 10, 30], checkpoint=None, max_valid_batches=None)¶ Model for learning from ChIP-seq peaks.
-
__init__(dataset, test_celltypes=[], single_cell=False, debug=False, batch_size=64, shuffle_size=10, prefetch_size=10, l1=0.0, l2=0.0, lr=0.001, radii=[1, 3, 10, 30], checkpoint=None, max_valid_batches=None)¶ Initializes Peak Model
- Parameters
dataset (EpitomeDataset) – EpitomeDataset
test_celltypes (list) – list of cell types to hold out for test. Should be in cellmap
single_cell (boolean) – whether you are building a model to predict using scATAC-seq posteriors. Defaults to False.
debug (bool) – used to print out intermediate validation values
batch_size (int) – batch size (default is 64)
shuffle_size (int) – data shuffle size (default is 10)
prefetch_size (int) – data prefetch size (default is 10)
floatl1 – l1 regularization (default is 0)
l2 (float) – l2 regularization (default is 0)
lr (float) – lr (default is 1e-3)
radii (list) – radius of DNase-seq to consider around a peak of interest (default is [1,3,10,30]) each model.
checkpoint (str) – path to load model from.
max_valid_batches (int) – the size of train-validation dataset (default is None, meaning that it doesn’t create a train-validation dataset or stop early while training)
Methods
__init__(dataset[, test_celltypes, …])Initializes Peak Model
body_fn()eval_vector(matrix, indices)Evaluates a new cell type based on its chromatin (DNase or ATAC-seq) vector, as well as any other similarity targets (acetylation, methylation, etc.).
g(p[, a, B, y])Normalization Function.
loss_fn(y_true, y_pred, weights)Loss function for Epitome.
run_predictions(num_samples, iter_[, …])Runs predictions on num_samples records
save(checkpoint_path)Saves model.
score_matrix(accessilibility_peak_matrix, …)Runs predictions on a matrix of accessibility peaks, where columns are samples and rows are regions from regions_peak_file.
score_peak_file(similarity_peak_files, …)Runs predictions on a set of peaks defined in a bed or narrowPeak file.
score_whole_genome(similarity_peak_files, …)Runs a whole genome scan for all available genomic regions in the dataset (about 3.2Million regions) Takes about 1 hour on entire genome.
test(num_samples[, mode, calculate_metrics])Tests model on valid and test dataset handlers.
test_from_generator(num_samples, ds[, …])Runs test given a specified data generator.
train(max_train_batches[, patience, min_delta])Trains an Epitome model.
Attributes
predict_step_generatorpredict_step_matrix-