epitome.models.PeakModel¶

class epitome.models.PeakModel(dataset, test_celltypes=[], single_cell=False, debug=False, batch_size=64, shuffle_size=10, prefetch_size=10, l1=0.0, l2=0.0, lr=0.001, radii=[1, 3, 10, 30], checkpoint=None, max_valid_batches=None)¶

Model for learning from ChIP-seq peaks.

__init__(dataset, test_celltypes=[], single_cell=False, debug=False, batch_size=64, shuffle_size=10, prefetch_size=10, l1=0.0, l2=0.0, lr=0.001, radii=[1, 3, 10, 30], checkpoint=None, max_valid_batches=None)¶

Initializes Peak Model

Parameters

dataset (EpitomeDataset) – EpitomeDataset
test_celltypes (list) – list of cell types to hold out for test. Should be in cellmap
single_cell (boolean) – whether you are building a model to predict using scATAC-seq posteriors. Defaults to False.
debug (bool) – used to print out intermediate validation values
batch_size (int) – batch size (default is 64)
shuffle_size (int) – data shuffle size (default is 10)
prefetch_size (int) – data prefetch size (default is 10)
floatl1 – l1 regularization (default is 0)
l2 (float) – l2 regularization (default is 0)
lr (float) – lr (default is 1e-3)
radii (list) – radius of DNase-seq to consider around a peak of interest (default is [1,3,10,30]) each model.
checkpoint (str) – path to load model from.
max_valid_batches (int) – the size of train-validation dataset (default is None, meaning that it doesn’t create a train-validation dataset or stop early while training)

Methods

`__init__`(dataset[, test_celltypes, …])	Initializes Peak Model
`body_fn`()
`eval_vector`(matrix, indices)	Evaluates a new cell type based on its chromatin (DNase or ATAC-seq) vector, as well as any other similarity targets (acetylation, methylation, etc.).
`g`(p[, a, B, y])	Normalization Function.
`loss_fn`(y_true, y_pred, weights)	Loss function for Epitome.
`run_predictions`(num_samples, iter_[, …])	Runs predictions on num_samples records
`save`(checkpoint_path)	Saves model.
`score_matrix`(accessilibility_peak_matrix, …)	Runs predictions on a matrix of accessibility peaks, where columns are samples and rows are regions from regions_peak_file.
`score_peak_file`(similarity_peak_files, …)	Runs predictions on a set of peaks defined in a bed or narrowPeak file.
`score_whole_genome`(similarity_peak_files, …)	Runs a whole genome scan for all available genomic regions in the dataset (about 3.2Million regions) Takes about 1 hour on entire genome.
`test`(num_samples[, mode, calculate_metrics])	Tests model on valid and test dataset handlers.
`test_from_generator`(num_samples, ds[, …])	Runs test given a specified data generator.
`train`(max_train_batches[, patience, min_delta])	Trains an Epitome model.

Attributes

`predict_step_generator`
`predict_step_matrix`