DeepMEL

Authors: Liesbeth Minnoye , Ibrahim Ihsan Taskiran , David Mauduit , Maurizio Fazio , Linde Van Aerschot , Gert Hulsemans , Valerie Christiaens , Samira Makhzami , Monika Seltenhammer , Panagiotis Karras , Aline Primot , Edouard Cadieu , Ellen van Rooijen , Jean-Christophe Marine , Giorgia Egidy Maskos , Ghanem-Elias Ghanem , Leonard Zon , Jasper Wouters , Stein Aerts

License: MIT

Contributed by: Ibrahim Ihsan Taskiran , Liesbeth Minnoye , Stein Aerts

Cite as: https://doi.org/10.1101/2019.12.21.885715

Type: None

Postprocessing: None

Trained on: Accessible genomic sites. Held-out chromosome chr2.

Source files

Model predicting melanoma-specific accessible regions

Create a new conda environment with all dependencies installed
kipoi env create DeepMEL
source activate kipoi-DeepMEL
Install model dependencies into current environment
kipoi env install DeepMEL
Test the model
kipoi test DeepMEL --source=kipoi
Make a prediction
kipoi get-example DeepMEL -o example
kipoi predict DeepMEL \
  --dataloader_args='{"intervals_file": "example/intervals_file", "fasta_file": "example/fasta_file"}' \
  -o '/tmp/DeepMEL.example_pred.tsv'
# check the results
head '/tmp/DeepMEL.example_pred.tsv'
Get the model
import kipoi
model = kipoi.get_model('DeepMEL')
Make a prediction for example files
pred = model.pipeline.predict_example()
Use dataloader and model separately
# Download example dataloader kwargs
dl_kwargs = model.default_dataloader.download_example('example')
# Get the dataloader and instantiate it
dl = model.default_dataloader(**dl_kwargs)
# get a batch iterator
it = dl.batch_iter(batch_size=4)
# predict for a batch
batch = next(it)
model.predict_on_batch(batch['inputs'])
Make predictions for custom files directly
pred = model.pipeline.predict(dl_kwargs, batch_size=4)
Get the model
library(reticulate)
kipoi <- import('kipoi')
model <- kipoi$get_model('DeepMEL')
Make a prediction for example files
predictions <- model$pipeline$predict_example()
Use dataloader and model separately
# Download example dataloader kwargs
dl_kwargs <- model$default_dataloader$download_example('example')
# Get the dataloader
dl <- model$default_dataloader(dl_kwargs)
# get a batch iterator
it <- dl$batch_iter(batch_size=4)
# predict for a batch
batch <- iter_next(it)
model$predict_on_batch(batch$inputs)
Make predictions for custom files directly
pred <- model$pipeline$predict(dl_kwargs, batch_size=4)

Schema

Inputs

List of numpy arrays

Name: None

    Shape: (500, 4) 

    Doc: DNA sequence

Name: None

    Shape: (500, 4) 

    Doc: Reverse-complemented DNA sequence


Targets

Single numpy array

Name: topic

    Shape: (24,) 

    Doc: Topic Prediction (4-MEL, 7-MES)


Dataloader

Defined as: .

Doc: Data-loader returning one-hot encoded sequences given genome intervals

Authors: Ibrahim Ihsan Taskiran

Type: None

License: MIT


Arguments

intervals_file : intervals file bed3

fasta_file : Reference genome FASTA file path.

ignore_targets (optional): if True, don't return any target variables


Model dependencies
conda:
  • python=3.6
  • h5py

pip:
  • keras>=2.2.4
  • tensorflow>=1.14.0

Dataloader dependencies
conda:
  • python=3.6
  • bioconda::pybedtools
  • bioconda::pysam
  • bioconda::pyfaidx
  • numpy
  • pandas

pip:
  • kipoiseq