DeepCpG_DNA/Smallwood2014_serum_dna
Authors: Christof Angermueller
License: MIT
Contributed by: Roman Kreuzhuber
Cite as:
https://doi.org/10.1186/s13059-017-1189-z
https://doi.org/10.5281/zenodo.1094823
Type: keras
Postprocessing: None
Trained on: scBS-seq and scRRBS-seq datasets, https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1189-z#Sec7
This is the extraction of the DNA-part of the a pretrained model by Christof Angermueller. The DeepCpG models are trained on: scBS-seq-profiled cells contained 18 serum and 12 2i mESCs, which were pre-processed as described in Smallwood et al. (2014), with reads mapped to the GRCm38 mouse genome. Two serum cells (RSC27_4, RSC27_7) were excluded since their methylation pattern deviated strongly from the remaining serum cells. scRRBS-seq-profiled cells were downloaded from the Gene Expression Omnibus (GEO; GSE65364) and contained 25 human HCCs, six human heptoplastoma-derived cells (HepG2) and six mESCs. Following Hou et al. (2013), one HCC was excluded (Ca26) and the analysis was restricted to CpG sites that were covered by at least four reads. For HCCs and HepG2 cells, the position of CpG sites was lifted from GRCh37 to GRCh38, and for mESC cells from NCBIM37 to GRCm38, using the liftOver tool from the UCSC Genome Browser.
kipoi env create DeepCpG_DNA/Smallwood2014_serum_dna
source activate kipoi-DeepCpG_DNA__Smallwood2014_serum_dna
kipoi test DeepCpG_DNA/Smallwood2014_serum_dna --source=kipoi
kipoi get-example DeepCpG_DNA/Smallwood2014_serum_dna -o example
kipoi predict DeepCpG_DNA/Smallwood2014_serum_dna \
--dataloader_args='{"fasta_file": "example/fasta_file", "intervals_file": "example/intervals_file"}' \
-o '/tmp/DeepCpG_DNA|Smallwood2014_serum_dna.example_pred.tsv'
# check the results
head '/tmp/DeepCpG_DNA|Smallwood2014_serum_dna.example_pred.tsv'
kipoi env create DeepCpG_DNA/Smallwood2014_serum_dna
source activate kipoi-DeepCpG_DNA__Smallwood2014_serum_dna
import kipoi
model = kipoi.get_model('DeepCpG_DNA/Smallwood2014_serum_dna')
pred = model.pipeline.predict_example(batch_size=4)
# Download example dataloader kwargs
dl_kwargs = model.default_dataloader.download_example('example')
# Get the dataloader and instantiate it
dl = model.default_dataloader(**dl_kwargs)
# get a batch iterator
batch_iterator = dl.batch_iter(batch_size=4)
for batch in batch_iterator:
# predict for a batch
batch_pred = model.predict_on_batch(batch['inputs'])
pred = model.pipeline.predict(dl_kwargs, batch_size=4)
library(reticulate)
kipoi <- import('kipoi')
model <- kipoi$get_model('DeepCpG_DNA/Smallwood2014_serum_dna')
predictions <- model$pipeline$predict_example()
# Download example dataloader kwargs
dl_kwargs <- model$default_dataloader$download_example('example')
# Get the dataloader
dl <- model$default_dataloader(dl_kwargs)
# get a batch iterator
it <- dl$batch_iter(batch_size=4)
# predict for a batch
batch <- iter_next(it)
model$predict_on_batch(batch$inputs)
pred <- model$pipeline$predict(dl_kwargs, batch_size=4)
docker pull kipoi/kipoi-docker:sharedpy3keras1.2-slim
docker pull kipoi/kipoi-docker:sharedpy3keras1.2
docker run -it kipoi/kipoi-docker:sharedpy3keras1.2-slim
docker run kipoi/kipoi-docker:sharedpy3keras1.2-slim kipoi test DeepCpG_DNA/Smallwood2014_serum_dna --source=kipoi
# Create an example directory containing the data
mkdir -p $PWD/kipoi-example
# You can replace $PWD/kipoi-example with a different absolute path containing the data
docker run -v $PWD/kipoi-example:/app/ kipoi/kipoi-docker:sharedpy3keras1.2-slim \
kipoi get-example DeepCpG_DNA/Smallwood2014_serum_dna -o /app/example
docker run -v $PWD/kipoi-example:/app/ kipoi/kipoi-docker:sharedpy3keras1.2-slim \
kipoi predict DeepCpG_DNA/Smallwood2014_serum_dna \
--dataloader_args='{'fasta_file': '/app/example/fasta_file', 'intervals_file': '/app/example/intervals_file'}' \
-o '/app/DeepCpG_DNA_Smallwood2014_serum_dna.example_pred.tsv'
# check the results
head $PWD/kipoi-example/DeepCpG_DNA_Smallwood2014_serum_dna.example_pred.tsv
https://apptainer.org/docs/user/main/quick_start.html#quick-installation-steps
kipoi get-example DeepCpG_DNA/Smallwood2014_serum_dna -o example
kipoi predict DeepCpG_DNA/Smallwood2014_serum_dna \
--dataloader_args='{"fasta_file": "example/fasta_file", "intervals_file": "example/intervals_file"}' \
-o 'DeepCpG_DNA_Smallwood2014_serum_dna.example_pred.tsv' \
--singularity
# check the results
head DeepCpG_DNA_Smallwood2014_serum_dna.example_pred.tsv
Targets
List of numpy arrays
Name: cpg/BS27_1_SER
Doc: Methylation probability for cpg/BS27_1_SER
Name: cpg/BS27_3_SER
Doc: Methylation probability for cpg/BS27_3_SER
Name: cpg/BS27_5_SER
Doc: Methylation probability for cpg/BS27_5_SER
Name: cpg/BS27_6_SER
Doc: Methylation probability for cpg/BS27_6_SER
Name: cpg/BS27_8_SER
Doc: Methylation probability for cpg/BS27_8_SER
Name: cpg/BS28_10_SER
Doc: Methylation probability for cpg/BS28_10_SER
Name: cpg/BS28_1_SER
Doc: Methylation probability for cpg/BS28_1_SER
Name: cpg/BS28_2_SER
Doc: Methylation probability for cpg/BS28_2_SER
Name: cpg/BS28_3_SER
Doc: Methylation probability for cpg/BS28_3_SER
Name: cpg/BS28_4_SER
Doc: Methylation probability for cpg/BS28_4_SER
Name: cpg/BS28_6_SER
Doc: Methylation probability for cpg/BS28_6_SER
Name: cpg/BS29_1_SER
Doc: Methylation probability for cpg/BS29_1_SER
Name: cpg/BS29_4_SER
Doc: Methylation probability for cpg/BS29_4_SER
Name: cpg/BS29_5_SER
Doc: Methylation probability for cpg/BS29_5_SER
Name: cpg/BS29_6_SER
Doc: Methylation probability for cpg/BS29_6_SER
Name: cpg/BS29_7_SER
Doc: Methylation probability for cpg/BS29_7_SER
Name: cpg/BS29_8_SER
Doc: Methylation probability for cpg/BS29_8_SER
Name: cpg/BS29_9_SER
Doc: Methylation probability for cpg/BS29_9_SER
- python=3.7
- h5py=2.10.0
- pip=20.2.4
- tensorflow==1.13.1
- keras==1.2.2
- protobuf==3.20
- bioconda::genomelake=0.1.4
- bioconda::pybedtools=0.8.1
- python=3.7
- numpy=1.19.2
- pandas=1.1.3