MMSplice/modularPredictions
Type: custom
Postprocessing: variant_effects
Trained on: MPRA (Rosenberg 2015), GENCODE annotation 24. Chromosome 1 to chromosome 8 were provided as training data. The remaining chromosomes 9 to 22 and chromosome X were held out.
Predict splicing variant effect from VCF
The raw predictions from the five modules for reference sequence and alternative sequence. Returns a vector of length 10 for each variant-exon pair.
kipoi env create MMSplice/modularPredictions
source activate kipoi-MMSplice__modularPredictions
kipoi test MMSplice/modularPredictions --source=kipoi
kipoi get-example MMSplice/modularPredictions -o example
kipoi predict MMSplice/modularPredictions \
--dataloader_args='{"gtf": "example/gtf", "fasta_file": "example/fasta_file", "vcf_file": "example/vcf_file", "exon_cut_l": 0, "exon_cut_r": 0, "acceptor_intron_cut": 6, "donor_intron_cut": 6, "acceptor_intron_len": 50, "acceptor_exon_len": 3, "donor_exon_len": 5, "donor_intron_len": 13}' \
-o '/tmp/MMSplice|modularPredictions.example_pred.tsv'
# check the results
head '/tmp/MMSplice|modularPredictions.example_pred.tsv'
kipoi env create MMSplice/modularPredictions
source activate kipoi-MMSplice__modularPredictions
import kipoi
model = kipoi.get_model('MMSplice/modularPredictions')
pred = model.pipeline.predict_example(batch_size=4)
# Download example dataloader kwargs
dl_kwargs = model.default_dataloader.download_example('example')
# Get the dataloader and instantiate it
dl = model.default_dataloader(**dl_kwargs)
# get a batch iterator
batch_iterator = dl.batch_iter(batch_size=4)
for batch in batch_iterator:
# predict for a batch
batch_pred = model.predict_on_batch(batch['inputs'])
pred = model.pipeline.predict(dl_kwargs, batch_size=4)
library(reticulate)
kipoi <- import('kipoi')
model <- kipoi$get_model('MMSplice/modularPredictions')
predictions <- model$pipeline$predict_example()
# Download example dataloader kwargs
dl_kwargs <- model$default_dataloader$download_example('example')
# Get the dataloader
dl <- model$default_dataloader(dl_kwargs)
# get a batch iterator
it <- dl$batch_iter(batch_size=4)
# predict for a batch
batch <- iter_next(it)
model$predict_on_batch(batch$inputs)
pred <- model$pipeline$predict(dl_kwargs, batch_size=4)
docker pull kipoi/kipoi-docker:mmsplice-slim
docker pull kipoi/kipoi-docker:mmsplice
docker run -it kipoi/kipoi-docker:mmsplice-slim
docker run kipoi/kipoi-docker:mmsplice-slim kipoi test MMSplice/modularPredictions --source=kipoi
# Create an example directory containing the data
mkdir -p $PWD/kipoi-example
# You can replace $PWD/kipoi-example with a different absolute path containing the data
docker run -v $PWD/kipoi-example:/app/ kipoi/kipoi-docker:mmsplice-slim \
kipoi get-example MMSplice/modularPredictions -o /app/example
docker run -v $PWD/kipoi-example:/app/ kipoi/kipoi-docker:mmsplice-slim \
kipoi predict MMSplice/modularPredictions \
--dataloader_args='{'gtf': '/app/example/gtf', 'fasta_file': '/app/example/fasta_file', 'vcf_file': '/app/example/vcf_file', 'exon_cut_l': 0, 'exon_cut_r': 0, 'acceptor_intron_cut': 6, 'donor_intron_cut': 6, 'acceptor_intron_len': 50, 'acceptor_exon_len': 3, 'donor_exon_len': 5, 'donor_intron_len': 13}' \
-o '/app/MMSplice_modularPredictions.example_pred.tsv'
# check the results
head $PWD/kipoi-example/MMSplice_modularPredictions.example_pred.tsv
https://apptainer.org/docs/user/main/quick_start.html#quick-installation-steps
kipoi get-example MMSplice/modularPredictions -o example
kipoi predict MMSplice/modularPredictions \
--dataloader_args='{"gtf": "example/gtf", "fasta_file": "example/fasta_file", "vcf_file": "example/vcf_file", "exon_cut_l": 0, "exon_cut_r": 0, "acceptor_intron_cut": 6, "donor_intron_cut": 6, "acceptor_intron_len": 50, "acceptor_exon_len": 3, "donor_exon_len": 5, "donor_intron_len": 13}' \
-o 'MMSplice_modularPredictions.example_pred.tsv' \
--singularity
# check the results
head MMSplice_modularPredictions.example_pred.tsv
Inputs
Dictionary of numpy arrays
Name: seq/acceptor_intron
Doc: alternative sequence of acceptor intron
Name: seq/acceptor
Doc: alternative sequence of acceptor
Name: seq/exon
Doc: alternative sequence of exon
Name: seq/donor
Doc: alternative sequence of donor
Name: seq/donor_intron
Doc: alternative sequence of donor intron
Name: mut_seq/acceptor_intron
Doc: alternative sequence of acceptor intron
Name: mut_seq/acceptor
Doc: alternative sequence of acceptor
Name: mut_seq/exon
Doc: alternative sequence of exon
Name: mut_seq/donor
Doc: alternative sequence of donor
Name: mut_seq/donor_intron
Doc: alternative sequence of donor intron
Defined as: ..
Doc: This model first predicts the effect of variants using 5 sub-modules (acceptor intron module, acceptor module, exon module, donor module, donor intron module), and then integrates those predictions using linear regression. The model has been trained to predict delta PSI subject to variants.
Type: SampleIterator
License: MIT
Arguments
gtf : path to the GTF file required by the models (Ensemble)
fasta_file : reference genome fasta file
vcf_file : Path to the input vcf file
split_seq (optional): Whether split the sequence in dataloader
encode (optional): If split the sequence, whether one hot encoding
exon_cut_l (optional): when extract exon feature, how many base pair to cut out at the begining of an exon
exon_cut_r (optional): when extract exon feature, how many base pair to cut out at the end of an exon
acceptor_intron_cut (optional): how many bp to cut out at the end of acceptor intron that consider as acceptor site
donor_intron_cut (optional): how many bp to cut out at the end of donor intron that consider as donor site
acceptor_intron_len (optional): what length in acceptor intron to consider for acceptor site model
acceptor_exon_len (optional): what length in acceptor exon to consider for acceptor site model
donor_exon_len (optional): what length in donor exon to consider for donor site model
donor_intron_len (optional): what length in donor intron to consider for donor site model
- python=3.7
- pip=21.0.1
- h5py==2.10.0
- mmsplice==1.0.3
- protobuf==3.20
- bioconda::cyvcf2=0.11.5
- bioconda::pyranges=0.0.66
- bioconda::pysam=0.15.3
- python=3.7
- mmsplice==1.0.3
- protobuf==3.20