APARENT/site_probabilities
Authors: Nicholas Bogard , Johannes Linder
License: MIT
Contributed by: Shabnam Sadegharmaki , Ziga Avsec , Muhammed Hasan Çelik , Florian R. Hölzlwimmer
Cite as: https://doi.org/10.1101/300061
Type: None
Postprocessing: None
Trained on: isoform expression data from over 3 million APA reporters, built by inserting random sequence into 12 distinct 3'UTR contexts.
Predicting the Impact of cis-Regulatory Variation on Alternative Polyadenylation Abstract Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over three million APA reporters, built by inserting random sequence into twelve distinct 3′UTR contexts. Predictions are highly accurate across both synthetic and genomic contexts; when tasked with inferring APA in human 3′UTRs, APARENT outperforms models trained exclusively on endogenous data. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of cleavage site selection, and integrates these features into a comprehensive, interpretable cis-regulatory code. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.
kipoi env create APARENT/site_probabilities
source activate kipoi-APARENT__site_probabilities
kipoi test APARENT/site_probabilities --source=kipoi
kipoi get-example APARENT/site_probabilities -o example
kipoi predict APARENT/site_probabilities \
--dataloader_args='{"fasta_file": "example/chr22.fa", "gtf_file": "example/chr22.gtf.gz"}' \
-o '/tmp/APARENT|site_probabilities.example_pred.tsv'
# check the results
head '/tmp/APARENT|site_probabilities.example_pred.tsv'
kipoi env create APARENT/site_probabilities
source activate kipoi-APARENT__site_probabilities
import kipoi
model = kipoi.get_model('APARENT/site_probabilities')
pred = model.pipeline.predict_example(batch_size=4)
# Download example dataloader kwargs
dl_kwargs = model.default_dataloader.download_example('example')
# Get the dataloader and instantiate it
dl = model.default_dataloader(**dl_kwargs)
# get a batch iterator
batch_iterator = dl.batch_iter(batch_size=4)
for batch in batch_iterator:
# predict for a batch
batch_pred = model.predict_on_batch(batch['inputs'])
pred = model.pipeline.predict(dl_kwargs, batch_size=4)
library(reticulate)
kipoi <- import('kipoi')
model <- kipoi$get_model('APARENT/site_probabilities')
predictions <- model$pipeline$predict_example()
# Download example dataloader kwargs
dl_kwargs <- model$default_dataloader$download_example('example')
# Get the dataloader
dl <- model$default_dataloader(dl_kwargs)
# get a batch iterator
it <- dl$batch_iter(batch_size=4)
# predict for a batch
batch <- iter_next(it)
model$predict_on_batch(batch$inputs)
pred <- model$pipeline$predict(dl_kwargs, batch_size=4)
docker pull kipoi/kipoi-docker:aparent-site_probabilities-slim
docker pull kipoi/kipoi-docker:aparent-site_probabilities
docker run -it kipoi/kipoi-docker:aparent-site_probabilities-slim
docker run kipoi/kipoi-docker:aparent-site_probabilities-slim kipoi test APARENT/site_probabilities --source=kipoi
# Create an example directory containing the data
mkdir -p $PWD/kipoi-example
# You can replace $PWD/kipoi-example with a different absolute path containing the data
docker run -v $PWD/kipoi-example:/app/ kipoi/kipoi-docker:aparent-site_probabilities-slim \
kipoi get-example APARENT/site_probabilities -o /app/example
docker run -v $PWD/kipoi-example:/app/ kipoi/kipoi-docker:aparent-site_probabilities-slim \
kipoi predict APARENT/site_probabilities \
--dataloader_args='{'fasta_file': '/app/example/chr22.fa', 'gtf_file': '/app/example/chr22.gtf.gz'}' \
-o '/app/APARENT_site_probabilities.example_pred.tsv'
# check the results
head $PWD/kipoi-example/APARENT_site_probabilities.example_pred.tsv
https://apptainer.org/docs/user/main/quick_start.html#quick-installation-steps
kipoi get-example APARENT/site_probabilities -o example
kipoi predict APARENT/site_probabilities \
--dataloader_args='{"fasta_file": "example/chr22.fa", "gtf_file": "example/chr22.gtf.gz"}' \
-o 'APARENT_site_probabilities.example_pred.tsv' \
--singularity
# check the results
head APARENT_site_probabilities.example_pred.tsv
Targets
Dictionary of numpy arrays
Name: distal_prop
Doc: Predicts proportion of cleavage occuring outside of the specified DNA range
Name: site_props
Doc: Predicts proportion of cleavage occuring at each position in the specified DNA range. Sum of all site props + distal_prop = 1
- python=3.9
- tensorflow
- keras>=2.0.4,<3
- python=3.9
- bioconda::kipoi
- bioconda::kipoiseq>=0.7.1
- bioconda::cyvcf2
- bioconda::pyranges