OptMRL

Authors: Frederick Korbel, Ekaterina Eroshok, Uwe Ohler

License: GPL-3

Contributed by: Frederick Korbel, Ekaterina Eroshok, Uwe Ohler

Cite as: https://doi.org/10.1101/2023.06.02.543405

Type: None

Postprocessing: None

Trained on: 260.000 random 5'UTR reporters (pre-trained) + 20.000 human 5'UTR reporters (fine-tuned)

Source files

Model predicting mean ribosome load (MRL) of an mRNA from the 50 nucleotides upstream of the coding sequence.

Create a new conda environment with all dependencies installed
kipoi env create OptMRL
source activate kipoi-OptMRL
Test the model
kipoi test OptMRL --source=kipoi
Make a prediction
kipoi get-example OptMRL -o example
kipoi predict OptMRL \
  --dataloader_args='{"gtf_file": "example/gtf_file", "fasta_file": "example/fasta_file"}' \
  -o '/tmp/OptMRL.example_pred.tsv'
# check the results
head '/tmp/OptMRL.example_pred.tsv'
Create a new conda environment with all dependencies installed
kipoi env create OptMRL
source activate kipoi-OptMRL
Get the model
import kipoi
model = kipoi.get_model('OptMRL')
Make a prediction for example files
pred = model.pipeline.predict_example(batch_size=4)
Use dataloader and model separately
# Download example dataloader kwargs
dl_kwargs = model.default_dataloader.download_example('example')
# Get the dataloader and instantiate it
dl = model.default_dataloader(**dl_kwargs)
# get a batch iterator
batch_iterator = dl.batch_iter(batch_size=4)
for batch in batch_iterator:
    # predict for a batch
    batch_pred = model.predict_on_batch(batch['inputs'])
Make predictions for custom files directly
pred = model.pipeline.predict(dl_kwargs, batch_size=4)
Get the model
library(reticulate)
kipoi <- import('kipoi')
model <- kipoi$get_model('OptMRL')
Make a prediction for example files
predictions <- model$pipeline$predict_example()
Use dataloader and model separately
# Download example dataloader kwargs
dl_kwargs <- model$default_dataloader$download_example('example')
# Get the dataloader
dl <- model$default_dataloader(dl_kwargs)
# get a batch iterator
it <- dl$batch_iter(batch_size=4)
# predict for a batch
batch <- iter_next(it)
model$predict_on_batch(batch$inputs)
Make predictions for custom files directly
pred <- model$pipeline$predict(dl_kwargs, batch_size=4)
Get the docker image
Not available yet
Get the full sized docker image
Not available yet
Get the activated conda environment inside the container
Not available yet
Test the model
Not available yet
Make prediction for custom files directly
Not available yet
Install apptainer
https://apptainer.org/docs/user/main/quick_start.html#quick-installation-steps
Make prediction for custom files directly
Not available yet

Schema

Inputs

Single numpy array

Name: None

    Shape: (50, 4) 

    Doc: 50 nucleotide 5'UTR sequence


Targets

Single numpy array

Name: None

    Shape: (1,) 

    Doc: mean ribosome load of the mRNA (MRL) as measured by polysome profiling


Dataloader

Defined as: .

Doc: Dataloader for 5-prime UTR

Authors: Ziga Avsec

Type: Dataset

License: MIT


Arguments

gtf_file : file path; Genome annotation GTF file

fasta_file : Reference genome sequence

disable_infer_transcripts : option to disable infering transcripts. Can be True if the gtf file has transcripts annotated.

disable_infer_genes : option to disable infering genes. Can be True if the gtf file has genes annotated.


Model dependencies
conda:
  • python=3.9
  • keras
  • tensorflow
  • h5py
  • pip

pip:
  • kipoi

Dataloader dependencies
conda:
  • python=3.9
  • pip=21.0.0
  • bioconda::pybedtools

pip:
  • kipoi
  • kipoiseq
  • gffutils==0.10.1