Kipoi: Model zoo for genomics

Kipoi is a repository for predictive models in genomics. It provides a unified framework to archive, share, access, use and build on models developed by the community. Kipoi models come with code to preprocess and load input data in major file formats, which facilitates easy application to new datasets and creation of new derived models.

What can I do with Kipoi?

  • Explore hundreds of trained models for genomics and apply them to new data in few lines of
  • Fine-tune an existing model on a new dataset. See our notebook and the Keras fine-tuning example.
  • Compose a new model using existing Kipoi models as building blocks. See how.
  • Contribute models and make them accessible for others. See how.
  • Use and develop routines for model post-processing and interpretation. Available routines:
    • Scoring the impact of genetic variants on molecular phenotypes. See docs and tutorial.

Popular machine learning frameworks are easily integrated into Kipoi. So far we have integrated:

Models developed using other frameworks or custom python code can also be submitted to Kipoi. See how.

Great! How do I get started?

Why do we need a model zoo for genomics?

Genomics uses experimental and computational methods to understand the structure, function and evolution of biomolecules and processes in health and disease through the lens of the genome. Revolutions in DNA sequencing and associated high-throughput assays has led to an explosion of genomic and molecular profiling datasets. Computational models are now routinely used to assimilate and derive predictions from these datasets.

While public databases have been developed for easy storage and access to genomic data, there is a lack of analogous repositories for computational models in genomics. Models are implemented in various programming languages and machine learning frameworks, stored in diverse formats and made available through different channels, such as code repositories and supplementary material of articles. Even with the availability of reliable code, replicating large models trained on large datasets can be challenging. The lack of a unified model repository makes it difficult to reproduce results, apply models to new data, systematically compare models and efficiently build on existing ones. Borrowing ideas from model zoos introduced in other application domains, we developed Kipoi with the primary goal of lowering the entry barrier to modeling in genomics.

About the team

Kipoi was first released on 13th March 2018. The founding team:

  • Žiga Avsec, Jun Cheng and Julien Gagneur, Technical University of Munich - Lab website
  • Roman Kreuzhuber, Lara Urban and Oliver Stegle, European Bioinformatics Institute - Lab website
  • Johnny Israeli, Avanti Shrikumar, Chuan Foo and Anshul Kundaje, Stanford University - Lab website