Kipoi - Seminar
The monthly virtual seminar series is designed as a platform for interested Kipoi users and developers and will host talks on the applications of deep learning on biological data. The seminar is held on every first Wednesday of the month at 5:30 p.m. - 6:30 p.m. CET. We are also happy to share the recordings of the seminar on YouTube.
How to take part
The Virtual Seminar Series takes place via Zoom. To take part in the seminar, you can register for the online Zoom conference here. Your personal join link will be valid for all upcoming lectures of the series.
How to apply as a speaker
The seminar is a great opportunity to present your recent work to a large international audience. If you want to apply as a speaker, please use the contact in the registration confirmation email.
SAILER and UFOLD: Deep Learning for scATAC-seq and RNA Secondary Structure PredictionDate: 7 April 2021 5:30 p.m. - 6:30 p.m. Central European Summer time
Speaker: Yingxin Cao, PhD Student at UC IrvineAbstract:
SAILER: Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) provides new opportunities to dissect epigenomic heterogeneity and elucidate transcriptional regulatory mechanisms. However, computational modelling of scATAC-seq data is challenging due to its high dimension, extreme sparsity, complex dependencies, and high sensitivity to confounding factors from various sources. In this work we propose a new deep generative model framework, named SAILER, for analyzing scATAC-seq data.
SAILER aims to learn a low-dimensional nonlinear latent representation of each cell that defines its intrinsic chromatin state, invariant to extrinsic confounding factors like read depth and batch effects. SAILER adopts the conventional encoder-decoder framework to learn the latent representation but imposes additional constraints to ensure the independence of the learned representations from the confounding factors. Experimental results on both simulated and real scATAC-seq datasets demonstrate that SAILER learns better and biologically more meaningful representations of cells than other methods. Its noise-free cell embeddings bring in significant benefits in downstream analyses: Clustering and imputation based on SAILER result in significant improvements over existing methods, respectively. Moreover, because no matrix factorization is involved, SAILER can easily scale to process millions of cells. We implemented SAILER into a software package, freely available to all for large-scale scATAC-seq data analysis. UFold: For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here we propose a deep learning-based data-driven method, called UFold, for RNA secondary structure prediction, trained directly on annotated data without any thermodynamic assumptions. UFold improves substantially upon previous models. UFold is also fast with an inference time about 160ms per sequence up to 1600bp length. We provide an online web server that implements UFold for RNA structure prediction and is made freely available.
- 3 March 2021 - Avanti Shrikumar, Stanford University, Stanford
- 3 February 2021 - Uwe Ohler, Max-Delbrück-Center for Molecular Medicine, Berlin
- 2 December 2021 - Ron Schwessinger, Radcliffe Department of Medicine, Oxford
- 4 November 2020 - David Kelley, Calico, San Francisco
- 7 October 2020 - Vikram Agarwal, Calico, San Francisco