SeqVec/embedding2structure

Authors: Michael Heinzinger

License: MIT

Contributed by: Michael Heinzinger

Type: None

Postprocessing: None

Trained on: NetSurfP-2.0 data set

3-state, 8-state secondary structure and disorder prediction based on SeqVec

Schema

Single numpy array

Name: None

Shape: (1,)

Doc: embeddings derived from SeqVec

List of numpy arrays

Name: d3_Yhat

Shape: (None, 3)

Doc:

Name: d8_Yhat

Shape: (None, 8)

Doc:

Name: diso

Shape: (None, 2)

Doc:

Dataloader

Defined as: ../embedding

Doc: Data-loader returning protein sequence as required by ELMo

Authors: Michael Heinzinger

Type: Dataset

License: MIT

fasta_file : fasta file containing multiple protein sequence(s)

split_char (optional): charcter used for separating header of fasta files (together with id_field used to extract protein identifier)

id_field (optional): index for extracting protein identifier from fasta header after splitting after split_char

Model dependencies

conda:

pip:

Dataloader dependencies

conda:

pip: