Class-based

Compose

Compose(self, transforms)

Composes several transforms together.

Arguments

transforms (list of Transform objects): list of transforms to compose.
Example: >>> transforms.Compose([ >>> transforms.CenterCrop(10), >>> transforms.ToTensor(), >>> ])

DummyAxis

DummyAxis(self, axis=None)

np.expand_dims wrapper - Insert a dummy axis (calls np.expand_dims)

SwapAxes

SwapAxes(self, axis1=None, axis2=None)

np.swapaxes wrapper

If any if the axis is None, do nothing.

ResizeInterval

ResizeInterval(self, width, anchor='center')

Resize the interval

OneHot

OneHot(self, alphabet=('A', 'C', 'G', 'T'), neutral_alphabet='N', neutral_value=0.25, dtype=None)

One-hot encode the sequence

Arguments

alphabet: alphabet to use for the one-hot encoding. This defines the order of the one-hot encoding.
Can either be a list or a string: 'ACGT' or ['A, 'C', 'G', 'T']
neutral_alphabet: which element to use
neutral_value: value of the neutral element
dtype: defines the numpy dtype of the returned array.
alphabet_axis: axis along which the alphabet runs (e.g. A,C,G,T for DNA)
dummy_axis: defines in which dimension a dummy axis should be added. None if no dummy axis is required.

ReorderedOneHot

ReorderedOneHot(self, alphabet=('A', 'C', 'G', 'T'), neutral_alphabet='N', neutral_value=0.25, dtype=None, alphabet_axis=1, dummy_axis=None)

Flexible one-hot encoding class that can account for many different one-hot encoding formats.

Arguments

alphabet: alphabet to use for the one-hot encoding. This defines the order of the one-hot encoding.
Can either be a list or a string: 'ACGT' or ['A, 'C', 'G', 'T']
neutral_alphabet: (single string character) neutral element representing
neutral_value: value of the neutral element
dtype: defines the numpy dtype of the returned array.
alphabet_axis: axis along which the alphabet runs (e.g. A,C,G,T for DNA)
dummy_axis: defines in which dimension a dummy axis should be added. None if no dummy axis is required.
Examples (None = sequence axis):
- (None, 4): default
- (4, None): alphabet_axis=0
- (4, 1, None): alphabet_axis=0, dummy_axis=1

SplitSplicingSeq

SplitSplicingSeq(self, exon_cut_l=0, exon_cut_r=0, intron5prime_cut=6, intron3prime_cut=6, acceptor_intron_len=50, acceptor_exon_len=3, donor_exon_len=5, donor_intron_len=13)

Split returned splice sequence (exon with flanking intron) to required format. It splits into ['intron5prime', 'acceptor', 'exon', 'donor', 'intron3prime']. 'intron5prime' is the intron 5' of the exon, while 'intron3prime' is from the 3'.

Arguments

exon_cut_l: when extract exon feature, how many base pair to cut out at the begining of an exon
exon_cut_r: when extract exon feature, how many base pair to cut out at the end of an exon (cut out the part that is considered as acceptor site or donor site)
intron5prime_cut: how many bp to cut out at the end of acceptor intron that consider as acceptor site
intron3prime_cut: how many bp to cut out at the end of donor intron that consider as donor site
acceptor_intron_len: what length in acceptor intron to consider for acceptor site model
acceptor_exon_len: what length in acceptor exon to consider for acceptor site model
donor_intron_len: what length in donor intron to consider for donor site model
donor_exon_len: what length in donor exon to consider for donor site model