Compose

Compose(self, transforms)

Composes several transforms together.

Arguments

  • transforms (list of Transform objects): list of transforms to compose.

  • Example: >>> transforms.Compose([ >>> transforms.CenterCrop(10), >>> transforms.ToTensor(), >>> ])

DummyAxis

DummyAxis(self, axis=None)

np.expand_dims wrapper - Insert a dummy axis (calls np.expand_dims)

SwapAxes

SwapAxes(self, axis1=None, axis2=None)

np.swapaxes wrapper

If any if the axis is None, do nothing.

ResizeInterval

ResizeInterval(self, width, anchor='center')

Resize the interval

OneHot

OneHot(self, alphabet=('A', 'C', 'G', 'T'), neutral_alphabet='N', neutral_value=0.25, dtype=None)

One-hot encode the sequence

Arguments

  • alphabet: alphabet to use for the one-hot encoding. This defines the order of the one-hot encoding.
  • Can either be a list or a string: 'ACGT' or ['A, 'C', 'G', 'T']
  • neutral_alphabet: which element to use
  • neutral_value: value of the neutral element
  • dtype: defines the numpy dtype of the returned array.
  • alphabet_axis: axis along which the alphabet runs (e.g. A,C,G,T for DNA)
  • dummy_axis: defines in which dimension a dummy axis should be added. None if no dummy axis is required.

ReorderedOneHot

ReorderedOneHot(self, alphabet=('A', 'C', 'G', 'T'), neutral_alphabet='N', neutral_value=0.25, dtype=None, alphabet_axis=1, dummy_axis=None)

Flexible one-hot encoding class that can account for many different one-hot encoding formats.

Arguments

  • alphabet: alphabet to use for the one-hot encoding. This defines the order of the one-hot encoding.
  • Can either be a list or a string: 'ACGT' or ['A, 'C', 'G', 'T']
  • neutral_alphabet: (single string character) neutral element representing
  • neutral_value: value of the neutral element
  • dtype: defines the numpy dtype of the returned array.
  • alphabet_axis: axis along which the alphabet runs (e.g. A,C,G,T for DNA)
  • dummy_axis: defines in which dimension a dummy axis should be added. None if no dummy axis is required.

  • Examples (None = sequence axis):

  • - (None, 4): default
  • - (4, None): alphabet_axis=0
  • - (4, 1, None): alphabet_axis=0, dummy_axis=1

SplitSplicingSeq

SplitSplicingSeq(self, exon_cut_l=0, exon_cut_r=0, intron5prime_cut=6, intron3prime_cut=6, acceptor_intron_len=50, acceptor_exon_len=3, donor_exon_len=5, donor_intron_len=13)

Split returned splice sequence (exon with flanking intron) to required format. It splits into ['intron5prime', 'acceptor', 'exon', 'donor', 'intron3prime']. 'intron5prime' is the intron 5' of the exon, while 'intron3prime' is from the 3'.

Arguments

  • exon_cut_l: when extract exon feature, how many base pair to cut out at the begining of an exon
  • exon_cut_r: when extract exon feature, how many base pair to cut out at the end of an exon (cut out the part that is considered as acceptor site or donor site)
  • intron5prime_cut: how many bp to cut out at the end of acceptor intron that consider as acceptor site
  • intron3prime_cut: how many bp to cut out at the end of donor intron that consider as donor site
  • acceptor_intron_len: what length in acceptor intron to consider for acceptor site model
  • acceptor_exon_len: what length in acceptor exon to consider for acceptor site model
  • donor_intron_len: what length in donor intron to consider for donor site model
  • donor_exon_len: what length in donor exon to consider for donor site model