Convolutional LSTM Networks for Subcellular Localization of Proteins

Convolutional LSTM Networks for SubcellularLocalization of Proteins

Søren Kaae Sønderby, Casper Kaae Sønderby, Henrik Nielsen*, and Ole Winther

*Center for Biological Sequence AnalysisDepartment of Systems Biology

Technical University of Denmark

Protein sorting in eukaryotes

Feed-forward Neural Networks

Problems for sequence analysis:• No builtin concept of

sequence• No natural way of

handling sequences of varying length

• No mechanism for handling long range correlations (beyond input window size)

LSTM networks

xt: input at time tht-1: previous outputi : input gate, f : forget gate, o: output gate, g: input modulation gate, c: memory cell.

An LSTM (Long Short Term Memory) cell

The blue arrow head refers to ct−1.

LSTM networks• are easier to train than

other types of recurrent neural networks

• can process very long time lags of unknown size between important events

• are used in speech recognition, handwriting recognition, and machine translation

“Unrolled” LSTM network

Each square represents a layer of LSTM cells at a particular time (1, 2, ... t).

The target y is presented at the final timestep.

Regular LSTM networks

Bidirectional: one target per position

Double unidirectional: one target per sequence

Attention LSTM networks

Bidirectional, but with one target per sequence.

Align weights determine where in the sequence the network directs its attention.

Convolutional Neural Networks

A convolutional layer in a neural network consists of small neuron collections which look at small portions of the input image, called receptive fields.

Often used in image processing, where they can handle translation invariance.

First layer convolutional filters learned in an image processing network, note that many filters are edge detectors or color detectors

Our basic model

……t t+1 T

Target prediction at t=T

Softmax

xt xt+1 xT

Note that conv. weights are shared across sequence steps for the convolutional filters

1D convolution(variable width)

Y K P WAxtxt-1xt-2 xt+1 xt+2

Conv. weights

LSTM……t T

xt xt+1 xT

Encoder

……ht ht+1 hT

Vectors containing the activations in each LSTM unit at each time step

Attention

Att. Weighting over sequence positions𝛼t 𝛼t+1 𝛼T

Decoder

Weighted hidden average

Softmax

Target prediction

Our model, with attention

Our model, specifications

– Input encoding: Sparse, BLOSUM80, HSDM and profile (R1×80)

– Conv. filter sizes: 1, 3, 5, 9, 15, 21 (10 of each)– LSTM layer: 1×200 units– Fully connected FFN layer: 1×200 units– Attention model: Wa (R200×400), va (R1×200)

MultiLoc architectureMultiLoc is an SVM-based based predictor using only sequence as input

MultiLoc2 architecture

MultiLoc2 corresponds to MultiLoc + PhyloLoc + GOLoc.

Thus, its input is not only sequence, but also metadata derived from homology searches.

SherLoc2 architecture

SherLoc2 corresponds to MultiLoc2 + EpiLoc

EpiLoc = a prediction system based on features derived from PubMed abstracts found through homology searches

Results: performance

Learned Convolutional Filters

Learned Attention Weights𝛼1 . . . . . . . . . 𝛼t . . . . . . . . . . 𝛼T

t-SNE plot of LSTM representation

Contributions1. We show that LSTM networks combined with convolutions

are efficient for predicting subcellular localization of proteins from sequence.

2. We show that convolutional filters can be used for amino acid sequence analysis and introduce a visualization technique.

3. We investigate an attention mechanism that lets us visualize where the LSTM network focuses.

4. We show that the LSTM network effectively extracts a fixed length representation of variable length proteins.

AcknowledgmentsThanks to:• Søren & Casper Kaae Sønderby,

for doing the actual implementation and training

• Ole Wintherfor supervising Søren & Casper

• Søren Brunakfor introducing me to the world of neural networks

• The organizersfor accepting our paper

• Youfor listening!

Convolutional LSTM Networks for Subcellular Localization of Proteins

Documents

Learning Statistical Scripts with LSTM Recurrent Neural .......

Attention in Convolutional LSTM for Gesture...

lstm neural network

RNN & LSTM

1 LSTM Fully Convolutional Networks for Time Series ... ·....

[PR12] PR-050: Convolutional LSTM Network: A Machine...

Deep Multi-Kernel Convolutional LSTM Networks and an ... ·...

Recurrent Neural Networks - GitHub PagesRNN: LSTM: Source:.....

Convolutional Neural Networks - e-learning•Convolutional.....

lmmunochemical Identification and Subcellular Distribution.....

Convolutional Neural Network and Bidirectional LSTM Based...

Future Semantic Segmentation with Convolutional LSTM {Sutton...

DEEP CONVOLUTIONAL AND LSTM NEURAL NETWORKS IN AUTOMATIC...

Neural Networks Convolutional and...

Convolutional and LSTM Neural Networks -...

Deep convolutional and LSTM recurrent neural networks for...