Page 1
Using Network Processors inGenomics
Herbert Bos* †
Kaiming Huang*
{herbertb,khuang}@liacs.nl
*Leiden Universiteit, Netherlands† Vrije Universiteit, Netherlands
http://www.liacs.nl/~herbertb/projects/biocomp/
H. Bos – Leiden University 13/02/2004 1
Page 2
Case study: BLAST
● search nucleotide/protein database for query● BLAST discovers similarity rather than exact
match● two main phases:
1. scoring (registering where query and DNADB match)
2. alignment (dynamic programming)
● only the first phase on NPUs
H. Bos – Leiden University 13/02/2004 2
Page 3
Window matching
H. Bos – Leiden University 13/02/2004 3
Page 4
Window matching
H. Bos – Leiden University 13/02/2004 4
Page 5
Window matching
H. Bos – Leiden University 13/02/2004 5
Page 6
Window matching
H. Bos – Leiden University 13/02/2004 6
Page 7
Window matching
● naïve approach: roughly W*N*M comparisons● does not scale ● string search algorithms: Aho-Corasick
– all windows matched at the same time– shifting genome one nucleotide at a time– matching algorithm transformed in a DFA
● DFA may be quite large
H. Bos – Leiden University 13/02/2004 7
Page 8
Aho-Corasick
H. Bos – Leiden University 13/02/2004 8
● Alphabet: acgt● Window size: 3● Query: acgccga● Windows:
{acg,cgc,gcc,ccg,cga}
Page 9
Aho-Corasick
H. Bos – Leiden University 13/02/2004 9
0 1 2 3
4 5 6
12
10 11
7 8 9
t a c g
c
g
g c
a
g
cc
c
s 1 2 3 4 5 6 7 8 9 10 11 12
f(s) 0 4 5 0 7 8 0 4 10 4 5 1
● Alphabet: acgt● Window size: 3● Query: acgccga● Windows:
{acg,cgc,gcc,ccg,cga}
Page 10
Aho-Corasick
H. Bos – Leiden University 13/02/2004 10
0 1 2 3
4 5 6
12
10 11
7 8 9
t a c g
c
g
g c
a
g
cc
c
● Alphabet: acgt● Window size: 3● Query: acgccga● Windows:
{acg,cgc,gcc,ccg,cga}
s 1 2 3 4 5 6 7 8 9 10 11 12
f(s) 0 4 5 0 7 8 0 4 10 4 5 1
3 6 9 11 12
acg cgc gcc ccg cga
Page 11
Aho-Corasick
H. Bos – Leiden University 13/02/2004 11
0 1 2 3
4 5 6
12
10 11
7 8 9
t a c g
c
g
g c
a
g
cc
c
● Alphabet: acgt● Window size: 3● Query: acgccga● Windows:
{acg,cgc,gcc,ccg,cga}
s 1 2 3 4 5 6 7 8 9 10 11 12
f(s) 0 4 5 0 7 8 0 4 10 4 5 1
3 6 9 11 12
acg cgc gcc ccg cga tacgcga
Page 12
H. Bos – Leiden University 13/02/2004 12
ControlProcessor
NPU (IXP1200)
ME
ME
ME
ME
ME
ME
PCI Bus
StrongARM Microengines
DRAM
SRAM
Gbps ports
Pentium
PCI
scratch
IXPBlastArchitecture
Page 13
H. Bos – Leiden University 13/02/2004 13
ControlProcessor
NPU (IXP1200)
ME
ME
ME
ME
ME
ME
PCI Bus
StrongARM Microengines
DRAM
SRAM
Gbps ports
Pentium
PCI
scratch
IXPBlastArchitecture
Page 14
H. Bos – Leiden University 13/02/2004 14
ControlProcessor
NPU (IXP1200)
ME
ME
ME
ME
ME
ME
PCI Bus
StrongARM Microengines
DRAM
SRAM
Gbps ports
Pentium
PCI
scratch
IXPBlastArchitecture
Page 15
H. Bos – Leiden University 13/02/2004 15
ControlProcessor
NPU (IXP1200)
ME
ME
ME
ME
ME
ME
PCI Bus
StrongARM Microengines
DRAM
SRAM
Gbps ports
Pentium
PCI
scratch
IXPBlastArchitecture
0 1 2 3
4 5 6
12
10 11
7 8 9
t a c g
c
g
g c
a
g
cc
c
Page 16
H. Bos – Leiden University 13/02/2004 16
ControlProcessor
NPU (IXP1200)
ME
ME
ME
ME
ME
ME
PCI Bus
StrongARM Microengines
DRAM
SRAM
Gbps ports
Pentium
PCI
scratch
IXPBlastArchitecture
0 1 2 3
4 5 6
12
10 11
7 8 9
t a c g
c
g
g c
a
g
cc
c
Page 17
H. Bos – Leiden University 13/02/2004 17
ControlProcessor
NPU (IXP1200)
ME
ME
ME
ME
ME
ME
PCI Bus
StrongARM Microengines
DRAM
SRAM
Gbps ports
Pentium
PCI
scratch
IXPBlastArchitecture
0 1 2 3
4 5 6
12
10 11
7 8 9
t a c g
c
g
g c
a
g
cc
c
Page 18
IXPBlast: packet handling
● packets read and processed in batches of 100.000● “spilling” must be taken into account● currently no feedback
H. Bos – Leiden University 13/02/2004 18
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Page 19
Results
● 232 MHz IXP1200 ~ 1.8GHz Pentium-4● 1611 Nucleotide query (MyD88)● 1.4 GB genome (Zebrafish)
– IXP1200: 90 sec with DFA– IXP1200: 129 sec with “trie”– P4: 132: 132 sec with “trie”
● number of matches: 524856
H. Bos – Leiden University 13/02/2004 19
Page 20
Results
H. Bos – Leiden University 13/02/2004 20
Query size
DNADB
sizeImpl. Performance
1611 1.4 GB P4 132 sec
1611 1.4 GB IXP1200 129 sec
1611 1.4 GB IXP1200
DFA
90 sec
Page 21
Conclusions
● NPUs are useful in other application domains● Newer hardware is expected to perform much
better● “Throughput processors”● Adapting our current approach to use BLAST
tricks/heuristics
H. Bos – Leiden University 13/02/2004 21
Page 22
Network processors
● geared for high throughput● used exclusively in network systems● example: intrusion detection● similar to looking for gene on
in genomes● differences
H. Bos – Leiden University 13/02/2004 22
Radisysixp1200 board
Page 23
Application domain: “Genomics”
● example: search genome for occurrence of “patterns”● similar problems as IDS, poor performance on GPP
cannot exploit parallelism– throughput-driven– how about FPGAs?– how about clusters?
● NPU– easier to program than FPGAs– cheaper than cluster computing– “on the desktop” IP never leaves the room
H. Bos – Leiden University 13/02/2004 23