Top Banner
Divide and conquer applied to metagenomic DNA C. Titus Brown [email protected] CSE / MMG, Michigan State University
71

U Florida / Gainesville talk, apr 13 2011

Apr 09, 2017

Download

Education

c.titus.brown
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: U Florida / Gainesville  talk, apr 13 2011

Divide and conquer applied to metagenomic DNA

C. Titus [email protected]

CSE / MMG, Michigan State University

Page 2: U Florida / Gainesville  talk, apr 13 2011

A brief intro to shotgun assemblyIt was the best of times, it was the wor, it was the worst of times, it was the isdom, it was the age of foolishness

mes, it was the age of wisdom, it was th

It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness

…but for 2 bn+ fragments.Not subdivisible; not easy to distribute; memory intensive.

Page 3: U Florida / Gainesville  talk, apr 13 2011

the quick brown fox jumped

jumped over the lazy dog

the quick brown fox jumped over the lazy dog

na na na, batman!

my chemical romance: na na na

Repeats do cause problems:

Assemble based on word overlaps:

Page 4: U Florida / Gainesville  talk, apr 13 2011

Whole genome shotgun sequencing & assembly

Randomly fragment & sequence from DNA;reassemble computationally.

UMD assembly primer (cbcb.umd.edu)

Page 5: U Florida / Gainesville  talk, apr 13 2011

How does assembly scale?

• Our assembly approach scales with the amount of genomic novelty present in the sample.

• For “sane” problems (microbes, human genome, etc.) this isn’t too bad, although challenging.

• For metagenomes, with millions of different species at different abundances, this is an intractable problem (so far)…

Page 6: U Florida / Gainesville  talk, apr 13 2011

Great Plains Grand Challenge –Sampling sites

• Wisconsin– Native prairie (Goose Pond, Audubon)– Long term cultivation (corn)– Switchgrass rotation (previously corn)– Restored prairie (from 1998)

• Iowa– Native prairie (Morris prairie)– Long term cultivation (corn)

• Kansas – Native prairie (Konza prairie)– Long term cultivation (corn)

Iowa Native Prairie

Switchgrass (Wisconsin)

Iowa >100 yr tilled

Page 7: U Florida / Gainesville  talk, apr 13 2011

Sampling strategy per site

Reference soil

Soil cores: 1 inch diameter, 4 inches deep

Total:

8 Reference metagenomes +

64 spatially separated cores (pyrotag sequencing)

10 M

10 M

1 M

1 M

1 cM

1 cM

Page 8: U Florida / Gainesville  talk, apr 13 2011

454 Titanium Shotgun sequencing

Illumina shotgun sequencing

Soil Metagenome

Community composition

454 Titanium Pyrotag sequencing

Page 9: U Florida / Gainesville  talk, apr 13 2011

What kinds of questions?

• What genes are present?• What species are present?• What are those species doing, physiologically

speaking?• How does “function” change with cultivation,

CO2, fertilizer types, crop cycles, etc?

We are at a “pre-question” stage, unfortunately…

Page 10: U Florida / Gainesville  talk, apr 13 2011

Iowa, Continuous c

orn

Iowa, Native P

rairie

Kansas

, Cultiva

ted co

rn

Kansas

, Native Prai

rie

Wisconsin

, Continuous c

orn

Wisconsin

, Native Prairie

Wisconsin

, Rest

ored Prairie

Wisconsin

, Switc

hgrass0

50

100

150

200

250

300

350

Great Prairie Sequencing Summary – Illumina whole metagenome shotgun

GAII HiSeq

Base

pair

s of S

eque

ncin

g (G

bp)

Page 11: U Florida / Gainesville  talk, apr 13 2011

The basic problem.

• Lots of metagenomic sequence data(200 GB Illumina for < $20k?)

• Assembly, especially metagenome assembly, scales poorly (due to high diversity).

• Standard assembly techniques don’t work well with sequences from multiple abundance genomes.

• Many people don’t have the necessary computational resources to assemble (~1 TB of RAM or more, if at all).

Page 12: U Florida / Gainesville  talk, apr 13 2011

We can’t just throw more hardware at the problem…

Lincoln Stein

Page 13: U Florida / Gainesville  talk, apr 13 2011

Hat tip to Narayan Desai / ANL

We don’t have enough resources or people to analyze data.

Page 14: U Florida / Gainesville  talk, apr 13 2011

Data generation vs data analysis

It now costs about $10,000 to generate a 200 GB sequencing data set (DNA) in about a week.

(Think: resequencing human; sequencing expressed genes; sequencing metagenomes, etc.)

…x1000 sequencers

Many useful analyses do not scale linearly in RAM or CPU with the amount of data.

Page 15: U Florida / Gainesville  talk, apr 13 2011

The challenge:

Massive (and increasing) data generation capacity, operating at a boutique level, with

algorithms that are wholly incapable of scaling to the data volume.

Note: cloud computing isn’t a solution to a sustained scaling problem!! (See: Moore’s Law slide)

Page 16: U Florida / Gainesville  talk, apr 13 2011

Awesomeness

Easy stuff like Google Search

Life’s too short to tackle the easy problems – come to academia!

Page 17: U Florida / Gainesville  talk, apr 13 2011

Assembly of shotgun sequenceIt was the best of times, it was the wor, it was the worst of times, it was the isdom, it was the age of foolishness

mes, it was the age of wisdom, it was th

It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness

…but for 2 bn+ fragments.Not subdivisible; not easy to distribute; memory intensive.

Page 18: U Florida / Gainesville  talk, apr 13 2011

the quick brown fox jumped

jumped over the lazy dog

the quick brown fox jumped over the lazy dog

na na na, batman!

my chemical romance: na na na

Repeats do cause problems:

Assemble based on word overlaps:

Page 19: U Florida / Gainesville  talk, apr 13 2011

Whole genome shotgun sequencing & assembly

Randomly fragment & sequence from DNA;reassemble computationally.

UMD assembly primer (cbcb.umd.edu)

Page 20: U Florida / Gainesville  talk, apr 13 2011

K-mer graphs - overlaps

J.R. Miller et al. / Genomics (2010)

Page 21: U Florida / Gainesville  talk, apr 13 2011

K-mer graphs - branching

For decisions about which paths etc, biology-based heuristics come into play as well.

Page 22: U Florida / Gainesville  talk, apr 13 2011

Iowa, Continuous c

orn

Iowa, Native P

rairie

Kansas

, Cultiva

ted co

rn

Kansas

, Native Prai

rie

Wisconsin

, Continuous c

orn

Wisconsin

, Native Prairie

Wisconsin

, Rest

ored Prairie

Wisconsin

, Switc

hgrass0

50

100

150

200

250

300

350

Great Prairie Sequencing Summary – Illumina whole metagenome shotgun

GAII HiSeq

Base

pair

s of S

eque

ncin

g (G

bp)

Page 23: U Florida / Gainesville  talk, apr 13 2011

Billions and billions of …>850:2:1:1943:15232/1 0CCTGCCTGTGGAGCAGCCCACGCAGTTCGAGCTGATCATCAACCTCAAGACGGCCCAAGCCCTTGGCATCACGATT>850:2:1:1943:15232/2 0ACACCATTTAATCTTAGCCATAAAAGTTGTATAAGCATCAACGTTTTGTTTGTCTCAAAAAACGATTTTTTTTTTG>850:2:1:1943:19543/1 0ACTGTAGGTTTCTGGCTGCGTCCGACGATAGCAGCCCGCTCTGCCGACATTGTCA>850:2:1:1945:16822/2 0AGTCGACAGATCGACCTGAAGGAGGTGCCGGGAATTGAAGTCATCCAGGGCGCCGAGGAGAACTGATCGG>850:2:1:1946:10202/2 0AGCTTTTTCGCGCGCGTGAAAAAGCTTTGTCGATTTCTGGGTTTCGGCCTTCTCACAGTCACCGCCGAGGGCCGGG>850:2:1:1947:6533/2 0GGTCTCCGGACACACGAAGGCACGGCTCTCCGAGAAGCGGAGGATGTACTCGACCTCACGGCTGC>850:2:1:1948:15431/1 0ACCGCTTACTCGATGATGGAGCAAGGCAGAATCGACATGATTCTGAGCTCGCGTCCCGAAGATCGACGCGCGG>850:2:1:1949:19998/1 0AATTCAAAGTAGGCATTTTTGTTTTTGTAGGGTTGGCGATGTTAGGCGCGCTGGTCGTGCAATTC>850:2:1:1950:4213/2 0CCAACCGGGCCCTGGTCCTGCACGCCAACCTGTCCCCGCTGGTGG>850:2:1:1950:1388/1 0CAGCCGCAATGTTGGCATTCTTCAGCAGTTCGAGCGCCACAAAGCGGTCATTGTCTGAGGCTTCTGGG

Page 24: U Florida / Gainesville  talk, apr 13 2011

Too much data – what can we do?

• Reduce the size of the data (either with an approximate or an exact approach)

• Divide & conquer: subdivide the problem.

• For exact data reduction or subdivision, need to grok the entire assembly graph structure.

• …but that is why assembly scales poorly in the first place.

Page 25: U Florida / Gainesville  talk, apr 13 2011
Page 26: U Florida / Gainesville  talk, apr 13 2011
Page 27: U Florida / Gainesville  talk, apr 13 2011
Page 28: U Florida / Gainesville  talk, apr 13 2011

Two exact data reduction techniques:

• Eliminate reads that do not connect to many other reads.

• Group reads by connectivity into different partitions of the entire graph.

For k-mer graph assemblers like Velvet and ABYSS, these are exact solutions.

Page 29: U Florida / Gainesville  talk, apr 13 2011

Eliminating unconnected reads

“Graphsize filtering”

Page 30: U Florida / Gainesville  talk, apr 13 2011

Subdividing reads by connection

“Partitioning”

Page 31: U Florida / Gainesville  talk, apr 13 2011

Two exact data reduction techniques:

• Eliminate reads that do not connect to many other reads (“graphsize filtering”).

• Group reads by connectivity into different partitions of the entire graph (“partitioning”).

For k-mer graph assemblers like Velvet and ABYSS, these are exact solutions.

Page 32: U Florida / Gainesville  talk, apr 13 2011

Engineering overview

• Built a k-mer graph representation based on Bloom filters, a simple probabilistic data structure;

• With this, we can store graphs efficiently in memory, ~1-2 bytes/(unique) k-mer for arbitrary k.

• Also implemented efficient global traversal of extremely large graphs (5-20 bn nodes).

For details see source code (github.com/ctb/khmer), or online webinar: http://oreillynet.com/pub/e/1784

Page 33: U Florida / Gainesville  talk, apr 13 2011

Store graph nodes in Bloom filter

Graph traversal is done in full k-mer space;

Presence/absence of individual nodes is kept in Bloom filter data structure (hash tables w/o collision tracking).

Page 34: U Florida / Gainesville  talk, apr 13 2011

Practical application

• Enables:– graph trimming (exact removal)– partitioning (exact subdivision)– abundance filtering

• … all for K <= 64, for 200+ gb sequence collections.

• All results (except for comparison) obtained using a single Amazon EC2 4xlarge node, 68 GB of RAM / 8 cores.

• Similar running times to using Velvet alone.

Page 35: U Florida / Gainesville  talk, apr 13 2011

We pre-filter data for assembly:

Page 36: U Florida / Gainesville  talk, apr 13 2011

Does removing small graphs work?

Small data set (35m reads / 3.4 gb rhizosphere soil sample)

Filtered at k=32, assembled at k=33 with ABYSS

N contigs / Total bp Largest contig130 223,341 61,766

Unfiltered (35m)130 223,341 61,766

Filtered (2m reads)

YES.

Page 37: U Florida / Gainesville  talk, apr 13 2011

Does partitioning into disconnected graphs work?

Partitioned same data set (35m reads / 3.5 gb) into 45k partitions containing > 10 reads; assembled partitions separately (k0=32, k=33).

N contigs / Total bp Largest contig130 223,341 61,766

Unfiltered (35m)130 223,341 61,766

Sum partitions

YES.

Page 38: U Florida / Gainesville  talk, apr 13 2011

Data reduction for assembly / practical details

Reduction performed on machine with 16 gb of RAM.

Removing poorly connected reads: 35m -> 2m reads.- Memory required reduced from 40 gb to 2 gb;- Time reduced from 4 hrs to 20 minutes.

Partitioning reads into disconnected groups:- Biggest group is 300k reads- Memory required reduced from 40 gb to 500 mb;- Time reduced from 4 hrs to < 5 minutes/group.

Page 39: U Florida / Gainesville  talk, apr 13 2011

Does it work on bigger data sets?

35 m read data set partition sizes:

P1: 277,043 readsP2: 5776 readsP3: 4444 readsP4: 3513 readsP5: 2528 readsP6: 2397 reads…

Iowa continuous corn GA2 partitions (218.5 m reads):

P1: 204,582,365 readsP2: 3583 readsP3: 2917 readsP4: 2463 readsP5: 2435 readsP6: 2316 reads…

Page 40: U Florida / Gainesville  talk, apr 13 2011

Problem: big data sets have one big partition!?

• Too big to handle on EC2.

• Assembles with low coverage.

• Contains 2.5 bn unique k-mers (~500 microbial genomes), at ~3-5x coverage

• As we sequence more deeply, the “lump” becomes bigger percentage of reads => trouble!– Both for our approach,– And possibly for assembly in general (because it assembles more poorly

than it should, for given coverage/size)

Page 41: U Florida / Gainesville  talk, apr 13 2011

Why this lump?

1. Real biological connectivity (rRNA, conserved genes, etc.)

2. Bug in our software

3. Sequencing artifact or error

Page 42: U Florida / Gainesville  talk, apr 13 2011

Why this lump?

1. Real biological connectivity? Probably not.- Increasing K from 32 to ~64 didn’t break up the lump: not biological.

2. Bug in our software? Probably not.- We have a second, completely separate approach &

implementation that confirmed the lump (bleu, by Rosangela Canino-Koning)

3. Sequencing artifact or error? YES.- (Note, we do filter & quality trim all sequences already)

Page 43: U Florida / Gainesville  talk, apr 13 2011

“Good” vs “bad” assembly graph

Low density

High density

Page 44: U Florida / Gainesville  talk, apr 13 2011

Non-biological levels of local graph connectivity:

Page 45: U Florida / Gainesville  talk, apr 13 2011

Higher local graph density correlates with position in read

Page 46: U Florida / Gainesville  talk, apr 13 2011

Higher local graph density correlates with position in read

ARTIFACT

Page 47: U Florida / Gainesville  talk, apr 13 2011

Trimming reads• Trim at high “soddd”, sum of degree degree

distribution:– From each k-mer in each read, walk two k-mers in

all directions in the graph;– If more than 3 k-mers can be found at exactly two

steps, trim remainder of sequence.

Overly stringent; actually trimming (k-1) connectivity graph by degree.

Page 48: U Florida / Gainesville  talk, apr 13 2011

Trimmed read examples

>895:5:1:1986:16019/2TGAGCACTACCTGCGGGCCGGGGACCGGGTCAGCCTGCTCGACCTGGGCCAACCGATGCGCC>895:5:1:1995:6913/1TTGCGCGCCATGAAGCGGTTAACGCGCTCGGTCCATAGCGCGATG>895:5:1:1995:6913/2GTTCATCGCGCTATGGACCGAGCGCGTTAACCGCTTCATGGCGCGCAAAGATCGGAAGAGCGTCGTGTAG

Page 49: U Florida / Gainesville  talk, apr 13 2011

Preferential attachment due to bias

• Any sufficiently large collection of connected reads will have one or more reads containing an artifact;

• These artifacts will then connect that group of reads to all other groups possessing artifacts;

• …and all high-coverage contigs will amalgamate into a single graph.

Page 50: U Florida / Gainesville  talk, apr 13 2011

Artifacts from sequencing falsely connect graphs

Page 51: U Florida / Gainesville  talk, apr 13 2011

Preferential attachment due to bias

• Any sufficiently large collection of connected reads will have one or more reads containing an artifact;

• These artifacts will then connect that group of reads to all other groups possessing artifacts;

• …and all high-coverage contigs will amalgamate into a single graph.

Page 52: U Florida / Gainesville  talk, apr 13 2011

Groxel view of knot-like region / Arend Hintze

Page 53: U Florida / Gainesville  talk, apr 13 2011

Density trimming breaks up the lump:

Old P1, soddd trimmed(204.6 m reads -> 179 m):

P1: 23,444,332 readsP2: 60,703 readsP3: 48,818 readsP4: 39,755 readsP5: 34,902 readsP6: 33,284 reads…

Untrimmed partitioning (218.5 m reads):

P1: 204,582,365 readsP2: 3583 readsP3: 2917 readsP4: 2463 readsP5: 2435 readsP6: 2316 reads…

Page 54: U Florida / Gainesville  talk, apr 13 2011

What does density trimming do to assembly?

204 m reads in lump: assembles into 52,610 contigs; total 73.5 MB

180 m reads in trimmed lump:assembles into 57,135 contigs;total 83.6 MB

(all contigs > 1kb)

Filtered/partitioned @k=32, assembled @ k=33, expcov=auto, cov_cutoff=0

Page 55: U Florida / Gainesville  talk, apr 13 2011

Wait, what?

• Yes, trimming these “knot-like” sequences improves the overall assembly!

• We remove 25.6 m reads and gain 10.1 MB!?

• Trend is same for ABySS, another k-mer graph assembler, as well.

Page 56: U Florida / Gainesville  talk, apr 13 2011

So what’s going on?

• Current assemblers are bad at dealing with certain graph structures (“knots”).

• If we can untangle knots for them, that’s good, maybe?

• Or, by eliminating locations where reads from differently abundant contigs connect, repeat resolution improves?

• Happens with other k-mer graph assemblers (ABYSS), and with at least one other (non-metagenomic) data set.

Page 57: U Florida / Gainesville  talk, apr 13 2011

OK, let’s assemble!

Iowa corn (HiSeq + GA2): 219.11 Gb of sequence assembles to:

148,053 contigs,in 220 MB;max length 20322max coverage ~10x

…all done on Amazon EC2, ~ 1 week for under $500.

Filtered/partitioned @k=32, assembled @ k=33, expcov=auto, cov_cutoff=0

Page 58: U Florida / Gainesville  talk, apr 13 2011

Full Iowa corn / mapping stats

• 1,806,800,000 QC/trimmed reads (1.8 bn)

• 204,900,000 reads map to some contig (11%)

• 37,244,000 reads map to contigs > 1kb (2.1%)

> 1 kb contig is a stringent criterion!

Compare:80% of MetaHIT reads to > 500 bp;

65%+ of rumen reads to > 1kb

Page 59: U Florida / Gainesville  talk, apr 13 2011

Success, tentatively.

We are still evaluating assembly and assembly parameters; should be possible to improve in every way.

(~10 hrs to redo entire assembly, once partitioned.)

The main engineering point is that we can actually run this entire pipeline on a relatively small machine

(8 core/68 GB RAM)

We can do dozens of these in parallel on Amazon rental hardware.

And, from our preliminary results, we get ~ equivalent assembly results as if we were scaling our hardware.

Page 60: U Florida / Gainesville  talk, apr 13 2011

Conclusions

• Engineering: can assemble large data sets.

• Scaling: can assemble on rented machines.

• Science: can optimize assembly for individual partitions.

• Science: retain low-abundance.

Page 61: U Florida / Gainesville  talk, apr 13 2011

Conclusions

• Engineering: can assemble large data sets.

• Scaling: can assemble on rented machines.

• Science: can optimize assembly for individual partitions.

• Science: retain low-abundance.

Page 62: U Florida / Gainesville  talk, apr 13 2011

Caveats

Quality of assembly??

• Illumina sequencing bias/error issue needs to be explored.

• Scaffolding with Velvet causes systematic problems

• Regardless of Illumina-specific issue, it’s good to have tools/approaches to look at structure of large graphs.

Page 63: U Florida / Gainesville  talk, apr 13 2011

Future thoughts

• Our pre-filtering technique always has lower memory requirements than Velvet or other assemblers. So it is a good first step to try, even if it doesn’t reduce the problem significantly.

• Divide & conquer approach should allow more sophisticated (compute intensive) graph analysis approaches in the future.

• This approach enables (in theory) assembly of arbitrarily large amounts of metagenomic DNA sequence.

• Can k-mer filtering work for non-de Bruijn graph assemblers? (SGA, ALLPATHS-LG, …)

• mRNAseq and genome artifact filtering?

Page 64: U Florida / Gainesville  talk, apr 13 2011

Kmer-> GTCGTAGTTCAGTTGGTTAGAACGCCGGCCTG 747:3:13:7042:16004/1 GATATCTGCAATATCCCGTTCGAATGGGGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGCCGGAGGCC 747:3:14:10559:9771/1 GAAATTCCGGTTTGATGCGGAGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGTCGGAGGTCGCGGGTTCG 747:3:14:17232:4498/1 CAAATTTGAGATCTGAGATCCCAGGGGTTTGCGGAGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGTCGG 747:3:15:7871:10206/1 TTTGCGGAGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGTCGGAGGTCGCGGGTTCGAGTCCCGTCGG 747:3:16:17865:15895/2 TCAGGAGACGCCAGGGCGGTCTGAGTTCTTCAGGGGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGCCGG 747:3:27:9549:13966/1 GGAGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGTCGGAGGTCGCGGGTTCGAGTCCCGTCGGCTCCGCC 747:3:30:10672:3136/1 GCGGGGTCGTAGTTCAGTTGGTTAGAACGCCGGCCTGTCACGCCGGAGGTCGCGAGTTCGAGTCTCGTCGGCCC

Better artifact filtering?

Page 65: U Florida / Gainesville  talk, apr 13 2011

All paths lead to the same k-mers

0 5 10 15 20 250

10000

20000

30000

40000

50000

60000

70000

80000

90000

Number of times k-mer is traversed

Histogramof k-mertraversalcounts.

Page 66: U Florida / Gainesville  talk, apr 13 2011

Estimating sequencing return on investment

• To reach ~rumen depth of sampling of top abundance organisms, would need ~1-2 TB

5x Sequencing Coverage(931 GB)

10x Sequencing Coverage(1900 GB)

<1% Novel Sequence

Page 67: U Florida / Gainesville  talk, apr 13 2011

Argonne National Laboratory Institute for Genomic and Systems Biology

Page 68: U Florida / Gainesville  talk, apr 13 2011

Argonne National Laboratory Institute for Genomic and Systems Biology

Earth Microbiome Projectwww.earthmicrobiome.org

• Goal – to systematically approach the problem of characterizing microbial life on earth

• Paradigm shift to analyzing communities from a microbes perspective:

• Strategy:– Explore microbes in environmental parameter space– Design ‘ideal’ strategy to interrogate these biomes– Acquire samples and sequence broad and deep both DNA,

mRNA and rRNA– Define microbial community structure and the protein universe

• Gilbert et al., 2010a,b Standards in Genomic Science, open access

Page 69: U Florida / Gainesville  talk, apr 13 2011

Argonne National Laboratory Institute for Genomic and Systems Biology

• Challenges– 2.4 Quadrillion Base Pairs (2.4 Petabases) = 8000 HiSEQ2000 runs.

– Global Environmental Sample Database (GESI): identification and selection of 200,000 environmental samples, soil, air, marine and freshwater, host-associated, etc.

– The standardization of sampling, sample prep and sample processing, cataloging and sample metadata – Genomic Standards Consortium can help!

– The coordination of thousands of “volunteer” scientists for site characterization, sample collecting and processing

Earth Microbiome Projectwww.earthmicrobiome.org

Page 70: U Florida / Gainesville  talk, apr 13 2011

Acknowledgements:The k-mer gang:

• Adina Howe

• Jason Pell• Rosangela Canino-Koning• Qingpeng Zhang• Arend Hintze

Collaborators:

• Jim Tiedje (Il padrino)

• Janet Jansson, Rachel Mackelprang, Regina Lamendella, Susannah Tringe, and many others (JGI)

• Charles Ofria (MSU)

Funding: USDA NIFA; MSU, startup and iCER; DOE; BEACON/NSF STC; Amazon Education.

Page 71: U Florida / Gainesville  talk, apr 13 2011