GENOMICS APPLICATIONS
PLATFORM
Friday, 27 April 12
GENESTACK PLATFORM
OBJECTIVE
REASON
APPROACH
universal genomics applications platform
provide full set of building blocks
existing tool integration& new tool development
Friday, 27 April 12
GENESTACK PLATFORM
Sharing
Private data
Public data
ApplicationsSecurity
HPC
Friday, 27 April 12
GENESTACK PLATFORM
DATA private & secure sharing
free public dataformat-independent
custom data types
Friday, 27 April 12
GENESTACK PLATFORM
APPLICATIONSefficient computationuni"ed user interfacescriptabletrustedapplication SDK
Friday, 27 April 12
Friday, 27 April 12
GENESTACK PLATFORM
SDK & tooling
Data and applications store
Security audit/testing
FOR DEVELOPERS
Friday, 27 April 12
GENESTACK SERVICEPUBLIC DATA
Meta-curation
Quality control
NGS tools
Free access
Friday, 27 April 12
GENESTACK SERVICEEND-TO-END SEQUENCING
NGS service partners
Data direct to cloud
Curation and apps
Friday, 27 April 12
GENESTACK SERVICE
Cost-effective, secure
Offsite backup
Long term archival
CLOUD DEPLOYMENT
Friday, 27 April 12
Friday, 27 April 12
Genestack Limited, Salisbury House, Station Road, Cambridge, CB1 2LA, United Kingdom
Telephone +447990705531, Email: [email protected], Twitter: @genestackltd
Registered in England and Wales Company No. 7778793
GENESTACK www.genestack.com
GENOMICS OPERATING SYSTEM
Solutions to Six Problems With Genomic Data and Applications in the Enterprise
1. Managing Genomic Data Storage Costs
Problem: Sequencing gets cheaper per genome, producing more gigabases per dollar, but data storage and processing costs are in fact growing. In-house storage and cluster solutions take large capital expenditures and big operating costs.
Solution: We offer a scalable way to manage your data storage and processing costs on our cloud-based platform. For a fixed monthly subscription fee, you get storage space and computational capacity, controlling costs in line with your needs. We host world’s biggest public genomics datasets, and by economy of scale can pass our lower storage costs on to you.
Interesting: For every gigabyte of raw sequence, researchers use at least seven gigabytes of operational disk space for processing, and usually store these intermediate files long after the processing run, adding to the storage costs. Our platform is designed to optimize these intermediate data overheads.
2. Safe Data Sharing Within & Across Organizations
Problem: Managing teams in your company with external collaborators, you may need to give individuals access to valuable data. Copying, downloading and sending data by mail is expensive, inefficient and difficult to control.
Solution: Our platform supports collaborative groups within and across organizations, with fine-grained access control. Data can be encrypted at rest and in transit. We are ready for stringent security tests, can fulfill technical and insurance requirements of pharma IT and legal departments and integrate with internal authentication/authorization mechanisms.
Interesting: NGS produces files hundreds of gigabytes in size; encrypting/decrypting them is slow and CPU-intensive, while bioinformatics tools can take hours or days to run. We have thought of ways to maintain security even for such cases.
3. Using Public Data with Proprietary Data Cost-effectively
Problem: To use data from 1000 Genomes, GEO, Ensembl references or other public data for in-house R&D, your IT keeps local snapshots of these resources. Maintaining them up to date is a heavy burden, but today there is no choice if, say, you need to see ten public RNA-Seq tracks with ten proprietary ones in a genome browser, while keeping control of your data.
Solution: We host and make available to you for free a huge collection of public data, selected and annotated by our curators. Together with the provision to host securely your proprietary data, and a flexible mechanism to select and create virtual meta-experiments, our platform offers the most cost-effective way to work with public and private datasets. You will access our genome browser from any laptop to view tens of different tracks, public and private, simultaneously, securely.
Interesting: The 1000 Genomes project is about 200 TB of data. It’s on Amazon’s cloud, but you need to be an expert to use it: tutorials are many pages long. SRA, the public repository for NGS data is about ten times that. You will have free, easy access to these and other datasets, up to date and annotated by our curators by subscribing to Genestack platform.
Universal genomics data platform. Secure hosting and team sharing of Big
Data genomics experiments. Bioinformatics applications ecosystem in the
cloud. Free access to curated genomic data from public repositories. Data
curation and application development. End-to-end sequencing service.
Applications SDK & marketplace. Fixed monthly subscription.
Genestack Limited, Salisbury House, Station Road, Cambridge, CB1 2LA, United Kingdom
Telephone +447990705531, Email: [email protected], Twitter: @genestackltd
Registered in England and Wales Company No. 7778793
GENESTACK www.genestack.com
GENOMICS OPERATING SYSTEM
Solutions to Six Problems With Genomic Data and Applications in the Enterprise
1. Managing Genomic Data Storage Costs
Problem: Sequencing gets cheaper per genome, producing more gigabases per dollar, but data storage and processing costs are in fact growing. In-house storage and cluster solutions take large capital expenditures and big operating costs.
Solution: We offer a scalable way to manage your data storage and processing costs on our cloud-based platform. For a fixed monthly subscription fee, you get storage space and computational capacity, controlling costs in line with your needs. We host world’s biggest public genomics datasets, and by economy of scale can pass our lower storage costs on to you.
Interesting: For every gigabyte of raw sequence, researchers use at least seven gigabytes of operational disk space for processing, and usually store these intermediate files long after the processing run, adding to the storage costs. Our platform is designed to optimize these intermediate data overheads.
2. Safe Data Sharing Within & Across Organizations
Problem: Managing teams in your company with external collaborators, you may need to give individuals access to valuable data. Copying, downloading and sending data by mail is expensive, inefficient and difficult to control.
Solution: Our platform supports collaborative groups within and across organizations, with fine-grained access control. Data can be encrypted at rest and in transit. We are ready for stringent security tests, can fulfill technical and insurance requirements of pharma IT and legal departments and integrate with internal authentication/authorization mechanisms.
Interesting: NGS produces files hundreds of gigabytes in size; encrypting/decrypting them is slow and CPU-intensive, while bioinformatics tools can take hours or days to run. We have thought of ways to maintain security even for such cases.
3. Using Public Data with Proprietary Data Cost-effectively
Problem: To use data from 1000 Genomes, GEO, Ensembl references or other public data for in-house R&D, your IT keeps local snapshots of these resources. Maintaining them up to date is a heavy burden, but today there is no choice if, say, you need to see ten public RNA-Seq tracks with ten proprietary ones in a genome browser, while keeping control of your data.
Solution: We host and make available to you for free a huge collection of public data, selected and annotated by our curators. Together with the provision to host securely your proprietary data, and a flexible mechanism to select and create virtual meta-experiments, our platform offers the most cost-effective way to work with public and private datasets. You will access our genome browser from any laptop to view tens of different tracks, public and private, simultaneously, securely.
Interesting: The 1000 Genomes project is about 200 TB of data. It’s on Amazon’s cloud, but you need to be an expert to use it: tutorials are many pages long. SRA, the public repository for NGS data is about ten times that. You will have free, easy access to these and other datasets, up to date and annotated by our curators by subscribing to Genestack platform.
Universal genomics data platform. Secure hosting and team sharing of Big
Data genomics experiments. Bioinformatics applications ecosystem in the
cloud. Free access to curated genomic data from public repositories. Data
curation and application development. End-to-end sequencing service.
Applications SDK & marketplace. Fixed monthly subscription.
Misha Kapushesky, [email protected] @genestackltdLaunch fall 2012. Want to take part in our early access programme?
Friday, 27 April 12