Enabling Cloud and Grid Powered Image Phenotyping Nirav Merchant iPlant Collaborative [email protected]
Dec 29, 2015
Enabling Cloud and Grid Powered Image Phenotyping
Nirav MerchantiPlant Collaborative
Topic Coverage
• Motivation• Key Components • Overview of BISQUE• Roadmap and future direction
Motivation• High throughput imaging is essential for enabling
genome scale phenotyping efforts• Affordable automation for image acquisition (e.g.
robotic high throughput systems) is creating vast amounts of imaging data (rapidly)
• Many laboratories have custom or commercial setup for high throughput image acquisition (but lack the comparable analysis platform)
• Super resolution microscopy and multi-channel images are pushing the boundaries of storage and computational capabilities
Image acquisition
Robotic image acquisition of root tips (Spalding Lab.)
Image Acquisition
Multiple setups recording movies for root growth (Spalding Lab.)
Motivation II• New improved algorithms and analysis routines are
being constantly published• Applying these algorithms to existing data is
challenging for biologists• Sharing and collaborating with large image data sets is
challenging• There is no common platform to try multiple
methods/algorithms on collection of images• Data management is challenging for high throughput
methods (metadata is key)• Establishing consistent protocol for image analysis is
challenging when using multiple applications/platforms
• ONE SIZE FITS ALL APPROACH DOES NOT WORK
Key iPlant infrastructure
• iPlant Data Store (iDS)• Computational Grid (HPC, HTC)• Atmosphere (Cloud Infrastructure)*• BISQUE*
iPlant Data StoreConnecting people with data and computation:
Lifecycle of DataLifecycle of Data
Transfer Storage Analysis Visualization Metadata Mark-up Search and Discover Share/Collaborate Publish
Transfer Storage Analysis Visualization Metadata Mark-up Search and Discover Share/Collaborate Publish
Why cloud ?• Standalone interactive GUI-based applications are
frequently required for analysis • GUI apps not easily to transform into web apps (or
run on grid/command line etc.)• Need to handle complex software dependencies
(e.g specific version on software/library)• Users needing full control of their software stack
(occasional sudo/super users access)• Need to share desktop/applications for
collaborative analysis (remote collaborators)
So how does it work ?
Configured VM(all required s/w)Configured VM
(all required s/w)
iPlant Data storeiPlant Data store
High B
andw
idth
Trans
fer
How does it look ?
How does it look ?
Why Bisque
• Allows algorithms developers to publish new analysis methods and make it completely web accessible with ease
• Biologists can choose from multiple analysis options for their images, overlay results to validate findings without altering original image content
• Produce interactive plots, visualization using built in API
• Share results, images , annotations with collaborators via secure link.
• Integrated with iPlant storage and computation infrastructure
Bisque features• Rich internet application (completely web based)• Draws upon features from popular large scale photo
sharing sites and high resolution aerial imagery (google maps)
• Ability to import and export over 100+ image formats, movies
• Ability to import extremely large image sets using iPlant storage infrastructure
• Can display 20Kx20K image using standard web browser• Utilizes distributed computing (connected to XSEDE) and
workflow engines (Pegasus, Condor) to scale analysis
Whole seedling-size analysisHigh resolution flat bed scanner image of seeds
Edge detection and analysis by PhytoBisque
Source: Edgar Spalding
Simple Steps for Using it
• Concept of Mini-Apps• Browse and select image (or video) • Run analysis• Overlay results and verify• Export data
PhytoBisque interface
Searching, browsing
PhytoBisque Interface
Viewing large (18Kx17k pixel image) and performing analysis on selected section
Participants
• Bisque (Univ. of California, Santa Barbara)– B. S. Manjunath– Kris Kvelikval– Dmitry Fedorov
• Phytomorph (Univ. of Wisconsin, Madison)– Edgar Spalding– Nathan Miller– Logan Johnson
Users
• Currently we have 5+ groups actively using this infrastructure
• 3 Graduate course• 2 Summer courses/workshops• 1 Pollen Network RCN• NSF ADBC Thematic Collections Network
(Yale University led)
• Main application:– http://bovary.iplantcollaborative.org
• Support:– http://ask.iplantcollaborative.org
• Project Website– http://www.iplantcollaborative.org