Screen Mining with KNIME A user-friendly framework for high throughput / content data analysis Mar$n Stöter HT - Technology Development Studio (TDS), the HC-Screening Unit at the MPI-CBG [email protected] KNIME workshop February 27 th 2016, Berlin
Screen Mining with KNIME
A user-friendly framework for high throughput / content data analysis
Mar$nStöterHT-TechnologyDevelopmentStudio(TDS),[email protected]
KNIMEworkshopFebruary27th2016,Berlin
Outline
Martin Stöter, MPI-CBG, Dresden, Germany 2
- IntroducAonintoHigh-ContentScreening(HCS)dataandtheHCSToolsnodes
- Hands-onsessionHCSTools- IntroducAonintoScripAngIntegraAonnodes- Hands-onsessionScripAonIntegraAon
Technology Development Studio (TDS)
Martin Stöter, MPI-CBG, Dresden, Germany 3
MPI-CBG,Dresden,Germany
Screeningfacilityforacademiclaboratories
ProvidefullserviceforautomaAonandcell-basedscreens,RNAiand
chemicalscreens
Equipment:liquidhandlingrobots,dropdispensers,platewashers,platereaders,
HighContentScreeningplaTorms
Data Analysis is a Bottleneck in HCS!
4
Dataanalyst
ComplexExperimentsLotsofdata(toomuchforExcel)Fancydataanalysis/miningManyscienAsts,butfewdataanalystsSomeAmesdifferentlanguagesDataanalysisisoYenaboZleneck!
Scien$sts
HCSTools+ +
+…
High-Content Screening (HCS) data
Martin Stöter, MPI-CBG, Dresden, Germany 5
DatageneraAon-Cells(RNAi,compounds)-Microscopy->images-Imageanalysis-Cellfeatures/parameters->welldata
Tasks/problems-ReaddatafromvarioussourcesSQLdatabase,XML,Excel,various.csv…-ScreeningspecificstaAsAcs-ScreeningspecificuAliAes-Datamining,visualizaAon
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24AB DMSO DMSO DMSOC 0.001 DMSO DMSO 0.001D 10 DMSO DMSO 10E 10 DMSO DMSO 10F 3 DMSO DMSO 3G 3 DMSO DMSO 3H 1 DMSO DMSO 1I 1 DMSO DMSO 1J 0.3 DMSO DMSO 0.3K 0.3 DMSO DMSO 0.3L 0.1 DMSO DMSO 0.1M no AB no AB 0.1 DMSO DMSO 0.1N no AB no AB 0.1 DMSO DMSO 0.1O DMSO DMSOP
HCS Tools for KNIME
DataImportImageAnalysisReaders(Opera,OpereQa,Mo$onTracking)PlateReaders(Envision,GeniusPro,MSDSectorImager)Other(ExampleData,GenericXML)
NormalizaAonPercent-of-control(POC),Normalizedpercentinhibi$on(NPI)Z-score,B-scoreVectorLengthNormaliza$on(clustering)Op$onal:robuststa$s$cs(Median+MAD)Selectwellstonormalize(controls,samples)
QualityControlZ-primefactor(Z‘),Mul$variateZ‘,SSMDCV(coefficientofvariance)Op$onal:robuststa$s$cs(Median+MAD)Selectwellstonormalize(controls,samples)
HCSTools
HCS Tools for KNIME
UAliAesHandlebarcodes,wellsandrowleQersJoinLayoutfromExcel(wellannota$on,metadata)CreateWellPosi$on(NEW)
VisualizaAonPlateHeatmapViewerDoseResponse(dependentonR!)
AdvancedStaAsAcsBinningAnalysisDataManitupaAon/Pre-ProcessingSplit/CombineColums(byheader)NumberFormaQer(NEW)RangeFilter,SpliQerOutlierRemoval
HCSTools
HCS Tools: Standardized Data Format
-EnforcestandardizaAonofdataformat
-Differentreadernodestoshapeacommondatastructure
-Lowertheknowledgeentrybarrierfornewusers
“barcode”,“plateRow”,“plateColumn”,param1,param2,…
->EasesuptheusageofotherHCSToolsnodes
HCS Tools: Expand well
StandardizaAonofthewellcoordinates:-“plateRow”and“plateColumn”asintegervaluesresemblewellposi$onmatrix(insteadofwell)
-Somenodesselectthesecolumnsasdefault(JoinLayout,PlateHeatmapViewer)
-Compa$blewith96,384and1536wellformat
-PlateRowConverter(leQer↔integer)
-CreateWellPosi$on(sortablewellstring)
NEWNODE
HCS Tools: Barcode Standard
RegularexpressionforinterpretaAonofbarcode:-Standardizedtablestructure->connec$ontoourTDScompounddatabase
-(?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1})
-ConfigurableinPreferences->KNIME->HCATools
-Mul$plebarcodes/regularexpressionspossible
-Finalreleaserecently
HCS Tools: Barcode Standard
RegularexpressionforinterpretaAonofbarcode:-Standardizedtablestructure->connec$ontoourTDScompounddatabase
-(?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1})
-ConfigurableinPreferences->KNIME->HCATools
-Mul$plebarcodes/regularexpressionspossible
-Finalreleaserecently
HCS Tools: Barcode Standard
RegularexpressionforinterpretaAonofbarcode:-Standardizedtablestructure->connec$ontoourTDScompounddatabase
-(?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1})
-ConfigurableinPreferences->KNIME->HCATools
-Mul$plebarcodes/regularexpressionspossible
-Finalreleaserecently
HCS Tools: Annotate Experiment
ExcelisthetoolforexperimentdocumentaAonandassaydevelopment
JoinLayoutnodeisExcelReaderfordefinedspreadsheet
PlateformatwithmulAplewellaZributes(1platelayout->1columninKNIME)
-TitleoflayoutstartsincellC5
-Twoemptyrowsbetweenthelayout
HCS Tools: Normalization
Tocomparedatafromdifferentplates,daysorrunsdatamustbenormalizedperplate
SelectablereferencewellpopulaAonperplatePercent-of-control(POC),Normalizes-percent-of-
inhibiAon(NPI),Z-ScoreRobuststaAsAcs(median&madinsteadofmean&sd)
withstaAsAcstableassecondoutput
HCS Tools: Normalization
Tocomparedatafromdifferentplates,daysorrunsdatamustbenormalizedperplate
SelectablereferencewellpopulaAonperplatePercent-of-control(POC),Normalizes-percent-of-
inhibiAon(NPI),Z-ScoreRobuststaAsAcs(median&madinsteadofmean&sd)
withstaAsAcstableassecondoutput
HCS Tools: Normalization
Tocomparedatafromdifferentplates,daysorrunsdatamustbenormalizedperplate
SelectablereferencewellpopulaAonperplatePercent-of-control(POC),Normalizes-percent-of-
inhibiAon(NPI),Z-ScoreRobuststaAsAcs(median&madinsteadofmean&sd)
withstaAsAcstableassecondoutput
HCS Tools: Quality Control (QC)
QualitycontrolstaAsAcmeasuretheassayperformanceSelectable(mulAple)referencewellpopulaAonperplateZ-Primefactor(Z’),mulAvariateZ’,strictlystandardized
meandifference(SSMD),coefficientofvariance(CV)RobuststaAsAcs(median&madinsteadofmean&sd)
HCS Tools: Quality Control (QC)
QualitycontrolstaAsAcmeasuretheassayperformanceSelectable(mulAple)referencewellpopulaAonperplateZ-Primefactor(Z’),mulAvariateZ’,strictlystandardized
meandifference(SSMD),coefficientofvariance(CV)RobuststaAsAcs(median&madinsteadofmean&sd)
HCS Tools: Quality Control (QC)
QualitycontrolstaAsAcmeasuretheassayperformanceSelectable(mulAple)referencewellpopulaAonperplateZ-Primefactor(Z’),mulAvariateZ’,strictlystandardized
meandifference(SSMD),coefficientofvariance(CV)RobuststaAsAcs(median&madinsteadofmean&sd)
HCS Tools: Binning Analysis
BinninganalysisdescribeschangesindistribuAonsGreattoolformovingfromcelltowelldata(insteadof
justtakingmeanperwell)
"CellProfiler and KNIME: open source tools for high content screening.". Methods in molecular biology (Clifton, N.J.) 2013 986, S. 105-22
HCS Tools: Binning Analysis
BinninganalysisdescribeschangesindistribuAonsGreattoolformovingfromcelltowelldata(insteadof
justtakingmeanperwell)
"CellProfiler and KNIME: open source tools for high content screening.". Methods in molecular biology (Clifton, N.J.) 2013 986, S. 105-22
HCS Tools: Binning Analysis
BinninganalysisdescribeschangesindistribuAonsGreattoolformovingfromcelltowelldata(insteadof
justtakingmeanperwell)
"CellProfiler and KNIME: open source tools for high content screening.". Methods in molecular biology (Clifton, N.J.) 2013 986, S. 105-22
HCS Tools: Plate Viewer (discontinued)
Martin Stöter, MPI-CBG, Dresden, Germany 23
179platesx384wells=~70.000datapointsAmesxparameters
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: Plate Heatmap Viewer
VisualizaAonofscreeningcampaignswithmetadataEasytofindvisuallypaZers,driYs,errors…Newfeatures:
-KNIMEColors-HiLitesupport-representaAonofimages-manydifferentconfiguraAons,e.g.colorscale…
-10x384wellplate-3replicates-~10,000datapoints-Rawdata-Metadatafrombarcode-Normalizeddata-Differentreadout-Metadatafromlayout-Browsingsingleplate-Viewingthewelldata-Displayofimages-…more
HCS Tools: what was / is cooking?
Newnodes-CreateWellPosiAon-NumberFormaZer
Enhancements-DoseResponse(R)
PlateViewerwasdisconAnuedBinningAnalysisworkinprogress
-BinningCalculate
-BinningApply
-BinningQC&ModelModifier
Transformsnumberstodefinedstring
-imageoutput(insteadofview)-moresta$s$csintableoutput(e.g.Hillcoefficent)-moreplotop$ons(SEM)-newmodelport
?
HCS Tools: the demo
Ok…nowlet’sgototheworkflowandseethenodes…
Thedataset:CellProfilerImagedata(pre-cleanedupasa.tabledueto
technicalreasons)
-10x384wellplatesin3replicateswith3imagesperwell
Acknowledgements
36
SoYwareDevelopmentAntjeJanoschTimNicolaisenMagdalenaRucinskFelixMeyerhofer(past)HolgerBrandl(past)
HCSTools
TDSteam(MPI-CBG)
KNIMEMichaelBertholdandtheKNIMEteam