© 2015 IHS. ALL RIGHTS RESERVED. KNOWLEDGE ARCHITECTURE AND BIG DATA How to Apply Knowledge Architecture to Big Data David Meza Chief Knowledge Architect NASA Johnson Space Center Federal Reserve June 15, 2016
©2015IHS.ALLRIGHTSRESERVED.
KNOWLEDGEARCHITECTUREANDBIGDATA
HowtoApplyKnowledgeArchitecturetoBigData
DavidMezaChiefKnowledgeArchitectNASAJohnsonSpaceCenter
FederalReserveJune15,2016
AGENDA
• KnowledgeArchitecture• NASADataStrategy• CogniPveCompuPng
2
“ThemostimportantcontribuPonmanagementneedstomakeinthe21stCenturyistoincreasetheproducPvityofknowledgeworkandtheknowledgeworker.”PETERF.DRUCKER,1999
ToconvertdatatoknowledgeaconvergenceofKnowledgeManagement,InformaPonArchitectureandDataScienceisnecessary.
4
KnowledgeManagement
DataScienceInformaPonArchitecture
KnowledgeArchitecture• Thepeople,processes,andtechnologyofdesigning,implemenPng,andapplying
theintellectualinfrastructureoforganizaPons.
• Whatisanintellectualinfrastructure?
• ThesetofacPviPestocreate,capture,organize,analyze,visualize,present,
anduPlizetheinformaPonpartoftheinformaPonage..
• InformaPon+Contexts=Knowledge
• InformaPonArchitecture+KnowledgeManagement+DataScience=Knowledge
Architecture
• KMwithoutapplicaPonsisempty(StrategyOnly)
• ApplicaPonswithoutKAareblind(ITbasedKM)
• DataSciencetransformyourdatatoknowledge
5
KnowledgeManagement"Knowledgemanagementistheprocessofcapturing,distribuPng,andeffecPvely
usingknowledge.”
ThisdefiniPonhasthevirtueofbeingsimple,stark,andtothepoint.Afewyearslater,the
GartnerGroupcreatedanotherseconddefiniPonofKM,whichisperhapsthemostfrequently
citedone(Duhon,1998):
"Knowledgemanagementisadisciplinethatpromotesanintegratedapproachto
idenPfying,capturing,evaluaPng,retrieving,andsharingallofanenterprise's
informaPonassets.Theseassetsmayincludedatabases,documents,policies,
procedures,andpreviouslyun-capturedexperPseandexperienceinindividual
workers.”
6
InformaPonArchitectureTheintentistoachieveavarietyofcapabiliPestoenabletheAgencytoefficiently
acquireorgenerate,findandaccess,useandreuse,shareandexchange,manageand
govern,andstoreandrePreourdata.
7
DataScienceDatascienceisaninterdisciplinaryfieldaboutprocessesandsystemstoextract
knowledgeorinsightsfromdatainvariousforms,eitherstructuredorunstructured,
whichisaconPnuaPonofsomeofthedataanalysisfieldssuchasstaPsPcs,data
mining,andpredicPveanalyPcs,similartoKnowledgeDiscoveryinDatabases(KDD).TheKnowledgeDiscoveryinDatabases(KDD)processiscommonlydefinedwiththestages:(1)SelecPon(2)Pre-processing(3)TransformaPon(4)DataMining(5)InterpretaPon/EvaluaPon.
8
DataStrategy
9
Key Recommendations : • Data Management • Unified Data Lifecycle • Data Governance • Data Analytics Lab • Data Fellows Program • Data Stewards
DataStrategyFramework
10
Challenge Example Opportunity RecommendaEonLackofanexplicitdatamanagementframework,fragmenteddatalifecycleandlackofdataintegraPon
NoAgency-widearchitectureandstandardsforinformaPoninteroperability.MuchofthedataNASAproducesisinaccessibleorhuman-readableonly,withnomethodtodraw-in,parse,organize,ormakeuseofthisdata.
Improvedarchitecture,standardsandaccessibilitypermimngquickerandmoreeffecPvecollecPon,digiPzaPonanddiscovery;increasedfocusonmission-specificdataneedsandtype-specificapproaches
1. DataManagement2. UnifiedDataLifecycle3. DataGovernanceProgram
NeedfornewemergingdataanalyPcstechnologiesandcapabiliPestoaddressmissionspecificchallenges
ManyofNASA’scurrentdatasystemsaresignificantlyoutdatedandcannotscaletomeetdemand.
ExperimenPngwithnewalgorithms,applicaPons,andtechniques
4.DataAnalyPcsLab
DataexperPsegap DatascienPstsareinlowsupplyandhighdemand,andNASAwillneedtocompetewithindustrytoapractthebest&brightest.
CollaboraPvepartnershipstobuildinternalcapacityandexperPseanduPlizeexternaltalent,tools,andinformaPon
5.DataFellowsProgram
NeedtoeffecPvelyaddresscultureandpolicyissuesalongsidetechnology
Inmanycases,individualsarenotmoPvatedtosharedataforcollaboraPveusewithothers.
Increasedcross-agencyandcross-stakeholderownershipandapproachtodatamanagementanddataanalyPcschallenges
6.DataStewards
KNOWLEDGEARCHITECTURE–ANALYTICSFRAMEWORK
11
IT&IntellectualInfrastructure
Security,DataQuality,WorkflowManagement,DataManagement,ResourceManagement
DataProducts:• PredicPons• Models• VisualizaPons• DecisionAnalysis• Wiki
Sources:• Sensor• Experimental• Computed
(modeling&simulaPon)
Forms:• Digital• Text• VisualOrganizaPon:• Structured• Semi-Structured• Unstructured
FuncPons:• Governance• Taxonomy• Ontology• Comm.Plan• OperaPons
Management• Security• MasterData
Management• Content
Management• Metadata• DataQuality
Tools&Environments:• Largescalestorage• RDBMS• ParallelRDBMS• NOSQL• HadoopOrganizaPon:• Structured• Semi-Structured• Unstructured
Tools&Environments:• ComputaPon&data
access• DataMining• TextMining• OpPmizaPon• NetAlgorithm• NewAlgorithm• VisualizaPonAccessPapern:• Structured• Semi-Structured• Unstructured• Predictable• Unpredictable
DataAcquisiPon&CreaPon
DataManagement
DataWarehousing
DataAnalyPcs,BI
(KnowledgeExtracPon)
KnowledgePresentaPon
andVisualizaPon
Source User
“Wehaveanopportunityforeveryoneintheworldtohaveaccesstoalltheworld’sinformaPon.Thishasneverbeforebeenpossible.WhyisubiquitousinformaPonsoprofound?Itisatremendousequalizer.InformaPonispower.”ERICSCHMIDT(FORMERCEOOFGOOGLE)
30%oftotalR&DspendiswastedduplicaPngresearchandworkpreviouslydone.Source:Na+onalBoardofPatentsandRegistra+on(PRH),WIPO,IFA
54%ofdecisionsaremadewithincomplete,inconsistentandinadequateinformaPonSource:InfoCentricResearch
46%Workerscan’tfindtheinformaPontheyneedalmosthalfthePme.Source:IDC
KnowledgeArchitecture:TheNextPhase
14
15
16
17
PushversusPull
18
WHATCOULDYOUACCOMPLISHIFYOUCOULD:
• Empowerfasterandmoreinformeddecision-making
• Leveragelessonsofthepasttominimizewaste,rework,re-invenPonandredundancy
• Reducethelearningcurvefornewemployees
• EnhanceandextendexisPngcontentanddocumentmanagementsystems
19
JSCKnowledgeArchitectureServices:§ AnalyPcs
§ WebPlauormforAnalysisandVisualizaPon
§ NOSQL-Neo4jandMongoDB
§ VisualizaPonServices-BusinessIntelligence
§ RepositorySpecificSearch
§ WikiFarm
§ CodeSharingandProjectcollaboraPon
§ Training
Contact Information
David Meza – [email protected]
Twitter - @davidmeza1
Linkedin - hpps://www.linkedin.com/pub/david-meza/16/543/50b
Github – davidmeza1
Blog davidmeza1.github.io
20
Contents
©2015IHS.ALLRIGHTSRESERVED. 21ReportName/Month2015
QUESTIONS?