Creating Modern Metadata System with New Relic CREATING MODERN METADATA SYSTEMS Bill Sammons, Head of Content Enrichment, DowJones
Apr 16, 2017
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
CREATING MODERN METADATA SYSTEMSBill Sammons, Head of Content Enrichment, DowJones
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Bill SammonsHead o f Conten t Enr i chment
November 17, 2016
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c
CREATING MODERN METADATA SYSTEMSTechnology Stack Transformation
New Relic Insights for Classification Engine Transition
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Content Pipeline Rebuild
• Ingestion & Enrichment Pipeline• Legacy Architecture– Mainly Centralized Functionality– Monolithic– Hard Impossible to scale efficiently
• Goals– Easy to scale – Expectations for Significant Growth– Reduce Data Center Footprint– Update technologies that had gone stale for a long period
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Legacy Technology Stack
• Coding Language – C++• Software Development – manual
build, test; CVS• System Resource Monitoring –
Cacti• Interface – http• Infrastructure – Physical Servers
& Load Balancers in Corporate Data Centers
• Server Acquisition – Purchase• OS – RHEL 4!
• Server Deployment – Sys Admins• Log Collection – Splunk• Escalations – Operations staff
monitoring Splunk output• Content Classification – SAP SDX• Communications –
emails/meetings• Project Management – MS
Project/Project Manager
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Impossible Tasks now Possible!
• New Annotators – How do we apply new Metadata to an Archive of 1.5B Articles?
• Refresh annually – Even more challenging
• Reusability of full Content Pipeline for Consumer Business Purposes
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Early Days with New Relic• APM on Legacy Systems – modest value – C++ code base
• Alerts Integration with OpsGenie
• Built Plug-in to extract custom data from legacy code
• APM on Rearchitected Systems – increased value – Java code base
• Insights for Technology Purposes primarily
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
APM & OpsGenie
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
PRODUCTION RELEASE PERFORMANCE YTD
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
WEEKLY CHANGE IN PERFORMANCE - CLASSIFIER
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Classification Engine Update• Classification of documents– 3 Taxonomies – News Subjects, Industries, Regions– 1000’s of Nodes– 7 Languages
• Key Component to Discovery and Organization in Products
• Very Different Technologies – Different Results Expected
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Insights to the Rescue• Business Partner ask:– Scores of spreadsheets– Static data– Compare old vs new
• New Relic Insights– A few dashboards– Dynamic Data– Drill through capabilities
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Code Simple!// Declare map.NewRelicInsightsParams = new ConcurrentHashMap<String, Object>();
// Populate it. long mdp_queue_time = start.getTimeInMillis() - auditTrail.get(auditTrail.size() - 1).getAuditEntryCreatedTime() .toGregorianCalendar().getTimeInMillis(); long time_since_creation = start.getTimeInMillis() - auditTrail.get(0).getAuditEntryCreatedTime().toGregorianCalendar() .getTimeInMillis(); NewRelicInsightsParams.put("queue_time", mdp_queue_time); NewRelicInsightsParams.put("time_since_creation", time_since_creation);…
// Record custom event.NewRelic.getAgent().getInsights().recordCustomEvent("MetadataPipelineComponent", NewRelicInsightsParams);
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
What Makes it Magic?• Simple to code as we have seen – just Name/Value pairs in Map &
Send
• Iterations of dashboards/NRQL incredibly fast
• NRQL – “SQL for Managers”
• Refresh rates on large datasets during drill downs very fast even on complex NRQL
• Ready to answer questions not yet asked
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
NRQL• Looks a bit complex but tools and prediction make it easy
– SELECT filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect!='None') AS '# Doc', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect!='None') AS '% Doc', uniquecount(mpc_doc_hash) AS 'Total Doc', filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Added to Search') AS '# Add Search', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Added to Search') AS '% Add Search', filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Added to Nav') AS '# Add Nav', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Added to Nav') AS '% Add Nav', filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Added to Nav & Search') AS '# Add N&S', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Added to Nav & Search') AS '% Add N&S', filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Lost from Search') AS '# Loss Search', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Lost from Search') AS '% Loss Search', filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Lost from Nav') AS '# Loss Nav', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Lost from Nav') AS '% Loss Nav', filter(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Lost from Nav & Search') AS '# Loss N&S', percentage(uniquecount(mpc_doc_hash),WHERE essex_product_effect='Lost from Nav & Search') AS '% Loss N&S' from MetadataRegionCodes FACET code where environment='INT' and nr_ver=1 since 1 week ago limit 1000 where language in ('en', 'fr', 'de', 'ru', 'es', 'pt', 'it')
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Business Partner Feedback• “New Relic Insights gives us the big picture – in near real-time!”
• “Instant Statistics! We’ve moved from a few static analyses of 100s of stories to 10s of thousands of stories every day with drill down capability”
• “We can now prioritize our work and it has become integral to our daily workflow”
• “New Relic Insights gives us vision into code competition that would have been nearly impossible in the past”
• “Insights gives us high confidence that we are delivering a quality solution to our customer in a highly complex problem space”
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
WHAT’S ON DECK?
• NewRelic Insights Alerts
• NewRelic Insights Dashboards w/Time Picker
• NewRelic Infrastructure
• Docker
• Bringing NewRelic to our Product Platform
C r e a t i n g M o d e r n M e t a d a t a S y s t e m w i t h N e w R e l i c
Thank You!
C r e a t i n g M o d e r n M e t a d a t a S y s t e m s w i t h N e w R e l i c 2 6