In their OSDI 2006 paper, Google describes that "Bigtable depends on a cluster management system for scheduling jobs, managing resources on shared machines, dealing with machine failures, and monitoring machine status." Until recently, no such system existed for Apache Accumulo to rely upon. Apache Hadoop 2 introduced the Yarn resource management system to the Hadoop ecosystem. This talk will describe the benefits Yarn can provide for Accumulo installations and how the Slider project (proposed for the Apache Incubator) makes it easier to deploy long-running applications on Yarn. It will describe the details of the Accumulo App Package for Slider and how to use Slider to deploy an Accumulo instance, as well as how instances can be actively managed by other applications such as Apache Ambari.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Servers run YARN Node Managers• NM's heartbeat to Resource Manager• RM schedules work over cluster• RM allocates containers to apps• NMs start containers• NMs report container health
•Do I need to re-write parts of my application?•How do I package my application for YARN?•How do I configure my application?•How do I debug my application?•Can I still manage my application?•Can I monitor my application?•Can I manage inter-/intra-application dependencies?
•How will the external clients communicate?•What does it take to secure the application?
Apache Slider is a project in incubation at the Apache Software Foundation with the goal of making it possible and easy to deploy existing applications onto a YARN cluster
• History– HBase on YARN (HOYA)– AccumuloProvider/HBaseProvider on YARN– Agent Provider + App Packages for Accumulo/HBase/Storm/…
Similar to any YARN application1. CLI starts an instance of the AM2. AM requests containers3. Containers activate with an Agent4. Agent gets application definition5. Agent registers with AM6. AM issues commands7. Agent reports back status,
Goal is to have Slider integrate with any application management framework, e.g. Ambari
Apache Ambari is an open source framework for provisioning, managing and monitoring Apache Hadoop clusters• Ambari Views allows development of custom user interfaces• Slider App View will deploy, monitor, manage YARN apps using Slider,
• A common problem (not specific to Slider)s://issues.apache.org/jira/browse/YARN-913
• Current– Apache Curator based– Register URLs pointing to actual data– AM doubles up as a webserver for published data
• Future– Registry should be stand-alone– Slider is a consumer as well as publisher– Slider focuses on declarative solution for Applications to publish data– Allows integration of Applications independent of how they are hosted