Taming the Elephant: Efficient and Effective Apache Hadoop Management
Post on 11-Jan-2017
509 Views
Preview:
Transcript
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementPaul Codding2016 Hadoop Summit Dublin, Ireland
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Presenters
Paul CoddingSenior Product Manager, Cloud & OperationsApache Ambari, SmartSense
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Introduction
Observations & Recommendations
– Observations from analyzing ~1000 customer bundles
– Common operational mistakes
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SmartSense Architecture
L A N D I N G Z O N E
S E R V E RG AT E W AY
A M B A R I
A G E N T A G E N T
A G E N TA G E N TA G E N T
A G E N T
B U N D L E
W O R K E RN O D E
W O R K E RN O D E
W O R K E RN O D E
W O R K E RN O D E
W O R K E RN O D E
W O R K E RN O D E
S m a r t S e n s eA n a l y ti c s
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AgendaIntroduction
Obligatory Poll
Observations & Recommendations
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EVERY node counts…Common difficult to diagnose issues
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operation System Configuration: Locale
/etc/localtime – Dictates which timezone your machine & the JDK thinks it’s in Hive
– unix_timestamp(…) – current_date()
SELECT sum(amount) from saleswhere sale_date > unix_timestamp('2016-03-01 00:00:00')
“default timezone and the default locale”
Inconsistent Locale Configuration
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operating System Configuration: Transparent Huge Pages (THP)
THP is an abstraction layer that automates creating, managing, and using huge pages Pages == memory managed in blocks by the Linux Kernel Huge pages are pages that come in larger sizes 2MB-1GB.
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operating System Configuration: NSCD/SSSD
Name Service Cache Daemon– getpwnam– getpwuid– getgrnam– getgrid– gethostbyname
cp10005.xxxxxx.com:1cp10006.xxxxxx.com:5cp10007.xxxxxx.com:1cp10008.xxxxxx.com:0cp10009.xxxxxx.com:1cp10010.xxxxxx.com:3cp10011.xxxxxx.com:0cp10012.xxxxxx.com:1cp10013.xxxxxx.com:0cp10014.xxxxxx.com:2cp10015.xxxxxx.com:0
cp10005.xxxxxx.com:0cp10006.xxxxxx.com:0cp10007.xxxxxx.com:0cp10008.xxxxxx.com:0cp10009.xxxxxx.com:0cp10010.xxxxxx.com:0cp10011.xxxxxx.com:0cp10012.xxxxxx.com:0cp10013.xxxxxx.com:0cp10014.xxxxxx.com:0cp10015.xxxxxx.com:0
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operating System Configuration: NTPD
Network Time Protocol daemon
2016-03-31 18:40:28,585 FATAL [regionserver/ip-10-0-x-x.ec2.internal/10.0.x.x:16020] regionserver.HRegionServer: Master rejected startup because clock is out of syncorg.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server ip-10-0-x-x.ec2.internal,16020,1459449626477 has been rejected; Reported time is too far out of sync with master. Time difference of 74097ms > max allowed of 30000ms
$ kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-HDP1@HORTONWORKS.LOCALkinit: Clock skew too great while getting initial credentials
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operating System: Legacy Kernel Issues
Specific NIC’s & Kernel Versions– Broadcom bnx2x module prior to RHEL 5.7 (kernel earlier than 2.6.18-274.el5)– QLogic NetXen netxen_nic module prior to RHEL 5.9 (kernel earlier than 2.6.18-348.el5)– Intel 10Gbps ixgbe module prior to RHEL 6.4 (kernel earlier than 2.6.32-358.el6)– Intel 10Gbps ixgbe module from RHEL 5.6 (kernel version 2.6.18-238.el5 and later)
Symptoms– NFS transfers over 10Gbps links are only transferring at 100MiB/sec (i.e. 1Gbps)– TCP connections never reach anywhere near wirespeed– TCP Window size reduced 720 bytesnic.generic-receive-offload
Workaround– nic.large-receive-offload– nic.generic-receive-offload RHEL Knowledgebase Solution: 20278
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS: NameNode Group Mapping Lookup Implementations
org.apache.hadoop.security.ShellBasedUnixGroupsMapping org.apache.hadoop.security.LdapGroupsMapping org.apache.hadoop.security.CompositeGroupsMapping org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
hadoop.security.group.mapping
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS: NameNode Metadata Directories
Multiple Entries – Each directory gets a replica of the fsimage data Very common “second directory” is an NFS Mount soft mount vs hard mount
dfs.namenode.name.dir
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS: NameNode Handler Count
Math.log(${currentDataNodeCount}) * 20
10 node cluster – 46 100 node cluster – 92 1000 node cluster - 138
dfs.namenode.handler.count
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS: HA Retry Policy
When primary NameNode is killed, clients can retry for up to 10 minutes instead of failing over
dfs.client.retry.policy.enabled = true
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS: DataNode Failed Volumes
dmesg smartctl
dfs.datanode.failed.volumes.toleratedata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
=== START OF READ SMART DATA SECTION ===SMART Self-test log structure revision number 1Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error# 1 Short offline Completed: read failure 20% 717
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS: DataNode
Default: 4096 Increase depends on other services deployed in the cluster and workload type
dfs.datanode.max.transfer.threads
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN: ResourceManager Min/Max Container Size Allocation
yarn.scheduler.minimum-allocation-mb & yarn.scheduler.maximum-allocation-mb
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN: NodeManager Memory
yarn.nodemanager.resource.memory-mb
RAM
Operating System
DataNode
Region Server
NodeManager
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN: NodeManager Local Directories
yarn.nodemanager.local-dirs
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN ATS: Rolling LevelDB Timeline store
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore
yarn.timeline-service.store-class
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN ATS: TTL
yarn.timeline-service.ttl-enable & yarn.timeline-service.ttl-ms
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AgendaIntroduction
Obligatory Poll
Observations & Recommendations
Summary
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SmartSense Recommendations
We’ve covered 16 of ~250 rules Built into Support Case close/Sev1 postmortem process Onramp into core products and Apache Ambari
– Stack Advisor– New Defaults– New Alerts
hbase_tcp_nodelayhdfs_check_point_periodhdfs_dn_suboptimal_mountshdfs_dn_volume_tolerancehdfs_enable_security_checkhdfs_mount_optionshdfs_nn_checkpoint_txnshdfs_nn_handler_counthdfs_nn_protect_imp_dirshdfs_nn_soft_mounthdfs_nn_super_user_grouphdfs_short_circuithive_enable_cbohive_vectorized_execjvm_optsmr_min_split_sizemr_reduce_parallel_copiesmr_slow_startos_cpu_scalingos_ssd_tuningtez_enable_reusetez_session_release_delaytez_shuffle_bufferyarn_ats_securityyarn_nm_black_listed_mount_logdir
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
All Bundles are:• Encrypted and Anonymized by default
Configurable options to:• Exclude properties within specific Hadoop configuration files• Global REGEX replacements across all configuration, metrics, and logs
By default:• Ambari clear text passwords are not collected• Hive and Oozie database properties are not collected• All IP addresses and host names are anonymized
Bundle Security
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SmartSense Stack Support
HDP 2.4 HDP 2.3 HDP 2.2 HDP 2.1 HDP 2.0
SmartSense 1.x
Ambari 2.2Built-In!
Ambari 2.1Plug-In
Ambari 2.0Plug-In
Ambari 1.7 Ambari 1.6
SmartSense 1.x
top related