Technical Report NetApp E-Series E2800 and Splunk with SANtricity System Manager 11.30 Stephen Carl, NetApp September 2016 | TR-4555 Abstract This technical report describes the integrated architecture of the NetApp ® E-Series 2800 all- flash or hybrid storage system and Splunk design. Optimized for node storage balance, reliability, performance, storage capacity, and density, this design employs the Splunk clustered index node model, with higher scalability and lower TCO. This document summarizes the performance test results obtained from a Splunk machine log event simulation tool.
31
Embed
Technical Report NetApp E-Series E2800 and Splunk …Splunk helps users gain visibility into this machine data to improve service levels, reduce IT operations costs, mitigate security
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Technical Report
NetApp E-Series E2800 and Splunk with SANtricity System Manager 11.30 Stephen Carl, NetApp
September 2016 | TR-4555
Abstract
This technical report describes the integrated architecture of the NetApp® E-Series 2800 all-
flash or hybrid storage system and Splunk design. Optimized for node storage balance,
reliability, performance, storage capacity, and density, this design employs the Splunk
clustered index node model, with higher scalability and lower TCO. This document
summarizes the performance test results obtained from a Splunk machine log event
2 Splunk Use Cases ................................................................................................................................. 4
2.1 Use Cases ...................................................................................................................................................... 4
4 NetApp and E-Series Testing ............................................................................................................ 17
4.1 Overview of Splunk Cluster Testing Used for E-Series Compared to Commodity Server DAS .................... 18
4.2 Eventgen Data .............................................................................................................................................. 18
4.3 Cluster Replication and Searchable Copies Factor ....................................................................................... 19
4.4 Commodity Server with Internal DAS Baseline Test Setup ........................................................................... 19
4.5 E-Series with DDP Baseline Test Setup ....................................................................................................... 19
4.6 Baseline Test Results for E-Series Compared Commodity Servers with Internal DAS ................................. 20
4.7 Search Results for Baseline Tests ................................................................................................................ 21
Splunk Apps for NetApp ........................................................................................................................................ 25
Splunk Cluster Indexer Rate from Distributed Management Console .................................................................... 27
Splunk Cluster Server Information ......................................................................................................................... 27
Splunk Cluster Indexer Bucket Information from Distributed Management Console ............................................. 28
Version History ......................................................................................................................................... 30
Figure 3) Distribution of data in a five-node Splunk cluster. ........................................................................................... 8
Figure 6) Managing a mixed-array environment with SANtricity Storage Manager and System Manager. .................. 13
Figure 7) System Manager home page. ....................................................................................................................... 14
Figure 8) Dynamic Disk Pools components. ................................................................................................................. 15
Figure 9) Dynamic Disk Pools drive failure. .................................................................................................................. 16
Figure 10) Performance of the E2800 ......................................................................................................................... 17
Figure 11) Commodity server Splunk cluster with DAS. ............................................................................................... 19
Figure 12) Splunk cluster with E-Series DDP. .............................................................................................................. 20
Figure 13) Index peer node ingest rates. ...................................................................................................................... 21
Figure 20) index.conf to limit maximum warm buckets ................................................................................................. 27
4-port optical HIC (SFP+), which can be configured as either 16Gb Fibre Channel or 10Gb iSCSI
2-port optical HIC (SFP+), which can be configured as either 16Gb Fibre Channel or 10Gb iSCSI
Note: A software feature pack can be applied in the field to change the host protocol of the optical baseboard ports and the optical HIC ports from FC to iSCSI or from iSCSI to FC.
2-port 10Gb iSCSI (Cat6e/Cat7 RJ45)
Note: If the base ports on the controller are configured with 10GB iSCSI RJ-45, then the only HIC option supported is the 2-port 10Gb iSCSI (Cat6e/Cat7 RJ45).
For optical connections, the appropriate SFPs must be ordered for the specific implementation. Consult the Hardware Universe for a full listing of available host interface equipment.
For detailed instructions on changing the host protocol, go to the Upgrading > Hardware Upgrade section
at https://mysupport.netapp.com/eseries.
Table 2) Supported drive types in SAS 3 enclosures.
Drive Types NL-SAS SAS SDD
DE212C 4TB, 6TB 800GB
8TB 1.6TB
10TB 3.2TB
DE224C 900GB 800GB
1.2TB 1.6TB
1.8TB 3.2TB
The E2800 controller shelf supports 12 and 24 drives based on the shelf model (DE212C or DE224C,
respectively), but the system capacity can be further expanded by adding additional expansion-drive
shelves to the controller shelf. The E2800 supports up to 4 total shelves, the controller shelf plus 3
expansion-drive shelves, for a maximum of 180 HDD (120 SSD) drives. Drive shelf options are shown in
Table 3.
Table 3) Drive shelf options for E2800.
Property DE212C DE224C DE1600 DE5600 DE6600
Form Factor 2U 2U 2U 2U 4U
Drive Size 3.5”
2.5” (w/bracket) 2.5” 3.5” 2.5”
3.5”
2.5” (w/bracket)
Drive Types NL-SAS
SSD
SAS
SSD NL-SAS
SAS
SSD
SAS
NL-SAS
SSD
Total Drives 12 24 12 24 60
Drive Interface 12Gb SAS 12Gb SAS 6Gb SAS 6Gb SAS 6Gb SAS
Note: DE1600, DE5600, and DE6600 are supported only as part of in-place data migration from E2700/E5400/E5500/E5600 to E2800. For information on the hardware used in previous NetApp E-Series and Splunk testing, see TR-4460.
3.2 SANtricity
E2800 systems are managed by the SANtricity System Manager browser-based application. The E2800
controller and SANtricity 11.30 form a milestone release for E-Series because it puts into production a
new architecture for both controller firmware and management software. In particular, SANtricity System
Manager 11.30 is embedded on the controller.
The major components of SANtricity storage management software are still used with the E2800-based
storage arrays so the installation flow is similar. The only component that will never be used is the Array
Management Window for E2800-based storage arrays. This component was replaced by the embedded
browser-based System Manager. For more details and information on the E-Series E2800 storage system
and SANtricity 11.30, see TR-4538.
3.3 Overview
SANtricity System Manager provides embedded management software, web services, event monitoring,
and AutoSupport for the E2800 controller. Previous controllers such as the E2700, E5600, and EF560 do
not have this embedded functionality. Because you might have a mixed environment with both the new
E2800 storage array and older storage arrays, there are a variety of management options. Figure 5
shows a graphical representation of the new landscape and where the different management functions
Figure 6) Managing a mixed-array environment with SANtricity Storage Manager and System Manager.
For a detailed description of installing and configuring the components you choose, refer to the
appropriate Power Guides for deployment.
3.4 System Manager Navigation
After you log in to System Manager, the home page is displayed, as shown in Figure 7.
The icons on the left of the home page are used to navigate through the System Manager pages and are available on all pages. The text can be toggled on and off.
The items on the top right of the page (Preferences, Help, Log Out) are also available at any location in System Manager.
Highlighted on the bottom right corner is the drop-down-style menu used extensively in System Manager.
Dynamic RAID Migration allows the RAID level of a particular volume group, for example, from RAID 10 to RAID 5, to be modified online if new requirements dictate a change.
Flexible cache block and segment sizes allow optimized performance tuning based on a particular workload. Both items can also be modified online.
There is built-in performance monitoring of all major storage components, including controllers, volumes, volume groups, pools, and individual disk drives.
Automated remote connection to the NetApp AutoSupport function provides “phone home” capabilities and automated parts dispatch if a component fails.
The E2800 has path failover and load-balancing (if applicable) between the host and the redundant storage controllers.
You gain the ability to manage and monitor multiple E-Series storage systems from the same management interface.
Dynamic Disk Pools
With seven patents pending, the DDP feature dynamically distributes data, spare capacity, and protection
information across a pool of disk drives. These pools can range in number from a minimum of 11 drives to
all the drives in an E2800 or E-Series storage system. In addition to creating a single DDP, storage
administrators can opt to create traditional volume groups in conjunction with a single DDP or even
multiple DDPs, which offers an unprecedented level of flexibility.
Dynamic Disk Pools are composed of several lower-level elements. The first of these is known as a D-
piece. A D-piece consists of a contiguous 512MB section from a physical disk that contains 4,096 128KB
segments. Within a pool, 10 D-pieces are selected using an intelligent optimization algorithm from
selected drives within the pool. Together, the 10 associated D-pieces are considered a D-stripe, which is
4GB of usable capacity. Within the D-stripe, the contents are similar to a RAID 6 8+2 scenario. There, 8
of the underlying segments potentially contain user data, 1 segment contains parity (P) information
calculated from the user data segments, and the final segment contains the Q value as defined by RAID
6.
Volumes are then created from an aggregation of multiple 4GB D-stripes as required to satisfy the
defined volume size up to the maximum allowable volume size within a DDP. Figure 8 shows the
relationship between these data structures.
Figure 8) Dynamic Disk Pools components.
Another major benefit of a DDP is that, rather than using dedicated stranded hot spares, the pool contains
integrated preservation capacity to provide rebuild locations for potential drive failures. This benefit
simplifies management, because you no longer need to plan or manage individual hot spares. The
capability also greatly improves the time of rebuilds, if required, and enhances the performance of the
volumes during a rebuild, as opposed to the time and performance of traditional hot spares.
When a drive in a DDP fails, the D-pieces from the failed drive are reconstructed to potentially all other
drives in the pool using the same mechanism normally used by RAID 6. During this process, an algorithm
internal to the controller framework verifies that no single drive contains two D-pieces from the same D-
stripe. The individual D-pieces are reconstructed at the lowest available LBA range on the selected disk
drive.
Figure 9) Dynamic Disk Pools drive failure.
In Figure 9, above, disk drive 6 (D6) is shown to have failed. Later, the D-pieces that previously resided
on that disk are recreated simultaneously across several other drives in the pool. Because there are
multiple disks participating in the effort, the overall performance impact of this situation is lessened and
the length of time needed to complete the operation is dramatically reduced.
In the event of multiple disk failures within a DDP, priority reconstruction is given to any D-stripes that are
missing two D-pieces to minimize any data availability risk. After those critically affected D-stripes are
reconstructed, the remainder of the necessary data continues to be reconstructed.
From a controller resource allocation perspective, there are two reconstruction priorities within a DDP that
the user can modify:
The degraded reconstruction priority is assigned for instances in which only a single D-piece must be rebuilt for the affected D-stripes; the default for this is high.
The critical reconstruction priority is assigned for instances in which a D-stripe has two missing D-pieces that need to be rebuilt; the default for this is highest.
For very large disk pools with two simultaneous disk failures, only a relatively small number of D-stripes
are likely to encounter the critical situation in which two D-pieces must be reconstructed. As discussed
previously, these critical D-pieces are identified and reconstructed initially at the highest priority. This
process returns the DDP to a degraded state very quickly so that further drive failures can be tolerated.
In addition to the improvement in rebuild times and superior data protection, DDP can also greatly
improve the performance of the base volume when under a failure condition compared with the
performance of traditional volume groups.
3.5 Performance
An E2800 configured with all SSD, HDD, or a mixture of both drives is capable of performing at very high
levels, both in input/output per second (IOPS) and throughput, while still providing extremely low latency.
The ingest machine log data was created using the Splunk workload tool eventgen. The cluster had eight
index peer nodes to handle ingesting ~125GB of simulated machine syslog data per indexer, for a total of
~1TB per day for the entire cluster.
4.1 Overview of Splunk Cluster Testing Used for E-Series Compared to Commodity Server DAS
The Splunk cluster configuration components consist of:
Forwarders—Ingest 125GB of machine log data files into the cluster of index node peers.
Index peer nodes—Index the ingested machine syslog data and replicate data copies in the cluster.
Search head—Execute custom searches for dense, very dense, rare, and very rare data from the cluster of index peer nodes.
Master—Monitor and push configuration management changes for the cluster. License master of 1TB per day ingest amount for the eight-index peer node cluster.
4.2 Eventgen Data
The machine log dataset was created with Splunk’s event generator, the eventgen. The Splunk event
generator is a downloadable Splunk app available from the Splunk website. Splunk eventgen enables
users to load samples of log files or exported .csv files as an event template. The templates can then be
used to create artificial log events with simulated timestamps. A user can modify the field values and
configure the random variance while preserving the structure of the events. The data templates can be
looped to provide a continuous stream of real-time data. For more eventgen information, visit Splunk
eventgen app.
For our testing, the eventgen was loaded into the cluster and was configured to produce a 125GB
simulated syslog type file for each Splunk forwarder instance. The file is then split into smaller syslog files
on each of eight individual Splunk heavy forwarder instances, each ingesting data in a one-to-one data
path to one of the eight index peer nodes. The total ingested data is ~1TB per day loaded into the cluster
for each simulated daily index.
Following are the number of rare and dense search terms per 10,000,000 lines:
Very Dense Search—1 out of 100 lines; 100,000 occurrences
Dense Search—1 out of 1,000 lines; 10,000 occurrences
Rare Search—1 out of 1,000,000 lines; 10 occurrences
Very Rare Search—1 out of 10,000,000 lines; 1 occurrence
Refer to the Interoperability Matrix Tool (IMT) on the NetApp Support site to validate that the exact product and feature versions described in this document are supported for your specific environment. The NetApp IMT defines the product components and versions that can be used to construct configurations that are supported by NetApp. Specific results depend on each customer's installation in accordance with published specifications.
Trademark Information
NetApp, the NetApp logo, Go Further, Faster, AltaVault, ASUP, AutoSupport, Campaign Express, Cloud
ONTAP, Clustered Data ONTAP, Customer Fitness, Data ONTAP, DataMotion, Fitness, Flash Accel,
Software derived from copyrighted NetApp material is subject to the following license and disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).