Reference Architecture: Splunk Enterprise with ThinkSystem Servers Describes reference architecture for Splunk Enterprise Contains sizing recommendations Includes four different deployment models from department to large enterprise Contains detailed bill of materials for Lenovo servers and networking Mike Perks Kenny Bain Last update: 30 July 2018 Version 1.0
31
Embed
Reference Architecture: Splunk Enterprise with ThinkSystem ... · Reference Architecture: Splunk Enterprise with ThinkSystem Servers Describes reference architecture for Splunk Enterprise
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Reference Architecture: Splunk Enterprise with ThinkSystem Servers
Describes reference architecture for Splunk Enterprise
Contains sizing recommendations
Includes four different deployment models from department to large enterprise
Contains detailed bill of materials for Lenovo servers and networking
1 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
1 Introduction This document describes the reference architecture for Splunk Enterprise using Lenovo® ThinkSystem servers and networking. The intended audience of this document is IT professionals, technical architects, sales engineers, and consultants to assist in planning, designing, and implementing Splunk Enterprise 7.1.1.
This document provides an overview of the business problem and business value that is addressed by Splunk Enterprise. A description of customer requirements is followed by an architectural overview of the solution and a description of the logical components. The operational model describes the recommended operational architecture of Splunk Enterprise and four different deployment scenarios using Lenovo ThinkSystem servers and network switches. The appendix features detailed Bill of Materials configurations that are used in the solution.
2 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
2 Business problem and business value The following section provides a summary of the business problems that this reference architecture is intended to help address, and the value that this solution can provide.
2.1 Business problem The advent of mobile data, social streams, clouds and interconnected everything signifies the "Transformation of Information" with huge shift in data usage. It delivers on the promise of analysis of big data to identify patterns in statistical populations vs. traditional reliance on data modeling tools, queries, spreadsheet dashboards and charts.
Global enterprises are under competitive pressure to expand into new markets, to find clients and build customer loyalty. To yield real-time insights, they now leverage technology to sift through their data instantaneously – and not after-the-fact data processing on a monthly, quarterly, or a yearly basis – which typically results in a potential loss of competitive advantage. Agility, security, cost-effectiveness, flexibility and efficiency are key deterministic priorities for their IT. Picture a bank sifting through its enormous data to recognize fraud, with a response time, of a few microseconds, during an ATM transaction, or an auto insurer receiving real-time updates on driving habits from sensors installed in client’s vehicles.
While customers are faced with many business challenges, this solution highlights two specific Big Data challenges that represent significant opportunities. The first challenge focuses on real-time identification and mitigation of advanced organizational security threats to the Enterprise by leveraging vigilant analysis and response capabilities. The second challenge is highlighted by the complexity of managing the abundance of systems prevalent in a data center, and ensuring high performance and availability of these systems, daily.
2.1.1 Vigilant enterprise security intelligence Organizational security threats do not make a story line for spy thrillers anymore. Global newsfeeds abound daily, with compromised websites, stolen credit card data, abnormal HTTP traffic, financial fraud, and malware presence. Detecting advanced Enterprise Security threats require a new approach, enabled by a smart & scalable security intelligence platform (SIP). SIP makes any data security relevant, scales to tens of terabytes of data per day and provides real-time analysis and response capabilities.
2.1.2 Operations analysis of machine data in data centers It is an extremely complex effort to efficiently manage the abundance of systems, deployed in a typical data center. On a daily basis, several systems experience outages, performance issues, or missed SLA’s. To ensure high performance and availability, Enterprise IT administration teams waste valuable resources accessing several management consoles, and run home-grown scripts to serially trace the valuable data they need from failed systems. This is machine data, a form of Big-Data.
3 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
2.2 Business value Splunk Enterprise provides an end-to-end, real-time solution for both of these business problems by delivering the following core capabilities:
• Universal collection and indexing of machine data and security data, from virtually any source • Powerful search processing language (SPL) to search and analyze real-time and historical data • Real-time monitoring for patterns and thresholds; real-time alerts when specific conditions arise • Powerful reporting and analysis • Custom dashboards and views for different roles • Resilience and horizontal scalability • Granular role-based security and access controls • Support for multi-tenancy and flexible, distributed deployments on-premises or in the cloud • Robust, flexible platform for big data apps
In addition, the Lenovo XClarity Administrator App for Splunk enables collection, visual representation, and analysis of Lenovo hardware events from the Splunk platform. Here are some examples of the critical insights that can be gained from the XClarity Administrator App for Splunk:
• The volume and types of events generated over time from all monitored hardware. This will help administrators quickly identify problem hardware and take actions.
• Percentage of total events being surfaced by each end point type such as the chassis management module (CMM), switch module, server, etc.
• Number of times when a power threshold has been exceeded for any XClarity-managed resource, over time. This can help identify environmental issues in the data center. If exceeding of power thresholds caused power capping, this could also explain performance slowdowns.
• Number of user accounts that were created on XClarity instances over time. Spikes in the number of new accounts could help identify uncommon security activities for audit purposes.
• User IDs that attempted to authenticate to XClarity, but failed. Seeing which unauthorized user IDs were used to attempt access would be useful in system audits.
• Number of login attempts made outside of normal business hours. This may help identify uncommon user account activity, like a large number of login attempts in the middle of the night or on a weekend.
4 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
3 Requirements This section describes the functional and non-functional requirements for this reference architecture.
3.1 Functional requirements The key functional requirements for the Splunk Enterprise solution include:
• Support for collecting, indexing and searching data • Support for real-time processing of data • Support for a variety of data and data types, including security data and machine data • Support for large volumes of data
In addressing the functional requirements, the reference architecture and sizing for the Splunk Enterprise solution must consider the following data requirements:
• The amount of incoming data. • The amount of indexed data in the datastore. • Data placement in relevant storage tiers (in accordance with Splunk Indexer Data Retirement &
Archiving Policies). • Data indexing performance is influenced by the choices of searches, and number of concurrent users. • Deployment and execution of Splunk ecosystem applications such as Lenovo XClarity App for Splunk
and Splunk App for Enterprise Security. • Required storage IO capabilities of high performance, scalability, and availability to support the
creation of extremely large, compressed data indexes, and offer the ability to run Storage IO-intensive sparse searches against this data.
3.2 Non-functional requirements The key non-functional requirement is to provide superior performance with both indexing data and searching data. The following shows the minimum performance requirements for Splunk Enterprise:
• Minimum performance for each Indexing Server o Up to 5.8 megabytes per second (or 500 GB per day) of raw indexing performance, provided
no other Splunk activity is occurring. • Minimum performance for each Search Server
o Up to 50,000 events per second for dense searches o Up to 5,000 events per second for sparse searches o Up to 2 seconds per index bucket for super-sparse searches o From 10 to 50 buckets per second for rare searches with bloom filters
In addition, the Splunk infrastructure needs to support both scale up and scale out as well as high availability and resilience to a single point of failure.
5 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
4 Architectural overview Splunk Enterprise provides an application platform for real-time operational intelligence. It facilitates easy, fast and secure collection, analysis, and search of data from massive data streams generated by devices, applications, transactions, timed events, systems and technologies.
Figure 1 below shows the architectural overview of Splunk Enterprise. Users can access one or more search head servers through a load balancer. The search head(s) provide access to information that is collected by forwarders from a variety of data sources possibly across multiple data centers.
Figure 1: Architectural Overview of Splunk Enterprise
IndexersSearch Head
Cluster
Clients
3rd Party Load Balancer
Applications
Forwarders
Web Servers
Hypervisors, OS
Databases
App Servers
Storage
Servers
Networks
Cloud Services
Deployment and License Server
6 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
5 Component Model This section describes the component model for Splunk Enterprise. Figure 2 shows an overview of the major components.
Figure 2: Component Model of Splunk Enterprise
5.1.1 Forwarders Forwarders collect data and send it to a Splunk deployment for indexing and searching. A particular environment could have thousands of forwarders executing on all different types of hardware. A forwarder represent a more robust solution than raw network feeds, with capabilities to”
• Tag metadata • Buffer compress and secure data • Run local scripts to collect or massage the data • Use any available network ports on the remote device
5.1.2 Indexers The indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:
• Indexing incoming data.
• Searching the indexed data.
Forwarder
Data Routing, Cloning and Load Balancing
Indexer
DeploymentServer
Search Head
REST Protocol
Splunk CLISplunk Web
Server
Lenovo XClarity
App
SplunkDeployment
Monitor App
HTTP Protocol
Web Browser
Other Apps…
LicenseServer
7 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
5.1.3 Search heads For large amounts of indexed data and numerous users concurrently searching on the data, it can make sense to distribute the indexing load across several indexers, while offloading the search query function to a separate machine. In this type of scenario, known as distributed search, one or more Splunk Enterprise components called search heads distribute search requests across multiple indexers.
5.1.4 Deployment server Splunk Enterprise deployment server is used to update a distributed deployment. The deployment server pushes out configurations and content to sets of Splunk Enterprise instances (referred to, in this context, as deployment clients), grouped according to any useful criteria, such as OS, machine type, application area, location, and so on. The deployment clients are usually forwarders or indexers. For example all of the Linux forwarders can be refreshed, after testing an updated configuration for a local Linux forwarder.
For small deployments, the deployment server can cohabit a Splunk Enterprise instance with another Splunk Enterprise component, either a search head or an indexer. For larger deployments it should run on its own Splunk Enterprise instance.
5.1.5 License server The license server manages Splunk Enterprise licenses. It often runs in the same Splunk Enterprise instance as the Deployment server.
5.1.6 Splunk Webserver Splunk provides a web user interface using a Python-based application server. It allows users to search and navigate data stored by Splunk servers and to manage the Splunk deployment.
5.1.7 Deployment monitor Although it's actually an app, not a Splunk Enterprise component, the deployment monitor has an important role to play in distributed environments. Distributed deployments can scale to forwarders numbering into the thousands, sending data to many indexers, which feed multiple search heads. The deployment monitor can be used to view and troubleshoot these distributed deployments and it provides numerous views into the state of the forwarders and indexers.
5.1.8 Lenovo XClarity app The Lenovo XClarity app for Splunk allows events to be forwarded from XClarity to the to the Splunk server listener. History and trends for different event can be viewed using built-in user interface.
5.1.9 Other apps Because Splunk provides a rich RESTful interface into its data and functionality, there are a large number of Splunk and third party provided applications and add-ons.
8 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
6 Operational model This section describes the options for mapping the logical components of Splunk Enterprise onto Lenovo ThinkSystem servers and Lenovo network switches. The “Operational model scenarios” section gives an overview of the examples and has pointers into the other sections for the related hardware. The BOM configurations are described in the appendix on page 25.
6.1 Operational model scenarios The following scenarios are considered in this chapter:
• Departmental server • Small enterprise (1/4 rack) • Medium enterprise (1/2 rack) • Large enterprise (full rack)
Below is a list of items that can have a significant impact on Splunk Enterprise performance.
• Amount of incoming data – increases processes time • Amount of indexed data – increases I/O bandwidth needed to store and search on data • Number of concurrent users performing searches, creating reports, or viewing dashboards • Number and types of searches • Number and unique performance, deployment, and configuration considerations for each Splunk app
Table 1 below gives sizing information for Splunk Enterprise and shows how many search heads and indexers are needed for different combinations of incoming data size and number of concurrent users. This table is taken from the Splunk Capacity Planning website: docs.splunk.com/Documentation/Splunk/7.1.1/Capacity/Summaryofperformancerecommendations.
6.2 Hardware components The following section describes the hardware components that can be used for Splunk Enterprise.
6.2.1 Rack servers You can use various rack-based Lenovo ThinkSystem server platforms to Splunk Enterprise.
Lenovo ThinkSystem SR630
Lenovo ThinkSystem SR630 (as shown in Figure 3) is an ideal 2-socket 1U rack server for small businesses up to large enterprises that need industry-leading reliability, management, and security, as well as maximizing performance and flexibility for future growth. The SR630 server is designed to handle a wide range of workloads, such as databases, virtualization and cloud computing, virtual desktop infrastructure (VDI), infrastructure security, systems management, enterprise applications, collaboration/email, streaming media, web, and HPC. The ThinkSystem SR630 offers up to twelve 2.5-inch or four 3.5 inch hot-swappable SAS/SATA HDDs or SSDs together with up to 10 on-board NVMe PCIe ports that allow direct connections to the U.2 NVMe PCIe SSDs.
10 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Figure 3: Lenovo ThinkSystem SR630
For more information, see this website: lenovopress.com/lp0643
Lenovo ThinkSystem SR650
Lenovo ThinkSystem SR650 (as shown in Figure 4) is similar to the SR630 but in a 2U form factor.
Figure 4: Lenovo ThinkSystem SR650
The key differences compared to the SR630 server are more expansion slots and chassis to support up to twenty-four 2.5-inch or fourteen 3.5-inch hot-swappable SAS/SATA HDDs or SSDs together with up to 8 on-board NVMe PCIe ports that allow direct connections to the U.2 NVMe PCIe SSDs. The ThinkSystem SR650 server also supports up to two NVIDIA GRID cards for graphics acceleration.
For more information, see this website: lenovopress.com/lp0644
6.2.2 10 GbE networking The standard network for Splunk Enterprise is 10 GbE. The following Lenovo 10GbE ToR switches are recommended:
The Lenovo ThinkSystem NE1032 RackSwitch (as shown in Figure 5) is a 1U rack-mount 10 Gb Ethernet switch that delivers lossless, low-latency performance with feature-rich design that supports virtualization, Converged Enhanced Ethernet (CEE), high availability, and enterprise class Layer 2 and Layer 3 functionality. The switch delivers line-rate, high-bandwidth switching, filtering, and traffic queuing without delaying data.
The NE1032 RackSwitch has 32x SFP+ ports that support 1 GbE and 10 GbE optical transceivers, active optical cables (AOCs), and direct attach copper (DAC) cables. The switch helps consolidate server and storage networks into a single fabric, and it is an ideal choice for virtualization, cloud, and enterprise workload solutions.
11 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Figure 5: Lenovo ThinkSystem NE1032 RackSwitch
For more information, see this website: lenovopress.com/lp0605
Lenovo RackSwitch G8272
The Lenovo RackSwitch G8272 uses 10Gb SFP+ and 40Gb QSFP+ Ethernet technology and is specifically designed for the data center. It is an enterprise class Layer 2 and Layer 3 full featured switch that delivers line-rate, high-bandwidth switching, filtering, and traffic queuing without delaying data. Large data center-grade buffers help keep traffic moving, while the hot-swap redundant power supplies and fans (along with numerous high-availability features) help provide high availability for business sensitive traffic.
The RackSwitch G8272 (shown in Figure 6), is ideal for latency sensitive applications, such as high-performance computing clusters and financial applications. In addition to the 10 Gb Ethernet (GbE) and 40 GbE connections, the G8272 can use 1 GbE connections.
Figure 6: Lenovo RackSwitch G8272
For more information, see this website: lenovopress.com/tips1267
6.2.3 1 Gbe networking The following Lenovo 1GbE ToR switch is recommended for use with Splunk Enterprise:
The Lenovo RackSwitch G7028 (as shown in Figure 7) is a 1 Gb top-of-rack switch that delivers line-rate Layer 2 performance at an attractive price. G7028 has 24 10/100/1000BASE-T RJ45 ports and four 10 Gb Ethernet SFP+ ports. It typically uses only 45 W of power, which helps improve energy efficiency.
Figure 7. Lenovo RackSwitch G7028
For more information, see this website: lenovopress.com/tips1268.
12 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Lenovo RackSwitch G8052
The Lenovo System Networking RackSwitch G8052 (as shown in Figure 8) is an Ethernet switch that is designed for the data center and provides a virtualized, cooler, and simpler network solution. The Lenovo RackSwitch G8052 offers up to 48 1 GbE ports and up to four 10 GbE ports in a 1U footprint. The G8052 switch is always available for business-sensitive traffic by using redundant power supplies, fans, and numerous high-availability features.
Figure 8: Lenovo RackSwitch G8052
For more information, see this website: lenovopress.com/tips1270.
6.3 Servers Splunk Enterprise runs best on bare-metal servers, as compared to virtual hardware. If Splunk is run in a virtual machine (VM) on any platform, performance does degrade. This is because virtualization abstracts the physical system hardware into resource pools from which defined virtual machines draw as needed. Splunk needs sustained access to a number of resources, particularly disk I/O, for indexing operations. Running Splunk in a VM or alongside other VMs can cause reduced performance.
There are three kinds of servers for Splunk:
• Indexer • Search head • Deployment server
For very small deployments the search head can be combined into the indexer. For medium to large deployments a separate deployment server is needed which can also support license management for the Splunk system. Each section below explores the Lenovo recommended configuration for the three kinds of compute servers.
See “Server BOM” on page 25 for the server bill of materials.
6.3.1 Indexer An indexer needs to store a large amount of local data and each indexer can roughly handle 300GB of data per day. The Lenovo ThinkSystem SR650 is recommended with up to fourteen 3.5” drives. The hot and warm data should be stored on solid state drives (SSD) that have a high endurance and the cold data can be stored on 3.5” large capacity hard disk drives (HDD). NVMe drives are not used in the configuration.
The enterprise performance “HUSMM32” SSDs have 800GB and 1.6TB capacities and a 3.5” form factor. The sweet spot for HDD price/performance is 8TB. Lenovo also recommends 4TB drives for smaller storage capacities. Larger storage capacities will usually require more indexers and therefore it may not be necessary to use 10TB or larger HDDs.
The processor and memory depends on the customer environment and Lenovo recommends the following:
The operating system is stored on two mirrored M.2 480GB boot drives. Two mirrored hot swap 3.5” SSDs could be used but that would reduce the total number of drives available for indexing.
Table 3 lists the recommended Indexer SSD configurations for each of the 4 deployment scenarios to store hot and warm data.
Table 3: Indexer SSD configurations
Attribute Departmental Small Enterprise Medium Enterprise
Large Enterprise
Indexers combined 3 6 14
Required storage 1TB 2.1TB 7.9TB 15.7 TB
Required storage +20%
1.2TB 2.52TB 9.84TB 18.84TB
Storage per indexer 1.2TB 804GB 1.64TB 1.34TB
SSD raw capacity 3 x 800GB 3 x 800GB 4 x 800GB 3 x 800TB
RAID configuration RAID 5 RAID 5 RAID 5 RAID 5
SSD actual capacity 1.47TB 1.47TB 2.18TB 1.47TB
For those cases that use only 3 SSDs, an extra SSD could be added as a hot spare.
Table 4 lists the recommended Indexer HDD configurations for each of the 4 deployment scenarios to store cold and archived data.
Table 4: Indexer HDD configurations
Attribute Departmental Small Enterprise Medium Enterprise
Large Enterprise
Indexers combined 3 6 14
Required cold storage 4.4TB 8.7TB 33.7TB 67.4TB
Archived storage 14.5TB 29.1TB 145TB 290TB
Total storage +20% 23.9TB 45.4TB 214TB 429TB
Storage per indexer 23.9TB 15.1TB 35.7TB 30.6TB
HDD raw capacity 10 x 6TB 10 x 4TB 10 x 8TB 10 x 8TB
For optimum performance, disk availability, bandwidth and space should be maintained on the indexers. Ensure that the HDD volumes have 20% or more free space at all times as HDD performance decreases
14 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
proportionally to available space because disk seek times increase. This affects how fast Splunk indexes data, and can also determine how quickly search results, reports and alerts are returned. In a default Splunk installation, the drive(s) that contain your indexes must have at least 5GB of free disk space, or indexing will pause.
6.3.2 Search head Because there is no local storage, a search can use a 1U SR630. The recommended configuration is:
The operating system is stored on two mirrored M.2 480GB boot drives. As an alternative two mirrored hot swap SSDs could be used.
6.3.3 Deployment server The deployment and license server can use low performance processors. In order to provide redundancy for search heads, it is recommended to simply use the same configuration as a search head.
6.4 Systems management Lenovo XClarity is used to manage Lenovo hardware. This section describes both Lenovo XClarity and the Lenovo XClarity Administrator App for Splunk. The combination provides scalable systems management and monitoring, and integrated analytics on top of the monitored data.
6.4.1 Lenovo XClarity Administrator Lenovo XClarity™ Administrator is a centralized resource management solution that reduces complexity, speeds up response, and enhances the availability of Lenovo® server systems and solutions.
The Lenovo XClarity Administrator provides agent-free hardware management for Lenovo’s ThinkSystem® rack servers, System x® rack servers, and Flex System™ compute nodes and components, including the Chassis Management Module (CMM) and Flex System I/O modules. Figure 9 shows the Lenovo XClarity administrator interface, in which Flex System components and rack servers are managed and are seen on the dashboard. Lenovo XClarity Administrator is a virtual appliance that is quickly imported into a virtualized environment server configuration.
Figure 9: XClarity Administrator interface
15 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
6.4.2 Lenovo XClarity Administrator App for Splunk XClarity continuously listens for events from all the resources it manages. Most of these are received via standard protocols such a CIM (common information model) or SNMP (simple network management protocol). Users can either view a log of all these events in the XClarity GUI console, or configure “event forwarders”, which enable them to forward events to another external visualization or management tool.
Lenovo provides integration with Splunk using the XClarity Administrator App for Splunk. This app enables collection, visual representation, and analysis of Lenovo hardware events from the Splunk platform. Pre-built dashboard panels included with the app help Splunk administrators identify changes made or needing to be made to security and configuration related settings. This helps administrators understand how much change is occurring to system configurations, and if those changes have been authorized.
The app provides the following functions:
• Monitoring of hardware events in a Lenovo XClarity Administrator-managed environment to quickly identify trends based on hardware events received, including hardware failures, power/thermal thresholds that have been exceeded, and PFAs (predicted failure alerts). These events are also categorized by source, type of hardware surfacing the events, and whether service is required.
• Auditing for security changes occurring within the Lenovo XClarity Administrator. Security events surfaced by Lenovo XClarity Administrator can help identify if unauthorized personnel are trying to access computing resources. This might include events showing that new users have been added/deleted, what IP addresses users are using to access the Lenovo XClarity Administrator, the time and dates when they are accessing resources, and any changes to the security settings of the Lenovo XClarity Administrator (or user IDs on the Lenovo XClarity Administrator). Visual representations can show changes in these activities, which could identify if an attack is occurring.
• Lenovo XClarity Administrator specializes in helping system administrators make desired changes on their computing resources. This includes updating the firmware of Lenovo XClarity Administrator managed resources, deploying configuration changes to groups of systems, and deploying operating systems to bare-metal systems. Auditing of these provisioning activities can help identify how much change is occurring to the configuration of servers, and if the changes have been authorized.
6.4.3 XClarity Administrator App Dashboards Figure 10 lists the dashboards that are available for this Splunk app.
Figure 10: XClarity Administrator App for Splunk dashboards
16 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Security Changes
The “Security Changes” dashboard shows any security changes made to the Lenovo XClarity Administrator, such as security policy changes, or changes for individual Lenovo XClarity Administrator users.
Figure 11: Security Changes dashboard
Security Logins
The “Security Logins” dashboard provides statistics on any security related events.
Figure 12: Security Logins dashboard
17 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Events Recommending Service
The “Events Recommending Service” dashboard displays events for resources that require attention by the System Administrator or the Support Center (or events predicting that these types of failures are imminent).
Figure 13: Events Recommending Service dashboard
General Events
The “General Events” dashboard provides a consolidated listing for all messages coming from Lenovo XClarity Administrator servers (including events from Lenovo XClarity Administrator-managed resources).
Figure 14: General Events dashboard
18 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Power and Thermal events
The “Power and Thermal events” dashboard graphically depicts power/thermal thresholds. Any time a power or thermal threshold is exceeded, the events associated with that situation are reflected in the graphs.
Figure 15: Power and Thermal events dashboard
Provisioning
The “Provisioning” dashboard shows events related to the provisioning of managed resources. Lenovo XClarity Administrator can provision changes to managed resources, including updating firmware, pushing configuration changes, and deploying operating system images.
Figure 16: Provisioning dashboard
19 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
6.4.4 Installing the XClarity Administrator App for Splunk There are three stages to install and use the XClarity Administrator App for Splunk:
1. Install the app into Splunk
2. Configure the data input configuration
3. Configure XClarity Administrator to forward events to Splunk
Installing the App into Splunk
To install the app into Splunk, follow these steps:
1. Download the Lenovo XClarity Administrator App from the Splunkbase website.
2. Click on “Install app from file”. In the next screen, click on “Choose file” button and point it to the .spl application file that was previously downloaded and extracted.
3. Once the App has been successfully imported, go back to “Manage Apps”. You will see the list of the installed Apps. Verify that the “Lenovo XClarity Administrator” App is listed. If not, then make sure you have the correct privileges to install Apps and try the import again.
4. Click on “Lenovo XClarity Administrator” and it will open the main page of the App. From the top level menus, click on “Dashboards”. You should see the dashboards as shown in Figure 17.
Figure 17: Dashboard configuration
Configure the data input configuration
The XClarity Administrator App for Splunk comes pre-configured for receiving log data events from XClarity. The default input ports are TCP and UDP port 10514. The app may not work if there are firewall or other conflicts. If so, the port can be changed as follows. Click on “Data inputs” on the “Settings” drop-down menu and select “TCP” to display the dialog shown in Figure 18 . The TCP configuration shows that the “lenovo_lxca” source type is mapped to the default port of 10514.
20 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Figure 18: TCP port configuration
The mapping can be deleted, changed, or cloned. Change the port number to the desired port and ensure that the “source type” is still specified as “lenovo_lxca”. Then click on “save” and “enable”. Splunk may need to be restarted to ensure that the new input port is active. Also, ensure that the forwarding port from XClarity Administrator system is the same so that the events get properly routed.
Configure XClarity Administrator to forward events to Splunk
The syslog forwarding capability of Lenovo XClarity Administrator must be configured to correctly forward events from XClarity Administrator to the Splunk app. The steps are as follows:
1. After signing into the Lenovo XClarity Administrator, mouse over “Monitoring” on the banner near the top of the screen. Select “Event Forwarding” as shown in Figure 19.
Figure 19: XClarity “event forwarding”
2. From the “Event Forwarding” panel, select the “New” icon.
3. Select “Syslog” as the event recipient type, and fill in the appropriate information in the dialog, including the TCP/IP address of the Splunk server as shown in Figure 20.
21 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Figure 20: XClarity “change event recipient” general tab
4. Then click “Next” to show the Devices tab.
5. Select the Lenovo XClarity Administrator-managed systems (and potentially the Lenovo XClarity Administrator management server itself) to forward events as shown in Figure 21.
9. Last click “Create”. The selected event types will be forwarded to the Splunk server.
6.5 Networking The 1GbE hardware management network is used for out-of-band access to the servers via the optional Lenovo XClarity Administrator. The dedicated Integrated Management Module (IMM) port on all of the servers needs to be connected to a 1GbE TOR switch such as the Lenovo RackSwitch G8052.
It is recommended that two top of rack (ToR) switches are used for redundancy. In order to support the logical pairing of the network adapter ports and to provide automatic failover of the switches, the Lenovo ThinkSystem NE1032 RackSwitch and G8272 supports virtual link aggregation groups (VLAGs). When VLAG is enabled over the inter-switch link (ISL) trunk, it enables logical grouping of these switches. When one of the switches is lost, or the uplink from the host to the switch is lost, the connectivity is automatically maintained over the other switch. In addition, the Lenovo Cloud Network Operating System (CNOS) should be used on the G8272 switches.
Figure 23 shows the scenario of two dual-port or one quad-port NIC connectivity into two ToR Lenovo RackSwitch G8272 switches with VLAG.
23 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Figure 23: Redundancy with 10GbE ToR switches
See “Networking BOM” on page 27 for the network switch bill of materials.
6.6 Racks The Lenovo 9363 rack is 42U in height and supports up to 6 power distribution units (PDUs). The required switches and servers for the 3 enterprise scenarios can be installed into this rack. Figure 24 shows the details for the small, medium, and large enterprise scenarios.
See “Rack BOM” on page 27 for the rack bill of materials.
24 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
Figure 24. Enterprise deployment scenarios in a rack
6.7 Operating Systems Splunk Enterprise was verified with the following operating systems:
• Red Hat Enterprise Linux 7.5
• Microsoft Windows Server 2016
Lenovo XClarity Administrator can be used to deploy operating systems. See the following website: sysmgt.lenovofiles.com/help/index.jsp?topic=%2Fcom.lenovo.lxca.doc%2Fosdeploy_considerations.html.
25 Reference Architecture: Splunk Enterprise with ThinkSystem Servers version 1.0
7 Appendix: Bill of Materials This appendix features the Bill of Materials (BOMs) for different configurations of hardware for Splunk Enterprise deployments. There are sections for servers and networking switches that are orderable from Lenovo.
For connections between ToR switches and devices (servers, storage, and chassis), the connector cables are configured with the device. The ToR switch configuration includes only transceivers or other cabling that is needed for failover or redundancy.
7.1 Server BOM This section lists the BOMs for the servers. See Table 2 on page 9 for the number of servers to configure.
References in this document to Lenovo products or services do not imply that Lenovo intends to make them available in every country.
Lenovo, the Lenovo logo, ThinkSystem, ThinkAgile, ThinkCentre, ThinkVision, ThinkVantage, ThinkPlus and Rescue and Recovery are trademarks of Lenovo.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Intel, Intel Inside (logos), MMX, and Pentium are trademarks of Intel Corporation in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
Information is provided "AS IS" without warranty of any kind.
All customer examples described are presented as illustrations of how those customers have used Lenovo products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.
Information concerning non-Lenovo products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by Lenovo. Sources for non-Lenovo list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. Lenovo has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-Lenovo products. Questions on the capability of non-Lenovo products should be addressed to the supplier of those products.
All statements regarding Lenovo future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Contact your local Lenovo office or Lenovo authorized reseller for the full text of the specific Statement of Direction.
Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in Lenovo product announcements. The information is presented here to communicate Lenovo’s current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard Lenovo benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.
Photographs shown are of engineering prototypes. Changes may be incorporated in production models.
Any references in this information to non-Lenovo websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this Lenovo product and use of those websites is at your own risk.