LINUX CLUSTERING WORKSHOP 4Dr. Zahid Anwar
Simplified Architecture of Linux Cluster
Simplified Architecture of a Single Computer
Simplified architecture of an enterprise cluster
Load Balancer
Cluster Nodes
Shared Storage
Print Server
No Single Point of FailureAn enterprise cluster should always have the following characteristic:
“Any computer within the cluster, or any computer the cluster depends upon for normal operation, can be rebooted without rebooting the entire cluster.”
e.g. by building high-availability server pairs
Clustering Terminology When a program runs When Process runs on a
Linux System A demon and the effects
it produces A service when combined
with its operating environment (config files, data, network)
When a resource moves from one computer to another
A proper failover configuration has no single point of failure.
Process
Service
daemon
Fail-Over
High Availability
Types of ClustersOriginally, "clusters" and "high-performance computing" were synonymous.
Today, the meaning of the word "cluster" has expanded beyond high-performance to include
• high-availability (HA) clusters and
• load-balancing (LB) clusters
Types of Clusters
High-availability clusters, also called failover clusters, used in mission-critical applications. The key to high availability is redundancy.
Load-balancing cluster provide better performance by dividing the
work This might be accomplished using a simple
round-robin algorithm. For example, Round-Robin DNS
Cluster Computing?
Parallel
Computing
Distributed
Computing
Grid Computing
Cloud Computing
Terminology Parallel computing
Tightly coupled sets of computation. E.g. Several pieces of data are being processed simultaneously in the
same CPU Homogenous collection of computers
Distributed computing Computing that spans multiple machines or multiple locations. Heterogeneous collection
Cluster Computing A form of Distributed Computing Generally restricted to computers on the same subnetwork or LAN.
Grid computing Frequently describes computers working together across a WAN or the
Internet. Much larger scale, tend to be used more asynchronously, and have much greater access, authorization, accounting, and security
concerns. Peer-to-Peer
Data or file-sharing (Napster, Gnutella, or Kazaa) SETI@Home
Building a HA Cluster using Heartbeat
Heartbeat: ability to failover a resource from one computer to another
Functioning Tell Heartbeat which computer owns a particular
resource (define primary and backup server Heartbeat daemon on backup server listens to
the "heartbeats" coming from the primary server.
If backup server does not hear the primary's heartbeat, it initiates a failover and takes ownership of the resource.
The Physical Paths of Heartbeat Normally Heartbeat configured to work over a separate physical connection between two servers.Separate physical connection can be either a
serial cable or another Ethernet
network connection (via a crossover cable or mini hub).
Adds extra traffic to your network
Heartbeat Control Messages 3 basic kinds Heartbeat (status msgs)
Typically 150 bytes broadcast, unicast, or multicast
Cluster Transition msgs relatively rare contains conversation b/w daemons to move resources ip-request : to release the resource of ownership ip-request-resp :shuts off the service and no longer
owns the resource. On receiving ip-request-resp, it starts up the service and
offers it to client Retransmission Requests
Rexmit-a request for a retransmission of a heartbeat control message when one of the servers notices that it is receiving heartbeat control messages out of sequence.
Secondary IP Addresses (Virtual Ips) Method for adding multiple IP addresses to the same physical network card.When you use Heartbeat to offer services it is done using secondary IP addresses
Lab Exercise Set up a 2-node clusterConfigure a highly-available web server
Load Balancing using Ultra Monkey (LVS)
Linux Virtual Server (LVS) enables TCP/UDP connections to be load balancedMechanism of connection control is referred to Layer 4 Switching. Layer 3 IP address/port information is used.The host that LVS runs on is referred to as the Linux-Director (specialized router)Packets received for a virtual service by linux-director, routed by a scheduling algo
subsequent packets for the same connection sent to the same real server
Advantage of load balancer over round robin DNS
directs requests to less load nodes
accounts for sessions. (e.g. forum software, shopping carts)
Overall mission
General Architecture
OS and Cluster Software
Cluster Hardware
Cluster Planning