Chapter 1 – How the ServerIron works Page 1 of 14 Chapter 1 – How the ServerIron Works Methods, and Modes Foundry Networks provides load balancing of TCP/IP applications through our product known as ServerIron. In doing this, there are many different topologies supported; each with its own uniqueness and feature set. The current release of the SI code accomplishes load balancing via NAT. A soon to be released software (possibly hardware) upgrade will allow for both proxy and NAT. In many cases proxy load balancing is not needed and is not discussed here. With NAT, the load balance device is transparent and passes information it receives to a real server. Speed and simplicity are gained from a NAT based load balancer. A connection request is passed to the load balance device, information is translated inside the packet, and it is passed on to the real server. When the real server responds, information is translated back and sent to the requester. What this type of device allows for is speed and simplicity. Most sites requirements for load balancing are met with a NAT based load balancer. As you will read from this document, Foundry Networks has even found a mode that will speed up responses from the real servers, and continually based on NAT. Methods and Modes The figures throughout this document give examples of the topologies that can be formed by two of the configuration methods of the Foundry Networks ServerIron product. These methods are: • Single SI– a single ServerIron that allows for load balancing with no backup mechanisms
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Chapter 1 – How the ServerIron works Page 1 of 14
Chapter 1 – How the ServerIron Works
Methods, and Modes
Foundry Networks provides load balancing of TCP/IP applications through our product
known as ServerIron. In doing this, there are many different topologies supported; each
with its own uniqueness and feature set.
The current release of the SI code accomplishes load balancing via NAT. A soon to be
released software (possibly hardware) upgrade will allow for both proxy and NAT. In
many cases proxy load balancing is not needed and is not discussed here.
With NAT, the load balance device is transparent and passes information it receives to a
real server. Speed and simplicity are gained from a NAT based load balancer. A
connection request is passed to the load balance device, information is translated inside
the packet, and it is passed on to the real server. When the real server responds,
information is translated back and sent to the requester. What this type of device allows
for is speed and simplicity. Most sites requirements for load balancing are met with a
NAT based load balancer.
As you will read from this document, Foundry Networks has even found a mode that will
speed up responses from the real servers, and continually based on NAT.
Methods and Modes
The figures throughout this document give examples of the topologies that can be formed
by two of the configuration methods of the Foundry Networks ServerIron product. These
methods are:
• Single SI– a single ServerIron that allows for load balancing with no backup
mechanisms
Chapter 1 – How the ServerIron works Page 2 of 14
• Redundant SIs – ServerIrons that can back each other up while providing load
balancing. The ServerIron provides two modes of redundancy:
Ø Single Active SI with switch backup – Single Active SI that allows for switch
level redundancy
Ø Dual Active SI with VIP backup – dual active ServerIrons that allow for VIP-
level redundancy.
For the redundant configuration, a ServerIron backs up another ServerIron either on a full
SI level or on a VIP level. This backup ServerIron virtually mirrors the configuration of
its active partner.
In the Single-Active SI configuration, one ServerIron is servicing all requests and the
other is dormant. In the event of a failure in the active SI, the dormant SI becomes
active.
In the Dual-Active SI configuration, there are two ServerIrons that are active; each
servicing requests for their unique VIP(s). Each ServerIron in this case can back up the
others VIP(s), in the event of a failure.
Each of the above methods can implement either of two modes of operation:
• DSR – Direct Server Return, which allows the return path of the data flow to be
between the real server and the requestor. The SI is not involved in the return path.
• Classic-SLB – which forces the ServerIron to be the intermediate device between the
real server and the requestor. This is many times known as classic-SLB.
All the above accomplishes the same goal: resiliency. The purpose of the topologies is to
provide fault resilient load balancing for inbound requests of services residing on the
servers.
Chapter 1 – How the ServerIron works Page 3 of 14
Concepts
During installation, the ServerIron is configured with three things:
• the VIP(s) IP address and TCP/UDP port numbers for all the ports (applications) that
will be supported
• the IP addresses and TCP/UDP port numbers of the “real” physical servers. Real
servers are those devices for which the ServerIron is load balancing. Included are the
individual TCP/UDP port numbers that each server supports
• Bind the VIP port numbers to the real server port numbers.
The ServerIron is assigned one or more “virtual” IP address (VIP) and this is the
address(es) to which the outside connections are routed to. The IP address assigned to the
VIP is the publicly known address (it is the one in DNS). A user requesting
www.bicycles.com is given an IP address by DNS and then the connection attempt is routed
to the IP address of the ServerIron (the VIP). Based on inbound connection requests
(TCP/SYN packets) a session table is built.
A connection attempt to the VIP is translated to an IP address of a real server. The real
server is selected by the ServerIron by any one of 3 balancing methods. The inbound
packet is translated and sent to the real server which processes it, and sends it back. The
ServerIron intercepts this packet, translates it and sends it back the requester. This is
known as the classic-SLB mode of operation.
Another mode of operation is the Direct Server Return (DSR) mode. The process of the
user request is the same. The packet will be received by the ServerIron and is translated
to the real server. However, in this mode, the real server is allowed to directly respond to
the requester, without having the ServerIron as the intermediate device.
Chapter 1 – How the ServerIron works Page 4 of 14
Both modes of operation are fully detailed in later chapters. This is simply a brief
introduction.
ServerIron Hardware
Fully redundant, these are actually Backbone class L2 switches with L4 switching
functions. They are not routers. The ones that are used in the diagram are 16-port
switches. The ServerIron is available with 8, 16, or 24 ports of 10/100 and up to 8
Gigabit ports (using the TurboIron/8 product). Optionally, the ServerIron software can
run on the TurboIron8 platform as well. Soon, the ServerIron will run on the BigIron
platform. The ones pictured contain 16 10/100 ports and 2 Gigabit ports with each port
capable of handling full duplex operation. The connectivity between all the switches can
be done with any combination of Gig and 10/100. You can use the Gigabit ports on the
ServerIron to downlink to the BI4000’s or use them as uplinks to the NetIrons. You can
also accomplish high speed connectivity by providing trunking between any of the
switches. The NetIrons and ServerIrons support trunking of up to (4) 10/100 ports or (2)
Gigabit ports. For connectivity between the ServerIrons (active and standby) trunking is
recommended. This is for redundancy, not necessarily speed. Trunking is the process of
combining multiple physical ports into one logical port. For example, you can combine
Ethernet interfaces 1 through 4 into one logical port. All of the parameters for the link
are written to Ethernet interface 1.
Startup without redundancy
In non-backup operation (SI with backup is covered under the header Startup with
redundancy), a single SI reads its configuration file and from that can determine who the
real servers are. The SI ARPs for these servers and when it receives a response, the SI
will mark the MAC address and the port that it received it on. In this way, it very easily
knows where its servers are and those servers are not bound to any particular physical
port. Since the SI can handle 1,000,000 sessions, all sessions can be dynamically bound
to this single interface. The SI is not limited to sessions per interface.
Chapter 1 – How the ServerIron works Page 5 of 14
Other health checks are performed before the servers are placed on the active rotation list.
By default, these services include PINGing the IP address of the real server and for well-
known TCP bound ports, the SI will attempt a TCP ACTIVE OPEN request as a single
health check. Available with release version 5.0, the network administrator can assign
which ports are TCP and which are UDP to allow for TCP Active Open health checking
of “not-so-well-known” ports. This also allows for UDP health checking as well. Once all
services have been positively acknowledged, the server is placed on the active rotation
list. The ServerIron will wait for TCP/SYN connection packets and assign the packet to a
particular server using three types of configurable load balance metrics (detailed later):
• Round Robin
• Least connections
• Weighted
For sessions that are already built in the table, an inbound TCP/UDP packet is examined
for its port number. Upon extracting this information, the ServerIron will make a load
balance decision. The outcome of this decision is to forward the datagram to a particular
server that is supporting the inbound TCP/UDP port number requested. The server can
be selected from round robin, least connections, or weights. The packet is forwarded to
that server, supporting that port number. After this selection an IP address translation
needs to take place.
Redundancy
Depending on the method of implementation, there are two backup scenarios:
• ServerIron Redundancy
• Real Server Redundancy
Chapter 1 – How the ServerIron works Page 6 of 14
Let’s discuss ServerIron redundancy first. There are always two physical ServerIrons
that a usually in the same broadcast domain involved but they operate differently as
detailed below.
Single-Active SI with Backup
In this backup configuration, there is one SI that will be the active SI and the other SI
becomes that standby (backup) SI. The Active SI and the standby SI have exact same
configuration files. The backup command in each of the SIs have the MAC address that
is to be shared between the two SIs. A backup link is used between the two SIs and this
not only provides for heartbeat signals but is also allows for information about sessions to
be transferred between the two SIs (classic SLB mode only, this mode of operation is
detailed later). Information based on ARP, MAC, session table, session statistics is
transferred between the two SIs. The standby SI is dormant. It does not answer to ARPs
in behalf of the VIPs and does not pass traffic (a few exceptions here but those packets do
not interfere with the operation). Every 1/10th of a second heartbeat signals are sent
between the two SIs, if these signals are missed over a period of a second, the standby SI
determines if the active SI is still alive by checking its data path interfaces (all the
interfaces except the backup link interface). If the standby SI can “hear” the active SI,
then no convergence takes place (if you have sniffer on-line, these packets are
indicated by the special MAC address of 00E0.5200.0000). They serve no other purpose
except to indicate that active SI is alive. The standby SI does not send these packets out.
If no backup command exists in the config file, these packets are not sent at all.
If the standby SI does not hear from the active SI by either method, the standby SI
becomes the Active SI. It immediately ARPs itself, to allow for all L2 forwarding
devices to update their forwarding tables. ARP tables will not be updated for the
standby SI uses the same MAC address as did the Active SI for all of its VIPs. As far as
ARP tables are concerned, the same switch is operating.
Chapter 1 – How the ServerIron works Page 7 of 14
Dual-Active SIs with Backup
For Dual-Active SIs with a backup configuration, both SIs are active at L2 and serving
their configured VIPs for which they have priority for. Active-backup SIs are each
configured for VIPs that may or may not have backups for. Each SI is configured with a
VIP(s) and there is a priority for that VIP. The higher the priority, the more likely that SI
will be the master VIP. Each SI then monitors the other to ensure that active SI for a VIP
remains active. If for any reason that SI is not heard from, the active SI that is standby
for a VIP will make that VIP active on its SI.
In a Dual-Active SI configuration (current software release 5.x), there is no backup
command associated to the SI. Furthermore, there is not private link between the two SIs
either. The SIs find each other through the data path. The SIs must be in the same
broadcast domain. This backup link may be required in a later release.
Real Server Redundancy
This type of redundancy is provided through the ServerIron. Yes, redundancy can be
provided for the real servers by placing multiple NIC cards in the real servers, but the
ServerIron can provide redundancy as well.
There is a special command that can be used known as remote-name. When a ServerIron
has determined that all of its real servers have failed, it can used the servers listed in the
command of remote-name. Requests for services of that ServerIron’s VIP will be
redirected to the IP address indicated by the remote-name. The remote-name can be
another VIP of another SI or it can be that of a real server.
For HTTP requests only, the ServerIron can also provide redundancy when all its real
servers fail, by issuing an HTTP redirect command to the requester. This command is an
HTTP 1.0/1.1 command that allows for the redirection of a request to another server.
Chapter 1 – How the ServerIron works Page 8 of 14
Data-flow with Classic-SLB
In classic-SLB mode, the ServerIron must be in the direct path of the data flow. An
inbound connection is attempted to the VIP on the ServerIron. This requested is
translated and passed through to the real server (no connections are terminated on the
ServerIron). This means that the inbound and return path must physically pass through
the SI.
The destination IP address of the inbound packet is modified (from the VIP address) with
the real server’s IP address, the IP checksum is recalculated and the destination MAC
address of the real server is placed in the data link header and the packet is forwarded to
the real server. This is known as partial NATing of the address. Running the ServerIron
in the non-DSR operating environment forces the return path of a forwarded packet back
through the SI for translation. Received datagrams from the real servers are translated
again placing the VIP’s IP address in the source IP address of the IP header, the
checksum is recalculated and the packet is forwarded to the appropriate next hop. In this
way, the Foundry Networks ServerIron products are extremely fast and provide complete
transparency to the users and servers.
Data-flow with Direct Server Return (DSR) mode
When designing the network, you should allow for the real server to directly forward the
packet back to the requester. The return packet can traverse the active SI but its speed of
return is greatly reduced compared to not traversing the originating SI. This means that
the return path should not include the ServerIron. It can be there, but it does not have to
be. Having the SI as an intermediate device on the return path forces a received packet to
be sent to the processor for path determination. This reduces the speed down to the non-
DSR effective throughput. This is demonstrated later in the topology design chapter of
this paper. Therefore, if designing a DSR topology, you should use either the single SI
with no backup or dual Active SI with backup.
Chapter 1 – How the ServerIron works Page 9 of 14
The DSR function of the ServerIron allows for the real server to directly communicate (in
the return path of the data flow) with the requestor. This means, that the return path
bypasses the ServerIron. No translation takes place on the return path. This allows for
tremendous speed improvements up to wire speed of the topology. Speed improvements
will be governed by the available bandwidth of the path back, to the requestor.
The details of how DSR works are given in Chapter 4 but a brief description is given
here. DSR operates on the idea that there will be two or more identical IP addresses on
the same broadcast domain. There are two functions. One IP address is the IP address of
a VIP in the ServerIron (the logical ports associated to this VIP will indicate DSR-mode).
The other same IP address is placed on the real server or servers. The law of IP
uniqueness has been stretched here but no other station on same broadcast domain will
know this because of a little known feature of the loopback interface.
The real server is now assigned multiple IP addresses: one for the NIC and two for the
loopback interface. The NIC card is assigned a unique IP address. The loopback
interface is assigned the same address as the VIP on the ServerIron. Each real server in
the SIs rotation list for a port number will have the same IP address. The uniqueness is
provided by the inability of a loopback interface to respond to ARP requests. When the
IP address is ARPed by any inbound interface, it is the ServerIron that responds. The
inbound packet is sent to the ServerIron. In turn, the ServerIron will send this packet to
the next real server in the rotation list translating only the destination MAC address. The
packet is then forwarded to that real server.
As far as the real server is concerned, the packet was forwarded directly to it by the
requestor and not the ServerIron. The real server processes the packet and when it
responds it will respond directly to the requester. No translation of any of the headers
occurs in the return path of this data flow.
As you learned with classic-SLB, translation occurs on both the forward and return path
of the data flow. With this, (not using Source-NAT which is the ability of the ServerIron
Chapter 1 – How the ServerIron works Page 10 of 14
to translate the source IP address before sending it to the ServerIron), the SI translates the
destination MAC, destination IP, and IP checksum. With DSR enabled, translation
occurs only on the inbound path and only on the destination MAC is translated. This
improves the speed forwarding of the packet to the real server. On the return, there is no
translation for the packet is forwarded directly back to the requestor at whatever speed
the bandwidth will allow. In many environments, speeds up to 2 Gbps are attainable.
Foundry Networks holds the speed record for load balancing!
Health Checks
Load balancing is just one feature that Foundry Networks ServerIrons can provide. But
without some capability of providing health checks of the services being load balanced,
you may as well provide only round robin DNS.
Foundry Networks provides protocol level health checks for Layers 2, 3, 4, and 5-7.
These are defined as:
• Layer 2 (ARP)
• Layer 3 (PING for UDP and server connectivity)
• Layer 4 (TCP Active Open [on a per TCP port basis])
• Layer 4 for not-so-well-known ports - assigned “unknown ports” as TCP ports and
health check these ports
Along with the L4 health checks the ServerIron allows for the assignment of timers and
retry counters to the health check intervals (layer 3 and higher). With this we can health
check at any timer period.
Chapter 1 – How the ServerIron works Page 11 of 14
But proving the existence of an application does not prove it can respond to requests.
Therefore Foundry Networks provides application level (Layer 7) health checks on the
following protocols:
• URL (HTTP)
• TELNET
• FTP – (port 21)
• POP3
• SMTP
• IMAP4
• RADIUS-old (port 1645)
• RADIUS (port 1812
• DNS
• LDAP
• NNTP
User-defined Health Checking (port profiles)
Besides these application layer checks the ServerIron offers the ability to define ports as
TCP or UDP and allow those ports to be application checked as well. For example, lets
say you have defined multiple instances of a Web Server to run on a single machine. In
order to accomplish this you must assign a unique port number for each instance of the
Web server. With the ServerIron’s port profile feature, you can tell the ServerIron which
port has been defined, assign it a TCP attribute and even assign it to do Layer 7 URL
health checks. Another example would be running an application that has been built in
house. The application runs on a TCP stack and is assigned a port number of 30000. The
ServerIron can be told that this port is a TCP port and the ServerIron will provide Layer 4
health checks on this port.
Congestion Avoidance
There are two features the ServerIron exhibits for congestion avoidance.
Chapter 1 – How the ServerIron works Page 12 of 14
- QoS using access policies
- Server Reassignment
The ServerIron provides for multiple methods to achieve QoS. QoS policies can be set
via a TCP/UDP port number, MAC address, or VLAN. The network administrator can
write policies in the ServerIron to indicate what packets has priority over other packets
and therefore get placed into a higher priority queue for forwarding.
An advanced yet easy to use feature that the ServerIron provides for load balancing
connections is server reassignment. This feature allows for the checking of server
response time. If the server does not respond within three connection requests (possibly
indicating a congested server), the ServerIron will move the connection to the next server
in the rotation (according the port bindings; that is, it will move the connection to the
next server in the rotation supporting that logical port). Even though we have moved that
particular connection request to the next server in the rotation, it does not mean that the
ServerIron marks that service as failed. After a settable number of requests to a single
service have not been responded to (known as the reassign threshold), the ServerIron will
take that service (logical port, not the server) out of the rotation list and health check the
service. If the service responds, the ServerIron will place the service back in the rotation
list and will again forward outside connection requests to it. To allow for outside
notification of this event, the ServerIron will transmit an SNMP TRAP to the specified
trap receiver and write the event to the log. For those sites supporting SyslogD, the event
will also be written to the SyslogD server. This feature combined with load balancing
algorithm of least-connections allows for the ServerIron to determine which real server is
most available and redirect requests to that server. It will continue this dynamically until
other servers can respond in an adequate amount of time. This time is discerned by TCP
and the user applications; not by special software that must be implemented on each real
server.
Chapter 1 – How the ServerIron works Page 13 of 14
The advantage of this feature is no special software is required on the real servers. The
ServerIron makes the determination of possible congestion through a naturally occurring
event in the TCP/IP protocol. Again, this allows for ease of use, transparency, and
greater throughput. It ensures that you don’t spend your time “tweaking” our software
that may produce unnecessary and unpredictable events.
Multiple NICs in the Servers
Redundancy comes in many forms and the ServerIron in the right topology can provide
for it. The real servers can provide redundancy as well. Multiple NICs in the real server
is operating system dependent and in no way is it dependent on the ServerIron.
Having multiple NICs in a real server can provide for redundancy and it can also provide
for real time load sharing. There are usually two modes of operation: active or standby.
This means that each NIC can be active or one can be active and another is waiting for it
to fail. In either case, redundancy can be provided by the real servers.
Futures Foundry Networks will be providing alternative topologies that will suit different
customer environments, however it is not the purpose of this paper to review these
features or the new topologies used to enable them. These features are currently in beta
testing. This includes:
• Another feature available by summer of 1999 is the ability to provide load balancing
globally. That is to provide SLB no matter where the servers exist on the Internet.
This is a feature that will allow the ServerIron to provide health-checked DNS
services based on checks such as delay, reachability, route policies, or BGP
AS_PATH cost.
• Proxy Server Load Balancing – SSL ID and URL switching
Chapter 1 – How the ServerIron works Page 14 of 14
The next migration and alternative would be to integrate the back end networks in a
single pair of switching routers using VLANs and Layer 3 routing. This topology would
integrate the ServerIron technology and the bottom two BI4000 switches. Also the
servers would be dual connected and VLANs or applications would be set up for the
highest availability by stripping them across different slots and ports on the BigIron.
Integrating these networks will simplify the current configuration, reduce amount of
network hardware required to support different networks and provide highest availability.
There are numerous options on integrating networks and optimizing the overall design.
Also BigIron can support Gigabit server connections as well as 10/100, this allows
scaleable growth not found in any other product. All the software features in the NetIron
are also available in the BigIron product. If we use GateD (routing) on the servers (SUN)
we can also dual home every server and use all links – very efficient.
Chapter 2 – Active – Standby Topologies Page 1 of 19
Chapter 2 – Single Active SI with Backup
The first topology to be look at for the Single Active SI with redundancy is the six-pack
topology. This topology is used in many installations and has been a trusted method of
implementing SLB for two years. The topology as shown here allows for 100%
redundancy in either a switch or routed configuration. However, again, network design is
really based on acceptable risk. This topology can be slimmed down to not require so
many components but in doing so the risk factor rises.
There are two methods of implementation: L2 switching and L3 switching. In order to
effect the L3 (discussed first, L2 is discussed next) topology, several pieces of hardware