This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Maipu Confidential & Proprietary Information Page 1 of 628
MyPower Switch Technical Manual
Maipu Communication Technology Co., Ltd No. 16, Jiuxing Avenue Hi-tech Park Chengdu, Sichuan Province People’s Republic of China - 610041 Tel: (86) 28-85148850, 85148041 Fax: (86) 28-85148948, 85148139 URL: http:// www.maipu.com Email: [email protected]
Maipu Confidential & Proprietary Information Page 2 of 628
All rights reserved. Printed in the People’s Republic of China. No part of this document may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language or computer language, in any form or by any means, electronic, mechanical, magnetic, optical, chemical, manual or otherwise without the prior written consent of Maipu Communication Technology Co., Ltd. Maipu makes no representations or warranties with respect to this document contents and specifically disclaims any implied warranties of merchantability or fitness for any specific purpose. Further, Maipu reserves the right to revise this document and to make changes from time to time in its content without being obligated to notify any person of such revisions or changes. Maipu values and appreciates comments you may have concerning our products or this document. Please address comments to: Maipu Communication Technology Co., Ltd No. 16, Jiuxing Avenue Hi-tech Park Chengdu, Sichuan Province People’s Republic of China - 610041 Tel: (86) 28-85148850, 85148041 Fax: (86) 28-85148948, 85148139 URL: http:// www.maipu.com Email: [email protected] All other products or services mentioned herein may be registered trademarks, trademarks, or service marks of their respective manufacturers, companies, or organizations.
Task name Entry address of the task Task ID Task priority Task status in the system Program counter, the instruction address of the current task The stack address of the task The error code of the task Task delay time
The task is delayed The task is suspended Delayed and suspended Pended and suspended With timeout value and is congested With timeout value, suspended, and pended The state has an inherited priority
Major functions of each task (common or configured)
tExcTask Exceptional tasks; provide VxWorks exceptional processing packets; implement the functions that cannot be performed in the interruption level The task must have the highest priority. You need not suspend, delete, or change the task priority.
tLogTask Log task, for the VxWorks to record the system information.
tExcTrace Display the system kernel information.
tSysWdog The watchdog task; when the switch encounters major faults, automatic restart can be performed.
tShell1 Shell task.
tSysLog Print the output information and write the specific information into the logging file.
tFwdTask System core forwarding task
tNetTask Task-level processing in the VxWorks network.
tSysTimer System timer
tActive Switching status detection
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 24 of 628
tSysTask Background system task; process the non-realtime system functions.
tTnd00 Forwarding task of the telnet
tSh00 Shell task of the telnet
tTffsPTask File system management task
tTelnetd The receiving task of Telnet; detect the connection request of the client.
Semaphore type includes: MUTEX, BINARY, and COUNTING.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 28 of 628
Task queuing ( Priority FIFO)
Use the show semaphore command to configure different parameters to
implement different functions:
show semaphore _STRING_: Display the information about specific
semaphore
Show semaphore list: display the list of the current semaphore
show semaphore binary | counting | mutex any | pended | unpended
detail | summary: Display the information about different types of
semaphores. Pended means the semaphore is blocked; unpended means
the semaphore is not blocked; detail means displaying the detailed
information; summary means the summary information.
show memory Display the memory usage in the system:
switch#show memory Displayed Content
Memory management mechanism, types, and usage. SUMMARY ------- Type Used bytes Free bytes Total bytes Used percent
---- ---------- ---------- ----------- ------------ HEAP 21291496 28001744 49293240 43.19% CODE 17810592 / 17810592 / SLAB 539040 349792 888832 60.65% MBUF 755936 16081824 16837760 4.49% Note The space of all such memory types exclude CODE is part of the HEAP's used memory,for example:MBUF,SLAB,and FPSS if exists. The memory of all memory management mechanisms (such as MBUF, SLAB, and FPSS-if
they exist) except the CODE segment are part of the used memory of HEAP.)
STATISTICS ---------- Used bytes Free bytes Total bytes Used percent ---------- ---------- ----------- ------------ 22670472 44433360 67103832 33.78%
Note
Meaning of each item:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 29 of 628
HEAP Stack memory, the most basic memory area in the system. Other re-allocation memory management mechanisms are separated from the area.
CODE Code segment memory, used in the area for saving code segment
SLAB
A memory re-allocation management mechanism
MBUF A memory re-allocation management mechanism
FPSS A memory re-allocation management mechanism, exists in MP3700, MP7200, and MP7500.
Use the show memory command to configure different parameters to
implement different functions:
show memory FPSS|HEAP|MBUF|SLAB: display the memory usage of
different memory management mechanisms
show memory FPSS|MBUF|SLAB _POOLNAME_: display the usage of a
memory pool in a memory management mechanism
show memory detail: display the usage details of system memory
show memory detail FPSS|HEAP|MBUF|SLAB: display the detailed memory
usage of different memory management mechanisms
show memory detail FPSS|HEAP|MBUF|SLAB _POOLNAME_: display the
detailed usage of a memory pool in a memory management mechanism
show arp Display the ARP cache of the system.
switch#show arp Displayed Content
Protocol Address Age (min) Hardware Addr Type Interface
Internet 128.255.41.40 2 0022.153b.55e4 ARPA vlan1
Internet 128.255.41.47 - 0001.7a5c.004a ARPA vlan1
Internet 128.255.43.254 0 0001.7a58.19ba ARPA vlan1
Note
When age is displayed as -, it means the static ARP entity.
show ip socket Display the information about the sockets in the active status:
switch#show ip socket Displayed Content
Active Internet connections (including servers) PCB Proto Recv-Q Send-Q Local Address Foreign Address vrf (state) -------- ----- ------ ------ ---------------------- ---------------------- ------- -------
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 30 of 628
The address of the socket protocol control block (PCB)
Proto
The protocol type of the socket
Recv-Q
The quantity of data received in the receiving cache of the socket
Send-Q
The quantity of data in the sending cache of the socket
Local Address The local IP address and port number bound with the socket (0.0.0.0.23 indicates that the IP address is any of the all local IP addresses; the port number is 23).
Foreign Address The foreign IP address and the port number corresponding to the socket.
Vrf VPN route forwarding
state
The status of the socket (effective to the TCP)
show pool Display the three commands in the current cache pool:
Show pool (show the summary of the pool)
Show pool detail (show the details of the pool)
Show pool information (show the actual information about the cache chain)
fact free number: the actual number of mblks of traversed mblk links
free clBlk number: the number of clblk
In CLUSTER POOL TABLE, the fact indicates the number of clusters
obtained in traversing cluster chain
switch#show pool detail Displayed Content
fastethernet pool Statistics for the network stack mbuf type number --------- ------ FREE : 1022 DATA : 2 HEADER : 0 SOCKET : 0 PCB : 0 RTABLE : 0 HTABLE : 0 ATABLE : 0 SONAME : 0 ZOMBIE : 0 SOOPTS : 0 FTABLE : 0 RIGHTS : 0 IFADDR : 0 CONTROL : 0 OOBDATA : 0 IPMOPTS : 0 IPMADDR : 0 IFMADDR : 0 MRTABLE : 0 DRVSCC : 0
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 33 of 628
DRV8SA : 0 DRV8S : 0 DRV16A : 0 DRV4M336: 0 DRVEXTSCC: 0 TOTAL : 1024 number of mbufs: 1024 number of times failed to find space: 0 number of times waited for space: 0 number of times drained protocols for space: 0 __________________ CLUSTER POOL TABLE _______________________________________________________________________________ size clusters free usage ------------------------------------------------------------------------------- 1556 512 256 599
------------------------------------------------------------------------------- Link pool Statistics for the network stack mbuf type number --------- ------ FREE : 1640 DATA : 0 HEADER : 0 SOCKET : 0 PCB : 0 RTABLE : 0 HTABLE : 0 ATABLE : 0 SONAME : 0 ZOMBIE : 0 SOOPTS : 0 FTABLE : 0 RIGHTS : 0 IFADDR : 0 CONTROL : 0 OOBDATA : 0 IPMOPTS : 0 IPMADDR : 0 IFMADDR : 0 MRTABLE : 0 DRVSCC : 0 DRV8SA : 0 DRV8S : 0 DRV16A : 0 DRV4M336: 0
DRVEXTSCC: 0 TOTAL : 1732 number of mbufs: 1732 number of times failed to find space: 0 number of times waited for space: 0 number of times drained protocols for space: 0 ____ _____________ CLUSTER POOL TABLE _______________________________________________________________________________ size clusters free usage ------------------------------------------------------------------------------- 64 1600 1600 0 128 10 10 0
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 34 of 628
256 10 10 0 512 10 10 0 1024 10 10 0 2048 100 100 0 ------------------------------------------------------------------------------- Size: 461120 bytes sys pool Statistics for the network stack mbuf type number --------- ------ FREE : 11560 DATA : 1 HEADER : 0 SOCKET : 2 PCB : 3
RTABLE : 22 HTABLE : 0 ATABLE : 0 SONAME : 0 ZOMBIE : 0 SOOPTS : 0 FTABLE : 0 RIGHTS : 0 IFADDR : 8 CONTROL : 0 OOBDATA : 0 IPMOPTS : 0 IPMADDR : 4 IFMADDR : 0 MRTABLE : 0 DRVSCC : 0 DRV8SA : 0 DRV8S : 0 DRV16A : 0 DRV4M336: 0 DRVEXTSCC: 0 TOTAL : 38400 number of mbufs:38400 number of times failed to find space: 0 number of times waited for space: 0 number of times drained protocols for space: 0 __________________ CLUSTER POOL TABLE _______________________________________________________________________________ size clusters free usage ------------------------------------------------------------------------------- 64 8000 7973 28
128 16000 15959 59 256 3200 3199 1 512 3200 3192 26 ------------------------------------------------------------------------------- Size: 7801600 bytes Data pool Statistics for the network stack mbuf type number --------- ------ FREE : 7999 DATA : 0 HEADER : 0
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 35 of 628
MRTABLE : 0 DRVSCC : 0 DRV8SA : 0 DRV8S : 0 DRV16A : 0 DRV4M336: 0 DRVEXTSCC: 0 TOTAL : 8000 number of mbufs: 8000 number of times failed to find space: 0 number of times waited for space: 0 number of times drained protocols for space: 0 __________________ CLUSTER POOL TABLE _______________________________________________________________________________ size clusters free usage ------------------------------------------------------------------------------- 64 800 800 4 128 200 199 27520 256 200 200 0 512 100 100 0 1024 80 80 0 2048 50 50 0 ------------------------------------------------------------------------------- Size: 767000 bytes Driver pool Statistics for the network stack mbuf type number --------- ------ FREE : 1388 DATA : 112
Maipu Confidential & Proprietary Information Page 36 of 628
OOBDATA : 0 IPMOPTS : 0 IPMADDR : 0 IFMADDR : 0 MRTABLE : 0 DRVSCC : 56 DRV8SA : 0 DRV8S : 0 DRV16A : 0 DRV4M336: 4 DRVEXTSCC: 4 TOTAL : 6000 number of mbufs: 6000 number of times failed to find space: 0 number of times waited for space: 0 number of times drained protocols for space: 0
bandsum The number of packets with incorrect checksum.
tooshort The length of the received packets is shorter than actual length (the length filed in the IP header).
toosmal The length of the received packets is shorter than the IP header length (20 bytes)
badhlen The IP header filed is smaller than the IP length (20 bytes)
badlen The value of the IP header is smaller than the IP header length
infragments The number of received fragments
fragdropped The number of dropped fragments
fragtimeout The number of timeout dropped fragments
forward The number of forwarded packets
cantforward The number of packets that cannot be forwarded
redirectsent The number of redirected packets
unknownprotocol The number of unknown protocol packets
toupper The number of sent to the upper layer
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 40 of 628
nobuffers The times of no buffer
reassembled The number of reassembled packets
outfragments The number of sent fragments
noroute The times of route failure
rawsockout The number of original IP packets
badaddress The number of the packets with the illegal address
fastforwardtotal The total of fast forwarded packets
fastforward The number of fast forwarded packets
cannotfastforward The number of packets that cannot be fast forwarded
show ip icmpstate Display the statistics of the ICMP packets:
switch#show ip icmpstate Displayed Content
Statistics for ICMP protocol 6929 calls to icmp_error 0 error not generated because old message was icmp Output histogram: echo reply: 5 destination unreachable: 24 0 message with bad code fields 0 message < minimum length 0 bad checksum 0 message with bad length Input histogram: echo: 5 #10: 2
5 message responses generated
Note
call to icmp error The number of invoking ICMP to send ICMP error packets
error not generatd because old message was icmp
The number of errors discarded for the packets are ICMP packets
Output histogram The histogram of the sent ICMP packets
echo reply The number of ICMP packets of echo reply
destination unreachable The number of ICMP packets with unreachable destination
message with bad code fields The number of ICMP packets discarded for
invalid code
message < minimum fields The number of packets discarded for the ICMP header is too short
bad checksum The number of discarded ICMP packets for bad checksum
message with bad length The number of discarded ICMP packets for invalid ICMP body
Input histogram The histogram of the received ICMP packets
echo The number of ICMP packets of echo
#10: 2 There are two packets with the type of ICMP_UNREACH_HOST_PROHIB
message response generated The number of generated response packets
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 41 of 628
Switch Principles
This chapter describes the switch principles for users to understand the
later chapters.
Main contents:
The development of the switching technology
The basic working principle of the switch
Multiple layer switching technology
Comparison between the switch and other network communication
products
Development of the Switching Technology The following is the development process of the LAN.
The combination of the computer technology and the communication
technology boosts the rapid development of the LAN. From 1960s to 1990s,
the development experiences ALOHA to 1000Mbps switching Ethernet. In
the thirty years, the technology leaps from simplex to duplex, from
sharing to switching, from low speed to high speed, from simple to
complex, and from expensive to popular.
In the later 1980s, the rapid increase of the semaphore boosts the
development of the technology. As a result, the LAN has increasingly
excellent performance. The 1M bps rate is replaced by the 100BASE-T and
100CG-ANYLAN. But, in the traditional media access method, lot of sites
share a common transmission media, namely CSMA/CD.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 42 of 628
In the early 1990s, with the improvement of the computer performance
and the increase of the semaphore, the traditional LAN is beyond its load.
The switching Ethernet technology emerges and the performance of the
LAN is significantly improved. Compared with the LAN topology of the
shared media based on the bridge and router, the bandwidth of the
network switch increases. With the switching technology, the dispersed
network can be constructed. As a result, the ports of the LAN switch can
transmit information parallelly, safely, and simultaneously. Therefore, the
LAN can be intensively expanded.
The development of the LAN switching technology goes back to the two-
port bridge. The bridge is a storage and forwarding device for connecting
similar LANs. According to the structure of the internet network, the bridge
is the DCE class point-to-point connection. According to the protocol layer,
the bridge stores and forwards the data frame in the logical link layer; it is
similar to the function of a repeater in the L1 and router in L3. The two-
port bridge and the Ethernet are developing at the same time.
The Ethernet switching technology is developed in 1990s based on the
multiple-port bridge. It implements the lower two layer protocols and is
related with the bridge. It is even called by the professionals as ―many
connected bridges‖. Therefore, the current switching technology is not new
standard; it is only the new application of current technology and is the
improved LAN bridge. Compared with traditional bridge, the switching
technology provides more ports, better performance, more powerful
management functions, and lower price.
Basic Working Principle of the Switch The LAN switching technology is on the L2 (data-link layer) of the OSI
model. The "switching‖ means forwarding frames. In the data
communication, all switching devices (namely the switches) implement
two basic tasks:
Frame forwarding: forward the frames received from the input media to
the corresponding output media;
Address learning process: construct and maintain the switching address
table to maintain the switch operation.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 43 of 628
The following describes the details of the two basic operations.
Frame Forwarding The switch forwards frames according to the MAC address. When the
switch forwards frames, the following rules must be observed:
1. If the destination MAC address of the frame is broadcasting address or
multicasting address, the frame is forwarded to all ports of the switch
(except the source port of the frame);
2. If the destination address of the frame is a unicast address, but the
address is not in the address table of the switch, the frame is
forwarded to all ports (except the source port of the frame).
3. If the destination address of the frame is in the address table of the
switch, forward the frame to the corresponding port according to the
address table.
4. If the destination address and the source address of the frame are in
the same network segment, the frame is discarded and switching is
not performed.
The following figure illustrates the frame switching.
Figure 2-1 Frame forwarding
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 44 of 628
When host D sends the broadcast frames, the switch receives frames with
the destination address of ffff.ffff.ffff from port E3, the frame is forwarded
to ports E0, E1, E2, and E4.
When host D communicates with host E, the switch receives frames with
the destination address of 0260.8c01.5555 from E3 port. Search the
address table and find that 0260.8c01.5555 is not in the table. Therefore,
the switch forwards the frames to E0, E1, E2 and E4 ports.
When host D communicates with host F, the switch receives frames with
the destination address of 0260.8c01.6666 from port E3. Search the
address table and find that 0260.8c01.6666 is at port E3, namely, the
address and the source address are in the same network segment.
Therefore, the switch does not forward the frame, and it drops the frame
directly.
When host D communicates with host A, the switch receives the frames
with the destination address of 0260.8c01.1111 from port E3. Search the
address table and find that 0260.8c01.1111 is at port E0. Therefore, the
switch forwards the frames to port E0. As a result, host A can receive the
frame.
If host D communicates with host A, host B is sending data to host C, the
switch also forwards the frames from switch B to port E2 connecting host
C. In this case, between E1 and E2, E3 and E0, through the hardware
switching circuit in the switch, two links are created. The data
communication between on the two links does not affect mutually.
Therefore, no network conflicts are encountered. Therefore, the
communication between host D and host A occupies a link exclusively. The
communication between host C and host B also occupies a link exclusively.
This type of link is created only when the two parties of the
communication have the requirements. When the data is transmitted, the
corresponding link is removed. This is the major features of the switch.
According to the switching process described previously, we can find that
the forwarding of frames is based on the MAC address table in the switch.
The following describes the creation and maintenance of the address table.
Address Learning Process In the address table of the switch, one entry is composed of one MAC
address and the resident switch port number. The generation of the whole
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 45 of 628
address table is through the dynamic self-learning, namely, when the
switch receives a frame, the source address and the input port are
recorded in the switching address table. Figure 2-2 illustrates the
forwarding and learning of the received frames.
When a frame reaches from a specific port, the switch gets the conclusion
according to the two items: from port X, the workstation specified by the
frame source address domain can be reached. Therefore, the switch can
update the forwarding database for the MAC address. To allow the change
of the network topology, each item of the database is configured with a life
timer. When a new item is added to the database, the timer is started. The
default value of the timer is 30 seconds. If the scheduled time is up, the
item searches the database to check whether any item with the same
address field value and frame address exists. If such item exists in the
database, the content of the item is updated. Reset the timer. If such item
does not exist in the database, add a new item in the database. The
address in the new item is the MAC address of the received frame; the
port number is the port of the received frame; the timer value is set to the
original value.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 46 of 628
Figure 2-2 Bridge forwarding and address learning
Multiple Layer Switching Technology The implementation of the LAN switching technology is through the
hardware mode. In the frame format of the LAN, the position of the
destination MAC address is fixed. The check of the header information is
simple to facilitate hardware switching. Therefore, the traditional LAN
switching refers to the L2 switching, namely, based on the L2 information-
destination MAC address.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 47 of 628
In the switching mode, the switch needs to receive certain data to check
the forwarding before the switching operation. If the length of the
detection data is increased, you can expand the L2 switching technology to
the L3, or even L4 switching technology.
In the L3 switching technology, the detection data is expanded to the IP
packet header. The switching is performed by checking the IP address.
Actually, it is based on the hardware route. L4 switching technology
checks the communication protocol type and the port number in the IP
packet header. It can be regarded as the switching based on application.
The widely used multiple layer switching technology combines L2, L3, and
L4 switching technologies to implement ―one route, multiple switching‖
function.
Comparison Between the Switch and Other Network Communication Products
Switch and the Switch Hub The switch hub can provides terminals with exclusive bandwidth,
automatically create and maintain the station table, and create switching
path between the output and input ports according to the station table.
The switch is developed based on the switch hub. It provides the
preceding functions, and also provides the functions required by the
current network: information flow priority, service category, virtual
Length: It is the data length, that is, the length of the Packet Body If
it is 0, it means that there is no data.
Packet Body: the data contents, varying with the type.
2. EAP Message Format
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 98 of 628
When the Type of the EAPOL message is EAP-Packet, Packet Body is the
EAP packet structure, as follows:
Figure 9-3 EAP encapsulation format
Code: the EAP type, including Request, Response, Success, and Failure.
Success and Failure do not have Data field. The value of Length is 4.
The Data field format of Request and Response is as follows. Type is the
EAP authentication type and the contents of Type data depend on Type.
Figure 9-4 The Data field format of Request and Response
Identifier: perform the Request and Response message matching;
Length: The length of the EAP packet, including Code, Identifier,
Length and Data fields.
Data: the contents of the EAP packet, depending on the Code type.
3. Encapsulation of EAP Attribute
To support the EAP authentication, RADIUS adds two attributes, that is,
EAP-Message and Message-Authenticator.
EAP-Message
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 99 of 628
Figure 9-5 EAP-Message encapsulation
As shown in figure 9-5, the attribute is used to encapsulate the EAP packet.
The type code is 79 and the String field is 253 bytes at most. If the length
of the EAP packet is larger than 253 bytes, you can fragmentize the packet
and encapsulate in multiple EAP-Message attributes.
Message-Authenticator
Figure 9-6 EAP-Authenticator attribute
As shown in figure 9-6, the attribute is used to prevent the access request
packet from being monitored when using the EAP and CHAP authentication.
The packet with the EAP-Message attribute must contain Message-
Authenticator at the same time. Otherwise, the packet is regarded as
invalid and discarded.
802.1X Authent icat ion The authentication can be initiated by Supplicant system or Authenticator
system. On one hand, Authenticator system actively sends the EAP-
Request/Identity packet to Supplicant system to initiate the authentication;
on the other hand, Supplicant system can send the EAPOL-Start packet to
Authenticator system via the software to initiate the authentication. The
following takes the Supplicant system to actively initiate the authentication
as an example. The EAP protocol supports the multiple authentication
methods. The following takes EAP-MD5 as an example to describe the
basic service flow.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 100 of 628
Figure 9-7 Service flow of 802.1X authentication system
The authentication process is as follows:
1. When the user has the requirement of accessing the network, enable
the 802.1x client program, input the applied and registered user name
and password, and initiate the connection request (EAPOL-Start
packet). Here, the client program sends the authentication request
packet to the device side and starts one authentication.
2. After receiving the authentication request data frame, the device side
sends one request frame (EAP-Request/Identity packet) to ask the
client program of the user to send the input user name.
3. The client program answers the request of the device side and sends
the user name information to the device side via the data frame (EAP-
Response/Identity packet). The device side encapsulates the data
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 101 of 628
frames sent by the client in the RADIUS Access-Request packet and
then sends it to the authentication server for processing.
4. After receiving the user name information forwarded by the device
side, the RADIUS server compares it with the user name table in the
database, finds the corresponding password information of the user
name, and uses one random-generated encryption word to encrypt it,
and then sends the encryption word to the device side via the RADIUS
Access-Challenge packet . The device side forwards it to the client
program.
5. After receiving the encryption word (EAP-Request/MD5 Challenge
packet) sent by the device side, the client program uses the
encryption word to encrypt the password (the encryption algorithm is
irreversible; generate the EAP-Response/MD5 Challenge packet) and
sends it to the authentication server via the device side.
6. The RADIUS server compares the received encrypted password
information (RADIUS Access-Request packet) with the local password
information after the encryption algorithm. If they are the same,
regard the user as the legal user and feed back the message of
passing the authentication (RADIUS Access-Accept packet and EAP-
Success packet).
7. After receiving the message of passing the authentication, the device
changes the port to the authorized state and permits the user to
access the network via the port.
8. The client also can send the EAPOL-Logoff packet to the device side to
ask for logout actively. The device side changes the port status from
the authorized state to the un-authorized state and sends the EAP-
Failure packet to the client.
Technologies Cooperat ing with 802.1X Auto Vlan:
Auto Vlan in the port-based access control mode is valid only on the
ACCESS port. Auto Vlan in the MAC-based access control mode is valid
only on the HYBRID port. In other access control modes, Auto Vlan is
invalid.
Auto Vlan is also called Assigned Vlan. When the 802.1x user passes the
authentication on the server, the server delivers the authorized VLAN
information to the device side. If the delivered VLAN is illegal (VLAN ID is
wrong or the VLAN does not exist), the authentication fails. Otherwise, the
authentication port is added to the delivered VLAN. After the user logs out,
the port recovers to the unauthorized state and is deleted from the Auto
Vlan. The default VLAN of the port recovers to the previous configured
VLAN.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 102 of 628
The authorized delivered Auto Vlan does not change or affect the port
configuration, but the priority of the authorized delivered Auto Vlan is
higher than that of the Vlan configured by the user (that is Config Vlan),
that is to say, the effective Vlan after passing the authentication is the
authorized delivered Auto Vlan and the Config Vlan takes effect after the
user logs out.
The three associated Radius attributes:
– [64] Tunnel-Type = Vlan
– [65] Tunnel-Medium-Type = 802
– [81] Tunnel-Private-Group-ID = Vlan name or Vlan Id
Guest Vlan:
Guest Vlan in the port-based access control mode takes effect only on the
ACCESS port. Guest Vlan in the MAC-based access control mode takes
effect only on the HYBRID port. It does not take effect in other access
control mode.
The Guest Vlan function is used to permit the un-authenticated users to
access some specified resources. The authenticated port of the user
belongs to one default VLAN (that is Guest Vlan) before passing the
802.1X authentication. To access the resources in the Guest Vlan, the user
does not need the authentication, but cannot access other network
resources. After passing the authentication, the port leaves Guest Vlan
and the user can access other network resources.
The user in Guest Vlan can get the 802.1X client software, upgrade the
client, or execute other application upgrade programs (such as anti virus
software and operation system patch program).
After enabling the 802.1X and configuring Guest Vlan, the port is added to
the Guest Vlan in untagged mode. Here, the users of the ports in the
Guest Vlan initiate authentication. If the authentication fails, the port is
still in Guest Vlan; if the authentication succeeds, there are two cases as
follows:
1. If the authentication server delivers one Vlan, the port leaves Guest
Vlan and is added to the delivered Vlan. After the user logs out, the
port returns to Guest Vlan.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 103 of 628
2. If the authentication server does not deliver Vlan, the port leaves
Guest Vlan and is added to Config Vlan. After the user logs out, the
port returns to Guest Vlan.
802.1X Expansion User-based authentication:
The standard 802.1X protocol is based on the port to realize, that is, as
long as one user of the port passes the authentication, the other users can
use the network resources without authentication, but after the user logs
out, the other users also are denied to use the network. Maipu switch
supports the user-based authentication (based on MAC address). When
the port is configured as the user-based authentication, each user of the
port needs the separate authentication. Only the users that pass the
authentication can use the network resources. After one user logs out,
only the user cannot use the network, but the other authenticated users
still can use the network.
EAP termination mode:
The standard 802.1X protocol defines that the client and the server
interact with each other via the EAP packet. During the interaction, the
device serves as the role of ―EAP relay‖. The device encapsulates the EAP
data sent from the authentication server in the EAPOL packet and then
sends it to the client. The interaction mode is called EAP relay. The EAP
relay requires that the authentication server supports the EAP protocol.
Otherwise, the authentication server cannot interact with the client by
using EAP. Considering the actual application environment, maybe the
previous deployed authentication sever does not support the EAP protocol,
so Maipu switch expands it and supports the EAP termination mode. The
EAP data of the client is not directly sent to the authentication server, but
the device completes the EAP interaction with the client. The device gets
the authentication information of the user from the EAP data and then
sends it to the authentication server for authentication. If adopting the
EAP termination mode, only MD5-based EAP authentication is supported.
When adopting the EAP termination mode, the service interaction flow is
as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 104 of 628
Figure 9-8 The service flow of the EAP termination mode of the 802.1X
authentication system
Compare Figure 9-8 with Figure 9-7, and we can see that when EAP
termination mode is adopted, the EAP protocol packer is not sent to the
authentication server, but terminates at the device side. The device gets
the enough information from the EAP protocol packet and then sends it to
the authentication server for authentication.
EAP over UDP mode:
In the standard 802.1X function, the client and the authentication device
exchange information via the EAPOL (EAP over LAN) packets. In the actual
application environment, because of the network complexity, maybe the
user to be authenticated and the authentication device need to traverse
the intermediate switch. Once the intermediate switches do not transmit
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 105 of 628
the EAPOL packets transparently, the user authentication cannot be
performed normally. Therefore, in the environment, you can use the
EAPOU mode to make the authentication packet (EAP packet) to traverse
the intermediate switch. In fact, the EAPOU function means to encapsulate
the original EAP packet in the UDP packet to be forwarded. Compared with
the EAPOL mode, the packet header changes from the original Ethernet
header to Ethernet header + IP header + UDP header, but the EAP
contents are the same. The EAPOU packet is not limited by the
intermediate switch, so the EAPOU mode can realize the 802.1X
authentication across the switch.
Non-client user authentication:
In the actual network, besides lots of PC terminal users, there are some
network terminals (such as network printer), which do not carry or cannot
be installed with 802.1X client program. Therefore, this kind of user
authentication is called non-client user authentication, that is, the so-
called MAC address authentication. The authentication method does not
need the user to install any client software. After the device detects the
user MAC address at the first time, enable the authentication for the user
at once. The authentication process does not need the user to input the
user name and password. After passing the authentication, the user can
access the network. The authentication is suitable for the terminal without
client software to authenticate and the PC terminal user that does not
want to install the client software or does not want to input the user name
or password to authenticate.
When performing the MAC address authentication, you can select the user
name type of the MAC address authentication. Usually, there are the
following two modes:
MAC address user name: Use the MAC address information of the user as
the user name and password for authentication.
Fixed user name: No matter what is the user MAC address, all users use
the local user name and password pre-configured on the device to
authenticate.
Dynamically deliver ACL:
In the 802.1X authentication environment that uses the radius server, you
can configure the corresponding ACL name on radius. When the user
authentication is passed, the server delivers the ACL name to the
authentication device, which binds the user with the ACL so that the
subsequent actions of the user are controlled by ACL. The ACL needs to be
pre-configured on the device. Passing the user authentication is just a
process of searching and binding. If the searching or binding fails, the user
cannot be online.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 106 of 628
Typical Application
802.1x Cl ient Authent icat ion The Supplicant is connected to the network via 802.1X authentication. The
authentication server is the Radius server. The port 0/0/1 connected to
the Supplicant is in Vlan 1; the authentication server is in Vlan2; Update
Server is the server used to download and upgrade the client software and
is in Vlan 10; the port 0/0/2 of the switch connected to Internet is in Vlan
5.
Radius ServerUpdate Server
Supplicant
Switch
Vlan 10
Port 0/4Vlan 2
Port 0/3
Vlan 1
Port 0/1
Vlan 5
Port 0/2
Internet
Figure 9-9
Enable the 802.1X authentication function on Port 0/1; set the
authentication mode as the port-based authentication; set Vlan 10 as the
Guest Vlan of the port.
Port 0/1 is added to Guest Vlan. Here, Supplicant and Update Server are in
Vlan 10; Supplicant can access Update Server and download the 802.1X
client.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 107 of 628
Radius ServerUpdate Server
Supplicant
Switch
Vlan 10
Port 0/4Vlan 2
Port 0/3
Vlan 1
Port 0/1
Vlan 5
Port 0/2
Internet
Vlan 10
Figure 9-10
When the user goes online after passing the authentication, the
authentication server delivers Vlan5. Here, Supplicant and Port 0/2 are in
Vlan 5; Supplicant can access Internet.
Radius ServerUpdate Server
Supplicant
Switch
Vlan 10
Port 0/4Vlan 2
Port 0/3
Vlan 1
Port 0/1
Vlan 5
Port 0/2
Internet
Vlan 5
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 108 of 628
Figure 9-11
Non-cl ient MAC Address Authent icat ion As shown in the following figure, one user (Client) is connected to Port 0/1
of the device. The device manager hopes to perform the MAC address
authentication for the user access on the port, so as to control the access
for Internet. After the device detects the MAC address of Client
0001.7a11.2233, enable the corresponding authentication. If the
authentication is passed, Client can access Internet. Otherwise, Client
cannot access Internet.
Figure 9-12
DHCP Snooping and Its Application This section describes the DHCP Snooping theory and how to realize it, as
well as its application.
Main contents:
Related terms
Introduction
Typical application
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 109 of 628
Related Terms Trust Port: DHCP Snooping divides the ports to trust port and un-trust
port and performs some limitation processing for the DHCP packet on the
un-trust port, so as to realize the security policy.
Option 82: Option82 is one DHCP option. The option is used to record the
location information of the DHCP client. The administrator can locate the
DHCP client according to the option, so as to perform some security
control.
Dynamic binding table: Snoop the interaction of the DHCP packets to
get one binding table that contains the binding relation of the IP address
and MAC address and the related information.
Introduction DHCP Snooping is one security feature of DHCP. It can ensure that the
client gets the IP address from the legal server, preventing the proof
attack. It also can record the corresponding relation between the IP
address and the MAC address of the DHCP client for the administrator to
view and for other security modules to use.
Record Corresponding Relat ion of IP Add ress and MAC Address Considering the security, the network administrator may need to record
the IP addresses used by the users for Internet and ensure the
corresponding relation of the IP address got by the user from the DHCP
server and the MAC address of the user supplicant.
DHCP Snooping records the MAC address of the DHCP customer and the
got IP address by snooping the DHCP-REQUEST and DHCP-ACK broadcast
packets received by the trust ports. The administrator can use the show
dhcp-snooping command to view the information about the IP address
got by the DHCP client.
Ensure that Cl ient Gets IP Address from Legal Server If there is private deployed DHCP server in the network, the user may get
the wrong IP address. To make the user get IP address from the legal
DHCP server, DHCP Snooping permits the port to be set as the trust port
and un-trust port.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 110 of 628
The trust port is the port directly or indirectly connected to the legal DHCP
server. The trust port forwards the received DHCP packets normally, so as
to ensure that the DHCP client gets the correct IP address.
The un-trust port is the port not connected to the legal DHCP server. If the
DHCP-ACK and DHCP-OFFER packets returned by the DHCP server are
received from the un-trust port, discard them, so as to prevent the DHCP
client from getting the wrong IP address.
Support Opt ion 82 Option82 is one DHCP option. The option is used to record the location
information of the DHCP client. The administrator can locate the DHCP
client according to the option, so as to perform some security control, such
as restrict the number of the IP addresses distributed to one port or VLAN.
Option 82 can contain 255 sub options at most. SM4100 series switch only
supports two sub options, that is, sub-option 1 (Circuit ID) and sub-option
2 (Remote ID).
SM4100 series switch supports two kinds of filling formats, that is, default
format and user-configured format.
The contents of the two sub options of the default format are as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 111 of 628
Figure 9-2-1 option82 default format
The contents of the two sub options of the user-configured format are as
follows:
Figure 9-2-2 Sub option 1 of option82 user-configured format
Figure 9-2-3 Sub option 2 of option82 user-configured format
The supporting of DHCP Snooping for Option 82:
1. After receiving the DHCP request packets, the device performs the
following processing on the packets according to whether the packet
contains Option 82 and the processing policy configured by the user,
as well as the filling format, and then forwards the processed packets
to the DHCP server.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 112 of 628
Received DHCP
Request Packet
Processing Policy Filling Format The Processing of
DHCP Snooping for
Packets
The received packet
carries Option 82.
Drop Discard the packet
Keep Keep the Option 82
in the packet and
forward it
Replace Default Adopt the default
format to fill in
Option 82; replace
the original Option 82
in the packet and
forward it
User-configured Adopt the user-
configured format to
fill in Option 82;
replace the original
Option 82 in the
packet and forward it
The received packet
does not carry Option
82.
Default Adopt the default
format to fill in
Option 82 and
forward it
User-configured Adopt the user-
configured format to
fill in Option 82 and
forward it
Figure 9-2-4 DHCP Process Snooping packets
2. If the packet contains Option 82 when the device receives the
response packet of the DHCP server, delete Option 82 and forward it
to the DHCP client; if the packet does not contain Option 82, directly
forward the packet to the DHCP client.
Packet Rate Limitat ion After enabling the DHCP Snooping function on the device, send all DHCP
packets to CPU. If the user adopts the tool to fabricate lots of DHCP
packets and initiate the DHCP Flooding attack, it may result in the running
of the device with high payload or even breakdown. To avoid this, you can
set the threshold for the DHCP packets received every second on the port.
The device measures the number of the DHCP packets received by the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 113 of 628
port each second. If the number of the packets received each second
exceeds the set threshold, the excessive packets are directly dropped by
CPU. If the number of the received DHCP packets exceeds the threshold in
successive 20 seconds, directly shut down the port and whether to recover
automatically depends the configuration managed by the port. You can
also recover manually.
Typical Application The typical application of the DHCP Flooding function in the network is as
shown in the following Switch A. The port connected to the client network
is set as the un-trust port and the port connected to the relay or server is
set as the trust port. This can ensure that the client can get the IP address
from the trust port (that is the legal server).
Figure 9-2-5 DHCP networking
IP Source Guard and Its Application This section describes the IP Source Guard theory and how to realize it.
Main contents:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 114 of 628
Related terms
Introduction
Typical application
Related Terms IP Source Guard: Filter IP packets via IP or IP+MAC.
Introduction With the IP Source Guard binding function, you can filter the packets
forwarded by the port, so as to prevent the packets with invalid IP address
and MAC address from passing the port and improve the port security.
After receiving the packet, the port searches for the IP Source Guard
binding entries and perform the following processing on the packet
according to the filter mode specified on the port.
When the filter mode of the port is IP: If the source IP address of the
packet is the same as the IP address recorded in the binding entries,
the port forwards the packet. Otherwise, drop the packet.
When the filter mode of the port is IP+MAC: If the source MAC
address and source IP address of the packet is the same as the MAC
address and IP address recorded in the binding entries, the port
forwards the packet. Otherwise, drop the packet.
The IP Source Guard binding entries have two sources. One is the static
binding entries configured manually by IP Source Guard; the other is the
entries maintained by DHCP Snooping.
Key Points for Realization 1. When the IP Source Guard function is enabled, poll IP Source Guard
static binding table and DHCP Snooping dynamic binding base to get
the corresponding port entries and write into the hardware entries.
2. When the IP Source Guard function is disabled, poll the IP Source
Guard function is static binding table and the DHCP Snooping dynamic
binding base and delete the corresponding port entries from the
hardware entries;
3. When adding the IP Source Guard static entries, update the hardware
entries automatically. Delete the hardware entries during deletion. If
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 115 of 628
setting the hardware entries fails, the static table sets Writed-Flag as
non-write.
4. When adding the DHCP Snooping dynamic entries, update the
hardware entries automatically. Delete the hardware entries during
deletion. If setting the hardware entries fails, the static table sets
Writed-Flag as non-write.
5. Synchronize the software table (IP Source Guard static entries and
DHCP Snooping dynamic entries) and hardware table every minute.
Because of the ACL resource limitation, it is likely that all software
entries cannot be written into the hardware entries. You need to check
whether there are available resources regularly. If there are available
resources, for example, some entries are deleted and the ACL
resources are adjusted larger, write the legal entries in the software
table into the hardware entries. The default ACL resources are two
slices, that is, 256. Enabling one port needs to occupy two and the
other are used to set the filter entries.
6. When the IP Source Guard function is enabled on the port, the
configured binding table is written into the switch chip hardware, so as
to realize the filtering of the IP packets. The quantity written into the
switch ship hardware depends on the number of the resources
distributed by the switch chip hardware to IP Source Guard. If the
switch chip hardware resources distributed to IP Source Guard are
used up and you need to add the binding entries or enable the IP
Source Guard binding function on other port, you need to add the
switch chip hardware resources or delete some binding entries. You
can continue to distributed the resources after restarting the device. If
you just delete some entries after the switch chip hardware resources
are used up, you cannot enable the IP Source Guard function on other
port, because you need to pre-distribute the resources for enabling
the IP Source Guard function of the port, but when the switch chip
hardware resources are not enough, to make the resource utilization
reach the maximum, the binding entries occupy the pre-distributed
resources. Meanwhile, after disabling the IP Source Guard function of
the port, the pre-distributed resources of the port are released, but
maybe the resources cannot be written into the binding table.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 116 of 628
Typical Application
Appl icat ion in non-DHCP Snooping environment
Figure 9-3-1 IP Source Guard configuration instance 1
The switch can be applied in LAN and be connected to Internet. Configure
IP Source Guard on the port of the switch connected to LAN; bind the IP
address and MAC address of the users in LAN according to the
configuration of the static binding table. Only the bound address can be
connected to Internet via the switch. The IP packet that is sent from the
un-bound address is regarded as illegal packet and is filtered.
Dynamic ARP Detection and Application This section describes Dynamic ARP Inspection theory and how to realize it.
Main contents:
Related terms
Introduction
Typical application
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 117 of 628
Related Terms Dynamic ARP Inspection: It is one security measure of discovering and
preventing the ARP proof attack by checking the validity of the ARP packet.
Introduction Dynamic ARP detection function can be used to discover and prevent the
ARP proof attack.
The dynamic ARP function re-directs all ARP packets (broadcast ARP and
unicast ARP) of the port on which the ARP detection function is enabled to
CPU for judging, comparing, software forwarding, log recording and so on,
so when there are lots of ARP packets, the CPU resource is consumed.
Therefore, in the normal state, it is not recommended to enable the
function. When it is double that there is the ARP proof attack in the
network, you can enable the function to confirm and locate.
The device does not check all ARP packets from the port on which the
dynamic arp inspection function is not enabled, but directly forward the
packets. Usually, the port on which the dynamic arp inspection is not
enabled is the upstream port of the device. The device checks the ARP
packets from the port on which the dynamic arp inspection function is
enabled according to the DHCP Snooping table or the IP static binding
table configured manually by IP Source Guard.
When global arp-security is enabled, control whether the device processes
the ARP packets of the IP/MAC specified by the global IP/MAC of ACL.
When the source IP of the ARP packet sent to the device matches with the
IP specified by the global IP/MAC of ACL, but the source MAC does not
match, the ARP packet is dropped so that the device does not set up the
wrong ARP entities. The device sets up the entity only when the source
IP/MAC matches with the global IP/MAC of ACL. When the source IP does
not match with the IP specified by the global IP/MAC, the ARP entity can
also be set up.
ARP Detect ion Pol icy 1. When the binding relation of the source IP address and source MAC
address in the ARP packet matches with the DHCP Snooping entries
or the manual-configured IP static binding entries, and the ingress
port of the ARP packet and its VLAN are consistent with the DHCP
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 118 of 628
Snooping entries or the IP static binding entries manually
configured by IP Source Guard, the ARP packet is valid and is
forwarded.
2. When the binding relation of the source IP address and source MAC
address in the ARP packet does not match with the DHCP Snooping
entries or the manual-configured IP static binding entries, and the
ingress port of the ARP packet and its VLAN are inconsistent with
the DHCP Snooping entries or the IP static binding entries manually
configured by IP Source Guard, the ARP packet is invalid and is
dropped. Besides, the log information is printed.
3. The matching order: First match IP Source Guard static binding
table and then match DHCP snooping dynamic binding table.
Packet Forwarding Pol icy After receiving the ARP packet, first judge whether the dynamic arp
inspection function is enabled on the port. If not, the ARP packet continues
going to the protocol stack for processing and do not perform the software
forwarding; if yes, check the validity according to the previous method. If
the packet is invalid, drop it directly and record in the log. If the packet is
valid, process it according to the destination address.
1. If the destination MAC address of the ARP packet is the local device,
forward the packet to the ARP protocol stack processing and update
the ARP cache of the local device.
2. If the destination MAC address of the ARP packet is the broadcast
address, copy the packet, forward the original packet to the ARP
protocol stack for processing, update the ARP cache of the local
device, and forward the copied packet from all ports of the same
VLAN.
3. If the destination MAC address of the ARP packet is other unicast
address, first search the hardware MAC table to get the forwarding
port. If the forwarding port is found, forward the packet from the
port; if the forwarding port is not found, forward the packet from
all ports of the same VLAN.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 119 of 628
Figure 9-4-1 Processing flow for valid ARP packet
Packet Rate Limitat ion After enabling the dynamic ARP function on the device, TRAP all ARP
packets to CPU. If the user adopts the tool to fabricate lots of ARP packets
and initiate the ARP Flooding attack, it may result in the running of the
device with high payload or even breakdown. To avoid this, you can set
the threshold for the ARP packets received every second on the port. The
device measures the number of the ARP packets received by the port each
second. If the number of the packets received each second exceeds the
set threshold, the excessive ARP packets are directly dropped by CPU. If
the number of the received ARP packets exceeds the threshold in
successive 20 seconds, directly shut down the port and whether to recover
automatically depends the configuration managed by the port. You can
also recover manually.
Log Recording For the invalid ARP packet, record it in the log before dropping it. Each
invalid ARP log entry includes the following contents:
1. Receiving VLAN
2. Receiving port
3. The IP address of the sender and the destination IP address
4. The MAC address of the sender and the destination MAC address
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 120 of 628
5. The number of the dropped packets
The log information is not output in real time, but output periodically. The
user can perform the further processing according to the output log
information, such as locate the host that initiates the ARP attack.
Typical Application
Figure 9-4-2 Application instance of Dynamic ARP Inspection
The above figure is the application in the DHCP environment. If it is not
the DHCP environment, that is, the DHCP Snooping function is not enabled
on switch A, you need to configure the IP Source Guard static binding
table. Otherwise, the ARP packets of all ports on which the Dynamic ARP
Inspection function is enabled are filtered. The Dynamic ARP Inspection
function adopts the dynamic binding table generated by the DHCP
Snooping function to filter the ARP packets, forward the valid packets, and
drop the invalid packets and record in the log.
Port Security This section describes the basic theory of the port security and its
application.
Main contents:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 121 of 628
Introduction
Typical application
Introduction The port security is applied at the access layer. It can limit the hosts that
access the network via the device, permitting some specified hosts to
access the network, but other hosts cannot access the network.
The port security function binds the MAC address, IP address, VLAN ID and
Port of the user flexibly to prevent the invalid user from being connected
to the network, so as to ensure the security of the network data and the
valid user can get the enough bandwidth.
The user can limit the hosts that can access the network via three kinds of
rules, including MAC rule, IP rule and MAX rule. The MAC rule is divided to
three kinds of binding modes, that is, MAC binding, MAC+IP binding, and
MAC+VID binding. The IP rule can be for one IP or a series of IP. The MAX
rule is used to limit the number of the maximum MAC addresses that the
port can learn (by order). The maximum number of the MAC addresses
does not include the valid MAC addresses generated by the MAC rule and
IP rule.
The MAC rule and IP rule can specify whether the packet that matches
with the corresponding rule permits the communication. With the MAC rule,
you can bind the MAC address with VLAN, MAC address with IP address
flexibly. The port security is realized based on the software. The rule
quantity is not limited by the hardware resources, which makes the
configuration more flexible.
The rules of the port security depend one the ARP packets of the terminal
device to trigger. When the device receives the ARP packet, the port
security gets the information about various kinds of packets to match the
configured three rules. The matching order is first to match the MAC rule,
then match IP rule and at last match the MAX rule. Control the L2
forwarding table of the port according to the matching result, so as to
control the forwarding of the port for the packet.
When the port security regards the packet as the illegal packet, it
performs the corresponding process. Currently, there are three kinds of
processing modes, that is, protect, restrict, and shutdown. The protect
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 122 of 628
mode drops packets; the restrict mode drops packets and trap alarm
(alarm within two minutes when receiving illegal packet); besides the
actions of the restrict mode, the shutdown mode shuts down the port.
Typical Application Refer to the related chapter of the configuration manual.
Port Monitoring This section describes the basic theory of the port monitoring and its
application.
Main contents:
Introduction
Typical application
Introduction The port monitoring function is to monitor the packets on the switch CPU,
filter the excessive packets at the bottom layer and protect the switch
from being attacked by the lots of invalid packets.
The monitoring includes the port monitoring and host monitoring. When
the switch is attacked, the user first enables the port monitoring. The
monitoring program measures the packets to the CPU by port. The user
discovers the attacked port from the statistics data and then enables the
host monitoring on the port and sets the upper threshold of the packets to
the CPU in sampling period. The packets that exceed the threshold in the
sampling period from the host that initiates the attack are filtered at the
bottom layer and they do not go to the IP layer for being routed and are
not written into the hardware route table, so as to save the CPU resources
and hardware table resources. When performing the packet filtering on the
host that initiates the attack, the other hosts still can communicate
normally. The monitoring program writes the host whose packets to the
CPU exceed the upper threshold in the sampling period into the blacklist.
In the next sampling period, only half of the upper threshold of the
packets of the hosts in the backlist can go to CPU and the other packets to
CPU are dropped. The port monitoring program performs the measuring
and dropping operations according to the packet classification.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 123 of 628
The port monitoring program calculates the sampling result at the end of
each sampling period and updates the backlist information.
The port monitoring divides the packets into six types:
1. broadcast-packet: The destination MAC address is all 1;
2. multicast-packet: The lowest digit of the highest bytes of the
destination MAC address is 1;
3. admin-packet: The destination IP address is the IP address of the
switch VLAN interface;
4. forward-packet: The destination IP address is not the IP address of
the switch VLAN interface. It is the packet that requires to be
forwarded out after being routed;
5. other-packet: The other packets except for the previous four kinds
of packets;
6. All the previous packets are called total-packet;
Typical Application Refer to the related chapters of the configuration manual.
Port Isolation This section describes the basic theory of the port isolation and its
application.
Main contents:
Related terms
Introduction
Typical application
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 124 of 628
Related Terms Port isolation: It is one function of the port security. The function can
prevent the packet forwarding between one port and the other ports of the
switch.
Introduction The port isolation is port-based security feature. The user can specify the
isolated ports of one port as desired to realize the L2 and L3 data isolation
between the port and the isolated ports, which improves the network
security and provide the flexible networking scheme for the user.
By default, the packets can be forwarded between any two ports in one
VLAN of the switch. To make any specified port in one VLAN cannot
communicate, you can configure the isolated ports of the port in the
specified port mode so that the port that is configured with the port
isolation cannot communicate with the specified isolated ports.
The port isolation is not related with the VLAN of the port. Currently, the
switch supports configuring the isolated ports in the common port and
aggregation port mode. The isolated port can be common port or
aggregation port. The port isolation only realizes the uni-directional packet
dropping. Suppose that the isolated ports are set as B, C, and D on port A.
If the destination port of the packet entering from the port A is B/C/D,
drop the packet directly. However, the destination port of the packet
entering from the port B/C/D is port A, the packet can be forwarded
normally.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 125 of 628
Typical Application
Figure 9-6-1 Typical application of port isolation
Illustration
The three ports of switch A are connected to three terminal devices
respectively. port 0/1, port 0/2 and port 0/3 are connected to PC1, PC2
and PC3 respectively. Port 0/27 is connected to the public network. port
0/1, port 0/2, port 0/3 and port 0/27 are connected to one VLAN.
PC1, PC2 and PC3 cannot communicate with each other, but can
communicate with the public network normally. In the normal state, the
ports in one VLAN can communicate with each other. To meet the previous
environment, you can use the port isolation function to realize the
application environment. Isolate port 0/2 and port0/3 on port 0/1; isolate
port 0/1 and port0/3 on port 0/2; isolate port 0/1 and port0/2 on port 0/3.
After the configuration, port 0/1, port 0/2, and port 0/3 cannot
communicate with each other, but can communicate with port 0/27.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 126 of 628
SPAN Technology
This chapter describes the port mirroring SPAN technology and application.
Main contents:
SPAN technology
Typical application
SPAN Technology Switched Port Analyzer (SPAN) is used to monitor the data flow of the
switch port. You can use SPAN to copy the frames on one monitoring port
(source port) to another destination port on the switch connected to the
network analysis device to analyze the communication on the source port.
The user adopts the network analysis device to analyze the packets
received by the destination port for network monitoring and
troubleshooting. SPAN does not affect the normal packet switching of the
switch, but all frames that enter into the source port and are output from
the source port are copied to the destination port. However, for one
destination port with excessive traffic, for example, one 100Mbps
destination port monitors one 1000Mbps port, the frames may be dropped.
Related Terms of SPAN Technology SPAN Session
The SPAN session means the data flow between one group of monitoring
ports and one destination port. The data of multiple monitoring ports can
be mirrored to the destination port. The mirrored data flow can be the
input data flow, output data flow or output and input data flow. You can
set SPAN for the port that is in the close state, the SPAN session is
inactive. However, as long as the port is enabled, SPAN becomes active.
Each line card support the SPAN session of four rx and one tx.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 127 of 628
Local SPAN
Local SPAN supports the port mirroring on one switch and all monitoring
ports and destination ports are on one switch. Local SPAN mirrors the data
of one or multiple monitoring ports to the destination port.
Remote SPAN
RSPAN supports that the monitoring port and the destination port are not
on the same switch, so as to realize the remote monitoring across the
network. Each RSPAN Session bears the monitoring traffic on the specified
RSPAN VLAN. RSPAN includes RSPAN Source Session, RSPAN VLAN, and
RSPAN Destination Session. You need to configure RSPAN Source Session
and RSPAN Destination Session on different switches. When configuring
RSPAN Source Session, you need to specify one or multiple monitoring
ports and one RSPAN VLAN. The monitoring data is sent to RSPAN VLAN.
Configure RSPAN Destination Session on another switch and you need to
specify the destination port and RSPAN VLAN. RSPAN Destination Session
sends the RSPAN VLAN data to the destination port.
The switches that realize the remote port mirroring function are divided to
three kinds:
1. Source switch: It is the switch of the monitored port, which
transmits to the intermediate switch or destination switch via
RSPAN VLAN.
2. Intermediate switch: It is the switch between the source switch and
destination switch in the network, which transmits the mirroring
traffic to the next intermediate switch or destination switch. If the
source switch is connected to the destination switch directly, there
is no intermediate switch.
3. Destination switch: It is the switch of the remote mirroring
destination port, which transmits the mirroring traffic received from
RSPAN VLAN to the monitoring device via the mirroring destination
port.
Traffic Types
There are three types of monitored traffic:
1. Receive (Rx): The traffic received by the monitoring port;
2. Transmit (Tx): The traffic sent by the monitoring port;
3. Both: The received and sent traffic of the monitoring port.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 128 of 628
Monitoring port (source port)
The data of the monitoring port (source port) is monitored for network
analysis. The monitored data flow can be input, output or bi-directional
and can be in different VLANs.
The monitoring port has the following features:
It can be common port or aggregation port;
It cannot be destination port;
One source port can only belong to one SPAN session;
It can be or not in the same VLAN as the destination port.
Destination port
The destination port can only be one separate physical port or aggregation
group. One destination port can only be used in one SPAN session.
The destination port has the following features:
The destination port is common port or link aggregation;
The destination port cannot be monitoring port;
The destination port type of RSPAN Destination Session should be
hybrid;
The destination port cannot take part in the STP calculation. The local
SPAN includes the BPDU of the monitored traffic, so any BPDU seen by
the destination port is from the source port;
The destination port should not be connected to other switch, which
may result in the network loop;
The destination port had better be larger than or be equal to the
bandwidth of the monitoring port. Otherwise, the packets may be lost;
The destination port does not enable the LACP and 802.1X function,
preventing the mirroring data from being affected;
The source RSPAN destination port can only be the common port, but
cannot be the aggregation port;
The destination port can serve as the common forwarding port, but to
prevent the monitored data from being interfered by other data flow,
it is recommended to delete the destination port from all VLANs.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 129 of 628
RSPAN VLAN
RSPAN Vlan should be one private idle VLAN for RSPAN and its VLAN
number can be 2-4096. You can select one idle VLAN flexibly during
configuration, but you need to ensure that other devices on all paths to
the analysis device are all configured with the VLAN and the corresponding
ports are added to the VLAN.
RSPAN VLAN has the following features:
To prevent the monitored data from being interfered by other data
flow, RSPAN VLAN can only bear the RSPAN traffic;
Except for the ports those are used to bear the RSPAN traffic, do not
configure any port to RSPAN VLAN;
RSPAN VLAN prohibits the MAC address learning function;
RSPAN does not support the L2 protocol monitoring unless disabling
the L2 protocol function of RSPAN destination session device.
Limitations
1. SPAN and flow mirroring use the same chip resource. When
enabling the port mirroring, avoid enabling the flow mirroring.
Otherwise, the hardware resource may become lacking.
2. In the MPLS environment, if MPLS learns the destination MAC
address of the packet, the mirrored MPLS packet carries the MPLS
header; if MPLS does not learn the destination MAC address of the
packet, the mirrored MPLS packet does not carry the MPLS header.
Typical Application
Local SPAN Appl icat ion The following is one simple local SPAN environment.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 130 of 628
The application diagram of the local SPAN
Illustration
In the above figure, all packets of port 0/1 are mirrored to port 0/2. The
network analyzer connected to port 0/2 is not connected to port 0/1
directly, but port 0/2 can receive the packets of port 0/1 via the mirroring.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 131 of 628
Remote SPAN Appl icat ion
The application diagram of remote SPAN
Illustration
In the above figure, the mirroring packets of the port 0/8 on the source
device switch 1 are transmitted to the destination port 0/1 of the
destination device switch 2 via RSPAN Vlan 100, realizing the monitoring
for the sent and received packets of the source switch ports on the
destination switch.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 132 of 628
IPv4 Unicast Routing
This chapter describes the principles of the mainstream routing protocols.
Main contents:
Introduction to the IPv4 unicast routing
Static routing protocol
M-VRF
Load balance
RIP dynamic routing protocol
OSPF dynamic routing protocol
IS-IS dynamic routing protocol
BGP dynamic routing protocol
Introduction to the IPv4 Unicast Routing The packets reach another host from one host in the network. Then, you
should know the transmission path of the packets in the network. The path
is called route.
A network is composed of many forwarding devices (such as switches). To
forward packets from one host to another host, each forwarding devices
should know the path to the destination host, that is, each forwarding
device should have the route to the destination route.
The source of the route includes three types: when the forwarding device
is directly connected to the network, the directly-connected route is
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 133 of 628
generated; when the network administrator adds routes manually, static
routes are generated; when the forwarding device runs the dynamic
routing protocol, the dynamic route can be automatically learned.
There are many paths for packets sent from one host to another.
Therefore, the best path should be selected to forward the packets.
Determine the path from the following aspects:
Path length: the path length can be measured through the hops or
cost. In the distance vector routing protocol, the path length refers to
the number of the forwarding devices from the source host to the
destination host. In the link status routing protocol, the path length
refers to the sum of the cost of each link.
Reliability: measured by the error rate between the source host and
the destination host. In most routing protocols, the reliability of a link
is designated by the network engineer.
Delay: refers to the sum of the time spent in traveling through all
network devices, links, and switching devices. In addition, for the
delay time, the network congestion and the distance between the
source end and the destination end. Many variables are taken into
account for the delay time. Therefore, in the calculation for best path,
delay is an important measurement standard.
Bandwidth: Calculating the best path through the bandwidth may
cause misleading. The link with 1.544Mbps bandwidth is better than
the link with the bandwidth of 56Kbps, but the utilization rate of the
1.544Mbps link is high, or the load of the opposite receiving device is
heavy, it may not be the best path.
Load: Assign a value for the network resource according to the
resource utilization. The value is determined by the CPU utilization,
passed packet per second, and disassemble/assemble of packets. But
the process of monitoring device resources is a heavy load.
Communication cost: In some cases, the communication link of public
network is charged by utilization rate or by monthly fee, for example,
the ISDN link is charge by the utilization time and the data amount in
the period. In the examples, the communication cost is a very import
factor in determining the best path.
Static Routing Protocol Main contents:
Introduction to the static route
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 134 of 628
Typical application of the static route
Troubleshooting of the static route
Introduction to the Static Route The static route is defined by users. Through the static route, the packets
between the source and destination adopt the path specified by the
administrator.
To know the information categories in the routing table, when a frame
reaches one interface of the switch, it is useful to check the changes. You
must check the data link tag of the frame in the destination domain. If the
tag includes the tag of the switch interface and the broadcast tag, the
switch will deprive the header and tailor of the frame and transmit the
complete packets to the network layer. The network layer must check the
destination address in the packets. If the destination address is the IP
address of the switch, is the multicast address performing monitoring, is
the broadcast address of the subnet or the designated broadcast address,
is the global broadcast address (255.255.255.255), the protocol domain of
the packets will be checked and the complete data will be transmitted to
the corresponding internal process.
To find a route, use the next-hop address as the destination, and parse
the link layer address. The next-hop address may be the address of
another host directly connected with the switch. It may be the address of
another host non-directly connected with the switch in the network. The
addresses can be routed.
To route the packets, the switch searches the routing table to get the
correct route. In the database, each route in the database should contain
the following two conditions:
1. Destination address: The network address that the switch can reach.
Based on the same primary network address , the switch may have
more than one route to the same address.
2. Destination pointer: The pointer specifies whether the network and the
switch are directly connected or specifies the address of the next
switch, namely, the next-hop switch.
The switch will try to match the most special address. In the following
special sequence, the address may be one of the following:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 135 of 628
Host address (host route)
Subnet
A group of subnets (summary route)
Main network ID
A group of network ID (ultranetwork)
Default address
If the destination address of the packets does not match any entry in the
routing table, the packets will be discarded and send an ICMP message
that the destination address is unavailable to the source address.
Typical Application of the Static Route The following is a simple environment illustrating the static route.
Figure 11-1 Typical application of the static route
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 136 of 628
Illustration
Two Maipu routers (switch-a and switch-b), as the forwarding equipment,
connect the two networks including 10.1.1.0/24 and 10.1.3.0/24. The
default gateway of PC-1 is 10.1.3.1 and the default gateway of PC-2 is
10.1.1.2.
Configure static route on the two switches to implement the
interconnection of 10.1.1.0/24 and 10.1.3.0/24. Configure a static route
on switch-a: set the destination address to 10.1.1.0/24 and set next hop
to 10.1.2.1. Configure a static route on switch-b: set the destination
address to 10.1.3.0/24 and set the next-hop to 10.1.2.2. Then, the
network can be interconnected.
The data flow sent to PC-2 from PC-1 reaches the default gateway switch-
a. Switch-a finds that the destination address 10.1.1.1 of the data flow is
not the local address. Search the routing table. Owing to the existence of
static route 10.1.1.0/24, switch-a can forward the data flow to the next
hop 10.1.2.1 (namely switch-b). Switch-b continues forwarding, the
destination address of the data flow hits the directly connected route, and
the data flow is successfully transmitted to PC-2.
Troubleshooting of the Static Route
Load Balancing of the Switching Device On the switching devices that support hardware routing (such as L3
switch), after the static route is configured, small amount of packets
should be forwarded (through software) to parse the next hop. For
example:
S 128.255.0.0/16 [1/10] via 1.1.1.2, 00:40:10, vlan1
When the static route takes effect, it is possible that the ARP table entry
corresponding to 1.1.1.2 does not exist. When real data flow should be
forwarded through the route, the ARP table entry corresponding to 1.1.1.2
will be parsed. The ARP is parsed by sending the data to the CPU for
software forwarding. When the ARP is parsed successfully, the data is
switched on the hardware and is not sent to the CPU.
When the static route is a load balancing route, it is possible that the data
is sent to the CPU continuously owing to the different route of the software
and hardware.
S 128.255.0.0/16 [1/10] via 1.1.1.2, 00:40:10, vlan1
via 1.1.1.3, 00:40:10, vlan1
The load balancing route is written into the hardware. The ARP is not
parsed for next hops 1.1.1.2 and 1.1.1.3. The data flow with the
destination address of 128.255.1.1 hits the route. For the load balancing
route, the hardware adopts flow load balancing mode to select the next
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 137 of 628
hop. For example: select 1.1.1.3. For 1.1.1.3, if the ARP is not parsed, the
packets should be transmitted to the CPU to perform software forwarding.
After the packets reach the CPU, if the software also adopts the flow load
balancing mode to select the next hop, owing to the different algorithm of
software and hardware, 1.1.1.2 may be selected. As a result, the ARP
parsing of 1.1.1.2 is implemented. 1.1.1.3 is not parsed.
Then, the hardware selects 1.1.1.3 as the next hop. The software selects
1.1.1.2 as the next hop. Consequently, the data flow is continuously
transmitted to the CPU and hardware forwarding cannot be performed.
Therefore, for the hardware route switching devices, when the static route
load balancing mode is used, we recommend setting the software load
balancing to packet load balancing mode. Then, each next hop on the
software can perform ARP parsing.
Use the ip load-sharing per-packet command to set the software load
balancing mode to per packet mode.
M-VRF Main contents:
Terms
Introduction to M-VRF
Terms of M-VRF VPN- Virtual Private Network Through VPN technology, two or multiple
network sites can be connected through the Internet. In the VPN, the
running mode is like that all sites are in a single private network.
M-VRF- Multi-VPN Routing and Forwarding In the switch, each VPN has
its own routing and forwarding table. All customers of sites of the VPN can
only access the routes of the table.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 138 of 628
Introduction to M-VRF M-VRF supports the VPN. In a switch, multiple VRFs may exist. The
resources (interface, IP address, routing table) belong to a VRF. The
resources in different VRF cannot access mutually. Through the Multi-VRF
function, users can isolate the network. And the address space overlapping
is supported.
The M-VRF does not modify the packet format. It only enhances the
security by dividing the resource attributes. The resources in the system
belong to one VRF only. After the interface is configured with a VRF, the
packets sent or received through the interface can only access the
resources of its own VRF.
We take the packet forwarding as an example. When an interface receives
a packet, take the VRF attributes of the interface. In addition to
determining whether the local address is the destination address of the
packets, we need to determine whether the VRF attributes of the home
interface of the address and the VRF attribute of the interface receiving
packets are the same. To forward packets, locate routing table according
to the VRF attribute.
Load Balancing Main contents:
Types of load balancing
Modes of load balancing
Switching types and load balancing
Types of Load Balancing Equal-cost load balancing, assigns communication traffic on average. (1:1)
Unequal-cast load balancing, assigns communication traffic according to
the cost ratio. (1: n)
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 139 of 628
Modes of Load Balancing Load balancing of per packet, the first packet takes one link and the
second packet takes another link. The packets are distributed each links
circularly. (Ignore whether the destination address is the same)
Load balancing of per session (or destination by destination), packets to
the same host use the same link.
Both modes have their own features.
1. Switching per packet: when the concurrent link is less than 64K, it is a
good option. Missequence may occur. It is improper for specific
application, such as voice traffic (depends on the sequence of the
arrived packets))
2. Switching per session: when the load of the link used by the session
traffic is heavy (for the communication traffic is heavy), but the load
of other links is light, the load of different links may be unbalanced.
Switching Types and Load Balancing Different switching types match different load balancing modes; generally
there are the following two types:
Process switching: To balance the load based on the sequence of the
arrived packets. The per packet balancing mode is adopted.
Fast switching: To balance the load based on the source/destination
address of the packets. The per session balancing mode is adopted.
Note
The content described in this chapter is only applicable to the software
forwarding. The packets forwarded through switching chip are not
restricted by the description in this chapter.
RIP Dynamic Routing Protocol Main contents:
Terms of RIP protocol
Introduction to the RIP protocol
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 140 of 628
Terms of RIP Protocol UDP- User Datagram Protocol. It is a simple datagram-oriented unreliable
transmission IP network transmission layer protocol.
D-V algorithm-distance vector algorithm. It is a routing calculation
method for the computer network. It is also called the Bellman-Ford
algorithm.
IGP-- Interior Gateway Protocol.
Request packets-The packets for requesting the RIP routing information
about other routing devices.
Response packets-For advertising its own routing information to the RIP
of the adjacent routing devices.
Split horizon- A measure adopted by the RIP protocol to prevent the
generation of loopback.
Poisoned reverse- A measure adopted by the RIP protocol for preventing
the generation of route loopback, is more initiative than the Split Horizon.
Triggered updates- A measure of the RIP protocol for quickening the
convergence. When the route changes, the updates are triggered and the
changed routes are advertised. Regular updates, the RIP protocol sends
the updates of all routing information at an interval of 30 seconds by
default.
Introduction to the RIP Protocol Routing Information Protocol (RIP) is an interior gateway routing protocol
based on the distance vector algorithm. It is used for the dynamic IPv4
route. The RIP protocol has become one of the standards of information
transmission between routing devices and hosts.
The RIP protocol includes RIPv1 and RIPv2. RIPv1 does not support
classless routes but RIPv2 supports the classless routes. Usually RIPv2 is
used.
The RIP protocol is simple and the configuration is also simple. The routing
information to be advertised by the RIP protocol and the number of routes
in the routing table are directly proportional. A large number of routes use
lots of network resources. At the same time, the RIP protocol defines that
the maximum of the hops is 15. Therefore, the RIP protocol is only
applicable to the simple small-to-medium network.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 141 of 628
RIP protocol is applicable to most campus network and simple regional
network. For more complex environment, the RIP protocol is not used.
RIP in the TCP/IP Protocol
Figure 11-2 RIP in the TCP/IP protocol stack
As shown in the preceding figure, the RIP protocol is based on the UDP
protocol. The protocol packets sent by the RIP protocol are encapsulated
in the UDP packets. At port 520, the RIP protocol receives the protocol
packets sent from the remote routing devices. It updates the local routing
table according to the routing information in the received protocol packets.
At the same time, add one to the metric and then notify other adjacent
routing devices. Through this mode, all routing devices in the route
domain can learn all routes.
The RIP protocol sends packets in the following three modes: broadcast,
multicast, and unicast. The usage of each mode is shown in the following
table.
Table 1-1 Modes of sending packets
Mode Address Version Port Purpose
Broadcast 255.255.255.255 RIPv1 520 RIPv1 sends protocol packets to all adjacent routing devices.
Multicast 224.0.0.9 RIPv2 520 RIPv2 sends protocol packets to all adjacent
routing devices.
Unicast Unicast IP address
RIPv1/2 520 The response packets responding to request packets; protocol packets sent to Neighbor.
RIP Packets Types and Structure RIP Packet Types
There are two types of packets: Request packets and Response packets.
The RIP packet types and the functions are as follows.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 142 of 628
Table 1-2 RIP packet types
Packet Type Function Sending Status
Request packets Request the routing information from the adjacent routing device RIP. You can request the specified routing information or request all routing information (there is only one route entry with the address family tag 0, metric 16.)
When the RIP is running at the interface, request all routing information from the adjacent routing device RIP.
Response packets Advertise the routing information to the adjacent routing device RIP.
A) Respond to the request packets. B) When the route changes, the update of the routing information is triggered. C) Advertise all routing information (regular updates) to the adjacent device RIP periodically.
RIP Packet Structure
Figure 11-3 RIP packets structure
As shown in the preceding figure, the RIP packets are encapsulated in the
UDP packets. In the IP header of the RIP packets, TTL is set to 1 to
prevent RIP packets from being forwarded by other routing devices.
The RIP header has two fields: Command field identifies the request
packets (value is 1) or response packets (value is 2); Version field
identifies the RIPv1 (value is 1) or RIPv2 (value is 2).
RIP Entry includes three types: RIPv1 routing entry, RIPv2 routing entry,
and authentication information entry. RIP Entry types and description are
as follows.
Table 1-3 RIP protocol RIP entry types and description
RIP information entry
Version Format Description
RIPv1 routing entry RIPv1 The format is shown in the
In the RIPv1, advertise the routing information to the adjacent routing
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 143 of 628
following figure:
device RIP.
RIPv2 routing entry RIPv2 The format is shown in the following figure:
In the RIPv2, advertise the routing information to the adjacent routing device RIP.
Authentication information entry
Plain text
RIPv2 The format is shown in the following figure:
Add the authentication information about the plain text of the packet in the RIPv2 protocol. The information follows the RIP packet header.
MD5 RIPv2 The format is shown in the following figure:
Add the authentication information about the MD5 of the packet in the RIPv2 protocol. The information follows the RIP packet header. At the end of the packet, corresponding authentication content is required.
Figure 11-4 Format of the RIP routing information entry
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 144 of 628
Figure 11-5 Packet format of the RIPv2 authentication information
Working Principle of RIP
Figure 11-6 Working flow of the RIP protocol
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 145 of 628
The working flow of the RIP protocol is shown in the preceding figure. It
can be divided into two parts: one is the RIP protocol starting flow, and
the other is the processing flow of RIP receiving packets.
Starting the Protocol
When an interface starts to run the RIP protocol, request packets are sent
to the interface through the broadcast (RIPv1) or multicast (RIPv2) mode
to request all routing information from all adjacent routing devices. Then,
the fast convergence can be implemented.
After the response packets of the request packets are received, update the
routes in the route database according to the routing information
contained in the packets. Then, the changed routes are advertised to other
adjacent routing device RIP (triggered updates).
At the same time, start the Updates timer. Every 30 seconds by default,
advertise all routing information through response packets to the adjacent
routing device RIP. The purpose of the operation is to ensure the
synchronization of the database between the routing device RIPs and to
update the advertise routes. As a result, the previously advertised routes
do not time out or become invalid on other routing devices.
Route Database
The route database records all routing information about the RIP protocol.
Each routing information is composed of the following elements:
1. Destination address: the destination host or subnet of the route.
2. Metric: The metric value of reaching the destination.
3. Next hop interface: the interface for forwarding packets reached
the destination, namely, the interface of the route is learned.
4. Next hot IP address: to reach the destination, the interface IP
address of the passed adjacent routing devices. Generally, the
source IP address of the response packets of the route is learned.
5. Source IP address: the source IP address of the response packets
of the route is learned.
6. Route tag: defined by the user, for marking category 1 route. For
example, mark that a route is obtained through redistributing the
BGP routes.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 146 of 628
Source of the Routing Entries in the Route Database
In the RIP route database, the sources of the routing entries are as follows:
1. Directly connected route of the covered interface
2. The route for the protocol to redistribute other protocols.
3. Routes generated by the protocol configuration command, for
example, the command for generating and launching default route
0.0.0.0 (default-information originate).
4. Routes learned from the adjacent routing device RIP.
Retrieval of Next-Hop Route
In RIPv1, the next-hop interface of the route is the interface of the learned
route. The next-hop IP address is the source IP address of the response
packets of the learned route.
In RIPv2, the routing information in the response packets can carry the
next-hop IP address. The next-hop interface of the route is the interface of
the learned route. The next-hop IP address can be one of the following:
the source IP address of the response packets that learned the route; the
next-hop IP address carried in the routing information. If the next-hop IP
address in the routing information and the interface that receives the
routing information are in the same subnet, the next-hop IP address of the
route is the next-hop IP address in the routing information. Otherwise, the
next-hop IP address of the route is the source IP address of the response
packets. The purpose is to implement the re-direction function.
The following example illustrates the application of the next-hop address
of the routing entry in RIPv2.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 147 of 628
Figure 11-7 RIP route redirection
As shown in the preceding figure, switch-A runs RIP, switch-B runs RIP
and OSPF, switch-C runs OSPF. In switch-B, the RIP redistributes the
learned OSPF route 11.0.0.0/8. As a result, switch-A can learn the route
11.0.0.0/8 that reaches the subnet. When switch-A learns the route, by
default, the next-hop is switch-B, namely, 10.1.1.2. Then, the packets
forwarded from switch-A to destination subnet 11.0.0.0/8 reach switch-C
through switch-B.
To solve the problem, when switch-B advertises route 11.0.0.0/8 to
switch-A, the next-hop of the route is specified to switch-C, namely
10.1.1.3. When switch-A learns the route, it specifies the next-hop of
route 11.0.0.0/8 to switch-C, namely 10.1.1.3. Then, the packets
forwarded to destination subnet 11.0.0.0/8 by switch-A are directly
forwarded to switch-C, and the packets doest pass through switch-B.
Route Updates
When a route is learned from the adjacent routing device RIP, in the
following cases, use the route to update the route in the database:
1. The route does not exist in the route database and the metric of the
route is less than 16 hops.
2. The route exists in the database. The source IP address and the source
IP address of the learned route are the same.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 148 of 628
3. The route exists in the database, but the metric is equal to or greater
than the metric of the learned route.
To accurate the number of metric hops, when the routes in the route
database are advertised, the metric increases 1. The maximum of the
metric is 15. When the metric is greater than 15, the route is considered
to be unreachable.
RIP Timer
Valid
Invalid +
HolddownInvalid
Flush(Delete route from
database)
Invalid Timer timeout
or metric is updating
to 16 (Unreachable)
Flush
Timer timeout
Holddown
Timer timeout
Route Update
Flush
Timer timeout
Running
invalid timer on
nexthops of routes
Running
holdown timer
and
flush timer on
routes
Running
flush timer on
routes
Figure 11-8 Status change of RIP route entry
RIP protocol contains four timers, Update timer, invalid timer, holddown
timer, and flush timer. The description of each timer is as follows.
Table 1-4 RIP protocol timers
Timer Operation Object
Default Value
Startup Condition
Function
Update Timer
Route Database
30 seconds
The timer is started repeatedly when the RIP is started.
Advertise all route information to the adjacent routing device Rip through the response packets periodically. 1. Ensure the database synchronization between routing device RIPs. 2. Refresh the previously advertised routes. As a result, the advertised routes do not time out on other routing devices.
Invalid Timer Next hop of routing entry
180 seconds
Started when one route entry is learned
A route entry will become invalid if it is not updated in certain time. The change of status is shown in the preceding figure. The timer can be updated by the response packets. When the route entry becomes invalid,
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 149 of 628
shut down the timer.
Holddown Timer
Route entry
180 seconds
Started when the route entry becomes invalid.
In a certain time after the route entry becomes invalid, the route entry cannot be updated by the response packets to prevent the loopback. The change of status is shown in the preceding figure. When the route entry gets out of the holddown status, shutdown the timer.
Flush Timer Route entry
240 seconds
Started when the route entry becomes invalid.
A route entry is deleted from the database after it becomes invalid for a certain time. The change of status is shown in the preceding figure. When the route entry is deleted, shut down the timer.
Prevent ion of RIP Route Loopback The RIP protocol is dynamic routing protocol based on the distance vector
algorithm. It does not know the status of the entire network topology.
When the network sends the changes, the routes of the entire network
take some time to perform convergence. As a result, the route databases
of each route devices are not synchronized in certain time. At the same
time, the topology of the entire topology is not known, so the route
loopback may be generated. The RIP protocol uses the following
mechanism to reduce the possibility of route loopback caused by the
inconsistency in the network:
Counting to Infinity
The RIP protocol allows the maximum hop of 15. The destination greater
than 15 hops is considered to be unreachable. The number restricts the
network size and prevents the infinite transfer of routing information. The
routing information travels from one routing device to another. The
number of hops increases 1 at each transfer. When the number of hops
exceeds 15, the route will be deleted from the routing table.
Split Horizon
Split-horizon prohibits a router from advertising a route back out the
interface from which it was learned. The route learned from one interface
is advertised from the interface. Consequently, the route loopback may
occur.
The rules of the RIP split horizon are as follows: if the routing device RIP
learns routing information A from an interface, the response packets sent
to the interface cannot contain the routing information A.
There is a special case for split-horizon, when an interface receives route
request packets from an interface, do not perform split-horizon for the
response of the packets.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 150 of 628
Poisoned Reverse
The purpose of poisoned reverse and the purpose of the split horizon are
the same, but the operations are different.
The rules of the RIP poisoned reverse are as follows: if the routing device
RIP learns routing information A from an interface, the response packets
sent to the interface cannot contain the routing information A, but the
metric is set to 16 (namely unreachable).
Compared with split horizon, the poisoned reverse has the advantage that:
when the number of hops is set to unreachable, notify the routing
information to the source routing device, if the route loopback already
exists, the route loopback can be broken immediately. But for the split
horizon, it has to wait until the wrong route entry is deleted for timeout.
The disadvantage is that: the poisoned reverse increases the size of the
response packets. As a result, the consumption of the protocol bandwidth
is increased.
Holddown Timer
The purpose of the holddown timer is to prevent the response packet
update after the route entry becomes unreachable for certain time.
Through the hoddown timer, before the route device receives the message
that the route is unreachable, the unreachable route will not be updated
by the received response packets. The route entry information in the
received response packets may be the packets advertised by itself.
Triggered updates
When the route changes, it is advertised to the adjacent routing device
RIP through the response packets.
The poisoned reverse and split horizon break the route loopback composed
of any two routing devices. The route loopback composed of three or more
routing devices may also occur until the route metric is accurate to infinite
(16). The triggered updates can quicken the convergence of the route.
Then, the time for breaking the route loopback is shortened.
RIPv1 and RIPv2 RIPv2 is the expansion of RIPv1. RIPv2 is the trend of the technology
development. At the same time, RIPv2 overcomes some disadvantages of
RIPv1. The main mechanism of RIPv2 is the same as that of RIPv1. It
improves and expands the RIPv1. The difference between the two
protocols is as follows:
Table 1-5 Difference between RIPv1 and RIPv2
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 151 of 628
Attribute RIPv1 RIPv2
Route mask Cannot release the route mask. The mask is obtained through the route class and the classless route is not supported.
Can release the route mask; the classless route is supported.
Packet sending Send in the broadcast (255.255.255.255) mode; it consumes lots of network resources.
Send in the multicast (224.0.0.9) mode; it consumes lots of network resources.
Authentication Does not support authentication Authentication information field is expanded; support the plain text and MD5 authentication.
Route tag Does not support advertisement and learning of route tag.
Support advertisement and learning of route tag.
Next hop advertisement
Does not support the advertisement of next hop.
Support the advertisement of next hop to implement the function of route redirection.
IRMP Dynamic Routing Protocol Main contents:
Related terms
Introduction to IRMP protocol
Related Terms of IRMP Protocol downstream router: (for the subnet) it is the router nearer to the
destination subnet;
successor: the next router passed from the current router to the
destination router;
reported distance: the distance reported by the neighbor to the current
router;
feasible successor: the router that is nearer to the destination router
than the current router.
Introduction to IRMP Protocol The technology (DUAL-Diffused Update Algorithm) used by IRMP (Internal
Routing Message Protocol, compatible with EIGRP) is similar to the
distance vector protocol.
The router only uses the information provided by the direct-connected
neighbor to make the routing decision. The received information can
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 152 of 628
perform the next filtering because of the security or communication
project.
The router only provides the used route information for the direct-
connected neighbor. The information sent to the neighbor also can be
filtered at first, and then be sent.
However, there is some difference between IRMP and distance victor,
which makes IRMP more excellent than the traditional distance vector.
1. IRMP saves all routes sent by all neighbors in the topology table, but
not just save the best route received up to now;
2. When IRMP cannot access the destination, but there is no substitute
route, it can query the neighbor (the topology table is one data
structure and IRMP uses it to save all route information received from
the neighbor).
IRMP Types Opcode Type
1 Update 3 Query 4 Reply 5 Hello 6 IPX SAP (does not support for the moment)
Different TLV Defined in IRMP No. TLV Type
Common TLV types: 0x0001 0x0003 0x0004 0x0005
IRMP Parameters Sequence Software Version 12 Next Multicast Sequence
The TLV types of IP: 0x0102 0x0103
IP Internal Routes IP External Routes
Other types are not supported for the moment
IRMP Unicast and Multicast Sending (Multicast Address 224.0.0.10)
Type/Reliability Unreliable Reliable
Unicast ACK Reply
Multicast Hello Update Query
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 153 of 628
In the following cases, IRMP adopts the unicast:
When transmitting packets (X.25 and frame relay) on the transmission
medium that does not support the hardware multicast;
When re-transmitting the packet to the neighbor that does not reply
within the multicast timeout interval;
IRMP Packet Format (Take One IP Packet with IRMP Data as an Example) Version (4) header length
(5) Service Type (00)
Total Length (0045)
ID (05f7) 00 00
Life time (02) Protocol (58) (IRMP)
Header check sum (c75d)
Source IP address (0a010102)
Destination IP address (E000000a)
IRMP version (02) Operation code (01)
Check sum (e655)
Flag (00000000)
Sequence (00000003)
Response (00000000) (when the packet is ack packet, it is not 0)
AS number (00000001)
TLV type (0102) Length (00 1d)
Next step (00000000)
Delay (0001f400)
Bandwidth (00000100)
MTU (008000) Steps (00)
Reliability (ff) Load (01) Reserved (0000)
Prefix length (20) Destination
By default, the hello packets are sent with an interval of 5s; keep timer as
15s (for the NBMA interface with the bandwidth lower than T1, the two
values are 60s and 180s respectively).
OSPF Dynamic Routing Protocol Main contents:
Terms of OSPF Protocol
Introduction to the OSPF protocol
OSFP features
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 154 of 628
Terms of OSPF Protocol AS- Autonomous System: a group of routing devices exchanging information through the same routing protocol.
Area: the collection of routing devices, which has such topology database:
OSPF divides one AS into multiple areas; the topology of one are is
invisible to another area, which reduces the number of routing information
in an AS. The area is used to contain link state updates and enables the
administrator to create hierarchical network.
areaID-the 32-bit ID of the area in the AS.
IGP- Internal Gateway Protocol: the routing protocol running on the
routing devices of an AS system, each AS system has an independent IGP;
different AS system may run different IGP. OSPF is one kind of IGP.
Router ID-a 32-bit number, it is granted to the OSPF, as a result, each
routing device can identify the routing device in the AS.
Point To Point network-the network composed of a pair of routing
devices, such as a 56kb serial port connection.
Broadcast Networks-the network supports multiple (more than 2)
routing devices. The routing devices can exchange information with all
network (broadcast) routing devices. The neighbor routing device is
dynamically detected by the OSPF hello packets. If the network has the
multicast capability, OSPF also uses multicast. Each pair of routing device
on the network is supposed to directly connect with the opposite party.
The Ethernet is an example of the broadcast network.
and simple and stable protocol. Therefore, the IS-IS protocol is applicable
to large-scale core backbone network.
In this chapter, the IS-IS protocol for IPv4 and IPv6 are described. The
OSI route is not widely used, so it is not described in this document.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 181 of 628
IS-IS Protocol Stack Structure and the Posit ion in the Network Protocol Stack
Figure 11-26 Structure of the IS-IS protocol stack
As shown in the preceding figure, IS-IS protocol can be classified into
basic part and the application part. The basic part of the IS-IS maintains
the topology of the entire network and uses the SPF algorithm to calculate
the shortest path of each IS in the destination network. After obtaining the
shortest path of each IS system, generate routes according to the
reachable subnet (IPv4, IPv6, OSI, such as 10.0.0.0/8) of the advertised
IS system. (for example, the path to the subnet 10.0.0.0/8 is the shortest
path to the IS system publishing the subnet).
Figure 11-27 Position of IS-IS protocol in the network protocol stack
As shown in the preceding figure, the IS-IS protocol is based on the link
layer, independent from the network layer of the IPv4, IPv6, and OSI
protocol stack. In the broadcast network, the packets are sent in the
multicast mode. In the Ethernet, IS-IS uses the following MAC addresses.
Table 1-6 Multicast address used by IS-IS
Address Name Multicast MAC address Description
AllL1ISs 01-80-C2-00-00-14 The multicast MAC address of layer 1 IS-IS packets
AllL2ISs 01-80-C2-00-00-15 The multicast MAC address of layer 2 IS-IS packets
AllIntermediateSystems 09-00-2B-00-00-05 The multicast MAC address of all IS
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 182 of 628
systems
AllEndSystems 09-00-2B-00-00-04 The multicast MAC address of all ES systems
IS-IS Packet Structure
Figure 11-28 IS-IS packet structure
As shown in the preceding figure, the position of the IS-IS protocol in the
network protocol stack is based on the link layer. Therefore, the IS-IS
protocol is encapsulated in the link layer packet. The routing information
carried in the IS-IS packet are organized in the TLV mode. It can be
organized and expanded flexibly. TLV: data type (1 byte)+data length (1
byte)+ data value (0-255 bytes). At the same time, according to the IS-IS
protocol, the TLV that cannot be identified should be ignored, instead of
being dropped.
IS-IS is based on the link layer and is irrelevant with the network layer,
and the routing information is organized flexibly in the TLV mode. In
addition, the TLV that cannot be identified can be ignored. This determines
the features of easy expanding and smooth upgrade.
The IS-IS protocol is shown in the following table.
Table 1-7 IS-IS protocol packets
IS-IS PDU Packet Type Category Type
Function
IIH Level 1 LAN IS to IS Hello PDU 15 Discover and keep alive layer 1 neighbor on the broadcast network
Level 2 LAN IS to IS Hello PDU 16 Discover and keep alive layer 2 neighbor on the broadcast network
Point-to-Point IS to IS Hello PDU 17 Discover and keep alive layer 1 and layer 2 neighbors on the point-to-point network
LSP Level 1 Link State PDU 18 Publish routing information in layer 1
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 183 of 628
area
Level 2 Link State PDU 20 Publish routing information in layer 2 area
CSNP Level 1 Complete Sequence Numbers PDU
24 Advertise the database abbreviated description information to the layer 1 neighbor
Level 2 Complete Sequence Numbers PDU
25 Advertise the database abbreviated description information to the layer 2 neighbor
PSNP Level 1 Partial Sequence Numbers PDU
26 Request or confirm LSP packets from layer 1 neighbors
Level 2 Partial Sequence Numbers PDU
27 Request or confirm LSP packets from layer 2 neighbors
NET of IS
Figure 11-29 IS-IS NET
When the IS-IS protocol is used to route for the TCP/IP protocol, it is still a
CLNP protocol of ISO. In the OSPF protocol, use the router ID to identify a
routing device. In the IS-IS protocol, use an ISO network address to
identify a routing device (IS). The ISO network address is the NET
(Network Entity Title). The description of NET is shown in the preceding
figure. The example in the figure is: NET 47.0000.0000.0000.0011.00.
Area ID is used to identify the layer 1 area. Level-2 Area is the backbone
of a network. Only one level-2 area is allowed. Therefore, ID is not
required.
System ID is used to identify an IS in an area. It must be unique in an IS-
IS AS.
SEL (NSAP Selector, also N-SEL), is similar to the protocol ID in the IP.
Different transmission protocol corresponds to different SEL. In IS-IS, all
SELs are 00.
Note the description of NET is for the routing purpose of the TCP/IP
protocol in the IS-IS. NET is defined in the ISO8348.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 184 of 628
Hierarchical Topology of IS- IS
Figure 11-30 Hierarchical topology of IS-IS
Area Division of IS-IS Routing Domain
The preceding figure illustrates the two-layer network topology of the IS-
IS protocol. A typical IS-IS network is composed of a level-2 area serving
as the core backbone network and multiple level-1 areas serving as the
access network. Each level-1 area uses one or multiple Level-2 Switch to
access the level-2 area. Each level-1 area is connected through level-2
area. Then, a level-2 network topology is formed. In an IS-IS network,
there can be one level-1 area or one level-2 area. More detailed area
division is not required.
Route Learning in the IS-IS Area
The LSDBs of each area are independent. They are also independent in
SPF routing calculation. The function of dividing areas is to divide the
entire network into many small routing domains. Then, the size of the
LSDB is reduced. Consequently, the consumption of the memory and the
SPF calculation is reduced. But, a new problem occurs; the SPF calculation
can only implement the route learning in the area. How the route learning
should be performed between areas?
Route Learning Between the IS-IS Areas
According to the preceding topology, the level-1 areas are connected
through Level-2 area. If the problem of the route between level-1 area
and level-2 area, the entire network can be interconnected.
Level-1 Area and Level-2 Area are connected through Level-2 switch.
Level-2 Switch runs the level-1 protocol and level-2 protocol of IS-IS at
the same time. To solve the problem of route between level-1 area and
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 185 of 628
level-2 area, deal with level-2 switch. Level-2 switch advertises the route
learned from level-1 area to level-2 area, advertises the attach tag to
level-1 area to show that it is connected to level 2 core network.
Learning Routes of Level-2 Area Reaching Level-1 Area
On level-2 switch, redistribute the routing information of level-1 area
calculated by level-1 SPF to the level-2 routing information for publishing.
As a result, all switches in the level-2 area can learn the routes of all
subnets that reach the level-1 area.
Learning Routes of Level-1 Area Reaching Level-2 Area
Mark the attach tag in the level-1 routing information published on level-2
switch. It indicates that the route is connected to the level-2 core network.
As a result, all switches in the level-1 area generate a default route to the
level-2 switch. Then, all switches in the level-1 area have the default route
reaching level-2 area.
Creat ion of Neighbor and Generat ion of Adjacency Information in IS- IS Protocol For the IS-IS protocol, the interface network can be classified into point-
to-point network and broadcast network. The neighbor creation and the
generation of adjacency information are different in the two interface
network types.
Designated IS
The designated IS (DIS) only exists in the broadcast network. It is
selected by all the IS systems in the same broadcast network. The
selection of the DIS is based on the priority of the interface connecting to
the broadcast network in each IS system and the SPNA address (in
Ethernet, it is the MAC address; in other networks, it is the IS system ID).
First, select the DIS with higher priority. When the priorities are the same,
select the greater SNPA address.
The functions of the DIS are as follows: 1. create the Pseudo-node,
generate and publish the adjacency information about the pseudo-node; 2.
Send the CSNP packets periodically to ensure the synchronization of the
LSDB in all IS systems on the broadcast network.
Pseudo-node
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 186 of 628
The Pseudo-node network only exists in the broadcast network. The
purpose is to simplify the adjacent network topology of the route
calculation. It is generated by the DIS. Pseudo-node has all IS systems
adjacent to the broadcast network. But no neighbor exists. The adjacency
information including Pseudo-node generates its own adjacent network
topology, as shown in the preceding figure.
Neighbor ID
Figure 11-31 IS-IS neighbor ID
The network node in the adjacent network topology is identified using the
neighbor ID in the LSDB, as shown in the preceding figure. There are two
types of nodes in the adjacent network topology: 1. IS, in its neighbor ID,
the system ID is its own system ID, the Circ ID is always 0x00; 2. Pseudo-
node, created by the DIS; in its neighbor ID, the system ID is the DIS ID,
the Circ ID is the ID of the interface generating the Pseudo-node of the
DIS; it must be non-zero to distinguish the neighbor ID of the IS.
Concepts of Neighbor and Adjacency
Figure 11-32 Relationship between neighbor and adjacency in IS-IS
broadcast network
Key Words Description
Neighbor Discover and keep alive through the hello packets (IIH). It represents the physical connection between IS systems.
Adjacency The topology around the host advertised to the entire IS-IS routing domain; describes the reachable network nodes (IS or Pseudo-node), used to organize the LSP packets. All LSP packets of the IS system form the LSDB to describe the entire network topology for SPF route calculation.
Relationship between neighbor and adjacency
Adjacency is generated by the neighbor. For the point-to-point network, the adjacent topology is equivalent to the neighbor topology. For the broadcast network, as shown in the preceding figure, Pseudo-node is added for bridging in the adjacent topology. But neighbors are all-topology relation.
Different The difference between the neighbor and adjacency lies in the broadcast
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 187 of 628
between neighbor and adjacency
network. The topology composed of neighbors is physical topology. Direct neighbor relations of all IS systems in the same broadcast network form the full-connection relation. The neighbor topology does not contain the pseudo-node generated by the DIS. The topology composed of adjacencies is for the topology of the SPF route calculation. In the same broadcast network, all IS systems show that they are adjacent to the pseudo-node of the broadcast network. The adjacent topology contains the pseudo-node.
Creation of Neighbors
In the IS-IS protocol, the discovery and keep-alive of neighbors are
implemented through sending and receiving hello packets (IIH). When an
interface runs the IS-IS protocol, it sends hello packets (IIH) periodically.
The creation of neighbors covers point-to-point network and broadcast
network. After the neighbor is created, hello packets (IIH) should be sent
periodically to keep neighbors alive.
On the point-to-point network, the point-to-point neighbor relation is
created through three-way handshake (RFC3373).
On the broadcast network, the LAN neighbor relation is created through
the three-way handshake. After the neighbor is created, all IS systems on
the broadcast network select a DIS.
Generation of Adjacency Information
The adjacency information describes the IS systems that the host can
reach directly. The generated adjacency information is described in the
point-to-point mode.
For the point-to-point work, the point-to-point format is used. It generates
adjacency information according to the neighbor relationship.
For the broadcast network, to simplify the adjacent network topology, the
DIS virtualizes a Pseudo-node in the broadcast network. All IS systems in
the broadcast network generate adjacency information to the pseudo-node.
The adjacency information of the pseudo-node is the IS systems adjacent
to the broadcast network. The adjacency information of the Pseudo-node
is generated and published by the DIS.
Publ ishing IS- IS Rout ing Information Content of the Routing Information
The routing information of the IS-IS protocol is organized in the Type
Length Value (TLV) format. It is carried in the LSP packets and thus
cannot be published. The routing information published by the IS-IS
protocol includes two types: adjacency information, used to form the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 188 of 628
entire network topology; reachable subnet information, used to describe
the subnet of the host (such as 10.0.0.0/8).
The adjacency information is obtained through the neighbor relationship.
Detail is provided previously.
The reachable subnet information comes from: 1. the directly-connected
routing information of the covered interfaces; 2. redistribute the routing
information about other protocols; 3. route leakage between layers.
Publishing the Routing Information
The IS-IS routing information is carried in the LSP packets. The
information is published to all the IS systems in the entire area through
the flooding mode. Flooding: when an IS system receives an LSP packet, it
saves a copy to the LSDB, and then sends the LSP packet to the interfaces
except the receiving interface.
Why the LSDB between IS systems should be Synchronized
If the LSDBs of each IS are not synchronous, the calculated SPF trees are
not consistent. The route loopback may occur. Therefore, in the entire
area, when the status is stable, ensure that the LSDBs of each IS system
must be synchronous.
Why the LSDBs between IS systems are not Synchronous
The LSDB is composed of LSP packets. The LSDBs are not synchronous
because the IS-IS packets are transmitted based on the link layer, it does
not depend on the transmission mechanism. Therefore, the LSP packets
may be dropped in the transmission process. Ensuring the synchronization
of the LSDBs is to ensure the reliability of the LSP packets. Therefore, for
the point-to-point network and the broadcast network, the synchronization
protection mechanisms are different.
Synchronization Protection Mechanism of the LSDB between IS
Systems in the Point-to-Point Network
In the point-to-point network, the sent LSP packets are acknowledged
through the PSNP packets to ensure the reliable transmission of the LSP
packets. The PSNP packets contain the abbreviated description information
about the LSP packets to be acknowledged.
Synchronization Protection Mechanism of the LSDB between IS
Systems in the Broadcast Network
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 189 of 628
In the broadcast network, different from the point-to-point network, the
LSDB synchronization is implemented by the DIS. The DIS sends CSNP
packets to the broadcast network periodically advertising the abbreviation
information about the LSDB, namely the LSP packets in the LSDB. In the
broadcast network, after other IS systems receive the CSNP packets, the
IS systems compare the CSNP packets with the LSDB. If it has multiple
LSP packets, the packets will be sent to the broadcast network; if it lacks
certain LSP packets, the PSNP packets will be sent to the DIS to apply for
the LSP packets. As a result, the LSDBs of all IS systems in the broadcast
network are synchronous.
IS-IS Route Calculat ion The route calculation of the IS-IS protocol includes the following two steps:
Step 1: Calculate the SPF tree through the SPF algorithm according to the
network topology composed of the adjacency information of the LSDB. As
a result, the shortest path to each network node (namely the IS) and the
next-hop are obtained.
Step2: According to the information about the reached subnet (such as
10.0.0.0/8) advertised by each network node (namely the IS) in the LSDB,
together with the SPF tree, the route is generated.
Typical Application of the IS-IS Protocol
Figure 11-33 Network topology of the IS-IS typical application
Illustration
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 190 of 628
As shown in the preceding network topology, there are four switches (A, B,
C, and D), namely four IS systems. The following describes the process of
route learning through the example of switch A learns the subnet
10.0.0.0/8 route of switch D. The metric of each link is 10. The DIS
selected from the Ethernet network is switch B.
Step 1: Publ ishing Rout ing Informat ion Generation of Adjacency Information
Figure 11-34 Adjacency topology of the IS-IS typical application
The adjacency information generated by each system forms the preceding
adjacency topology. The adjacency information generated by each IS is as
follows:
Table 1-8 Adjacency information generated by IS in the IS-IS Example
Network Node
System ID Neighbor ID Adjacency Information
IS A 0000.0000.0001 0000.0000.0001.00 Adjacency to B (0000.0000.0002.02) metric
10
IS B 0000.0000.0002 0000.0000.0002.00 Adjacency to B (0000.0000.0002.02) metric 10
Pseudo-node B
0000.0000.0002(same as DIS)
0000.0000.0002.02 Adjacency to A (0000.0000.0001.00) metric 0 Adjacency to B (0000.0000.0002.00) metric 0 Adjacency to C (0000.0000.0003.00) metric 0
IS C 0000.0000.0003 0000.0000.0003.00 Adjacency to B (0000.0000.0002.02) metric 10
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 191 of 628
Adjacency to D (0000.0000.0004.00) metric 10
IS D 0000.0000.0004 0000.0000.0004.00 Adjacency to C (0000.0000.0003.00) metric 10
Generation of Reachable Subnet Information
In the IS D, publish the directly-connected reachable subnet 10.0.0.0/8.
The Metric is 10.
Publishing the Routing Information
Through the flooding of routing information, the LSDB of each IS contains
the preceding adjacency information and the reachable subnet information.
Step 2: Perform SPF Calculat ion to Get the Shortest Path from Switch A to Each Switch
Figure 11-35 SPF tree of IS-IS route calculation example
In IS-A, according to the information about LSDB, take A as the start point;
use the SPF algorithm to calculate the SPF tree as shown in the preceding
figure. Then, the shortest path (Pseudo-node should be ignored when the
shortest path is obtained) to the IS D obtained is A->C->D. If the Ethernet
interface of A is vlan1, the IP address of Ethernet interface of C is 3.3.3.3,
the next-hop interface of IS D is vlan1, the next-hop address is 3.3.3.3,
and the metric is 20.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 192 of 628
Step 3: Generate Route According to Reachable Subnet D advertisement can reach subnet 10.0.0.0/8; the metric is 10; the next-
hop and metric reaching D on A is obtained through the SPF calculation.
With the information, A can obtain the IPv4 route: the next-hop interface
to 10.0.0.0/8 is vlan1, the next-hop address is 3.3.3.3. The metric is 30.
BGP Dynamic Routing Protocol Main contents:
Terms of BGP protocol
Introduction to the BGP protocol
Terms of BGP Protocol AS- Autonomous System AS is a set of routing devices and hosts in the
same management control domain and policy. The AS number is allocated
by the internet registration organization.
EBGP-BGP between AS systems. An EBGP neighbor is a routing device of
the management and policy control beyond the local AS.
IBGR-the BGP in the same AS. An IBGP neighbor is the routing device in
the same management control domain.
CIDR- Classless Interdomain Routing. CIDR is an address allocation
scheme, used to solve the explosive increase of IP address entry in the IP
routing table of the routing device and to solve the problem of exhaustion.
In CIDR, an IP network is represented by a prefix. The prefix address is
represented by the IP address and the most significant bit.
NLRI- Network Layer Reachability Information NLRI is a part of the BGP
update packets, used to list the collection of the reachable destination.
Ultranet-a network advertisement whose prefix rang is one bit less than
the natural mask of the network. For example, the natural mask of class C
network 202.11.1.0 is 255.255.255.0. If we use 202.11.0.0/16 to
represent the network address, the mask is 16 bits, which is less than 24
bits. Therefore, it is an ultranet.
IP Prefix-It is a kind of IP network address. It indicates the mask bits
forming the network.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 193 of 628
SYN-Synchronize Before the BGP advertises the routes, the route must be
in the current IP routing table. Namely, the BGP and IGP must be
synchronized before the route is advertised.
Introduction to the BGP Protocol Border Gateway Protocol (BGP) is a kind of route selection protocol for
exchanging network layer reachability (NLRI) between route selection
domains. Its main function is to exchange NLRI with other BGP peers. A
BGP peer refers to any device running BGP.
BGP uses the TCP as the transmission protocol (port 179). Then, reliable
data transmission is provided. The retransmission and acknowledgement
of data are implemented by the TCP, instead of BGP. As a result, the
process is simplified. The reliability need not be designed in the protocol.
Create a TCP connection between two routing devices running BGP. Then,
the two routing devices are called peers. Once the connection is created,
the two peer routing devices acknowledge the connection parameters
through exchanging the open packets. The parameters include BGP
version number, AS number, duration, BGP identifier and other optional
parameters. After the two peers negotiate parameters successfully, the
BGP exchanges routes by sending update packets. The update packets
contain the list of reachable destinations passing each AS system (namely
NLRI), and the path attributes of each route. When the route changes,
incremental update packets are used between peers to transmit the
information. BGP does not require refreshing routing information
periodically. If the route does not change, the BGP peers only exchange
keepalive packets. The keepalive packets are sent periodically to ensure
the valid connection.
BGP Message Header The BGP message header contains a 16-byte tag, 2-byte length field, and
1-byte type field. The following figure illustrates the format of the BGP
message header.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 194 of 628
Figure 11-36 Format of the BGP message header
The header can be followed by data or not. It depends on the message
type, for example, the keepalive message only requires the message
header, and no data is followed.
Marker: the marker field occupies 16 bytes, used to detect the
synchronization loss between BGP peers. If the message type is open, or
the open packets do not contain the authentication information, the
marker fields must be set to 1. Otherwise, the marker field is calculated by
the authentication technology.
Length: the length field occupies 2 bytes. It indicates the length of the
message. The minimum allowed length is 19 bytes and the maximum is
4096 bytes.
Type: The type field occupies one byte. It indicates the type of the BGP
message. The four types of the BGP message are as follows:
Figure 11-8 BGP message types
Number Type
1 Open
2 Update
3 Notification
4 Keepalive
Open Messages After the TCP connection is created, the first packet is the open message.
The Open message contains BGP version number, AS number, duration,
BGP identifier, and other optional parameters.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 195 of 628
If the open message is acceptable, it means that the peer routing devices
agree with the parameters. In this case, the keepalive message is sent to
acknowledge the open message.
Except the fixed BGP header, the open message contains the following
fields:
Figure 11-37 Format of the BGP open message
Version: the version field occupies one byte. It indicates the version
number of the BGP protocol. When the neighbors are negotiating, the peer
routing devices agree on the BGP version numbers. Usually, the latest
version supported by the two routing devices is used.
My Autonomous System: the field is two bytes. It indicates the AS number
sending the routing device.
Hold Time: the field is two bytes. It indicates the maximum waiting time
when the sending party receives the adjacent keepalive or update
messages. The BGP routing device negotiates with the peer and set the
hold time to the smaller value of the two hold times.
BGP Identifier: the field is four bytes. It indicates the identifier of the BGP
sending routing devices. The field is the ID of the routing device, namely
the maximum loopback interface address or the maximum IP address of
the physical interfaces. You can set the address of the router-id manually.
Optional parameter Length: the field is one byte. It indicates the total
length of the optional parameter fields (the unit is byte). If there are no
optional parameters, the field is set to 0.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 196 of 628
Optional Parameters: variable length field. It provides the list of the
optional parameters of the BGP neighbor negotiation.
Update Message The update message is used to exchange routing information between BGP
peers. When you advertise routes to a BGP peer or cancel the routes, the
update message is used. The update message contains the fixed BGP
header and the following optional parts:
Unfeasible Routes Length: two-byte field. It indicates the total length of
the withdrawn route field. If the field is 0, there is no withdrawn routes.
Withdrawn Routes: variable length field. It contains the IP address prefix
list of the routes withdrawn from the services.
Total Path Attribute Length: the field is two bytes; it indicates the total
length of the path attribute field.
Path Attribute: the variable long field contains the BGP attribute list
related with the prefix in the NLRI. The path attribute provides the
attribute information of the advertised prefix, such as the priority or next
hop. The information is for route filtering and route selection. The path
attribute can be classified into the following types:
1. Well-Known Mandatory: the attributes must be contained in the BGP
update message and the attributes must be implemented and
recognized by all BGP vendors. For example, origin, AS_PATH, and
Next_HOP.
ORIGIN: one kind of the well-known mandatory attributes. It gives the
origin of the route update message. There are three possible origins: IGP,
EGP, and INCOMPLETE. The routing device uses the information in the
processing of multiple route selections. Select the route with the lowest
ORIGIN attributes. IGP is lower than the EGP and EGP is lower than the
INCOMPLETE.
AS_PATH: The AS_PATH is a kind of well-known mandatory attributes.
AS_PATH indicates the AS systems that the route in the update message
passes.
NEXT_HOP: It is a kind of well-known mandatory attributes. The attribute
describes the IP address of the next-hop routing device of the destination
listed in the reaching update message.
2. Well-Known Discretionary: the attributes that must be recognized by
all BGP implementations. But the BGP update message can contain the
attribute or not.
LOCAL_PREF: used to distinguish the priority of multiple routes to the
same destination. The higher the attribute of the local priority is, the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 197 of 628
higher is the route priority. The local_pref is not contained in the update
message sent to the EBGP neighbor. If the attribute is contained in the
update message from the EBGP neighbor, the update message will be
ignored.
ATOMIC_AGGREGATE: used to warn that the path information is lost in the
downstream routing devices. Some routing information is lost in the route
aggregation for the aggregation comes from different sources with
different attributes. If a routing device sends the aggregation that causes
the information loss, the routing device requires adding the
atomic_aggregate attribute to the route.
3. Optional Transitive: not all BGPs support the optional transitive
attribute. If the attribute cannot be recognized by the BGP process, it
views the transitive tag. If the transitive tag is set, the BGP process
accepts the attribute and transmit it to other BGP peers.
AGGREGATOR: the attribute marks the BGP peer (IP address) performing
the route aggregation and the AS number.
COMMUNITY: the attribute indicates that one destination serves as one
member of the destination group, and these destinations share one
multiple features. The type code of the community attribute is 8. The
community is regarded as a 32-bit value. To facilitate management,
assume that: the community values from 0 (0x00000000) to 65535
(0x0000FFFF) and from 4294901760 (0xFFFF0000) to 429467295
(0xFFFFFFFF) are reserved. The left community value should use the AS
number as the first two bytes. The meaning of the last two bytes can be
defined by the AS. Beyond the reserved values, several well-known
community values are defined.
NO_EXPORT (4294967041 or 0xFFFFFF01): the received routes with the
value cannot be published to the EBGP peers. If an alliance is configured,
the route cannot be published beyond the alliance.
NO_ADVERTISE (4294967042 or 0xFFFFFF02): the received route with
value cannot be published to the EBGP or IBGP peers.
LOCAL_AS (4294967043 or 0xFFFFFF03): the received route with the
value cannot be published to the EBGP peer or the peers of other AS in the
alliance.
4. Optional Nontransitive: not all BGPs support the optional nontransitive
attributes. If the attribute is not recognized by the BGP process, it
views the transitive tag. If the transitive tag is not set, the attribute is
ignored and is not transmitted to other BGP peers.
MULTI_EXIT_DISC (MED): used by BGP peers to distinguish multiple exits
to a adjacent AS. The lower the MED is, the higher is the route priority.
MED attributes are switched between AS systems. When the MED attribute
enters an AS, it does not leave the AS (nontransitive). This is different
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 198 of 628
from the processing of local priority. The external routing device may
affect the route selection of another AS. The local priority only affects the
route selection in the AS.
ORIGINATOR_ID: the attribute is used by the route reflector. The attribute
is a 32-bit value generated by the route originator. The value is the
routing device ID in the AS. If the originator finds its own router-id in the
received originator-id of the route, it knows that route loopback is
generated. Then, the route is ignored.
CLUSTER_LIST: the attribute is a list of the cluster ID of the route reflector
that the route passes. If the route reflector finds its own local cluster-id in
the received CLUSTER_LIST of the route, it knows that route loopback is
generated. Then, the route is ignored.
Network Layer Reachability: the variable long field contains the list of
reachable IP address prefix advertised by the sender.
Keepal ive Message The keepalive messages are exchanged between peers periodically to
check whether the peer is reachable.
Noti f icat ion Message When any error is detected, the notification message is sent. The BGP
connection is closed after the message is sent. Except the fixed BGP
message header, the notification message contains the following fields:
Error Code: one byte, the field indicates the error type.
ERROR SUBCODE: one byte, the field provides more details about the
error.
DATA: variable length field, the field contains the data related with the
error, for example, invalid message header, illegal AS number. The
following table lists the possible error codes and the error subcodes.
Table 11-8 BGP Notification message error code and error subcode
Error Code Error Subcode
1-Message header error 1-Connection not synchronized
2- Message length is invalid
3-Message type is not supported
2-Open message errors 1-Version numbers not supported
2-AS number of invalid peers
3-Invalid BGP identifiers
4-Not supported optional parameters
5-Authentication failed
6-Unacceptable hold time
7-Not supported capability
3-Update message error 1-Format of the attribute list is incorrect
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 199 of 628
2-well-known attribute cannot be recognized
3-Well-known attribute is lost
4-Attribute tag error
5-Attribute length error
6-Source attribute is invalid
7-AS route cycling
8- next-hop attribute is invalid
9-Optional attribute error
10-Network field is invalid
11-AS path format is incorrect
4-Hold timer timeout Not used
5-FSM error (errors detected by FSM) Not used
6-Stop (critical errors except the listed errors)
Not used
BGP Fini te -State Machine Before the BGP peer can exchange the NLRI, one BGP connection must be
created. The creation and maintenance of the BGP connection can be
described in the FSM. The following provides the complete BGP FSM and
the input events causing the state change.
Figure 11-38 BGP FSM
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 200 of 628
Table 11-8-3 Input Events (IE)
IE Description
1 BGP starts
2 BGP ends
3 BGP transmission connection opens
4 BGP transmission connection is terminated
5 Fail to open the BGP transmission connection
6 BGP transmission fatal errors
7 Retrying connection timer times out
8 Duration time terminated
9 Keepalive timer terminated
10 Receive Open messages.
11 Receive Keepalive messages.
12 Receive update messages
13 Receive notification messages
Idle: initial status, the BGP is in the idle status until an operation triggers
a startup event. The startup event is usually triggered by the creation or
restart of BGP session.
Connect: BGP is waiting for the completeness of the transmission protocol
(TCP). If the connection succeeds, send the Open message, and enter the
status of sending open message. If the connection failed, move to the
active status. If the re-connecting the timer times out, it remains in the
connection status; the timer will be reset and one transmission connection
is started. If any other events occur, it returns to the idle status.
Active Status: in the status, BGP attempts to create a TCP connection with
the neighbor. If the connection succeeds, send the Open message, and
move to the status of sending open message. If re-connecting timer times
out, the BGP restarts the connection timer and goes back to the
connection status to monitor the connection from the peers.
OpenSent: in the status, the open message is sent. BGP is waiting for the
open message sent from the peers. Check the received open message. If
any error occurs, the system sends a notification message and goes back
to the idle status. If no error occurs, the BGP sends a keepalive message
to the peer and resets the keepalive timer.
OpenConfirm: in the status, BGP is waiting for a keepalive or notification
message. If a keepalive message is received, it enters the created status.
If a notification message is received, it goes back to the idle status. If the
hold timer times out before the keepalive message reaches, send a
notification message, and goes back to the idle status.
Established: the last phase of the neighbor negotiation. In the status, the
connection between BGP peers is established. Between peers, the update,
notification, and keepalive messages can be exchanged.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 201 of 628
BGP Path Att r ibutes The path attribute is a major feature of the BGP route. The path attribute
provides the necessary information about the basic route function and
allows the BGP to set and interconnect the route policy.
The route attribute can be one of the following:
Well-Known Mandatory;
Well-Known Discretionary;
Optional Transitive
Optional Non-Transitive;
Well-known mandatory: all BGP update messages contain the attribute,
and all BGPs can parse the messages containing the attributes.
Well-known discretionary: BGP update messages can contain the attribute,
and all BGPs can parse the messages containing the attributes.
Optional Transitive: BGP does not need to support the attribute, but it
should accept the path with the attribute and the paths should be
advertised.
Optional Non-Transitive: BGP does not need to support the attribute. If it
is not recognized, the update message with the attribute is ignored; the
path is not published to the peer.
The meaning of the common path attribute is as follows:
ORIGIN: Well-known mandatory, specifies the source of the update
message;
AS_PATH: Well-known mandatory; use the AS sequence to describe the
path between AS systems or the routes to the destination specified by the
NLRI.
NEXT_HOP: Well-known mandatory; describes the next-hop IP address of
the published destination path.
MULTI_EXIT_DISC: Optional non-transitive; allows one AS to notify the
first entrance point to another AS.
LOCAL_PREF: Well-known; the attribute is used to describe the first level
of the BGP device whose route has been published;
ATOMIC_AGGREGATE: well-known discretionary; used to warn the path
information loss in the downstream devices;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 202 of 628
AGGREGATOR: Optional transitive, indicates the AS number and IP
address of the device launching the aggregation route;
COMMUNITY: Optional transitive, simplifies the implementation of policy;
ORIGINATOR_ID: Optional non-transitive, the route originator prevents
loopback by identifying the ID in the attribute;
CLUSTER_LIST: Optional non-transitive, the reflector prevents loopback by
identifying the ID in the attribute;
BGP Route Decis ion BGP Path Decision Process
When multiple routes with the prefix of the same length and to the same
destination exist, BGP select the best route according to the following rules:
1. Next-hop unreachable route will be ignored;
2. Preferentially select the route with the maximum weight value;
3. Preferentially select the route with the maximum LOCAL_PREF value;
4. Preferentially select the route originated locally;
5. Preferentially select the route with the shortest AS_PATH;
6. Preferentially select the route with lowest ORIGIN attribute;
7. Preferentially select the route with the minimum MED value;
8. Preferentially select the route obtained through the EBGP, instead of
through IBGP;
9. Preferentially select the route whose next-hop has the minimum IGP
metric;
10. Preferentially select the first received EBGP route;
11. Preferentially select the route with the minimum BGP ROUTER-ID;
12. Preferentially select the route with shortest CLUSTER_LIST;
13. Preferentially select the route from the lowest neighbor address;
14. If the BGP load balancing is started, rules 10-13 are ignored. All routes
with the same AS_PATH length and MED values are installed in the
routing table.
Example of LOCAL_PREF and MED Preferential Selection
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 203 of 628
Figure 11-39 In the same condition, preferentially select the route with
higher LOCAL_PREF value
User AS100 obtains routes from ISP1 and ISP2. But ISP1 is the preferred
ISP. When the device connected to the ISP1 announces routes to the
switch-F, set the LOCAL_PREF value higher. For the same destination,
preferentially select the routes learned by ISP1 for its LOCAL_PREF value
is higher.
Figure 11-40 In the same condition, preferentially select the route with
lower MED value
The two-host structure is used between a user and an ISP. The ISP prefers
to use LINK2 and use LINK1 as the backup. When the user publishes
routes to the ISP, the update packets with lower MED value are
transferred on LINK2. If the routes transferred on EBGP neighbor created
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 204 of 628
on LINK2 and LINK1 have no different options, the route with lower MED is
selected preferentially. As a result, the traffic of ISP enters ISP from LINK2.
Route Fi l ter ing Route filtering means that a BGP speaker can determine the sent route
and the received route from any BGP peers. Route filtering is to define the
route policy. The procedure is as follows:
1. Identify Routes
2. Allow or deny routes
3. Operation attributes
We can complete route filtering through access list, prefix list, or AS path
access list. We can also use the route mapping to implement filtering and
attribute operation.
Route Ref lector The route reflector is the centralized routing device or focus of all internal
BGP (IBGP) sessions. The peer routing device of the route reflector is
called route reflector customer. The customers match with route reflector
and exchange routing information. Then, the route reflector exchanges or
reflects the information to all other customers to eliminate the
requirements for the full interconnection environment. As a result, large
amount of money is saved.
The route reflector is recommended only in the large scale internal BGP
closed network. The route reflector increases the overhead of the route
reflector server. If the configuration is incorrect, the route may be cyclic or
unstable. Therefore, route reflector is not recommended in every topology.
All iance The alliance is another method for processing the sharp increase of IBGP
closed network in the AS. Similar to the route reflector, the alliance is
recommended only in the large scale internal BGP closed network.
The concept of the alliance is put forward because one AS can be divided
into multiple sub-AS systems. In each sub-AS, all IBGP rules are
applicable. For example, all BGP routing devices in the sub-AS must form
a fully closed network. Each sub-AS has different AS number. Therefore,
external BGP must be run between them. Although the EBGP is used
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 205 of 628
between sub-AS systems, the route selection in the alliance is similar to
the IBGP route selection in a single AS. Namely, when the sub-AS boarder
is crossed, the next-hop, MED, and local priority information is reserved.
An alliance looks likes a single AS.
The defect of the alliance is: in the case of changing the plan from the
non-alliance to the alliance, the routing devices should be reconfigured
and the logical topology should be changed. In addition, if the BGP policy
is not manually set, you cannot select the best route through the alliance.
Route Damping Route damping (route attenuation) is a technology controlling the
unstability of routes. It significantly reduces the unstability caused by
route oscillation.
The route damping divides the route into normal performance and bad
performance. Routes with normal performance demonstrate long-term
high stability. In addition, the route with bad performance demonstrate
unstability in short term. The route with bad performance should be
punished with direct proportion to the expected route unstability. Unstable
routes should be suppressed until the route becomes stable.
The recent history of the route is the basis of evaluating the future
stability. To know the route history, first, you should know the swing times
of the route in certain period. In the route damping, when the route
swings, it is punished. When the punishment reaches a predefined limit,
the route is suppressed. After the route is suppressed, the route can
increase punishments. The more frequent the route swing is, the earlier
the route will be suppressed.
Similar rules are used to un-suppress the route and re-advertise the route.
An algorithm is used to exit (reduce) punishment according to the power
law. The basis of configuring the algorithm is the parameters defined by
users.
BGP Graceful Restart Principle of BGP Graceful Restart
After the route device becomes faulty, the neighbors in the BGP route
layer will detect that the neighborship becomes down and up, which is
called BGP neighbor oscillation. The oscillation of neighborship finally
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 206 of 628
causes the route oscillation. As a result, route blackhole occurs after the
routing device is restarted for a while or the data service of the neighbor
bypasses the restarted routing device. Consequently, the reliability of the
network is decreased.
The BGP graceful restart in the case of routing device failure prevents the
route disturbance and accelerates the route aggregation, which ensures
the network reliability.
Procedure for BGP Graceful Restart
Through BGP graceful restart, the following aspects are expanded:
1. In the BGP OPEN message, the graceful restart capability is added. The
fields are as follows:
Restart-flag: indicates whether the neighbor is restarted, 1: Yes; 0: No.
AFI/SAFI: the address family supporting graceful restart;
Fwd-flag: if an address family has the graceful restart capability, and
request for reserving the address family route, the value is 1. Otherwise,
the value is 0;
2. In the BGP update packets, add the EOR flag to indicate that the
update is complete.
3. Three timers are added
Restart-timer: Helper end is started, indicates that the reconstruction
session enters the longest waiting time of the GR flow
Stale-path-timer: Helper end is started, the longest time of reserving
routes;
Defer-timer: restarter end is started, the longest time of delaying
calculation and advertisement
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 207 of 628
Figure 11-41 Graceful restart flow
Restarter end (Switch-A):
1. At the beginning of creating neighbors, negotiate the GR capability
through the open message;
2. When any fault occurs, the forwarding layer of switch A reserves the
route and continue guiding the forwarding;
3. Re-construct the neighbor, send open messages. The restart-flag is set
to 1, which indicates that the restart is performed, notifying the
restart-time value and the reserved address family route to the
neighbors.
4. After the neighbor is restarted, start defer-timer to receive updates
from the neighbors.
5. Delay the route calculation until the EOR flag from the neighbor is
received or the deter-timer times out.
6. Calculate the route, update the core route and advertise the route.
Helper end (Switch-B):
1. At the beginning of creating neighbors, negotiate the GR capability,
and record that the neighbor has the GR capability.
2. After the restarter end becomes faulty, if any TCP error is detected,
run step 3, if no TCP error is detected, run step 4.
3. Reserve Routes; start the restart timer.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 208 of 628
4. Re-construct neighbors and delete the restart timer. If the timer exists,
start the stale-path timer.
5. Before the creation, the restart timer times out, or the fwd-flag in the
corresponding address family of the open message is not 1, or the
corresponding address family information is not contained, run step 8.
6. Send routes to the restart routing device. Then, send EOR flag.
7. If the stale-path times out before the EOR is received, run step 8.
8. Delete the reserved route and then enter the normal BGP flow.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 209 of 628
ACL Technology
This chapter describes the ACL technology and its application. The
configurations related with the ACL function in the switch include the
action group configuration, traffic meter configuration, and time range
configuration.
Main contents:
ACL introduction and application
Introduction to action group
Introduction to traffic meter
Introduction to time domain
ACL Introduction and Application This section describes the basic concepts and application of the ACL
technology.
Main contents:
Basic concepts of ACL
ACL classification
Typical application
Basic Concepts of ACL Access Control List (ACL) is the basic control mechanism of filtering traffic
on the switch. ACL is the traffic filter and can identify the specified types of
traffic according to the packet attributes, such as IP address and port
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 210 of 628
number. After identifying the traffic, ACL can execute the specified
operations, such as prevent them from passing one interface.
ACL comprises a series of rules. Each rule is used to match one specified
type of traffic. The serial number of the rule (Sequence) decides the
location of the rule in the ACL. ACL checks the packets according to the
rule sequence from small to large. The first rule that matches with the
packet in the ACL decides the processing result for the packet, permit or
deny. If there is no rule to match the packet, the packet is denied, that is
to say, the packets that are not permitted are denied. This shows that the
rule order is important.
The following example defines one IP standard access list.
The Attribute field carries the specified authentication, authorization,
information and configuration details of RADIUS request and response.
Attribute can have multiple instances its format is as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 229 of 628
Value …
0
Type Length
1 2
The Type field indicates the Attribute type.
The Length field indicates the length of the whole Attribute, including Type,
Length and Value.
The Value field is 0 or multiple bytes, including the specified Attribute
information. The format and length of Value depend on the Type and
Length.
The following lists several common Attributes:
Attribute Type Data Type Attribute Length
User-Name 1 String Length >=3
User-Password 2 String 18<=Length<=130
NAS-IP-Address 4 Address Length=6
Service-Type 6 Integer Length=6
Reply-Message 18 String Length>=3
Acct-Status-Type 40 Integer Length = 6
Introduction to TACACS TACACS provides the authentication, authorization and accounting services.
TACACS adopts the TCP packet to transmit the data and uses the port 49
to receive the TCP packet. The format of the TACACS packet header is as
follows. The packet header always adopts the plaintext to transmit.
major version field
It is the major version number.
minor version field
It is the minor version number.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 230 of 628
type field
It is the packet type, indicating authentication, authorization or accounting.
1-authentication
2-authorization
3-accounting
seq_no field
It is the serial number of the packet.
flags field
It is the flag. The lowest bit indicates whether the packet is encrypted.
session_id field
It is the session ID. It is one random 4-byte number. The ID does not
change in one session.
length field
It is the length of the packet body (excluding the packet head).
What is near to the packet head is the authentication, authorization or
accounting packet. All are encrypted.
The authentication has three types of packets, including START, REPLY,
and CONTINUE. START and CONTINUE are sent by the customer and
REPLY is sent by the server.
The authorization session uses one pair of packet REQUSET and
RESPONSSE to complete the authorization; the accounting session adopts
one pair of packet REQUSET and REPLY and carries the specified attributes
in the packet.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 231 of 628
Introduction to ID Authentication Mechanism
Login Authentication 1. If AAA is not configured and Line is not configured, the login via
console port or telnet directly pass the authentication; for SSH, you
should use the local login.
2. If AAA is not configured, but Line is configured, authenticate according
to the Line configuration.
Authentication Type Configured on Line
Description
no login Pass
Login (the default login mode of telnet)
Authenticate according to the line password. If the line password is not configured, log in via the console port and pass the authentication; For telnet and ssh login, do not pass the authentication. (Note If the line password is not configured, the login fails.)
login local (the default login mode of ssh)
Authenticate according to the local password. (Note If the local user is not configured, the login fails.)
3. Configure AAA
Authenticate according to the configured method list. One method list
supports 4-6 authentication methods, but four authentication methods can
be configured at most.
When the user logs in via the interface or line, the system authenticates
the ID according to the method list referenced by the interface or line. If
the interface or line does not reference ant method list or the referenced
method is not defined, the system uses the default method list to
authenticate the ID; if the default method list is not configured, adopt the
default method to authenticate.
For the login via console port, the default method is none; for telnet and
ssh login, the default method is local.
If the user adopts the valid user name to log in, it is not required to input
the user name when authenticating ID in the privileged mode and you just
need to input the desired password.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 232 of 628
Authenticate in Privileged Mode 1. AAA is not configured.
Use the enable password to authenticate:
If the login user has the enable password, authenticate according to the
password;
Otherwise, if there is the global enable password, authenticate according
to the global enable password;
If there is no any enable password, the user that logs in via the console
port directly passes the authentication, but the telnet user does not pass
the authentication.
2. Configure AAA
Authenticate according to the configured default method list. The method
list supports four authentication methods.
After the user logs into the router, request entering the privileged mode.
The system authenticates the ID according to the default method list; if
the method list does not exist, adopt the default method to authenticate:
For the login via the console port, the default method list is enable none;
For the telnet and ssh login, the default method is enable.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 233 of 628
EIPS Technology
EIPS is a link layer protocol especially applied in Ethernet ring. It can
prevent the broadcast storm caused by the data loop. When a link on the
Ethernet ring is disconnected, the standby link can be enabled rapidly to
recover the communication between the nodes on the ring network.
Compared with STP protocol, EIPS has the features that the topology
aggregation speed is fast (lower than 50ms) and the aggregation time is
not related with the nodes on the ring network.
The EIPS technology supports two modes. One is sub ring mode. When
processing the intersecting rings, de-compound the two intersecting rings
to one master ring and one sub ring; there is one public link between the
master ring and the sub ring. The other mode is called hierarchical mode.
When processing the two intersecting rings, choose one ring as the master
ring. After removing the public link with the master ring, the ring
connected to the master ring becomes the low-level link connected to the
master ring.
Sub Ring Mode EIPS Main contents:
Basic concepts of EIPS
EIPS packet format
Basic theory of EIPS
Typical application of EIPS
Basic Concepts of EIPS EIPS (Ethernet Intelligent Protection Switchover): IETF defines the
auto protection switchover standard of Ethernet ring in RFC3619 (2003.10
information), indicating that the auto protection switchover mechanism is
performed in the Ethernet ring.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 234 of 628
EIPS domain: The EIPS domain is identified by the integer ID. A group of
switches that are configured with the same domain ID and are
interconnected form one EIPS domain. The EIPS domain comprises EIPS
ring, EIPS control VLAN, master node, transmission node, edge node and
assistant edge node.
EIPS ring: The EIPS ring is identified by the integer ID. It physically
corresponds with one ring Ethernet topology. Each EIPS ring is one local
unit of the EIPS domain. The EIPS protocol takes effect on the EIPS ring.
The EIPS rings in the EIPS domain are divided to master ring and rub ring.
In one EIPS domain, there is only one master ring, but there can be one or
multiple sub rings. The sub ring intersects with the upper ring via the edge
node and the assistant edge node.
EIPS master ring: It is the EIPS ring with level as 0.
EIPS sub ring: It is the EIPS ring whose level is larger than 0.
EIPS control VLAN: It is relative to the data VLAN. In the EIPS domain,
the control VLAN can only be used to transmit the EIPS protocol packets.
Each EIPS ring has one control VLAN. The master ring protocol packets are
transmitted in the master control VLAN. The sub ring protocol packets are
transmitted in the sub control VLAN. It is not permitted to configure the IP
address on the master control VLAN and sub control VLAN interfaces. The
port connected to the Ethernet ring on the switch belongs to the control
VLAN and only the port connected to the Ethernet ring can be added to the
control VLAN. The port on the master ring belongs to the master control
VLAN and the sub control VLAN. The port on the sub ring only belongs to
the sub control VLAN. The whole master ring is regarded as one logical
node of the sub ring. The EIPS protocol packets of the sub ring are
transmitted transparently as the user packets of the master ring. The EIPS
protocol packets of the master ring do not enter the sub ring, but are only
transmitted in the master ring.
EIPS node: each switch on the EIPS ring is one node on the EIPS ring.
The nodes on one ring have the same EIPS domain ID and the EIPS ring
ID. Each EIPS node has two EIPS ports connected to the EIPS ring, which
are specified as the master port and standby port by the user during the
configuration.
Master node: The master node is the initiator of polling the status of the
ring network (the master node sends HEALTH packets periodically from
the master and standby ports. If at least one port can receive the packet
from another port, it indicates that the ring is complete. If the HEALTH
packet cannot be received for a long time, it is regarded that the ring fails).
The master node is also the decider of executing the operation after the
network topology status changes.
The master node has the following three states:
Complete State:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 235 of 628
When all links on the ring network are in the UP state, the master node
can receive the HEALTH packet sent by itself from the standby port, which
indicates that the master node is in the complete state. The status of the
master node reflects the status of the EIPS ring. Therefore, EIPS ring is
also in the complete state. Here, the master node blocks the standby port,
so as to prevent the packets from forming the broadcast loop on the ring
topology.
Failed State:
When all links on the ring network are in Down state, it indicates that
master node is in the Failed state. Here, the master node enables the
standby port to ensure that the communication between the nodes on the
ring network is not interrupted.
PRE-UP State:
When the master node is in the failed state, it first turns to the Pre-up
state after receiving the HEALTH packet. If it still can receive the HEALTH
packets within a period, it turns to the complete state. This is to prevent
the network flap.
Transmission node: Besides the master node, there are all transmission
nodes on the EIPS ring. The transmission node is responsible for
monitoring the status of the direct-connected link and reporting the status
change to the master node via the EIPS protocol packet, and then the
master node decides how to process. The two transmission nodes
intersecting with the master ring on the sub ring are divided to edge node
and assistant edge node (there is only the transmission node on the
master ring; the edge node and assistant edge node are just on the sub
ring). If the transmission node on the master ring has the public port with
the edge node of the sub ring, it needs to send the sub ring protocol
channel status detection packet on its port. If the transmission node on
the master ring has the public port with the assistant edge node of the sub
ring, it needs to transmit the received sub ring protocol channel status
detection packet to the corresponding assistant edge node.
The transmission node has the following three states:
Link-Up State (UP state):
The master port and standby port of the transmission node are both in the
up state. The transmission node is in the Link-Up state.
Link-Down State (Down state):
When the master port or standby port of the transmission node is in the
Down state, the transmission node is in the Link-Down state. When the
transmission node in the Link-up state finds that the master port or
standby port is in the Link-Down state, it turns from the Link-Up state to
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 236 of 628
the Link-Down state and informs the master node by sending the Link-
Down packet.
Preforwarding State (temporary blocked state):
The transmission node cannot directly return to the Link-Up state from the
Link-Down state. When one port of the transmission node in the Link-
Down state is in the Link-Up state, and then the master port and standby
port recover to the Up state, the transmission node turns to the
Preforwarding state and blocks the last recovered port. At the moment
when the master port and standby port of the transmission node recover,
the master node cannot get to known the message at once, while the
standby port is still in the enabled state. If the transmission node returns
to the Link-UP state at once, the packets form the broadcast loop on the
ring network. Therefore, the transmission node first turns from the Link-
Down state to the Preforwarding state.
When the transmission node in the Preforwarding state receives the
COMPLETE-FLUSH-FDB packet sent by the master node, it turns to the
Link-Up state. If the COMPLETE-FLUSH-FDB packet is lost during
transmission, the EIPSA protocol provides one backup mechanism to
recover the temporary-blocked port and trigger the status switchover, that
is, if the transmission node does not receive the COMPLETE-FLUSH-FDB
packet in the specified time, it automatically turns to the Link-Up state and
enables the temporary-blocked port.
Edge node and assistant edge node: The edge node and assistant edge
node are used to detect the status of the sub ring protocol packet channel
in the master ring. The edge node is the initiator of the detection
mechanism, the assistant edge node judges the channel status and
reports to the edge node, and at last, the edge node makes decision
according to the channel status.
The edge node and the assistant node are both the special transmission
node, so they have the same three state as the transmission node, but the
meanings are a little different, as follows:
Link-Up State (UP state):
When the edge port is in the UP state, it indicates that the edge node
(assistant edge node) is in the Link-Up state.
Link-Down State (Down state):
When the edge port is in the Down state, it indicates that the edge node
(assistant edge node) is in the Link-Down state.
Preforwarding State (temporary-blocked state):
The transferring of the edge node (assistant edge node) status is basically
the same as the transmission node. The difference is that when the port
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 237 of 628
link statues change results in the status transferring of the edge node
(assistant edge node), it only depends on the status of the edge port
(refer to the previous introduction of the edge node status).
The edge node and the assistant edge node is the two main bodies of the
mechanism of detecting the sub ring protocol packet channel status in the
master ring. The edge node is the initiator of the mechanism, the assistant
edge node judges the channel status and reports to the edge node, and at
last, the edge node makes decision according to the channel status. The
mechanism is described in details later.
EIPS port: EIPS port is one abstract concept, corresponding to one of the
links that form the EIPS ring. The link can be one single physical link or
the aggregation link formed by multiple physical links. On each EIPS node,
there are always two ports connected to the EIPS ring. The EIPS rings may
intersect, so one EIPS port may belong to multiple EIPS nodes.
EIPS master port and EIPS standby port: The ports on the master
node and the common transmission node (non-edge node and assistant
edge node) are divided to master port and standby port. For the master
node, when the loop is complete, the user data VLAN of the standby port
needs to be blocked; for the transmission node, the master port and
standby port do not have special meaning.
EIPS public port and EIPS edge port: The ports on the edge
transmission node and assistant edge transmission node are divided to
public port and edge port. The public port is the port connected to the
public link of two intersecting rings and belongs to multiple EIPS rings. The
edge port only belongs to one sub ring. When the public port fails, do not
need to report to the master node of the sub ring, but only need to report
to the master node of the master ring.
EIPS Packet Format The protocol frame format of the Ethernet ring protection protocol is as
follows:
Table 15-1 EIPS packet format
0 15 16 31 32 47
Destination MAC address (6 bytes)
Source MAC address (6 bytes)
Type (Ether Type) (TPID) PRI + CFI + VLAN ID Frame Length
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 238 of 628
DSAP/SSAP CONTROL OUI = 0x00E02B
0x00BB 0x99 0x0B ERP_LENGTH
ERP_VER ERP_TYPE Domain_ID Ring_ID
0x0000 SYSTEM_MAC_ADDR (high 4 bytes)
Low 2 bytes HEALTH_TIMER FAIL_TIMER
STATE 0x00 HEALTH_SEQ 0x0000
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
The description of the frame format:
Destination MAC address: 48bits
Table 15-2 The description of the destination MAC address
Destination MAC Description
0180.6307.0000 1. The destination MAC of the HEALTH packet, sent out by
the master and standby ports of the master node,
passing all transmission nodes or common L2 switches;
the transmission node only forwards the HEALTH packet,
but does not send it to CPU. The ports of the master
node receive the HEALTH packet;
2. The destination MAC address of the LINK-DOWN packet,
initiated by the transmission node, edge node or
assistant edge node; inform the master node when the
links of the nodes change;
3. The destination MAC address of the ASK-RING-STATE
packet.
0180.6307.0002 The destination MAC of the COMM-FLUSH-FDB/COMP-FLUSH-
FDB packet. The packet is initiated by the master node. The transmission node forwards the packet and sends it to CPU; the master node does not forward it, but just sends it to CPU.
0001.7A4F.4826 1. The destination MAC address of the EDGE-HEALTH
packet. The packet detects the master ring link between
the edge node and assistant edge node;
2. The destination MAC of the MAJOR-FAULT packet. It is
initiated by the assistant edge node. When the master
ring link between the edge node and assistant edge node
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 239 of 628
is disconnected, inform the edge node that the master
ring link fails;
3. The destination MAC address of the MAJOR-RESUME
packet. It is initiated by the assistant edge node. When
receiving the EDGE-HEALTH packet of the edge node
again, inform the edge node that the link is recovered.
0001.7A4F.4AB6 Topology request packet
0001.7A4F.4AB4 Uni-directional detection packet
0001.7A4F.4AB5 The HELLO1 packet sent after the standby node does not receive the Hello packets of the master node within some time.
Source MAC address: 48bits, the MAC address of the sending node;
TPID: 8 bits, fixed as 0x8100;
PRI+CFI: 4bits, not defined, the priority can be defined (7 is
recommended by default), the standard format frame with CFI as 0;
VLAN ID: 16bits, not defined;
Frame Length: 16bits, the length of the Ethernet frame, fixed as 0x48;
DSAP/SSAP: 16bits, fixed as 0xAAAA;
CONTROL: 8bits, fixed as 0x03;
OUI: 24bits, fixed as 0x00E02B;
ERP_LENGTH: 16bits, fixed as 0x40;
ERP_VERS: 16bits, fixed as 0x0001;
ERP_TYPE: 16bits, the frame type;
Domain_ID: 16bits, the domain ID;
Ring_ID: 16bits, the ring ID;
SYSTEM_MAC_ADDR: 48bits, the MAC address of the sending node;
HEALTH_TIMER: 16bits, the period of sending the HEALTH frames set
by the master node and edge control node (the unit is 16ms);
FAIL_TIMER: 16bits, the timeout of not receiving the HEALTH frames
set by the master node and edge control node (the unit is 16ms);
STATE: 8bits, the node status;
HEALTH_SEQ: 16bits, the serial number of the HEALTH frame,
generated by the maser node;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 240 of 628
ERP_TYPE: the packet type, defined as follows:
Table 15-3 The definition of packet type
Packet Type Value Description
HEALTH packet 5 The packet is initiated by the master node, detecting
the loop integrality for the network.
COMP-FLUSH-FDB packet
6 The packet is initiated by the master node. When the EIPS ring turns to the HEALTH state, inform the transmission node to update the MAC entries and inform the transmission node to un-block the temporary-blocked port.
COMM-FLUSH-FDB packet
7 The packet is initiated by the master node. When the EIPS ring turns to the DWON state, inform the transmission node to update the MAC entries or when the transmission node has one port in the Link-Down state, initiate the packet, too.
LINK-DOWN packet
8 The packet initiated by the transmission node, edge node or assistant edge node. When the links of the nodes are down, inform the master node that the loop disappears.
ASK-RING-STATE 9 When the port is up and the ring status is not
confirmed, the non-master node queries the current ring status from the master node.
EDGE-HEALTH packet
10 The packet is initiated by the edge node, detecting the master ring link between the edge node and the assistant edge node.
MAJOR-FAULT packet
11 The packet is initiated by the assistant edge node. When the master ring link between the edge node and the assistant edge node is disconnected, inform the
edge node that the master ring link fails.
MAJOR-RESUME 12 After the assistant edge node finds that the master ring
fault recovers, inform the edge node that the master ring fault recovers.
LINK-HELLO 14 The uni-directional detection packet
TOPOLOGY 15 The topology collection packet, including the topology
request and topology response packets.
Basic Theory of EIPS
Basis of EIPS Protocol All nodes on each domain are configured with the same EIPS domain
ID;
The master ring protocol packets are broadcasted in the main control
VLAN; the sub ring protocol packets are broadcasted in the sub control
VLAN;
The EIPS ports on the master ring node are added to the main control
VLAN and sub control VLAN; the EIPS ports on the sub ring are only
added to the sub control VLAN;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 241 of 628
The protocol packets of the sub ring are processed as the packets in
the master ring, being blocked/enabled synchronously with the
packets;
Pol l ing Mechanism The Polling mechanism is the mechanism that the master node of the EIPS
ring actively detects the health status of the ring network. The master
node periodically sends HEALTH packets from two ports at the same time,
which are transmitted on the ring via the transmission nodes in turn. If the
master node can receive the HEALTH packet sent by itself from any port, it
indicates that the ring network link is complete and the assistant port is
blocked, as shown in Figure 15-1. If the two ports cannot receive the
HEALTH packets within the specified time, it is regarded that the ring
network link fails; enable the assistant port; send COMM-FLUSH-FDB
packets from two ports, as shown in Figure 15-2. When the master node
in the Failed state receives the HEALTHA packets sent by itself from the
assistant port, it first turns to the PRE-UP state. After some time, it turns
to the Complete state, blocks the assistant port, refreshes FDB, and sends
COMP-FLUSH-FDB packets from the master port to inform all transmission
nodes to enable the temporary-blocked ports and refresh FDB.
There are two aspects of reasons why the master node sends HEALTH
packets from two ports at the same time:
When the ring is uni-directional, if do not send HEALTH packets from
two ports at the same time, maybe the master node cannot receive
the HEALTH packets, so it enables the assistant port and as a result,
the uni-directional link become loop;
When enabling the standby master node function, if the one link in the
loop is DOWN, the standby master node is at the port that does not
send the HEALTH packets on the master node, so the standby master
node function cannot take effect.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 242 of 628
Figure 15-1 The running when the uni-ring is in the non-fault state
Figure 15-2 The master node cannot receive the HEALTH packets
Mechanism of Not i fying Link Status Change The mechanism of notifying link statue change provides the mechanism of
processing the ring network topology change that is faster than the Polling
mechanism. The initiator of the mechanism is the transmission node,
which always monitors its port link status. Once the status changes, the
transmission node sends the packet to inform the master node and then
the master node decides how to deal. If it is found that the port is Down,
send the LINK-DOWN packet, as shown in Figure 15-3. After the master
node receives the LINK-DOWN packet, it turns to the Failed state and
sends the COMM-FLUSH-FDB packets to the transmission node on the ring
via two ports.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 243 of 628
Figure 15-3 Transmission node detects that the physical line is down
Sub Ring Protocol Packet Detect ion Mechanism The edge node sends the EDGE-HEALTH packets to the assistant edge
node from two directions via the two ports of the associated transmission
node, so as to detect the faults of the master ring link, as shown in Figure
15-4. When the assistant edge node does not receive the EDGE-HEALTH
packet, it indicates that at least two points on the master ring are broken.
The assistant edge node switches to the MAJOR-FAULT state and sends the
MAJOR-FAULT packets to the edge node via its edge port. After the edge
node receives the MAJOR-FAULT packet, the status machine switches to
the MAJOR-FAULT state and blocks the edge port, so as to avoid the loop
during the dual-homing, as shown in Figure 15-5.
When the edge node and assistant edge node receive the COMP-FLUSH-
FDB packets of the sub ring, turn to the LINK-UP state unconditionally.
When the assistant edge node receives the EDGE-HEALTH packet, the
status machine turns to the LINK-UP state.
To avoid that the edge node becomes disordered when receiving the
MAJOR-FAULT and COMP-FLUSH-FDB packets and the status of the edge
node becomes wrong, when the assistant edge node turns to the LINK-UP
state, send the MAJOR-RESUME packet to the edge node. After receiving
the packet, the edge node needs to turn to the LINK-UP state.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 244 of 628
Figure 15-4 The sub ring detecting the master ring link
Figure 15-5 The sub ring detects the master ring link fault
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 245 of 628
EIPS Typical Application
Uni-r ing Networking Appl icat ion
Figure 15-6 EIPS uni-ring networking
As shown in Figure 15-6, there is only one ring in the network topology.
Here, you just need to define one EIPS domain and one EIPS ring. The
feature of the networking is that when the topology changes, the response
speed is high and the convergence time is short, which can meet the
application when there is only one ring in the network.
Sub Ring Appl icat ion
Figure 15-7 Typical network of the EIPS sub ring
As shown in Figure 15-7, there are two or more rings in the network
topology, but there are two public nodes between the rings. Here, you just
need to define one EIPS domain and select one ring as the master ring
and the other as the sub ring. The typical application of the networking is
that the master node of the sub ring can go upstream via two edge nodes
and provide the upstream link backup.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 246 of 628
Hierarchical EIPS Main contents:
Basic concepts and abbreviations of EIPS
Basic network topology of EIPS
Ports and protocol packets on the ring
EIPS protocol mechanism
Basic Concepts and Abbreviations
Basic Concepts Ethernet Ring: It is a set of a group of Ethernet switch nodes those are
interconnected as a ring.
Master Node (master, M for short): It is the main decision maker and
control node on the ring of one domain. There is only one master node on
one single ring. The two ports of the master node on the ring are the
master port and the assistant port. When the link pf the domain controlled
by the master node is complete, the assistant port blocks all data to avoid
the loop. When the link on the ring fails and if the port of the faulty link is
not the assistant port of the master node, enable the forwarding function
of the assistant port.
Transmission node (transit, T for short): It is the node that transmits
data and cooperates with the master node to protect the ring in one
domain. It has two ports in the ring. When finding the link of the port fails,
the transmission node informs the master of the domain, updates the port
address forwarding table according to the received control packet, and
enables the port. Besides the master node, the others are transmission
nodes in one single ring.
Topology level (level): It is the division of the loops protected by one
EIPS domain. The loop protected in one domain comprises one ring or
several intersecting rings. When there is only one ring in the domain, set
the ring level as major-level ring and the level is 0; when there are
multiple intersecting rings in the domain, choose one ring as the major-
level ring and topology level is 0. The ring connected to the major-level
ring becomes the low-level link connected to the major-level ring after
removing the public link part with the major-level ring. The ring connected
to the low-level link becomes the lower-layer link after removing the public
link with the low-level link. For the low-level links in the topology, the
lower the level is, the higher the level number is. The level number of the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 247 of 628
major-level ring is highest (it is 0). Here, the major ring is one complete
ring. The low-level links are the un-complete ring link set after removing
the public links with the access upper layer.
Topology segment (segment): It is the division of different low-level
links on the same level in the domain. It is used to distinguish the
different low-level links of the same level. There can be multiple low-level
links on the same level in the domain. The segment is used to distinguish
the different low-level links of the same level. The multiple low-level links
of the same level use the different segment numbers. Here, the segment
number of the major-level ring is 0. After dividing the levels and segments
in the domain, the ring or low-level link of each level and segment has one
unique level number and segment number in the whole domain, called
level segment. The low-level link whose level number and segment
number are defined is called low-level segment link. The low-level
segment link is the segment link connected on the major-level ring or
between the two edge ports of the upper-level segment link.
Edge-control node (edge-control, E-C for short): It is the main
decision maker and control node of the low-level segment links in the
domain. The edge-control node has one port in the level segment. The
ports of the low-level segment link connected to the upper-level segment
belong to the low-level segment link. If the accessed nodes of the upper-
level segment controls the edge ports to protect the low-level segment link,
the accessed nodes of the upper-level segment are called edge control
nodes, which belong to the low-level segment link, but not the accessed
upper-level segment. The function of the edge control node is similar to
the low-level master node. When the links of the controlled level segment
are complete, the edge ports block the forwarding function of protecting
the service VLAN, avoiding the closed ring in the domain. When the link of
the controlled level segment fails and if the ports of the faulty link are not
the edge ports, the edge ports enable the forwarding function of protecting
the service VLAN so that the VLAN data of protecting the services can pass
the edge ports. Select one of the two nodes those connect the low-level
segment link to the upper-level segment as the edge control node, which
is responsible for controlling the level segment link.
Edge assistant node (edge-assistant, E-A for short): It is the node of
the low-level segment link that transmits the data and cooperates with the
decision node to protect the ring in the domain. The edge-assistant node
has one port in the level segment. When the access node of the low-level
segment link connected to the upper-level segment is not the edge control
node, the node is the edge assistant node, which only belongs to the
accessed low-level segment link, but not belong to the accessed upper-
level segment. The edge assistant node is responsible for transmitting the
loop status detection packets sent by the edge control node on the level
segment link. When it is found that the level segment link fails, the edge
assistant node serves as the decision node to send the link fault
notification packet.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 248 of 628
Edge node: It is the intersecting point of the two rings. It is associated
with multiple different levels and has at least three ports in one domain. It
is the compound role. The edge node can have different roles in different
levels. In the accessed low-level segment link, it can be edge control node
or edge assistant node; in the accessed upper-level segment, it can be the
master node or transmission node.
Control VLAN: To control the EIPS protocol packets to be transmitted
only in the EIPS domain, use one VLAN to control the EIPS protocol
packets. The EIPS control VLAN cannot be configured with the L3 interface.
Abbreviat ions ERP: Ethernet Ring Protection
EIPS: Ethernet Intelligent Protection Switching
MAC: Media Access Control
FDB: Forwarding Database
VLAN: Virtual Local Area Network
STP: Spanning Tree Protocol
MSTP: Multiple Spanning Tree Protocol
Basic Network Topology of EIPS
Uni-r ing Topology When the domain includes one single ring, define the single ring as the
major-level ring, the level is defined as 0 and the segment is defined as 0,
as shown in Figure 15-8.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 249 of 628
Figure 15-8 EIPS uni-ring
In Figure 15-8, the nodes T1, T2, T3, and M form the major-level ring
(level 0, seg 0); the node M is the master node; the nodes T1, T2, and T3
are the transmission nodes. When the major-level ring is not faulty, EIPS blocks the services of the second port S.
Intersect ing Ring Topology When the domain includes multiple physical rings those intersect with each
other, de-compound it to one hierarchical structure that includes one
major-level ring and several low-level segment links. The level of the
major ring is defined as 0 and the segment is defined as 0. The low-level
segment link is distributed with one level number and segment number.
The lower the level is, the higher the level number is.
Figure15-9 EIPS intersecting rings
In Figure 15-9, choose one of the intersecting rings as the major-level ring
and the other rings degenerate as the low-level segment link. The nodes
T1, T2, T3, T4, and M form the major-level ring; the node M is the master
node; the nodes T1, T2, T3 and T4 are the transmission nodes. Divide the
level and segment for other links; (level 1, segment 1) includes the nodes
T1, T2, T3 and T4. Here, the node T2 is the edge control node; the node
T1 and T2 are the transmission nodes; the node T3 is the edge assistant
node. When ((level 1, segment 1) link is not faulty, the node T2 blocks the
edge port connected to (level 1, segment 1). The major-level ring is one
single ring and the low-level segment link is one link. The larger the level
number is, the lower the level is.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 250 of 628
Node Roles Master Node:
The major-level ring of one domain has one master node, that is, the
master node of the major-level ring. The master node is the initiator of
detecting the major-level ring status actively and the decision maker of
executing the operation after the major-level ring topology changes.
The master node sends the HEALTH packets periodically from two ports,
which are transmitted via the transmission nodes on the ring. If the
master node can receive the HEALTH packets sent by itself, it indicates
that the major-level ring link is complete; if the two ports cannot receive
the HEALTH packets within the specified time, it regards that the ring
network link fails.
The master node has the following four states:
Complete State
The major-level ring is in the stable state and there is no broken link in
the ring. The master node blocks the service forwarding function of the
protect VLAN of the assistant port, to as to prevent the network storm
caused by the loop. Meanwhile, the master node periodically sends the
HEALTH packet, which is transmitted via the transmission node when the
loop is normal and returns to the port of the master node.
Failed State
When the link of the major-level ring is disconnected, the master node
enters into the Failed State after receiving the event that the link is
disconnected. If the corresponding port of the faulty link is not assistant
port, the assistant port enables the data forwarding function of the protect
VLAN. Because the topology of the major-level loop changes, the master
node needs to send the COMM-FLUSH-FDB control messages from the
main port and assistant port to inform all other nodes of the level segment
to clear up the address entries of the master node and the protected VLAN.
Init State
When the master node begins to initiate, the link status of the current loop
is not known, so set the current status as Init State until the actual status
of the loop is detected.
PRE-UP State
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 251 of 628
To avoid that the fault point flaps repeatedly and the loop status
frequently switches, which causes the interruption of the service data, the
master node waits for some time and then enters the Complete State from
the Failed State. During the waiting time, the status of the master node is
PRE-UP State.
Transmission Node:
The transmission node is responsible for monitoring the status of the link
on the direct-connected loop. When the link fails, send the LINK-DOWN
packet to inform the control node of the level segment and then the
control node decides how to deal. When the COMP-FLUSH-FDB and COMM-
FLUSH-FDB packets of the control node are received, update the FDB table
related with the protection service VLAN.
The transmission node has the following four states:
Complete State:
When receiving the COMP-FLUSH-FDB packet of the level segment, enter
the Complete State.
Failed State:
When receiving the COMM-FLUSH-FDB packet of the level segment, enter
the Failed State.
Init State:
When the transmission node begins to initiate, the link status of the
current loop is not known, so set the current status as Init State and send
the ASK packet to query the control node of the level segment.
Pre-forwarding:
The status appears at the moment when the link recovers. When in the
state, the original Down port becomes up. The EIPS control VLAN is
enabled and can forward the EIPS protocol packets, but the service VLAN
is still blocked. After the loop enters the Complete state and the
transmission node receives the COMP-FLUSH-FDB packet of the control
node, enable the forwarding function of the service VLAN and turn to the
Complete state. If the transmission node does not receive the COMP-
FLUSH-FDB packet within the specified time, automatically turn to the
Complete state.
Edge Control Node:
The edge control node is the control node that has only one port on the
low-level segment link. There is no master node in the level segment link.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 252 of 628
The edge control node periodically sends the HEALTH packet to the level
segment link from the access port. When the link is complete, the returned
HEALTH packet can be received. The edge control node is similar to the
master node and has the following four status:
Complete State
The level segment link is in the stable state and there is no broken link.
The edge control node blocks the service forwarding function of the protect
VLAN of the access port, to as to prevent the network storm caused by the
loop. Meanwhile, the access port periodically sends the HEALTH packets,
which are transmitted via the nodes of the low-level segment link when
the loop is normal and return to the access port of the edge control node.
Failed State
When the access port of the edge control node does not receive the
returned HEALTH packets within the specified time or receives the event
that the link is disconnected on the level segment link, the node enters the
Failed State. If the corresponding port of the faulty link is not the access
port of the edge control node, enable the data forwarding function of the
protection service VLAN of the access port. Because the topology of the
level segment link changes and the edge control node needs to send
COMM-FLUSH-FDB control message to inform the other nodes on the level
segment link and the related nodes of the upper level to clear up the FDB
table of the node and the protected VLAN.
Init State
When the edge control node begins to initiate, the link status of the
current level segment link is not known, so set the current status as Init
State until the actual status of the loop is detected.
PRE-UP State
To avoid that the fault point flaps repeatedly and the loop status
frequently switches, which causes the interruption of the service data, the
edge control node waits for some time and then enters the Complete State
from the Failed State. During the waiting time, the status of the edge
control node is Preforwarding State.
Edge Assistant Node:
The edge assistant node is the non-control node that has only one port on
the low-level segment link. When receiving the HEALTH packet sent by the
control node of the level segment link, return it to the control node from
the receiving port and cooperate with the control node to detect the level
segment link status. If the edge assistant node does not receive the
HEALTH packet within the specified time, it is regarded that the link
between the edge assistant node and the control node fails. When the
edge assistant node receives the LINK-DOWN packet of the level segment
link, it is also regarded that the link between the edge assistant node and
the control node fails. The edge assistant node is responsible for
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 253 of 628
monitoring the status of the link on the direct-connected loop. When the
link fails, send LINK-DOWN packet to inform the control node of the level
segment. When the edge assistant node finds that the link between itself
and the control node of the level segment link fails, it serves as the
temporary control node and send the COMM-FLUSH-FDB packet to inform
the other nodes on the level and the upper-level nodes to update the FDB
table related with the protection service VLAN.
Port and Protocol Packets on Ring
Main Port and Assistant Port The master node and transmission node are connected to two ports of the
ring link. One is the main port and the other is the assistant port. The port
role depends on the user configuration. The main port and assistant port
of the master node are different on the function, while the main port and
assistant port of the transmission node has no difference on function.
The master node of the main ring send the HEALTH packets from two
ports. If at least one port can receive the packet from the other port, it
indicates that the main ring is complete, so you need to block the data
forwarding function of the protection service VLAN of the assistant port.
Contrarily, if the HEALTH packet is not received within the specified time
or the LINK-DOWN packet of the main ring is received, it indicates that the
major-level ring fails. If the corresponding port of the faulty link is not the
assistant port, you need to enable the protection service VLAN forwarding
function of the assistant port, so as to ensure the normal communication
of all nodes on the ring. Besides, the master node of the main ring
receives the address update packet from other low-level segment link, but
does not forward it.
The main port and assistant port of the transmission node has no
difference on the function. The port role also depends on the user
configuration.
Edge Port The edge node has only one port connected to one level segment link and
the port is the edge port. When the address refresh message COMP-
FLUSH-FDB and COMM-FLUSH-FDB is received from the edge port and if
the upper level does not get the status change notification of the level
segment link that sends the control message, send the packet to the
upper level and update the FDB table of the port related with the
protection service VLAN.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 254 of 628
Data Forwarding Funct ion of Port The data forwarding function of the node port (including main port,
assistant port and access port) has the following two status:
Block: block port, prohibit the data from being forwarded via the port;
Forward: enable port, permit the data to be forwarded via the port;
For example, when the link on the main ring is normal, the master node of
the main ring blocks the assistant port so that the data in the protection
service VLAN cannot pass the assistant port of the master node, avoiding
the loop. When the link on the main ring fails and the corresponding port
of the faulty link is not the assistant port of the master node, the master
node enables the assistant port and permits the data in the protection
service VLAN to pass the assistant port and recover the communication of
service data.
Format of EIPS Protocol Packet The format of the Ethernet ring protection protocol packet is as follows:
Table 15-4
0 15 16 31 32 47
Destination MAC address (6 bytes)
Source MAC address (6 bytes)
Type (Ether Type) (TPID) PRI + CFI + VLAN ID Packet length (Frame
Length)
DSAP/SSAP CONTROL OUI = 0x00E02B
0x00BB 0x99 0x0B ERP_LENGTH
ERP_VER ERP_TYPE CTRL_VLAN_ID LEVEL_ID SEG_ID
0x0000 SYSTEM_MAC_ADDR (high 4 bytes)
Low 2 bytes HEALTH_TIMER FAIL_TIMER
STATE 0x00 HEALTH_SEQ 0x0000
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
RESERVED (0x000000000000)
The packet format is described as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 255 of 628
Destination MAC address: 48bits, described as follows:
Destination MAC Description
0180.c200.0035 The destination MAC of the HEALTH, LINK-DOWN or ASK-RING-STATE packet; the transmission node sends the packet from another port to other nodes and does not send it to the CPU of the transmission node for processing.
00E0.2B00.0004 The destination MAC of COMM-FLUSH-FDB/COMP-FLUSH-FDB packet; the transmission node sends the protocol packet to CPU for processing and sends it from another port to other nodes.
0001.7A4F.4AB6 The topology request packet
0001.7A4F.4AB4 Uni-directional detection packet
0001.7A4F.4AB5 The HELLO1 packet sent when the standby master node does not receive the HELLO packet within the specified time
Source MAC address: 48bits, the MAC address of the sending node;
TPID: 8 bits, fixed as 0x8100;
PRI+CFI: 4bits, not defined, the priority can be defined (7 is
recommended by default), the standard format frame with CFI as 0;
VLAN ID: 16bits, not defined;
Frame Length: 16bits, the length of the Ethernet frame, fixed as 0x48;
DSAP/SSAP: 16bits, fixed as 0xAAAA;
CONTROL: 8bits, fixed as 0x03;
OUI: 24bits, fixed as 0x00E02B;
ERP_LENGTH: 16bits, fixed as 0x40;
ERP_VERS: 16bits, fixed as 0x0001;
ERP_TYPE: 16bits, the packet type;
CTRL_VLAN_ID: 16bits, the ID of the control VLAN;
LEVEL_ID: 8bits, the level number of the segment link, the major-
level ring is 0; the low-level link is larger than 0;
SEG_ID: 8bits, the ID of the segment link; the major-level ring is 0;
SYSTEM_MAC_ADDR: 48bits, the MAC address of the sending node;
HEALTH_TIMER: 16bits, the period of sending the HEALTH frames set
by the master node and edge control node (the unit is ms);
FAIL_TIMER: 16bits, the timeout of not receiving the HEALTH frames
set by the master node and edge control node (the unit is ms);
STATE: 8bits, the node status;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 256 of 628
HEALTH_SEQ: 16bits, the serial number of the HEALTH frame,
generated by the maser node;
ERP_TYPE: the packet type, defined as follows:
HEALTH=5, the link health detection HEALTH packet; the
destination MAC address of the packet is 0x0180c2000035; the
protocol packet does not need to be transmitted to the CPU of the
transmission node;
COMP-FLUSH-FDB=6, the COMP-FLUSH-FDB packet of informing
that the link is complete; the destination MAC address of the
packet is 0x00E02B000004; the protocol packet needs to be
transmitted to the CPU of the transmission node;
COMM-FLUSH-FDB=7, the COMM-FLUSH-FDB packet of informing
that the link fails; the destination MAC address of the packet is
0x00E02B000004; the protocol packet needs to be transmitted to
the CPU of the transmission node;
LINK-DOWN=8, the link fault alarm LINK-DOWN packet; the
destination MAC address of the packet is 0x0180c2000035; the
protocol packet does not need to be transmitted to the CPU of the
transmission node;
ASK-RING-STATE=9, the link status query ASK packet; the packet
when the transmission node and assistant edge node asks the
current loop status of the master node during initialization; the
destination MAC address of the packet is 0x0180c2000035; the
protocol packet does not need to be transmitted to the CPU of the
transmission node;
LINK-HELLO =14 uni-directional detection packet;
TOPOLOGY=15 topology collection packet, including topology
request and topology response packet;
Other values are reserved;
The definition of the STATE value:
IDLE=0
COMPLETE=1
FAILED= 2
LINK-UP =3
LINK-DOWN =4
PRE-FORWARDING=5
The other values are reserved.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 257 of 628
EIPS Protocol Mechanism
Uni-r ing Running Mechanism The uni-ring is one major-level ring. The nodes on the major-level ring
detect and protect the links of the major-level ring, ensuring that the data
communication of the protection service VLAN of any two nodes on the
major-level ring has one connected logical path at most and the Ethernet
control packet of the major level can only be transmitted in the major-
level ring.
Non-faul t Status When the links and nodes on the uni-ring has no fault, the master node
periodically sends the HEALTH packets from the main port, which are
transmitted via the transmission nodes and links on the ring to reach the
assistant port of the master node. The master node blocks the protect
VLAN forwarding function of the assistant port so that the data in the
protect VLAN cannot be transmitted via the assistant port of the master
node, avoiding the loop. The control VLAN does not block and the EIPS
protocol packets can pass the blocked assistant port of the master node.
As shown in Figure 15-10, the master node M periodically sends the
HEALTH packets; because the loop is not faulty, the HEALTH packet
reaches the assistant port of the master node; the master node blocks the
data forwarding function of the protect VLAN of the assistant port,
avoiding the loop.
Figure15-10 The non-fault status of the uni-ring
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 258 of 628
Loop Fault Status When the link on the ring fails, block the data forwarding function of the
corresponding port of the faulty link after the neighbor node of the faulty
link detects the fault. To prevent the loop protocol packet from passing the
faulty link during uni-direction, the protocol packet cannot pass the
corresponding port of the faulty link. If it is detected that the faulty node
of the link is transmission node, send the LINK-DOWN packet from
another non-fault port. After the master node receives the LINK-DOWN
packet, it is regarded that the ring fails, as shown in Figure 15-11. To
prevent the LINK-DOWN packet from being lost, the master node has the
standby detection mechanism. When the master node does not receive the
HEALTH packet within the specified time, it is regarded that the loop fails.
After the master node detects that the link fails, enable the data
forwarding function of the assistant port at once.
Figure 15-11 Transmission node detects the link fault
If the master node itself fails, the processing is different. If the main port
fails, block the main port and enable the data forwarding function of the
assistant port; if the assistant port fails, the assistant port is still blocked.
Fault Recovery After the link fault on the ring disappears, the neighbor node of the faulty
link detects that the link fault of the port disappears; set the port of the
link on which the fault disappears as the status of forwarding the ring
network control packets so that the port can forward the EIPS protocol
packets. Set the port status as Pre-Forwarding, but the port still cannot
forward the packets of the protect VLAN.
When the link fails, the master node periodically sends the HEALTH packet
from the main port. After the link fault disappears, the master node
regards that the link recovers when the assistant port receives the HEALTH
packet. To prevent the link status flap, turn to the PRE-UP state, enable
the PRE-UP timer, and enable the data VLAN. After the PRE-UP timer times
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 259 of 628
out, turn to the COMPLETE state, re-block the data forwarding function of
the protect VLAN of the assistant port and send the COMP-FLUSH-FDB
packet to the main port. Meanwhile, the master node updates the FDB
address table of the port. After the transmission node on the ring receives
the COMP-FLUSH-FDB packet, update the FDB table of the port, set the
two neighboring ports of the faulty link as Forward state, and enable the
protect VLAN data forwarding function of the port.
To prevent the COMP-FLUSH-FDB packet from being lost, set the Pre-
Forwarding port as Forward and enable the protect VLAN data forwarding
function of the port when the neighboring node of the link on which the
fault disappears does not receive the COMP-FLUSH-FDB packet within the
specified time so that the data of the protect VLAN is transmitted
according to the topology. To prevent that the transmission node receives
two COMP-FLUSH-FDB packets, which results in the repeated updating of
the port FDB address, record the current loop status as Complete State
when the transmission node receives the COMP-FLUSH-FDB packet. If the
recorded current loop status of the transmission node is Complete State,
do not process after receiving the COMP_FLUSH-FDB packet, avoiding the
repeated updating of the port FDB table. To make the status of all
transmission nodes on the ring consistent, the master node periodically
sends the COMP-FLUSH_FDB packets. As shown in Figure 15-12, after the
link fault between the nodes T2 and T3 recovers, the nodes T2 and T3
detect that the link fault of the port disappears and set the port of the link
on which the fault disappears as the status of permitting forwarding the
ring network control packets so that the port can forward the Ethernet ring
network protect control packets. Set the port status as Pre-Forwarding,
but the port still cannot forward the packets of the protect VLAN. If the
HEALTH packets sent by the master node from the main port can pass the
link on which the fault recovers to reach the assistant port, it is regarded
that the loop recovers and starts to work and the status turns to PRE-UP;
enable the PRE-UP timer. After the PRE-UP timer times out, turn to the
COMPLETE. As shown in Figure 15-13, the master node blocks the protect
VLAN data forwarding function of the assistant port and sends the COMP-
FLUSH-FDB packet to inform other nodes of the loop recovery and to
update the FDB table of the port. After other nodes on the ring receive the
COMP-FLUSH-FDB packet, update the FDB table of the port, the
neighboring node of the link on which the fault recovers enables the Pre-
Forwarding port so that the data of the protect VLAN can pass and the
loop completes the fault protect switchover.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 260 of 628
Figure 15-12 Fault recovering
Figure 15-13 Fault recovery is complete
Running Mechanism of Intersect ing Rings After dividing the intersecting rings to major-level ring and low-level
segment link, the major-level ring is one single ring and is protected
according to the uni-ring protect running mechanism. The nodes on the
low-level segment link detect the low-level segment link, ensuring that the
data communication of the protect service VLAN of any two nodes on the
low-level ring has one connected logical path at most, and the HEALTH
and LINK-DOWN packets of the low-level segment link can only be
transmitted in the low-level segment link. When the loop of the low-level
segment link switches, the edge node sends the COMP-FLUSH-FDB and
COMM-FLUSH-FDB packets to the high-level node, informing the high-level
node to update the FDB table of the port.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 261 of 628
Non-faul t Status of Low-level Link The edge control node periodically sends the HEALTH packets from the
edge port, which are transmitted via the transmission nodes and links on
the low-level segment link to reach the edge assistant node. After the
edge assistant node receives the HEALTH packet, detect that the level and
segment of the HEALTH packet are the local level segment and return the
packet from the receiving port. The edge port of the edge control node can
receive the HEALTH packet returned by the edge assistant port. The edge
control node blocks the protect VLAN forwarding function of the edge port
so that the data in the protect VLAN cannot pass the edge port of the edge
control node, preventing the loop, but the Ethernet loop protect protocol
packets can pass the blocked edge port of the edge control node.
Low-level Link Fault If the edge control node does not receive the HEALTH packets within the
specified time, it is regarded that the link fails. The nodes all detect the
link status on the ring. When the node detects that the port link status of
itself is faulty, send the LINK-DOWN packet to the edge control node and
edged assistant node of the level segment link so that the edge control
node and edge assistant node knows that the link fails. To avoid the loop
when the link recovers, the two neighboring nodes of the faulty node
blocks the data forwarding function of the protect service VLAN of the
faulty port and prevents the EIPS protocol packets from being forwarded
via the faulty port.
After the edge control node and edge assistant node detects the fault
status of the level segment link, the edge control node sends the COMM-
FLUSH-FDB packets from the edge port and the two ports of the accessed
level. If the faulty port is not the edge port of the edge control node,
enable the data forwarding function of the edge port protect service VLAN
of the edge control node; when the edge assistant node detects that the
local level segment link fails, send the COMM-FLUSH-FDB packets from the
edge port and the two ports of the accessed level.
When the transmission node receives the COMM-FLUSH-FDB packet and if
the level of the node is higher than or equal to the level of the sending
source, refresh the port FDB table. When the edge node receives the
COMM-FLUSH-FDB packet from the edge port and if the level of the edge
access port is higher than equal to the level of the sending source and the
upper-level node does not know the link status change of the sending
source level, forward the COMM-FLUSH-FDB packet to the upper level and
update the FDB table of the port related with the protect service VLAN.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 262 of 628
Low-level Segment Link Recovery When the link is faulty, the edge control node periodically sends HEALTH
packets from the edge port. When receiving the HEALTH packet returned
by the edge assistant node, it is regarded that the low-level link between
the edge ports recovers; block the data forwarding function of the protect
service VLAN of the edge port; send the COMP-FLUSH-FDB packet from
the edge port and the two ports of the accessed level; update the FDB
table of the port related with the protect service VLAN. When the
transmission node receives the COMP-FLUSH-FDB packet and if the level
of the node is higher than or equal to the level of the sending source,
refresh the port FDB table.
When the edge node receives the COMP-FLUSH-FDB packet from the edge
access port and if the level of the edge access port is higher than or equal
to the level of the sending source and the upper-level node does not know
the link status change of the sending source level, forward the COMP-
FLUSH-FDB packet to the upper level and update the FDB table of the port
related with the protect service VLAN. After the two neighboring ports of
the faulty link detects that the link recovers, the EIPS protocol packets are
forwarded via the port on which the fault recovers; set the port status as
Pre-Forwarding. If the COMP-FLUSH-FDB packet of the local level segment
is received, enable the data forwarding function of the protect service
VLAN on the port; if the COMP-FLUSH-FDB packet is not received within
the specified time, automatically time out and enable the port.
Extended Functions Realizing the Ethernet intelligent protect switch is the basic function of
EIPS and is also the main function. The following describes several
extended functions.
Main contents:
Payload balance function
Topology auto collection function
The networking mode of not sending HELLO command
Uni-directional detection function
Reliability realization
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 263 of 628
Payload Balance Function The basic function of EIPS is to prevent the network ring by blocking the
port. In this way, all user data has only one link to choose, regardless of
the networking, so it is easy to form the traffic bottleneck. The block
granularity can be accurate to the instance on the port, which can solve
the problem validly. The EIPS payload balance function is based on the
method, so the each ring control granularity of EIPS needs to be accurate
to the instance.
The EIPS node is based on one or multiple spanning tree instances.
Perform the protection and switch on the data of the instances. One
physical ring can be configured with multiple EIPS rings and different rings
block different ports, so as to realize the payload balance, as shown in the
following figure.
Figure 15-14 EIPS payload balance
The four switches M1, M2, M3, and M4 are interconnected with each other,
forming one physical ring. Configure four EIPS rings on the physical ring;
the master node of R1 is M1 and the protect instance is inst 1; the master
node of R2 is M2 and the protect instance inst 2; the master node of R3 is
M3 and the protect instance is inst 3; the master node of R4 is M4 and the
protect instance is inst 4. When the physical ring is complete, the EIPS
ring R1, R2, R3, and R4 are all complete. The master node of R1 M1 blocks
the data of inst 1 at the assistant port S; the master node of R2 M2 blocks
the data of inst 2 at the assistant port S; the master node of R3 M3 blocks
the data of inst 3 at the assistant port S; the master node of R4 M4 blocks
the data of inst 4 at the assistant port S. The data traffic of each instance
can pass different link, so as to realize the payload balance.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 264 of 628
Topology Auto Collection Function To manage and maintain the network nodes on the ring, EIPS provides the
L2 topology auto collection function. Any one node that enables the EIPS
can see the other nodes in the ring and can describe the topology
structure.
Basic Theory Each node on the ring collects the topology separately. When EIPS is
enabled on the node, the ports of the node actively send the multicast
topology request packet. After the other nodes on the same logical ring
receive the packet, add one to the TTL value. The receiving port returns
the unicast topology response packet to the requester. The response
packet contains the basic information of the node, including the node type,
node status, the information about the contained ports and so on.
Meanwhile, for the master node and transmission node, continue to
forward the topology request from another port. Each node need to reply
after receiving the topology request sent by other node. After the node
receives the topology response packet, save the information and confirm
the location in the node according to the TTL value in the packet. After all
nodes respond, the whole topology structure can be described completely.
The topology collection can reflect the topology status of the current ring,
that is, whether it is one complete ring topology structure. The main ring
and sub ring cannot see the topology structure of each other, but only can
see whether there is other edge node on the transmission node.
For the edge node and assistant edge node, there is only one port, so the
seen topology is the topology collected by the port; but for the master
node and the assistant edge node, when the topology is complete, the
topologies collected by the two ports are complete and consistent; when
the topology is in-complete, for example, one link is disconnected, they
can only collect the part of the topology and you need to combine the
collected parts to form one complete topology. The seen by the user on
the node is the complete topology after combining the topologies collected
by the two ports.
The realtime requirement is not high. Each node sends one topology
request every 10 seconds, so when the topology changes, it cannot get
the response at once and needs to be re-discovered by the re-collection of
the topology after 10 seconds. Each collection updates the previous
topology according to the new response packet. If one node is not updated
within 10 seconds, it is regarded that the node is in the topology range.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 265 of 628
Topology Request Packet The topology request packet is as follows:
Figure 15-15 The structure of the topology request packet
The request packet is formed by standard EIPS packet +topology
information head. In the standard EIPS packet field, the destination MAC
address of the Ethernet head field is 0001.7A4F.4AB6. The packet whose
destination address is the address received by any node needs to be sent
to CPU. ERP_TYPE in the standard EIPS packet is TOPOLOGY(15).
The meanings of the fields in the topology information head are as follows:
type: one byte; 1 indicates the topology request; 2 indicates the
topology response;
ttl: one byte; indicating the location of the node relative to the request
node; fill 0 in the topology request packet; add one after passing one
node;
baseMac: 6 bytes, indicating the MAC address of the device; for the
topology request packet, it is the device MAC address of the request
node; it should be null in the topology response packet;
DMAC: 6 bytes, indicating the MAC address of the destination port; in
the topology request packet, it is all 1; in the topology response
packet, it is the MAC address of the request port;
SMAC: 6 bytes, indicating the MAC address of the source port; in the
topology request packet, it is the MAC address of the request packet;
in the topology response packet, it is the MAC address of the response
port;
Topology Response Packet The topology response packet is as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 266 of 628
Figure 15-16 The structure of the topology response packet
The topology response packet is formed by standard EIPS packet +
topology information head + node information.
In the standard EIPS packet field, the destination MAC address is the MAC
address of the initiating port of the topology request initiator. The MAC
address is got from the SMAC field in the information head of the received
topology request packet. ERP_TYPE in the standard EIPS packet is
TOPOLOGY(15). In the topology information head, type is 2; ttl is the hops
from the initiator to responder; DMAC is the destination MAC address, the
MAC address of the initiating port of the initiator, that is the value of the
SMAC field of the head information field in the topology request packet;
SMAC is the MAC address of the sending port.
The meanings of the fields in the node information are as follows:
hop: one byte, indicating the hops from the responder to initiator,
equal to the TTL value in the packet;
nt: four bits, short for node type, indicating the type of the response
node;
ns: three bits, short for node status, indicating the current status of
the response node;
b: one bit, short for border, indicating whether there is the edge node
connection; 0 means no; 1 means yes;
bm: four bits, short for backup master, indicating whether it is the
backup master node; 0 means no; 1 means yes;
ar: four bits, short for actor role, only valid for the backup master
node; o means that the backup master node role is not the master
node; 1 means that the backup master node serves as the master
node;
host name: 32 bytes, the host name of the response node;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 267 of 628
base mac: 6 bytes, the device MAC address of the response node;
sys oid: 16 bytes, the system OID of the response node;
r_role: one byte, indicating the port role of the port that receives the
request packet;
r_b: four bits, short for r_blockstatus, indicating the BLOCK status of
the port that receives the request packet on the ring of the node; 0
means non-BLOCK; 1 means BLOCK;
r_l: four bits, short for r_linkstatus, indicating the LINK status of the
port that receives the request packet; 1 means UP; 2 means DOWN;
r_i: two bytes, short for r_index, indicating the number of the port
that receives the request packet;
r_n: 16 bytes, short for r_name, indicating the name of the port that
receives the request packet. To save the memory space, intercept a
part of the port name. If it is the common port, omit ―port‖. For
example, save as ―0/0/1‖ or ―0/1‖; if it is the aggregation port, omit
―linkaggregation‖. For example, save the aggregation port 1 as ―1‖
and aggregation port 2 as ―2‖;
r_mac: 6 bytes, indicating the MAC address of the port that receives
the request packet;
s_role: one byte, indicating the role of the port that forwards the
request packet;
s_b: four bits, short for s_blockstatus, indicating the BLOCK status of
the port that forwards the request packet on the ring of the node; 0
means non BLOCK; 1 means BLOCK;
r_l: four bits, short for r_linkstatus, indicating the LINK status of the
port that forwards the request packet; 1 means UP; 2 means DOWN;
r_i : two bytes, short for r_index, indicating the number of the port
that forwards the request packet;
r_n: 16 bytes, short for r_name, indicating the name of the port that
receives the request packet. To save the memory space, intercept a
part of the port name. If it is the common port, omit ―port‖. For
example, save as ―0/0/1‖ or ―0/1‖; if it is the aggregation port, omit
―linkaggregation‖. For example, save the aggregation port 1 as ―1‖
and aggregation port 2 as ―2‖;
r_mac: 6 bytes, indicating the MAC address of the port that forwards
the request packet;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 268 of 628
Networking Mode of Not Sending HELLO The master node supports the mode of not sending HELLO command.
When the master node sets the hello timer as 0, do not need to send the
HELLO packet; when not receiving the HELLO packet, the receiving timer
times out and does not modify the EIPS node status. In the mode of not
sending the HELLO packet, as long as the master node detects that the
two ports both become up, it turns to the PRE-UP state.
Figure 15-17 EIPS supports connecting to higher-level network
As shown in Figure 15-17, EIPS is configured as the master node, being
connected to one network N via two lines main and backup. If both the
main line and backup line do not have fault, EIPS blocks the port S of the
backup line and the data is aggregated to network N via the main line. If
the main line fails, EIPS enables the port S of the backup line and the data
is aggregated to the network N via the backup line.
Uni-directional Detection Function In the present EIPS technology, the EIPS node detects the line fault
according to the signal status of the physical line. If there are other
transmission devices between two nodes on the EIPS ring as shown in
Figure 15-18, the uni-directional fault appears between the transmission
devices M1 and M2. EIPS cannot detect the fault according to the line
signals. The EIPS master node also cannot detect the uni-directional fault
according to HELLO. In the actual network, the possibility of the uni-
directional fault is small.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 269 of 628
Figure 15-18 Uni-directional fault on transmission device between two
EIPS nodes
To solve the problem, the EIPS nodes send the detection packet LINK-
HELLO to each other. The LINK-HELLO adopts the standard EIPS packet
and uses the SYSTEM_MAC_ADDR field and the front two fields in the
packet to detect. The destination MAC address in the standard EIPS packet
is 0001.7A4F.4AB4, but can automatically learn according to the peer
destination MAC address. ERP_TYPE is LINK-HELLO(14).
SYSTEM_MAC_ADDR records the MAC address of the peer port and the
front two fields record the port number of the peer port. Meanwhile, adopt
the front fields of the reserved field in the packet record the port number
of the sending port. When the eight bytes about the peer information are
all 0.
As shown in Figure 15-16, if one node can receive the LINK-HELLO packet
of the neighbor and SYS_MAC_ADDR in the packet is the MAC address of
the local port and the port number is the number of the local port, it is
regarded that the line is bi-directional.
Figure 15-19 EIPS node sends LINK-HELLO to detect the uni-direction
By default, the period of sending LINK-HELLO is 1s. LINK-HELLO timeout
period is three multiples of the sending period. The sending period can be
configured. When sending LINK-HELLO, the source MAC address in the
packet is the MAC address of the sending port of the sending node;
SYSTEM_MAC_ADDR is the MAC address of the receiving port of the peer
node. As shown in the figure, the node S1 the LINK-HELLO packet whose
source MAC address is the MAC address of the node S1;
SYSTEM_MAC_ADDR is the MAC address of S2. S1 gets the MAC address
of S2 from the LINK-HELLO packet of S2. Only when the MAC address of
the LINK-HELLO received by the node is the peer MAC address,
SYSTEM_MAC_ADDR in the packet is the MAC address of the local port,
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 270 of 628
and the port number is the local port number, it is regarded that the LINK-
HELLO packet that takes part in the timeout judging is received. When the
nodes does not know the peer MAC address, SYSTEM_MAC_ADDR and the
port number of the LINK-HELLO packet are set as all 0.
If SYSTEM_MAC_ADDR in the LINK-HELLO packet received by one port of
one EIPS node is not the MAC address of the receiving port of the node or
the port number field is not the number of the receiving port, it is
regarded that the uni-directional fault appears. Perform the shutdown
operation on the uni-directional physical port and send the TRAP
information to the gateway. After the physical port is shutdown, EIPS gets
the notification at once.
After the receiving time out, it indicates that one direction or two
directions may be disconnected. If one direction is disconnected, the
neighbor can detect; if two directions are disconnected, the EIPS master
node can detect. Therefore, when the receiving times out, you just need to
clear up the recorded MAC address of the neighbor and do not need more
operations.
If the port belongs to multiple EIPS nodes, choose the control VLAN of one
node as the VLAN field in the LINK-HELLO packet at random when forming
the LINK-HELLO packet. For the selection convenience, select the control
VLAN of the EIPS node with the minimum node number.
Reliability Realization In the ring topology network, if the control platform of the master node
becomes abnormal and breaks down, but the data platform is complete, it
makes the data platform become ring. To avoid the problem, back up the
master node to realize the EIPS reliability. Therefore, the concept of
backup master node is put forward. The main function of the backup
master node is to serve as the master node when the control platform of
the master node breaks down. When it is detected that the topology is
complete, block the assistant port to avoid the ring and inform other
nodes to refresh FDB.
The backup master node can only be the transmission node. The edge
node and assistant edge node, as well as the transmission node that is
connected to the edge node or assistant edge node cannot serve as the
backup master node. To avoid the influence for the link caused by blocking
the assistant port of the backup master node and the assistant port of the
master node, the assistant port of the configured backup master node
must be direct-connected to the assistant port of the master node, as
shown in the following figure.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 271 of 628
Figure 15-20 Assistant port of backup master node is direct-connected to
assistant port of master node
Set the Hello packet and LINKDOWN packet on the backup master node to
go to CPU and be forwarded. When the backup master node cannot
receive the HELLO packet of the master node, send the HELLO1 packet
(the format of the HELLO1 packet is the same as that of the HELLO packet;
only the destination MAC address is different; the destination MAC address
of the HELLO1 packet is 0001.7A4F.4AB5) that detects the integrality of
the data platform of the master node and the complete status of the ring.
If the assistant port can receive the HELLO1 packet, it indicates that the
loop is complete and the data platform of the master node is complete,
but the control platform breaks down. Here, the assistant port should be
blocked, and send the COMP-FLUSH-FDB packet from the main port. Set
the working status of the backup master node as the master node, as
shown in Figure 15-21.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 272 of 628
Figure 15-21 The control platform of master node breaks down
When the backup master node works as the master node, its working
theory is basically the same as the master node. When the LINKDOWN
packet on the ring is received, you need to enable the assistant port and
send the COMM-FLUSH-FDB packet to the ring via two ports. If the HELLO
packet of the master node is received and the assistant port is in the
BLOCK state, you need to enable the assistant port and switch the working
status to the transmission node status.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 273 of 628
ULFD Technology
ULFD Protocol and Application Unidirectional Link Fault Detection protocol (ULFD) is a L2 protocol. It can
be used by the devices connected with the fibers or twisted-pairs so that
they can monitor the physical configuration of the cables and check
whether uni-directional link exists. When discovering a uni-directional link,
UDLD disables the interface.
The uni-directional link results in the a series of problems, including the
spanning tree topology ring.
This section describes the theory and realization of the ULFD protocol.
Related Terms of ULFD Protocol Uni-directional link: Sometimes, there is one special phenomenon—uni-
directional link, that is to say, the local device can receive the packets sent
by the peer device, but the peer device cannot receive the packets of the
local device. The uni-directional link causes a series of problems, such as
spanning tree topology ring.
Take fiber as an example. The uni-directional link includes two types. One
is that the fibers are cross-connected; the other is that one fiber is not
connected or one fiber is disconnected. As shown in Figure 16-1, the fibers
of the two devices are cross-connected; as shown in Figure 16-2, the
hollow wire means that one fiber is not connected or one fiber is
disconnected. The typical case of Figure 16-2 is that one device is not
connected or disconnected.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 274 of 628
The cross connection of fibers
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 275 of 628
One fiber is not connected or disconnected
Introduction to ULFD Protocol The ULFD protocol is used for the network uni-directional detection.
The ULFD protocol has the following features. ULFD is the link layer
protocol and it cooperates with the physical layer protocol to monitor the
link status of the devices. The auto negotiation mechanism of the physical
layer is used to detect the physical signals and faults; ULFD is used to
identify the peer devices and uni-directional link and close the un-
reachable ports. After enabling the auto negotiation mechanism and ULFD,
they cooperate to work and can detect and close the physical and logical
uni-directional connection and prevent other protocols (such as STP
protocol) from become invalid. If the links of the two ends can work
separately at the physical layer, ULFD detects whether the links are
connected correctly and whether the two ends can exchange packets. The
detection cannot be realized via the auto negotiation mechanism.
Protocol Packet Def in i t ion The ULFD protocol runs at the LLC layer. It uses one special broadcast
address as the target address and adopts the standard SNAP format.
The destination MAC address is Destination MAC address 01-00-0C-CC-CC-
CC.
Source MAC address is the L2 MAC address of the device.
ULFD SNAP format:
LLC value: 0xAAAA03
Org ID : 0x00017a
HDLC protocol type: 0x0111
ULFD PDU Field Definition:
Ver field ( 3bits):
0x01:ULFD PDU Version Number, the current ULFD protocol version
number
Opcode field (5 bits):
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 276 of 628
Packet Type Value Description
Keepalive (probe) 0x01 Used to generate the packet for discovering the
neighbor and keeping the neighbor alive; used when
maintaining the neighbor table and requesting re-
synchronizing the neighbor.
Detection (echo) 0x02 The packet used for the unidirectional detection; when
the new neighbor is detected (or the old neighbor is re-
synchronized), adopt the packet for the uni-directional
detection.
Clear (flush) 0x03 Notify the neighbor when ULFD is disabled on one
device or port; it is used to synchronize the neighbor
information rapidly; after the neighbor receives the
message, clear up the corresponding buffer information
If the TLV type is in the TLV type range defined by ULFD, the TLV is
regarded as invalid.
Protocol Act ion The work of the ULFD protocol contains the following aspects:
Neighbor discovery: The port sends its own information and the re-
synchronization request via the probe packet, while the peer port realizes
the neighbor discovery according to the content information of the probe
packet after receiving the probe packet. After the port receives one probe
packet, judge whether the sending port is in the neighbor table. If no, it
indicates that it is the new neighbor, so add it to the neighbor table and
return the echo packet for uni-directional detection; if the sending port is
in the neighbor table, but the probe packet is set with the RSY flag, it
indicates that the neighbor requests re-synchronization and send the echo
packet to the port for the uni-directional detection; if the sending port is in
the neighbor table and is not set with the RSY flag, the probe packet is
one common keepalive packet and update the information of the neighbor.
Neighbor aging: After the neighbor is added to the neighbor table, the
port sets one aging time Tlf according to the Message Interval value in the
received probe packet. If the port does not receive the probe keepalive
packer sent by the neighbor after reaching the time Tlf, the neighbor is
aged and deleted from the neighbor table.
Uni-directional detection: The port performs the uni-directional
detection only when the neighbor table changes. The detection initiator
first initiates one probe packet with the synchronization request (RSY flag),
requests that the peer returns the echo packet after receiving the packet
and adds its own neighbor table information to the echo-tlv field of the
echo packet. If the initiator receives the echo packet sent by the peer,
check whether the contents of the echo-tlv field is correct, including the
packet format and whether the local port and device ID information is
contained. If the format of the received echo packet is correct and the
echo-tlv filed contains the local port and the device ID information, it is
regarded that the port is in the bi-directional status; if the contents of the
received echo packet are not correct, it is regarded that the port is in the
uni-directional status; if the echo packet is not received, the processing
method depends on the ULFD detection mode.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 278 of 628
Uni-directional processing: After the port status is confirmed as the
uni-directional, the neighbor table of the port is cleared up; send the
FLUSH packet to inform the neighbor that the port information to clear up
the port information, and then shut down the port. To re-enable the port,
the user needs to execute the Reset operation manually or configure other
auto recovery mechanism.
Keepalive mechanism: After the port status is stable, the port
periodically sends probe keepalive packet, informing other ports of its
status. The peer uses the keepalive packet to refresh the status of the
neighbor. If the probe keepalive packet is not received within the
keepalive period, the port is deleted from the neighbor table. The probe
keepalive packet carries all neighbor information of the port. The sending
period of the probe keepalive packet Tmsg can be set via the global
command.
Two Kinds of Detect ion Modes The ULFD mode has two kinds of working modes, that is, normal mode
and aggressive mode. In the two modes, the methods of judging the uni-
directional link are different.
In normal state, if the port does not the packet of the peer end in the
keepalive stage, the port is in the un-confirmed status; if the port does not
receive the echo packet of the peer end or the received echo packet does
not have the local port information in the uni-directional detection stage, it
is regarded that the local port and the peer link are in the uni-directional
state. The Normal mode is often used to check the uni-directional status
caused by the crossover connection.
In the aggressive mode, if the port does not receive the packet of the peer
end and as a result, all neighbor are aged in the keepalive stage, and no
any neighbor is learned after the process of Re-establishing the link, it is
regarded that the local port is un-reachable (not the uni-directional link on
strict meaning), and shut down the local port; if the port does not receive
the echo packet of the peer end or the received echo packet does not have
the local port information in the uni-directional detection stage, it is
regarded that the local port and the peer link are in the uni-directional
state. The Aggressive mode is used to check the uni-directional connection
caused by the fiber crossover connection or disconnection.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 279 of 628
Typical Application When using ULFD, ensure that the corresponding ports are configured with
the ULFD function and work in the same detection mode; the ULFD global
setting of the device is enabled.
In this section, configure one basic ULFD protocol for reference.
The network topology is as follows:
ULFD configuration instance
Illustration
Port 0/0 of the local switch A is connected to Port 0/1 of the peer switch B
via the fiber. Now, configure the ULFD function on the connection to detect
the connection status of the link.
The configuration of Switch A:
Command Description
SwitchA(config)# port 0/0 Enter the port configuration mode
SwitchA (config-port-0/0) #ulfd port aggressive
Configure the ULFD work node aggressive on port 0/0
SwitchA (config-port-0/0) #exit Exit the port configuration mode
SwitchA (config)#ulfd message time 16 Configure the interval of sending packets as
16s
SwitchA(config)#ulfd enable Enable ULFD globally
SwitchA (config)#exit Complete the ULFD configuration
The configuration of Switch B:
Command Description
SwitchA(config)# port 0/1 Enter the port configuration mode
SwitchA (config-port-0/1) #ulfd port aggressive
Configure the ULFD work node aggressive on port 0/1
SwitchA (config-port-0/1) #exit Exit the port configuration mode
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 280 of 628
SwitchA(config)#ulfd enable Configure the interval of sending packets as 16s
SwitchA (config)#exit Enable ULFD globally
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 281 of 628
OAM Technology
The chapter describes the MAN OAM technology and the applications. OAM
is short for Operation, Administration and Maintenance.
Main contents:
CFM protocol and its application
E-LMI protocol and its application
Ethernet OAM protocol and its application
CFM Protocol and Application This section describes the basic theories of the Ethernet connectivity fault
management (CFM).
Main contents:
Terms of Ethernet CFM
Introduction to Ethernet CFM protocol
Terms of Ethernet CFM CFM: Connectivity Fault Management;
OAM: Operation, Administration and Maintenance;
Maintenance Domain (MD): It is a part of the network covered by the
connectivity fault management. Its limit is defined by a series of
maintenance points (MP) configured on the ports. The maintenance
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 282 of 628
domain name is used to identify the MD. According to multi-domain OAM
network model of 802.1ag, MD has hierarchical levels. The high level can
include the low level, but they cannot intersect, that is, the range covered
by high level is larger than that covered by the low level. The integers of
0-7 are used to identify different levels. The higher the level, the bigger
the number.
Maintenance Association (MA): It is a set in MD, including some MPs. MA is
identified by MD name + short MA name. MA serves one VLAN, in which
the packets sent by the MPs in MA are forwarded and the packets sent by
other MPs in the MA are received at the same time. Therefore, MA is also
called Service Instance (SI).
MP (Maintenance point): It is one Maintenance Association End Point (MEP)
or Maintenance Association Intermediate Point (MIP). It is configured on
the port and belongs to one MA. On one port, each MA can be configured
with only one MP.
Maintenance Association End Point (MEP): It can receive and send any
CFM packet. Each MEP is identified by an integer, which is called MEP ID.
MEP is configured on the port and decides the MD range. The MA and MD
to which the MEP belongs decide the VLAN attribute and level attribute of
the packet sent by MEP. According to the location of MEP in MA, the MEP
direction includes inward and outward. If the packet in MA is received from
the port on which the MEP is configured, the MEP direction is outward.
Similarly, the outward MEP can only send packets to the network via the
port on which the MEP is configured. Contrariwise, if the packet in MA is
received from other port, the MEP direction is inward. The inward MEP
cannot send packets to the network via the port on which the MEP is
configured.
Maintenance Association Intermediate Point (MIP): It can process and
respond to some CFM packets (such as LT packet or the packet whose
destination is the LB which is at the same layer as itself), but cannot send
packets initiatively. The MA and MD to which the MIP belongs decide the
VLAN attribute and the MD level of the received packet.
Introduction to Ethernet CFM Protocol The IEEE 802.1ag protocol calls the Ethernet OAM function as connectivity
fault management (CFM), which is a supplement of the 802.1Q protocol. It
is the end-to-end Ethernet OAM function based on VLAN. It defines the
protocol and protocol entities for checking, confirming and locating the
connectivity fault in the VLAN-based network.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 283 of 628
This section describes some basic concepts and functions of Ethernet CFM.
Maintenance Domain The maintenance domain is a part of the network covered by the
connectivity fault management. Its limit is defined by a series of
maintenance points (MP) configured on the ports, including MEP and MIP,
as shown in figure 17-1.
Maintenance domain
The carrier-class Ethernet needs to provide different management scopes
and contents for different organizations. Usually, there are three kinds of
organizations that refer to carrier-class Ethernet services, including
customers, service providers, and network carriers. Users purchase
Ethernet services from service providers; service providers can use their
own network or other carriers’ network to provide end-to-end Ethernet
services. In IEEE 802.1ag, carrier-class Ethernet is divided to one multi-
domain OAM network model, including three maintenance grades, that is,
customers, service providers, and carriers. They correspond to different
management domains. The service providers are responsible for end-to-
end service management and the carriers provide service transmission.
Figure 17-2 shows three maintenance domains, that is, customers, service
providers, and carriers, as well as the hierarchical structure of the
maintenance domains. CE is the edge device of the customer (Customer
Edge); PE is the edge device of the service provider (Provider Edge).
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 284 of 628
Hierarchical structure of Ethernet CFM maintenance domain
Maintenance Associat ion Maintenance Association (MA): It is a set in MD, including some MPs. MA is
identified by MD name + short MA name. MA serves one VLAN, in which
the packets sent by the MPs in MA are forwarded and the packets sent by
other MPs in the MA are received at the same time. Therefore, MA is also
called Service Instance (SI).
Maintenance Point One maintenance point is one function point configured on the port, which
takes part in the CFM protocol operation. According to the different
locations of the maintenance points in the maintenance domain, the
maintenance point is divided to Edge Maintenance Point (MEP) and
Maintenance Intermediate Point (MIP).
MEP defines the limit of one maintenance domain. Meanwhile, these
maintenance points can limit the CFM packets in the range of the
maintenance domain according to the level of the maintenance domain.
MEP can send and receive any CFM packet.
Each MEP is identified by an integer, which is called MEP ID. MEP is
configured on the port and decides the MD range. The MA and MD to which
the MEP belongs decide the VLAN attribute and level attribute of the
packet sent by MEP. According to the location of MEP in MA, the MEP
direction includes inward and outward. If the packet in MA is received from
the port on which the MEP is configured, the MEP direction is outward.
Similarly, the outward MEP can only send packets to the network via the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 285 of 628
port on which the MEP is configured. Contrariwise, if the packet in MA is
received from other port, the MEP direction is inward. The inward MEP
cannot send packets to the network via the port on which the MEP is
configured.
MIP can process and respond to some CFM packets (such as LT packet or
the packet whose destination is the LB which is at the same layer as itself),
but cannot send packets initiatively.
Figure 17-3 shows the case that MEP and MIP are on the devices of the
customers, service providers, and carriers.
Hierarchical management of MD and locations of MEP and MIP
802.1ag supports hierarchical management and the management level is
identified by the level of the maintenance domain. The low levels can be
nested. The high-level maintenance domain can include the low-level
maintenance domain, but the low-level maintenance domain cannot
include the high-level maintenance domain. All CFM packets are initiated
by MEP. MIP does not send any CFM packet actively, but responds to LT or
the LB packet at the same layer as itself.
Figure 17-3 shows the hierarchical management of the maintenance
domain. The bigger the ID, the higher the level, the wider the control
scope.
When the maintenance domain is used to locate the fault, you can first use
LT or LB to determine the fault interval on Level 5. If the fault is between
two MIPs on Level 5, continue to use LT or LB to locate the fault on Level 3.
The packets sent or received by each MP belong to its MA, have the
features of the VLAN and layer, and do not interfere with each other. The
rest is deduced by analogy until the minimum fault area is found.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 286 of 628
Similarly, MEP sends CCM, and remote MEP receives and processes it.
When the MD and MA configured by remote MEP are inconsistent with
those configured by the MEP that sends CCM, you can find out the
configuration error in the network.
Connect ivi ty Check The continuity check function is the most basic function in 802.1ag, used
to check the connection failure of the Ethernet flow between MPs. The
connection failure may be caused by the fault or configuration error. The
connectivity check is suitable for checking the unidirectional connection
failure. Figure 17-4 shows the example chart of one CC function. The
maintenance domain (Provider Domain) contains two Operator Doamians
(Operator A and Operator B).
Connectivity checking
When the network connection is normal, each MEP periodically sends
multicast CCM (Continuity Check Message). The destination address is the
multicast address, which is determined by the level of the maintenance
domain where the MEP is located, as shown in Table 1-1.
After the MEP receives the CCM sent by the equivalent MEP in the same
maintenance domain and analyzes it correctly, the information of the peer
MEP is saved in the CCM database. The information includes MEP ID, MAC
address of MEP, remote error ID (RDI) of MEP, Sender ID of MEP, and so
on.
The local MEP compares the MEP ID of the received CCM to ensure that
there is no repeated MEP ID in the local configuration. If there is repeated
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 287 of 628
MEP ID, it indicates that the network configuration is wrong or there is
loop.
The timeout of CCM is the 3.5 multiples of the sending interval, that is, the
connection between the local MEP and the remote MEP is regarded as
wrong when three successive CCMs are lost.
Multicast address of connectivity check packet (CCM)
01-80-C2-00-00-3y
MD Level of CCM Four address bits “y”
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
CCM can reach any MEP in one MA. When other MEPs receive the CCMs
from one MA, first get the packet information and save it in the CCM
database, and then check whether the CCMs of all other MEPs in one MA
are received within the specified time.
Suppose MEP sends one CCM. When the CCM reaches the MIP in the MA,
the MIP continues to forward it; when the CCM reaches the destination
MEP of the same MA, the MEP checks whether the Level is the same as
CCM. When the timer does not time out, process the packet, re-set the
timer, and wait for receiving the next CCM sent by the remote MEP.
When receiving the CCMs sent by the other MEPs in the same MA, MEP
periodically multicasts the CCMs outward. The local MEP is responsible for
checking whether the MEP in the local CCM database times out. If the MEP
times out, it indicates that the connection with the remote MEP fails;
report the error to the network administrator.
When the sending interval of the received CCM is inconsistent with the
configured value in MA, it triggers error notification (FNG alarm). When
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 288 of 628
the MA IDs of the received CCMs are inconsistent, it indicates that there is
cross-connection error, which also triggers FNG alarm.
Loopback Check Loopback (LB) check function is used to check the connection status with
the remote device. It is suitable for checking the bidirectional connectivity
failure. The LB function is shown in Figure 17-5.
Loopback check
Execute the command to send Loop Back Message (LBM) actively on MEP
via the network management system. The target can be any MP in MA. For
the other remote MEP in MA, the local MEP can get its MAC address via
CCM; for MIP, the local MEP gets its MAC address by sending Link Trace
Message (LTM).
Each LBM has a unique serial number. After sending LBM, the serial
number of the packet is reserved for at least 5 seconds, used to
distinguish whether the received Loop Back Reply (LBR) is the correct
reply packet of the sent LBM.
When CC finds the network connectivity error, the network administrator
uses the command to trigger sending LBM to perform error track. When
MP receives LBM, first check the validity of the packet (for example, the
source address must be one unicast address), and then reply one LBR to
the source MEP. Exchange the source address and destination address of
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 289 of 628
LBR with those of LBM; the packet type is changed to LBR; the contents
are the same as those of LBM.
When MEP receives LBR, it checks whether the serial number is consistent
with that of the latest LBM. If inconsistent, it indicates that there is error;
if MIP receives one LBR, it is regarded as one error packet and drop it.
Link Trace Funct ion Link Trace (LT) function is used to search the neighboring relation and
locate the fault. The LT function is as show in Figure 17-6.
Link trace function
LTM is the multicast packet. The multicast address is as shown in table 1-2.
Multicast address of link trace packet (LTM)
01-80-C2-00-00-3y
MD Level of LTM Four address bits “y”
7 F
6 E
5 D
4 C
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 290 of 628
3 B
2 A
1 9
0 8
TLV of LTM contains one original address (Original MAC) and one target
address (Target MAC). The original address is the address of the port
where the MEP that sends LTM is located; the target address is the MAC
address of the target MEP to which the LTM is sent. Their difference is the
destination address and source address of the Ethernet data frame. There
is a unique serial number in the LTM packet, which is added with one
every time sending.
Each MP with the same level to the target address sends one LTR packet
to the original address. The packet is one unicast packet, whose source
address is equal to the target address of LTM and the destination address
is equal to the original address in TLV of LTM.
When the FNG alarm appears, send the LTM packet to track and locate the
error link. MEP sends one LTM and MIP decides whether to receive the LTM
packet according to the level of the maintenance domain. When receiving
the packet, MIP first checks whether the TTL value of LTM is 0. If yes, drop
the packet. Otherwise, subtract one from TTL and then search for the
egress port to forward the LTM packet according to the target address and
VLAN ID of LTM in the FDB table. If the egress port is not found in the FDB
table, drop the LTM packet. When the LTM packet is forwarded, the other
information except for the source MAC address and TTL value does not
change. The MIP on the port replies one LTR packet to the source MEP
after one random delay. When the network fails, LTM can only reach the
MP before the faulty point. The MPs between the faulty point and the
target MEP do not reply LTR. In this way, the faulty area can be found.
CFM Packet The CFM packet type is 0x8902. The public head of the CFM packet is as
shown in Figure 17-7.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 291 of 628
Public head of CFM packet
The CCM packet is as shown in Figure 17-8.
CCM packet
The LBM and LBR packets are as shown in Figure 17-9.
LBM and LTM packets
The LTM packet is as shown in Figure 17-10.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 292 of 628
LTM packet
The LTR packet is as shown in Figure 17-11.
LTM packet
E-LMI Protocol and Application Main contents:
E-LMI protocol and application
Definition of E-LMI protocol
Relation of E-LMI protocol and 802.1ag
UNI-N of E-LMI
UNI-C of E-LMI
Typical application
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 293 of 628
Terms of E-LMI Protocol EVC (Ethernet Virtual Connection): MEF defines EVC as port-class point-to-
point or multipoint-to-multipoint Ethernet L2 circuit. EVC is the association
of two or more UNIs. The EVC status information can be used by CE as the
routing basis of the access service provider’s network.
UNI (User Network Interface): It is he Ethernet physical interface between
the edge device of the service provider (PE) and the edge device of the
user (CE). It comprises UNI-N (on the PE device) and UNI-C (on the CE
device). The E-LMI protocol runs on one UNI and its limits are UNI-N and
UNI-C.
CE (Customer Edge): the edge device of the customer;
PE (Provider Edge): the edge device of the service provider;
Introduction to E-LMI Protocol Referring to the local management interface standards of frame relay (FR-
LMI), MEF defines the Ethernet Local Management Interface (E-LMI). E-
LMI is one OAM protocol applied on the user network interface (UNI) and
works between the edge device (CE) of the customer and the edge device
of the service provider (PE). E-LMI makes the service provider can
automatically configure CE according to the purchased services. With E-
LMI, CE can automatically receive the mapping information from specific
Ethernet service instance (such as VLAN 100) to EVC and the bandwidth
and QoS settings. The auto configuration function of the CE device reduces
not only the service construction work, but also the negotiation work
between the service provider and the enterprise user. Therefore, the user
does not need to know the configuration of the CE device, which is
configured and managed by the service provider, reducing the risks of
wrong manual operations. Besides, E-LMI provides the EVC status
information for the CE device. Once the EVC fault is found (such as
802.1ag), the edge device of the service provider can inform the CE device
of the fault information so that the CE device can do the corresponding
adjustment in time (for example, switch the access route).
Definition of E-LMI Protocol E-LMI protocol runs on one UNI; the protocol edges are UNI-CE and UNI-
PE, as follows.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 294 of 628
Metro Ethernet Network
User Network
Interface
E-LMI
User Network
Interface
E-LMI
CE CEPE
PE
Typical topology of E-LMI protocol running on one UNI
E-LMI Protocol Act ion The actions of the E-LMI protocol include CE polling and PE informing.
CE Polling:
The UNI-C device transmits the E-LMI Check message (E-LIMI Check
STATUS ENQUIRY) to the UNI-N device for active polling; the polling
interval is T391s (by default, it is 10s). Every after N391 times (360 times
by default) of active polling, UNI-C transmits one complete status request
message (FULL STATUS ENQUIRY). UNI-N transmits the status and
configuration information of UNI and EVC to UNI-C as response. UNI-N
enables the T392 timer to wait for the request message of UNI-C. The
configured value of T392 must be larger than T391.
After receiving the correct response of Full Status Enquiry, CE modifies
and updates the status and configuration information of EVC and UNI in
the local database according to the information carried in the response, so
as to ensure that the EVC configuration and status information of CE is
synchronous with that of PE.
PE Informing:
If finding that the EVC status on PE changes, PE immediately sends the
Single Evc Asynchronous Status message to inform the CE. CE modifies
and updates the EVC status information in the local database according to
the information carried in the response, so as to ensure that the EVC
status information of CE is synchronous with that of PE.
E-LMI Message Type MEP 16 defines two kinds of message types to realize the E-LMI protocol
interacting, including STATUS ENQUIRY message and STATUS message.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 295 of 628
The content type (Report Type) transmitted by the E-LMI packet is divided
to the following four types:
E-LMI Check: the checking packet during normal polling;
Full Status: full-status packet;
Full Status Continues: Full-status follow-up packet;
Single EVC Asynchronous Status: active EVC status informing packet; the
packet can only be sent by UNI-N to inform CE of the EVC status change
information.
STATUS ENQUIRY Message:
The STATUS ENQUIRY message is sent by UNI-C to ask the UNI-N for the
configuration and status information of EVC and UNI. After receiving one
valid STATUS ENQUIRY message, UNI-N should send one STATUS
message to reply the request message.
The structure of the STATUS ENQUIRY message:
Message type: STATUS ENQUIRY
Direction: UNI-C to UNI-N
Information element Type
Protocol Version Mandatory
Message type Mandatory
Report Type Mandatory
Sequence Numbers Mandatory
Data Instance Mandatory
Structure of STATUS ENQUIRY message
STATUS Message:
The STATUS message is sent by UNI-N to reply the STATUS ENQUIRY
message or actively inform UNI-C of the EVC status change information.
The Report Types of the messages are different, so the contents of the
STATUS massages are different. The content relation is as follows.
STATUS message
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 296 of 628
Report Type
Information Element
Value
Information Element
Full
Status
E-LMI Check Single EVC
Asynchronous
Status
Full
Status
Continued
Sequence Numbers X X X
Data Instance X X X
UNI Status X
EVC Status X X X
CE-VLAN ID/EVC Map X X
E-LMI Message Frame Encapsulat ion Format Destination
Address
Source
Address
E-LMI
Ethertype
E-LMI PDU
(message)
CRC
6 Octets 6 Octets 2 Octets 46 1500 Octets
(Data + Pad)
4
Octets
E-LMI message encapsulation frame format
In MEF-16, the destination address of the E-LMI message is defined as 01-
80-c2-00-00-07; E-LMI EtherType is defined as 0X88EE. The PDU contents
comprise the series of TLV. For details, refer to MEF-16 standards.
Relation between E-LMI Protocol and 802.1a The E-MLI protocol runs on the UNI connection from PE to CE and gets the
EVC and UNI configurations and status information from the UNI-N end to
complete the auto configuration function of CE at the UNI-C end. But at
the UNI-N end, the E-LMI module cannot get the EVC status information,
but depends on the CC (Cross Check) function of the 802.1ag protocol
(CFM module) to check the connectivity between UNIs of EVC, so as to
determine the current operation status of EVC.
UNI-N End of E-LMI The EVC, UNI and CFM configurations need to be configured on UNI-N end.
The defined EVC needs to be applied on UNI. On one UNI, use EVC
Reference ID to identify one EVC. Different EVCs map with CE-VLAN IDs.
The number of the bound EVCs depends on the UNI type.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 297 of 628
CFM
Refer to the configurations and technology description of the 802.1ag
protocol.
EVC
EVC needs to be defined on UNI-N. The EVC is divided to point-to-point
and multipoint-to-multipoint types.
Point-to-point EVC comprises only two UNIs; Multipoint-to-mulitpoint EVC
comprises two or more UNIs.
One EVC needs to be bound with the CFM management domain instance.
The connectivity between the UNIs in EVC can be got via CFM
management domain instance.
UNI
UNI has the following three types:
Bundling: Multiple EVCs can be configured on one UNI and one EVC can
map with multiple CE-VLAN IDs;
Service Multiplexing with no Bundling: Multiple EVCs can be
configured on one UNI, but each EVC can map with only one CE-VLAN ID;
All to one Bundling: One UNI can be bound with only one EVC and all
CE-VLAN IDs map to the EVC;
The port of the UNI-N end needs to be configured as the MEP node of one
CFM domain and enable the CC function of CFM. In this way, UNI-N end
can get the connection status between the UNIs of EVC configured on the
PE device via the 802.1ag protocol, so as to get the current operation
status of the EVC.
Enable PE Mode of E-LMI Protocol:
After enabling the PE mode of the E-LMI protocol on the UNI-N, the UNI-N
waits for the request of UNI-C and makes the corresponding response.
When UNI-N finds that the status of the EVC bound to the UNI changes, it
actively sends the EVC status notification message to the PE.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 298 of 628
UNI-C of E-LMI The UNI-C of E-LMI only needs to enable the E-LMI protocol and run in the
CE mode. After being configured as the CE mode, UNI-C periodically sends
the E-LMI Check request to UNI-N and initiates one Full Status request to
ask UNI-N for the EVC and UNI configuration and status information when
finding that the Data Instance values of EVC and UNI do not match with
each other via the E-LMI Check message. Besides, the local UNI-C
information is updated.
Typical Applications The following is one typical application of E-LMI.
Topology of typical E-LMI application
In the above figure, one EVC——EVC_Provider is defined to show the
network connection of the service provider. It comprises PE1, PE2, and
PE3. The blue ellipse means one CFM management domain- Service
Provider Domain, whose level is 4. The three edge devices are configured
as three MEP nodes of the domain. The CFM management domain checks
the connectivity between the three MEPs to determine the EVC_Provider
operation status.
Enable the E-LMI protocol on the UNI connection UNI1 between CE1 and
PE1. CE1 gets the UNI1 configuration information, and EVC_Provider
configuration and status information from PE1 via the E-LMI protocol, so
as to complete the auto configuration function of CE1.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 299 of 628
Ethernet OAM Protocol and Application Main contents:
Ethernet OAM protocol and related terms
Introduction to Ethernet OAM protocol
Related Terms of Ethernet OAM Protocol OAM: Operations Administration and Maintenance
Errored symbol: the times of the error symbol on the port
Errored frame: the number of the received error packets
Introduction to Ethernet OAM Protocol As one L2 protocol, Ethernet OAM is the tool of monitoring and solving
network problems. It can report the network status at the data link layer
so that the network administrator can manage the network more
efficiently. Ethernet OAM is defined in IEEE 802.3ah.
Currently, Ethernet OAM mainly solves the OAM problems of the Ethernet
devices at the last one km, including link performance monitoring, fault
detecting and alarming, loopback test, remote MIB and variable request.
All functions of Ethernet OAM can become valid only after the Ethernet
OAM connection is set up.
The main functions of the Ethernet OAM are as follows:
1. Discovering and setup of Ethernet OAM connection
2. Link monitoring of Ethernet OAM connection
3. Remote fault diagnose of Ethernet OAM connection
4. Remote loopback of Ethernet OAM connection
5. MIB variable request of Ethernet OAM connection
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 300 of 628
Locat ion of Protocol in System
Location of Ethernet OAM in the system
As shown in the above figure, the Ethernet OAM is located between MAC
Control layer and the LLC layer.
Protocol Structure
Structure of Ethernet OAM protocol
As shown in the above figure, Ethernet OAM comprises the OAM sublayer
and OAM client.
The OAM sublayer is responsible for the flow dividing and remote loopback
policy processing of the sent and received packets on the interface; OAM
client is responsible for the connection maintenance and remote loopback
control of the protocol.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 301 of 628
Structure of OAM sublayer
As shown in the above figure, the OAM sublayer comprises Multiplexer,
Parser, and Control.
Multiplexer is responsible for the OAM processing at the sending direction
of all packets (including service data packets) on the interface. There are
two modes, that is, Forward mode (send all packets normally) and Discard
mode (discard all packets of non-Ethernet OAM protocol).
Parser is responsible for the OAM processing at the receiving direction of
all packets (including service data packets) on the interface. There are
three modes, that is, Forward mode (receive all packets normally),
Loopback mode (loopback the non-Ethernet OAM protocol packets), and
Discard mode (discard all non-Ethernet OAM protocol packets).
Control is responsible for sending and receiving the Ethernet OAM protocol
packets.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 302 of 628
Basic Format of OAM Protocol Packet
Basic format of Ethernet OAM packet
As shown in the above figure, the destination address of the Ethernet OAM
packet is 01-80-C2-00-00-02; the OAM packet belongs to the low-speed
protocol (the protocol number is 88-09); the subtype is 0x03;
Flags identifies the status of the Ethernet OAM;
Code identifies the type of the Ethernet OAM packet;
Data/Pad is the data content of the Ethernet OAM packet, which varies
with Code;
Information OAMPDU:
Information OAMPDU packet is used to send the status information of the
OAM entity (including local information, remote information and
customized information) to the remote OAM entity, keeping the OAM
connection. The packet format is as follows:
Information OAMPDU packet format
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 303 of 628
Event Notification OAMPDU:
Event Notification OAMPDU packet is used for the link monitoring,
alarming the link fault of the remote OAM entity. The packet format is as
follows.
Event Notification OAMPDU packet format
Variable Request OAMPDU
Variable Request OAMPDU packet is the variable request packet, which is
sent when requesting the MIB variable. The packet format is as follows:
Variable Request OAMPDU packet format
Variable Response OAMPDU
Variable Response OAMPDU packet is used to respond the variable request,
which is sent when responding the MIB variable request. The packet
format is as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 304 of 628
Variable Response OAMPDU packet format
Loopback Control OAMPDU
Loopback Control OAMPDU packet is used for remote loopback control. The
device can select whether to use the packet. To realize the loopback
control, the local DTE sends the loopback control command to the remote
DTE. If the loopback control function of the remote DTE is enabled, the
sent packet is returned to the sending party. The packet format is as
follows.
Loopback Control OAMPDU packet format
Discovery and Setup of OAM Connect ion The OAM connection is set up during OAM Discovery. When setting up the
OAM connection, the connected devices can exchange their OAM
configuration information and announce the OAM capabilities supported by
the local node. The other OAM functions can be performed only after the
OAM connection is set up.
Active Mode and Passive Mode
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 305 of 628
The device can select Active mode or Passive mode to set up the OAM
connection. The DTE (Data Terminating Entity) processing capabilities in
active mode and passive mode are as follows.
Comparison of DTE processing capabilities in active mode and passive
mode
Processing Capability DTE in Active Mode DTE in Passive Mode
Initiate OAM Discovery Yes No
Respond OAM Discovery Yes Yes
Need to send Information OAMPDUs Yes Yes
Allow sending Event Notification OAMPDUs Yes Yes
Allow sending Variable Request OAMPDUs Yes No
Allow sending Variable Response OAMPDUs
Yes, but the peer DTE also needs to be in the active mode.
Yes
Allow sending Loopback Control OAMPDUs Yes, but the peer DTE also needs to be in the active mode.
No
Respond Loopback Control OAMPDUs Yes Yes
Allow sending Organization Specific OAMPDUs
Yes Yes
Status Transferring and Triggering Event of Connection
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 306 of 628
Status transferring of connection
The above figure shows the status transferring of the Ethernet OAM
connection. Besides the status transferring described in the above figure,
there are several special status transferring:
1. When the connected timeout timer times out, all status return to
Active or Passive;
2. When the port is down or the OAM function is shut down, all status
return to Fault;
Transferred Status of OAM Connection
Transferred status of connection
Status Description
Fault Ethernet OAM does not begin running.
Active Active status, actively sending out the information OAMPDU
packet that contains Local information TLV periodically to
discover the connection.
Passive Passive status, passively waiting for the the information
OAMPDU packet that contains Local information TLV to accept
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 307 of 628
the connection
Discovered Discovered connection status, periodically sending out the
information OAMPDU packet that contains Local information
TLV and Remote information TLV to negotiate the connection
and enable the connection timeout timer
Local-stable The connection status that the local passes the attribute
matching, periodically sending out the information OAMPDU
packet that contains Local information TLV and Remote
information TLV to negotiate the connection and enable the
connection timeout timer
Up The setup status of the connection, periodically sending out
information OAMPDU packet that contains Local information
TLV and Remote information TLV to keep alive the connection
and enable the connection timeout timer
Event Triggering OAM Connection Status Transferring
Events triggering OAM connection status transferring
Event Description
Ethernet OAM port UP The Ethernet OAM port becomes up
Ethernet OAM port DOWN The Ethernet OAM port becomes down, including port down
and Ethernet OAM function shutdown
Receive the information
OAMPDU packet
The information OAMPDU packet is received.
Local attribute matching
passed
According to information OAMPDU, match the local attribute
and the matching is passed
Local attribute matching not
passed
According to information OAMPDU, match the local attribute
and the matching is not passed
Remote attribute matching
passed
According to the flags digit of the information OAMPDU
packet, judge that the remote attribute matching is passed
Remote attribute matching
not passed
According to the flags digit of the information OAMPDU
packet, judge that the remote attribute matching is not
passed
Connection times out The connection is invalid and the timer times out
Serious Link Event of OAM Connect ion When there are serious link events on the link, set the related link status
on the Flags field of the Ethernet oAM packet header and inform the
connected peer end via the Event Notification OAMPDU packet.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 308 of 628
The serious link event types of the Ethernet OAM connection and the
definitions are as follows:
Serious link event of Ethernet OAM connection
Event Definition
Link fault Hardware PHY finds the link fault at the receiving
direction;
Dying gasp The un-recoverable fault event happens to the local. For
example, Ethernet OAM is down.
critical-event Un-predictable serious event happens (currently, there is
no definition)
Link Monitor ing of OAM Connect ion Ethernet OAM can monitor and check the error signals and error frames on
the link periodically and execute the specified operation when the error
number exceeds the specified threshold (such as shut down the port) and
inform the connected peer end via the Event Notification OAMPDU packet.
The link monitoring types and the definitions of Ethernet OAM connection
are as follows:
The link monitoring types and the definitions of Ethernet OAM connection
Link monitoring event Definition
Errored Symbol Period The number of error signals exceeds the defined threshold
during the unit signal number period;
Errored Frame The number of error frames exceeds the defined threshold
during the unit time period;
Errored Frame Period The number of error frames exceeds the defined threshold
during the unit frame number period;
Errored Frame Seconds Summary The number of error frame seconds exceeds the defined
threshold during the unit time period;
Remote Loopback of OAM Connect ion After the OAM connection is set up, the Loopback Control OAMPDU packet
can be sent to control the peer end to enter the remote loopback test
mode. During the remote loopback test, the packets sent by the local are
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 309 of 628
looped back by the peer end, so as to test the parameters of the link, such
as packet loss rate and delay.
Remote Loopback test mode only influences the non-Ethernet OAM
protocol packets. The Ethernet OAM protocol packets are still sent and
received normally.
In the remote loopback test mode, the processing of the OAM sublayer is
as follows:
Port status in remote loopback mode
Port status Multiplexer
Mode
Parser Mode Description
Master (initiating
the loopback)
Forward Discard When receiving the information
OAMPDU packet that indicates
that the peer end is in the
loopback state, enter into the
mode
Slave (looping
back)
Discard Loopback When receiving the command
of enabling loopback in the
Loopback Control OAMPDU
packet, enter into the mode
In the Remote Loopback test mode, process of the non-Ethernet OAM
protocol packets is as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 310 of 628
Process of the non-Ethernet OAM protocol packets in Remote Loopback
test mode
MIB Variable Request of OAM Connect ion The local OAM entity can send remote MIB variable request (OAMPDU
packet) to the peer OAM entity to ask for the current MIB variable. The
function can be used to monitor the link status of the remote port in real
time.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 311 of 628
EVC Technology
This chapter describes the EVC technology and application.
Main contents:
Related terms
Application description
Typical application
Related Terms This section describes the related terms of EVC.
EVC (Ethernet Virtual Connection): EVC is put forward by MEF. It is
the virtual connection used to connect two or more UNIs and switch
Ethernet service frames between them.
EVC can be divided to three types according to the connection mode:
1. Point-to-point EVC: It is also called Eline Service, including two types:
EPL: Ethernet private line
EVPL: Ethernet virtual private line
The difference is that there can be multiple EVPLs on one UNI, while there
can be only one EPL on one UNI.
2. Multipoint-to-multipoint EVC, also called ELAN Service
3. Point-to-multipoint EVC: It is one special EVC. We call one side as root
and the other side as leaf. The EVC is formed by one or multiple roots
(usually it is one root) + one or multiple leaves. The main feature is
that the frames from the root node or the leaf node need to be copied
to all leaves, while the frames from the leaf node to the root node only
need to be transmitted to the root node and the frames do not need to
be copied between the leaf nodes. The main usage is IPTV. Currently,
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 312 of 628
Maipu switch does not support this kind of EVC directly, but can
support indirectly by configuring the port separation and L3 forwarding
features between UNIs.
UNI (User Network Interface): It is the Ethernet physical connection
between the network edge device of the service provider (PE) and the
customer edge device (CE). It is formed by UNI-N (defined on PE device)
and UNI-C (defined on CE device). The E-LMI protocol runs on one UNI
and its edge is UNI-N and UNI-C.
Currently, UNI supports three types of attributes:
Multiplexing with Bundling: One UNI can be configured with multiple
EVCs and each EVC can map with multiple CE-VLAN IDs;
Multiplexing with no Bundling: One UNI can be configured with
multiple EVCs, but each EVC can map with only one CE-VLAN ID;
All to one Bundling: One UNI can be configured with only one EVC; all
CE-VLAN IDs are mapped to the EVC;
The port of the UNI-N end needs to be configured as the MEP node of one
CFM domain and enables the CC (Cross Check) function of CFM. In this
way, UNI-N end can get the connection status between the UNI ends of
EVC configured on the PE device via 802.1ag, so as to get the current
operation status of EVC.
CE (Customer Edge): customer edge device
PE (Provider Edge): edge device of service provider
EFP (Ethernet Service Instance): Ethernet service instance
QINQ, ELMI, and CFM: Refer to the related technical manuals.
Application Description EVC provides the public attributes and configurations, cooperating with the
modules to realize the service functions. For details, refer to EVC
Configuration Manual. The main attributes of EVC are described as follows:
Realization type of EVC: There can be multiple schemes to realize EVC.
Currently, Maipu switch supports QinQ.
EVC type: There are two types, that is, point-to-point and multipoint-to-
multipoint. Point-to-point means that there are only two UNI ports in one
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 313 of 628
EVC virtual connection, while multipoint-to-multipoint means that there
are multiple UNI ports in one EVC virtual connection, as follows:
Figure 18.1 point-to-point EVC
Figure 18.2 multipoint-to-multipoint EVC
Local MEP and remote MEP of EVC: MEP is the end point used to
maintain the connection and can send/receive ant CFM packet. Each MEP
uses one integer to identify, called MEP ID.
QINQ type: There are two kinds, including double and mapping. Double
supports the mapping of multiple CEVLANs and one single EVC, while
mapping only supports the mapping of one single CEVLAN and one single
EVC.
QINQ mode: There are two kinds, that is, one and multiple. The one
mode does not need to configure SVLAN and CEVLAN of EVC, adopting the
port default value; multiple has no limitation.
The combination of EVC and the related modules is described as follows:
1. The application combination of EVC and QINQ (for QINQ, refer to
QINQ Technical Manual):
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 314 of 628
Associate EVC to the local port and run QinQ function on the port to set up
the EVC connection. Bind EVC on the port, get the QinQ information in EVC
according to EVC ID and convert the QinQ information to the port
configuration. The UNI mapping type of the port should match with the
information in the bound EVC. The detailed matching rules are as follows:
You can bind EVC only to the Hybrid and Trunk port, but cannot bind EVC
to the Access port. The UNI mapping of the port is ALL-TO-ONE. The port
can only be bound to one EVC and all CEVLANs are mapped to the EVC.
If the UNI mapping of the port is BUNDLING, the port can be bound to
multiple EVCs and each EVC can be configured with multiple CEVLANs. The
CEVLANs in the multiple bound CEVLANs cannot be the same and SVLANs
cannot conflict with each other.
If the UNI mapping of the port is MULTIPLEXING, the port can be bound to
multiple EVCs, but each EVC can be configured with only one CEVLAN. The
CEVLANs in the multiple bound EVCs cannot be the same and SVLANs
cannot conflict with each other.
2. The application combination of EVC and ELMI (for ELMI, refer to ELMI
Technical Manual)
Bind the E-LMI protocol on the connected ports of the PE and CE devices,
and run the E-LMI protocol as the PE and CE modes. With the E-LMI
switching, the CE device can get the configuration information and status
information of all EVCs bound on the ports connected to the CE device
from the PE device. Meanwhile, when the EVC status on the PE port
changes, actively inform the CE device to update at once via the E-LMI
protocol.
3. The application combination of EVC and 802.1ag (for 802.1ag, refer to
802.1ag Technical Manual)
One EVC needs to be bound with the CFM management domain instance.
With the CFM management domain instance, you can get the connectivity
between the UNIs in the EVC.
The current status of EVC depends on the status of all local ports and
remote ports in EVC. The status of the remote port needs to be got via
802.1ag. Therefore, EVC needs to concern and process the following
events in 802.1ag: add remote MEP, remote MEP status UP, delete remote
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 315 of 628
MEP, remote MEP status DOWN, and delete CFM management domain
information. Process the events to update the current status of EVC.
Typical Application The following figure shows one typical application instance of combining
EVC and E-LMI.
Figure 18.3 EVC networking instance
In the above figure, one EVC is defined. EVC_Provider indicates the
network connection of the service provider, which comprises PE1, PE2 and
PE3. The blue ellipse indicates one CFM management domain- Service
Provider Domain, whose level is 4. The three edge devices are three MEP
nodes of the domain. The CFM management domain is responsible for
checking the connectivity among the three MEPs, so as to confirm the
operation status of EVC_Provider.
Enable the E-LMI protocol on UNI1 between CE1 and PE1. CE1 gets the
UNI1 configuration information and the configuration and status
information of EVC_Provider from PE1 via the E-LMI protocol, so as to
complete the auto configuration function of CE1.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 316 of 628
LLDP Technology
This chapter describes the LLDP technology and application.
Main contents:
Overview
LLDP working mechanism
TLV information type
Typical application of LLDP
Overview LLDP (Link Layer Discovery Protocol) is the link layer protocol defined in
802.1ab. It organizes the information of the local device as TLV
(Type/Length/Value) to be encapsulated in LLDPDU (Link Layer Discovery
Protocol Data Unit), which is sent to the direct-connected neighbor.
Meanwhile, LLDP saves LLDPDU received from the neighbor in MIB
(Management Information Base). With LLDP, the device can save and
manage the information of itself and direct-connected neighbor device for
the network management system to query and judge the communication
status of the link. LLDP does not configure or control network elements or
traffic, but it only reports the L2 configuration. Another content in 802.1ab
is to make the network management software use the information
provided by LLDP to discover some L2 contradiction.
LLDP Working Mechanism LLDP has the following four working modes:
TxRx: transmit and received LLDPDU
Tx: only transmit, but not receive LLDPDU
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 317 of 628
Rx: only receive, but not transmit LLDPDU
Disable: not transmit or receive LLDPDU
LLDPDU Transmitting Mechanism When the port works in TxRx or Tx mode, transmit LLDPDU to the
neighbor device periodically according to the specified interval. When the
local configuration changes, to inform the change of the local information
to the neighbor device as soon as possible, you need to enable the polling
function on the device and configure the polling period; when the polling
time reaches, transmit LLDPDU at once. If the polling function is not
enabled, the change of the local configuration does not transmit LLDPDU
at once until transmitting the next LLDPDU by the transmitting period. To
prevent that the frequent change of the local information causes lots of
LLDPDU to be sent, delay some time every transmitting one LLDPDU, and
then continue to transmit the next LLDPDU.
When some configuration about the LLDP of the local device (such as
holdtime, select the released TLV type) changes, or when the polling
mechanism finds that the configuration information of the local system
LLDP changes after the polling function is enabled, to make other devices
discover the change of the local device as soon as possible, enable the
rapid transmitting mechanism, that is, transmit the LLDPDU of the
specified number (it is 3 by default) continuously at once, and then
recover to the normal transmitting period.
When the device disables LLDP globally or the port on which LLDP is
enabled performs the operations of shutdown, adding into aggregation
group, disabling LLDP, and executing the system reload, to make the
neighbor device learn the disabling of the local device LLDP rapidly, you
need to transmit one CLOSE TLV LLDPDU to inform the neighbor.
LLDPDU Receiving Mechanism When the port works in the TxRx or Rx mode, check the validity of the
received LLDPDU or the carried TLV. After checking the validity, save the
neighbor information to the local device and set the aging time of the
neighbor information in the local device according to the TTL (Time To Live)
carried by LLDPDU. If the TTL value in the received LLDPDU is 0, age the
neighbor information at once.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 318 of 628
Set the aging time of the local information on the neighbor device by
configuring holdtime. The default value is 120s. The maximum value of
holdtime is 65535s.
TLV Information Type The TLV that can be encapsulated by LLDP includes basic TLV, the TLV
defined by the organization and related TLV of MED (Media Endpoint
Discovery). The basic TLV is regarded as a group of TLV of the network
device management basis; the TLV defined by the organization and the
related TLV of MED is the TLV defined by the standard organization and
other organization, used to improve the management for the network
devices. You can configure whether the TLV is transmitted in LLDPDU as
desired.
Basic Management TLV In basic TLV, some types of TLV are mandatory for realizing the LLDP
function, that is, must be released in LLDPDU, as shown in Table 19-1.
Description of basic management TLV
TLV Type Description Whether to be released
End of LLDPDU TLV Indicating the end of LLDPDU Yes
Chassis ID TLV The MAC address of the sending device Yes
Port ID TLV Used to identify the port of the LLDPDU sending end; when the device does not send MED TLV, the content is the port name; when the device sends MED TLV, the content is the MAC address of the port.
Yes
Time To Live TLV The life time of the local device information on the neighbor device
Yes
Port Description TLV The description character string of the port No
System Name TLV The device name No
System Description TLV The system description No
System Capabilities TLV The main functions of the system and which functions are enabled
No
Management Address TLV Management address and the corresponding interface number and OID (Object Identifier). The management address is the main IP address of the VLAN permitted by the interface with minimum VLAN ID. If the VLAN with the minimum VLAN ID is not configured with the main IP address, the management address is 127.0.0.1. By default, send the TLV.
Yes
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 319 of 628
TLV Defined by Organization 1. TLV defined by IEEE 802.1
Port VLAN ID TLV: VLAN ID of the port;
Protocol VLAN ID TLV: the protocol VLAN ID of the port;
VLAN Name TLV: the VLAN name of the port;
Protocol Identity TLV: the protocol type supported by the port;
The device does not support sending Protocol Identity TLV, but can receive
this type of TLV.
2. TLV defined by IEEE 802.3
MAC/PHY Configuration/Status TLV: the rate and duplex status of the
port, whether to support the auto negotiation of the port rate, whether
to enable the auto negotiation function and the current rate and
duplex status;
Power Via MDI TLV: the power capability of the port;
Link Aggregation TLV: Whether the port supports the link aggregation
and whether to enable the link aggregation;
Maximum Frame Size TLV: The supported maximum frame length,
adopting the configured MTU of the port (Max Transmission Unit);
Related TLV of LLDP-MED LLDP-MED Capabilities TLV: The MED device of the current device and
the LLDP MED TLV type that can be encapsulated in LLDPDU;
Network Policy TLV: The VLAN ID of the port, the supported
application (such as voice and video), the applied priority and used
policy;
Hardware Revision TLV: the hardware version of the device;
Firmware Revision TLV: the firmware version of the device;
Software Revision TLV: the software version of the device;
Serial Number TLV: the serial number of the device;
Manufacturer Name TLV: the manufacturer of the device;
Model Name TLV: the Model Name of the device;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 320 of 628
Assert ID TLV: the assert ID of the device, for the directory
management and assert tracking;
Location Identification TLV: the location ID information of the
connection device, used by other devices in the application based on
the location;
Neighbor Storage Capability of LLDP The LLDP protocol can receive LLDPD and store the neighbor in the form of
the neighbor information. The LLDP protocol has limitation for the storage
capability of the neighbor. Currently, the single port on Maipu switch
supports the information of 20 neighbors at most. The whole device
supports the storage of 2000 neighbors at most. If the number of the
neighbors reaches 2000, the notification packets of more neighbors are
dropped and are not saved.
Typical Application of LLDP
Networking of configuring LLDP
As shown in the above figure, the port 0/0/1 of SW1 is connected with
port 0/0/1 of SW2; port 0/0/2 of SW1 is connected with port 0/0/2 of SW3.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 321 of 628
Configure LLDP function on the three devices. The three devices can
exchange information via LLDPDU and query the neighbor information of
each other. The remote NMS can be connected to the device for network
management and topology collection, so as to realize the cluster
management.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 322 of 628
MAC Address Table Management Technology
This chapter describes the management technology of the MAC address
table and application.
Management and Application of MAC Address Table This section describes the management theory of the MAC address table.
Main contents:
Related terms
Introduction
Related Terms Dynamic MAC address: the auto learned MAC address of the packet
received by the switch. When the port receives one packet, search
whether the source/destination MAC address of the packet is in the MAC
address table. If not, associate the port, VLAN and source MAC address
and save in the MAC address table.
Static MAC address: the static forwarded MAC address configured by the
user via the shell command or snmp proxy; the static MAC address and
the dynamic MAC address have the same function, but compared with the
dynamic MAC address, the static MAC address does not age.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 323 of 628
Filter MAC address: the static filtered MAC address configured by the
user via the shell command or the snmp proxy; when the source or
destination MAC address of the packer received by the gateway is the filter
MAC address, directly discard the packet.
MAC address entry: formed by the information, such as MAC address,
VLAN, port number and the type of the MAC address.
Aging time: the existing time of the dynamic MAC address in the MAC
address table after the switch learns the MAC address.
Introduction The MAC address entry contains the address information of the packet
forwarding between the ports. There are three types of addresses in the
MAC address entries, including static MAC address, dynamic MAC address,
and filter MAC address. The MAC address entry is formed by the
information, such as MAC address, VLAN, port number and the type of the
MAC address.
The static MAC address can only be set manually or via other software.
Compared with the dynamic MAC address, the static MAC address is not
aged and cannot be learned, but can only be added and deleted manually.
According to the function, the static MAC address is divided to three kinds,
that is, the static MAC address of forwarding packets normally (FWD), the
static MAC address of only transmitting the packet to CPU, but not
forwarding the packet (TRAP) and the static MAC address of transmitting
the packet to CPU and forwarding packet (F&T).
The filter MAC address is global and functions on the whole switch. If one
MAC address is configured as the filter address, the host of the address is
prohibited to access the network via the switch, that is, the packet with
the destination or source MAC address as the MAC address is dropped.
The dynamic MAC address is the MAC address that is learned according to
the source MAC address of the packet after the switch receives the packet.
The MAC address entry is associated and saved according to the MAC
address, VLAN ID and port value. The MAC address table updates the
entries according to this mode. When receiving one packet whose
destination MAC address is in the MAC address table, forward it directly.
Otherwise, write the source MAC address into the MAC address table, that
is, learn one MAC address and forward the packet to other member ports
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 324 of 628
of the VLAN to which the port belongs. That is to say, the packet floods.
When the number of the MAC addresses learned by the port reaches the
maximum value, do not learn any more and flood the packet. If the device
does not receive the packet with the source MAC address packet as the
address before the aging time of the dynamic MAC address arrives after
learning one MAC address, the MAC address entry is deleted when the
aging time arrives.
The port-based MAC address learning number limitation is that the user
can configure to limit the number of the dynamic MAC addresses learned
by each port. Usually, the maximum number of the MAC addresses that
can be learned by one port is 32767. When the number of the MAC
addresses learned by the port reaches 32767, do not learn MAC address
any more. The new MAC address cannot be learned until the MAC
addresses are aged and the new address packets do not flooding.
The function of the static forwarding MAC address and dynamic MAC
address is fast forwarding, that is, the MAC address table is one fast
forwarding table, which can make the packet be forwarded via the
specified port rapidly and correctly, so as to prevent the packet from being
broadcasted in the whole VLAN.
Note
The static MAC address entries configured by the user manually and filter
MAC address entries are not covered by the dynamic MAC address entries,
but the dynamic MAC address entries can be covered by the static MAC
address entries and black-hole MAC address entries.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 325 of 628
PWE3 Technology (Only for S3400/S3900)
PWE3 provides the tunnel on the packet switching network (IP/MPLS) to
emulate the L2 VPN protocol of some services (FR, ATM, Ethernet, and
TDM SONET/SDH). The protocol can help to connect the traditional
network with the packet switching network, so as to realize the sharing of
resources and the expansion of the network. The protocol is the expansion
of the Martini protocol. It expands new signaling (optimize the signaling
expenses) and regulates multi-hop negotiation mode to make the
networking of the protocol more flexible. The manual describes the theory,
key technologies, and typical applications of the circuit emulation in the
packet network.
The circuit emulation in the packet network is a technology of bearing
traditional TDM data on the packet switching network (PSN). It adopts the
circuit emulation mode in PWE3 frame protocol to provide end-to-end
transmission for PDH and SDH data flow on the packet switching network.
The main contents of the chapter:
Basic concepts
Technology theory
Realizing method
Typical application
Basic Concepts With the evolution of the network technology and the network
convergence, the network data transmission and switching mode with the
packet as the basic unit will be the dominant in the next generation
network. Both IP network and MPLS network are the representatives of the
packet switching network. However, the next generation network (NGN)
cannot be constructed overnight. The current PDH/SDH network serving
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 326 of 628
PSTN public voice communication services will exist for a long time, and
the existing TDM devices of users on the network will still be used. To
protect the investments of users on the TDM devices, it is necessary to
provide the capabilities of accessing the TDM services and transmitting the
TDM data transparently in the next generation packet switching network.
For the data transparent transmission of the TDM circuit switching service
on the packet switching network, several standard organizations put
forward their own standards and solutions. Currently, the TDM circuit
emulation is the most mature.
Background of TDM Circuit Emulation Technology At first, the TDM circuit emulation technology is to realize the transparent
transmission of the TDM circuit switching data on the IP network. It
appears as the competitive technology for the VoIP technology and
provides the processing flow that is more simplified than the VoIP protocol.
It provides the voice transmission service via the IP network. The initial
TDM service transparent transmission device only supports the transparent
transmission of the E1 and DS1/DS0 services. With the packet switching
network becoming the dominant in the NGN solutions gradually, especially
the rising of the Metro-E technology, TDM circuit emulation technology
becomes the important technology of transmitting TDM service on the
packet switching network. Currently, many protocol drafts or technology
standards for the transparent transmission of the E1/T1/E3/T3 structured
and non-structured TDM service, the structured transparent transmission
of the SDH service, and the transmission of the PDH and SDH signaling
are complete.
Related Technology Standards The related standards of TDM circuit emulation technology are mainly from
four international standard organizations, that is, IETF, ITU-T, MEF, and
MFA. The organizations cooperate with each other. The transparent
transmission standards of the TDM service put forward by different
organizations are basically similar and have a little difference in the
specific technical details such as data encryption format. Among the
standard organizations, IETF PWE3 working group plays a leading role in
making the transparent transmission standards of the TDM service. The
organization not only defines the standards of the technology at the data
layer, but also defines the standards at the control and management
layers, while the standards of other organizations mainly focus on the data
encryption method.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 327 of 628
The standards put forward by MEF focus on how to encrypt the original
TDM service to the Ethernet frame, while the standards of MFA focus on
how to bear the TDM service on the MPLS network. ITU-T standards also
focus on the data layer. It provides the mode of MPLS bearing the TDM
service data and the mode of IP bearing the TDM service data. Besides,
ITU-T defines the clock transmission solutions that are important for the
TDM service.
Commonly-used Terms PWE3 (Pseudo Wire Edge to Edge Emulation): IETF defines the meaning
of PW in RFC3985, that is, an emulation of using the packet switching
network to bear the local service;
IWF (Interworking Function): the device that switches the data between
two different networks;
CE (Customer Edge): the device to initiate and terminate the TDM service;
PE (Provider Edge): the device that provides the PWE, which is equivalent
to IWF;
AC (Attachment Circuit): the connection link or virtual link between CE
and PE; all data on AC is required to be sent to the peer end without any
change;
Bundle: the bit flow sent by the TDM circuit of the PE devices at the two
sides of the PW; it can comprise any several 64Kbps time slots in one E1
or T1. Bundle is the uni-directional data flow. It often matches the
opposite Bundle to form the full-duplex communication. There can be
several Bundles between two PE devices.
CESoPSN (Circuit Emulation Services over Packet Switched Network): It is
the emulation that concerns the structure of the TDM data frame;
SAToP (Structure-Agnostic TDM over Packet): It is the emulation that
does not identify the structure of the TDM data frame;
TDMoIP (Time Division Multiplexing over Internet Protocol): It is the
emulation related with the contents of the TDM data.
CAS: Channel Associated Signaling
Technical Theory IETF PWE3 working group plays a leading role in making the standards of
the TDM service transparent transmission, so the standards of the TDM
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 328 of 628
service transparent transmission made by IETF PWE3 working group are
the most complete and become the mainstream standards in the field. The
following introduces the TDM transparent transmission technology by
analyzing the TDM PWE3 technical scheme.
TDM PWE3 Technical Scheme
PW Theory PW is a mechanism that transmits the key elements of one emulation
service from one PE to another or several other PEs via the PSN. It
emulates various services (ATM, FR, HDLC, PPP, TDM, and Ethernet) via
one tunnel (IP/L2TP/MPLS) on the PSN network. The PSN network can
transmit various data payloads. The tunnel used by the scheme is defined
as Pseudo Wires. The inner data service born over the PW is invisible for
the core network, that is to say, the core network is transparent for the CE
data flow.
Figure 21-1 PW schematic
The PW scheme provides a technical frame. In the frame, various services
can use the PW to be transmitted transparently on the PSN network. TDM
Pseudo Wires emulation is a technology that uses the PW to emulate the
TDM service data on the PSN network.
Elements of TDM Emulat ion Service When using the PW mode to emulate transmitting the TDM service on the
PSN network, the following elements need to be transmitted to the other
side of the PW.
1. TDM service data
2. The frame format of the TDM service data
3. The alarm and signaling of the TDM service at the AC side
4. TDM synchronous timing information
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 329 of 628
TDM Emulat ion Protocol TDM circuit emulation service is to use the special circuit emulation packet
head to encrypt the TDM service data. In the special packet head, there is
the frame format information, alarm information, signaling information
and synchronous timing information of the TDM service data. The
encrypted packet is called PWE3 packet. And then the PWE3 packet is born
by the IP, MPLS, and L2TPv3 protocols to cross the corresponding packet
switching network. After reaching the exit of the PW tunnel, dis-encrypt
the packet, and then re-construct the TDM circuit switching service data
flow.
The following describes several TDM circuit emulation encryption protocols.
1. SAToP protocol (RFC4553)
RFC4553 provides the emulation function for the low-rate PDH circuit
services such as E1/T1/E3/T3. SAToP is to transmit the unstructured (that
is unframed) E1/T1/E3/T3 service data. It segments and encrypts the TDM
service as the serial data stream, and transmits it on the PW tunnel. In the
elements of the TDM emulation service described in the above section, the
protocol can provide the transparent transmission of the TDM service and
the transmission of the synchronous timing information, but cannot
identify the TDM frame structure. Therefore, the information about the
TDM frame structure and the signaling in the TDM frame cannot be
identified and processed, and can only be transmitted transparently. The
protocol is the simplest mode of transmitting the PDH low-rate service
transparently in the TDM circuit emulation scheme. It is also because it is
simple to realize that it is released by IETF as the RFC formal standard.
RFC4553 totally provides three optional PW outer tunnel encryption modes,
that is, UDP/IP mode, L2TPv3 mode, and MPLS mode. UDP/IP mode
adopts the UDP/IP packet head to encrypt the PWE3 packet and uses the
different UDP port numbers to distinguish different PW outer tunnels. The
encryption mode is suitable for the pure IP network. Currently, the TDM
circuit emulation service developed by Maipu supports the UDP/IP mode.
The L2TPv3 mode adopts the L2TPv3 packet head to encrypt the PWE3
packet and uses the different session IDs to distinguish different outer
tunnels. The mode can adopt the L2TPv3 protocol negotiation to set up the
outer tunnel and distributes different session IDs to the different PWs in
the tunnel via the protocol. It is more flexible than the UDP/IP mode in
using.
MPLS mode adopts the MPLS label to encrypt the PWE3 packet and adopts
the LSP as the outer tunnel of PW. The PW label is the most inner label of
the MPLS label stack. In MPLS mode, the user can perform the dynamic
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 330 of 628
distribution and management via the LDP protocol, so compared with
UDP/IP manual binding mode, MPLS mode is more convenient to use.
Meanwhile, there can be several layers of MPLS labels to realize the
nesting of the PW outer tunnel, which is convenient for applying the mode
in a larger scale network range.
2. CESoPSN protocol
Compared with SAToP, CESoPSN can provide the structured TDM service
emulation transmission function, that is, can identify, process, and
transmit the framed structure and the signaling in the TDM frame. Take E1
as an example. The structured E1 comprises 32 time slots. Except for time
slot 0, the other 31 time slots can bear one 64Kbps voice service
respectively. Time slot 0 is used to transmit the signaling and the frame
symbol. The CESoPSN protocol can identify the frame structure of the TDM
service. The idle time slot channel does not need to transmit the data.
Only the useful time slots of the CE device are used to encrypt the E
service flow to the PWE3 packet. Meanwhile, the functions of identifying
and transmitting the CAS and CCS signaling in the E1 service flow are
provided.
The CESoPSN protocol scheme also provides three optional PW outer
tunnel encryption modes, that is, UDP/IP mode, L2TPv3 mode, and MPLS
mode. Different from the SAToP protocol, the TDM service data that is
born inside the PW by using the CESoPSN protocol has the frame structure.
Meanwhile, the PW control field in the PWE3 packet has the M domain to
identify the signaling checking at the AC side. Currently, the TDM service
products developed by Maipu support the CESoPSN protocol in UDP/IP
encryption mode.
Besides the TDM service data, CESoPSN provides the scheme of identifying
and transmitting the CAS signaling.
3. TDMoIP protocol
The PW encryption modes (UDP/IP mode, L2TPv3 mode, MPLS mode, and
MEF mode) on different PSN networks are described. Both SAToP and
CESoPSN take the TDM bit flow as the payload encrypted by the PW, while
TDMoIP adds three new TDM payload types, that is, the AAL1 payload,
AAL2 payload, and HDLC payload. Currently, the TDM service products
developed by Maipu support the HDLC TDM payload.
Besides, the PWE3 working group of IETF defines the structured circuit
emulation scheme for the high-end and low-end channel of SONET/SDH to
transmit the VC11/VC12 and VC2 TDM service data transparently via the
PWE3 mode.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 331 of 628
Other Technical Schemes Besides the PWE3 working group of IETF, MEF, MFA, and ITU-T define the
related protocol standards of the circuit emulation. For example, MEF8.0
defines the TDM circuit emulation packet encrypted by the nude Ethernet,
which distinguishes the different TDM circuit emulation data flows by the
different ECIDs.
Figure 21-2 Mapping relation between the function layer and MEF
packet encryption
In the MEF8.0 standard defined by MEF, CESoETH control words are
compatible with the PW control words defined by IETF. The RTP control
words also adopt the RFC3550 standard of IETF. It also adopts the PWE3
tunnel to transmit the TDM service transparently, but the bearing layer is
the nude Ethernet.
Key Technologies
Data J i t ter Buffer After crossing the packet switching network to reach the exit PE device,
the reaching interval may be different and the packets may be out of order.
To ensure that the TDM service data flow can be re-constructed on the exit
PE device, the jitter buffer technology is needed to smooth the interval of
the PW packets and re-arrange the packets that are out of order. The
capacity of the jitter buffer considers the performance eclectically. The
jitter buffer with large capacity can absorb the packet transmission
interval jitter with much change in the network, but brings in large delay
when re-constructing the TDM service data flow. Providing the jitter buffer
whose capacity the user can configure and adjust is a good policy. The
user can configure it flexibly according to the different network delay and
jitter. Currently, the TDM circuit emulation products developed by Maipu
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 332 of 628
support configuring the jitter buffers with different capacities via the
command.
Recover Clock Timing Informat ion The TDM networks that adopt the circuit switching (such as the SDH
network) natively have the capability of transmitting the network
synchronous timing information, but most packet switching networks,
especially the current Ethernet network, do not have the function.
Currently, there are the following solutions.
1. Adopt the auto-sensing packet recovering algorithm: Use the
time window smoothing method and auto-sensing algorithm to
extract the synchronous timing information from the PWE3
packet at the exit so that the re-constructed TDM service data
flow gets a service data flow that is approximately synchronous
with the sending end. But the algorithm has limitations.
Especially, when the packet loss and transmission delay in the
network changes greatly, the synchronous timing information
cannot be recovered correctly.
2. Adopt the synchronous Ethernet to transmit the clock: Reform
the Ethernet network of the current synchronous clock system
and bring in the idea of synchronous timing transmission in the
whole network of the SDH system to the design of the Ethernet
network design.
3. TDM circuit emulation only transmits the service data. The
synchronous timing information is transmitted by other
synchronous timing system, such as the sending clock of the
GPs system or the sending clock of the synchronous clock
network.
Check Link Faul t The link fault checking includes the fault checking at the AC side, the fault
checking of the PW tunnel link, and a series of actions taken after the fault
is found, such as notifying the peer side and switching the fault link.
Currently, the link fault checking at the AC side and notifying the peer side
have the related technical drafts. The fault checking of the PW tunnel link
also has many optional technologies, such as MPLS-OAM technology and
Ethernet OAM technology.
Analyze Packet Delay For the services that have a high requirement for realtime such as voice
transmission, the data delay and jitter affect the service quality greatly,
which needs to be considered. For the technology of using the TDM PW
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 333 of 628
emulation mode to transmit the TDM service transparently, the data delay
comprises the following aspects, that is, packet encryption delay, service
processing delay, and network transmission delay.
1. Packet encryption delay is generated when the TDM service flow is
encrypted as the PWE3 packet. The delay is only owned by the
TDM circuit emulation technology. Take the E1 as an example. The
E1 rate is 2.048Mbps; each frame contains 32 time slots (256 bits);
8000 frames are transmitted every second; the duration of each
frame is 0.125ms. If adopting the structured encryption mode and
very four frames are encrypted as one PW packet, the delay for
encrypting one PW packet is 4×0.125ms=0.5ms. The encryption
time increase with the number of the encrypted frames. The more
the encrypted frames, the larger the encryption delay.
2. Service processing delay is the time for the device to process the
packet, including the packet validity check, packet filtering, parity
check, and calculation, packet encryption and receiving and
sending. The delay depends on the service processing capability of
the device. For one device, it is fixed.
3. Network transmission delay is generated when the PWE3 packet
reaches the egress PE from the ingress PE via the packet switching
network. It varies greatly with the network topology structure and
the network service flow. It is also the main reason for generating
the service jitter. Currently, the jitter buffer technology can absorb
the jitter, but the delay cannot be absorbed.
The TDM service delay depends on the above three kinds of delays.
Channel ized and Non-channel ized Technologies The non-channelized service transmission in the TDM Pseudo Wire
Emulation is the un-structured transmission. It does not identify the data
format in the TDM service flow and only processes the TDM data as the
serial code flow. RFC4553 (SAToP) un-structured encryption protocol
requires that the un-structured circuit emulation for E1 rate must support
the service processing with 256 bytes as a basic payload unit, that is, the
E1 frame structure is not identified, but the TDM code flow must be
segmented according to the integral multiple of the E1 frame length and
are encrypted as the PWE3 packet. Meanwhile, the un-structured T1 rate
circuit emulation must support the service processing with 1024 bytes as a
basic payload unit.
Correspondingly, the channelized service is the structured TDM Pseudo
Wire Emulation. It needs to identify the frame format in the TDM service
flow and the segmenting for the TDM code flow must be at the frame
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 334 of 628
delimiter. For example, E1 frame must be segmented at the beginning of
time slot 0. Because of segmenting from the frame delimiter, the 32 time
slots in the E1 frame can be identified for structured processing. The
structured processing for T1 and E3/T3 is similar.
Comparing the two modes, the un-structured mode is simpler. It does not
need to identify the frame format in the TDM data flow and is more
commonly used. For the device in the traditional data network that takes
E1/T1 as the synchronous serial interface (that is, ignore the frame format)
and adopts the net channel transmission, the un-structured TDM Pseudo
Wire Emulation is more convenient.
Structured (channelized) mode is more complicated. It needs to identify
the frame symbol in the TDM data flow. The time slots in the frame and
the signaling information carried by some special time slots must be
identified and processed. When the TDM interface works in the frame
mode and communicates via some time slots of E1/T1, adopting the
structured TDM Pseudo Wire Emulation is more helpful for improving the
bandwidth utilization. The structured TDM Pseudo Wire Emulation can
distinguish the time slots being used in one E1 circuit from the idle time
slots. It can encrypt only the being-used time slots in the PWE3 packet
and discard the idle time slots. In this way, the network transmission
bandwidth is saved. Besides, the structured mode can realize the inserting
of the time slots between different E1/T1 interfaces, so as to further
improve the bandwidth utilization.
Realizing Methods
PWE3 Packet Format Currently, Maipu only supports the PWE3 packets encrypted by the
UDP/IPv4 mode. As shown in Figure 21-3, the TDM service data is
encrypted in the TDMoIP PAYLOAD of the packet.
Figure 21-3 PWE3 packet encrypted by UDP/IP mode
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 335 of 628
The format of the UDP/IPv4 head is as shown in Figure 21-4. The source
IP address is the local address of the Pseudo Wire. The source addresses
of the PWE3 packets sent from the local are the same. The destination IP
address is the remote address of the Pseudo Wire. The destination IP
addresses of the PWE3 packets sent to the Pseudo Wire are different. UDP
destination port number is fixed as 2142, which is the private port number
of TDM over IP distributed by IANA. It is the ID of the PWE3 packet
encrypted by the UDP/IP mode. The UDP source port number is used to
distinguish the PWE3 packets of different bundles on one Pseudo Wire and
the value range is 1-8063.
Figure 21-4 UDP/IPv4 head format
The control word provides the method of exchanging TDM circuit status
and PSN network status for the PWE3 packet. The format is as shown in
Figure 21-5. RES is the reversed field and must be set as 0. L bit means
the local asynchronous; placing 1 at L bit means that the local is detected
or informed. The fault at the TDM physical layer results in the incomplete
of the data, so the bit can be used to indicate the asynchronous at the
physical layer and trigger generating the AIS signal at the remote side.
After the TDM fault is fixed, L bit is cleared up. R bit means the remote
receiving fault. Placing 1 at the R bit means that the remote does not
receive the packet from the Ethernet port. R bit can be used to advertise
the fault block or other network faults. Receiving the remote fault
indication can trigger the rollback mechanism to avoid the block. The R bit
is placed with 1 after the pre-set successive N packets are not received;
after the packets are received, the R bit is cleared up.
FRG field means the segmenting type and it is used for the CAS multi-
frame structure in the CESoPSN protocol. When FRG is 00, it means that
the multi-frame is in one packet; 01 means that the packet carries the
first segment of the multi-frame; 10 means that the packet carries the last
segment of the multi-frame; 11 means that the packet carries the middle
segment of the multi-frame. LENGTH field means the total bytes of the
control word, payload, and RTP head (if there is), which is used when the
length is less than 64 bytes. When the length is more than 64 bytes, the
field is set as 0. SEQUENCE NUMBER field means the serial number of the
packet. The initial value is a random value and it increases according to
the sent packets. When reaching the maximum value, it rolls back to 0.
The field is used to check whether the packet is lost.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 336 of 628
Figure 21-5 Control word format
The RTP head is used to carry the clock information and assist the
receiving end to recover the TDM clock from the PSN network. The format
is as shown in Figure 21-6. V means the version and is fixed as 2. P means
the filling bit and is fixed as 0. CC is the CSRC count and is fixed as 0. M is
the marking bit and is fixed 0. PT field means the payload type and the
value of each bundle is unique. SN is the serial number of the packet and
is the same as SEQUENCE NUMBER in the control word. TS is the time
stamp and has two generating modes, that is, absolute mode (it is from
the recovered clock on the TDM line and it increases by 1 every 125 ms)
and the relative mode (it is from the common clock and it is added with 1
every time receiving a bit). SSRC indicates the synchronous source.
Figure 21-6 RTP head format
SAToP Protocol The TDM port on PE works in the non-framed mode and does not concern
the received TDM frame structure information, which is regarded as a bit
flow with fixed rate. As shown in Figure 21-7, SAToP processes the TDM
flow with the byte (8 bits) as the unit. Every N received TDM bytes are
encrypted to the TDM payload of the PWE3 packet, and sent to the PSN
network. After the PE device at the other side of the Pseudo Wire receives
the packet, dis-encrypt the TDM payload from the PWE3 packet and send
it to the TDM port.
Figure 21-7 SAToP sketch map
After N TDM bytes are received, generate a PWE3 packet, so a fixed delay
is generated, which is called packet encryption delay (PCT).
The PCT calculation method of SAToP: PCT=N×8×bit time=N×8÷bit rate
Take E1 as an example. The E1 rate is 2.048Mbps; 2048000 bits are
transmitted every second; each bit time is 488ns. If every 256 bits are
encrypted as one PWE3 packet, the delay for encrypting one PWE3 packet
is 256×8×488ns=1ms. The packet encryption time increases with the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 337 of 628
number of the encrypted bytes. The more the encrypted bytes, the larger
the packet encryption delay, the fewer the generated packets in unit time.
CESoPSN Protocol The TDM port on PE works in the framed mode, which is divided into non-
CAS and CAS modes according to the TDM service type.
Non-CAS mode
As shown in Figure 21-8, CESoPSN processes the TDM flow with the frame
as the unit. After every N frames are received, the data of the specified
time slots (time slot 4 and 25) is encrypted into the TDM payload of the
PWE3 packet and then sent to the PSN network. After the PE device at the
other side of the Pseudo Wire receives the packet, dis-encrypt the TDM
payload from the PWE3 packet, insert them to the specified time slots
(time slot 4 and 25) respectively, and then send them to the TDM port.
Figure 21-8 CESoPSN sketch map of non-CAS
In the mode, PCT=N×frame time=N÷frame rate. Take E1 as an example.
The E1 rate is 2.048Mbps; every frame contains 32 time slots; 8000
frames are transmitted every second; the frame rate is 0.125ms; every 32
frames are encrypted as one PWE3 packet. Therefore, the delay for
encrypting one PWE3 packet is 32×0.125ms=4ms. The packet encryption
time increases with the number of the encrypted bytes in the PWE3 packet.
The more the encrypted bytes, the larger the packet encryption delay, the
fewer the generated packets in unit time.
CAS mode
As shown in Figure 21-9, TDM has the CAS multi-frame structure, that is,
comprises 26 base frames. The 16 time slots of each base frame are used
to carry the signaling and multi-frame synchronization. CESoPSN
processes the TDM flow with the CAS multi-frame as the unit. Encrypt the
data of the specified time slots (time slot 2, 4, and 25) in each base frame
to the TDM payload in the PWE3 packet according to the order that begins
with the first base frame of the multi-frame and ends with the last base
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 338 of 628
frame of the multi-frame. At last, add the corresponding signaling
information to the end of the time slot data, and then send it to the PSN
network. After the PE device at the other side of the Pseudo Wire receives
the packet, dis-encrypt the TDM payload from the PWE3 packet and insert
it to the specified time slots (time slot 2, 4, and 25) respectively.
Meanwhile, insert the signaling to time slot 16, and then send it to the
TDM port.
Figure 21-9 CESoPSN sketch map of CAS
In the mode, PCT=the number of the base frames in the multi-frame×
frame time=the number of the base frames in the multi-frame÷frame
rate. Take E1 CAS multi-frame as an example. The E1 rate is 2.048Mbps;
each frame contains 32 time slots; 8000 frames are transmitted every
second; the frame rate is 0.125ms; each CAS multi-frame contains 16
base frames. Therefore, the delay for encrypting one PWE3 packet is 16×
0.125ms=2ms.
If the multi-frame contains many base frames, the packet encryption
delay is large and maybe cannot reach the delay index required by the
system. The CAS multi-frame segmenting mode can solve the problem, as
shown in Figure 21-10. The multi-frame is divided to N sub multi-frames
and each sub multi-frame contains M base frames. CESoPSN processes the
TDM flow with the sub multi-frame as the unit. Each sub multi-frame
corresponds with one PWE3 packet. The last sub multi-frame of the multi-
frame is added with the signaling information. Set the FRG in the control
word of the PWE3 packet that contains the first sub multi-frame of the
multi-frame as 01; set the FRG in the control word of the PWE3 packet
that contains the last sub multi-frame of the multi-frame as 10; set the
FRG in the control word of the PWE3 packet that contains the other middle
sub multi-frames of the multi-frame as 11. The PE device at the other side
of the Pseudo Wire can dis-encrypt the time slot data and the signaling
according to the FRG in the control of the PWE3 packet.
In the segmenting mode, PCT=the number of the base frames in the
multi-frame×frame time=the number of the base frames in the multi-
frame÷frame rate. Take E1 CAS multi-frame as an example. The E1 rate
is 2.048Mbps; each frame contains 32 time slots; 8000 frames are
transmitted every second; the frame rate is 0.125ms; each sub multi-
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 339 of 628
frame contains 4 base frames. Therefore, the delay for encrypting one
PWE3 packet is 4 × 0.125ms=0.5ms. The packet encryption delay
increases with the number of the base frames in the sub multi-frame. The
more the base frames in the sub multi-frame, the larger the packet
encryption delay.
Figure 21-10 CESoPSN segmenting sketch map of CAS
HDLC Mode SAToP and CESoPSN circuit emulation modes are called flow mode
(transparent transmission mode), because the encrypted in the packet is
the original bit flow. The purpose is to transmit the TDM bit flow without
any change between two TDM devices.
However, in the HDLC mode, only the existing HDLC frames in the TDM bit
flow are transmitted, as shown in Figure 21-11. No matter whether the
TDM flow is framed or not, it is processed with the HDLC frame as the unit,
that is, search for the frame head and the frame trail of the HDLC frame in
the bit flow. When a complete HDLC frame is received, the data is
encrypted to the TDM payload and then sent to the PSN network. After
The PE device at the other side of the Pseudo Wire dis-encrypts the PWE3
packet, the payload is re-encrypted as the HDLC frame and inserted into
the TDM bit flow.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 340 of 628
Figure 21-11 Sketch map of HDLC mode
In the mode, PCT is meaningless. The number of the generated PWE3
packets is the same as that of the sent HDLC frames in the TDM flow.
Technology of Recovering Clock from Circuit Emulation packet The circuit emulation technology is originated from the ATM network,
which adopts the virtual circuit to encrypt the circuit service data in the
ATM cell to be transmitted on the ATM network. Later, the theory of the
circuit emulation is transplanted to the Metro-E. The Ethernet provides the
emulation transmission of the circuit switching services such as TDM. The
circuit emulation is the mechanism adopted by the transparent
transmission of the circuit switching service on the network. It uses the
special circuit emulation head to encrypt the TDM service and realizes the
transmission of the clock on the packet switching network via some
mechanism. The device that realizes the encryption function at the
physical layer is called framer or mapper, which can be connected to the
original TDM network directly.
The technology of recovering clock from the circuit emulation packet is to
adopt the auto-sensing algorithm to recover the clock synchronous
information from the packet. The following describes the basic theory of
the algorithm.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 341 of 628
Figure 21-12 Sketch map of auto-sensing clock recovering
As shown in Figure 21-12, the gateway (IWF) at the clock source side
sends the time information to the peer gateway regularly. The time
information is provide with the T1/E1 emulation packet. At the other side,
the gateway extracts the time stamp from the packet and recovers the
service clock (f-service) via algorithm.
The core theory of the algorithm is that the left IWF device sends the
packet to the destination IWF device according to its own source clock.
The destination IWF device uses one queue to buffer the packet, and uses
its own local clock to send it out. If the source clock and the destination
local clock are not consistent, even if only a very small difference, it
results in the depth change of the buffer queue in the destination device.
Therefore, we can judge whether the local clock is consistent with the
source clock according to the depth of the queue. If the queue depth
continues increasing, it shows that the local clock is slower than the source
clock and the local clock needs to be adjusted quicker; if the queue depth
continues reducing, it shows that the local clock is quicker than the source
clock, and the local clock needs to be adjusted slower. This is a negative
feedback mechanism. After it becomes stable, we will find that the local
clock at the destination is the same as the source clock in the long run. In
this way, the frequency synchronization is complete between two IWF
devices on the IP network.
A vivid metaphor can help to understand the auto-sensing algorithm. The
IWF device at the clock source is equivalent to the inlet of the pool and
sends the packets to the pool with a certain clock frequency. The IWF
device at the destination is equivalent to the outlet of the pool. The water
in the pool maintains a constant level by adjusting their switches. In this
way, the synchronization between two devices is complete.
The difficulty to realize the auto-sensing algorithm is that the IP network
innately has the delay jitter (PDV). The packet jitter also causes the depth
change of the buffer queue, while the IWF device at the destination cannot
judge the change is caused by the frequency difference or the delay jitter
of the IP network, so it cannot make the right response. But the delay
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 342 of 628
jitter of the IP network is not cumulative, so you can use the statistics
methods such as getting the average to perform the filtering.
PWE3 Typical Application
Figure 21-13 The connection and aggregation of the MAN private line
As shown in Figure 21-13, the TDM circuit emulation technology can be
used to connect and aggregate the MAN private line. For example, the LAN
district is connected to the PBX switches of the branches in the district to
provide the E1 voice access function to realize the communication in the
district. This can also be realized by connecting the district to the PSTN.
The TDM circuit emulation service is the emulation for the TDM physical
transmission mode and does not perceive the actual services transmitted
in E1. The DDN service, FR service, and ATM service over E1 can be
transmitted transparently via the TDM circuit emulation mode.
TDMoIP Gateway in the figure is the PWE3 device. The PWE3 packet
formats on ①②③④ paths are as shown in Figure 21-14.
Figure 21-14 PWE3 packet
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 343 of 628
Performance Test Result
Figure 21-15 Performance test environment
To test the reliability of the PWE3 circuit emulation, set up the test
environment as shown in Figure 21-15. Enable the PWE3 function on PE;
construct various background flows via the SMARTBIS devices; simulate
the network block and bandwidth burst change. And then use Router A to
send the test packet to the Router B. The test result is that no matter
whether there is background flow impact, Router B can receive the data
from Router A completely.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 344 of 628
Loopback Detection Technology
Introduction to Loopback Detection Ethernet is one broadcast network. When the destination of the packet
cannot be identified, the switch broadcast the packet in one VLAN. When
there is loop in the network, the packet is forwarded repeatedly in the
network and at last, the network bandwidth is consumed up and the
communication cannot be performed. Enable the loopback detection
function on the port and send Loopabck packets with an interval to check
whether there is loop in the network. When the port receives the Loopback
packet sent by the local device, analyze the source port of the packet from
the loopback packet, set the port as ERR-DISABLE, and print the log
information.
This section describes the theory of the loopback detection protocol and
how to realize it.
Related Terms of Loopback Detection Protocol LBD: Loopback detection
Introduction to Loopback Detection Protocol The loopback detection protocol is used to detect the uni-port network
loop.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 345 of 628
Ethernet is the multipoint-to-multipoint network, as well as one broadcast
network. When the destination address of the packet cannot be identified,
the switch broadcasts the packet to all terminal stations. Therefore, when
there is loop in the network, the packet is forwarded repeatedly and at last,
the network bandwidth is consumed up and the communication cannot be
performed.
There are two cases of loop. One is that the loop is between different ports
of the switch. For example, because of connection error, two ports of one
switch are connected; the other is that the loop is on one port of the
switch. For example, the port is connected to one bridge device and the
Ethernet port of the bridge loops. In the first case, you can use STP to
detect, but in the second case, STP is useless and you should adopt other
methods to detect.
The theory of the port loopback detection is to send one special packet
timely. In the normal state, the device that receives the packet drops it. If
there is loop, the packet is returned to the source port. Compared with the
sent packet, you can get to know whether there is loopback.
Format of Loopback Detect ion Protocol Packet The format of the Ethernet loopback detection protocol packet is as follows:
The format of loopback detection packet
Fields of Loopback Detect ion Packet DMAC field (6 bytes): the destination MAC address of the packet;
SMAC field (6 bytes): the source MAC address of the packet;
QTag field (4 bytes): If VLAN is configured, tag is four bytes. Otherwise,
there is no tag field;
Ethernet type field (2 bytes): the protocol type number of the loopback
packet, 0x9000;
Skip count field (2 bytes): The field is usually set as 0x0000;
Message type field (2 bytes): The message type; if it is 0x0100, it
means Reply message; if it is 0x0200, it means Forward_data;
Port Index field (2 bytes): the number of the port that sends the loopback
packet;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 346 of 628
Workf low of Loopback Detect ion The workflow of the loopback detection is as follows:
Send the detection packet with an interval on the port that is configured
with loopback detection. The DMAC of the packet is one MAC of the switch
(got from the base MAC); the SMAC is one MAC of the switch (got from
the base MAC); Skip counter is 0; Message type is 0x0100; Receipt
number is the port number. If the port is not configured to any VLAN, send
only one untag loopback packet. Otherwise, if the port belongs to one or
multiple VLANs, besides one untag loopback packet, send the tag loopback
packet to each VLAN that is configured with tag.
When the port receives one loopback packet that is not configured with
the loopback detection, drop it. Otherwise, check whether the DMAC and
SMAC of the packet are the MAC addresses of the device. If yes, prompt
the port loopback to the user. If the port is in the controlled state,
shutdown the port. Otherwise, do not shutdown the port and only prompt
the port loopback to the user.
Typical Application When using the loopback detection, ensure that the corresponding port is
configured with the loopback detection function and works in the same
detection mode.
In this section, configure one basic loopback detection protocol for
reference.
The network topology is as follows:
Figure 22-1 Application instance of the loopback detection
Illustration
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 347 of 628
The Ethernet port0/1 of Switch1 is connected to Ethernet port0/2 of switch
2 via the network cable. Use the network cable to interconnect the port
0/3 and port 0/4 of switch 2. Add port 0/1 of switch 1 and port 0/2, port
0/3 and port 0/4 of switch 2 to VLAN 10 in tagged mode. Check whether
there is loop on switch 2 via the loopback detection function on switch 1.
The configuration of Switch1:
Command Description
switch1(config)#loopback-detection enable Enable the port loopback detection globally
switch 1(config)# port 0/1 Enter the port configuration mode
switch1(config-port-0/1)# port hybrid tagged vlan 10
RP: Rendezvous Point, the tree root of the shared tree;
RPF: Reverse Path Forwarding;
SPT: Shortest Path Tree, the shortest path to the source;
Introduction to PIM-SM Protocol PIM-SM is similar to PIM-DM, adopting any one IP routing protocol (RIP,
IRMP, STATIC and OSPF) to decide the RPF interface. The most important
difference between PIM-SM and PIM-DM is that PIM-SM adopts the pulling
mode, while PIM-DM adopts the pushing mode. The pulling mode supposes
that the multicast is not needed. The multicast information is not sent to
the receiving station unless explicitly adding.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 372 of 628
Basic Hierarchy of PIM-SM in TCP/IP Protocol Stack
Basic hierarchy of PIM-SM in TCP/IP protocol stack
The PIM-SM protocol is at the upper layer of the IP protocol and
communicates with IP via the original socket. The protocol number of PIM-
SM in the IP packet is 103.
PIM-SM Protocol In the PIM-SM domain, the route/switch device that runs the PIM-SM
protocol periodically sends the Hello message, which is used to discover
the neighboring PIM route/switch device and is responsible for selecting
DR in the multi-path access network. Here, DR is responsible for sending
the adding/pruning message and registering message.
PIM-SM sets up the multicast distribution tree to forward the multicast
packets. The multicast distribution tree includes two kinds, that is, the
shared tree with RP of group G as the root and the source tree with the
multicast source as the root. PIM-SM sets up and maintains the multicast
distribution tree via the explicit adding/pruning mechanism.
When there is the active member of the group G in the direct-connected
network of DR, send the multicast adding message hop by hop along the
RP direction of the group G to add into the shared tree (No. 1 in the
following figure). When the adding go upstream along the shared tree, the
route/switch devices on the way set up the multicast forwarding status
(No. 2 in the following figure), that is, route option. The route option
includes the fields of the source address, group address, input interface of
the multicast packet, output interface list of the multicast packet, timer
and flag so that the route/switch device can forward the received multicast
data along the tree. When the pruning message goes upstream along the
shared tree, the route/switch device on the way updates its route options,
such as the output interface. If the branches of the distribution tree are
not updated, they are deleted after timeout. To avoid this problem, the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 373 of 628
route/switch device on the distribution tree periodically sends the
adding/pruning message to the RP of the group, so as to maintain the
multicast distribution tree status.
When the source host sends the multicast data to the group, the source
data is encapsulated in the register message and then DR unicasts it RP
(No. 5 in the following figure). RP encapsulates the register message as
packet and forwards it to the group members along the shared tree. And
then, RP can send the adding/pruning message (No. 3 in the following
figure) for the specified source along the source direction to add into the
shortest path tree of the source. In this way, the packet is sent to RP
without being encapsulated along the shortest path tree. When the
multicast packet reaches along the shortest path, RP sends the register-
stop message to the DR of the source, so as to make DR stop the
registering and encapsulating process. Hereafter, the multicast data of the
source is not registered or encapsulated any more, but is sent to RP (A-B
-RP) along the shortest path tree of the source, and then RP forwards the
packet to the shared tree. At last, the packet is sent to the group
members along the shared tree (RP-C-E).
Work process a of PIM-SM protocol
If reaching a certain data transmission rate, DR can send the explicit
adding message to add into the shortest path tree of the source (No. 9 in
the following figure) and the multicast packet is forwarded along the
shortest path tree. And then DR updates the shared tree and deletes the
corresponding shared forwarding route (No. 8 in the following figure).
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 374 of 628
Work process b of PIM-SM protocol
PIM-SM refers to the selection mechanism of BSR and RP. One or multiple
Candidate-BSRs are configured in the PIM-SM domain and use some rule
to select the public unique BSR of the domain. Candidate-RP is also
configured in the PIM-SM domain. The Candidate-RPs unicast the packets
that contain the information about their addresses and the multicast
groups that can be served to the BSR, and then BSR regularly generates
the BootStrap messages that contain a series of Candidate-RPs can
corresponding group addresses. The BootStrap messages are sent hop by
hop in the whole domain. The route/switch device receives and saves the
BootStrap messages. Id DR receives the IGMP adding packets from the
direct-connected host and it does not have the route option of the group,
use the hash algorithm to map the group address to one candidate RP,
and multicast the adding/pruning message hop by hop along the RP
direction. If DR receives the multicast packets from the direct-connected
host, and it does not have the route option of the group, use the hash
algorithm to map the group address to one candidate RP and then
encapsulate the multicast data in the register message and unicast it to RP.
In the multi-path access network, PIM-SM brings in the following
mechanism: use the assert mechanism to select the unique forwarder,
avoiding the repeated forwarding of the multicast packet in the same
segment; use the adding/pruning suppression mechanism to reduce the
redundant adding/pruning message; use the pruning deny mechanism to
deny the un-necessary pruning.
DR Select ion The rules of selecting DR are as follows:
1. If the PIM Hello packets of all neighbor route/switch devices on one
interface carry the priority field, first compare the priority values. The
larger the value, the higher the priority. If there are multiple
route/switch devices with the same priority, select the one with largest
IP address as DR;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 375 of 628
2. If the interface has one neighbor route/switch device whose PIM Hello
packets do not carry the priority field, select DR according to the IP
address, that is, select the one with the largest IP address as DR.
BSR Select ion At first, the route/switch device configured as the candidate-BSR enters
the Pending-BSR status; set the Bootstrap timer as the random veto value
(5s-23s) and begin to monitor the Bootstrap message.
The Bootstrap message contains the priority and IP address of the
message initiator. When the route/switch device in the Pending-BSR status
receives one Bootstrap message, it compares the priority and IP address
of the message with the its own priority and IP address. If the message
initiator is better, it enters the candidate-BSR status, and set the
Bootstrap timer as the Bootstrap timeout value (130s). If the route/switch
device is better, it does not perform the further processing. When the
Bootstrap timer of the route/switch device in the Pending-BSR status
times out, it enters the Selected-BSR status, send the Bootstrap message
and set the Bootstrap timer as the Bootstrap period value (60s). If the
priorities are the same, the one with larger IP address is better.
When the Bootstrap timer of the route/switch device in the candidate-BSR
state times out, it enters the pending-BSR status, set the Bootstrap timer
as the random veto value and enter a new BSR selection process. When
the route/switch device in the candidate-BSR status receives one better
Bootstrap message, it sets the Bootstrap timer as the Bootstrap timeout
value and still keeps the candidate-BSR status.
When the Bootstrap timer of the route/switch device in the selected-BSR
status, it sends the Bootstrap message, set the Bootstrap timer as the
Bootstrap period value and keep the selected-BSR status. When the
route/switch device in the selected-BSR status receives one poorer
Bootstrap message, it sends the Bootstrap message, set the Bootstrap
timer as the Bootstrap period value and still keep the selected-BSR status.
When the route/switch device in the selected-BSR status receives one
poorer Bootstrap message, it enters the candidate-BSR status and sets the
Bootstrap timer as the Bootstrap timeout value.
Bootstrap message adopts the ―all PIM routers‖ multicast group address
224.0.0.13 and TTL is set as 1. When one PIM route/switch device
receives the Bootstrap message, it sends the message on all interfaces
(except for the receiving interface). The process not only can ensure that
the Bootstrap message is spread to the multicast domain, but also can
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 376 of 628
ensure that each PIM route/switch device can receive the packet, so as to
know which route/switch device is BSR.
RP Select ion One route/switch device can be configured as the candidate-RP (C-RP) of
some specified multicast group or all multicast groups. After receiving the
Bootstrap message and getting to know the BSR location, C-RP transmits
Candidate-RP-Advertisement message to BSR via unicast. The message
has the RP address of the message initiator, its priority and the multicast
group address of C-RP.
BSR clears up all C-RPs, lists their priorities and their groups and forms
the RP set. BSR declares the RP set to the whole multicast domain via the
Bootstrap message. The Bootstrap message includes one 8-bit hash mask.
When one route/switch device receives the IGMP message or PIM join
message and one shared tree needs to be added, it checks the RP set got
from BSR. With the specified hash algorithm, select the RP for the
multicast group.
Int roduct ion to PIM SSM PIM SSM is short for Protocol Independent Multicast ----Source Specific
Multicast. PIM SSM is the specified source multicast of PIM, that is,
perform the special processing for the multicast services in the
232.0.0.0/8 address range of IPv4. Performing the multicast service with
group address in SSM needs to complete the related SPT operations. The
discovery of source S is realized via outband, that is, do not use the PIM
message (such as register message). SSM needs the supporting of IGMPv3,
because IGMPv3 can send the IGMP member reports of the specified
source and group at the same time. PIM SSM mode can run on one device
with PIM SM at the same time, but also can run on one device separately,
which depends on the protocol.
Introduction to PIM-DM Protocol PIM-DM is short for Protocol Independent Multicast-Dense Mode. Same as
PIM-SM, PIM-DM is at the upper layer of the IP protocol and communicates
with IP via the original socket. The protocol number of PIM-DM in the IP
packet is 103. The TTL of the sent PIM-DM protocol packet is always 1,
that is, the transmission distance is only one hop.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 377 of 628
The basic hierarchy of PIM-DM in TCP/IP protocol stack
PIM-DM Protocol
PIM-DM application topology
Neighbor Setup After the PIM route/switch device starts, it periodically (by default, it 30s)
sends the hello packets to the route/switch device (sent to all PIM router
groups 224.0.0.13) to set up the neighbor relation. The route/switch
device that receives the hello packet adds the route/switch device that
sends the hello packet to the neighbor list and enables one timer for it.
The value of the timer is the value in the holdtime domain in the hello
packet.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 378 of 628
Spreading and Pruning Process of Service Packets When the source appears, send one (S, G) service packet to the network.
At the beginning, the packet is spread to the every corner of the network.
When the route/switch device receives the service packet, set up the (S,
G) entry for it and record the input interface and the other are regarded as
the downstream interface. As shown in the figure, C receives the service
packets of A and B, but there can only be one input interface. Select the
route with the smallest cost as the input interface according to the cost of
the route to the source via unicast, but the other sends the pruning
information to prune it.
When the service packet is transmitted from E to I and I finds that itself
does not have the downstream neighbor or local group member and the
egress port is empty, I sends the pruning message to the upstream E
(note: the pruning is sent out from the input interface and the destination
address is the address of the group to be pruned) to ask for pruning. Here,
E finds that it has only one neighbor (such as the point-to-point
connection between E and I), E prunes I at once after receiving the
pruning of I. After pruning, E finds that its egress port is empty, it
continues to send the pruning upstream. After receiving the pruning of E,
C finds that there is local group member (refer to IGMP) in the network, so
ignore the pruning of E.
When the service packet is transmitted to the network from F, G receives
the packet and finds that itself does not have egress port, so forward
pruning upstream. Here, there is no other route/switch device in the
network, so F enables the pruning delay timer; H has the local group
member and egress port, and it needs to receive the service packet. When
H audits the pruning information of G (because the pruning sent by G is
transmitted to the group that needs pruning), it enables one deny timer.
When the timer times out, send the adding packet upstream (sent to the
desired group) and inform F that the service packet needs to be received
in the network and the egress port cannot be pruned. Therefore, after F
receives the adding packet, continue to keep the status of forwarding the
service packet.
Graft ing Process If I has local group member to add and has egress port, it sends unicast
graft packet to E; after receiving the unicast graft packet, E returns one
Graft ACK to change the downstream interface status to the forwarding
status; after I receives the Graft ACK, it changes its upstream interface to
the forwarding status and when there is packet, it can forward. Here, E
finds that itself has the egress port, so send the graft via unicast to the
upstream RPF neighbor (such as C, suppose C is the declared winner); C
returns one Graft ACK to E and then the upstream of E changes to the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 379 of 628
forwarding status. C is forwarding the service packet, so E with the
upstream interface in the forwarding status receives the service packet ad
forwards it. In this way, the service packet is transmitted to the new
added local group.
Declar ing Process As shown in the figure, because of the spreading of the service packet, the
route/switch device E may receive the service packet forwarded by C and
D, which results in the information redundancy. Therefore, C and D need
one declaring process and then D receives the service packet forwarded by
C at its egress port, which causes the declaring process. Similarly, C sends
the declare packet and they compare to select one winner to forward
service packets.
Status Refreshing Process The PIM-DM protocol is one typical spreading and pruning protocol. After
the pruning timer times out, the packet is spread to the network. To
reduce the cost of the frequent spreading-pruning process, PIM-DM uses
the status refreshing mechanism to maintain the pruning status in the
network. The status refresh message (SRM) is generated by the route-
switch device directly connected to the source and is sent to all
downstream neighbors in the network. After the downstream neighbor
receives the status refresh packet, make the response according to the
contents of the packet (for example, if the status refresh packet shows
that A sends pruning, while C needs to forward the packets, so C sends
out adding message to A; if A is in the forwarding status, while C does not
have the egress port, C sends the pruning information to A), refresh the
pruning timer of the egress port with the downstream neighbor, modify
the status refresh packet according to its own information, and forward
the modified status refresh packet (such as E in the figure, the egress port
is in the pruning status; after receiving the status refresh packet, refresh
the pruning timer of the egress port, fill in its own information and send
the status refresh packet to I; after receiving the status refresh packet, I
finds that E is in the pruning status and its own ingress interface is also in
the pruning status, and I does not have other downstream neighbor, so do
not do anything).
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 380 of 628
Introduction to MSDP Protocol
Overview
MSDP application topology
In the PIM-SM mode, if one source begins to send multicast service flow,
the first hop DR connected to the source registers the source information
to RP. In this way, the RP in PIM-SM can always know the source
information of all multicast service flows in the domain. In actual
application, to meet the network management requirements, divide the
whole network to multiple PIM domains and each domain has its own RP,
which is used to manage the source information of all multicast service
flows in the domain. Usually, the RP in the domain cannot know the source
information of other PIM domains, so it cannot receive the multicast
service flow of other domains. However, to meet the use requirements,
the users belonging to different domains hope to receive the multicast
service flow of other domains. To provide all multicast service flows, one
domain must depend on the RPs of other domains, which is not hoped by
the carriers. MSDP appears to solve the problem.
Multicast Source Discovery Protocol (MSDP) makes each MSDP domain
have its own RP and send multicast service flow to other domains or
receive multicast service flow from other domains.
MSDP sets up the peer connection relation between domains. The defined
information exchanging makes RPs of the domains share the active source
information in the network. Meanwhile, the RP of each domain maintains
the receiver information of its own domain. Therefore, for the multicast
service flow with receiver, RP can directly initiate adding to the source and
does not depend on the RPs of other domains. After the service flow is
referenced to RP via the source tree (SPT), RP transmit the service flow to
the receivers in the domain via the sharing tree (RPT). In this way, the
multicast service flow can be transmitted in the domain without depending
on the RP of other domain.
The MSDP peer relation is set up between the RPs of the domains via the
TCP connection. When the RP of one domain learns the new active source
in the domain, it sends SA (source-Active) message to all peer ends that
set up the peer relation with it. The peer end of MSDP adopts the
improved RPF to check whether to accept the SA message sent from other
peer end. After receiving the SA message, forward it to other peer ends
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 381 of 628
until all MSDP routers in the network receive the SA message. If the RP
that receives the SA sets up the (*, G) item, RP sets up the (S,G) item
and adds it to the source via SPT, importing the service flow to the domain.
The left is processed by the PIM-SM protocol. Besides, MSDP router
periodically sends out the source information in its own domain via the SA
message, letting the MSDP peer ends of all other domains know that the
source is sending service flow.
Setup of MSDP peer After configuring the MSDP peer, confirm the connection status according
to the address used to set up connection with peer and the size of the
peer address. Set up the passive connection for the large address and set
up the active connection for small address. The passive connection side
must send the MSDP message to the active connection side. Without the
MSDP message, send the keepalive message to prevent the active side
from resetting the connection. After the connection is set up, form the
MSDP peer relation.
Sending of Source Active Message After MSDP gets the multicast source information from PIM, send the
Source Active message to the connected MSDP peer and notify the
multicast source information to the peer. After the peer MSDP receives the
multicast source information, notify the information to the PIM module, so
as to realize the cross-domain multicast-on-demand.
MSDP Application
Inter-domain MSDP PIM-SM can be regarded as one multicast IGP protocol, because it is
supposed to run in one single domain. How to cross the AS boundary to
distribute the multicast packets and maintain the autonomy of each AS at
the same time is the problem of PIM-SM. PMBR (PIM Multicast Border
Router) in the PIM-SM protocol is used to solve the problem. PMBR is
located at the edge of AS and sets up the branches for all RPs in the AS.
Each branch is expressed by (*,*,RP). The wildcard indicates all source
and group addresses mapped to the RP. When RP receives the traffic from
the source, forward the traffic to PMBR, and then PMBR forwards the
traffic to other domain. When the adjacency domain does not need the
traffic, send pruning to PMBR, and then PMBR sends the pruning to RP, as
follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 382 of 628
PMBR solution
The key disadvantage of PMBR is the flooding and pruning actions.
Moreover, PMBR is designed to connect the PIM-SM domain to the DVMRP
domain. Therefore, PMBR is not the good method of solving the above
problem.
To solve the above problem, the following two problems need to be solved:
1. When the source is in one domain, but the group member is in the
another domain, RPF process must keep valid;
2. To keep autonomy, the domain cannot trust the RP in another domain;
PIM can use the BGP route to decide the RPF to other domain, but when
the unicast and multicast use different links, RPF check may fail. The static
multicast route can be used to prevent the RPF problem, but using the
static multicast route in a large range is not realistic. MBGP expanded from
BGP can solve the problem. In this way, problem can be solved via MBGP.
The reason of solving problem 2 is that AS (managed by different ISPs)
does not hope to depend on the uncontrollable RP (in other domain or
managed by other ISP). If each AS sets its own RP, there must be protocol
to make multiple RPs cross the AS boundary to share the information and
discover the known source information of other RP, as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 383 of 628
Inter-domain MSDP
MSDP shares the known source information of the RPs between different
AS via interaction. PIM-SM feels that the shared sources are in the same
domain. In this way, the receiver only depends on the RP in the local
domain, realizing the AS autonomy.
Int ra-domain MSDP To solve the PIM-SM problem between domains, we have to talk about the
problem in the same PIM-SM domain, that is, Anycast RP (Anycast means
that when one packet is sent to one single address, one of multiple
devices responds the address).
Placing RP in a large dispersed PIM-SM domain is a headache. PIM-SM
only permits the group-RP mapping, so the following problems may appear:
1. Traffic bottleneck;
2. Lack the expansible register de-encapsulation (when using the shared
tree)
3. When the activated RP fails, the recovery of the fault is slow;
4. The multicast packet may be forwarded with secondary priority;
5. Depend on the remote RP;
The hash algorithm and auto RP filtering of the PIMv2 BootStrap protocol
can relieve the above problems, but cannot provide the scheme of solving
the problems completely. Anycast RP is the method of permitting one
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 384 of 628
single group to be mapped to multiple RPs. The RPs can distributed in the
whole domain and use the same RP address. As a result, the virtual RP is
generated, while MSDP is the basis of generating the virtual RP.
As shown in Figure 3, four route/switch devices form the virtual RP;
release one single RP address 10.100.254.1; use MSDP to exchange the
information of the sources registered on each route/switch device. But all
route/switch devices run the auto RP and release the RPA address of
10.100.254.1. The source DR in the domain has the information of only
one RP address and is registered on the nearest physical RP. This causes
the separation of the PIM domain, but with MSDP mesh group, the Anycast
RPs in the domain can exchange the source information.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 385 of 628
MPLS Technology
This chapter describes the principle and application of Multi-protocol Label
Switching (MPLS).
Main contents:
Terms
Introduction to MPLS
MPLS architecture
Introduction to the LDP Protocol
Introduction to BGP/MPLS VPN
MPLS VPN user accesses Internet
Introduction to CSC
Introduction to MPLS L2VPN
MPLS traffic engineering
MPLS OAM
Terms of MPLS Protocol MPLS -Multiprotocol Label Switching
Label -Label
FEC-Forwarding Equivalence Class
LSR-Label Switching Router
LDP-Label Distribution Protocol
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 386 of 628
Introduction to MPLS The MPLS integrates the latest development of the route/switch solution.
It combines the simplicity of L2 switching and flexibility of L3 route. It
provides the following features:
In the MPLS network, the packet forwarding is based on the fixed-
length label. It simplifies the forwarding mechanism and improves the
forwarding rate.
Frame relay, ATM, PPP, HDLC, SDH, and DWDM are supported, which
ensures the interconnection of multiple types of network.
It supports QoS, traffic engineering and large-scale VPN.
MPLS Architecture
Separation of Control and Forwarding The MPLS architecture is divided into two independent units: the control
unit and forwarding unit, as shown in the following figure:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 387 of 628
Figure 25-1 MPLS Architecture
The control unit uses the standard routing protocol (such as OSPF and
BGP4) to exchange routing information and maintain routing tables. At the
same time, it uses the label control protocol (such as LDP, MP-BGP, and
RSVP) to exchange the label forwarding information with the
interconnected label switching devices to create and maintain the label
forwarding table.
The forwarding unit determines the forwarding of a packet, namely, search
the label forwarding table according to the information in the packet
header. Process and forward the label according to the search results.
Forwarding Equivalence Class A FEC is a collection of the packets using the same forwarding path in the
network (the destination addresses of the packets can be different). The
packets are processed in the same mode by the LSR in the process of
forwarding. From the view of forwarding processing, the packets are
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 388 of 628
equivalent. FEC is collection of a series of attributes (FEC elements),
including source address, destination address, source port, destination
port, protocol type, and CoS.
The entrance LSR of the MPLS domain determines one FEC for the IP
packet entering the MPLS domain. Then, it searches the corresponding
label value according to the FEC and encapsulates them into the IP
packets to form label packets. Then, transmit the packets in the MPLS
domain.
Label Encapsulation and Label Operation In the MPLS network domain, the forwarding of the label packets are
performed according to the label carried in the packet. The label is
inserted between the L3 packet and L2 header. It is called MPLS label
header. The format is shown as follows:
Figure 25-2 MPLS label
One MPLS packet can carry multiple label headers. The structure is called
a label stack. The labels are organized in the ―Last in, first out‖ mode. The
external label is called the stack top label and the internal label is called
the stack bottom (simple IP unicast route does not use label stack, but
other MPLS-based applications including MPLS-VPN rely on the label stack).
Each label is composed of the following fields:
Time-to-Live
The TTL of the field is 8 bits. It is used for coding of TTL. The function is
the same as that of the TTL field in the IP header. The filed is used to
prevent forwarding loopback caused by improper configuration, fault, or
slow convergence of routing algorithm, and to restrict the packet scope.
Stack bottom bit (S)
The field is 1 bit and the location is 1. It indicates that the corresponding
label is the last label (S) in the label stack. 0 indicates all other labels
except the bottom stack label.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 389 of 628
Service class information (EXP, also named trial bit)
The field is 3 bits used to carry CoS information (the function is similar to
TOS data in the IP packet).
Label Value
The field is 20 bits, containing the actual value of the label. When a LSR
receives the label packet, it first checks the label value of the stack top.
Normally, the LSR knows the next-hop node through the label value and
uses new label to replace the current stack top label. The label values 0-15
are the reserved label values. The meaning is as follows:
Label Value
Description
0 Indicates that IPv4 shows the blank label. When the label is at the stack top, it indicates that the next step is to pop up the label and forward the packets according to the new stack top label. If the label is the only label in the label stack, namely, after the popup, the label stack is empty, the forwarding of the packets are based on the IPv4 packet header.
1 Indicates the alert label of the router. When the stack top label of the receive packet is 1, the packet is sent to the local software. The forwarding of the packet is determined by the next item in the label stack.
2 Indicates that IPv6 shows the blank label. The usage is similar to label value 0.
3 It indicates the hidden blank label. The LDP uses it to require upstream neighbor to pop up labels (penultimate relay segment popup). The label value does not occur in
the label encapsulation.
4-15 Reserved
MPLS Network Structure and Forwarding Process In the forwarding of traditional IP packets, in each hop, the router
independently analyzes the destination IP address and runs the routing
algorithm of the network. On the basis, the independent forwarding policy
is made to determine the next hop of the packets. In MPLS, packets
entering the network are divided into different FECs. Then, search the
corresponding label value according to the FEC and encapsulate the values
into the packets. The routers in the network determine the packet
forwarding according to the labels carried in the packets. In the entire
MPLS domain, the forwarding of the packets is performed according to the
label. You need not perform any operation over the IP headers. The join
and forwarding process of the label is as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 390 of 628
Figure 25-3 Label forwarding
The basic unit of the MPLS network is the label switching router (LSR). The
switches or routers that can distribute labels and forward packets
according to the label belong to LSR. According to the functions provided
by the LSR, it can be divided into LSE (LER) and core LSR.
The LSR possessing non-MPLS neighbor is considered to be the boarder
LSR. The boarder LSR performs the tag insertion or rejection operation in
the MPLS boarder. In the entrance point of the MPLS domain, insert the
tag; in the egress point of MPLS domain. Before forwarding packets to
neighbors out of the MPLS domain, reject the packet tag.
In the preceding MPLS network structure, R1-R7 form an MPLS domain, in
which, R1, R2, R3, and R7 are the boarder LSRs, R4, R5, and R6 are core
LSRs. In the MPLS domain, each LSR maintains a label forwarding table.
The core LSR searches the label forwarding table according to the labels
carried in the packets to determine the forwarding path. No operation is
required for the IP header.
Penultimate Hop Popping Mechanism For the packet that is received from the MPLS neighbor and whose
destination is a subnet outside of the MPLS domain, the MPLS egress
boarder LSR must search it twice, as shown in R7 of figure 25-3. The LSR
must check the label in the label stack top to search labels. When the
packet is forwarded to outside of MPLS domain, the label should be pop up.
Then, search and forward L3 according to the IP header. Dual-search
operation of the R7 router may decrease the performance of the node. In
addition, in the environment implementing MPLS and IP switching, dual-
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 391 of 628
search will increase the complexity of the hardware implementation. To
solve the problem, in the MPLS architecture, penultimate hop popping
mechanism is adopted.
With the penultimate hop popping mechanism, the boarder LSR can
require the upstream neighbor to pop up the label (through the signaling
protocol such as LDP to send hidden label tag value 3 to the upstream
neighbor). In figure 25-3, R6 router pops up the labels in the packets,
then, send the pure IP packets to the R7 router. At last, R7 router
performs simple L3 searching operation and sends packets to the
destination.
Introduction to the LDP Protocol LDP, as a signaling protocol in the MPLS architecture, binds the labels for
the unicast routes in the routing table and advertises the generated MPLS
label forwarding table. The relation between the LDP and the label
forwarding table is similar to the relation between routing protocol and
core routing table.
Basic Concepts of LDP
LDP Peer The two LSRs using LDP to exchange FEC/label mapping information are
called LDP peers.
Label Space The concept of label space is related with the assignation and distribution
of the label. It defines the scope of using labels and defines whether the
labels in different interfaces can be repeated. The label space includes two
types:
Label space in the scope of each interface
The interface using the interface resources as the label generally uses the
label space. If the LDP peer is connected through specific interface, and
the label is transmitted through specific interface data, the label space
based on the scope of each interface can be used. In this case, the label is
unique in each interface.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 392 of 628
Label space in the scope of each platform
When the interfaces share label resources, the label space based on each
platform scope is used. In this case, the label is unique in a platform (a
LSR).
LDP Ident i f ier The length of LDP identifier is 6 bytes. It is used to mark the label space
scope of specific LSR. The first four bytes indicate the IP address assigned
to the specific LSR. The rest two bytes indicate the specific label space in
the LSR. For the label space in the platform scope, the last two bytes of
LDP identifier are always 0. The format of LDP identifier is as follows:
<IP address>: <Label space SN> such as 128.255.1.2:0,
129.13.17.35:2
If there are two physical links between two LSRs, the two physical links
are ATM links using the label space of each interface scope. In this case,
multiple label spaces should be advertised between LSRs, and multiple
LDP identifiers should be used.
LDP Session The LDP session is used to exchange label information between LSRs. If
multiple label spaces are advertised to another LSR from one LSR, for
different label spaces, different LDP sessions must be created between
LSRs.
LDP Transmission The LDP uses TCP to ensure reliable transmission of the LDP session. If
multiple LDP session is required for two LSRs, different LDP sessions will
correspond to different TCP connection.
LDP Working Process The LDP working process includes LDP discovery, session creation and
maintenance, label distribution and management.
LDP Discovery LDP discovers and creates Adjacency through the discovery mechanism.
LDP supports basic discovery and expanded discovery.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 393 of 628
The basic discovery mechanism is implemented by periodically sending
link hello message (the UDP multicast packets with the port of 646, the
multicast address is all routers in the subnet: 224.0.0.2) in the startup
LDP interface.
The expanded discovery mechanism discovers the non-directly-connected
LDP neighbor by periodically sending destination hello message (UDP
unicast packets with the port of 646) to a specific IP address.
If the LSR receives LDP hello message, it indicates that the potential
reachable LDP peer exists. The label space used by the LDP peer can be
obtained.
LDP Session Creat ion and Maintenance Exchanging LDP discovery hello message between two LSRs (LSR1 and
LSR2) will start the creation of LDP session. The process of creating LDP
session includes two steps: creating transmission connection (TCP
connection) and session initialization.
Assume that the label space of LSR1 is LSR1: a, the label space of LSR2 is
LSR2: b. The following describes the process of creating LDP session of
LSR1.
Process of creating transmission connection
After discovering the hello message through exchanging LDP, the two
parties will create an adjacency. Then, determine the initiative part
according to the transmission addresses of the two parties. If the
transmission address of LSR1 is larger than the transmission address of
LSR2, LSR1 serves as the initiative party of creating connection, namely, it
initiatively launches TCP connection (port: 646) to LSR2, and LSR2 serves
as the passive party of the connection to wait for the creation of
connection.
The determination mode of transmission address:
If LSR1 uses ―TLV is optional for transmission address‖ in the hello
message sent to LSR2 to advertise its address, the transmission address
of LSR1 is the address advertised in the TLV.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 394 of 628
If LSR1 did not use the optional TLV of transmission address, the
transmission address of LSR1 is the source address of sending hello
message to LSR2.
LDP session initialization
After LSR1 and LSR2 create the transmission layer connection, they
exchange the LDP initialization messages and negotiate the LDP session
parameters. The parameters that should be negotiated include LDP
protocol version, label distribution mode, and the session holding timer
value. When the parameters are negotiated successfully, a session based
on LSR1: a and LSR2: B are created between LSR1 and LSR2. The
following describes the initialization process of a session through the state
F bit: forwarding bit, 1: the notification message should be forwarded to
the next-hop or previous-hop LSR of LSP related with the message; 0: not
forward
State data: 30-bit unsigned integer, indicates the state information.
Message type, 0: the state TLV is not related with specific message; 1:
message type of the state TLV.
BGP/MPLS VPN BGP/MPLS VPN is a mechanism permitting SP to use the IP backbone
network to provide L3 VPN service for users. In the mechanism, BGP is
used to publish the VPN routing information in the backbone network of
the SP. MPLS is used to forward VPN service from one VPN station to
another.
Concepts and Terms of BGP/MPLS VPN Term Description
P-Network Provider network, the backbone network of the service provider.
PE router Provider Edge Router
P router Provider Router
CE router Customer Edge Router
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 409 of 628
ASBR AutoSystem border router
Site The networks connected with CE form a site
VRF VPN Routing Forwarding Instance, supported in PE, each VRF has an independent route forwarding table.
VPN An abstract concept, it can be considered to be a group of sites sharing routing information. A VPN can include multiple VRFs (the VPN contains routes of multiple VRFs); one VRF can belong to multiple VPNs (multiple VPNs contain the routes in VRF).
RD Route Distinguisher, is a 64-bit number. Address overlapping is allowed in different VRFs. When the BGP advertises VRF routes, different RD must be added to the IP address to ensure that the address is unique.
RT Route-Target, used for the BGP to advertise VPN routes; it controls the
destination VRF of importing the received VPN routes.
BGP/MPLS VPN Network Structure The following illustrates the BGP/MPLS VPN network structure.
Figure 25-5 BGP/MPLS VPN network structure
In the preceding figure, each PE contains two VRFs, and connects two sites.
The two interfaces connecting sites belong to two different VRFs. Site1 and
site2 belong to one VPN; site3 and site4 belong to another VPN.
Process of Route Advert isement and Label Mapping Advert isement In the P-Network, each device runs a certain IGP protocol (such as
OSPF) to mutually advertise routes, including loopback interface.
In the P-Network, each device starts the MPLS, and mutually
advertises label mapping through signaling protocol (such as LDP). For
PE1, in the routing table, there is a route to PE2 LOOPBACK interface;
the corresponding output MPLS label is L1.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 410 of 628
OSPF and RIP IGP protocols run between PE and CE. At the same time,
BGP protocol can also run. The routing information can be exchanged
through static route. For PE2, the route 10.2.1.0/24 learned from site2
will be saved in the routing table of VRF1.
Run MP-BGP between PEs to mutually advertise VPN routes (including
In CSC-PE1, replace the external label with L4 according to
external L5 and label forwarding entry L5-to-L4. Press one layer of
L1 according to next hop 2.2.2.2 of the entry and FTN:
2.2.2.2/32-to-L1. Therefore, CSC-PE1 sends one ―L1 L4 L7
10.2.1.1…‖ MPLS packet to P1.
After P1 receives the packet, the external label L1 pops up. Then,
send packets ―L4 L7 10.2.1.1…‖ to CSC-PE2.
In CSC-PE2, replace the external label with L3 according to
external label L4 and label forwarding table entry L4-to-L3.
Therefore, CSC-PE2 sends a ―L3 L7 10.2.1.1…‖ MPLS packet to
CSC-CE2.
In CSC-CE2, replace the external label with L2 according to
external label L3 and label forwarding table entry L3-to-L2.
Therefore, CSC-CE2 sends a ―L2 L7 10.2.1.1…‖ MPLS packet to
USER-P.
After the packets reach USER-P, the external label pops up
according to label entry L2-to-NULL. Then, forward packet ―L7
10.2.1.1…‖ to USER-PE2.
When USER-PE2 receives the packets, label L7 pops up according
to label table entry L7-to-NULL. Then, send IP packets to USER-
CE2.
As a result, the IP packets of USER-CE1 reach USRE-CE2. The process of
forwarding packets is complete.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 420 of 628
MPLS L2VPN
Terms VPLS: Virtual Private LAN Service, expands Ethernet LAN to IP/MPLS
network. It provides users with virtual cross-WAN transparent LAN service.
VPWS: Virtual Private Wire Service, a point-to-point virtual private line
technology.
H-VPLS: Hierarchical VPLS, a technology enhancing VPLS expansibility.
PW: Pseudo wire, an indication of packet leased line, or virtual circuit
between two nodes.
AC: Attachment Circuit, the connection circuit between CE device and PE
device. It is a physical circuit.
VC: Virtual Circuit, a logical link between devices.
SVC: Spoke VC
uPE: User-facing PE
nPE: Network-facing PE.
Q-in-Q: an Ethernet encapsulation technology, it allows the frame with
802.1Q VLAN tag to be added with 802.1Q VLAN; it is also called VLAN
stack.
VSI: Virtual Switch Instance. Multiple VPLS forwarders connected through
PW form a VSI.
Basic Concepts MPLS L2VPN provides L2 VPN service in the MPLS network. With the
MPLSL2VPN technology, carriers can provide users with L2 VPN services of
different media through the MPLS network, including ATM, FR, VLAN,
Ethernet, PPP, and HDLC. The MPLS network also provides common IP, L3
VPN, traffic engineering, and QOS service. As a result, carriers can save
the investment for constructing network.
MPLS L2VPN transparently transmits L2 data in the MPLS network.
Through the MPLS L2VPN network, L2 connection be created between
different sites. Take ATM as an example; configure an ATM virtual circuit in
each CE. Connect it with a remote CE of the MPLS network. This mode is
the same as the interconnection through ATM network.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 421 of 628
With MPLS L2VPN, the carrier only needs to provide L2 connectivity for
users. The carrier does not need to participate in the route calculation of
VPN users. But the MPLS L2VPN is same as traditional L2 VPN (for example,
VPN provided by ATM PVC), there is the problem of N power. In each VPN,
the connection between any two CEs requires a link between CE and PE.
For PE device, if a VPN has N sites, N-1 physical or logical port connection
between CE and PE must be created. In MPLS L2VPN, PE device does not
participate in the route calculation of users. Therefore, the expansibility of
L2VPN is greater than L3VPN. But, L2VPN is less flexible.
PPVPN team of IETF worked out many frame drafts, in which, the two
most important types are Martini and Kompell. The Martini draft
implements MPLS L2VPN through expanding LDP; Kompell draft
implements it through expanding MP-BGP. Currently, Martini draft has
become a standard. Maipu supports this mode.
Relevant RFCs are as follows:
RFC4905, Encapsulation Methods for Transport of Layer 2 Frames over
MPLS Networks
RFC4906, Transport of Layer 2 Frames Over MPLS
MPLS L2VPN covers Virtual Private Wire Service (VPWS) and Virtual Private
LAN Services (VPLS). VPWS is a point-to-point virtual dedicated line
technology. It supports most link layer protocols. VPLS provides similar
LAN services in the MPLS network. Distributed users can access mutually
like accessing LAN directly.
VPWS The basic principle of MPLS L2VPN is similar to that of BGP/MPLS VPN. It
also uses the label stack to implement the transparent transmission of
packets in the MPLS network. External label (tunnel label) is used to
transfer packets from one PE to another. Internal label (in MPLS L2VPN, it
is called VC label) is used to distinguish different connections in different
VPNs. The receiver PE determines the destination CE according to the VC
label. In the process of forwarding, the label stack changes as follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 422 of 628
Figure 25-11 Forwarding process of MPLS L2VPN label
Illustration
L2PDU: link layer packets
V: internal VC label
T, T1: external Tunnel label, in the MPLS forwarding, the tunnel label will
be replaced.
Implementat ion Mode of Mart in i Martini mode defines the method of implementing MPLS L2VPN through
creating point-to-point link. It distributes VC labels by expanding LDP
signaling protocol. Therefore, the mode is also called LDP L2VPN.
For the LDP protocol to distribute VC labels, RFC4447 expands the LDP
protocol. In the LDP protocol, FEC type of VC FEC is added. In addition,
the two PEs switching VC labels are not directly connected. Therefore, LDP
must use target peer to create a session and then transfer VC FEC and VC
labels over the session. The process of distributing VC labels of LDP is the
same as the distribution process of other labels.
The L2VPN implemented through expanding LDP can carry ATM, FR,
Ethernet/VLAN, PPP, and HDLC. It requires that the link layer protocols in
each site in the VPN are the same. Only when all sites are Ethernet or ATM,
the L2 VPN network can be created. The disadvantage of L2VPN in Martini
mode is that only the point-to-point VPN L2 connection can be created.
The automatic discovery mechanism of the VPN is not supported.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 423 of 628
The L2VPN of Martini mode focuses on the problem ―how to create virtual
circuit between two CEs‖. It adopts VC-TYPE + VC-ID to identify a VC. VC-
TYPE indicates the VC types including ATM, ETHERNET, VLAN, and PPP.
VC-ID is used to identify a VC. It must be unique in the PE device. The PE
connecting two CEs exchanges VC labels through LDP protocol and binds
the corresponding CEs through VC-ID.
When the LSP connecting two PEs is created successfully, and the
exchange and binding of labels are complete, the VC is complete. The two
CEs transmit L2 data through the VC.
Point-to-Multipoint Connection (VPLS)
Background and Features of VPLS Technology VPLS virtual private LAN is one kind of MAN Ethernet technology. It can
connect each access points and implement point-to-point, point-to-
multiple point, and multiple point-to-multiple point Ethernet service in the
network topology.
According to the connection mode, VPLS uses WAN backbone network of
IP/MPLS to provides enterprise users with simulation LAN connection.
According to the service provision mode, the simulation LAN of VPLS
provides convenient and flexible Ethernet service. The simulation LAN
connection is transparent for the sub-LAN crossing the WAN. Each sub-LAN
is like being connected to the same switch.
VPLS uses IP/MPLS domain to classify the network and restrict the L2
service to the entrance/edge network. According to the networking
requirements, the MAN using VPLS technology includes the following two
modes.
1. The access network provides L2 service; the aggregation and core
network provide L3 service.
2. The access network and aggregation network provide L2 service; the
core network provides L3 service.
VPLS technology integrates the IP/MPLS, VPN, and Ethernet switching to
implement the multipoint-to-multipoint LAN interconnection in the WAN.
The advantage of VPLS is that: after the PE device with multipoint
connectivity is configured, when the CE devices are added, deleted, or re-
deployed in the VPN, you only need to re-configure the directly connected
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 424 of 628
PE device. If the point-to-point L2VPN is used, the peer PE device must be
re-configured.
Two Graphical Concepts and Working Pr inc iple of VPLS VPLS technology includes signaling control layer and data forwarding layer.
To implement the VPLS signaling control function, you can use BGP or
Targeted-LDP, which are respectively called Kompella VPLS and Martini
VPLS. Currently, only the VPLS control panel through Targeted-LDP is
supported.
In the signaling control panel, VPLS technology uses LDP signaling protocol
to create a pair of cross-backbone network unidirectional MPLS VC-LSP,
and create corresponding PW between PEs. Transmit the Ethernet data
unit in the backbone network through PW. VC-LSP can be configured
statically or dynamically configured by the LDP protocol. The created PSN
tunnel can carry multiple VPLS services. At the same time, It shields the
transmission data to protect the cross-backbone network security.
For data forwarding, in the MAN created according to the VPLS technology,
the PE devices in the network independently learn the MAC address and
maintain the MPLS FIB table, encapsulate/de-encapsulate the received L2
data according to RFC4447. The data is exchanged through the PSN tunnel
created by LSP of MPLS between PEs. One VPLS instance corresponds to
one enterprise customer. PE maintains one MPLS FIB entry for different
VPLS instance. In the maintained MPLS FIB table entry, the key is the
relationship between MAC and PW, namely, the relation between MAC and
LSP. Note that one PW is composed of two LSPs. MAC corresponds to the
labels of negative direction. Then, the data can be properly forwarded.
When the PE maintains the MPLS FIB table entries, the problem similar to
the MAC address aging of the switch will be encountered. VPLS
implements the function through the signaling protocol to send address
withdraw message. The function is implemented through a FEC TLV
(involved VPLS of the flag) contained in a LDP address withdraw message
and a MAC address TLV (optional).
VPLS technology emulates a transparent LAN. The sub-LAN is similar to be
connected to a switch. Loopback will be encountered inevitably. VPLS
technology solves the problem through two methods: Run STP in each PE
to transmit STP BPDU tunnel; perform full-mesh interconnection for all PEs
and support horizontal split mode. In the first method, STP is developed
from the LAN technology. Even in the LAN with many hosts, the
aggregation time is long. Although STP is improved through multiple ways,
it is not suitable to apply in large-scale network. The second method can
solve the loopback problem in certain scale. But when the number of PEs
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 425 of 628
increases, the full-mesh interconnection will increase the number of inter-
PE LSPs, decrease the flexibility of network deployment, the increase the
press of PE. You can solve the problem by applying hierarchical VPLS (H-
VPLS) in the large-scale network.
H-VPLS uses a centralized star layout to create the hierarchy: full-mesh
tunnels are maintained between backbone sites (specified to be PE); CE
devices are connected to a uPE; uPE is connected to one nPE. Through the
hierarchy, H-VPLS enables carriers to assign bandwidth dynamically in the
network to create unique section. H-VPLS can effectively use the
bandwidth, especially for the video application. Through pushing multiple-
point broadcast to the edge of the carrier network, H-VPLS decreases the
load of the core part of the MAN.
VPLS Packet Encapsulat ion The packets transmitted over AC and PW in the VPLS mode are Ethernet
frames. Two Ethernet encapsulation modes are supported: RAW and
TAGGED.
RAW: the packets can contain 802.1Q VLAN tag (or do not contain),
but the tag is meaningless for the two connected nodes. The tag is
transparently transmitted.
TAGGED: In each packet, at least one 802.1Q VLAN tag is contained.
The tag is meaningful for the two connected nodes, namely, the two
connected nodes have certain conventions for the tag (for example,
configure through signaling or manual operation).
For a PE device, AC or PW encapsulation mode selection means selecting
from the encapsulation formats of the packets output from AC or PW. If
the TAGGED mode is selected, for AC, the tag is meaningful for the CE-PE
two ends; for PW, the tag is meaningful for the two ends of pseudo line
connection between PE1 and PE2.
The packets received from AC, namely, the packets received from VLAN
interface, can contain tag of not. If the tag is contained, the tag can be the
Service-Tag (S-TAG) pressed by users for the SP network to distinguish
users. It can also be customer VLAN-Tag (C-TAG). To identify the S-TAG
or C-TAG, you should check the configuration of customers (packets first
match the TPID of OVID, then, match the TPID of IVID (namely, the per-
chip configured inner TPID). If the two TPIDs are equivalent, it is
considered to be OVID).
The packets with Tags are received from PW. If the PW is in TAGGED
mode, and the TPID in the packets is equivalent to the configured TPID,
the external TAG is considered to be S-TAG. Otherwise, it is C-TAG. If PW
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 426 of 628
is the RAW mode, the tag contained in the packets is C-TAG. The C-TAG is
transparently transmitted in the VPLS processing. It will not be deleted or
replaced.
1. Packet Encapsulation on AC
The packet encapsulation mode on the AC is determined by the VSI access
mode of the user VLAN interface. The user access modes include: Ethernet
access and VLAN access.
VLAN access: the Ethernet frame header sent to PE from CE or sent to CE
from PE contains a VLAN tag. The tag is a S-TAG pressed by the customer
for the SP network to distinguish customers.
Ethernet access: The uplink Ethernet frame header of CE and the downlink
Ethernet frame header of PD do not contain S-TAG. If the frame header
contains VLAN tag, it indicates that it is the internal VLAN tag of the user
packets and it is meaningless for PE devices. The tag of the internal VLAN
is called C-TAG.
Ethernet access mode is corresponding to the RAW encapsulation mode.
VLAN access mode is corresponding to the Tagged VPLS encapsulation
mode. The packet processing modes are as follows:
A. RAW Mode
Packets sent from AC do not process tag in the VPLS process, no matter
whether S-TAG or C-TAG exists. Whether S-TAG should be added to the
packets is determined by the port configuration and VLAN configuration.
B. TAGGED Mode
If the packets sent from AC contain S-TAG in the VPLS processing part,
judge whether the S-TAG is equivalent to the S-TAG of AC. If they are
equivalent, do not perform any operation; if they are not equivalent,
replace the tag. If the packets do not contain S-TAG, add the S-TAG of AC.
2. PW Encapsulation
The encapsulation mode in PW also contains two types: RAW mode and
Tagged mode.
A. RAW Mode
If PW uses the RAW mode, PW indicates the virtual links on two Ethernet
ports. Packets are transparently transmitted. The packets can contain Tags.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 427 of 628
But the tag is meaningless fro ingress and egress PE. S-TAG will not be
transmitted over PW.
The packets received from AC will be output from PW. If the packets
contain S-TAG previously, delete the S-TAG first, then, press two layers of
MPLS labels before forwarding. If the packets without S-TAG are received,
press two layers of MPLS labels before forwarding.
B. TAGGED Mode
After the PW is configured to be the TAGGED mode, PW indicates the
virtual link between two VLANs. Each PW can represent different VLAN to
perform switching of different network. Each packet must contain a TAG.
The tag value is meaningful for the ingress PE and egress PE.
The packets received from AC should be output from PW. If the S-TAG is
contained previously, press two layers of MPLS labels before forwarding; if
the packets do not contain S-TAG, add an empty TAG (TAG VID=0) and
then press two layers of MPLS labels before forwarding.
Basic VPLS The full-mesh interconnection structure is adopted in the basic VPLS.
In a full-mesh network, the session connections are created between PEs
in the same VPLS instance. Corresponding PW is generated. The packets
received from the CE can be forwarded to one or multiple local interfaces
(AC) and emulated LAN interface (PW). To prevent loopback of broadcast
packets in the network, the packets received from a PW will not be
forwarded to other PWs in the same VPLS instance. This is L2 horizontal
split. In the Full-mesh network, the horizontal-split is a basic function.
The following figure illustrates the full-mesh connection.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 428 of 628
Figure 25-12 Full-mesh connection of basic VPLS
In the preceding figure, enterprise user A and user B are connected to
three branch LANs through VPLS technology respectively. Red line
indicates the traffic flow of user A and blue dotted line indicates the traffic
flow of user B. Each branch LAN of the user is connected to the IP/MPLS
backbone of the carrier through PE to form a VPLS instance. In the
preceding figure, user A belongs to VPLS instance 1; user B belongs to
VPLS instance 2. The traffic flow can be transmitted in the LAN mode
between branch LANs in the same VPLS instance. Even if multiple
enterprises access the same backbone network through the same PE, the
traffic flows are independent from each other logically. This ensures the
privacy of user data. The VPLS instance data of user A and user B are
isolated and cannot be interconnected.
To connect different branch sites, you should create the full-mesh
interconnection between PEs of the same VPLS instance. It is a data tunnel
created through the LSP of IP/MPLS. PE provides the Ethernet-based
bridge access mode for users. PE directly receives the data frames in the
Ethernet encapsulation format from user branch LAN, and determines
forwarding data to the proper LSP to reach the branch LAN at the other
end according to the destination MAC address. With the VPLS protocol
running on PE, the interfaces connecting user network on PE, like bridge
devices, provide L2 switching and MAC address learning capability. When
the PE receives data frames, it first checks whether the destination MAC
address of the frame header and the entries in the MAC address table are
matched. If any entries are matched, the data frame is forwarded to the
corresponding LSP for transmission; if no entry is matched, the same data
frame is broadcast to other logical ports serving the same VPLS instance.
When the PE device receives data from the home host of the MAC address
and learns the address, the MAC address table is updated. The following
data frames will be forwarded normally. This is similar to the working
principle of Ethernet switch.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 429 of 628
H-VPLS Hierarchical VPLS (H-VPLS), is a technology enhancing the VPLS
expansibility. It extends the access scope of the service provider VPLS and
decreases the network complexity to facilitate network management. At
the same time, the construction and operation cost is reduced. When
common VPLS is used, if one PE is expanded, full-connection with each PE
is required. If LDP is used, each PE device in the VIS should be configured.
N2 problems occur in the case of controlling the quantity of packets. After
the H-VPLS is used, expand a PE. You only need to modify the
configuration of the PE connected. In addition, the quantity of the packets
does not encounter the N2 problem.
New roles are introduced in H-VPLS: uPE, namely the user end PE, the PE
in the SP network connected with uPE is also called nPE, namely, network
end PE. uPE can be the L2 device with the Ethernet switch function only. It
can also be L3 device with switch and route functions. One end is
connected with PE of SP network; the other end (multiple interfaces)
connects multiple user CE devices in the building. uPE is one part of VPLS.
It connects with PE by creating a PW. The PW is also called SVC.
In the H-VPLS network, user end PE (U-PE), is usually placed at the
entrance. Therefore, it is also called Multi-Tenant Unit (MTU). If the MTU
only has the switching function, H-VPLS can use the L2 QinQ mode to
access. The mode is applicable to the early stage of the network
construction, when the accessed devices in the system do not have the
MPLS function. It can also be used in small access network. Only simple
access function is required. If the MTU has the routing and MPLS function,
H-VPLS can use the LSP mode of the MPLS to access. This mode can also
be used in the medium-scale access or aggregation point. The MPLS
network can extend to the user end to user other VAS of the MPLS
network.
The core network in the H-VPLS is the full-mesh topology. The edge
network is the Hub-and-Spoke star topology. In the preceding figure, uPE
is the hub, and the multiple CEs are equivalent to spoke. The top layer and
the edge layer of the core are connected through the pseudo wire.
In the H-VPLS network, if you want to make full connection like basic VPLS,
the uPEs will serve as PEs in the basic VPLS to make full connection. The
quantity of sessions is greater than the full-connected PE devices in H-
VPLS. Therefore, H-VPLS enhances the expansibility of the VPLS. As a
result, the N power problem caused by the expansion is prevented. For the
new uPE, configure the uPE and the connected PE. You do not need to
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 430 of 628
change other devices. Then, the maintainability and manageability are
improved.
For the signaling protocol between PE and uPE, one mode is that the PW
from PE to uPE is implemented through spoke VC function of LDP, to
implement H-VPLS; the other mode is the H-VPLS based on the QinQ
mode. It is only applicable to Ethernet link.
Figure 25-13 H-VPLS connection
In the H-VPLS, uPE can access multiple CEs. The CEs can belong to one or
multiple different VPLS instances. Between nPE and uPE, label or VLAN-ID
is used to distinguish VPLS instance. If the VLAN-ID is used, the QinQ
technology is required for the user data frame may contain VLAN-ID label.
For the CEs in the same VPLS instance connected to the same uPE to
exchange information, you can implement the function through L2 switch
on uPE. The participation of nPE is not required.
When CE2 wants to send data to the remote CE1 (through the CE
connected to the WAN of SP) in the VPLS instance, the Ethernet frame is
first sent to uPE1. If uPE1 fails to learn the DEST-MAC (broadcast frame or
multicast frame) of the frame, send the frame on other ports (AC, SVC,
and PW) of non-receiving port. After PE2 receives the frame, if the MAC is
not learned, the frame will be broadcast in all ports (PW, AC, and other
SVC) of the VPLS instance. If the DEST-MAC is learned, the frame will be
sent in the corresponding PW. If PE1 of the other end receives the data
frame, it will be forwarded according to the DEST-MAC, namely, if the
DEST-MAC is not learned, broadcast the frame on other ports of the VPLS
instance; if the DEST-MAC is learned, send the frame to the corresponding
AC, and then upload to the CE1.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 431 of 628
1. Access Through SVC
The connection between uPE and PE can adopt VC, which is called Spoke-
VC (SVC). Use the SVC to identify the VPLS instance of the packets
entering PE. For the SVC, there are two conditions:
uPE has the switching capability. The processing for received
packets is described previously. Between uPE and PE, maintain
one PW for one VPLS.
uPE is a device without switching capability. Between uPE and PE,
for one VPLS, multiple PWs should be maintained. On uPE, the
mode is the same as that of VPWS. One ingress interface of uPE
accessed by CE corresponds to PE directing to PW. In this case,
the packets of uPE received from the AC will be sent be the PE for
processing. If the packets are sent to another CE of the same
VPLS connected with the local uPE, the switching process is
implemented on the PE. This mode has some disadvantages. But
it is only a compatible mode of using the deployed uPE devices
supporting VPWS.
For SVC, two VPLS instances (such as two cross-MAN VPLS instances) can
be connected. This is called Multi-domain VPLS. Two PEs connected by
SVC are called border-PEs. If multiple multi-domains should be
interconnected, perform full-mesh for border-PEs of each VPLS through
SVC. As a result, a L3 VPLS network is formed.
2. QinQ Access
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 432 of 628
Figure 25-14 Packet process of QinQ Access
The preceding figure illustrates the packet forwarding process of QinQ
access:
A. Enable QinQ at the CE access port. Add pressed VLAN tag for
the received packets to serve as multiplexing separation tag.
Between MTU and PE1, transparently transmit packets to PE1
through QinQ tunnel.
B. PE1 first determines the home VSI according to the VLAN tag
of the carried MTU, and then press multiplexing separation
label (MPLS label) corresponding to PW it according to the
destination MAC of the packets. At last, forward the packets.
C. After PE1 receives packets from the PW side, determines the
home VSI according to the multiplexing separation label (MPLS
label). Label VLAN tag according the destination MAC Forward
the packets to MTU through the QinQ tunnel. At last, MTU
forwards the packets to the CE.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 433 of 628
MAC Address Restr ic t ion The MAC address is learned before switching is performed in the VPLS
instance. Then, search the MAC address table according to the destination
MAC address. One system can support multiple VSI instances. To prevent
oversized MAC address table of an instance, restrict the number of MAC
addressed that can be learned in the VSI.
MAC Address Recycl ing When any fault is encountered, to quicken the convergence speed, notify
other PEs to clear local MAC table entries of the VSI, trigger the re-
learning of MAC address, and reconstruct the MAC forwarding path as soon
as possible. The recycling message of LDP protocol provides the
mechanism.
The address recycling message carries the MAC TLV. The devices receiving
the message delete the MAC address or re-learn the MAC address
according to the parameters specified by TLV.
The destination of the MAC address recycling message is relevant with the
fault type. The basic principle is to notify all devices that may learn the
MAC addresses. The fault types include: AC interface fault, Mesh-PE device
fault, and Spoke-PE device fault.
When the AC interface is faulty, you should send the MAC address
recycling message to all Mesh-PE devices and Spoke-PE devices.
When a Mesh-PE device is faulty, you should notify all Spoke-PE devices.
When a Spoke-PE device is faulty, notify all Mesh-PE devices and other
Spoke-PE devices.
Loopback Avoidance Like the common Ethernet, the loopback avoidance must be taken into
consideration for the virtual Ethernet. In the VPLS, full-mesh and split
horizon must be adopted to avoid loopback.
In the basic networking environment, among all PEs of the same VPLS
instance, the PE will be created to form a full-mesh topology. As a result, a
PE can connect with other PEs through the PW. At the same time, PE will
be connected to CE through the access circuit (AC). In split horizon, the
broadcast, multicast, or the frames to be flooded that are received from
the PW will not be sent to other PWs (including itself) of the same VPLS
instance, but they can be sent to AC; the broadcast, multicast, or frames
to be flooded that are received from AC, except the AC itself, can be sent
to other PWs and ACs of the same VPLS instance, namely, the packets
received from the PWs at the public network will not be forwarded to other
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 434 of 628
PWs of the public network. The packets can only be forwarded to the
private network.
The core network created in this mode does not have loopback.
If the loopack structure caused by the backdoor exists in the CE network
of the VPLS, the users in the LAN should run the loopback avoidance
protocol, such as STP, to avoid loopback. For the loopback avoidance
control protocol of users, the carrier network does not perceive. It is
transparently transmitted as user data.
In the H-VPLS of the MPLS access network, to avoid loopback in the
forwarding, nPE devices must enable the L2 split horizon in the pseudo
wire connecting to other nPE devices. Disable the split horizon in the
pseudo wires connecting to uPE. On nPE, packets reaching the pseudo
wires connecting to uPE are forwarded to other pseudo wires. When the
packets reach the pseudo wires connecting nPE, the packets are forwarded
to the pseudo wires connecting uPE.
If a uPE connects a PE, since it is a star topology, there is no loopback in
the network. To prevent circuit fault, you can use the MPLS FRR to ensure
fast recovery of the fault. To prevent node fault, you can use the dual-
homing access nPE.
If uPE is dual-homed to two PEs, the L2 split horizon only cannot prevent
the loopback. You have to enable the spanning tree protocol between uPE
and nPE.
Comparison between VPLS and VPWS VPWS VPLS
Concept Virtual private wire service.
Point-to-point virtual circuit connection, in users' eyes, is a circuit connecting to another end, providing L2 packets transparent transmission.
Virtual private LAN service.
Provide the virtual Ethernet service through WAN. In users' eyes, it seems that multiple VPN branches are connected to a huge LAN provided by the SP. In addition, bridge switching is performed in the LAN.
VPN A point-to-point connection mode of L2VPN.
A point-to-multipoint connection mode of L2VPN.
Expansibility There is a network expansibility problem, namely, Npower problem. After providing multi-connectivity PE
Provide good expansibility; operation and maintenance are simple. After providing multi-connectivity PE
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 435 of 628
devices for VPLS customers, when you add, delete, or re-deploy CEs in the L2 VPN, you must re-configure each peer PE.
devices for VPLS customers, when you add, delete, or re-deploy CEs in the L2 VPN, you must re-configure the connected PEs.
Signaling protocol
LDP, the pseudo wire between PEs is called VC.
LDP, the pseudo wire between PEs is called PW or SVC.
Encapsulation mode
Add the VPWS label, and then add the label of external MPLS tunnel. Take the FR AC access as an example: When the AC interface encapsulation between CE and PE is FR, packets are received on PE. Add the VPWS label before the FR header, and then, add external MPLS label.
Add the VPLS label, and then add the label of external MPLS tunnel. Take the FR AC access as an example: When the AC interface encapsulation between CE and PE is FR, packets are received on PE (the format should be: FR header + Ethernet header + Data). The FR header should be removed. Add VPLS label only before the Ethernet header in the FR packet, and
then, add MPLS label.
AC access Multiple types of ACs are supported, such as, PPP, HDLC, Ethernet, VLAN, FR, and ATM
Multiple types of ACs are supported, such as, PPP, HDLC, Ethernet, VLAN, FR, and ATM
Packet processing flow
The network connection is as follows: CE1--------PE1--------P-------PE2--------CE2. Assume that: data communication is performed from CE1 to CE2, VC label is exchanged through the LDP protocol between PE1 and PE2. The data processing flow is as follows: CE1PE1, PE1 adds the VPWS label and then adds the global route label, send to PE2, after PE2 receives the packets, remove the label, and send to CE2 interface.
The network connection is as follows: CE1--------PE1--------P-------PE2--------CE2. Assume that: data communication is performed from CE1 to CE2, VC label is exchanged through the LDP protocol between PE1 and PE2. The data processing flow is as follows: CE1PE1, after PE1 receives packets, it learns the MAC address of CE1, and then search the MAC address table in the VPLS instance taking the destination MAC as the key value. The found destination MAC will be sent to the PW of PE2, add VPLS label in the encapsulation, and then add global MPLS label, at last, send to PE2. After PE receives the packets, learn the address and then search the table. If the AC is found, remove the label, and then send it to CE2 interface. If the packet is not found in the table, perform flooding in the VPLS instance based on the split horizon principle.
MPLS Traffic Engineering With the expansion of network scale, network engineering and traffic
engineering arise.
Network engineering is to design the network to meet the traffic
requirements. The network designer should understand the transmission
of traffic in the network, and then purchases proper links and network
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 436 of 628
devices. The implementation of network engineering takes long time for
new links and devices should purchased and installed.
Traffic engineering is to design the traffic for normal transmission over the
network. Despite the efforts of network designers, the actual traffic in the
network is not the same as the predicted value. The increasing speed of
the traffic is beyond the expectation sometimes, but the network
designers cannot upgrade the network at once. Usually, rapid traffic
increase, emergency, or network accident may increase the requirements
for bandwidth at certain places. At the same time, some links in the
network is not fully utilized. The core concept of the traffic engineering is
to transfer the traffic, and the traffic blocking the link will be transferred to
the links not fully utilized. The traffic engineering is not the proprietary
product of MPLS, it is a universal solution. MPLS-based traffic engineering
is a trial. It attempts to use the link-oriented traffic engineering
technology and integrate the technology with IP routing technology.
At the ingress port (it can be considered to be source end of the data) of
MPLS network, the MPLS traffic engineering controls the path to specific
destination. Create the LSP and reserve network bandwidth in the passing
routes. Balance the traffic load and make full use of the link bandwidth.
The acronym of MPLS traffic engineering is MPLS-TE.
MPLS-TE ensures the bandwidth for each traffic by creating tunnels. After
the tunnel is created, the data is mapped to be FEC, and is forwarded in
the tunnel along the LSP path. At the head end of the tunnel, the tunnel
exists as a tunnel interface. Any traffic to pass the tunnel, should be sent
through the interface. In the network routing, the tunnel interface can be
found through static route and dynamic route. The routes directing to the
tunnel interface can be distributed through the dynamic route.
Another major feature of MPTS-TE is to implement communication
protection. Usually, the partial protection technology, namely, the fast
reroute technology is adopted; the graceful restart technology can also be
adopted.
Ground of MPLS Traffic Engineering To implement the MPLS-TE, the following two modes can be adopted:
Constraint-Based Label Distribution Protocol (CR-LDP).
RSVP-TE expanded from RSVP.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 437 of 628
RSVP-TE is supported by most vendors. The MP series switches support
RSVP-TE protocol.
In RSVP-TE, LABEL_REQUEST, EXPLICIT_ROUTE, SESSION_ATTRIBUTE,
RECORD_ROUTE and LABEL are added. They are respectively used in the
PATH and RESV messages. The objectives are used to request label,
complete path specified by the source end, description, recording route
and assignment label. The EXPLICIT_ROUTE can specify the path of the
tunnel. The specified path in the objective includes strict hop and loose
hop. Usually, the path calculated by the source end is described in strict
hop. If the data source cannot see all details in the entire network, or the
source end does not want to specify each hop in the path explicitly. you
can use the loose hop to describe the path. When the router node receives
the PATH message, and the path objective is processed, for the strict hop,
the first IPv4 address in the objective must be the address of the local
router, otherwise, the objective cannot be processed. For loose hop, the
router should generate a strict hop path for the source end, which contains
loose hop node. In addition, use new strict hop objective to put into the
PATH message for transmission.
Releasing MPLS-TE Network Topology Information The RSVP protocol cannot see the topology of the entire network.
Therefore, MPLS-TE should resort to the link state routing protocol (OSPF
or IS-IS) to release network topology information and calculate the tunnel
path of MPLS-TE.
The link state routing protocol (OSPF or IS-IS), according to the known
network topology and the advertised MPLS-TE network topology
information, calculates the shortest path of the required MPLS-TE tunnel.
Releasing MPLS-TE Network Topology Information on OSPF The MPLS-TE network topology information released on OSPF includes two
types: switch address information and switch link information. The
released switch address information is the switch ID of the MPLS-TE, which
is used to identify the switch node in the MPLS-TE network topology. The
relevant information is as follows:
Information Function Command
Router ID The Router ID (one interface IP address) of the switch, used to identify the switch node in the MPLS-TE network topology
Maipu Confidential & Proprietary Information Page 438 of 628
The link information refers to the relevant information of MPLS-TE released
based on a single link. The corresponding configuration command is
configured in the interface mode. It includes the following content:
Information Function Command
Link type Specify the link type, 1: point-to-point, 2: multiple access (for example, Ethernet)
N/A
Link ID On the point-to-point link, it is the OSPF ROUTER ID of the neighbor; on the multiple access link, it is the interface address of the designated router (DR)
N/A
IPv4 interface address
The interface IP address of the advertisement switch on the link
N/A
Neighbor address Point-to-point link refers to the interface of the neighbor at the other end; multiple-point interface refers to the interface address of the DR
N/A
TE metric The cost of calculating tunnel path in the link mpls traffic-eng admin-weight
Maximum physical link bandwidth
The maximum physical bandwidth on the link interface
bandwidth
Maximum reserved bandwidth
Maximum bandwidth that can be reserved in the link
ip rsvp bandwidth
Unreserved bandwidth for each priority
Unassigned reserved bandwidth of each priority tunnel on the link
N/A
Attribute flag The link attributes defined by the user. Include or exclude the link according to the attribute in the path calculation.
mpls traffic-eng attribute-flags
Releasing MPLS-TE Network Topology Information on IS-IS The MPLS-TE network topology information released on IS-IS includes
global MPLS-TE network topology information and the attached MPLS-TE
network topology information.
The global MPLS-TE network topology information released on IS-IS is as
follows:
Information Function Command
Router ID The Router ID (one interface IP address) of the switch, used to identify the switch node in the MPLS-TE network topology
Rtr Id:1 is one icmpecho entity. The time of creating the entity and the last
modifying time; schedule for 0 times; the detected destination address is 1.1.1.2; send two packets for each schedule; the packet size is 80 bytes; the timeout is 5s; the alarm mode is SHELL, none indicates no alarm, log indicates the shell prompt, log-andtrap indicates the shell prompt and sending the trap information to inform the NMS, and trap indicates only sending trap to inform the NMS; the round-trip delay threshold is 5ms; when the round-trip delay of the detection is no less than the threshold, provide the alarm by alarm-type; the threshold of the packet loss is 200000000, be means alarming when no less than the threshold, se means alarming when smaller than or equal to the threshold, and alarm by alarm-type; 200 history records can be saved
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 506 of 628
at most and the new records cover the old records when exceeding 200; save the history record when scheduling for one time; currently, it is not scheduled; the schedule frequency is 23s; the link status is DEFAULT; if the destination is reachable, the link status is REACHABLE.
Rtr id 2 is the ICMP-PATH-ECHO entity; the time of creating the entity is THU JAN 01 05:15:45 2009; the last modifying time is THU JAN 01 05:36:34 2009; the entity is scheduled for 0 times, that is, not start to schedule; only send one ICMP packet to the destination end and the medium devices during each schedule; the valid payload is s32 bytes; the timeout is 5000ms; the schedule frequency is 60s; just detect
the network of the destination end and to detect the network of the medium device, set as FALSE; do not check the data; the alarm mode LOG is SHELL prompt, none means no alarm, log means the shell prompt, log-andtrap means the shell prompt and sending the trap information to inform the NMS, and trap means just sending trap to inform the NMS; the threshold of the packet loss is 1 and can only be set as 1, be means
alarming when no less than the threshold, se means alarming when smaller than or equal to the threshold, and alarm by alarm-type; save 100 history records and the new records cover the old records when exceeding 100; save the history record during each detection; not in the debug state; the link status is DEFAULT; if the destination is reachable, the status is
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 507 of 628
Rtr id 3 is the ICMP-PATH-JITTER entity; the time of creating the entity is THU JAN 01 05:15:50 2009; the last modifying time is THU JAN 01
05:48:03 2009; the entity is scheduled for 0 times, that is, not start to schedule; only send 10 ICMP packet to the destination end and the medium devices during each schedule; the valid payload is s32 bytes; the timeout is 5000ms; the schedule frequency is 60s; just detect the network of the destination end and between the source and the medium devices; do not check the data; the alarm mode LOG is SHELL prompt, none means no alarm, log
means the shell prompt, log-andtrap means the shell prompt and sending the trap information to inform the NMS, and trap means just sending trap to inform the NMS; the threshold of the round-trip delay is 6ms and provide the alarm by alarm-type when the round-trip delay of the actual detection is no less than the threshold; the threshold of the packet loss is 200000000; be means alarming when no less than the threshold, se means alarming when smaller than
or equal to the threshold, and alarm by alarm-type; the jitter threshold is 5ms; save 100 history records and the new records cover the old records when exceeding 100; save the history record every detecting for three times; not in the debug state; the link status is DEFAULT; if the destination is reachable, the status is REACHABLE.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 508 of 628
Rtr Id:4 is one jitter entity; the time of creating the entity is THU JAN 01 05:15:53 2009; the last time of modifying the entity is THU JAN 01 05:52:41 2009; the entity is scheduled for 0 times; the entity can run; the destination IP address of the detection is 1.1.1.2; the destination port number is 3434 and the simulated is the well-known codec G729.A, that is, the packet size is 32bytes, send 1000 packets during each schedule, the schedule interval is one minute, and the interval of sending packets is 20ms; the timeout is 5000ms; the alarm mode is shell and send trap to inform the NMS, none means no alarm, log means the shell prompt, log-and-trap means the shell prompt and sending
the trap information to inform the NMS, and trap means just sending trap to inform the NMS, and alarm according to the alarm-type; the mos and icpif thresholds are the calculation result × 106, for example, the MOS threshold is 10.000000 and it is 10000000 after calculation; the number of the history records is 120, and the new records cover the old records when exceeding 100; the link status is DEFAULT; if the destination is reachable, the
status is REACHABLE.
Rtr id:5 is one UDPECHO entity; the time of creating the entity is THU JAN 01 05:15:56 2009; the last time of modifying the entity is THU JAN 01 06:43:11 2009; the entity is scheduled for 0
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 509 of 628
times;, that is, do not start to schedule; the entity is in the PEND state; the destination IP address of the detection is 1.1.1.2; the destination port is 1234; the timeout is 5000ms; the valid payload is 16 bytes; the schedule period is 6s; the alarm mode is not alarm; the round-trip delay threshold is 15ms; be means alarming when the actual detection value is no less than the threshold, se means alarming when the actual detection value is smaller
than or equal to the threshold, and alarm by alarm-type; the packet filling field is abcd; the number of the history records is limited to 10 and the new records cover the old records when exceeding 10; save the history record during each schedule; the link status is DEFAULT; if the destination is reachable, the status is REACHABLE.
Rtr Id:6 is one FLOW-STATISTICS entity; the time of creating the entity is THU JAN 01 05:15:59 2009; the last time of modifying the entity is THU JAN 01 06:51:15 2009; the entity is scheduled for 0 times;, that is, do not start to schedule; the alarm mode is none, that
is, not alarm; the threshold for the number of the packets received by the interface is 20000, be means alarming when the number of the packets actually received by the interface is no less than the threshold, se means alarming when the number of the packets actually received by the interface is smaller than or equal to the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 510 of 628
threshold, and alarm according to the alarm-type; the interface for detection is vlan2; the detection interval is 60s; the number of the saved history records is limited to 220 and the new records cover the old records when exceeding 10; save the history record during each schedule; the link status is DEFAULT; if the destination is reachable, the status is REACHABLE.
show rtr group Displayed Information Explanation
26-8#show rtr group
There are 1 valid groups now in the system
----------------------------------------------
ID:2 name:rtrGroup2 Members schedule interval:200
*****************************
type:SINGLE Entity Id :3
type:SINGLE Entity Id :45
type:RANGE Entity start id:60 end id:80
type:SINGLE Entity Id :7
26-8#
There is one rtr group in the system.
Rtr group2: The interval of scheduling the members is 200s and the member list is 3, 45, 60-80, 7
show rtr schedule Displayed Information Explanation
Rtr schedule38: Schedule rtr entity 1; start to schedule after three minutes; the life time is 500s; the ageout is 400s; schedule for twice; the schedule interval is 35s.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 511 of 628
----
show rtr history After scheduling, view the history records of rtr entity 1:
Displayed Information Explanation
26-8#show rtr history 1
-------------------------------------------------------------- ID:1 Name:IcmpEcho1 CurHistorySize:2 MaxHistorysize:200 History recorded as following: THU JAN 01 01:06:18 1970 Rtt:1(ms) PktLoss:0 THU JAN 01 01:29:38 1970 Rtt: 1(ms) PktLoss:0
Rtr1 scheduling result is as follows: The maximum number of the history records saved by the ICMP-ECHO entity is 200; currently, two history records are saved and save according to the schedule interval 23s
The bi-directional delay is 1ms and there is no lost packet.
Note If there is another history record when the number of the history records reaches 200, the new record covers the oldest record.
After scheduling, view the history record of rtr entity 2:
Displayed Information Explanation
26-8#show rtr history 2
-------------------------------------------------------------- ID:2 Name:IcmpPathEcho2 History of record from source to dest: CurHistorySize:2 MaxHistorysize:100 THU JAN 01 00:11:59 1970 Rtt:3 THU JAN 01 00:21:59 1970 Rtt:3
The maximum number of the history records saved by the ICMP-PATH-ECHO entity is 100s; currently, two history records are saved; save according to the schedule interval 60s.
The bi-directional delay is 3ms; if invalid is displayed, it indicates that the network is unreachable, that is, one packet is lost, so the entity just sends only one ICMP packet.
Note If there is another history record when the number of the history records reaches 100, the new record covers the oldest record.
After scheduling, view the history records of rtr entity 3:
Displayed Information Explanation
26-8#show rtr history 3
--------------------------------------------------------- -------------------------------------------------------------- ID:3 Name:IcmpPathJitter3 History of hop-by-hop:
The result of the rtr schedule is as follows:
The maximum number of the history records saved by the ICMP-PATH-JITTER entity is 100; currently, one history
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 512 of 628
3.3.3.2 Rtt:1 Jitter:0 Pkt loss:0 1.1.1.2 Rtt:2 Jitter:0 Pkt loss:0 History of record from source to dest: CurHistorySize:1 MaxHistorysize:100 THU JAN 01 02:30:03 1970 Rtt:2 Jitter:0 Pkt loss:0
The result of rtr4 schedule is as follows: It is the JITTER entity; the maximum number of the saved history records is 120; currently, one history record is saved.
There is no lost packet from the source to destination and from destination to source. The round-trip delay is 16ms; the uni-directional delay from source to destination is 11ms and the uni-directional delay from the destination to source is 15ms; the jitter from source to destination is 10ms; the jitter from the destination to source is 10ms; the MOS value is 4.3; the icpif value is 10.0.
Note
1. If there is another history record when the number of the history records reaches 100, the new record covers the oldest record.
2. The NTP protocol must be configured; let the clock to synchronize.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 513 of 628
After configuring the RTR entity 5, view the history records of rtr entity 5:
Displayed Information Explanation
26-8#show rtr history 5
-------------------------------------------------------------- ID:5 Name:UdpEcho5 CurHistorySize:2 MaxHistorysize:10 History recorded as following: THU JAN 01 00:31:04 1970 Packet loss:0 Rtt:18(ms) THU JAN 01 00:31:10 1970 Packet loss:0 Rtt:18(ms)
The detection type is UDPECHO; the maximum number of the history records is 10, currently, two history records are saved.
The following is the statistics information after the entity is scheduled:
The number of the lost packets is 0 and the roung-trip delay is 18ms.
Note
If there is another history record when the number of the history records reaches 10, the new record covers the oldest record.
After configuring the RTR entity 6, view the history records of rtr entity 6:
Displayed Information Explanation
26-8#show rtr history 6
-------------------------------------------------------------- ID:6 Name:flow-statistics6 CurHistorySize:2 MaxHistorysize:220 History recorded as following: THU JAN 01 00:31:27 1970 Input pkt:1 (packets/s) Input flow:0(bits/s) Output pkt:1 (packets/s) Output flow:0(bits/s) THU JAN 01 00:31:37 1970 Input pkt:1 (packets/s) Input flow:0(bits/s) Output pkt:1 (packets/s) Output flow:0(bits/s)
The result of rtr 6 schedule is as follows:
It is the FLOW-STATISTICS entity; the maximum number of the history records is 220, currently, two history records are saved.
The following is the traffic statistics of the interface:
The rate of receiving the packets is 1packets/s; the receiving traffic is 0bits/s; the rate of sending the packets is 1packets/s; the maximum sending traffic is 0bits/s.
SLA Debug Commands debug rtr all: show all SLA debug information
debug rtr icmpecho: the detection information of debugging the
ICMPECHO entity
debug rtr icmp-path-echo: the detection information of debugging the
ICMP-PATH-ECHO entity
debug rtr icmp-path-jitter: the detection information of debugging the
ICMP-PATH-JITTER entity
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 514 of 628
debug rtr jitter : the detection information of debugging the jitter
entity
debug rtr udpecho: the detection information of debugging the udpecho
entity
debug rtr flow-statistics: the detection information of debugging the flow-
statistics entity
debug rtr macping: the detection information of debugging the macping
entity
debug rtr group: the information of debugging the rtr group
debug rtr schedule: the information of debugging the rtr schedule
debug rtr responder: the information of debugging the rtr responder
Enable the debug during the entity detection and you can see the specific
debug information.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 515 of 628
VRRP Technology
This chapter describes the VRRP protocol theory and how to realize it.
Main contents:
Related terms of VRRP protocol
Introduction to VRRP protocol
Debug commands and debug information
Related Terms of VRRP Protocol VRRP――Virtual Router Redundancy Protocol
Master: One status of VRRP; the active device is in the state; ensure the
forwarding of the IP packets;
Backup: One status of VRRP; the standby device is in the state; ensure
the switch in time when the active device fails.
Introduction to VRRP Protocol VRRP is the redundancy backup protocol. Usually, the hosts in one
network are configured with one default route. In this way, the packets
whose destination addresses are not in the local segment are sent to the
default gateway A via the default route, so as to realize the
communication between the host and the outer network. When the
gateway A fails, all the hosts with A as the default route next hop in the
local segment disconnects the communication with the outside.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 516 of 628
Here, the used gateway is any network device with the IP forwarding
function, such as switch and router. To make it easy for the reader to
understand, the following uses router to express the gateway.
VRRP is to solve the previous problem and it is designed for the LAN with
multicast or broadcast capability (such as Ethernet). VRRP makes a group
of routers of the LAN (including one MASTER and several BACKUP) form
one virtual router, called one backup group.
The virtual router (that is backup group) has its own IP address. The
router in the backup group has its own IP address. The hosts in the LAN
just need to know the IP address of the virtual router, but do not need to
know the IP address of the master router or the IP address of the backup
router. They set their default route as the IP address of the virtual router.
Therefore, the hosts in the network communicate with other networks via
the virtual router. When the master router in the backup group fails, the
other backup router in the backup group becomes the new master and
continues to provide route service for the hosts in the network, so as to
realize the un-interrupted communication with the out network.
Basic Hierarchy of VRRP in TCP/IP
The VRRP protocol is one IP packet and the protocol number is 112 (0x70).
Structure of VRRP Packet The structure of the VRRP packet:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 517 of 628
Version: Version number; it is 2.
Type: The packet type is 1, indicating ADVERTISEMENTS;
VRID: The configured vrid of the interface, Virtual Router Identifier (VRID).
Priority: The priority configured on the interface. The priority of the router
with the virtual IP address (the router with VIP as the interface IP) is 255;
the priorities of the other routers are 1-254 and the default value is 100.
Count IP Addr: The number of the virtual IP addresses; usually, it is 1.
AuthType: the authentication type;
0: no authentication; AuthData field is all 0.
1: simple text authentication.
Advertise Interval: the period of sending ADVERTISEMENT, taking the
second as the unit; the default value is 1s.
IP Address: virtual IP address.
Checksum: the check summary.
Auth Data: 8 characters at most; if there are no 8 characters, fill 0.
VRRP Workflow Simply speaking, VRRP is one fault tolerance protocol. It ensures that
when the next hop router of the host fails, there is another router to
replace in time, so as to keep the continuity and reliability of the
communication. To make VRRP work, configure the virtual router number
and virtual IP address on the router. In this way, one virtual router is
added to the network, while the communication between the host on the
network and the virtual router does not need to know any information of
the physical router on the network. One virtual router comprises one
master router and several backup routers. The master router realizes the
real forwarding function. When the master router fails, one backup router
becomes the new master router and takes over the work.
VRRP just defines one kind of packets—VRRP packet, which is one
multicast packet. The packet is sent by the master router to advertise its
existing. The packet can be used to detect the parameters of the virtual
router and also can be used for the selection of the master router.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 518 of 628
VRRP defines three kinds of models, including Initialize, Master and
Backup. Here, only the Master state can provide the services for the
forwarding request via the virtual IP address.
The VRRP protocol defined in RFC2338 is made on the basis of the private
HSRP protocol of Cisco, but VRRP simplifies the mechanism put forward by
HSRP, reducing the additional load brought by the redundancy function to
the network. For example, HSRP defines that the virtual router has 6
states, while VRRP has only three, so as to reduce the complexity of the
protocol. In the stable state, HSRP has two states that can send packets,
while in VRRP, only the router in the Master state can forward packets and
the packets are one kind, which reduces the occupied bandwidth The HSRP
packets are based on UDP, while the VRRP packets are encapsulated on
the IP packet. Meanwhile, VRRP supports using the actual interface IP
address as the virtual IP address.
VRRP router forms the different virtual routers via VRID. The routers that
form one virtual router are divided to master router and backup router.
The master and backup virtual routers needs to be confirmed via some
rules. The following are the rules for selecting the master and backup
routers:
1. Select the master router according to the priority. The router with the
highest priority is the master router and the status is Master. If the
priorities of the routers are the same, compare the IP addresses of the
interfaces, the one with larger IP address becomes the master router.
2. The other routers serve as the standby router, monitoring the status
of the master router in real time. When the master router works
normally, it sends one VRRP multicast packet (224.0.0.18), informing
the backup router in the group that it is in the normal state. If the
backup router in the group does not receive the packets from the
master router for a long time, it turns to Master. When there are
multiple backup routers in the group, there may be multiple master
routers. Here, each master router compares the priority in the VRRP
packet and its local priority If the local priority is smaller than the
priority in VRRP, its status turns to Backup. Otherwise, keep its status.
In this way, the router with the highest priority becomes the new
master router and completes the backup function of VRRP.
The virtual router has three status, including Initialize, master and backup.
Master status:
Must answer the ARP request for the virtual IP address; the response
of ARP is the corresponding MAC address of the virtual router IP
address;
Be responsible for forwarding the packets via virtual IP;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 519 of 628
Cannot receive the packets with destination IP address as the virtual
router IP (except for the IP address owner);
Must receive the packets with the related IP address as the destination
(if it is the IP address owner);
Must send and receive the protocol packets (multicast);
When turning to master from other status, send the free ARP packets;
BACKUP status:
Cannot answer the ARP request for the virtual router IP address;
Cannot receive the packet with the destination IP address as the
virtual router IP address;
Cannot send the protocol packets; must receive the protocol packets
(multicast);
INITIALIZE status:
No any operation except for answering startup.
The converting of the three status:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 520 of 628
VRRP Features VRRP has the following features:
Gateway backup: Multiple routers share one IP address, preventing
the single virtual IP address with multiple connected clients from
becoming invalid and minimizing the network back hole. This is the
main function of VRRP.
Load balance: It is one function with high VRRP added value. Use
multiple virtual routers to back up multiple gateways; the terminal
sets different virtual router IP addresses to realize the load balance.
Security expanding: The interacting of the VRRP protocol packets can
expand the security via the security authentication mode. VRRP defines
two kinds of authentication modes, including no authentication, and simple
clear text passwords.
no authentication: In one secure network, you can set the
authentication mode as NO. The router does not perform any
authentication processing for the received and sent VRRP packets,
which can improves the VRRP performance.
simple clear text passwords: In one network that may be threaten
by the security, you can set the authentication mode as SIMPLE.
Encrypt the sent VRRP packet and de-encrypt the received VRRP
packet. If the authentication fails, refuse the illegal packet, so as
to ensure the normal running of the VRRP protocol.
Debug Commands and Debug Information 1. Packet debug
debug vrrp packet or
debug vrrp interface _interface_ group _groupId_ packet
The command is used to print the information of the VRRP packet.
03:12:03: VBRP: vlan1 Grp 0 Standby router is local, was unknown
03:12:03: VBRP: vlan1 Grp 0 Speak -> Standby
The Hello packet is not received from other Standby device, so the device
turns from Speak to Standby.
The priority of the Standby device is adjusted to 200 and it turns to Active.
r2(config-if-vlan1)# standby priority 200
03:20:29: VBRP: vlan1 Grp 0 Standby: h/Hello rcvd from lower pri Active
router (110/128.255.16.3)
03:20:29: VBRP: vlan1 API MAC address update
03:20:29: VBRP: vlan1 Grp 0 Active router is local, was 128.255.16.3
03:20:29: VBRP: vlan1 Grp 0 Standby router is unknown, was local
03:20:29: VBRP: vlan1 Grp 0 Standby -> Active
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 531 of 628
IPFIX Technology
Overview This chapter describes the working principle of IPFIX.
Main contents:
Terms
Introduction to the principle
Terms IPFIX-IP Flow Information Export
IPFIX Packets-The packets sent to the IPFIX workstation from the IPFIX
module; it carries the IP flow statistical information monitored by the
IPFIX on the network devices. The IPFIX packets are UDP packets and
assembled according to the NetFlow v9 mode.
IP flow-The IP packets processed by the network devices; categorize the
packets according to the ingress port, protocol ID, source address,
destination address, TOS field, TCP/UDP source port, and TCP/UDP
destination port. Each category is a IP flow.
IPFIX flow recording template-a type of IPFIX packets; it defines the
format of the subsequent IPFIX flow recording packets.
IPFIX option recording template-a type of IPFIX packets; it defines the
format of the subsequent IPFIX option recording packets.
IPFIX flow record-a type of IPFIX packets; it records the statistics of the
IP flow.
IPFIX option records-a type of IPFIX packets; it records the content of the
statistical options irrelevant with single IP flow in the IPFIX.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 532 of 628
Introduction to the Principle Main contents:
IPFIX working flow
IPFIX restrictions
IPFIX packet structure
IPFIX Working Flow When the IPFIX function is enabled in the system, the IP packets are
classified into different IP flows according to the ingress port, protocol ID,
source address, TOS field, TCP/UDP source port, and TCP/UDP destination
port. Each IP flow is counted independently. The statistical data of the
flows are assembled into IPFIX packets by the IPFIX periodically and sent
to the specified IPFIX server. The IPFIX server provides powerful graphical
display and calculation capability. It analyzes the flow statistics in the
IPFIX packets to provide materials for traffic monitoring and management
for the network administrators
When the IPFIX is enabled in the switch, the simplest procedure is as
follows:
1. Determine the ports to monitor traffic. The ports are called observation
points.
2. In the observation points, use the ipfix ingress/egress command to
enable the IPFIX to monitor traffic. The ipfix ingress means monitoring
the IP flow received from the observation point; the ipfix egress means
monitoring the IP flow sent from the observation point.
3. Configure the address of the IPFIX server and the UDP destination port
number. The destination address of the IPFIX packets and the UDP
destination port number will use the configuration.
After the preceding configuration is complete, the IP traffic forwarded by
the observation point will be divided into different IP flows for processing
and calculation. The historical IP flow statistics are sent to the IPFIX
module periodically. After the statistical information is received, the IPFIX
module assembles the IP flow statistics into IPFFIX packets. Fill in the
destination address of the packets and the destination UDP port number
according to the configuration. Then, send the packets.
The time cycle of delivering IP flow statistics to IPFIX is determined by the
IPFIXinactive timer configured in the port. The inactive timer specifies the
failure time of a flow. If no packets are hit for an existing flow in the
inactive time, the flow record fails. If the inactive timer of the flow record
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 533 of 628
times out, the statistical information of the flow will be delivered to the
IPFIX.
IPFIX Restrictions The restrictions of the IPFIX in a switch are as follows:
1. The IPFIX flow record is controlled by the chip, instead of software.
The switching chip that does not support IPFIX function cannot support
the IPFIX function.
2. For the statistics of INGRESS flow, only the unicast flow is counted. For
the unicast flow, the chip forwards the packets through a single port
instead of multiple ports (namely, it cannot be flooding). The flow
statistics of the egress is not restricted.
IPFIX Packet Structure The IPFIX packet complies with the NetFlow v9 format. It is composed of
packet header and FlowSet.
Packet Header
Figure 32-1 Format of IPFIX Packet Header
Version: ver9 format, 0x0009.
Count: the quantify of records carried in the packets.
System Uptime: the running time of the device, with the unit of ms.
UNIX Seconds: the seconds from 1700 0 UTC till now.
Sequence: the sequence number of the packets; it is accumulated.
Source ID: the value is 0.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 534 of 628
FlowSet FlowSet includes: Template FlowSet and Data FlowSet. One IPFIX packet
can contain multiple FlowSets.
Template FlowSet
One Template FlowSet is composed of multiple template records. Each
template record defines a template. The template defines the explanation
for corresponding data records. The IPFIX server explains the received
data subsequently according to the received template.
The template can be classified into flow record template and option record
template. The flow record template defines how to explain the flow record;
the option record template defines how to explain the option records.
The format of the FlowSet composed of flow record template is as follows:
Figure 32-2 Template FlowSet format of the flow template
FlowSet ID: the FlowSet composed of flow record template uses ID 0.
Length: the total length of FlowSet.
Template ID: for the matching of data and template. It starts from 256.
Field Count: the number of Template record fields.
Field Type: the type of the field, indicated with numbers
Filed Length: the number of bytes of the field defined by the field type.
The format of the FlowSet composed of option record template is as
follows:
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 535 of 628
Figure 32-3 FlowSet format of the option template
FlowSet ID: the FlowSet composed of the option template uses ID 1.
Length: the length of FlowSet, including the length of Padding.
Template ID: for the matching of data and template; it is greater than 255.
Option Scope Length: the number of bytes in the Scope field.
Options Length: the number of bytes in the Option field.
Scope Field Type: the type of the scope field quoted by the relevant data
of the IPFIX process 0x1: system; 0x2: interface; 0x3: line card; 0x4:
IPFIX cache; 0x5: template.
Scope Field Length: The length of Scope field.
Option Filed Type: the type of the option data, the used value is the same
as the field type value described in flow template.
Option Field Length: the length of option data (number of bytes).
Padding: for the FlowSet to align by 32 bits.
The types of the fields used in the IPFIX template are as follows:
Type value Name Description
42 TOTAL_FLOWS_EXP Total exported flow records
41 TOTAL_PKTS_EXP Total exported IPFIX packets
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 536 of 628
1 IN_BYTES Input bytes
2 IN_PKTS Input packets
21 LAST_SWITCHED The last hit time of the packets
22 FIRST_SWITCHED The time of creating the flow
8 IPV4_SRC_ADDR The source IP address.
12 IPV4_DST_ADDR The destination IP address
10 INPUT_SNMP The MIB index at the input interface
14 OUTPUT_SNMP The MIB index at the output interface
15 IPV4_NEXT_HOP The IPv4 address of the next hop.
7 L4_SRC_PORT Source port number
11 L4_DST_PORT The destination port number
4 PROTOCOL Protocol
5 SRC_TOS Source TOS
9 SRC_MASK The length of source mask
13 DST_MASK The length of destination mask
6 TCP_FLAGS TCP flag
32 ICMP_TYPE ICMP type
16 SRC_AS The BGP AS of the source route
17 DST_AS The BGP AS of the
destination route
18 BGP_IPV4_NEXT_HOP BGP route gateway
23 OUT_BYTES Output bytes
24 OUT_PKTS Output packets
Data FlowSet
Figure 32-4 Packet structure of the Data FlowSet
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 537 of 628
FlowSet ID: The FlowSet ID is corresponding to the template ID; the IPFIX
explains the data information according to the corresponding relation.
Length: the length of FlowSet.
Padding: round the FlowSet length according to 32 bits. The length
includes padding.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 538 of 628
Port Isolation Technology
This chapter describes the port isolation technology of the switch.
Configure Port Isolation Main contents:
Introduction to port isolation
Application instance of port isolation
Introduction to Port Isolation Port isolation is the port-based security feature. The user can realize the
L2 and L3 data isolation between the port and the isolated port according
to the isolated port of the specified port, improving the network security
and provide flexible networking scheme for the user.
By default, the packet forwarding can be realized between any two ports in
one VLAN of the switch. To realize that any specified port in one VLAN
cannot communicate, you can configure the isolated port in the specified
port mode so that the port configured with the port isolation cannot
communicate with the specified isolated port.
The port isolation feature is not related with the port VLAN. Currently, the
switch supports configuring the isolated port in the common port and
aggregation port mode. The configured isolated port can be common port
or aggregation port. The port isolation function only realizes the uni-
directional packet dropping. Suppose that the configured isolated ports on
port A are port B, C, and D. If the destination port of the packet entering
from port A is B/C/D, the packet is directly dropped. But if the destination
port of the packet entering from port B/C/D is A, the packet can be
forwarded normally.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 539 of 628
Port Isolation Application
Appl icat ion Instance 1
Application instance of port isolation
Illustration
Three ports of the switch are connected to three terminal devices
respectively. Port 0/0/1, port 0/0/2 and port 0/0/3 are connected to
terminal 1, terminal 2, and terminal 3 respectively. Port 0/0/1, port 0/0/2
and port 0/0/3 belong to one VLAN. To make terminal 1 cannot
communicate with terminal 2 and terminal 3, use the previous commands
to complete the configuration of the function.
The switch configuration:
Command Description
switch(config)#port 0/0/1 Enter the port configuration mode switch (config-port-0/0/1)#isolate-port port0/0/2-0/0/3
Configure port0/0/1 to be isolated from port0/0/2 and port0/0/3
switch (config-port-0/0/1)#exit Exit the port configuration mode
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 540 of 628
IPv6 Unicast Routing
IPv6 RIPng Dynamic Routing Protocol Main contents:
Terms of IPv6 RIPng protocol
Introduction to IPv6 RIPng protocol
Terms of IPv6 RIPng Protocol UDPv6 (IPv6 User Datagram Protocol): It is one simple IP network
transmission layer protocol based on the unreliable transmission of
packets.
D-V algorithm (Distance-Vector): It is one method of calculating the roite
of the computer network, also called Bellman-Ford algorithm.
IGP: Interior Gateway Protocol;
Request packet: It is used to request the IPv6 RIPng route information
of other route devices.
Resposne packet: It is used to advertise its own route information to the
IPv6 RIPng of other adjacent route device.
Split horizon: learn the route from one interface, but do not advertise
the route to the interface. The IPv6 RIPng protocol is one measure to
prevent the route loop.
Poisoned reverse: Learn the route from one interface and then advertise
the route to the interface with unreachable metric (16). IPv6 RIPng
protocol is one measure to prevent the route loop, which is more active
than Split horizon.
Triggered updates: It is one measure of IPv6 RIPng protocol to speed up
the convergence. When the route changes, generate the triggered updates,
advertising the changed route. Regular updates is opposed to triggered
updates. Regular updates means that the IPv6 RIPng protocol sends out
the updates of all route information with an interval of 30s (by default).
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 541 of 628
Introduction to IPv6 RIPng Protocol IPv6 RIPng (Routing Information Protocol for IPv6) is one Distance-Vector
IGP, used for the simple IPv6 route learning of the small network. This
section describes how to configure the IPv6 RIPng dynamic routing
protocol on Maipu route devices for the IPv6 network interconnection.
The running mechanism of the IPv6 RIPng protocol is basically consistent
with the IPv4 RIP protocol. The unique difference is that the advertised
learned route changes from the IPv4 route to IPv6 route.
The advantages of the IPv6 RIPng protocol are that the protocol is simple
and the configuration is simple, but the route information that needs to be
advertised by the IPv6 RIPng is proportional to the route quantity of the
route table. When there are many routes, many network resources are
consumed. Meanwhile, the IPv6 RIPng protocol defines that the maximum
hops of the route devices that are passed by the route path is 15 hops.
Therefore, the IPv6 RIPng protocol is just used for the simple middle/small
networks.
The IPv6 RIPng protocol can be used for most of the campus networks and
the area networks with simple structure and strong continuity. Generally,
the complicated environments do not use the IPv6 RIPng protocol.
Locat ion of IPv6 RIPng Protocol in TCP/IP
Data Link Layer
Network Layer (IPv6)
TCPv6 UDPv6
IPv6 RIPng
Figure 34-1 Location of IPv6 RIPng protocol in TCP/IP
A shown in the above figure, the IPv6 RIPng protocol is one routing
protocol based on the UDP protocol. The protocol packet sent by the IPv6
RIPng protocol is encapsulated in the UDPv6 packet. By default, IPv6
RIPng protocol uses the 521 port to send and receive the protocol packets
from the remote route device, updates the local route table according to
the route information in the received protocol packet, and then add the
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 542 of 628
metric with 1 to advertise to the other adjacent route device. In this way,
all route devices in the route domain can learn all routes.
IPv6 RIPng protocol sends the protocol packets in three modes, as follows:
Table 34-1 The mode of IPv6 RIPng protocol sending packets
Mode Address Port Usage
Multicast ff02::9 521 Send the protocol packets to all adjacent route devices on one interface
Unicast Unicast IPv6 address
The source packet of the request packet
The response packet of one request packet
Unicast Unicast IPv6 address
521 The protocol packet sent to the configured neighbor
IPv6 RIPng Protocol Packet Type The IPv6 RIPng protocol has two kinds of protocol packets, including
request packet and response packet. The IPv6 RIPng protocol packet type
and function are as follows:
IPv6 RIPng protocol packet type
Packet Type Function Sending status
Request packet Request the route information from the IPv6 RIPng of the adjacent route device. You can request the specified route information or all route information (there is only one route entry whose destination address is 0, prefix length is 0 and metric is 16).
When IPv6 RIPng just starts running on the interface, request all route information from IPv6 RIPng of the adjacent route device.
Response packet Advertise the route information to the IPv6 RIPng of the adjacent route device
1. Answer the request packet;
2. When the route changes,
trigger updating the route
information;
3. Advertise all route
information to IPv6 RIPng of
the adjacent route device
regularly (regular updates).
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 543 of 628
IPv6 RIPng Protocol Packet Structure Data Link
Header
IPv6
Header
IPv6 RIPng routing
information
route table entry
(20 Bytes)
UDPv6
Header
IPv6 RIPng
Header
command
(1 byte)
version
(1 Byte)
must be zero (2 Bytes)
route table entry
(20 Bytes)
Figure 34-2 Basic structure of IPv6 RIPng protocol packet
As shown in the above figure, the IPv6 RIPng protocol packet is
encapsulated in the UDPv6 packet. In the IPv6 header of the IPv6 RIPng
protocol packet, the Hop count field is set as 255, preventing the IPv6
RIPng protocol packet from being forwarded by other route device.
IPv6 RIPng header has two fields: Command field identifies the packet is
the request packet (the value is 1) or the response packet (the value is 2);
the version field is always 1.
Route table entry can have two types, which are described as follows:
Table 34-2 Route table entry type of the IPv6 RIP protocol
Route table entry Type Format Description
The route table entry As shown in the following figure
Bear the IPv6 route information
The entry of the next address route table
As shown in the following figure
Bear the next-hop address of the IPv6 route information. The using method is: First, add the entries of the next-hop address route table, and then add the next-hop address as the route table entry of the address, at last, end with the next-hop address route table entry whose next-hop address is 0:0:0:0:0:0:0:0.
IPv6 prefix (16 Bytes)
Route Tag(2 Bytes)
Prefix len (1 Bytes)
Metric (1 Bytes)
Route table entry
IPv6 next hop address (16 Bytes)
Must be zero(2 Bytes)
Must be zero (1 Bytes)
0xFF (1 Bytes)
Next hop route table entry
Format of the IPv6 RIPng protocol route information entry
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 544 of 628
Basic Work Principle of IPv6 RIPng Protocol
IPv6 RIPng protocol start
Send Request packet asking
for all routing information
from neighobr
Update all routing
information to neighbor
30 Sec
IPv6 RIPng receive
packets
Response routing
information in unicast
Request
packet
Packet type?
Update routes in
database by packet
Response
packet
Routes
changed?
Trigger update
routing information
Y
N
Protocol start flow Receive packet process flow
Else
packet
Basic work flow of the IPv6 RIPng protocol
The basic work flow of the IPv6 RIPng protocol is as shown in the above
figure, including two parts. One is the flow of starting the protocol and the
other is the flow of processing the received packet.
Protocol Start Process When the IPv6 RIPng protocol starts to run on one interface, send the
route request packet to the interface in the multicast mode to request all
route information from all adjacent route devices on the interface, so as to
reach the purpose of fast convergence.
After receiving the response packet of the request packet, update the
routes in the route database according to the route information in the
packet and then advertise the changed route to IPv6 RIPng of other
adjacent route device (Triggered updates).
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 545 of 628
Meanwhile, enable the Updates Timer and use the route response packet
to advertise all route information to IPv6 RIPng of all adjacent route
devices, so as to ensure the synchronization of the route database
between IPv6 RIPng of each route device and update the advertised route.
In this way, the previous advertised route does not time out and become
invalid on other route devices.
Route Database The route database records all route information of the IPv6 RIPng
protocol. Each route information comprises the following elements:
1. Destination subnet address: The destination host or subnet of the
route;
2. Metric: The metric of the destination;
3. Next-hop interface: the interface that forwards the packet to the
destination, that is, the interface that learns the route;
4. Next-hop IPv6 address: The interface IPv6 address of the adjacent
route device that needs to be passed, so as to reach the destination.
Generally, it is the source IPv6 address of the response packet that
learns the route.
5. Source IPv6 address: The source IPv6 address of the response packet
that learns the route;
6. Route tag: It is defined by the user, used to tag one type of route. For
example, tag one route is got by re-distributing the BGP route.
Sources of Route Entr ies in Route Database The sources of the route entries in the IPv6 RIPng protocol route database
are as follows:
1. The protocol covers the direct-connected route of the interface;
2. The protocol re-distributes the route of other protocol;
3. The RIPng instance re-distributes the route of other RIPng instance;
4. The route generated by the protocol configuration command, such as
generate the command of releasing the default route (default-
information originate);
5. The route learned from IPv6 RIPng of the adjacent route device;
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 546 of 628
How to Get Route Next Hop In IPv6 RIPng, the next-hop interface of the route is the interface that
learns the route, but the next-hop IPv6 address is selected from the
following two addresses, that is, the source IPv6 address of the response
packet that learns the route and the next-hop IPv6 address in the route
information. If the next-hop IPv6 address exists in the route information
and it is the link local address, the next-hop IPv6 address of the route is
the next-hop IPv6 address in the route information. Otherwise, the next-
hop IPv6 address of the route is the source IPv6 address of the response
packet. This is to realize the function similar to re-direction.
Therefore, for the re-distributed route, when the sending interface is the
next-hop interface of the route, the route carries the next-hop address of
the route.
The following provides one instance to describe the using of the next-hop
address information of the route information in IPv6 RIPng.
Instance diagram of IPv6 RIPng route re-direction
As shown in above figure, IPv6 RIPng runs on Switch-A; IPv6 RIPng and
IPv6 OSPFv3 run on Switch-B; IPv6 OSPFv3 runs on SwitchC. IPv6 RIPng
in Switch-B re-distributes the IPv6 OSPFv3 route 11::/24 learned by the
local device so that switch-A can learn the route to the subnet 11::/24.
When the route is learned on switch-A, the next-hop is Switch-B, that is,
fe80::0201:7aff:fe4f:73f8 by default. As a result, the packets forwarded
from switch-A to the destination subnet 11.0.0.0/8 all first pass switch-B
and then reaches Switch-C.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 547 of 628
To solve the problem, when switch-B advertises the route 11::/24 to
switch-A, the next-hop of the route is specified as Switch-C, that is,
fe80::0201:7aff:fe4f:73f7. When switch-A learns the route, the next hop
of the route 11::/24 is specified as Switch-C, that is,
fe80::0201:7aff:fe4f:73f7. As a result, the packets forwarded from switch-
A to the destination subnet 11::/24 are all directly forwarded to Switch-C,
but do not need to pass Switch-B.
Route Update When IPv6 RIP of the adjacent route device learns one route, add 1 to the
metric before route processing, so as to accumulate the metric hops.
When the metric is smaller than 15, the route is the reachable route; when
the metric is larger than or equal to 16, the route is un-reachable route.
If the route complies with the following conditions, use the route to update
the routes in the route database:
1. The route does not exist in the route database and the metric of the
route is smaller than 16 hops;
2. The route exists in the database and the source IPv6 address is
consistent with the source IPv6 address of the learned route;
3. The route exists in the database, but the metric is larger than or equal
to the metric of the learned route.
Protocol Packet Authent icat ion IPv6 RIPng protocol packet is not authenticated by the protocol, but is
authenticated by UDP v6.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 548 of 628
Status Transition of IPv6 RIPng Protocol Route Entry and Related Timer
Valid
Invalid +
HolddownInvalid
Flush(Delete route from
database)
Invalid Timer timeout
or metric is updating
to 16 (Unreachable)
Flush
Timer timeout
Holddown
Timer timeout
Route Update
Flush
Timer timeout
Running
invalid timer on
nexthops of routes
Running
holdown timer
and
flush timer on
routes
Running
flush timer on
routes
Status transition of IPv6 RIPng protocol route entry
IPv6 RIPng protocol has four timers, including Update Timer, Invalid Timer,
Holddown Timer, and Flush Timer. The timers are described as follows:
IPv6 Timers of the RIPng protocol
Timer Name Operation Object
Default Value
Start Condition
Function
Update Timer Route database
30s When RIP is enabled, start the timer circularly.
Use the response packet to advertise all route information to the RIP of the adjacent route device regularly. 1. Ensure the route database
synchronization between the
RIP of each route device;
2. Refresh the previous
advertised route so that the
previous advertised route
does not time out or become
invalid on other route device.
Invalid Timer The next-hop of the route entry
180s Start the timer when learning one route entry
One route becomes invalid when it is not updated within some time. The status transition is as shown in the above figure. The timer can be updated by the response packet. When the route entry becomes invalid, disable the timer.
Holddown Timer
Route entry
0s Start the timer when the route entry enters the invalid
One route is not permitted to be updated by the response packet within some time after becoming invalid, so as to prevent the route
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 549 of 628
state loop. The status transition is as shown in the above figure. Disable the timer when the route entry leaves the holddown state.
Flush Timer Route entry
240s Start the timer when the route entry enters the invalid state
One route is deleted from the route database after becoming invalid for some time. The status transition is as shown in the above figure. Disable the timer when the route entry is deleted.
Avoidance of IPv6 RIPng Protocol Route Loop The IPv6 RIPng protocol is the dynamic routing protocol based on
Distance-Vector and does not know the topology of the whole network.
When the network changes, the routes of the whole network need some
time to converge and as a result, the route database of the route device
cannot synchronize in some time. Meanwhile, the topology of the whole
network is not known, so the rout loop may appear. The IPv6 RIPng
protocol uses the following mechanisms to reduce the possibility of
generating the route loop because of the inconsistency on the network,
including Counting to Infinity, Split Horizon, Poisoned Reverse, Holddown
Timer, and Triggered updates.
Counting to Inf in i ty The IPv6 RIPng protocol permits the maximum metric to be 15. The
destination whose metric is larger than 15 is regarded as unreachable.
This limits the network size and prevents unlimited transmission of the
route information. The route information is transmitted from one route
device to another route device and the metric is added with 1 after
transmitting for one time. When the metric exceeds 15, the route is
deleted from the route table.
Spl i t Horizon The route learned from one interface cannot be advertised to the same
interface. If the route learned from one interface is advertised to the same
interface, it may result in the route loop.
The Split Horizon rule of the IPv6 RIPng protocol is as follows: If IPv6
RIPng of the route device learns the route information A from one
interface, the response packet sent to the interface cannot contain the
route information A.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 550 of 628
Split Horizon has one special case. When one interface receives a part of
the route information request packet, the response of the packet does not
perform Split Horizon.
Poisoned Reverse The purpose of the poisoned Reverse is the same as that of Split Horizon,
but there is a little difference as follows.
The Split Horizon rule of the IPv6 RIPng protocol is as follows: If IPv6
RIPng of the route device learns route information A from one interface,
the route response packet sent to the interface contains route information
A, but the metric is set as 16 (that is unreachable).
Compared with Split Horizon, the advantage of Poisoned Reverse is to
advertise the route information to the source route device by setting the
hops as unreachable. If there is route loop, it can be broken at once, while
Split Horizon can only wait for the wrong route entry to be deleted
because of timeout. The disadvantage is that Poisoned Reverse increases
the size of the route response packet, and as a result, the protocol
bandwidth consumption is increased,
Holddown Timer Holddown timer is to deny the route entry to be updated by the route
response packet within some time after becoming unreachable.
Holddown timer ensures that the unreachable route is not updated by the
response packet before each route device receives route unreachable
information. The information of the route entry in the received response
packet may be the one advertised previously.
Triggered updates Triggered updates is to use the route response packet to advertise the
route change information to the adjacent route device at once when the
route changes.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 551 of 628
Poisoned Reverse and Split Horizon breaks he route loop formed by ant
two route devices, but the route loop formed by three or more route
devices still appear until the metric of he route is transmitted and
accumulated to unreachable (16). Triggered Updates can speed up the
route convergence, so as to shorten the time of breaking the route loop.
IPv6 OSPFv3 Dynamic Routing Protocol Main contents:
Terms of OSPFv3 Protocol
Introduction to the OSPFv3 protocol
Terms of OSPFv3 Protocol AS- Autonomous System: a group of route devices exchanging information
through the same routing protocol.
Area: the collection of route devices, which has such topology database:
OSPFv3 divides one AS into multiple areas; the topology of one are is
invisible to another area, which reduces the number of routing information
in an AS. The area is used to contain link state updates and enables the
administrator to create hierachical network.
areaID-the 32-bit ID of the area in the AS.
IGP- Internal Gateway Protocol: the routing protocol running on the
route devices of an AS system, each AS system has an independent IGP;
different AS system may run different IGP. OSPFv3 is one kind of IGP.
Router ID-a 32-bit number, it is granted to the OSPFv3, as a result, each
route device can identify the route device in the AS.
Point To Point network-the network composed of a pair of route devices,
such as a 56kb serial port connection.
Broadcast Networks-the network supports multiple (more than 2) route
devices. The route devices can exchange information with all netowkr
(broadcast) route devices. The neighbor route device is dynamically
detected by the OSPFv3 hello packets. If the network has the multicast
capability, OSPFv3 also uses multicast. Each pair of route device on the
network is supposed to directly connect with the opposite party. The
Ethernet is an example of the broadcast network.
MyPower Switch Technical Manual
Maipu Confidential & Proprietary Information Page 552 of 628