-
EVERYTHING YOU ALWAYS WANTED TO KNOW ABOUT OPENSTACK NETWORK* *
But Were Afraid to Ask
AKA Openstack debugging VLAN setup
Disclaimer
HereisatentativeguidetotestanddebugmostlythenetworkingintheOpenstackcloudworld.
Wehavespenthugeamountoftimelookingatpacketdumpsinordertodistillthisinformationforyouinthebeliefthat,followingtherecipesoutlinedinthefollowingpages,youwillhaveaneasiertime!
KeepinmindthatthisiscomingmorefromadaybydaydebugthanfromastructuredplansoItriedtoseparatethepiecesaccordingtothearchitecturethatIhaveinmind...butisandwillremainaworkinprogress.
Reference setup: Thesetupisthefollowing:
1.
computenodeUbuntuserver144ethernetinterfacesmappedonem14(3used)2.
controllercomputenodeUbuntuserver144ethernetinterfacesmappedonem14(3used)3.
networknodeUbuntuserver144ethernetinterfacesmappedonem14(3used)
The networking configuration is implemented within neutron
service and based on a VLAN approach so to obtain
acompletlyL2separationofamultipletenantenvironment.
Follow the openstack guide to configure the services (in
appendix the configuration files that has been used in this
caseandfewconfigurationscripts).
Preliminary checks Once you agreed with your network
administrators on the switches configuration (If you have no direct
access to
them)let'sdoublechecktheportconfigurationforthevlanids:
CaptureanLLDPpacket(0x88cc)fromeachhostandforeachinterface:
-
#tcpdumpvvvs1500etherproto0x88cciem1
(waitforapacketandthenCTRLc)
this command will give you some information about the switch
that you are connected to and the VLAN
configuration.NBiftheportisintrunkyoumaygetthesameresultasiftheportiswithoutVLANsettings.
Anexampleoftheoutputofthecommandforaninterfaceattachedtoaportthatisconfiguredasaccess:
tcpdump:WARNING:em1:noIPv4addressassigned
tcpdump:listeningonem1,linktypeEN10MB(Ethernet),capturesize1500bytes
12:33:03.255101LLDP,length351
[...]
SystemNameTLV(5),length13:stackdr2.GARR
0x0000:737461636b6472322e47415252
[...]
PortDescriptionTLV(4),length21:GigabitEthernet2/0/31
[...]
OrganizationspecificTLV(127),length6:OUIEthernetbridged(0x0080c2)
PortVLANIdSubtype(1)
portvlanid(PVID):320
[...]
1packetcaptured
1packetreceivedbyfilter
0packetsdroppedbykernel
anexampleoftheoutputofthecommandforaninterfaceattachedtoaportthatisconfiguredastrunk:
#tcpdumpvvvs1500etherproto0x88cciem3
tcpdump:WARNING:em3:noIPv4addressassigned
tcpdump:listeningonem3,linktypeEN10MB(Ethernet),capturesize1500bytes
12:32:11.513135LLDP,length349
[...]
SystemNameTLV(5),length13:stackdr2.GARR
[...]
PortDescriptionTLV(4),length20:GigabitEthernet2/0/3
[...]
-
PortVLANIdSubtype(1)
portvlanid(PVID):1
[...]^C
1packetcaptured
1packetreceivedbyfilter
0packetsdroppedbykernel
Check Interfaces On compute nodes, use the following command to
see information about interfaces: IPs, VLAN ids and to know
wethertheinterfacesareup:
#ipa
onegoodinitialsanitycheckistomakesurethatyourinterfacesareup:
#ipa|grepem[1,3]|grepstate
2: em3: mtu 1500 qdisc mq master ovssystem state UP
groupdefaultqlen1000
6: em1: mtu 1500 qdisc mq state UP group default qlen 1000
37:brem3:mtu1500qdiscnoqueuestateUNKNOWNgroupdefault
Troubleshooting Open vSwitch Open vSwitch is a multilayer
virtual switch. Full documentation can be found at the website. In
practice you need to ensure that the required bridges (brint, brex,
brem1, brem3 etc) exist and have the proper ports connected to
themwiththeovsvsctlandovsofctlcommands.
Tolistthebridgesonasystem(VLANnetworksaretrunkedthroughtheem3networkinterface):
#ovsvsctllistbrbrem3brexbrint
Example:onthenetworknode(youshouldfollowthesamelogiconthecomputeone)
Lets check the chain of ports and bridges. The bridge brem3
contains the physical network interface em3 (trunk
network)andthevirtualinterfacephybrem3attachedtotheintbrem3ofthebrint:
#ovsvsctllistportsbrem3em3phybrem3#ovsvsctlshowBridge"brem3"Port"em3"Interface"em3"Port"phybrem3"Interface"phybrem3"type:patch
-
options:{peer="intbrem3"}Port"brem3"Interface"brem3"type:internalbrint
contains intbrem3 which pairs with phybrem3 to connect to the
physical network which is used to connect to the compute nodes and
the TAP devices that connect to the DHCP instances and the Tap
interfaces that connects tothevirtualrouters:
#ovsvsctllistportsbrintintbrem3intbrexqr9ae4acd492qrae75168a67qre323976e2bqre3debf8deetap1474f18da9tap7c29ce274etapc974ab5325tapd9762af34b#ovsvsctlshowBridgebrintfail_mode:securePort"tapd9762af34b"tag:5Interface"tapd9762af34b"type:internalPortintbrexInterfaceintbrextype:patchoptions:{peer=phybrex}[...]
Port"qr9ae4acd492"tag:1Interface"qr9ae4acd492"type:internalPortbrintInterfacebrinttype:internalPort"tap1474f18da9"tag:3Interface"tap1474f18da9"type:internal#ovsvsctllistportsbrexBridgebrexPortbrexInterfacebrextype:internalPort"em4"Interface"em4"PortphybrexInterfacephybrextype:patchoptions:{peer=intbrex}Ifanyoftheselinksismissingorincorrect,itsuggestsaconfigurationerror.
NB: you can also check the correct vlan tags translation along
the overall chain with ovsofctl commands i.e. (more
detailsfollows)
#ovsofctldumpflowsbrint NXST_FLOWreply(xid=0x4):
-
cookie=0x0, duration=6718.658s, table=0, n_packets=0, n_bytes=0,
idle_age=6718,
priority=3,in_port=1,dl_vlan=325actions=mod_vlan_vid:4,NORMAL
cookie=0x0, duration=6719.335s, table=0, n_packets=0, n_bytes=0,
idle_age=6719,
priority=3,in_port=1,dl_vlan=327actions=mod_vlan_vid:3,NORMAL
cookie=0x0, duration=6720.508s, table=0, n_packets=3,
n_bytes=328, idle_age=6715,
priority=3,in_port=1,dl_vlan=328actions=mod_vlan_vid:1,NORMALcookie=0x0,
duration=5840.156s, table=0, n_packets=139, n_bytes=13302,
idle_age=972,
priority=3,in_port=1,dl_vlan=320actions=mod_vlan_vid:5,NORMALcookie=0x0,
duration=6719.906s, table=0, n_packets=58, n_bytes=6845,
idle_age=6464,
priority=3,in_port=1,dl_vlan=324actions=mod_vlan_vid:2,NORMALcookie=0x0,
duration=6792.845s, table=0, n_packets=555, n_bytes=100492,
idle_age=9,
priority=2,in_port=1actions=dropcookie=0x0, duration=6792.025s,
table=0, n_packets=555, n_bytes=97888, idle_age=9,
priority=2,in_port=2actions=dropcookie=0x0, duration=6793.667s,
table=0, n_packets=203, n_bytes=22402, idle_age=4535,
priority=1actions=NORMALcookie=0x0, duration=6793.605s,
table=23, n_packets=0, n_bytes=0, idle_age=6793,
priority=0actions=drop
Bridges can be added with ovs-vsctl add-br, and ports can be
added to bridges with ovs-vsctl add-port.
Troubleshoot neutron traffic Refer to the Cloud Administrator
Guide for a variety of networking scenarios and their connection
paths. We use theOpenvSwitch(OVS)backend.
Seethefollowingfigureforreference.
1.
TheinstancegeneratesapacketandsendsitthroughthevirtualNICinsidetheinstance,suchaseth0.2.
ThepackettransferstoaTestAccessPoint(TAP)deviceonthecomputehost,suchastap1d40b89cfe.
YoucanfindoutwhatTAPisbeingusedbylookingatthe/etc/libvirt/qemu/instancexxxxxxxx.xmlfile.
followinganexamplewiththeinterestingpartsinevidence:
instance00000015cc2b78766d3a4b78b817ed36146a9b9e[....]
-
fig: Neutron network paths see here for more details at the
networking scenarios chapter
Looking also at the neutron part and highlighting the VLAN
configuration we have something like (I recycled the image so the
breth1 is bremXX in my setup and ethYY are emZZ but the flow is the
point that I want to stress here):
-
1. The TAP device is connected to the integration bridge, brint.
This bridge connects all the instance TAP devices and any other
bridges on the system. intbreth1 is one half of a veth pair
connecting to the bridge
breth1,whichhandlesVLANnetworkstrunkedoverthephysicalEthernetdeviceeth1.
2. The TAP devices and veth devices are normal Linux network
devices and may be inspected with the usual tools, such as ip and
tcpdump. Open vSwitch internal devices are only visible within the
Open vSwitch environment.
#tcpdumpiintbrem3tcpdump:intbrem3:Nosuchdeviceexists(SIOCGIFHWADDR:Nosuchdevice)
3. To watch packets on internal interfaces you need to create a
dummy network device and add it to the bridge containing the
internal interface you want to snoop on. Then tell Open vSwitch to
mirror all traffic to or from the internal port onto this dummy
port so to run tcpdump on the dummy interface and see the
trafficontheinternalport.
4. Capture packets from an internal interface on integration
bridge, br-int (advanced):
1. Createandbringupadummyinterface,snooper0:2.
#iplinkaddnamesnooper0typedummy3. #iplinksetdevsnooper0up
4. Add device snooper0 to bridge br-int:
#ovsvsctladdportbrintsnooper0
5. Create mirror of for example int-br-em3 interface to snooper0
(all in one line - returns UUID of mirror port):
-
# ovsvsctl set Bridge brint mirrors=@m id=@snooper0 get Port
snooper0 id=@intbrem3getPortintbrem3 id=@mcreateMirror
name=mymirror selectdstport=@intbrem3 selectsrcport=@intbrem3
outputport=@snooper0
dcce2c59be1a4f2db00b9d906c77ee8a
6. and from here you can see the traffic going through
int-br-em3 with a tcpdumpisnooper0.
7. Clean up mirrors:
#ovsvsctlclearBridgebrintmirrors#ovsvsctldelportbrintsnooper0#iplinkdeletedevsnooper0
On the integration bridge, networks are distinguished using
internal VLAN ids (unrelated to the segmentation IDs used in the
network definition and on the physical wire) regardless of how the
networking service defines them. This allows instances on the same
host to communicate directly without transiting the rest of the
virtual, or physical, network. On the brint, incoming packets are
translated from external tags to internal tags. Other
translationsalsohappenontheotherbridgesandwillbediscussedlater.
5. To discover which internal VLAN tag is in use for a given
external VLAN by using the ovsofctl command:
1. FindtheexternalVLANtagofthenetworkyou'reinterestedinwith
#neutronnetshowfieldsprovider:segmentation_id+++|Field|Value|+++|provider:network_type|vlan||provider:segmentation_id|324|+++
2. Grep for the provider:segmentation_id, 324 in this case, in
the output of ovsofctl dumpflows brint:
#ovsofctldumpflowsbrint|grepvlan=324
cookie=0x0,duration=105039.122s,table=0,n_packets=5963,n_bytes=482203,idle_age=1104,hard_age=65534,priority=3,in_port=1,dl_vlan=324actions=mod_vlan_vid:1,NORMAL
3. Here you can see packets received on port ID 1 with the VLAN
tag 324 are modified to have the
internalVLANtag1.Diggingalittledeeper,youcanconfirmthatport1isinfact:
4.
#ovsofctlshowbrintOFPT_FEATURES_REPLY(xid=0x2):dpid:0000029a51549b40n_tables:254,n_buffers:256capabilities:FLOW_STATSTABLE_STATSPORT_STATSQUEUE_STATSARP_MATCH_IPactions:
OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST
SET_NW_SRCSET_NW_DSTSET_NW_TOSSET_TP_SRCSET_TP_DSTENQUEUE1(intbrem3):addr:52:40:bd:b3:88:9cconfig:0state:0speed:0Mbpsnow,0Mbpsmax2(qvof3b63d31a0):addr:4e:db:74:04:53:4dconfig:0state:0
-
current:10GBFDCOPPERspeed:10000Mbpsnow,0Mbpsmax3(qvo65fb5ad8b5):addr:92:75:b8:03:cc:1dconfig:0state:0current:10GBFDCOPPERspeed:10000Mbpsnow,0Mbpsmax4(qvoa6e8c6e31c):addr:82:22:71:c5:4e:f8config:0state:0current:10GBFDCOPPERspeed:10000Mbpsnow,0Mbpsmax5(qvo1d40b89cfe):addr:5e:e3:15:53:e5:16config:0state:0current:10GBFDCOPPERspeed:10000Mbpsnow,0Mbpsmax6(qvoff8e411e6e):addr:02:a9:38:d6:88:22config:0state:0current:10GBFDCOPPERspeed:10000Mbpsnow,0MbpsmaxLOCAL(brint):addr:02:9a:51:54:9b:40config:0state:0speed:0Mbpsnow,0MbpsmaxOFPT_GET_CONFIG_REPLY(xid=0x4):frags=normalmiss_send_len=0
5. (NB this is NOT valid if you are using a GRE tunnel)
VLANbased networks exit the integration
bridge via a veth interface i.e. intbrem3 (intbreth1 in the
picture) and arrive on a bridge i.e. brem3 (breth1) on the other
member of the veth pair phybrem3 (phybreth1). Packets on this
interface arrive with internal VLAN tags and are translated to
external tags in the reverse of the processdescribedabove:
#ovsofctldumpflowsbrem3|grep324
cookie=0x0,duration=105402.89s,table=0,n_packets=7374,n_bytes=905197,idle_age=1468,hard_age=65534,priority=4,in_port=2,dl_vlan=1actions=mod_vlan_vid:324,NORMAL
6. Packets, now tagged with the external VLAN tag, then exit
onto the physical network via em3 (eth1). The Layer2 switch this
interface is connected to must be configured as trunk on the
VLANIDsused.Thenexthopforthispacketmustalsobeonthesamelayer2network.
6. The packet is then received on the network node. Note that
any traffic to the l3agent or dhcpagent will be visible only within
their network namespace. Watching any interfaces outside those
namespaces, even those that carry the network traffic, will only
show broadcast packets like Address Resolution Protocols (ARPs),
but unicast traffic to the router or DHCP address will not be seen.
See Dealing with Network
Namespacesfordetailonhowtoruncommandswithinthesenamespaces.
7. Alternatively, it is possible to configure VLANbased networks
to use external routers rather than the
l3agentshownhere,solongastheexternalrouterisonthesameVLAN:
1. VLANbased networks are received as tagged packets on a
physical network interface, eth1 in
thisexample.Justasonthecomputenode,thisinterfaceisamemberofthebreth1bridge.
2. GREbased networks will be passed to the tunnel bridge brtun,
which behaves just like the GRE interfacesonthecomputenode.
8.
Next,thepacketsfromeitherinputgothroughtheintegrationbridge,againjustasonthecomputenode.9.
The packet then makes it to the l3agent. This is actually another
TAP device within the router's network
namespace. Router namespaces are named in the form qrouter.
Running ip a within the
namespacewillshowtheTAPdevicename,qre6256f7d31inthisexample:
10.
#ipnetnsexecqroutere521f9d0a1bd4ff4bc8178a60dd88fe5ipa|grepstate10:qre6256f7d31:mtu1500qdiscnoqueue\stateUNKNOWN
-
11:qg35916e1f36:mtu1500\qdiscpfifo_faststateUNKNOWNqlen50028:lo:mtu16436qdiscnoqueuestateUNKNOWN
11. The qg interface in the l3agent router namespace sends the
packet on to its next hop through device eth2 on the external
bridge brex. This bridge is constructed similarly to breth1 and may
be inspected in thesameway.
12. This external bridge also includes a physical network
interface, eth2 in this example, which finally lands
thepacketontheexternalnetworkdestinedforanexternalrouterordestination.
13. DHCP agents running on OpenStack networks run in namespaces
similar to the l3agents. DHCP namespaces are named qdhcp and have a
TAP device on the integration bridge. Debugging of
DHCPissuesusuallyinvolvesworkinginsidethisnetworknamespace.
Debug a problem along the Path
Pingisyourbestfriend!Fromaninstance:
1.
Seewhetheryoucanpinganexternalhost,suchas8.8.8.8(googlewhichusuallyisup:fromstats99.9%).2.
Ifyoucan't,trytheIPaddressofthecomputenodewherethevirtualmachineishosted.3.
If you can ping this IP, then the problem is somewhere between the
compute node and that compute
node'sgateway.4. If you can't the problem is between the
instance and the compute node. Check also the bridge connecting
thecomputenode'smainNICwiththevnetNICofthevm.5. Launch a second
instance and see whether the two instances can ping each other. If
they can, the issue
mightberelatedtothefirewallonthecomputenode.Seefurtherforiptablesdebugging
tcpdump This is your second best friend to help with
troubleshooting network issues. Using tcpdump at several points
along thenetworkpathshouldhelpfindingwheretheproblemis.
Forexample,runthefollowingcommand:
tcpdumpianynv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=icmpecho'
on:
1.
Anexternalserveroutsideofthecloud(intheexample193.206.159.201)2.
Acomputenode()3. Aninstancerunningonthatcomputenode
Inthisexample,theselocationshavethefollowingIPaddresses:
Instance10.0.2.24203.0.113.30ComputeNode10.0.0.42203.0.113.34ExternalServer1.2.3.4
Next, open a new shell to the instance and then ping the
external host where tcpdump is running. If the network
pathtotheexternalserverandbackisfullyfunctional,youseesomethinglikethefollowing:
On the external server:
$tcpdumpianynv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=icmpecho'tcpdump:listeningonany,linktypeLINUX_SLL(Linuxcooked),capturesize65535bytes10:20:23.517242
IP (tos 0x0, ttl 64, id 65416, offset 0, flags [none], proto ICMP
(1),
length84)193.206.159.201>90.147.91.10:ICMPechoreply,id1606,seq28,length64
-
which received the ping request and sent a ping replyOn the
compute node you can follow the traffic along the path:
1.
onthetapdevicewhichisconnectingtheVMtothelinuxbridge(tofindthetapseeprevious)
# tcpdump i tap88ab3af77d n v \ 'icmp[icmptype] = icmpechoreply
or
icmp[icmptype]=icmpecho'tcpdump:WARNING:tap88ab3af77d:noIPv4addressassignedtcpdump:
listening on tap88ab3af77d, linktype EN10MB (Ethernet), capture
size
65535bytes10:36:31.000419IP(tos0x0,ttl64,id1469,offset0,flags[DF],protoICMP(1),
length84)192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq1,length64
2.
onthetwosidesofthevethpairbetweenthelinuxbridgeandtheOVSbrint
# tcpdump i qbr88ab3af77d n v \ 'icmp[icmptype] = icmpechoreply
or
icmp[icmptype]=icmpecho'tcpdump:WARNING:qbr88ab3af77d:noIPv4addressassignedtcpdump:
listening on qbr88ab3af77d, linktype EN10MB (Ethernet), capture
size
65535bytes10:36:59.035767IP(tos0x0,ttl64,id1497,offset0,flags[DF],protoICMP(1),
length84)192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq29,length64root@compute:~#
tcpdump i qvb88ab3af77dnv\'icmp[icmptype]=icmpechoreply
oricmp[icmptype]=icmpecho'tcpdump:WARNING:qvb88ab3af77d:noIPv4addressassignedtcpdump:
listening on qvb88ab3af77d, linktype EN10MB (Ethernet), capture
size
65535bytes10:37:18.058899IP(tos0x0,ttl64,id1516,offset0,flags[DF],protoICMP(1),
length84)192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq48,length64
3. andfinallyontheoutgoinginterface(em1intheexample)
#tcpdumpiem1nv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=icmpecho'tcpdump:WARNING:em1:noIPv4addressassignedtcpdump:listeningonem1,linktypeEN10MB(Ethernet),capturesize65535bytes10:37:49.099383IP(tos0x0,ttl64,id1547,offset0,flags[DF],protoICMP(1),
length84)192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq79,length64
Ontheinstance:
#tcpdumpianynv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=>icmpecho'tcpdump:listeningonany,linktypeLINUX_SLL(Linuxcooked),capturesize65535bytes09:27:04.801759IP(tos0x0,ttl64,id36704,offset0,flags[DF],protoICMP(1),length
84)192.168.4.103>192.168.21.107:ICMPechorequest,id1693,seq27,length64
NBitcanbeusefultoshowvlantagintrafficdebugging.Todothisuse:
#tcpdumpiUw|tcpdumpenrvlan
-
iptables and security rules OpenStack Compute automatically
manages iptables, including forwarding packets to and from
instances on a
computenode,forwardingfloatingIPtraffic,andmanagingsecuritygrouprules.
iptablessave
showsyoualltherules
Example of setup of security rules toshowthesecurityrules:
#novasecgrouplistrulesdefault++++++|IPProtocol|FromPort|ToPort|IPRange|SourceGroup|++++++|||||default||||||default|++++++
tosetuparuletomakeicmptrafficpassthrough:
novasecgroupaddruledefaulticmp110.0.0.0/0
++++++|IPProtocol|FromPort|ToPort|IPRange|SourceGroup|++++++|icmp|1|1|0.0.0.0/0|||||||default||||||default|++++++
Troubleshooting DNS SSH server does a reverse DNS lookup on the
IP address that you are connecting from so if you can use SSH to
log
intoaninstance,butittakesorderofaminutethenyoumighthaveaDNSissue.
A quick way to check whether DNS is working is to resolve a
hostname inside your instance by using the host
command.IfDNSisworking,youshouldsee:
#hostgarr.itgarr.itmailishandledby15lx1.dir.garr.it.garr.itmailishandledby20lx5.dir.garr.it.
Note If you're running the Cirros image, it doesn't have the
"host" program installed, in which case you can use
pingtotrytoaccessamachinebyhostnametoseewhetheritresolves.
Dealing with Network Namespaces Linux network namespaces are a
kernel feature the networking service uses to support multiple
isolated layer-2 networks with overlapping IP address ranges. Your
network nodes will run their dhcp-agents and l3-agents in isolated
namespaces. NB Network interfaces
-
and traffic on those interfaces will not be visible in the
default namespace. L3-agent router namespaces are named qrouter-,
and dhcp-agent name spaces are named qdhcp-. To see whether you are
using namespaces, run ip netns:
#ipnetnsqrouter80fdf88437c34d33a340cd1a09510e59qdhcpc3cfc51bf07c47aebdb4b029035c08d7qdhcpf7bff0561d274c12a9176ffe2925a44bqrouteredcb7cb537fd4b3181c5cee1bda75369qdhcp286f28446b7642e59664ab5123bde2d5qrouter3618b0204f3c4a728c02e25db0c4769dqdhcpc8a29266e9ac45e0be6d79c32f501194qrouter301f264a8ef1413db252c0886fc2c815qrouter9d378195ee9345f0b27f2bd48b774f5aqdhcp13c334c1ad394c51b396953430059b22
This output shows a network node with 5 networks running
dhcp-agents, each also running an l3-agent router. A list of
existing networks and their UUIDs can be obtained by running
neutronnetlist with administrative credentials. #neutronnetlist
++++
|id|name|subnets|
++++
|13c334c1ad394c51b396953430059b22|intnet324|edd7678a277c477ea5ac84258e6b1794192.168.1.0/24|
|286f28446b7642e59664ab5123bde2d5|inafnet|dbf5bd19de674b84a97b8e322f9343dc192.168.3.0/24|
|99e9c208b72a427f97f62443cdd6de9c|extnetflat319|e0ef8d6f3fa94a05ae2c5ec229357f4b90.147.91.0/24|
|b4ef2523bebe4dbeb5b782983fec6be8|extnetflat319bis|91ccda542af14a59bf088bb0821c1c0890.147.91.0/24|
|c3cfc51bf07c47aebdb4b029035c08d7|intnet328|0d36feb34c834867a227fb972564125c192.168.8.0/24|
|c8a29266e9ac45e0be6d79c32f501194|ingvnet|915f9929e49b4a95a193c71227ff870d192.168.2.0/24|
|f7bff0561d274c12a9176ffe2925a44b|eneanet|d9d1ba304a144aaba95f4ed2c3f895d3192.168.4.0/24|
Once you've determined which namespace you need to work in, you
can use any of the debugging tools mention earlier by prefixing the
command with ip netns exec . For example, to see what network
interfaces exist in the first qdhcp namespace returned above, do
this:
#ipnetnsexecqdhcpf7bff0561d274c12a9176ffe2925a44bipa1:lo:mtu65536qdiscnoqueuestateUNKNOWNgroupdefault
-
link/loopback00:00:00:00:00:00brd00:00:00:00:00:00inet127.0.0.1/8scopehostlovalid_lftforeverpreferred_lftforeverinet6::1/128scopehostvalid_lftforeverpreferred_lftforever61:
tapd9762af34b: mtu 1500 qdisc noqueue state UNKNOWN group
defaultlink/etherfa:16:3e:b8:2e:0cbrdff:ff:ff:ff:ff:ffinet192.168.4.100/24brd192.168.4.255scopeglobaltapd9762af34bvalid_lftforeverpreferred_lftforeverinet6fe80::f816:3eff:feb8:2e0c/64scopelinkvalid_lftforeverpreferred_lftforever
From this you see that the DHCP server on that network is using the
tapd9762af3-4b device and has an IP address of 192.168.4.100. Usual
commands also mentioned previously can be run in the same way.
note: It is also possible to run a shell and have an interactive
session within the namespace i.e.
#ipnetnsexecqdhcpf7bff0561d274c12a9176ffe2925a44bbashroot@network:~#ifconfigloLinkencap:LocalLoopbackinetaddr:127.0.0.1Mask:255.0.0.0inet6addr:::1/128Scope:HostUPLOOPBACKRUNNINGMTU:65536Metric:1RXpackets:0errors:0dropped:0overruns:0frame:0TXpackets:0errors:0dropped:0overruns:0carrier:0collisions:0txqueuelen:0RXbytes:0(0.0B)TXbytes:0(0.0B)tapd9762af34bLinkencap:EthernetHWaddrfa:16:3e:b8:2e:0cinetaddr:192.168.4.100Bcast:192.168.4.255Mask:255.255.255.0inet6addr:fe80::f816:3eff:feb8:2e0c/64Scope:LinkUPBROADCASTRUNNINGMTU:1500Metric:1RXpackets:22errors:0dropped:0overruns:0frame:0TXpackets:9errors:0dropped:0overruns:0carrier:0collisions:0txqueuelen:0RXbytes:1788(1.7KB)TXbytes:738(738.0B)
Mapping of physnet vs network inside neutron db Sometimes there
could be an unclear (from the logs point of view) error that claims
not to find suitable
resourcesatthemomentofVMcreation.ItcouldberelatedtoaproblemintheneutronDB.Tofindout:
1.
checkthatnovaservicesarerunningonthecomputenodesandcontroller
#novaservicelist+++++++++| Id | Binary |Host |Zone |Status
|State|Updated_at |DisabledReason|+++++++++| 1 | novacompute |
compute | nova | enabled | up | 20150212T13:52:45.000000||| 2 |
novacert | controller | internal | enabled | up |
-
20150212T13:52:40.000000||| 3 | novaconsoleauth | controller |
internal | enabled | up | 20150212T13:52:40.000000||| 4 |
novascheduler | controller | internal | enabled | up |
20150212T13:52:45.000000||| 5 | novaconductor | controller |
internal | enabled | up | 20150212T13:52:44.000000||| 6 |
novacompute | controller | nova | enabled | up |
20150212T13:52:46.000000||+++++++++
2. checkthatthereareenoughhwresources
#novahypervisorstats
+++
|Property|Value|
+++
|count|2|
|current_workload|0|
|disk_available_least|1130|
|free_disk_gb|1274|
|free_ram_mb|367374|
|local_gb|1454|
|local_gb_used|180|
|memory_mb|386830|
|memory_mb_used|19456|
|running_vms|6|
|vcpus|80|
|vcpus_used|9|
+++
3. check that there is no problem in the mapping of physnet and
networks in the neutron db (i.e.
trunknetisourvlantaggednetwork)
select*fromml2_vlan_allocations
++++
|physical_network|vlan_id|allocated|
++++
|trunknet | 319| 0|
|trunknet | 320| 0|
-
|trunknet | 321| 0|
|trunknet | 322| 0|
|trunknet | 323| 0|
|trunknet | 324| 0|
|trunknet | 325| 0|
|trunknet | 326| 0|
|trunknet | 327| 0|
|trunknet | 328| 0|
++++32
Debugging with logs - where Are the Logs? Following reported a
quick summary table of the services log location, more in OpenStack
log locations.
Node type Service Log location
Cloud controller
nova-* /var/log/nova
Cloud controller
glance-* /var/log/glance
Cloud controller
cinder-* /var/log/cinder
Cloud controller
keystone-* /var/log/keystone
Cloud controller
neutron-* /var/log/neutron
Cloud controller
horizon /var/log/apache2/
All nodes misc (swift, dnsmasq) /var/log/syslog
Compute nodes
libvirt /var/log/libvirt/libvirtd.log
-
Compute nodes
Console (boot up messages) for VM instances:
/var/lib/nova/instances/instance-/console.log
Block Storage nodes
cinder-volume /var/log/cinder/cinder-volume.log
Backup + Recovery (for Real) This chapter describes only how to
back up configuration files and databases that the various
OpenStack components need to run. This chapter does not describe
how to back up objects inside Object Storage or data
containedinsideBlockStorage.
Database Backups The cloud controller is the MySQL server
hosting the databases for nova, glance, cinder, and keystone. To
create a databasebackup:
#mysqldumpuhcontrollerpalldatabases>openstack.sql
To backup a single database (i.e. nova) you can run:
#mysqldumpuhcontrollerpnova>nova.sql
You can easily automate this process. The following script dumps
the entire MySQL database and deletes any
backupsolderthansevendays:
#!/bin/bashbackup_dir="/var/lib/backups/mysql"filename="${backup_dir}/mysql`hostname`
`evaldate+%Y%m%d`.sql.gz"#DumptheentireMySQLdatabase/usr/bin/mysqldumpurootp123gridalldatabases|gzip>$filename#Deletebackupsolderthan7daysfind$backup_dirctime+7typefdelete
File System Backups
Compute
The/etc/novadirectoryonboththecloudcontrollerandcomputenodesshouldbebackedup.
/var/lib/novaisadirectorytobackup.
note: its unuseful to backup /var/lib/nova/instances
subdirectory on compute nodes which contains the KVM
imagesofrunninginstancesunlessyouneedtomaintainbackupcopiesofallinstances.
ImageCatalogandDelivery
/etc/glanceand/var/log/glanceshouldbebackedup
/var/lib/glanceshouldalsobebackedup.
-
There are two ways to ensure stability with this directory. The
first is to make sure this directory is run on a RAID array. If a
disk fails, the directory is available. The second way is to use a
tool such as rsync to replicate the images toanotherserver:
#rsyncazprogress/var/lib/glance/imagesbackupserver:/var/lib/glance/images/
Identity
/etc/keystoneand/var/log/keystonefollowthesamerulesasothercomponents.
/var/lib/keystone,shouldnotcontainanydatabeingused.
Recovering Backups Recoveringbackupsisasimpleprocess.
1.
ensurethattheserviceyouarerecoveringisnotrunning.I.e.inthecaseofnova:
#stopnovacert#stopnovaconsoleauth#stopnovanovncproxy#stopnovaobjectstore#stopnovascheduler
2. importapreviouslybackedupdatabase:
#mysqlurootponedatabaseneutron
-
1.
ThedashboardorCLIgetstheusercredentialsandauthenticateswiththeIdentityServiceviaRESTAPI.2.
The Identity Service authenticates the user with the user
credentials, and then generates and sends back an
authtokenwhichwillbeusedforsendingtherequesttoothercomponentsthroughRESTcall.3.
The dashboard or CLI converts the new instance request specified in
launch instance or novaboot form to
aRESTAPIrequestandsendsittonovaapi.4. novaapi receives the
request and sends a request to the Identity Service for validation
of the authtoken
andaccesspermission.5. The Identity Service validates the token
and sends updated authentication headers with roles and
permissions.6. novaapichecksforconflictswithnovadatabase.7.
novaapicreatesinitialdatabaseentryforanewinstance.8. novaapi sends
the rpc.call request to novascheduler expecting to get updated
instance entry with host ID
specified.9. novaschedulerpicksuptherequestfromthequeue.10.
novaschedulerinteractswithnovadatabasetofindanappropriatehostviafilteringandweighing.11.
novascheduler returns the updated instance entry with the
appropriate host ID after filtering and
weighing.12. novascheduler sends the rpc.cast request to
novacompute for launching an instance on the appropriate
host.13. novacomputepicksuptherequestfromthequeue.14.
novacompute sends the rpc.call request to novaconductor to fetch
the instance information such as host
IDandflavor(RAM,CPU,Disk).15.
novaconductorpicksuptherequestfromthequeue.16.
novaconductorinteractswithnovadatabase.17.
novaconductorreturnstheinstanceinformation.18.
novacomputepicksuptheinstanceinformationfromthequeue.19.
novacompute performs the REST call by passing the authtoken to
glanceapi. Then, novacompute uses
the Image ID to retrieve the Image URI from the Image Service,
and loads the image from the image
-
storage.20. glanceapivalidatestheauthtokenwithkeystone.21.
novacomputegetstheimagemetadata.22. novacompute performs the
RESTcall by passing the authtoken to Network API to allocate
and
configurethenetworksothattheinstancegetstheIPaddress.23.
neutronservervalidatestheauthtokenwithkeystone.24.
novacomputeretrievesthenetworkinfo.25. novacompute performs the
REST call by passing the authtoken to Volume API to attach volumes
to the
instance.26. cinderapivalidatestheauthtokenwithkeystone.27.
novacomputeretrievestheblockstorageinfo.28. novacompute generates
data for the hypervisor driver and executes the request on the
hypervisor (via
libvirtorAPI).
Configuration options
Allthedetailsaboutconfigurationoptionscanbefoundherehttp://docs.openstack.org/juno/configreference/content/index.html
anycommentismorethanwelcome
Alex