Troubleshooting Common Networking Problems with Wireshark, Pt. 5: Broadcast Storms

NOTE: You can now take course by the author with video and example traces on Wireshark. Check this post for more details.

Author’s Note: This is the fifth part in a six-part series about finding and solving many networking anomalies using the Wireshark network protocol analyzer. If you are new to the series, you can find part 1 here, and the whole series here.
Broadcast storms occur when a host floods the subnet with broadcasts. This is not a typical function of any given protocol, but can happen due to an application bug or hardware problem. Unfortunately, broadcast storms can be very difficult to isolate in a large network, but there are a few tricks that we can use to identify the source.
In the example we will examine here, all network printers on a given subnet become unresponsive whenever the DHCP server service on a Windows server on the same subnet is enabled. If the service is disabled, the problem rectifies. While we may be tempted to blame this problem on the DHCP service itself, network traces will tell us the true cause. First, we need to determine if the problem is due to a broadcast storm. To do this, we will begin by opening the trace and using a filter in Wireshark to filter out all non-broadcast packets:
Eth.dest == ff-ff-ff-ff-ff-ff
Next, we will go to the statistics menu and select ‘Summary’ to display statistics on the number of broadcast packets, like so:

Doing a little math, we can see that over 45% of the traffic in this trace is due to broadcasts! Even more bothersome is the fact that we are averaging 730 broadcasts a second. We definitely have a broadcast storm on our hands. So now we need to determine which system is causing the storm.
To determine the culprit, we will examine our filtered trace. The first broadcast packet is below:

Frame 2 (368 bytes on wire, 368 bytes captured)
Ethernet II, Src: 160.1.153.47 (00:12:3f:68:3a:1c), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Internet Protocol, Src: 160.1.153.47 (160.1.153.47), Dst: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: bootps (67), Dst Port: bootpc (68)
Bootstrap Protocol
    Message type: Boot Reply (2)
    Hardware type: Ethernet
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0xec15491b
    Seconds elapsed: 0
    Bootp flags: 0x0000 (Unicast)
    Client IP address: 0.0.0.0 (0.0.0.0)
Your (client) IP address: 172.16.33.128 (172.16.33.128)
    Next server IP address: 0.0.0.0 (0.0.0.0)
    Relay agent IP address: 0.0.0.0 (0.0.0.0)
Client MAC address: Fuji-Xer_29:d4:c0 (08:00:37:29:d4:c0)
    Server host name not given
    Boot file name not given
    Magic cookie: (OK)
    Option 53: DHCP Message Type = DHCP ACK
    Option 58: Renewal Time Value = 10 days, 12 hours
    Option 59: Rebinding Time Value = 18 days, 9 hours
    Option 51: IP Address Lease Time = 21 days
    Option 54: Server Identifier = 160.1.153.47
    Option 1: Subnet Mask = 255.255.255.0
    Option 3: Router = 172.16.33.5
    Option 6: Domain Name Server
    Option 15: Domain Name = "corp.com"
    Option 44: NetBIOS over TCP/IP Name Server
    Option 46: NetBIOS over TCP/IP Node Type = H-node
    End Option

Looking at this packet, we can see that the broadcasts are, in fact, coming from the DHCP server. In fact, if we examine the rest of the broadcast packets, we will find that almost all of them are identical to this one. So, the DHCP server is creating the broadcast traffic. But is the server responsible for the storm? Disabling the filter and examining only BOOTP traffic will show us the real culprit:

Frame 1 (590 bytes on wire, 590 bytes captured)
Ethernet II, Src: 160.1.30.2 (00:d0:04:7f:4c:00), Dst: 160.1.153.47 (00:12:3f:68:3a:1c)
Internet Protocol, Src: 172.16.33.128 (172.16.33.128), Dst: 160.1.153.47 (160.1.153.47)
User Datagram Protocol, Src Port: bootpc (68), Dst Port: bootps (67)
Bootstrap Protocol
    Message type: Boot Request (1)
    Hardware type: Ethernet
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0xec15491b
    Seconds elapsed: 27582
    Bootp flags: 0x8000 (Broadcast)
Client IP address: 172.16.33.128 (172.16.33.128)
    Your (client) IP address: 0.0.0.0 (0.0.0.0)
    Next server IP address: 0.0.0.0 (0.0.0.0)
    Relay agent IP address: 0.0.0.0 (0.0.0.0)
Client MAC address: Fuji-Xer_29:d4:c0 (08:00:37:29:d4:c0)
    Server host name not given
    Boot file name not given
    Magic cookie: (OK)
    Option 53: DHCP Message Type = DHCP Request
    Option 12: Host Name = "DELL29D4C0"
    Option 57: Maximum DHCP Message Size = 548
    Option 55: Parameter Request List
    End Option
    Padding

Frame 2 (368 bytes on wire, 368 bytes captured)
Ethernet II, Src: 160.1.153.47 (00:12:3f:68:3a:1c), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Internet Protocol, Src: 160.1.153.47 (160.1.153.47), Dst: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: bootps (67), Dst Port: bootpc (68)
Bootstrap Protocol
    Message type: Boot Reply (2)
    Hardware type: Ethernet
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0xec15491b
    Seconds elapsed: 0
    Bootp flags: 0x0000 (Unicast)
    Client IP address: 0.0.0.0 (0.0.0.0)
Your (client) IP address: 172.16.33.128 (172.16.33.128)
    Next server IP address: 0.0.0.0 (0.0.0.0)
    Relay agent IP address: 0.0.0.0 (0.0.0.0)
Client MAC address: Fuji-Xer_29:d4:c0 (08:00:37:29:d4:c0)
    Server host name not given
    Boot file name not given
    Magic cookie: (OK)
    Option 53: DHCP Message Type = DHCP ACK    Option 58: Renewal Time Value = 10 days, 12 hours
    Option 59: Rebinding Time Value = 18 days, 9 hours
    Option 51: IP Address Lease Time = 21 days
    Option 54: Server Identifier = 160.1.153.47
    Option 1: Subnet Mask = 255.255.255.0
    Option 3: Router = 172.16.33.5
    Option 6: Domain Name Server
    Option 15: Domain Name = "corp.com"
    Option 44: NetBIOS over TCP/IP Name Server
    Option 46: NetBIOS over TCP/IP Node Type = H-node
    End Option

As you can see from the BOOTP traffic, the DHCP server is responding to a DHCP Request packet from a client. Looking at the entire trace, you can see these same two packets repeated over and over, 700+ times per second. If you were to disable the DHCP server service and take a capture, you would see the same client system sending the same requests at the same rate, just no DHCP server response. So, the real issue is that this client is constantly requesting this IP address over and over again. Resolving the issue therefore requires locating the client. In this regard, we once again turn to the traces.
In Frame 1, we can see the best identifier for the client, the source MAC address, buried in the BOOTP header. So how do we track down the client that owns this MAC address in a large environment? Unless the customer has an exhaustive list of all the MAC addresses in their environment, the simplest method is usually to trace it via the MAC tables in the switches. Most managed switches have a command to allow you to view the MAC address tables. In most Cisco switches, the command is ‘show mac-address-table address [MAC address]’. Using this command, the customer can follow the MAC address trail back to the client, though it may take some time as a large number of switches may need to be checked. However, also pay attention to the first six digits of the MAC address, which identify the vendor of the NIC. Using a site such as http://www.coffer.com/mac_find/ can help in determining the vendor. If the vendor is unusual in their environment, the task of finding the problematic client may be a little easier.

article, geeky, TCNP Series, Troubleshooting Common Networking Problems with Wireshark, Wireshark articles

This entry was posted on June 18, 2012, 3:00 pm and is filed under Computing, Technical. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

Troubleshooting Common Networking Problems with Wireshark, Pt. 5: Broadcast Storms

Parata Occult Mysteries, #3

Near-future sci-fi thriller

Parata Occult Mysteries, #4

Popular posts

Archives

Recent Comments