Troubleshooting Common Networking Problems with Wireshark, Pt. 5: Broadcast Storms


Author’s Note: This is the fifth part in a six-part series about finding and solving many networking anomalies using the Wireshark network protocol analyzer. If you are new to the series, you can find part 1 here, and the whole series here.
Broadcast storms occur when a host floods the subnet with broadcasts. This is not a typical function of any given protocol, but can happen due to an application bug or hardware problem. Unfortunately, broadcast storms can be very difficult to isolate in a large network, but there are a few tricks that we can use to identify the source.
In the example we will examine here, all network printers on a given subnet become unresponsive whenever the DHCP server service on a Windows server on the same subnet is enabled. If the service is disabled, the problem rectifies. While we may be tempted to blame this problem on the DHCP service itself, network traces will tell us the true cause. First, we need to determine if the problem is due to a broadcast storm. To do this, we will begin by opening the trace and using a filter in Wireshark to filter out all non-broadcast packets:
Eth.dest == ff-ff-ff-ff-ff-ff
Next, we will go to the statistics menu and select ‘Summary’ to display statistics on the number of broadcast packets, like so:
clip_image002
Doing a little math, we can see that over 45% of the traffic in this trace is due to broadcasts! Even more bothersome is the fact that we are averaging 730 broadcasts a second. We definitely have a broadcast storm on our hands. So now we need to determine which system is causing the storm.
To determine the culprit, we will examine our filtered trace. The first broadcast packet is below:

Frame 2 (368 bytes on wire, 368 bytes captured)
Ethernet II, Src: 160.1.153.47 (00:12:3f:68:3a:1c), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Internet Protocol, Src: 160.1.153.47 (160.1.153.47), Dst: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: bootps (67), Dst Port: bootpc (68)
Bootstrap Protocol

Message type: Boot Reply (2)
Hardware type: Ethernet
Hardware address length: 6
Hops: 0
Transaction ID: 0xec15491b
Seconds elapsed: 0
Bootp flags: 0x0000 (Unicast)
Client IP address: 0.0.0.0 (0.0.0.0)
Your (client) IP address: 172.16.33.128 (172.16.33.128)
Next server IP address: 0.0.0.0 (0.0.0.0)
Relay agent IP address: 0.0.0.0 (0.0.0.0)
Client MAC address: Fuji-Xer_29:d4:c0 (08:00:37:29:d4:c0)
Server host name not given
Boot file name not given
Magic cookie: (OK)
Option 53: DHCP Message Type = DHCP ACK
Option 58: Renewal Time Value = 10 days, 12 hours
Option 59: Rebinding Time Value = 18 days, 9 hours
Option 51: IP Address Lease Time = 21 days
Option 54: Server Identifier = 160.1.153.47
Option 1: Subnet Mask = 255.255.255.0
Option 3: Router = 172.16.33.5
Option 6: Domain Name Server
Option 15: Domain Name = "corp.com"
Option 44: NetBIOS over TCP/IP Name Server
Option 46: NetBIOS over TCP/IP Node Type = H-node
End Option

Looking at this packet, we can see that the broadcasts are, in fact, coming from the DHCP server. In fact, if we examine the rest of the broadcast packets, we will find that almost all of them are identical to this one. So, the DHCP server is creating the broadcast traffic. But is the server responsible for the storm? Disabling the filter and examining only BOOTP traffic will show us the real culprit:

Frame 1 (590 bytes on wire, 590 bytes captured)
Ethernet II, Src: 160.1.30.2 (00:d0:04:7f:4c:00), Dst: 160.1.153.47 (00:12:3f:68:3a:1c)
Internet Protocol, Src: 172.16.33.128 (172.16.33.128), Dst: 160.1.153.47 (160.1.153.47)
User Datagram Protocol, Src Port: bootpc (68), Dst Port: bootps (67)
Bootstrap Protocol
Message type: Boot Request (1)
Hardware type: Ethernet
Hardware address length: 6
Hops: 0
Transaction ID: 0xec15491b
Seconds elapsed: 27582
Bootp flags: 0x8000 (Broadcast)
Client IP address: 172.16.33.128 (172.16.33.128)
Your (client) IP address: 0.0.0.0 (0.0.0.0)
Next server IP address: 0.0.0.0 (0.0.0.0)
Relay agent IP address: 0.0.0.0 (0.0.0.0)
Client MAC address: Fuji-Xer_29:d4:c0 (08:00:37:29:d4:c0)
Server host name not given
Boot file name not given
Magic cookie: (OK)
Option 53: DHCP Message Type = DHCP Request
Option 12: Host Name = "DELL29D4C0"
Option 57: Maximum DHCP Message Size = 548
Option 55: Parameter Request List
End Option
Padding

Frame 2 (368 bytes on wire, 368 bytes captured)
Ethernet II, Src: 160.1.153.47 (00:12:3f:68:3a:1c), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Internet Protocol, Src: 160.1.153.47 (160.1.153.47), Dst: 255.255.255.255 (255.255.255.255)
User Datagram Protocol, Src Port: bootps (67), Dst Port: bootpc (68)
Bootstrap Protocol
Message type: Boot Reply (2)
Hardware type: Ethernet
Hardware address length: 6
Hops: 0
Transaction ID: 0xec15491b
Seconds elapsed: 0
Bootp flags: 0x0000 (Unicast)
Client IP address: 0.0.0.0 (0.0.0.0)
Your (client) IP address: 172.16.33.128 (172.16.33.128)
Next server IP address: 0.0.0.0 (0.0.0.0)
Relay agent IP address: 0.0.0.0 (0.0.0.0)
Client MAC address: Fuji-Xer_29:d4:c0 (08:00:37:29:d4:c0)
Server host name not given
Boot file name not given
Magic cookie: (OK)
Option 53: DHCP Message Type = DHCP ACK Option 58: Renewal Time Value = 10 days, 12 hours
Option 59: Rebinding Time Value = 18 days, 9 hours
Option 51: IP Address Lease Time = 21 days
Option 54: Server Identifier = 160.1.153.47
Option 1: Subnet Mask = 255.255.255.0
Option 3: Router = 172.16.33.5
Option 6: Domain Name Server
Option 15: Domain Name = "corp.com"
Option 44: NetBIOS over TCP/IP Name Server
Option 46: NetBIOS over TCP/IP Node Type = H-node
End Option

As you can see from the BOOTP traffic, the DHCP server is responding to a DHCP Request packet from a client. Looking at the entire trace, you can see these same two packets repeated over and over, 700+ times per second. If you were to disable the DHCP server service and take a capture, you would see the same client system sending the same requests at the same rate, just no DHCP server response. So, the real issue is that this client is constantly requesting this IP address over and over again. Resolving the issue therefore requires locating the client. In this regard, we once again turn to the traces.
In Frame 1, we can see the best identifier for the client, the source MAC address, buried in the BOOTP header. So how do we track down the client that owns this MAC address in a large environment? Unless the customer has an exhaustive list of all the MAC addresses in their environment, the simplest method is usually to trace it via the MAC tables in the switches. Most managed switches have a command to allow you to view the MAC address tables. In most Cisco switches, the command is ‘show mac-address-table address [MAC address]’. Using this command, the customer can follow the MAC address trail back to the client, though it may take some time as a large number of switches may need to be checked. However, also pay attention to the first six digits of the MAC address, which identify the vendor of the NIC. Using a site such as http://www.coffer.com/mac_find/ can help in determining the vendor. If the vendor is unusual in their environment, the task of finding the problematic client may be a little easier.

, , , ,

  1. #1 by tony on February 19, 2017 - 12:15 am

    I have problems in our network continue from time to time .And I would like to know the problem.

(will not be published)