Story image

Network professionals spending too much time troubleshooting, expert says

By Shannon Williams, Fri 9 Sep 2016
FYI, this story is more than a year old

Network professionals are spending around a quarter of their time troubleshooting network concerns.

According to NETSCOUT, finding the root cause of these network issues is timing consuming as networks become more sprawled and complex.

“If they’re intermittent issues, it can seem almost impossible to find the cause and resolve them,” comments Amit Rao, director, APAC Channels, NETSCOUT.

“However, by taking a methodical approach, it’s possible to effectively troubleshoot enterprise network problems,” he says.

According Rao, there are six common enterprise network issues, however he says there are ways to troubleshoot them: 1. Infrastructure performance  End-user complaints often signify that there is an infrastructure issue. However, Rao says when application servers and infrastructure devices are operating normally, obvious error states can’t be located, and legacy network monitoring tools report ‘green’. Finding the root cause can be challenging, he says.

Possible causes include bad cabling, network congestion, server network adapter issues, or DNS issues. 

There are four steps to troubleshooting these issues:  a. use existing monitoring tools and extract information from SYSLOG receivers  b. check server and network device log files to understand if there are connectivity issues from the NIC side  c. examine WAN links and logs to understand whether traffic-shaping devices or policies are affecting performance  d. check errors including web server, load balancer, and application log errors.  2. Network services  Rao says there are numerous issues that can affect network services, such as DHCP issues or a slow DNS response. Possible causes include misconfigured DHCP or DNS servers, duplicate IP addresses caused by overlapping DHCP scopes, rogue DHCP servers, or users manually assigning static IPs. This can enable a ‘man-in-the-middle’ attack and create significant security issues. 

To troubleshoot, first confirm proper configuration of authorised DHCP servers.  3. Prove it’s not the network  Most of the time, the network is not to blame for performance issues, Rao says.

People blame the network due to lack of visibility into network operations, not enough bandwidth, network complexity, insufficient network expertise, and lack of effective, easy-to-use troubleshooting tools. 

To troubleshoot, the IT team should use packet captures, gather network data, review dropped packets, check for excessive retries and congestion in capture files. They should also check network device logs and ping to check response times, as well as using tracert to verify that the network path is correct.  4. Wi-Fi and BYOD threats  Rao says Wi-Fi networks, combined with bring your own device (BYOD) policies, can create security and performance issues if not managed carefully. These can include chatter, dropped connections, excessive bandwidth issues, and poor device behaviour from users (such as streaming music), congestion. The sheer number of devices can swamp the network. 

To troubleshoot, conduct regular Wi-Fi SSID surveys to detect rogue access points and routers. Look up MAC addresses to discover the types of devices attached to networks and implement MAC address filtering if necessary. Also, understand that some devices are well known for causing problems if improperly configured, for example, Apple TV Airplay can badly impact performance.  5. Poor Wi-Fi performance  When the Wi-Fi network is underperforming, network teams should check for frequency interference, rogue routers (such as phones being used as hotspots), misconfigured Wi-Fi routers, and compatibility issues between certain Wi-Fi clients and routers, Rao says. Even excessive heat can cause strange symptoms. 

To troubleshoot, teams should regularly use an SSID scanner to identify rogue routers and APs in infrastructure, remember that strange DHCP behaviour is an indicator of rogue DHCP servers, relocate routers that may be suffering interference due to proximity to EMI sources, and ensure that all Wi-Fi devices are within their designed operating environment.  6. Intermittent performance issues  Transient issues can take time and, sometimes, luck to capture, diagnose, and resolve, Rao says. Causes can include cabling issues, external sources, power fluctuations, hardware failures, and excessive heat. 

To troubleshoot, rule out logical sources, then look for illogical sources of interference. Track occurrences of the specific performance issue and look for patterns. As always, start at the physical layer, using a cable tester to see if the issue is related to cabling. 

“Understanding how to effectively troubleshoot the most common issues can potentially reduce the amount of time network professionals spend on issue resolution,” Rao adds.

Recent stories
More stories