3 Network Nightmares | True Tales of IT Terror
It’s that time of year again when real Viavi customers share their greatest network nightmares. These bloodcurdling tales of IT terror will give you chills and send you scrambling for your monitoring tools.
1. The Case of the Disappearing Network
Let’s face it, nothing’s worse than logging on to find that your entire network has made like Houdini and disappeared from the face of the earth. But that’s exactly what happened to Peter Lee when a legacy switch went bump in the night.
“One day, we came in and lost the whole network. It was completely gone.”
“About 6 years ago, there were a number of legacy Cisco 3500XL series switches in existence on the campus,” says Lee, Network Technology Specialist at Northern Devon Healthcare in the UK. “I suspected as soon as I started working here that they could be a potential source of problems. One day, we came in and lost the whole network. It was completely gone.”
For a major medical center, any amount of network downtime can impact quality of care. It was important that Lee and his team act quickly.
“I sent my team around the node room to manually plug up the console of every switch and show me the logs on them,” says Lee. “One of the 3500 switches went completely mad. The ASIC on the motherboard was sending out millions and millions of packets and maxing out the core. If we turned the core off and brought all the links up gradually, everything was fine—until it started reoccurring. Then, it was 100% CPU again. We couldn’t troubleshoot anything because we couldn’t get any response.”
The log files eventually revealed a MAC error on one of the switches. “We shut that switch down, rebooted the core, and everything was fine. We dropped that switch out and the problem went away,” says Lee.
2. Voices in the Darkness
Sometimes the network doesn’t go away entirely, but individual applications (like VoIP) stop working the way they should. Is it gremlins? The Demogorgon? When you can’t tell who is on the other end of the line, productivity can be rendered nonexistent. That’s what happened to Ivan McDuffie at NEC Unified Solutions.
“It became an unmanageable situation.”
Because it is extremely sensitive to overall network performance and delay, it is critical to constantly monitor VoIP alongside other applications. As a result McDuffie, the Area Engineering Manager, needed a single solution capable of presenting and analyzing everything running on a network.
The team was additionally challenged by not having enough storage space to adequately record traffic during pre-deployment assessments.
“Usually, when we perform a network assessment we try to get a full week’s worth of data at a customer’s site,” explains McDuffie. “With our existing equipment, that just wasn’t possible. We would have to remove the appliances from the client’s site, unload the captured data at our lab, and then reinstall the appliances at the customer’s site. It became an unmanageable situation.”
McDuffie, tired of hauling equipment back and forth, settled on a portable solution with enough storage space to capture large volumes of traffic.
“The Observer Platform lets me rebuild and listen to an entire VoIP call,” says McDuffie. “By listening to that, I get a sense of the user’s experience.”
Today, jitter, dropouts, and other creepy noises are easily identified. With all the traffic captured, there’s nowhere to hide.
“Having everything saved to disk means we can quickly isolate the time of the event, pinpoint the source and present evidence to our client of the occurrence. It definitely cuts down on the time we spend trying to find the source of the problem.”
3. Bringing Creatures to Life
When it’s your job to create creatures, such as in the case of Aardman Animations’ award-winning CGI department, any network hiccup can interrupt the creative process.
“We have an incredible volume of bandwidth-intensive applications shared across a complex environment,” says Howard Arnault-Ham, Head of IT at the company. It’s easy to imagine the frightening results of characters left ill-rendered by downed applications and services.
“We needed a solution that could handle our systems.”
Aardman’s IT network, spread across four sites in Bristol, England, has approximately 250 users. Its 10Mbit/s extended LAN supports a wide variety of complex creative applications such as CGI, Internet traffic ranging from websites and bandwidth-intensive graphics to file servers and database replications.
“We needed a solution that could handle our systems. Having used the Observer Platform in my previous employment, I knew it was user-friendly and extremely reliable and therefore had no hesitation in recommending it to the management team.”
Going forward, the Oscar-winning enterprise hopes to continue building its creatures, unhindered by technology issues.
“Observer helps segment network and application issues so they can get a handle on both. Long-term data acquisition and log-in facilities will help establish baselines for normal operation. As Aardman is a growing organization, this will help the IT department justify capacity upgrades and pre-empt issues that may arise in the future.”
Learn more about how Viavi customers have triumphed over the forces of IT evil in one of our many case studies.