According to Gartner, the average cost for just one minute of unplanned network downtime is upward of $5,000. SAP has claimed that the cost of downtime of business critical SAP applications is $9,000 per minute.
It’s 2018. Yet organizations are still using legacy monitoring techniques which are severely inadequate and cannot keep up with the complex systems of today. With system availability increasingly crucial for revenue, this can create a potentially serious problem.
I was recently part of the IT support team of a company that had implemented SAP Enterprise Portal as a way to evaluate and test new functionalities. This gave rise to opportunities for new Java-based applications to be built. Over time, scope increased with the addition of business functionality including corporate news, social integration (forums and chats), HR data, CRM and the integration of non-SAP application sources. For end-users and employees, the portal was the only gateway to all SAP and non-SAP applications. While this made for convenient access to applications, it also created the potential for serious problems in the event of a failure.
And we were indeed hit by a system failure. After hours of firefighting, we managed to stabilize the system and allow business to resume. To minimize the possibility of future disruptions, we held a problem review discussion. When we looked into it, it was obvious that system issues had become painful and difficult to tackle for these reasons:
- Alert overload. It was simply impossible for the support team to investigate every one of the hundreds of alerts it receives daily.
- Interconnections. With so many linkages among the various applications and aspects of the system, any one of these alerts could have been the key to preventing a system failure.
- Poor tools and processes. The monitoring and alerting software we used was clunky and unintuitive. Critical business continuity processes like disaster recovery, high availability, backup and restore didn’t have periodic check mechanisms in place, and the technical team wasn’t even sure if these tools would work when a continuity problem arose.
Visualize the impact of a system failure like this at your company. Business disruption, lost revenue, loss in end-user and IT productivity, and reputation loss are just some of the consequences you could experience. My years in IT operations support have convinced me that inadequate monitoring and evaluation is a big contributor not only to unanticipated failures but to the ability to deal with them quickly.
Of course, there are many solutions and tools available in the market for monitoring servers, networks and applications. In the SAP world, tools like SAP Solution Manager, SAP LoadRunner from HP and Wily Introscope from CA and others are often used, alone or in combination.
While Solution Manager is included free by SAP, the other tools can end up costing your organization significant sums. Yet, not one of them – free or otherwise – can be used at full potential unless properly configured and optimized to meet your specific business needs. System misconfigurations can play a significant role in unplanned downtime and have a major negative impact on system performance.
So in addition to paying for the software, you need to bring on board trained technicians – often very hard to find – to properly maneuver through the complexity of these tools and configure them. Once you’ve done that, you then have to ensure that you schedule proper system evaluation and periodic checks to get maximum value. Many organizations fail to factor this into their thinking, and monitoring becomes an afterthought.
In my next post, I will discuss how to properly set up system monitoring and evaluation.