Five Keys to Success with APM in Production Environments – APM Analytics (Part 2 of 5)
December 16, 2011 at 3:52 pm Diego Lomanto 4 comments
By Diego Lomanto (Twitter: diego_lomanto)
This is the second of a five part series where we explore the critical factors of implementing APM in production environments successfully. You can find part one here. Please check back next week for part three.
In this series we are discussing how the Gartner Magic Quadrant provides a great start to implementing with APM solution. However, maximizing your APM investment in production hinges on critical capabilities that can make or break an implementation. Capabilities that don’t get as much coverage in the media. They are:
- Continuous monitoring, NOT exception-based monitoring
- APM analytics that enable you to become more proactive with application/transaction data
- Real-time monitoring for proactive APM analysis
- Broad platform support eliminating all blind spots in your monitoring strategy
- Enterprise readiness for growth and scalability
Over these five blog entries I’ll spend a little bit of time on each of these success factors so you can be sure that you purchase a solution that will deliver the results you expect, not just in development and testing environments but also in production.
Part 2 – APM Analytics
Last week we talked about the virtues of a continuous monitoring strategy at length. But now that we can see everything, we’re going to have to find a way to make sense of it. A major risk for APM solutions in production environments is that they simply overwhelm the end-user with data, or the opposite occurs. They don’t provide enough actionable intelligence. It’s just hard to manually determine what’s important.
This is where APM analytics comes into play. Analytics should not be an optional component of APM – it is vital to fulfill the promise of APM. It enables you to analyze application performance in ways previously impossible or requiring massive amounts of work. And, analytics makes APM accessible to the enterprise.
Types of Analysis
The easiest way to understand APM analytics is to look at the use cases. The common use cases of analytics for APM are real-time, short-term analysis and long term planning.
Real-time (e.g. A product server is about to violate an SLA)
- Real-time OLAP
- Alerting to isolate problems while they are happening in transactions, infrastructure and business process.
In the Short-term (Why were transactions 10% slower today?)
- Business event correlation for root cause analysis
- Capacity management
In the Long-term (What applications can we move to the cloud?)
- Improving user and application behavior
- Capacity planning
- Cloud architecture planning
Here are some of the common types of reports you will get:
Real-time analytics is a topic that deserves its own post, so I’ll cover that next week in detail. In this post we’ll focus on the short and long term use cases.
What Makes APM Analytics Work in Production?
Ok, sounds good so far, right? There’s a gotcha. You knew there would be! In order to provide right amount of actionable intelligence in a production environment, you must first start with good data. The concept of “garbage-in, garbage-out” hold very much true for APM analytics. Here’s the secret to good APM data: entity relationships.
Entity relationships hold information about the interaction of a transaction with other components of the infrastructure. (E.g. this transaction was in this tier for this long before moving to that tier). Entity relationships are crucial to APM analytics because they allow you to infer root cause. Most APM solutions cannot provide detailed entity relationships in production because they do not track all the tiers and they do not track all transactions. This all goes back to the continuous monitoring requirement from last week. You might be starting to see that the keys to success with APM in production are related to each.
Ok, Sounds Good. How About Some Examples?
Sure thing. At OpTier, we call the customizable part of our APM analytics Business Events and we’ve helped customers use it to detect the following:
- Poorly designed SQLs as the root-cause of slow transactions
- ESB wrongly orchestrating transactions
- Retail banking payment transactions traversing certain application components before final booking
- Trading transactions having specific cut-off times in the day
- Order fallouts (common for telcos)
- Resource-intensive batch tasks impacting online transaction activities
- Specific users impacting system performance
Let’s take a deeper dive. Here’s an example of APM analytics uncovering the root cause of where transactions are failing in the short term. In the screenshot we below are analyzing transaction flows and can see that there is a missing step in the overall process flow: “Send Invoice.”
The APM solution can detect and report “Send Invoice” as a root cause because of the entity relationships. There are relationships between tiers in a transaction flow, and when the system can understand that it can start to detect when those relationships start to change. The next step here is for an analyst to look at the invoicing system and determine why that step in the process is not occurring. This improves mean time to resolution, as the analyst is not forced to look at every tier, just the problematic ones. He or she can then get the issue over to developers to fix faster than they could have before APM analytics was available in production.
That is just one example of the power APM analytics in a production environment. Because of the depth of information, APM analysts need a way to parse through the volume to get to the root causes. Analytics is the key to delivering this success in production
What do you think? Have you come across any other good examples of analytics? I’d love to hear some of your stories.
Stay tuned for the next installment of this series where we will discuss leveraging real-time analysis to proactively monitor applications.for the third part of this series. If you’d like to be notified when the post goes up please follow me on twitter @diego_lomanto.
Entry filed under: Analytics, APM. Tags: analytics, apm analytics.



1. Five Keys to Success with APM in Production Environments – Continuous Monitoring (Part 1 of 5) « Business Transaction Management Blog | December 16, 2011 at 3:55 pm
[...] of implementing APM in production environments successfully. Please check back next week for part two. [...]
2. Five Keys to Success with APM in Production Environments – Real-Time APM (Part 3 of 5) « Business Transaction Management Blog | December 27, 2011 at 3:12 pm
[...] factors of implementing APM in production environments successfully. You can find parts one and two here. Please check back next week for part [...]
3. Five Keys to Success with APM in Production Environments – Broad Platform Support (Part 4 of 5) « Business Transaction Management Blog | January 5, 2012 at 8:26 pm
[...] factors of implementing APM in production environments successfully. You can find parts one , two and three here. Please check back next for part [...]
4. Five Keys to Success with APM in Production Environments – Enterprise Scale and Readiness (Part 5 of 5) « Business Transaction Management Blog | January 17, 2012 at 4:08 pm
[...] APM analytics that enable you to become more proactive with application/transaction data [...]