Author Archive
Business Transaction Management and IT
Recently, leading experts from Gartner and OpTier sat down to look at the challenges companies face in the business transaction management market and the solutions available to address those changes.
Please click on this link to register and view the entire video.
Does change management impact your infrastructure or your business?
I’ve witnessed a lot in IT over the last decade. I’ve seen a DBA blow away (rm -rf) a live production database thinking they were logged into a test server shell by mistake. I’ve seen websites go bang several hours before and even several minutes into major product launches. I’ve filled out many change requests in my time with many of these processed by people who actually forgot to make the relevant changes despite signing off the change requests as completed. I’ve also seen many customers deploying applications into production based on configuration they used in test environments with debug logging enabled. The best one recently was when a security guard accidently locked themselves in a data center room and hit a button thinking it was the door release when in actual fact it was the EPS power button which knocked out the entire power to the data center. We can blame the rise of the machines for our IT woes but the biggest liability by far is still us human beings
Today, the only thing constant throughout the application lifecycle is change. Building an application is relatively cheap, supporting and maintaining it is where the costs start to spiral out of control. Change requests are an expensive activity, they require development, regression testing, documentation, planning, downtime, backup procedures and an eye for detail. However, when a change occurs how many organisations can truly quantify the business impact?

What exactly changed?
For example, a DBA might look at the top 5 slowest SQL Statements that execute in the database. They might optimise these in several ways by creating a few indexes, updating relevant table statistics or tweaking I/O settings. Various change requests are then submitted which are then deployed in production. What the DBA doesn’t understand at the time is what impact their changes will have on the business. Their database could be serving multiple applications spanning hundreds of business transactions with thousands of users. Introducing a new index on one table might improve one SQL statement but it could have a detrimental effect on several other SQL statements which collectively could impact several key business transactions. It’s therefore virtually impossible to quantify whether changes like this will have a positive impact on the business.
Same goes for an application developer. I know because I’ve been there and tried to optimise many JVM’s with APM tools in the past. I could spend all day knocking milliseconds off Java API calls or playing with container settings like connection pools or thread counts in a vain attempt to optimise the application sitting on top of the JVM’s. You can find 101 interesting things a day to optimise with an APM tool. The trick is knowing which things will actually impact the business in the most positive way. Its also good to know when to stop tuning – the more you change the more you need to test. When your tweaking application code or changing container settings its not that easy to figure out what business transactions your playing with. Again, you might be tuning your JVM’s to make them more efficient but being able to truly understand the business impact of your actions is still a black art. If a dev team of 5 people spends 4 weeks tuning application code and only improves business transaction response time by 5% did they really do a great job? Did the 5% improvement impact important business transactions or did it impact less important business transactions?
Another problem is knowing when to schedule a change request. Many applications these days are 24/7 and global. No longer can organisations rely on midnight change requests. You want to schedule change requests at times with the least business impact. How many users are logged on at this time? How many business transactions execute at this time? Are the business transactions important or can they suffer unavailability?
Business Transaction Management solves a lot of these change management issues. When you capture all business transactions across all tiers all of the time you have full visibility into how each change request or tier impacts your business transactions and ultimately your business. You can also identify the best time to schedule changes based on business transaction activity. When Change Request #5463 was deployed it improved the SLA for several key business transactions by more than 25%. When Change Request #7653 was deployed it improved the response time of Execute Order by 80% but actually degraded the response time of Cancel Order and Check Customer by almost 350%. This is just a small sample of the benefits BTM can bring to change management.
Manage IT with Business Impact not with Traffic Lights
I’ve been using the phrase “If everything is important then nothing is important” quite a lot in the last week. In my desperate attempts as a product manager to respond to every email, enhancement request, PRD, conference call and tweet it’s becoming quite challenging to say the least. I’m constantly fighting the battle of email and have even tried sending less email recently in the vain attempt that I’ll receive less…which didn’t seem to work at all. I even tried setting filters up on my inbox but still the emails keep getting through, it’s actually a novelty these days when someone picks up the phone and has the audacity to speak to me.
A typical day for me starts with a latte (and more often than not a chocolate chunk cookie) from Starbucks followed by a quick prioritization session. What things am I going to do today that will have the biggest impact on the company I work for? I could attempt each day to deal with email and tasks as they arrive on my desk in the vain attempt that I’ll keep everyone happy which normally requires working till 2am in the morning each day. Alternatively, I can be smart with how I work and push back of things that are less of a priority or have no tangible impact on the business.

Traffic lights don't always reflect the true business impact
What I go through daily as a product manager is pretty much identical to what operations and application support teams go through each day. Most support teams get email, in fact they get several hundred email or even several thousand emails as a result of the enterprise monitoring solutions they have hooked up to every component of their infrastructure. They have alerts and traffic lights configured for their OS, networks, storage, middleware, messaging, databases and users across hundreds of applications and thousands of physical servers. Customer’s enterprise dashboards turn red and stay red because they simply cannot deal with the volume they receive daily. It’s a monumental task to browse through alerts and put all the pieces together in the attempt that you can identify and isolate an issue before the business picks up the phone and starts asking questions.
More importantly, 99% of these alerts have no business context. The alerts contain technical information based on KPI metrics for a given threshold breach or state, they do not provide any visibility into how the alert is impacting the business. If an enterprise monitoring team receives 5,000 alerts a day how can they make sure they deal with the 3 or 4 alerts that are impacting the business vs. the 4,997 alerts that are just noise?
The answer is Business Transaction Management. When you can manage all business transactions across all tiers all of the time you have total visibility into how your business runs on IT. More importantly you can quantify business impact in real-time by seeing with your own eyes which business transactions, users and applications are experiencing service level breaches. You manage IT with business impact so that you can truly prioritise your teams and resources to deal with the incidents that are most detrimental your business. Gone are the days when your IT support department manages IT with traffic lights based on infrastructure alerts or by investigating each alert as it arrives in the inbox that is running out of disk quota.
Not all business transactions, users and applications are equal. Just like not all emails, enhancement requests and PRD’s are equal for a product manager. If you can’t prioritize and focus on the things that have an impact on your business then the amount of value you’re providing to that business is pretty questionable. In many organizations the business is IT and without IT the business would fail. It’s therefore essential that IT is aligned to the needs and priorities of the business.
Business Transaction Management has Disco Fever
Life is dull when you can predict everything that is going to happen. For instance, I was driving home last week in rush hour on the M4 in the fast lane and in my mirrors I could see a black car approaching quickly. A few seconds later this black TVR Tuscan with big yellow stripes was behind me, pretty cool and a pretty rare sight on a motorway. As the traffic ground to a halt the owner of the TVR pulled into the middle lane next to me and rev’d his engine to prove a point whilst looking at me with a smug grin. The first thing that entered my mind was “Your car’s not going to last long sitting in this traffic mate”. Guess what? A few minutes later smoke started pouring out the front of this TVR with the owner looking pretty stressed. I was laughing and feeling smug also but not surprised in the slightest as the TVR pulled into the hard shoulder in a cloud of white smoke. For those not familiar with TVR sports cars, they are about as reliable as the Windows Operating System with no firewall or anti-virus protection – you leave them to idle and your in trouble.
Today, I can’t help thinking that enterprise monitoring is largely predictable, or even somewhat dull despite servers and applications going up in flames occasionally. For the people who manage helpdesks or application support, monitoring software is about as interesting as watching a set of traffic lights for 8 hours a day. The lights turn red, all hell breaks loose and the blame finger comes out. The lights stay green and you can kick back on Face Book or Twitter and see whose updated his/her status (only joking) or read blogs that describe just how your feeling
Enterprise monitoring needs an adrenaline boost, it needs mojo, it needs to shock and deliver answers to problems that you would never have predicted or guessed. If you can predict or assume why outages might have occurred then it becomes quite boring blaming the same DBA or network administrator every week. When the database is slow everybody assumes it’s a missing index or the DBA hasn’t updates the table statistics for several years. If the JVM is firing OutOfMemory exceptions everybody assumes it’s a memory leak and gets paranoid about finding the irresponsible code without checking JVM memory parameters first like MaxPermSize which will often resolve 90% of memory issues. Another classic example is where a JMX metric shows connections to the database are being exhausted so the first thought is to increase the database connection pool size in the JVM without actually figuring out what’s holding onto the exhausted connections (like slow SQL) in the first place.
Imagine if your enterprise monitoring software provided you with answers that shocked you. Imagine if you were in denial for a split second or even freaked out at the prospect that the solution to your problem is something which you’ve never even considered before. To be shocked you and your enterprise monitoring software first needs to be able to discover new things. The traditional way to deploy enterprise monitoring software is to ask the customer “Which servers/tiers do we need to put an agent on or monitor?”. This approach means customers get visibility into the server/tiers they are expecting their application and business transactions to flow through. The data provided is therefore predictable and somewhat unexciting.
Forget about monitoring servers/tiers for a moment (or a few years). Imagine if you monitored business transactions instead and their respective flows – things start to get interesting very quickly. Wherever the business transaction goes so does your monitoring capabilities and visibility. You begin to discover servers and tiers that you never imagined your business transactions or applications utilised. You begin to learn new things about how your applications and business transactions behave, you learn their dependencies, their interactions and more importantly their contributions in managing your service levels and end user experience. Welcome to the world of Business Transaction Management (BTM).

Being shocked is a good thing
I’ve seen many customers shocked, in denial and more importantly buzzed about what BTM can do for them and their organisation. Seeing a customers face is priceless when you tell them that their business transactions flow from their production application servers to a UAT test database. It’s even more priceless when you here them pick up the phone and describe it to other people in their organisation that real users business transactions are executing against a UAT test database. Its also impressive to show customers their real application topology based on business transaction flow than to keep referencing the partial diagram they think their application actually uses. It was only last week where BTM pointed out four application tiers to a customer that had no idea the tiers actually existed. Shock, denial and then amazement would be how I described that customer.
Apologies to those reading this blog who were expecting references to Disco Funk, big hair, big flairs and the king of pop. All I can say is that Business Transaction Management discovers lot of things that make life a bit more exciting and unpredictable. If everything was predictable then managing IT wouldn’t be fun each day.
Simplicity is Good, Complexity is Evil.
For those of you who read my blog you may perceive me as just another product manager of an IT company. One of my interests outside of work is motorsport and generally driving a car as fast as is physically possible. I’m not the type of guy who drives to work cruising on the motorway in 6th gear doing 50mph (no offense meant for people who do this btw). For me its about getting from point A to point B as quickly as possible whilst maintaining strict adherence to government speed limits…or something along those lines.
Anyway, I was thinking the other day just how complex a car is underneath the glossy paint and metal shell that most people perceive a car to be. You’ve got the engine for starters (literally), then you’ve got things like air filters, radiators, oil tank, fuel tank, catalytic converters, spark plugs, exhausts, gearbox, clutch and so on (I won’t bore you with the other 1842 parts). The car also has hundreds of sensors to detect failure, tolerance levels of components and even stupid people who don’t wear their seatbelts (again no offense intended for people who don’t wear seatbelts). It’s actually an engineering miracle that so many pieces can work together without failure for so long (unless you happen to own a TVR of course). And the great thing is that when something does go wrong your car dashboard lights up like a Christmas tree and tells you what’s wrong – how cool is that?. The monitoring and operation of all those car components is simplified through a lovely glowing dashboard. The oil light comes on when you need more oil, the tyre light comes on when you need new tyres or more pressure. If you drive a BMW then the onboard computer even tells you that your not driving close enough to the car in front like other BMW drivers. In the unfortunate case of an engine light, the problem normally involves a trip to your car garage where some guy in white overalls plugs in a computer to your cars ECU. Usually within 2 minutes he’s detected that your car is broken and needs £2000 worth of work to fix it. Needless to say 95% of issues can be fixed in no more than a few hours (which is why Audi, Mercedes and BMW garages charge £150 per hour labour)
.

Simple to drive but Complex to engineer
My point with the car is that it’s a simple bit of kit to use and monitor despite its hidden complexities. Car manufacturers have done a stellar job of simplifying complexity so that our cars don’t have 101 dashboards to report status or issues. You turn the key to start, turn the wheel to steer and plant your foot firmly to the floor to go fast. Your car’s dashboard does all the rest to inform you of what you need to know. When the car needs an update the garage simply remaps the car’s ECU at the next service rather than letting than the owner do it himself with a laptop, OBC connection and a hotfix off the internet.
If monitoring cars can be so simple then why can’t monitoring applications? Applications have just as many components and complexity, they are even built by engineers who use keyboards rather than spanners. They even have a nice pretty appearance (unless they’ve been built in the 1990′s with visual basic or something). I know what your thinking “Applications are more complex, nothing can be as complex as coding an EJB or writing some complex SQL”. Try telling that to the folks at Ferrari or Porsche that spend millions each year optimising their traction control and stability systems that stop people like me from ending up in a hedge.
As a product manager working for a software company in the monitoring space I feel a sense of responsibility for putting an end to this complexity of monitoring business transactions, applications, SOA environments, end users, networks, JVM’s, databases, servers, enterprises buses and pretty much everything else that requires several million products, agents , appliances, dashboards and user interfaces. Software vendors should do what car manufacturers have been doing for the last 20 years. They should provide simple usable solutions that abstract over all complexity and make it as straight forward as possible to manage business transactions and the IT infrastructure with which they flow.
New business transaction management research announced
On June 16th, OpTier announced that it had interviewed 2,000 UK IT decision makers at businesses of 1,000+ employees across a range of industries, including retail, government, finance, telecoms and manufacturing. The results were significant, finding that two-thirds of IT managers are blinded by complexity of management tools and, as a result, are costing large businesses more than £4.5million annually.
The startling insight was picked up across the UK and brought business transaction management to the forefront of IT news in the UK.
Head over to the press release on OpTier.com to learn more about the research.
Why BTM Complements APM Solutions
Something that has become clear in my mind over the last year is that Business Transaction Management (BTM) is very different to the popular Application Performance Management (APM) solutions we’ve typically seen in the market place over the last 5 years. I base my opinion solely on an important group of people we’ve come to know over the years as “customers”.
My father once taught me an important lesson whilst he was arguing with a waiter in a restaurant, something along the lines of “The customer is always right”. In fact, it was only recently I used the exact same phrase whilst interacting with the security staff at EL AL airport check-in. After a few puzzled looks, baggage checks, several questions and two stickers I was on my way. Anyway, my point is that customers are generally a good indication as to whether something is good, bad, useful, different or valuable.
In the last year I’ve sat with several Fortune 500 customers who have ALL told me that BTM is changing the way they utilise their APM investments. In fact, two of these customers actually shared with me their IT service delivery and support processes so I could see with my own eyes where BTM and APM were playing a key role towards the common objectives of improving end user service levels, performance and availability. Simply put, BTM was used to identify, alert, prioritise (understand business impact) and isolate issues whereas APM was then used to understand root cause and resolution of these issues.
For example, in real-time BTM could detect a user specific transaction that breached in the application, it could then provide an immediate latency breakdown across all tiers where that problematic transaction traversed. Once the latency is isolated to a specific tier the customer can then focus their APM solutions to that tier and understand the root cause and apply a fix. The net result of all of this is that Mean Time To Resolution (MTTR) or Recovery (take your pick) is significantly reduced. One BTM customer dropped MTTR from 2 hours to under 15 minutes using both BTM and APM effectively together.
User’s experience transactions, its therefore important that BTM provides you with visibility of every transaction from every user across every tier so you can focus your APM solutions in seconds to the tiers that are causing issues. When your application spans tens or hundreds of tiers you need to isolate the right haystack before you start looking for that needle.
