What is AIOps?

AIOps or artificial intelligence for operations, is all the buzz at the moment. Many of the software vendors have products called AIOps or that they say play in the AIOps space.

In my discussions with customers, a common theme has come up: there is too much data for the operations staff to look at, too many scenarios to manually programme automations for and too few skilled people who understand the environment sufficiently well enough to know what to do next if a problem arises.

Enter AIOps, a solution that knows how to work with “big data”, how to sift through lots of information and pick out what is of interest, group related events together and make recommendations on how problems can be solved.

We create more and more data of various kinds: log files, errors, performance data, chats, tickets… machines are good at weeding through a lot of data and finding patterns and anomalies that break those patterns. Some patterns are useful to see how things group together and are inter-related. And other patterns are recognised for what is “normal”, so that when anomalies appear that break these patterns, it is a prediction that something is going to go wrong soon, something is not acting as it should.

In addition, the AI can identify what solutions were used by people to solve problems in the past and use those solutions to solve those same problems, should they arise again. So, the AI learns and adapts as time goes on.

Gartner says the central functions an AIOps solution should include are:

1. Ingesting data from multiple sources;

2. Enabling data analytics using machine learning at two points:

a) Real-time analysis at the point of ingestion (streaming analytics)

b) Historical analysis of stored data

3. Storing and providing access to the data;

4. Suggesting prescriptive responses to analysis; and

5. Initiating an action or next step based on the prescription (result of analysis).

Imagine a future where we no longer have a team of people sitting in front of screens in an NOC (network operations centre), where they try to figure out how to make everything that is red, green again, by finding and fixing the underlying problems. Instead, those people are now able to work at a level higher, only resolving the root cause problems that do not already have known solutions, because the AI has taken care of identifying the real problems (or root cause) and has automatically fixed the problems that have known solutions.

In this future, downtime is significantly reduced and we all know that downtime costs companies a lot of money, whether that is due to direct loss of income when a commercial site is down, loss of customers, reputation loss or loss of productivity of internal staff. Downtime is reduced because of the faster identification of the root cause and the automation of resolutions to problems.

So, in summary, AIOps can help companies make the best use of their people by automating what can be automated. It can also save companies money by reducing the downtime that costs them so much.

Should you want to engage in a further conversation on AIOps, please contact me, Birgit, on bsmythe@envisage.email.