2025 will be the year of AI agents, which will transform data backups, storage, security and data protection.
This is according to Ian Engelbrecht, system engineering manager at Veeam, who says AI and machine learning are becoming indispensable in data security and data protection.
“This year, we will see a huge increase in AI agents and tooling. Pre-training of foundational AI models has largely matured, and improvements now come through fine-tuning, reinforcement learning and retrieval-augmented methods on new data and knowledge. Next comes agents, which take action by using that corpus of data that they have been trained on,” he says.
Engelbrecht says AI has made strides in the world of backup and security, and is now being used to monitor environments, identify issues and suggest resolutions, cutting through noise and enabling more proactive operations. The start of AI-driven observability and AIOps.
“AI agents capable of autonomous decision-making are the next step. There is a data pipeline where information is fed into AI models and the AI is making informed decisions about this data. For example, if we input a log message into an AI system, it can identify what is key to the issue, go and research information about that online or within its own model and then generate a resolution based on its knowledge. Now consider AI being able to start handling that action automatically, assess the issue, use reasoning to find a resolution, define actionable steps and execute them,” he says.
AI in backup monitoring
“From an analytics and monitoring perspective, AI is massive. Veeam is harnessing AI and machine learning (ML) retrieval-augmented method (RAG) to enhance data observability in our Veeam ONE solution, making it an even more powerful product,” he says.
Veeam ONE, part of the Veeam Data Platform, provides visibility into virtual, physical, cloud and SaaS platforms, with monitoring, reporting, analytics and tools for automation and control.
Engelbrecht says the addition of a new intelligent AI assistant in Veeam ONE allows users to query data in natural language, making it more accessible to non-technical stakeholders and saving time for technical users. The AI's capabilities in automating report generation and providing insights save time for security and data teams, while the integration of data protection with security operations bolsters overall system resilience.
Engelbrecht says: “It offers over 300 types of predefined alarms and hundreds of reports that you can keep to get a view into your infrastructure and data. You can automate reports and automatically get them in your inbox every morning, but for a new user, our AI assistant is fantastic because you can speak to it in natural language, having a conversation with it. You want to find out where your database is protected, how many databases do you have protected, are they secure, are they immutable and secure from ransomware. And the AI assistant is able to then interrogate the data that we've collected about the environment and provide you with a response by using a machine learning retrieval-augmented method.”
AI in security
Veeam is integrating its solutions with security partner products to enhance data resilience, he says. Engelbrecht notes that data resilience is an evolution of security and continuity. “Traditional cyber security plans are not able to keep up with the proliferation of new threats. To Veeam, resilience means that regardless of what happens, you have the ability to recover or be able to bring your environment up anywhere. If you try to recover to your DR data centre, but it's also been compromised, you need to be able to go somewhere else, with your data intact, secure and available. This is where data mobility comes into play. Veeam enables data mobility, giving you the freedom of choice.”
Veeam AI/ML also bolsters traditional security.
“Where a customer might have an SOC, with a SIEM and SOARs, Veeam feeds information into these platforms to enhance them. We use an ML model trained to identify malicious and encrypted data, and deploy it for inline malware scanning as data is being backed up or read,” Engelbrecht explains.
“We flag malicious backup images and send that information to the SOC marked as infected or suspicious. The solution also informs backup engineers of the malware, along with information on the last clean backup in the historical chain.”
“We can identify who the malicious actor is and what MITRE techniques they are using with built-in IOC scanner (indicators of compromise ), then craft a YARA rule to scan the entire environment to determine whether the threat actor’s tools deployed during kill chain are still within the environment, including the historical chain. This ensures there are no malicious tools in the infrastructure and archive.”
Engelbrecht says AI technology is able to review thousands of logs and events and find anomalies in minutes and hours that would take a normal human days or weeks to find. It not only finds important data, but can improve itself by adding context and meaning through historical information. AI has the ability to make predictions on possible outcomes based on historical data.
“For example, imagine a typical backup environment; AI could monitor the backup infrastructure and then advise that it has detected a pattern of job slowing, which could mean that in three weeks’ time, this backup job could start failing," he says.
These are some of the possibilities of AI agents coupled with machine learning methods to act autonomously.
NLP in historical data analytics
Engelbrecht highlights the potential for AI/ML and natural language processing (NLP) to be deployed to optimise the value of backup data.
“You can give AI access to backup data – for example, your finance or CRM server with sales data. Using Veeam data APIs, you can automatically mount the sales data from backup without having to restore it. We make that data instantly available through Veeam without having to recover it, and when you tie in AI, you can interrogate sales metrics going back many years using NLP and machine learning techniques. You could use this for auditing, forensics, classification, prediction or creating a data lake to training a custom model,” he says.
Learn more about demystifying regulatory compliance by downloading this eBook on the topic.
Share