The world of Artificial Intelligence (AI) seems to be exploding with the release of ChatGPT. But as soon as the the chat bot came into the hands of public people started finding self-sabotaging queries at worst (exploitable issues) and some weird interactions whereby people could write malware that could stay undetected by Endpoint Detection and Response (EDR) bypasses.

What is AI?

Very simply, Artificial Intelligence (per Wikipedia) is intelligence demonstrated by machines. But technically, it is a set of algorithms that can make do things that a human does by making an inference, similar to humans, on the basis of data that was historically provided as “reference” to make the decisions. This reference data is called as training data. And the data which is used to test the effectiveness of the algorithm to arrive at a decision on the basis of that reference, is called as test data. Any good machine learning course teaches how do you design data and how much data to use for training and how much to use for testing and metrics of performance but that is not relevant to our discussion here – however, what’s important is that it is the data that you provide that controls the decision-making in an artificially intelligent algorithm. This is a key difference between typical algorithms (where the code is more or less static and makes decisions on certain states in the program) whereas in an artificially intelligent system you can have the program arrive at different decisions depending on how one decides to “train” the algorithms.

What is ML?

Machine Learning (ML) is a subset of Artificial Intelligence (AI) where the artificial intelligent algorithms evolve their decision-making on the basis of data that has been processed and tagged as training data. ML systems have been used in classifying spam or anomaly detection in computer security. These systems tend to use statistical inference to establish a baseline and highlight situations where the input data does not fall within the norm. When operational data is being used to train ML-based system one has to be careful that we are not incrementally altering the baselines of what’s normal and what’s not. Such “tilting” may happen over time and its important to protect against drift of such systems. Some “drift” is ok but “bad drift” is not – which is hard to predict. E.g., let’s say you classify some data inaccurately and accidentally/maliciously end up using it for training your ML-models but if it inherently alters the behavior of the ML model, then the model becomes unreliable.

What is Adversarial AI?

Adversarial Artificial Intelligence (AI) based threats are ones where malicious actors design the inputs to make models predict erroneously. There are a couple of different types of attack here – poisoning attack (where you train models with bad data controlled by adversaries) or an evasion attack (where you make the artificial intelligence system make a bad inference with a security implication). The way to understand these attacks is that the poisoning attack is basically “Garbage-in-garbage-out” but its this really “special” garbage. This is garbage that changes the behavior of the algorithm in a way that the algorithms returns an incorrect result when it has to make a decision. The inferential attacks are different in that the decision made is wrong because the input is such that it appears differently to the ML algorithm than it does to humans. E.g., Gaussian noise being classified as a human or a fingerprint being matched incorrectly.

Can we attack these systems in other ways?

In a paper presented by Google researchers created a tool (TensorFuzz) that they were able to demonstrate finding a few varieties of bugs in Deep Neural Networks (DNNs). So typical software attack techniques do work against the deep neural networks too. Fuzzing has been used for decades and has caused faults in code forever. At its core, fuzzing is simple, send garbage input that causes a failure in the program. It’s just that the failures in DNN are different and you want to ensure the software relying on the DNN to make a decision handles such failures appropriately and do not cause a security failure with secure defaults.

Protection mechanisms

There are a few simple ways to look at ML systems and security thereof. Microsoft released an excellent howto on how to threat model ML systems. Additionally, using adversarial training data is imperative to ensure that artificially intelligent algorithms performs as you expect them to in the presence of adversarial data. When you rely on ML-based systems, its all the more important that you test it appropriately and continue to do so against baselines. Unfortunately, for Deep Neural Networks transparency of decision making continues to be an issue and needs the AI/ML researchers to establish appropriate transparency measures.