The evidence tool “resurrected” AIS's “brain” cannot understand what is wrong

From drones that provide medical supplies to digital assistants performing daily tasks, AI-powered systems are increasingly embedded in daily life. These innovative creators promise transformative benefits. For some, mainstream applications like Chatgpt and Claude look like magic. But these systems are not magical, nor foolproof - they can and do not work as expected.

AI systems may fail due to technical design flaws or biased training data. They may also suffer from vulnerabilities in the code, which can be exploited by malicious hackers. Isolating the cause of AI failure is crucial to fixing systems.

But AI systems are usually opaque, even to their creators. The challenge is how to investigate after an AI system fails or becomes a victim attack. There are technologies for checking AI systems, but they require access to the internal data of the AI system. This access cannot be guaranteed, especially for forensic investigators who are asked to determine the cause of failure of a proprietary AI system, which makes the investigation impossible.

We are computer scientists who study digital forensics. Our team at Georgia Tech has built a system, AI psychiatry, or AIP, that can recreate AI failures to determine what is going wrong. The system addresses the challenge of AI forensics by restoring and “reviving” suspicious AI models so that it can be systematically tested.

AI uncertainty

Imagine that an autonomous car won't head to the road for reasons that are not easy to identify and then crash. Logs and sensor data may indicate that a faulty camera has caused the AI to misunderstand the signpost as a turn command. After a mission-critical failure, such as a self-driving car crash, investigators need to determine exactly what caused the error.

Was the crash caused by a malicious attack on AI? In this assumption case, the camera's error may be the result of a security vulnerability or error in the software exploited by the hacker. If investigators find such vulnerability, they must determine whether it causes a crash. But making this decision is not a trivial matter.

Despite the forensic methods that can recover some evidence from drones, autonomous cars and other so-called cyber-physical systems, no one has been able to capture the clues needed by AI in that system. Advanced AIS can even constantly update their decisions and therefore constantly investigate the latest models using existing methods.

[embed]https://www.youtube.com/watch?v=pcfxjfypdge[/embed]

Researchers are working to make AI systems more transparent, but unless and unless these efforts change the field, forensic tools will be needed to understand at least AI failures.

Pathology of AI

AI psychiatry applies a series of forensic algorithms to isolate the data behind AI system decisions. The parts are then recombined into the same functional model as the original model. Investigators can "ignite" AI in a controlled environment and test it with malicious input to see whether it exhibits harmful or hidden behavior.

AI psychiatry uses snapshots of input memory images, bits and bytes loaded during AI runtime. Memory images of crashes in an autonomous vehicle scenario are consistent with important clues to the internal state and decision-making process of controlling the vehicle's AI. With AI psychiatry, researchers can now elevate exact AI models from memory, dissect their bits and bytes, and load the models into a safe environment for testing.

Our team tested AI psychiatry on 30 AI models, 24 of which were intentionally “backseat” to produce incorrect results under specific triggers. The system is able to successfully restore, rehost and test each model, including models commonly used in real-life situations such as street sign recognition in autonomous vehicles.

So far, our tests have shown that AI psychiatry can effectively solve the digital mysteries behind failures, such as self-driving car crashes, which previously left more questions than answers. And if vulnerability is not found in the car's AI system, AI psychiatry allows researchers to troubleshoot AI and look for other causes, such as a faulty camera.

Not only for autonomous cars

The main algorithms of AI psychiatry are universal: it focuses on common components that all AI models must make decisions. This makes our approach easy to scale to any AI model using popular AI development frameworks. Anyone who is struggling to study possible AI failures can use our system to evaluate the model without prior knowledge of its exact architecture.

Whether it is AI, a robot that formulates product recommendations or a system that guides an automated drone fleet, AI psychiatry can recover and re-host AI for analysis. AI psychiatry is fully open source for any researcher to use.

AI psychiatry can also serve as a valuable tool to audit AI systems before problems arise. AI audit is becoming an increasingly common oversight requirement at the state level as government agencies integrate AI systems into their workflows from law enforcement to child protection services. With tools like AI psychiatry, auditors can apply consistent forensic approaches across different AI platforms and deployments.

In the long run, this will bring meaningful dividends to the creators of AI systems and everyone affected by their tasks.