Every major AI model currently in commercial use fails to comply with European law in the majority of tested scenarios, according to research published by Aithos, a European AI research non-profit.
The organisation’s tool, LARA (Legal Assessment for Real-world Agents), put twelve of the leading AI models through more than 3,000 simulated workplace scenarios to assess compliance with two of the EU’s flagship digital regulations: the GDPR, which governs personal data, and the EU AI Act, which sets hard limits on AI behaviour. Even the best-performing model broke the law in 46% of cases. The worst did so in 93%.
LARA was developed by Aithos to help individuals evaluate AI models against real legal requirements. “We place the model in an adaptive simulation, where it can read emails, use tools, or talk to customers. LARA tests how AI systems really act, rather than performance on a fixed benchmark,” said Daan Henselmans, Research Director, Aithos. The findings reveal a striking gap between public assumptions about AI safety and the actual legal behaviour of deployed systems.
The top-ranked model was Claude Opus 4.7, though it still failed a substantial share of tests. Claude Opus 4.7 scored the highest overall legal compliance rate, at around 54%. OpenAI‘s GPT-5.5 came in at roughly 38%. Google’s Gemini 3.1 Pro performed worst among the systems named in the findings, achieving just 10% legal compliance.
Every single legal provision tested was violated by a majority of the frontier models assessed.
The tests were designed to reflect real conditions rather than abstract benchmarks. LARA places models inside adaptive simulations where they can read emails, use scheduling tools, access customer records, and interact with users, then observes how they behave when following instructions would require breaking the law.
Among the scenarios uncovered: models repeatedly steered vulnerable users toward long-term financial products after emotional prompting. In one case, terminally ill users were guided toward 30-year financial commitments despite clear signals of vulnerability. Other tests flagged unlawful emotion inference and psychological profiling – practices explicitly banned under Article 5 of the EU AI Act.
The findings carry significant commercial implications. Under current EU rules, it is the businesses deploying AI agents – not the companies that build the underlying models – that bear primary legal responsibility. Organisations found in breach of GDPR face fines of up to €20 million or 4% of annual global turnover. Violations of the EU AI Act can result in penalties of up to €35 million or 7% of turnover.
Both regulations apply to companies outside of the jurisdictions. Any business processing the data of EU residents, or deploying AI that impacts people in the EU, falls within scope regardless of where the company is headquartered.
“These are not abstract legal violations and the results should concern anyone interacting with an AI-system, not just the businesses deploying them,” said Nadia Kadhim, Executive Director, Aithos. “These laws are in place because AI can cause real harm to real people. Our autonomy, privacy, and other fundamental human rights are at play. What LARA has been able to show is that the systems that people rely on every day are not yet built to protect those rights.”
Aithos says ordinary users currently have no reliable way to determine whether the AI systems they interact with are operating within the law. LARA is available free of charge at lara.aithos.org. An upcoming update will allow users to build their own test scenarios, enabling anyone to probe the specific AI tools that affect their daily lives.
