AI can lie, cheat and fail – so why are we still measuring it with vibes?

4 min read19 June

Generative AI is moving so fast we’ve skipped the part where we check whether it is doing what it’s supposed to. We’re in a strange limbo where chatbots can write essays, fake court transcripts, even emotionally manipulate users – and developers claim they’re “correctly aligned” because someone ticked a checklist.

Ethics matter, deeply. But ethics without evidence are just branding. If we say we want AI to be fair, accurate and trustworthy, we need to prove it in practice. And right now, we can’t. Not consistently, not comparably, and not in ways grounded in how these systems actually operate.

This isn’t just a technical gap – it’s a failure of imagination from previous governments. We have mountains of frameworks discussing bias, robustness, and trust as abstract concepts, with little acknowledgment that AI systems are built to do specific things.

If you don’t measure how well an AI performs on its intended task, across different groups and conditions, then you’re not measuring safety – you’re measuring sentiment. It’s like focusing on a car’s paintwork without checking whether the brakes work.

When AI fails, it often fails quietly and badly. A misquote. A mislabelled name. An exam flagged as plagiarised because a neurodiverse student thinks differently. This isn’t sci-fi. It is real-world harm caused by systems that haven’t been tested where it counts.

And when systems fail more often for certain groups – because of poor training data or incomplete testing – that’s not just a technical issue. That’s an ethical failure. Bias isn’t a philosophical concept; it’s a broken system. Until we treat it like one, we’ll keep hiding accountability behind complexity.

Trust in AI isn’t built through glossy PR. It’s built through competence. When an AI system makes decisions about healthcare, finance, policing or education, people deserve to know how it works, whether it works for them, and what happens when it doesn’t.

That means testing AI at the task level – repeatedly, and in the messy real world, not just in sterile labs. Fairness and robustness must be part of how we define success, not bolted on as afterthoughts.

This work is hard. It involves agreeing on standards across sectors, listening to experts and communities who’ve often been ignored, and resisting the temptation to prioritise shiny demos over solid foundations.

Organisations like my former employer, British Computer Society, The Chartered Institute for IT – where I headed up policy – are pushing for this kind of practical, rigorous AI assurance.

We need more of that. Because if we don’t get it right, we’ll keep making the same mistakes – mistakes that hurt real people.

AI has the potential to transform society – especially in towns like Weston-super-Mare, which stand to gain from new efficiencies and public service improvements. That’s why assessing AI’s impact on places like mine must be central to any national strategy.

It’s encouraging to see this Labour government’s commitment to realising AI’s benefits – boosting economic growth, improving productivity and delivering better services across the UK. But I’ll always push for those benefits to be intentionally designed with towns like Weston in mind.

Ambition will fall short if we don’t build trust in the communities we want AI to serve – especially where scepticism about technology already runs high. And we can’t build that trust without real ethics, real transparency, and real accountability.

Ethics aren’t optional – but ethics without rigour are hollow. If we want trustworthy AI, we need to earn that trust with transparent, testable systems that work as well in Weston-super-Mare as they do in Silicon Valley.

That’s why this government is putting accountability at the heart of AI regulation. Our Data Use and Access Bill sets new standards for strong, enforceable data protection – criminalising deepfake intimate images, introducing smart data schemes to lower costs, and driving transparency in automated decisions.

Skipping over rigorous standards isn’t a shortcut – it’s a liability. Trust has to be built, task by task, in the real world. There’s no other way.

Daniel Aldridge, Labour MP for Weston-super-Mare