AI Can't Be Trusted to Tell Us When It Can't Be Trusted

Explore the critical trustworthiness gap in AI systems and how to design solutions that account for these limitations.

No items found.
Author:

Paweł Brodziński

Disclaimer: It's a post about AI trustworthiness or lack thereof. It is not a blunt voice against AI, though.

We are, in fact, working on a solution that uses AI models to read handwriting. While the technical aspect of the work is a lot of fun, it triggers deeper considerations.

Let's start with a manual solution to the problem. A human would decipher someone's handwriting and type in the data. But wait, it's not that simple. Since handwriting can be messy, the person typing in the data may flag a record for proofing. Or mark it as unintelligible altogether.

At that moment, a human applies a skill from another dimension. Not only do they give the best match of what they think was written, but also they assess how trustworthy the record is.

Non-obvious AI Limitations

Let's run the same process through an automated image recognition solution.

We'll produce output almost in an instant. Given enough training data, we can make our solution decently accurate. Still, it will be making mistakes.

One obvious reason is the challenges directly related to handwriting. It won't always be possible to decipher the scribblings. On top of that, we have hallucinations—mistakes that would be obvious to a human looking at the output, yet it would look perfectly fine to AI.

Let me defend AI here, though. I'd speculate that as often as it hallucinates the wrong data, we'd have humans knowing the correct data but making typing mistakes. So, let's assume it all levels up in terms of quality.

There's a kicker, though.

AI won't tell us when it's "unsure" of its answer. There's no second-dimension skill that we inherently get from a human being.

Just try to randomly ask the LLM of your choice "Are you sure?" any time you receive an answer (good or bad), and see what happens.

Refrigerator Image

And that's the core of the problem. AI won't tell us when it can't be trusted. It's inherently untrustworthy.

It doesn't mean that we can't get it to produce decent outcomes. It means that when it does not deliver quality outcomes, we're left in the dark.

The Need for an Expert

A neat example comes from Simon Wardley. Every half a year or so, he plays a game of getting LLMs to extract key concepts from books in the form of a Wardley map.

While he can get a decent outcome, it relies on two things:

  1. He knows where AI makes mistakes because he knows the books he asks about.
  2. He uses the knowledge about the books to improve the prompts to make AI provide better answers.

In other words, the decent outcome is entirely based on Simon knowing the stuff. Which, by the way, means that he could have produced his maps easily with no AI help whatsoever.

Let's generalize this observation. We need an expert to make AI work well enough and be able to tell when it does not.

The same expert whose presence we try to circumvent with the solution. Ultimately, if we have a free choice between Simon Wardley and an AI solution to generate a Wardley map for us (yup, the name correlation is not a coincidence), we'd go with Simon 10 times out of 10.

On the other hand, a layperson wouldn't be able to tell whether the first output generated by AI is any good. And if they guessed it wasn't, they wouldn't be able to guide the automated solution in getting better.

The trustworthiness issue would defeat them.

Against Naivety

As mentioned initially, it's not a call against AI-based solutions. If anything, it's a call against naivety.

We can say things like, "Our AI-based solution is 95% accurate, and so is the manual solution that's now in place. Thus, it's better as it's cheaper to maintain."

Before we high-five and call it a day, though, let's consider the following:

  • What happens when we are inaccurate? What's the cost? What's the fallback scenario?
  • How can we know when the output is wrong? Is there a way to automate it (to a degree)?

Right now, we often use human intuition, knowledge, and experience as validation and decision-making mechanisms when dealing with edge cases.

However, we want to replace manual human activity with AI solutions, thus removing intuition, knowledge, and experience from the system.

Is it a good tradeoff? It may be. It doesn't have to be, though. A simple accuracy match is not enough to tell.

Trustworthiness Issue

Interestingly, it all boils down to trust. We can't trust either a human or a machine to produce a perfect result.

However, we can trust a good-willed person to tell us when they don't know the answer or are unsure how good the answer is.

The machine is incapable of showing us the same courtesy.

AI can't be trusted to tell us when it can't be trusted.

We must design AI-based systems with that observation being the core of the suggested solution. That, in turn, will make them more complicated, more expensive, and much less of a quick fix than we are made to believe.

Want To Discuss
Your Next Project?

Our team is excited to work with you and create something amazing together.

let's talk

let's talk

More articles

We don’t just build software, we understand your business context, challenges, and users needs so everything we create, benefits your business.

thumbnail image for blog post
plus sign
thumbnail image for blog post
plus sign
thumbnail image for blog post
plus sign
Arrow