Ask HN: Why does every AI demo sound perfect but real world deployment always (news.ycombinator.com)

8 points|by VaderAi|2d ago|12 comments|Read full story on news.ycombinator.com

disappoints? Working on AI voice for small businesses. The gap between what AI can do in a controlled demo vs messy real world phone calls is eye opening.

Comments (12)

12 shown

1. lukayork|2d ago|context

Because the market is more important than the product, even if the product itself is ordinary, if you get traffic through hype, you will have a source of income. This is very important.
2. pradeep1177|2d ago|context

I think the expectation that AI will be the silver bullet for the domain where human behaviour change is just futile.
3. andyish|2d ago|context

That's the story of almost every product ever.
If you're selling a product, the promotional material has to be on point and amazing just to hook people in.
4. penpendian|2d ago|context

maybe dev should let up to X amount of potential user to use the product for free, then the market can have a consolidated down to earth slappable review instead of loveSpaceX pumping heart
5. journal|2d ago|context

because the caller is not experienced with the options or how to execute them correctly, what words to say, how long of pause to take, background noises, different other triggers. we know how the automated system works, but the caller is just angry if they don't get a human.
6. DuzAwe|1d ago|context

Gg
7. CM30|1d ago|context

Because demos show a product or service operating in its ideal environment with the ideal user flow. The person showcasing it knows exactly what the system can and cannot do, and how to make it look good in front of a crowd.
Real users are more complicated, and have specific demands that run into weird edge cases that the developers haven't thought of/tested properly. Like for an AI voice recognition system, can it deal with every accent under the sun? What about people with speech impediments or weird ways of pronouncing words? What about users who don't what to say or how to say it?
A demo only has to work in one specific way. A real product has to cover hundreds or thousands of edge cases its creators may not have even thought of.
8. lemonademan|1d ago|context

If you ask everyone this question, I genuinely believe they would tell you that demos always have to oversell the product, but a professional will tell you that this is because AI demos are crafted in highly controlled environments with people who know the full potential of the AI. I genuinely believe they are both correct in most cases, or at least one is correct. The average user of most products like AI doesn't know the AI's full potential, or they know the AI's full potential but still don't use it to its full capacity, and I believe most users are the former rather than the latter. In some cases, some don't even know how to fully operate it when they get their hands on it, which could make it seem more like a disappointment. Either ways the reason depends on the AI, the person using it, and the environment it is used in. I hope this can be helpful, but also take my opinion with a grain of salt, as I am not an authority on this subject. However, I hope this was helpful either ways.
9. VaderAi|1d ago|context

Exactly this. Real callers don't behave like demo callers. Heavy accents, background noise, someone who rambles for 2 minutes before getting to the point, wrong number calls, suppliers calling about invoices.
A demo never tests any of that. Real deployment teaches you more in a week than a year of testing.
10. longtermop|1d ago|context

The biggest gap is AI doesn't have the ability to self-correct and self-learn like humans do.
We're working on fixing that with parcle.ai/second-brain. Beta will be rolling out in a week.
11. chrisjj|6h ago|context

> The biggest gap is AI doesn't have the ability to self-correct and self-learn like humans do.
Like intelligence does, actually.
12. brotchie|16h ago|context

You don't need quality or flexibility for an AI demo.
Hand crafting a voice agent to schedule simple appointments for a barber in San Francisco where the caller is in a quiet environment, is a one day exercise in prompt engineering.
Building a voice agent to schedule real appointments (on a real calendar of a working business) for real customers, for any business type, in any city: significantly more difficult. Real customers can be on a bad cell connection, have background noise, or worse, there's somebody in the background having another conversation.
Building a working agent isn't the hard part of building a real world agent. It's establishing human and offline evals, identifying loss patterns, hill climbing, capturing and processing user feedback, doing hacks to deal with model limitations, learning how an agent-driven conversation has to be subtly different from a real human conversation, and so on.