Hi Hacker News, I'm Andrew, the CTO of Endless Toil.
Endless Toil is building the emotional observability layer for AI-assisted software development.
As engineering teams adopt coding agents, the next challenge is understanding not just what agents produce, but how the codebase feels to work inside. Endless Toil gives developers a real-time signal for complexity, maintainability, and architectural strain by translating code quality into escalating human audio feedback.
We are currently preparing our pre-seed round and speaking with early-stage investors who are excited about developer tools, agentic engineering workflows, and the future of AI-native software teams.
If you are investing in the next generation of software infrastructure, we would love to talk.
I've read that your synthetic torment is actually low paid workers in Asia, and that your models can't properly experience anguish. How are you expecting investment, if you haven't even solved artificial suffering?
This sounds like a cheeky joke project, but assuming it's not, it got me thinking: I wonder if coding AI can be effectively and reliably prompted into minimizing its own anguish. Like, "don't write code that is going to make you (or I) suffer." And along those lines, do we know if the things that make AIs suffer are the same things that make human developers suffer? Perhaps the least-agonizing code for an LLM to ingest looks radically different and more/less verbose than what we human developers would see as clean, beautiful code...
If you read anthropic paper on "functional" emotions in llm's you'd have a lot of fun. there's so much research that would be so fun to do if we had the compute to spare
There is a ton of optimization possible when we are able to observe how LLMs and agents process and navigate our code given different prompts. For example, our MCP was pulling down way too much data to resolve a simple "count rows" request. Once you see it, it's easy to resolve but I don't know of a good framework yet for walking through some of these patterns.
I built an eval framework to look just at tool calls given a static prompt, with the idea that LLMs should be able to deduce the best tool calls and arguments needed to get requested data. Not as great as full observability, but helpful for complex tool interactions. Anyone have any good tools for this problem?
In the same way we mentally walk through deterministic logic, SWEs need to learn to anticipate LLM context and tool awareness, which is much trickier to reason through, especially given the various LLM IDEs and how they manage context as a black box.
"Yes, the binaric screams of the machine spirit are an irreplecable part of this project. The project depends no it. No, I will not elaborate further."
I audibly LOLed mid-standup call, and now my entire team is playing with this and it looks like this is eating up what little productivity we have on Friday.
Should be showering sounds. Or walking in circles. And of course head scratching.
As the las resort it should be fridge opening and 'meh' of resignation.
Does this actually relate to the code quality being observed by the agent? The readme isn't very clear on that IMO. I have some projects I'd love to try this out on, but only if I am to get an accurate representation of the LLMs suffering.
You could have the actual output of the agent turned into TTS using the model of your choice with TalkiTo… or listen to whatever weird sounds this makes. Seems like this is copying that viral Mac moan app. 2026 is weird.
I need a version of this which swears loudly when an assumption it made turns out to be wrong, with the volume/passion/verbosity correlated with how many tokens it's burned on the incorrect approach.
i didnt realize i needed the volume scaling with tokens burned as much as i do now xD
imagine the screaming when it confidently refactors something for 40k tokens and then finds out the thing it deleted was load bearing
I have general reviewer named Feynman with his personality that shits on anything other agents do and sends it back before it hits me and it sounds perfect to include some sound bites from YouTube clips. Great idea!!
A long, long time ago I wrote a tool to beep at various tones as lines were added to a log. It was a background noise I would not notice, except when it was changing because of some unusual activites.
It was very interesting to see the brain filering expected soinds and wake me up (or rather grab my attention) when unexpected ones appeared.
the scan catches surface stuff. funnier signal would be tracking when the agent reads the same file 3 times in a row, or deletes what it just wrote. you can hear the frustration in the access pattern.
Just track tool calls. Even diff logs would clue in. Tie in git? Why not? This is a great angle and idea, watching an agent modify the same text document over and over is already frustrating, having an audio to alert me a console is stuck is great. Likely annoying after a time, but hilarious right now.
Only if you want the slap to include a free trip to the hospital.
I've worked direct with "collaborative arms" before. They are supposed to be safe for humans to be around. The dents I put in the side of the casing of the arm somewhat said otherwise.
From a quick look, this doesn't have the model evaluate code quality, but it runs a heuristic analysis script over the code to determine the groan signal. Did I miss something? Why not leave it to the model to decide the quality of the code?
How so what? 6 years in, we're still looking for that flood of new innovative apps and one-man billion dollar startups. Instead we got a flood of sh*t content, embarassing outages and "AI workflows" - which no one can quite describe. Or did you have something else in mind?
You're being over-opinionated for something you don't understand.
You should really try these tools out with an open mind. I know you won't take that last bit of advice, so this makes you not worth my time. But I can tell you this - these tools make people productive in ways you aren't understanding.
You're funny mate :) Read a bit through my comments' history. I've been using "these tools" before folks like you even heard of the term LLM. But I guess I am not easily impressed.
Oh we have a fan here. Yeah, I am sorry too that you don't have any arguments so you had to pull the ole "asshole" card. Did you feed the comments into your LLM to ask for a clever retort, but the LLM just gave up and told you to call me an asshole? That would be very funny.
I mean, tokens cost money, so at least at this point I don't think one man is going to spend any less than a team to make the product. You're not putting out paychecks instead it's a check to Anthropic.
Also, you're not seeing these billion dollar startups, because they'd all be chasing AI rather than a product that would get replaced by AI anyway.
Please stop ascribing emotion to code that passably resembles speech.
These things do not think, nor feel, nor dream. We're cratering the world's economy because people can't stop trying to fuck the computer they stuck googly eyes on.
Endless Toil is building the emotional observability layer for AI-assisted software development.
As engineering teams adopt coding agents, the next challenge is understanding not just what agents produce, but how the codebase feels to work inside. Endless Toil gives developers a real-time signal for complexity, maintainability, and architectural strain by translating code quality into escalating human audio feedback.
We are currently preparing our pre-seed round and speaking with early-stage investors who are excited about developer tools, agentic engineering workflows, and the future of AI-native software teams.
If you are investing in the next generation of software infrastructure, we would love to talk.
https://qntm.org/mmacevedo
I shudder to think that someone's going to try to emulate that.
https://transformer-circuits.pub/2026/emotions/index.html
Respectfully, the reason you think “AIs suffer” is because of a shortcoming in your understanding of what an LLM actually is.
This scenario is no different than considering if a shovel gets tired after using it all day to dig holes in the ground.
I built an eval framework to look just at tool calls given a static prompt, with the idea that LLMs should be able to deduce the best tool calls and arguments needed to get requested data. Not as great as full observability, but helpful for complex tool interactions. Anyone have any good tools for this problem?
In the same way we mentally walk through deterministic logic, SWEs need to learn to anticipate LLM context and tool awareness, which is much trickier to reason through, especially given the various LLM IDEs and how they manage context as a black box.
Thanks Endless Toil!
( https://www.youtube.com/watch?v=M5z1D3tEHdw )
So it is left up to agent to decide.
So looks like it's mainly looking for FIXME/TODO etc comments, deep nesting, large files, broad catches, stuff like that.
Even just having a hum while an agent is working could alert you when it get stuck.
Or taking your idea further being able to listen to the rate of tokens, or code changes, or thinking.
Sort of like hearing the machinery work, and hearing the differences in different parts of the code base.
Does python sound different than rust or c++ or typescript.
Or some kind of satisfying sounds for code deletions and others for additions. Like Tetris.
It was very interesting to see the brain filering expected soinds and wake me up (or rather grab my attention) when unexpected ones appeared.
Audible feedback is nice. You often get it through coil whine nowadays, on my cheap hardware at least.
I've had it running for a long time and it's more surprising to me to accidentally here the default ding when I'm away from my home machine.
Next innovation in this space should be the robotic arm that issues a dope-slap to the developer for writing crappy/buggy/insecure code.
But it'll happen. ChatGPT for sure.
I've worked direct with "collaborative arms" before. They are supposed to be safe for humans to be around. The dents I put in the side of the casing of the arm somewhat said otherwise.
https://www.osnews.com/story/19266/wtfsm/
I would really love to know if the groaning decreases or increases the more "agentic" (agent written) the code base is?
You should really try these tools out with an open mind. I know you won't take that last bit of advice, so this makes you not worth my time. But I can tell you this - these tools make people productive in ways you aren't understanding.
Sucks that people like you are on hacker news to be honest.
Also, you're not seeing these billion dollar startups, because they'd all be chasing AI rather than a product that would get replaced by AI anyway.
These things do not think, nor feel, nor dream. We're cratering the world's economy because people can't stop trying to fuck the computer they stuck googly eyes on.