NewsLab
Apr 29 02:52 UTC

We decreased our LLM costs with Opus (mendral.com)

36 points|by shad42||7 comments|Read full story on mendral.com

Comments (7)

7 shown
  1. 1. wxw||context
    > We switched to the "triager" pattern: a Haiku agent with a very specific and narrow job. Is this issue already tracked or not? If it is, stop right there. If not, escalate to Opus.

    > 4 out of 5 failures never reach Opus. A triager match costs around 25x less than a full investigation.

    The title feels misleading. Why clickbait on that when you can just be genuine about the architecture?

  2. 2. idorosen||context
    The title does not match the article title: “We Upgraded to a Frontier Model and Our Costs Went Down”.
  3. 3. stingraycharles||context
    It’s still misleading, though.
  4. 4. cadamsdotcom||context
    I have rewritten the article to be slightly shorter:

    “Let a cheap agent decide if the expensive one is needed.”

  5. 5. a_t48||context
    Sounds like L1 vs L2 support :)
  6. 6. whalesalad||context
    Looking at the diagram, is this seriously a case of replacing basic functional concepts like "write to clickhouse" or "have we seen this before" to a model? could those be actual function calls in some language?

    just seems wasteful all around. having an agent in the critical path when a regular expression (or similar) could do just seems odd. yeah haiku is cheap but re.match() is cheaper.

  7. 7. saltyoldman||context
    I do a similar thing with a "planner agent" that uses the cheapest (I think it's using openai-gpt-5.2-mini or something at like 20 cents for 1M.) that more or less emits a plan name, task list and the task list has a recommended model in each task. It's not perfect, but many of our tasks are accomplished with lighter weight models. When doing code generation or fixing we upgrade to a more expensive model, planning and decisions are done more cheaply. Keep in mind the tasks are relatively constrained, so planning done with a cheap agent makes sense here. An open-ended agent would likely use a more expensive call for planning.