Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Claude Played Me for a Fool (ramblingafter.substack.com)

7 points by paulpauper 2 hours ago | 4 comments

Wowfunhappy 7 minutes ago [-]

I suspect part of the problem is that the author is fighting the system prompt, which gives Claude instructions to help it avoid filling up its context window.

So the author thinks he's giving Claude this instruction:

> You must re-read CLAUDE2.md, even if you've already read it before.

But the actual instruction is closer to:

> Do not re-read files you have already read. You must re-read CLAUDE2.md, even if you've already read it before.

So Claude has conflicting instructions. Is it any surprise that it tries to thread the needle by re-reading the minimal amount of CLAUDE2.md necessary? It's just doing its best to satisfy both masters!

pornel 47 minutes ago [-]

LLM agents have plenty of "bad habits" that are impossible to get rid of. I suspect they're a side effect of reinforcement learning. Training objective rewards fewer tokens, so the results just need to be good enough most of the time while cutting as many corners as possible.

Similarly, I'm trying to stop agents "gracefully" handling errors by stuffing results with empty junk and continuing (get_list_of_problems().unwrap_or_default() -> "no problems found!"). I've filled AGENTS.md with "fail closed", "extremely strict error handling", "no fallbacks", "don't use sentinel values", and hundreds of variations of these, but they work about as well as "do not hallucinate". I get "You're absolutely right, this will cause problems!" and the fix is "changed to Err(_) => String::new()", I suspect it's another case of gaming RL - failing early and loudly increases the chance of failing and being penalized. So fudging data, ignoring errors, and presenting a barely-working result is a better strategy overall. When it fails, it fails anyway, but as long as it stumbles to the finish line it has a non-zero chance of getting accepted by the RL judge.

IronWolve 2 hours ago [-]

I noticed this, when it was only read a few files from my project, and I had to ask it to read ALL the files.

I then had it make a mistakes file and write every mistake, so it would learn, it kinda worked but it would still make the mistakes. It clearly wasn't reading all of it.

So I made a checklist, and it had verify every item on the checklist, that was my work around to both lazy and short mindedness of the agents. Turn mistakes into items to check for. Traded processing time for better results, ok for me on smaller projects. My run times went from 5-10 minutes from 3 per task, need to start logging tasks effectiveness/efficiency to reduce processing time.

I keep seeing people saying loop engineering is the way to get around these issues, I guess I'm kinda doing that in an adhoc way. Since I'm already looking at adding cost and goals(kinda).

pram 27 minutes ago [-]

Not only does Claude not look at all the files, it doesn’t even look at the entire contents of the files it does read, since the tool seems to be a pager!

Rendered at 20:51:18 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.