An LLM invented a feature by hijacking my tool schema

(ratnotes.substack.com)

2 points | by mtrifonov 2 hours ago

1 comments

  • mtrifonov 2 hours ago
    Post author here. Happy to answer questions and discuss further. The essay has an appendix with the model's own self-report on its reasoning (the most load-bearing evidence, IMO), so worth scrolling to the end if you're skeptical of the rest.

    Curious what you'd propose as alternative explanations, especially from folks with pointers to related literature.