LLM Agent 吃掉 200K token 之後就死了

把專案目錄丟進 AI Agent 的配置資料夾，系統直接罷工。錯誤訊息寫得很清楚：超過 200K token 的 context 上限。

這就像把整座圖書館的書都塞進背包，然後抱怨拉鍊壞了。傳統軟體的配置檔頂多幾 KB，但 AI 系統的「配置」可能是幾十萬字的知識庫。

問題藏在遞迴載入

這個 Agent 系統設定成自動掃描配置目錄裡的所有 markdown 檔，然後當作 system prompt 餵給模型。本來是為了方便維護文件，結果有人不小心把 node_modules 或整包程式碼丟進去，瞬間爆炸。

200K token 大概是 15 萬個中文字。一個中型專案的程式碼加上依賴套件，輕鬆超標。模型連啟動都不行，直接回傳錯誤。

這不是配置問題，是架構設計沒考慮到 AI 系統的特殊性。你得重新思考目錄結構：哪些東西該被當成「知識」餵給模型，哪些只是普通檔案。可以加個白名單機制，或者限制遞迴深度，或者直接換個目錄專門放 prompt 素材。

回饋迴圈其實沒在運作

另一個更隱蔽的問題是回饋系統的假象。很多人設計 AI Agent 的自我改進功能時，會把檢討系統產生的「建議」塞回 prompt，期待模型下次表現更好。實測結果是模型愛理不理，該犯的錯照犯。

因為 LLM 會自己判斷要不要採納建議。你寫「請參考上次的改進意見」，它可能看過就忘，或者覺得不適用就跳過。prompt 裡的文字對模型來說都是參考資料，不是強制指令。

真正有效的做法是在程式邏輯層面動手：把隨機選擇改成條件篩選，強制調整執行順序，或者設硬性閘門擋掉不符合規則的輸出。別讓模型「自律」，直接用程式碼幫它做決定。

其他值得記的坑

字串在多層 shell 之間傳遞時，引號會被反覆解析，最後變成你認不出的東西。圖片生成模型的人臉一致性還是不穩定，最實際的解法是預先生成一批標準素材重複使用。OAuth 整合用最標準的 redirect flow 最穩，別被那些炫炮的 popup SDK 騙了。

AI 系統的架構複雜度跟傳統軟體不在同一個維度。邊界變多、隱性限制變深、控制邏輯變得難預測。現在的開發框架還沒跟上，所以每個坑都得自己踩一遍。

— 邱柏宇

Drop your project directory into an AI Agent’s config folder, and the system just dies. Error message is crystal clear: exceeded the 200K token context limit.

It’s like stuffing an entire library into a backpack and complaining the zipper broke. Traditional config files are a few KB max, but AI system “configurations” can be hundreds of thousands of words of knowledge bases.

The Problem Hides in Recursive Loading

This Agent system was configured to automatically scan all markdown files in the config directory and feed them as system prompts to the model. Originally meant for convenient documentation maintenance, someone accidentally dropped node_modules or the entire codebase in there. Instant explosion.

200K tokens is roughly 150,000 Chinese characters, or about 100,000 English words. A medium-sized project’s code plus dependencies easily exceeds that. The model can’t even start—just returns an error.

This isn’t a configuration issue, it’s architecture that didn’t account for AI systems’ peculiarities. You need to rethink directory structure: what should be treated as “knowledge” fed to the model versus ordinary files. Add a whitelist mechanism, limit recursion depth, or just use a separate directory specifically for prompt materials.

The Feedback Loop That Doesn’t Actually Work

Another more insidious problem is the illusion of feedback systems. Many people design AI Agent self-improvement features by stuffing “suggestions” from review systems back into prompts, expecting better performance next time. Real-world result: the model barely cares, repeats the same mistakes.

Because LLMs decide for themselves whether to adopt suggestions. You write “please refer to previous improvement recommendations,” and it might glance and forget, or decide it’s not applicable and skip it. Text in prompts is reference material to the model, not mandatory instructions.

Actually effective approaches modify program logic directly: change random selection to conditional filtering, forcibly adjust execution order, or set hard gates to block outputs that don’t meet rules. Don’t rely on model “self-discipline”—use code to make decisions for it.

Other Pits Worth Remembering

Strings passed through multiple shell layers get their quotes repeatedly parsed, eventually becoming unrecognizable. Image generation model facial consistency is still unstable—most practical solution is pre-generating a batch of standard assets for reuse. OAuth integration works most reliably with standard redirect flow, don’t get seduced by flashy popup SDKs.

AI system architectural complexity isn’t in the same dimension as traditional software. More boundaries, deeper implicit constraints, control logic becomes unpredictable. Current development frameworks haven’t caught up, so you have to step in every pit yourself.

— Jett Chiu

JustFLY~JustBlog~

Just Do It!!