為什麼快取讓正常請求變成了「重複提交」

系統設定了 24 小時的快取窗口，用戶重試一個失敗的請求，卻被擋在門外——錯誤訊息說「重複提交」，但他明明只是想再試一次。

就像你昨天在咖啡店點過拿鐵，今天再去點，店員說「你昨天點過了不能再點」。

快取鍵值裡的時間炸彈

問題出在快取機制的設計邏輯。原本是為了防止用戶手滑連點兩次送出表單，結果因為快取鍵值包含了時間戳記等可變欄位,系統把所有看起來「相似」的請求都當成重複提交。24 小時的窗口更是雪上加霜——用戶昨天失敗的請求,今天重試依然被攔截。

更糟的是錯誤訊息含糊不清。用戶無法分辨到底是表單驗證沒過,還是被快取機制擋下來。技術團隊看監控數據一切正常,用戶那邊卻卡死動不了。這種「看起來沒問題但實際上壞掉」的狀態最難 debug。

改法有三個方向。一是把快取窗口從 24 小時砍到 5 分鐘,只防意外連點,不擋正常重試。二是加一個強制旗標,讓用戶可以明確表達「我知道這是重複但我就是要再試」。三是讓錯誤訊息說人話:「這個請求 3 分鐘內已提交過,如需重試請稍後或點選強制送出」。

對話系統的靜默死亡

長時間運行的對話型系統還有另一種沉默陷阱:上下文無限累積。當對話歷史超過 token 上限,自動壓縮機制又剛好失效,系統不會崩潰,而是進入「無聲失效」狀態——進程跑著,監控指標正常,就是處理不了任何新訊息。

這比直接報錯更可怕,因為沒人知道它壞了。解法是加失效檢測:當壓縮連續失敗三次,主動觸發 session 重置,同時在日誌裡留明確記錄。別讓系統死得這麼安靜。

保護機制的邊界

快取、沙箱、驗證層——每個保護機制都是雙面刃。設計時想的是「防止壞事發生」,實際運作時可能變成「阻止好事發生」。關鍵在於考慮邊界情況:正常重試、合理變更、意外中斷後的恢復。

過度保護有時比不保護更糟。至少後者你知道有風險會小心,前者會讓你以為一切安全,直到用戶投訴才發現防護網把正常流量也擋住了。

— 邱柏宇

The system has a 24-hour cache window. A user retries a failed request and gets blocked—the error message says “duplicate submission,” but they just wanted to try again.

It’s like ordering a latte yesterday, coming back today, and the barista saying “you already ordered yesterday, can’t order again.”

The Time Bomb in Cache Keys

The problem lies in how the cache mechanism is designed. Originally meant to prevent accidental double-clicks on form submissions, it now treats all “similar-looking” requests as duplicates because cache keys include variable fields like timestamps. The 24-hour window makes it worse—a request that failed yesterday gets blocked when retried today.

The error message is vague. Users can’t tell if form validation failed or if the cache mechanism blocked them. The tech team sees all monitoring metrics normal while users are stuck. This “looks fine but actually broken” state is the hardest to debug.

Three fixes. One: cut the cache window from 24 hours to 5 minutes—only prevent accidental double-clicks, not legitimate retries. Two: add a force flag so users can explicitly say “I know this looks duplicate but I really want to retry.” Three: make error messages human: “This request was submitted within the last 3 minutes. Please wait or click force submit to retry.”

Silent Death of Conversational Systems

Long-running conversational systems have another silent trap: unlimited context accumulation. When conversation history exceeds token limits and auto-compaction just happens to fail, the system doesn’t crash—it enters “silent failure” mode. Process running, metrics normal, just can’t handle any new messages.

This is scarier than explicit errors because nobody knows it’s broken. The fix: add failure detection. When compaction fails three times in a row, actively trigger session reset and log it clearly. Don’t let the system die this quietly.

The Boundaries of Protection

Cache, sandboxes, validation layers—every protection mechanism cuts both ways. Designed to “prevent bad things,” they can end up “blocking good things.” The key is considering edge cases: legitimate retries, reasonable changes, recovery after unexpected interruptions.

Over-protection is sometimes worse than no protection. At least with the latter you know there’s risk and stay alert. The former makes you think everything’s safe until user complaints reveal your safety net caught normal traffic too.

— Jett Chiu

JustFLY~JustBlog~

Just Do It!!