
一個沒有 exp 的 token,憑什麼失效?
某次例行的服務節點重建之後,一套自動化串接的 API 開始持續回傳 401。第一反應是 token 過期——但這個 token 原本就沒有設 expiration。打開 payload 看,結構完整,欄位都在,沒有任何明顯異常。服務端卻直接拒在門外,連 decode 都不做。
技術環境
Node.js + Express API 服務,使用 jsonwebtoken 套件處理身分驗證。JWT_SECRET 於容器啟動時從環境變數讀入,未設定持久化儲存。API 整合採無狀態設計:每次請求在 Authorization header 帶入 Bearer token,服務端以 jwt.verify() 驗簽,不查 session store 或資料庫。問題模式與框架無關——任何在容器化部署中將 signing key 存放於非持久性環境變數的設計,節點重建時都會複現相同行為。
這讓人想到 ibon 取票碼。超商機器收到取票碼之後,不會打電話回總部核對「這個人是不是真的買了票」——它只驗簽章,字串是否由正確的金鑰簽出。如果金鑰換了,同一串碼就過不了,不管你買票的時候多認真。JWT 的邏輯和這個幾乎相同:服務端不追蹤 session,它只驗簽章。
錯誤傳染鏈(時序)
Client API Server Container Env | | | |── GET /api/resource ─>| | | (Bearer: old_token) | | | |── jwt.verify(token,──────>| | | JWT_SECRET) | | | [SECRET = 新密鑰] | | |<── SignatureInvalid ───────| ← 密鑰不吻合 ✗ |<── 401 Unauthorized ──| | | | | 舊 token:以 OLD_KEY 簽出 ✓ / Server 持有 NEW_KEY → 驗簽失敗
關鍵節點在 jwt.verify():服務端不嘗試 decode payload,只做簽章比對——密鑰一旦不符,token 直接被拒,連內容是否合法都不看。
分界點在重建那一刻
問題的來源很明確:節點重建後,服務實例的簽名密鑰被重置了。舊 token 是用舊密鑰簽出的,新實例拿新密鑰去驗,自然對不上,驗簽失敗,401。這不是 token 壞掉,是密鑰換了。
容易誤判的原因在於 token 本身看起來完全正常。base64 decode 之後 payload 讀得出來,格式合法,沒有 exp,理論上永久有效。如果不知道密鑰有可能在重建時被重置,第一時間很難往這個方向想——多數人會先懷疑 token 傳遞是不是哪裡漏掉了,或者中間件有沒有改過設定。
確認方式與修法
確認的方式直接:從容器環境變數取出當前的 JWT_SECRET,用同一組 payload 手動 sign 一個新 token,拿去打 API,看回傳是不是 200。結果是 200,問題定位完畢。
修法沿著同一條路走。取得當前簽名密鑰,在本機用 jsonwebtoken mint 一個新 token,payload 結構和舊 token 相同,刻意不設 exp。把新 token patch 進整合設定,重新啟動,401 消失。整個過程不超過十分鐘,沒有走重新登入的流程,沒有動到服務本身的任何設定。
user uuid 的來源值得一提:直接從舊的 JWT decode payload 取 id 就夠了,不需要查資料庫。舊 token 雖然驗簽失敗,但 payload 還是可以讀,這個細節讓 mint 的流程更快。
Code 對照:修法前後
修法前(JWT_SECRET 於啟動時隨機生成,每次重建都換)
// server.js(問題版本)
const crypto = require('crypto');
const JWT_SECRET = crypto.randomBytes(32).toString('hex'); // ← 問題在這裡:ephemeral,重建即失效
app.use('/api', (req, res, next) => {
const token = req.headers.authorization?.split(' ')[1];
jwt.verify(token, JWT_SECRET, (err, decoded) => {
if (err) return res.sendStatus(401);
req.user = decoded;
next();
});
});
修法後(JWT_SECRET 從持久化環境變數讀入 + mint 新 token)
// server.js(修法後)
const JWT_SECRET = process.env.JWT_SECRET; // ← 從 Docker secret 或持久化 env 讀入
if (!JWT_SECRET) throw new Error('JWT_SECRET is required and must be persistent');
// 重建後 mint 新 token(不需重新登入)
const jwt = require('jsonwebtoken');
const { id: userId } = jwt.decode(oldToken); // 舊 token payload 仍可讀
const newToken = jwt.sign({ id: userId }, JWT_SECRET); // 無 exp,與舊 token 結構相同
// → patch newToken 進整合設定,重啟服務即恢復
該被隔離的側效應類型
- 驗簽失敗日誌:401 事件應寫入 audit log(token 前 8 碼、IP、時間戳),不阻塞回應
- 異常偵測觸發:短時間大量驗簽失敗應觸發 anomaly alert(可能是密鑰輪換或攻擊)
- 整合監控通知:第三方 API 整合的 token 失效,應觸發 webhook 通知整合方自動 re-mint
- 快取失效:密鑰更換後,Redis 中與身分關聯的暫存資料應一併清除
- Token Mint 稽核:手動 mint 事件應記錄操作者、時間、payload hash(不含原始值)
- 健康檢查驗證:節點重建完成後,health check 應自動驗證
JWT_SECRET是否與預期 hash 吻合 - Rate Limiter 隔離:驗簽失敗不應觸發 rate limiter 計數,兩者邏輯需分開,避免互相汙染
- 搜尋索引無感:auth 層錯誤不應讓搜尋索引或快取預熱任務感知到用戶狀態變化
判斷標準:如果這段邏輯失敗不應讓使用者看到操作失敗,它就需要邊界隔離——auth 驗簽的副作用全部屬於這一類。
留給下次的一件事
節點重建和密鑰重置,這兩件事在直覺上沒有強連結。重建聽起來像「把服務重啟一次」,不像「把身分認證的基礎換掉」。但如果服務的 JWT_SECRET 是跑起來時隨機生成、或從某個不具持久性的環境變數讀入,那每次重建就等於換了一把鎖,所有舊 token 全部作廢——不管它們有沒有設 exp。
下次碰到重建後出現 401,第一個檢查點不是 token 本身,而是密鑰有沒有跟著變。
— 邱柏宇
延伸閱讀
After the Rebuild, the Server Wouldn’t Even Decode It
A Token With No Expiry That Still Expired
After a routine service node rebuild, an automated API integration started returning 401s consistently. The token had no expiration set — no exp field, nothing. The payload decoded fine. The server refused it anyway, without even attempting to verify the contents. It just stopped at the door.
There’s a useful parallel here: the ibon ticket pickup code at 7-Eleven. The machine doesn’t call headquarters to confirm you actually bought a ticket. It checks the signature — was this string signed by the correct key? If the key changed, the same code fails, regardless of when it was issued. JWT works the same way. The server doesn’t track sessions. It only verifies the signature.
Technical Environment
Node.js + Express API using the jsonwebtoken package for authentication. JWT_SECRET is loaded from container environment variables at startup with no persistent storage configured. The integration is stateless: each request carries a Bearer token in the Authorization header, verified via jwt.verify() — no session store or database lookup involved. The failure pattern is framework-agnostic: any containerized service that stores its signing key in a non-persistent environment variable will reproduce this behavior on every node rebuild.
Error Propagation Sequence
Client API Server Container Env | | | |── GET /api/resource ─>| | | (Bearer: old_token) | | | |── jwt.verify(token, ─────>| | | JWT_SECRET) | | | [SECRET = NEW KEY] | | |<── SignatureInvalid ───────| ← key mismatch ✗ |<── 401 Unauthorized ──| | | | | Old token: signed with OLD_KEY ✓ / Server holds NEW_KEY → verification fails
The failure happens at jwt.verify(): the server skips payload decoding entirely and only compares signatures — once the key mismatches, the token is rejected without inspecting its contents at all.
The Exact Moment Things Broke
The cause was straightforward once located: the service instance’s signing key had been reset during the rebuild. The old token was signed with the old key. The new instance verified with the new key. Mismatch. 401.
What makes this easy to misread is that the token itself looks completely intact. Base64-decode the payload and everything is there — valid structure, correct fields, no expiry. Without knowing that a rebuild could silently rotate the JWT_SECRET, the natural first suspicion is a dropped header, a misconfigured middleware, or a broken auth flow somewhere upstream.
How It Was Confirmed and Fixed
Verification was direct: pull the current JWT_SECRET from the container’s environment variables, sign a new token with the same payload structure, call the API. A 200 response confirmed the diagnosis.
The fix followed the same path. Using the current signing key, a new token was minted locally via jsonwebtoken, with no exp set. The user UUID came from decoding the old token’s payload — even though the old token failed signature verification, its payload was still readable. That detail shaved several minutes off the process. Patch the new token into the integration config, restart, 401 gone. Under ten minutes total, no login flow touched, no service config changed.
Code Diff: Before and After
Before (JWT_SECRET generated at startup — ephemeral, rotates on every rebuild)
// server.js (problematic version)
const crypto = require('crypto');
const JWT_SECRET = crypto.randomBytes(32).toString('hex'); // ← problem: ephemeral, invalid after rebuild
app.use('/api', (req, res, next) => {
const token = req.headers.authorization?.split(' ')[1];
jwt.verify(token, JWT_SECRET, (err, decoded) => {
if (err) return res.sendStatus(401);
req.user = decoded;
next();
});
});
After (JWT_SECRET loaded from persistent env + mint new token without re-login)
// server.js (fixed)
const JWT_SECRET = process.env.JWT_SECRET; // ← from Docker secret or persistent env
if (!JWT_SECRET) throw new Error('JWT_SECRET is required and must be persistent');
// Mint replacement token after rebuild (no login flow needed)
const jwt = require('jsonwebtoken');
const { id: userId } = jwt.decode(oldToken); // payload still readable even if sig invalid
const newToken = jwt.sign({ id: userId }, JWT_SECRET); // no exp, same payload structure
// → patch newToken into integration config, restart, done
Side Effects That Should Be Isolated
- Signature failure logging: 401 events should write to audit log (first 8 chars of token, IP, timestamp) without blocking the response
- Anomaly detection: a spike in signature failures should trigger an alert — it could be a key rotation or an attack
- Integration monitoring: when a third-party API token goes invalid, a webhook should notify the integration to auto-re-mint
- Cache invalidation: after a key rotation, identity-linked entries in Redis should be purged
- Token mint audit trail: manual mint events should log operator, timestamp, and payload hash (not the raw secret)
- Health check verification: after a node rebuild, the health check should confirm
JWT_SECRETmatches the expected hash - Rate limiter isolation: signature failures should not count against rate limiter quotas — these are separate concerns and must not cross-contaminate
- Search index / cache warmup: auth layer failures should have no effect on search index state or cache warmup jobs
If a piece of logic failing should not cause the user to see the operation fail, it needs boundary isolation — every auth side effect listed here qualifies.
One Thing Worth Remembering
A node rebuild and a key rotation don’t feel like the same event. “Rebuild” sounds like restarting a service. It doesn’t sound like replacing the cryptographic foundation of your auth layer. But if a service’s JWT_SECRET is generated at startup or read from a non-persistent environment variable, every rebuild effectively issues a new lock — and every existing token, regardless of its expiry settings, becomes invalid.
Next time a rebuild is followed by 401s, check the key first, not the token.
— 邱柏宇
Related Posts
https://justfly.idv.tw/s/az6ZZeN