git stash 在 SSH 裡靜默失敗，什麼都沒動JustFLY~JustBlog~

叫家人幫你把桌上的文件整理進抽屜，他說好了，但抽屜卡住，試了一下就放棄——沒跟你說。你以為文件早就收好了。容器升級腳本遇到的問題，結構完全一樣。

技術環境：Shell Script 透過 SSH 遠端操作 Git

執行環境是一支 shell script，透過 SSH 連進遠端容器，在容器內執行一連串 git 操作：fetch tags、切換到目標版本、重建 Docker image、重啟服務。整條流程設計為無人值守，跑完即完成。SSH 連線本身設了 ConnectTimeout 600s，容納 Docker build 所需的時間。問題不在網路，不在 build，在 git。

現象：checkout 一直 abort，stash 看起來有跑

升級腳本在遠端容器執行了幾次，每次都在切換版本那步 abort。錯誤是熟悉的：工作目錄有未提交的變更，git 拒絕 checkout。

追下去才發現，容器裡的 Dockerfile 被另一個自動化進程用 sed 注入修改過，留下了未 commit 的變更。腳本裡已經加了 git stash -u，理論上應該先把本地變更收起來再切 tag。但腳本繼續往下跑，checkout 繼續 abort，dirty state 沒有消失。

AutomationScript (SSH)
        │
        ▼
    git stash -u          ← [炸點] SSH 遠端環境靜默失敗，無 error output，無 exit code 異常
        │
        ▼ (stash 實際未執行)
    git checkout v{tag}   ← [炸點] Dockerfile dirty，git 拒絕，abort
        │
        ▼
    Docker build          ← 永遠跑不到
        │
        ▼
    Service restart       ← 永遠跑不到

── 成功路徑（修正後）──

AutomationScript (SSH)
        │
        ▼
    git checkout -f v{tag} ← [成功點] 強制丟棄本地變更，直接切換
        │
        ▼
    sed PM2 patch         ← idempotent，重新注入
        │
        ▼
    Docker build          ← 進行
        │
        ▼
    Service restart       ← 完成

分界點：SSH 遠端執行讓 stash 的失敗從有聲變無聲

同樣的 git stash -u，在互動式終端執行，有問題會噴出來。在 SSH 遠端非互動環境裡，某些條件觸發的失敗不產生任何輸出，exit code 也可能沒有異常——腳本看不到錯誤，繼續往下跑，當作 stash 成功了。

這不是 git stash 的設計缺陷，是跨環境工具行為差異加上外部進程隱性污染工作目錄的組合。兩個條件缺任何一個，問題都不會發生。sed 注入如果沒有留下 dirty state，stash 根本不需要跑；stash 如果在有回饋的環境裡執行，失敗至少看得到。

容易誤判的原因：腳本設計看起來正確

腳本加了 git stash -u 這件事本身沒有錯，邏輯上是對的。問題是在 SSH 遠端執行的路徑上，這個指令的失敗是無聲的。沒有 error output，就沒有理由懷疑它沒動。第一時間會花時間看 checkout 的 abort 訊息，往 Dockerfile 的變更內容追，往 sed 注入的時間點追——這些都是對的，只是誤判了 stash 有跑成功這個前提。

確認方式很直接：在 SSH session 裡手動跑 git stash -u，再跑 git status --short，看 dirty state 有沒有消失。如果沒消失，stash 就沒動。

修法前後對照


# 修法前：理論上正確，SSH 遠端靜默失敗
git stash -u                  # ← 在 SSH 非互動環境可能無聲失敗
git checkout v$TARGET         # ← dirty state 還在，abort

# 修法後：直接丟棄，不依賴 stash 成功
git checkout -f v$TARGET      # ← 強制丟棄所有本地變更，直接切換

# 後續補回 idempotent patch（以 PM2 注入為例）
if ! grep -q 'npm install -g pm2' Dockerfile; then
  sed -i '/&& chmod 755 \/app\/openclaw.mjs/a\    RUN npm install -g pm2' Dockerfile
fi

台灣開源社群有個精神叫「Fork the Government」——與其把舊的修好，不如直接 fork 一個更乾淨的版本。git checkout -f 的選擇邏輯類似：不嘗試暫存再恢復，直接丟棄，切乾淨，再把必要的 patch 用 idempotent 方式重新注入。

側效應隔離清單：使用 git checkout -f 前需確認的條件

本地變更是否全部可重建？ -f 會永久丟棄未暫存的變更。如果工作目錄有手動修改且無法從其他來源還原，不能用。
untracked files 是否需要保留？ git checkout -f 不會清除 untracked files，但如果搭配 git clean -fd 才會一起刪。需區分兩者的行為邊界。
外部進程是否會在切換後立刻重新污染？ 如果 sed 注入是持續性的（每次容器啟動都跑），切乾淨後馬上又 dirty，問題轉移到下一個週期。
sed patch 是否真的 idempotent？ 用 grep -q 先確認再注入，避免重複插入造成 Dockerfile 語法問題。grep 的 anchor 如果跟 upstream 格式不符，patch 會靜默跳過。
SSH ConnectTimeout 是否足以容納 build 時間？ Docker build 需要數分鐘，SSH 連線如果 timeout，build 可能在背景繼續跑但腳本已失聯，造成狀態不確定。
build 失敗時舊容器是否仍在跑？ docker compose up -d 在 build 失敗時不會換掉現有容器，舊版本繼續服務——這是安全的，但需確認監控不會因為版本不符而誤報。
plugin / symlink 是否需要在 checkout 後重新安裝？ image 重建後 native module 和 symlink 會失效，需在升級流程末段補上 plugin reinstall 步驟，否則服務起來但功能缺失。

留給下次的一個問題

git stash -u 在 SSH 非互動環境靜默失敗，目前還沒有一個乾淨的方式在腳本層面偵測它有沒有真的執行成功。git stash list 可以在 stash 後確認 entry 有沒有建立，這是最直接的 check。腳本如果要繼續用 stash 路徑，這個驗證步驟不能省。

— 邱柏宇

git stash Said Nothing and Did Nothing

Imagine asking someone to put your papers into a drawer before you leave. They say “done.” The drawer was stuck — they tried once and gave up without telling you. You find out only when you come back and the papers are still there. A container upgrade script ran into the exact same structure.

Stack: Shell Script, SSH Remote Execution, Git

The setup is a shell script connecting to a remote container over SSH, running a sequence of git operations: fetch tags, checkout a target version, rebuild the Docker image, restart services. Designed to run unattended. SSH ConnectTimeout was set to 600s to accommodate Docker build time. The problem was not the network, not the build. It was git.

What Happened: checkout Kept Aborting, stash Appeared to Run

The upgrade script ran against a remote container several times. Every run aborted at the version-switch step. The error was familiar: uncommitted local changes, git refusing to checkout. Tracing it back revealed that the Dockerfile inside the container had been modified by a separate automation process using sed injection, leaving a dirty working directory. The script already included git stash -u — logically correct, should have shelved the local changes before switching tags. But checkout kept aborting. Dirty state did not go away.

AutomationScript (SSH)
        │
        ▼
    git stash -u          ← [FAIL] silent failure in SSH non-interactive env, no error output
        │
        ▼ (stash did not execute)
    git checkout v{tag}   ← [FAIL] Dockerfile dirty, git refuses, abort
        │
        ▼
    Docker build          ← never reached
        │
        ▼
    Service restart       ← never reached

── Fixed path ──

AutomationScript (SSH)
        │
        ▼
    git checkout -f v{tag} ← [OK] force-discards local changes, switches cleanly
        │
        ▼
    sed PM2 patch         ← idempotent re-injection
        │
        ▼
    Docker build          ← proceeds
        │
        ▼
    Service restart       ← completes

The Boundary: SSH Remote Context Turns stash Failures Silent

The same git stash -u in an interactive terminal produces visible output when it fails. In an SSH non-interactive environment, certain failure modes produce no output and may not set a non-zero exit code. The script sees nothing wrong, continues, assumes stash succeeded. Two conditions have to combine: an external process leaving the working directory dirty, and stash silently failing in a non-interactive execution context. Remove either condition and the problem disappears.

The fix was a different frame entirely. Instead of shelving and restoring, discard directly: git checkout -f v$TARGET. Then re-apply the necessary Dockerfile patch idempotently afterward using sed with a grep guard. Fork the old approach, start clean.

Before and After


# Before: logically correct, silently broken over SSH
git stash -u                  # may fail silently in non-interactive SSH
git checkout v$TARGET         # dirty state persists, abort

# After: discard directly, no dependency on stash succeeding
git checkout -f v$TARGET      # force-discard all local changes, switch

# Re-apply idempotent patch afterward
if ! grep -q 'npm install -g pm2' Dockerfile; then
  sed -i '/&& chmod 755 \/app\/openclaw.mjs/a\    RUN npm install -g pm2' Dockerfile
fi

Side Effect Isolation Checklist Before Using git checkout -f

Are all local changes fully reproducible? -f permanently discards unstaged modifications. If any manual edits cannot be reconstructed from another source, this is the wrong tool.
Are untracked files safe to leave? git checkout -f does not remove untracked files. That requires git clean -fd separately. Know the boundary between the two behaviors.
Will the external process re-dirty the working directory immediately after? If sed injection runs on every container start, the working directory becomes dirty again next cycle. The problem shifts forward, not away.
Is the sed patch truly idempotent? A grep guard before insertion prevents duplicate lines. If the grep anchor drifts from upstream Dockerfile format, the patch silently skips — the container starts but services depending on the injected tooling will break.
Is SSH ConnectTimeout long enough for the Docker build? If SSH times out mid-build, the build may continue in the background while the script loses track of it. State becomes uncertain.
Does the old container keep running if the build fails? docker compose up -d does not replace a running container when the build fails. The old version continues serving — which is safe, but monitoring should not flag the version mismatch as a crash.
Do plugins and symlinks need reinstallation after image rebuild? Native modules and symlinks inside the container become invalid after a Docker image rebuild. A plugin reinstall step must follow the upgrade, or services come up missing functionality.

One Open Question for Next Time

There is no clean in-script way to confirm that git stash -u actually ran successfully in an SSH non-interactive context. The most direct check: run git stash list immediately after and verify that a new entry was created. Any script that still relies on the stash path should include this verification step. Without it, the script is trusting a silent operation that has already proven it can lie.

— 邱柏宇

延伸閱讀

短網址分享

https://justfly.idv.tw/s/oEgkMeN