寫死的交易時段,在颱風假面前沒有任何意義

寫死的交易時段,在颱風假面前沒有任何意義

颱風假是前一晚宣布的。縣市政府直播、媒體輪播、LINE 通知一起炸。隔天早上六點多,有人還是出門了——走到捷運站閘門才看到手寫告示:「今日因颱風停止營運」。原地轉身。

技術環境

Python 自動交易系統,排程器定時觸發訊號產出函式。原始設計:單層靜態時窗判斷,datetime.time 比較 09:00–13:30,通過即產出訊號。問題模式:時窗判斷純本地運算,對交易所當天是否開市毫無感知。台股以 TWSE MIS 為授時來源,美股以第三方 API(如 FMP)確認半日市。兩者共通:颱風假、補班、臨時休市屬於 runtime 動態資訊,靜態規則追不上。

某個自動交易系統,那天也沒有轉身。

交易時段的判斷邏輯很直觀:上午九點到下午一點半之間,就是可以交易的時間。台股下午一點半收盤,這個數字沒錯,規則跑了好一段時間也沒出問題。颱風假那天,交易所臨時休市。系統對這件事毫無所知,時間框架通過了,信號照常產出。

就像公司排班系統只寫了「週一到週五九點到六點是工作時間」,颱風假當天照樣發提醒:「再過三十分鐘到班了。」

問題不是 bug,是假設

時段寫死本身不是錯誤。錯的是這條規則暗含的前提:每個落在時段內的交易日,都是正常開盤的。

現實有幾種破口。颱風假是前一晚才宣布的,不在任何固定的假日表裡。端午節補班讓原本的平日變成休市日。美股的感恩節半日市,讓交易所在下午就提前關門,時段判斷這時會拿到一個「還在時段內但市場已經收了」的錯誤狀態。

這三種情況的共同點:都是「需要查詢才能知道」的資訊,不是「可以預先寫死」的規則。

判斷失效時序:訊號衝出去但市場是關的

修法前:

排程器         交易閘門 (TradingGate)       TWSE MIS
   |                  |                        |
   |── fire() ───────>|                        |
   |                  |── time_check(09:00–13:30)
   |                  |   return True  ✓       |
   |                  |                        |
   |                  | (沒有查詢步驟)        |
   |                  |                        |
   |                  |── generate_signal() ──>|  <-- 颱風假,市場實際休市
   |<── signal_out() ─|                        |

修法後:

排程器         交易閘門 (TradingGate)       TWSE MIS
   |                  |                        |
   |── fire() ───────>|                        |
   |                  |── time_check()         |
   |                  |   return True  ✓       |
   |                  |── GET /getLastTradingDay ──>|
   |                  |<── last_date ≠ today ──|  <-- 颱風假,休市
   |                  |   return False  ✗      |
   |<── silent ───────|                        |
State: 訊號被攔截 ✓ / 交易閘門: 靜默 ✓

修法前只有本地時間判斷,系統對交易所當天狀態毫無感知;修法後多一次 MIS 查詢,颱風假、補班、臨時休市全部自動涵蓋。

分界點在哪裡

修法的邏輯是把判斷拆成兩層。時段框架通過之後,再向交易所查詢當天是否真的有開市。台股的做法:查 TWSE MIS 最後交易日是否等於今天,這個判斷自動涵蓋颱風假、國定假日、臨時休市,不需要另外維護一份例外清單。美股再多查一個是否為半日市。

驗證結果設了 10 分鐘快取,不讓頻繁查詢燒掉免費的 API 額度。資料源掛掉時退回時段判斷,不阻擋交易——這個降級策略要注意,它的意思是「資料源失聯」和「市場休市」是兩件不同的事,系統要分開處理,不能把無法查詢的狀態直接等同於「今天沒開盤」。

Code 對照:修法前後

修法前(靜態時窗,單層判斷)

from datetime import datetime, time

def is_trading_window() -> bool:
    now = datetime.now()
    # 只查本地時間,完全不知道今天有沒有開市  <-- 問題在這裡
    return time(9, 0) <= now.time() <= time(13, 30)

def run_signal():
    if is_trading_window():
        generate_signal()  # 颱風假照常跑

修法後(兩層驗證:時窗 + 交易所狀態)

import time as time_mod
import urllib.request
import json
from datetime import datetime, time, date

_market_cache: dict = {}   # {date_str: (is_open: bool, cached_at: float)}
CACHE_TTL = 600            # 10 分鐘快取

def is_tw_market_open() -> bool:
    today = date.today().isoformat()

    # 快取命中
    if today in _market_cache:
        result, ts = _market_cache[today]
        if time_mod.time() - ts < CACHE_TTL:
            return result

    try:
        # 第一層:本地時窗
        now = datetime.now()
        if not (time(9, 0) <= now.time() <= time(13, 30)):
            return False

        # 第二層:TWSE MIS 確認今日是否為交易日
        with urllib.request.urlopen(
            "https://mis.twse.com.tw/stock/api/getLastTradingDay.jsp",
            timeout=5
        ) as r:
            last_td = json.loads(r.read())["msgArray"][0]["z"]  # YYYYMMDD
        result = (last_td == today.replace("-", ""))

        _market_cache[today] = (result, time_mod.time())
        return result

    except Exception:
        # MIS 掛掉:降級退回時窗判斷,不等同休市  <-- 降級策略
        return True

def run_signal():
    if is_tw_market_open():
        generate_signal()

該被隔離的側效應類型

  • 快取失效(Cache Invalidation):10 分鐘快取以日期為 key,跨日執行的進程需確認日期切換時不會沿用前一天結果。
  • 降級事件記錄(Fallback Logging):MIS 失聯時降級為「允許訊號」是有意設計,但每次降級應寫 event log(timestamp + reason),方便事後判斷訊號來自降級模式還是正常市場行為。
  • 推播通知 / 警報(Alert Notification):MIS 連續失聯超過閾值應觸發 on-call 警報,不能靜默降級——操作人員需知道系統正在盲飛。
  • 下游 Webhook 投遞(Downstream Webhook):若訊號產出後進入下單閘門 webhook,休市攔截必須在 webhook 投遞之前發生,而非收到端再過濾。
  • 多進程快取同步(Multi-process Consistency):多個 worker 各自維護記憶體快取,颱風假宣布後 10 分鐘視窗內仍可能有舊進程放行訊號;評估是否改用 Redis 共享快取。
  • 測試環境 API 汙染(Test Isolation):開發環境若直接打 TWSE MIS 正式 API,日常測試會帶真實查詢流量;建議以環境變數 mock is_tw_market_open() 回傳值。
  • 可觀測性 / 分析(Observability):被攔截的訊號(時窗通過但 MIS 回傳休市)應記入 analytics,追蹤攔截率,確認颱風假、補班、臨時休市各類例外均正確捕捉。
  • 非同步隊列遺留任務(Async Queue Stale Jobs):若訊號進 queue 後非同步執行,休市攔截必須在進 queue 前完成,而非 queue worker 執行時才判斷——stale job 累積在 queue 是另一類問題。

判斷標準:如果一個副作用失敗可能讓系統在市場休市時仍以為開市,這個驗證點屬於訊號生成前的必要條件,不能交給下游處理。

容易誤判的地方

第一時間很難看出來,因為系統「平常時間」的行為完全正常。颱風假不常發生,補班日更罕見,感恩節半日市對台灣開發者來說更是邊緣情境。這些例外加在一起的頻率很低,但每次發生,衝擊是真實的。

確認方式很簡單:在颱風假當天手動觸發邏輯,看系統是否靜默。如果時段框架通過但查詢回來是休市,信號不應該出現。這一個 check 就夠了,不需要跑完整個流程。

留給下次的話

時間框架是必要條件,不是充分條件。任何依賴「固定時段=可操作」這個假設的邏輯,都應該問一個問題:這個時段裡,外部系統是否也如預期地活著?

台股下午一點半就收盤。颱風假前一晚宣布。這些不是邊緣案例,是每年都會發生的事。

— 邱柏宇

延伸閱讀


Hardcoded Hours Don’t Know About Typhoon Days

Typhoon days in Taiwan are announced the night before — a livestream from the county government, media on loop, LINE notifications piling up. The next morning, some people still head out. They make it all the way to the MRT gate before seeing the handwritten sign: “Service suspended today due to typhoon.” They turn around and go home.

An automated trading system, on one such morning, did not turn around.

The market-hours logic was straightforward: if the current time falls between 9:00 AM and 1:30 PM, trading is permitted. Taiwan’s stock exchange closes at 1:30 PM — that number is correct. The rule ran without issue for a while. Then a typhoon day hit. The exchange closed unexpectedly. The system had no way of knowing. The time window passed its check, and signals went out anyway.

It’s the same failure as a company scheduling system that only knows “Monday through Friday, 9 to 6 is work time” — and sends a reminder at 8:30 AM on a typhoon day:

Technical Environment

Python automated trading system with a scheduler triggering signal generation at fixed intervals. Original design: single-layer static time-window check via datetime.time comparison (09:00–13:30); if the window passes, signal emits. No external exchange status query — purely local clock arithmetic. Taiwan stocks use TWSE MIS as source of truth for whether today is a trading day; US markets require an additional half-day flag from a third-party API (e.g., FMP). Shared constraint: typhoon closures, makeup days, and emergency shutdowns are runtime facts. Static rules can’t track them.

“You’re due in 30 minutes.”

The assumption buried in the rule

Hardcoding a time window isn’t wrong by itself. What’s wrong is the implicit premise: that every moment within that window is a normal, open-market session.

The edge cases aren’t exotic. Typhoon closures are announced the night before — they don’t appear in any fixed holiday calendar. A makeup workday can flip a weekday into a market holiday. US markets observe a half-day on Thanksgiving, meaning the exchange closes before the window does. In all three cases, the time check passes, but the market isn’t open.

What these situations share: they’re facts that require a real-time query to know. They cannot be pre-written into a static rule.

Signal Escape Sequence: Clock Says Yes, Market Says No

Before fix:

Scheduler      TradingGate                  TWSE MIS
    |                |                          |
    |── fire() ─────>|                          |
    |                |── time_check(09:00–13:30)|
    |                |   return True  ✓         |
    |                |                          |
    |                | (no query step)          |
    |                |                          |
    |                |── generate_signal() ────>|  <-- typhoon day, market closed
    |<── signal_out()|                          |

After fix:

Scheduler      TradingGate                  TWSE MIS
    |                |                          |
    |── fire() ─────>|                          |
    |                |── time_check()           |
    |                |   return True  ✓         |
    |                |── GET /getLastTradingDay ─>|
    |                |<── last_date ≠ today ────|  <-- typhoon closure
    |                |   return False  ✗        |
    |<── silent ─────|                          |
State: signal blocked ✓ / TradingGate: silent ✓

Before the fix, only local time was checked — the system had no awareness of whether the exchange was open. After the fix, a single MIS query covers typhoon closures, public holidays, and emergency shutdowns automatically.

The fix: two-layer validation

The revised logic splits the check into two stages. The time window runs first. If it passes, the system queries whether the exchange actually opened today. For Taiwan stocks, this means checking whether TWSE MIS returns today as the most recent trading date — a single lookup that automatically covers typhoon closures, public holidays, and emergency shutdowns, with no need to maintain a separate exceptions list. US markets get an additional check for half-day sessions.

Query results are cached for 10 minutes to avoid burning through free API rate limits. If the data source goes down, the system falls back to the time-window judgment alone and does not block trading. That fallback matters: an unreachable API and a closed market are two different states. Conflating them — treating “can’t check” as “must be closed” — creates a different class of failure.

Code Diff: Before and After

Before (static window, single-layer)

from datetime import datetime, time

def is_trading_window() -> bool:
    now = datetime.now()
    # only checks local clock — no awareness of whether the market is open  <-- problem here
    return time(9, 0) <= now.time() <= time(13, 30)

def run_signal():
    if is_trading_window():
        generate_signal()  # fires on typhoon day regardless

After (two-layer: time window + live exchange status)

import time as time_mod
import urllib.request
import json
from datetime import datetime, time, date

_market_cache: dict = {}   # {date_str: (is_open: bool, cached_at: float)}
CACHE_TTL = 600            # 10-minute cache

def is_tw_market_open() -> bool:
    today = date.today().isoformat()

    # cache hit
    if today in _market_cache:
        result, ts = _market_cache[today]
        if time_mod.time() - ts < CACHE_TTL:
            return result

    try:
        # layer 1: local time window
        now = datetime.now()
        if not (time(9, 0) <= now.time() <= time(13, 30)):
            return False

        # layer 2: confirm today is a trading day via TWSE MIS
        with urllib.request.urlopen(
            "https://mis.twse.com.tw/stock/api/getLastTradingDay.jsp",
            timeout=5
        ) as r:
            last_td = json.loads(r.read())["msgArray"][0]["z"]  # YYYYMMDD
        result = (last_td == today.replace("-", ""))

        _market_cache[today] = (result, time_mod.time())
        return result

    except Exception:
        # MIS unreachable: fall back to time-window, do NOT treat as closed  <-- fallback policy
        return True

def run_signal():
    if is_tw_market_open():
        generate_signal()

Side Effects That Should Be Isolated

  • Cache invalidation: The 10-minute cache uses date as the key. Long-running processes that cross midnight must not carry the previous day’s cached result into a new trading session.
  • Fallback event logging: When MIS is unreachable and the system falls back to time-window mode, each fallback instance should be written to an event log (timestamp + reason) so post-hoc analysis can distinguish degraded-mode signals from legitimate ones.
  • Alert/notification: Consecutive MIS failures beyond a threshold should trigger an on-call alert. Silent fallback with no notification leaves operators flying blind.
  • Downstream webhook delivery: If signals feed a downstream order gateway via webhook, the market-open check must block the signal before the webhook fires — not after the receiver picks it up.
  • Multi-process cache consistency: Multiple workers each maintain their own in-memory cache. Within the 10-minute TTL after a typhoon closure is announced, some workers may still pass an old cached “open” result. Evaluate whether a shared cache (Redis) is warranted.
  • Test environment API pollution: If the dev environment hits the live TWSE MIS API directly, routine test runs generate real query traffic. Gate the call behind an environment variable so tests use a mock response.
  • Observability / analytics: Blocked signals (window passed, MIS returned closed) should emit structured events to analytics — to confirm that typhoon closures, makeup days, and emergency shutdowns are each being caught correctly after the fix ships.
  • Async queue stale jobs: If signals enter an async queue before execution, the market-open check must happen before enqueue — not inside the queue worker — to avoid stale jobs accumulating in the queue while the market is closed.

The test: if a side effect’s failure could make the system believe the market is open when it isn’t, that check belongs before signal generation — not delegated downstream.

Why it wasn’t obvious

The system behaved correctly on every normal day. Typhoon closures happen a handful of times a year. Makeup trading days are rarer. A US half-day is an edge case for a Taiwan-based developer. The combined frequency is low enough that the flaw stays invisible — until the day it isn’t.

The verification is simple: trigger the logic manually on a typhoon closure day and confirm no signal is produced when the market-open query returns closed. That single check is sufficient.

What to watch for next time

A time window is a necessary condition, not a sufficient one. Any logic that treats “within scheduled hours” as equivalent to “system is operational and ready” should ask one more question: is the external system also behaving as expected right now?

Taiwan’s market closes at 1:30 PM. Typhoon days get announced the night before. Neither of these is a corner case — they happen every year, on a schedule nobody fully controls.

— 邱柏宇

Related Posts