can ever see Claude 永遠
看不到的原始值
per server run 每次伺服器執行
淨化後的彙總值
stages 伺服器流程
階段數
no SSH to data 硬規則:
不 SSH 到資料
1. Project Purpose 1. 專案目標
Build a privacy-preserving data analysis pipeline. Claude writes scripts (statistical analysis, ML, neural networks) that run on a remote server. Claude does not see raw data — only schemas, performance metrics, and a small bounded set of aggregated values per run. We discuss methods together; the server runs them; the server returns sanitized outputs through a fixed-size feedback channel. 建立一個保護資料隱私的分析流程。Claude 撰寫分析腳本(統計、ML、神經網路),這些腳本在遠端伺服器上執行。Claude 看不到原始資料,只能看到 schema、效能指標,以及每次執行回傳的少量、有上限的彙總值。我們一起討論方法;伺服器執行;伺服器透過固定大小的回饋通道回傳淨化過的輸出。
What Claude can / cannot see Claude 可看 / 不可看的內容
| Visible可見 | Hidden隱藏 |
|---|---|
| ✓ Column names, dtypes, schema 欄位名稱、型別、schema | ✗ Raw row values 原始資料列的值 |
| ✓ Aggregates: count, null %, min/max/mean 彙總統計:筆數、缺失率、min/max/mean | ✗ Individual records, sample rows 個別資料列、樣本資料 |
| ✓ Model metrics + ≤30 chosen values per run 模型指標 + 每次執行 ≤30 個選定值 | ✗ Per-sample predictions 個別樣本的預測結果 |
| ✓ Training curves, summary histograms 訓練曲線、彙總直方圖 | ✗ Row-level scatter plots 以資料列為單位的散布圖 |
| ✓ Code, configs, model architectures 程式碼、設定檔、模型架構 | ✗ Trained weights if they could leak data 可能洩漏資料的訓練權重 |
2. Architecture Overview 2. 架構總覽
2.1 Operational Boundary — Where Claude Runs 2.1 作業邊界 — Claude 跑在哪
Claude (this tool) must run on a separate machine from the 5090 — e.g. your laptop. Do NOT add the 5090 as an SSH host in Claude Code. If we did, Claude would have shell access to raw data and could bypass the entire Gate. Claude(這個工具)必須跑在與 5090 分開的機器上 — 例如你的筆電。不要把 5090 加入 Claude Code 的 SSH 主機。一旦如此,Claude 會擁有原始資料的 shell 存取權,繞過整套把關。
Claude Code session modes — safety mapping Claude Code 連線模式 — 安全對應
- Claude → server: Git push only. Claude → 伺服器:只走 Git push。
- Server → Claude: synced folder of sanitized JSON only. 伺服器 → Claude:只同步淨化過的 JSON 資料夾。
2.2 Single Machine, Two Zones 2.2 單機雙區隔離
You have one physical computer (the 5090 desktop). To make Claude blind to data while still operating the framework, the desktop is split into two logical zones with hard permission boundaries. Claude SSHes from the MacBook into the control zone only; the data zone has no inbound shell. 你有一台實體電腦(5090 主機)。為了讓 Claude 看不到資料但仍能操作框架,主機被切成兩個邏輯區,由硬性權限邊界隔離。Claude 從 MacBook 透過 SSH 只進入控制區;資料區不接受任何外來 shell。
claude_user/srv/data.
可 SSH · Git · 腳本 · 清單 · 輸出(唯讀掛載)。不能存取 /srv/data。
data_user/srv/output; control zone read-only-mounts it.
淨化結果 — 資料區寫入 /srv/output;控制區唯讀掛載。
Implementation options 實作方式
Avoid: Remote Desktop / GUI sharing 避免:遠端桌面 / GUI 分享
SSH to claude_user is enough. Don't enable Remote Desktop / VNC / RustDesk for Claude — it widens the surface (clipboard, screen reads, keyboard injection) without giving anything Claude actually needs. SSH terminal + a synced output folder is the cleanest channel.
SSH 到 claude_user 已足夠。不要為 Claude 開啟遠端桌面 / VNC / RustDesk — 那會擴大攻擊面(剪貼簿、螢幕讀取、鍵盤注入),卻沒有 Claude 真正需要的東西。SSH terminal + 同步輸出資料夾是最乾淨的通道。
3. The Feedback Budget 3. 回饋預算
Each server run returns a bounded number of values to Claude — default 30 items per run. Claude designs which 30 to return via the feedback manifest, and refines it each iteration. This is what lets Claude tune parameters without seeing most of the data. 每次伺服器執行只回傳有限的值給 Claude — 預設每次 30 筆。Claude 透過回饋清單決定要哪 30 筆,並在每輪迭代中調整清單。這就是 Claude 能在看不到大部分資料的情況下調整參數的關鍵。
3.1 Visualizing the 30-slot budget 3.1 視覺化 30 格預算
3.2 What counts as 1 item 3.2 什麼算 1 筆
| Item type類型 | Example範例 | Cost消耗 |
|---|---|---|
| Scalar metric純量指標 | accuracy = 0.87 | 1 |
| Per-class metric (k classes)每類指標 | F1 per class每類 F1 | k |
| Top-K feature importance前 K 重要特徵 | top 10 by SHAPSHAP 前 10 名 | K |
| Confusion matrix混淆矩陣 | 3×3 = 9 cells | k² |
| Histogram (n bins)直方圖 (n bin) | 10-bin residuals10 bin 殘差 | n |
| Validation curve point驗證曲線點 | val loss at epoch 10epoch 10 的 val loss | 1 |
3.3 Iteration strategy — broad → narrow 3.3 迭代策略 — 由廣至窄
- Round 1: data quality probe — what does the data look like? 第 1 輪:資料品質檢查 — 資料長什麼樣?
- Round 2: baseline metrics (A) — does anything work? 第 2 輪:基準指標(A)— 有任何方法可行嗎?
- Round 3+: hyperparameter sweep (B) or training diagnostics (C) — tune the best candidate. 第 3 輪起:超參數掃描(B)或訓練診斷(C)— 調整最佳候選。
- Final: custom manifest targeting specific failure modes. 最後:針對特定失敗模式設計專屬清單。
4. Stack — Decisions & Options 4. 技術棧 — 決策與選項
4.1 Data Storage 4.1 資料儲存方式
4.2 Server / Compute 4.2 伺服器 / 運算
4.3 RTX 5090 — what fits in 32 GB VRAM 4.3 RTX 5090 — 32 GB VRAM 能跑什麼
| Workload工作類型 | 5090 fit5090 適配度 |
|---|---|
| Tabular ML / classical stats表格 ML / 統計 | ✗ CPU is faster anywayCPU 更快 |
| 14B – 32B local LLMs14B – 32B 本機 LLM | ✓ excellent fit非常適合 |
| 70B local LLMs70B 本機 LLM | ~ tight, needs Q3 / offload吃緊,需 Q3 / CPU 卸載 |
| Custom NN training (TS, vision)自訓神經網路 | ✓ good fit適合 |
| Foundation-model inference (Chronos, TimesFM)基礎模型推論 | ✓ very fast非常快 |
4.4 Optional: Local LLM as Server-Side Analyst 4.4 選配:在伺服器端加本機 LLM
A locally-hosted LLM (via Ollama) can run on the 5090 alongside forecasting models. It CAN see raw data — but its output passes through the same Sanitize + Budget Gate before reaching Claude. 在 5090 上用 Ollama 架本機 LLM,與預測模型同機運行。它可以看到原始資料 — 但輸出仍須經過「淨化與預算把關」才能到 Claude。
private私密
sees raw data可看資料
≤30 items≤30 筆
no raw data看不到資料
Recommended Ollama models (32 GB VRAM) 推薦 Ollama 模型(32 GB VRAM)
| Use case用途 | Model模型 | VRAM (Q4) | Notes備註 |
|---|---|---|---|
| General + Chinese通用 + 中文強 | qwen2.5:32b / qwen3:32b |
~20 GB | ★ Top pick首選 |
| Strong reasoning強推理 | deepseek-r1:32b |
~20 GB | Distilled from DeepSeek-R1DeepSeek-R1 蒸餾 |
| Small & fast小型快速 | phi4:14b / qwen2.5:14b |
~9 GB | Punches above weight小而強 |
| Code review程式碼審查 | qwen2.5-coder:32b |
~20 GB | Best open code model最強開源 code 模型 |
| Embeddings (multilingual)嵌入(多語) | bge-m3 |
~1 GB | Includes Chinese含中文 |
5. Time Series — Methods Ladder 5. 時間序列 — 方法階梯
Climb top-down — try simpler tiers first. Neural networks are not always best for time series; tree-based methods with lag features beat NNs on many real-world problems. Don't reach for transformers until simpler tiers hit a wall. 由上而下嘗試 — 先試簡單方法。神經網路在時間序列上不一定最好;許多實際案例中,加 lag 特徵的樹模型反而勝過神經網路。簡單方法用盡之前不要急著上 transformer。
5.1 Recommended framework: Nixtla 5.1 推薦框架:Nixtla
Nixtla covers Tiers 1–3 with one unified API. Same patterns whether calling AutoARIMA, LightGBM with auto-lags, or PatchTST. Models are GPU-accelerated via PyTorch Lightning — the 5090 is well-utilized. Nixtla 用統一 API 涵蓋 Tier 1–3。呼叫 AutoARIMA、加 lag 的 LightGBM、或 PatchTST 都是同一套寫法。模型透過 PyTorch Lightning 做 GPU 加速 — 5090 能充分利用。
| Package套件 | Covers涵蓋 |
|---|---|
statsforecast | ARIMA, ETS, Theta — fast classical (Tier 1)ARIMA、ETS、Theta — 快速古典(Tier 1) |
mlforecast | LightGBM/XGBoost/CatBoost + auto lag features (Tier 1)LightGBM/XGBoost/CatBoost + 自動 lag 特徵(Tier 1) |
neuralforecast | N-BEATS, N-HiTS, TCN, PatchTST, iTransformer, TimeMixer, TFT (Tier 2-3)N-BEATS、N-HiTS、TCN、PatchTST、iTransformer、TimeMixer、TFT(Tier 2-3) |
chronos-forecasting | Amazon's pretrained foundation model (Tier 4)Amazon 預訓練基礎模型(Tier 4) |
Install (Python 3.11+) 安裝(Python 3.11+)
# Time-series suite (Tier 1-3)
pip install statsforecast mlforecast neuralforecast
# Foundation models (Tier 4)
pip install chronos-forecasting
# PyTorch with CUDA for the 5090 (Blackwell)
pip install torch --index-url https://download.pytorch.org/whl/cu124
5.2 If you confirm time-series — please tell me 5.2 若確認時間序列 — 請告訴我
- Univariate (one signal) or multivariate (many features per timestep)? 單變量(一條訊號)還是多變量(每個時間點多特徵)?
- One series, or many parallel series (per product / sensor / region)? 單一序列還是多條並行(每個產品/感測器/區域)?
- Granularity — seconds, minutes, hours, days, months? 時間粒度 — 秒、分、小時、日、月?
- Forecast horizon — next step, next N steps, next year? 預測區間 — 下一步、下 N 步、未來一年?
- Anything special — strong seasonality, intermittent / sparse, hierarchical? 特殊情況 — 強季節性、稀疏/間歇、層級結構?
⚡ Gas Power Plant Decision System ⚡ 燃氣電廠決策系統
- Project framing: gas power plant decision system, modeling-contest entry專案定位:燃氣電廠決策系統、建模競賽參賽方案
- Data: ~20 GB, ~100 dimensions, hourly resolution, >1 year history; provided by operator資料:~20 GB、~100 維、每小時解析度、>1 年歷史;由電廠操作員提供
- Operator also provides action limitations / constraints操作員另提供動作限制/約束條件
- 3-step purpose: (1) understand the plant's action restrictions / freedom; (2) forecast key factors (electricity price, gas price, consumption, etc.); (3) daily quotation strategy to the power network — maximize income3 步目的:(1) 理解電廠的動作限制/自由度;(2) 預測關鍵因子(電價、氣價、消耗等);(3) 對電網的每日報價策略 — 最大化收益
- Competitor: traditional accounting-method decisions ("old school, lack of tech")對手:傳統會計式決策(「老學派、缺科技」)
- Timeline: 1–2 months時程:1–2 個月
- Reference materials: 4 zh-Hant HTML docs in
old_reference_powerfactory/(especially系統架構圖.html) describing the standard operating model參考材料:old_reference_powerfactory/的 4 份繁中 HTML(尤其系統架構圖.html),描述標準作業模型 - Hardware: local 5090 desktop is the execution server硬體:本機 5090 主機作為執行伺服器
- Privacy / architecture: Sanitize + Budget Gate (≤30 items per run); two-zone setup on the 5090 (
claude_uservia SSH +data_userwith no inbound shell)隱私/架構:「淨化 + 預算把關」(每次 ≤30 筆);5090 上雙區設置(claude_user走 SSH +data_user無外來 shell) - Forecast framework: Nixtla suite (
statsforecast+mlforecast+neuralforecast) +chronos-forecasting預測框架:Nixtla 套件 + Chronos - Round-1 baseline: LightGBM (Tier 1) + Chronos (Tier 4) side-by-side for the 6 forecast factors第 1 輪基準:對 6 個預測因子,LightGBM (Tier 1) + Chronos (Tier 4) 並行
- Phased delivery: Phase 1 = forecast layer (build distributions for the 6 factors); Phase 2 = strategy / quotation layer分階段交付:第 1 階段 = 預測層(為 6 個因子建立分布);第 2 階段 = 策略/報價層
- Decision approach: explicit decision engine (scenario optimization or rules + solver) + interactive UI for operator override; constraints inside the solver, not post-hoc決策方式:顯式決策引擎(情境最適化或規則 + 求解器)+ 互動式 UI 供操作員覆寫;約束內建於求解器,不是事後補檢
- Final outputs: executable actions (unit commitment / dispatch / gas procure / resale-store / skip-gen-trade) — not model scores最終輸出:可執行動作(機組啟停/出力/購氣/售氣存氣/不發電轉交易)— 不是模型分數
- "Quantify the win" metric: backtested P&L of the model's decisions vs the operator's recorded decisions over the same window, controlled for action constraints「量化優勢」指標:在相同時間窗、相同動作約束下,回測模型決策的 P&L 對比操作員實際決策
- Stage 1 first deliverable: the Sanitize + Budget Gate, before any modeling codeStage 1 首個交付:「淨化 + 預算把關」,先於任何建模程式碼
- Optional helper: Qwen 32B via Ollama on the same 5090 — server-side hypothesis generation, structured-JSON output only選配輔助:Ollama 上的 Qwen 32B(同一台 5090)— 伺服器端假設生成,僅輸出結構化 JSON
- Time granularity for decisions: hourly, 4-hour, or daily?決策時間粒度:每小時、每四小時、每日?
- Forecast horizon: next hour, next day-ahead market window (24h), next 168h, longer?預測區間:下一小時、下一日前市場窗(24h)、下 168 小時、更長?
- Concrete action space: what specific actions can the plant take? (unit start/stop · output level · gas buy/sell/store · contract calls)具體動作空間:電廠能採取哪些動作?(機組啟停/出力等級/買賣存氣/合約調用)
- Action-constraints format: how will the operator deliver them? Document, spreadsheet, code?動作約束格式:操作員會以什麼形式交付?文件、試算表、程式碼?
- Override authority: which constraints are hard-locked vs human-overridable (with audit)?覆寫權限:哪些約束硬鎖、哪些可人工覆寫(含稽核)?
- Column contract: are tables already structured (PK + timestamp + units), or do we define schema?欄位契約:表格已有結構(PK + 時間戳 + 單位)嗎?還是我們要定義 schema?
- Operator's decision log: do you have a record of past decisions to backtest the model against?操作員決策日誌:有過去決策的記錄可供模型回測比較嗎?
- Decision layer: rules engine, explicit solver (Pyomo / OR-Tools), or hybrid?決策層:規則引擎、顯式求解器(Pyomo / OR-Tools)、或混合?
- Distribution output format: mean+variance / quantiles / scenario trees / all of the above?分布輸出格式:均值+方差/分位數/場景樹/全部?
- Update frequency: real-time hourly arrivals or daily batches?更新頻率:即時每小時抵達還是每日批次?
- Privacy reason: commercial confidentiality, regulatory, both?隱私原因:商業機密、法規、兩者皆是?
- Contest: deadline + submission format (paper / live demo / metric)?競賽:截止日 + 提交格式(論文/現場 demo/指標)?
Standard Operating Model — daily / per-run cycle 標準作業模型 — 每日/每次運行循環
This is what the system repeats every run. Numbers indicate data-flow order, NOT development phases. (Source: old_reference_powerfactory/系統架構圖.html.)
這是系統每次運行會重複的循環。數字代表資料流順序,不是開發階段。(來源:old_reference_powerfactory/系統架構圖.html)
Standard Data Intake 標準資料入口
Standard Feature Confluence 標準特徵匯流
Numeric features + event vectors + unit state + inventory + market signals — unified timestamp + column contract. 數值特徵 + 事件向量 + 機組狀態 + 庫存 + 市場訊號 — 統一時間戳與欄位契約。
Standard Forecasting 標準預測處理
Ensemble of forecast models (NOT a single monolith). Outputs distributions, not point predictions. Targets: power price, gas procurement cost, gas resale netback, net-load gap, unit availability, supply-disruption probability. 多個預測模型集成(不是單一神經網路)。輸出分布,不是單點預測。目標:電價、購氣成本、售氣淨回值、淨負荷缺口、機組可用率、供應中斷機率。
Standard Distribution Output 標準分布輸出
Every key factor delivered to the decision layer in a uniform format — mean, variance, quantiles, peak probability, scenarios. Point forecasts alone make tail risk invisible. 每個關鍵因子都用統一格式交付給決策層 — 均值、方差、分位數、尖峰機率、情境。僅有單點預測會讓尾部風險不可見。
Standard Decision Interface 標準決策接口
Constraints — built into the solver, not post-hoc 約束 — 內建於求解器,不是事後補檢
Standard Operational Output 標準作業輸出
Final outputs are executable actions, not model scores. 最終輸出是可執行動作,不是模型分數。
Standard Feedback & Monitoring 標準回寫與監控
Actual prices, P&L, deviations, override reasons → write back for monitoring, retraining, governance. 實際價格、盈虧、偏差、覆寫原因 → 回寫供監控、再訓練、治理使用。
🎮 Gaming Platform Security Audit 🎮 遊戲平台安全稽核
- Project type: security audit on a gaming platform專案類型:遊戲平台安全稽核
- Data: ~2 TB database (needs cleansing first)資料:~2 TB 資料庫(須先清洗)
- Source code in GitLab, multiple snapshot versions原始碼在 GitLab,多版本快照
- System logs available具備系統日誌
- Insider operator (the "cook") with full schema + data-flow knowledge具完整 schema 與資料流知識的內部操作員(內部「主廚」)
- Goal: detect internal or external security compromises目標:偵測內部或外部的安全入侵
- Timeline: ~1 month時程:約 1 個月
- Note: separate from power-factory project; may reuse the same framework備註:與燃氣電廠專案分開;可沿用同一套框架
- Reuse the Sanitize + Budget Gate infrastructure on a different data plane沿用「淨化 + 預算把關」基礎設施,作用於不同資料平面
- Workflow: data cleansing → detection scripts → cross-reference (logs ↔ DB events ↔ source-code diffs across snapshots) → report工作流程:資料清洗 → 偵測腳本 → 交叉比對(日誌 ↔ DB 事件 ↔ 跨版本程式碼 diff)→ 回報
- Detection methods: anomaly detection on access patterns, transaction flows, login times, code-diff anomalies偵測方法:對存取模式、交易流、登入時間、程式碼 diff 做異常偵測
- Sanitization output: only counts / severity scores / anonymized identifiers flow back to Claude — never raw log lines, user IDs, or balances淨化輸出:只回傳計數/嚴重程度分數/匿名後識別子 — 絕不回傳原始日誌列、使用者 ID、餘額
- Sequencing: likely after Power Factory (or in parallel if your bandwidth allows); reuses the two-zone architecture排程:建議在燃氣電廠之後(若你有餘力可並行);沿用雙區架構
- Type of gaming platform: online casino, sportsbook, mobile game, social gaming, esports?遊戲平台類型:線上博弈、體育投注、手遊、社交遊戲、電競?
- Suspected compromise type: data exfiltration, account takeover, code injection, financial fraud, internal abuse?懷疑入侵類型:資料外洩、帳號竊取、程式碼注入、金融詐欺、內部濫用?
- Known IOCs: any indicators of compromise already identified to start from?已知 IOC:是否已有任何已識別的入侵指標可作起點?
- DB type: MySQL, PostgreSQL, MongoDB, other?資料庫類型:MySQL、PostgreSQL、MongoDB、其他?
- Source-code language(s): affects code-diff analysis tooling原始碼語言:影響 code diff 分析的工具選擇
- Time window: investigate past month, year, or all-time?調查時間窗:過去一個月、一年、或全部歷史?
- Storage: will the 2 TB live on the 5090 (does it have the disk?), or external?儲存:2 TB 放在 5090(5090 磁碟夠嗎?)還是外接?
- Compliance: PCI-DSS, GDPR, gaming licensing requirements? Affects what can leave the data zone.合規:PCI-DSS、GDPR、博弈牌照要求?影響哪些資訊可離開資料區。
- Output format: written report, dashboard, SIEM-style alert feed?輸出格式:書面報告、儀表板、SIEM 風格警報串流?
- Timing: before, after, or in parallel with the Power Factory project?時機:先於、後於、或與燃氣電廠專案並行?
Note:備註: This project shares the Sanitize+Budget Gate infrastructure with the power-factory project but operates on a different data plane. Likely reuses the same framework with separate scripts and manifests. 本專案與燃氣電廠共用同一個「淨化與預算把關」基礎設施,但作用於不同資料平面。預計沿用同一套框架,腳本與清單分開。
6. Stage Timeline 6. 階段時間軸
- Q1 Data type: time seriesQ1 資料類型:時間序列
- Q3 Goal: forecasting (implied)Q3 目標:預測(推得)
- Q5 / Q8 Hardware: local RTX 5090 (sunk cost)Q5 / Q8 硬體:本機 RTX 5090(沉沒成本)
- Framework: Nixtla suite + Chronos框架:Nixtla 套件 + Chronos
6.1 Questions for You 6.1 給你的問題
Green = answered. Gray = still need an answer. 綠色 = 已回答。灰色 = 仍需回答。
6.2 Starter Setup 6.2 起步配置
- Compute: the 5090 machine itself is the execution server. 運算:5090 主機本身就是執行伺服器。
- Data store: PostgreSQL or Parquet on the 5090. 資料儲存:5090 上的 PostgreSQL 或 Parquet。
- Code: Private Git repo (GitHub) — Claude reads & writes. 程式碼:私有 Git 倉庫 — Claude 可讀寫。
-
Time-series stack:
statsforecast+mlforecast+neuralforecast+chronos-forecasting. PyTorch CUDA 12.x. 時間序列棧:statsforecast+mlforecast+neuralforecast+chronos-forecasting。PyTorch CUDA 12.x。 - Sanitize + Budget Gate: Python module enforcing ≤30 items. 淨化與預算把關:強制 ≤30 筆的 Python 模組。
-
Optional local LLM:
qwen2.5:32bvia Ollama on the same 5090. 選配本機 LLM:Ollama 上的qwen2.5:32b,同一台 5090。 - Experiment tracking: MLflow (self-hosted) or JSON files. 實驗紀錄:MLflow(自架)或 JSON 檔。