MentorAI GPU 工作紀錄

目前判斷

最快可用 talking-video

Wan2.2 S2V

約 13.55s / output-second

最高表情樣片

InfiniteTalk

短樣片可用，長版目前過慢

參考圖生成

Z-Image

後續 seed 約 4s / 張

長版風險

>70 min

InfiniteTalk 257 frames / 32 steps 推估

工作板

完成 5

ComfyUI + Cloudflare 已跑通

G49206 的 ComfyUI 可透過 momooAi.com 入口使用。

infradone

Wan2.2 S2V 本地工作流

Segment 01 輸出 10.31 秒，生成約 139.72 秒。

fastestlocal

Z-Image 老師參考圖

已選出溫和微笑、乾淨綠幕的男老師參考圖。

reference

Wan InfiniteTalk 81-frame 樣片

3.24 秒樣片已完成並傳 Telegram message 24。

sampleslow

模型耗時表

Markdown 與 CSV 已建立，後續生成需追加紀錄。

benchmark

觀察中 3

InfiniteTalk 長版太慢

257 frames / 32 steps 第一 sampler step 約 135.93 秒。

bottleneck

InfiniteTalk CFG 限制

`cfg=5` 會出現 audio batch shape mismatch；穩定值是 `cfg=1`。

known failure

Partner/API 節點未使用

Seedance / Kling 等 Partner nodes 需 Comfy API key 與 credits。

blocked

下一步 4

把新生成統一追加到本頁

不要只寫到 research 檔或 terminal log。

process

測 Wan2.2 S2V 品質參數

以 Wan2.2 S2V 作長段主線，嘗試提升表情但控制耗時。

quality

InfiniteTalk 只做短樣片

先用 81 frames 比較嘴型、表情與身份，不直接跑 10 秒以上。

cost control

若要 Trello，從本頁匯出

用本頁作單一資料源，再建立 Trello cards，避免雙寫。

trello-ready

模型耗時表

Date	Pipeline	Params	Output	Gen sec	Sec / out sec	VRAM	Status	Notes
2026-06-13	Z-Image Turbo still reference	768x768; 8 steps; cfg=1.5; euler/simple	PNG	~10	n/a	n/r	pass	First run includes model load.
2026-06-13	Z-Image Turbo still reference	768x768; 8 steps; cfg=1.5; euler/simple	PNG	~4	n/a	n/r	pass	Later seeds.
2026-06-13	Wan2.2 S2V local smoke	512x512; 65 frames; 16fps; 8 steps; cfg=1	4.06s MP4	n/r	n/r	n/r	pass	Generation wall time not recorded.
2026-06-13	Wan2.2 S2V local segment01	512x512; 165 frames; 16fps; 8 steps; cfg=1	10.31s MP4	139.72	13.55	~30GB	pass	Fastest usable talking-video branch so far.
2026-06-13	Wan2.2 S2V local composite	1920x1080; 30fps composite	70.83s MP4	n/r	n/r	n/a	pass	Composite render time not captured separately.
2026-06-14	Wan2.1 InfiniteTalk long attempt	832x480; 257 frames; 25fps; 32 steps	target 10.3s MP4	229.10	projected slow	~40GB	interrupted	Step 1/32 took ~135.93s; projected >70 min.
2026-06-14	Wan2.1 InfiniteTalk bad CFG test	832x480; 81 frames; 20 steps; cfg=5	target 3.24s MP4	3.53	n/a	~27GB	fail	CFG/audio batch mismatch.
2026-06-14	Wan2.1 InfiniteTalk stable sample	832x480; 81 frames; 20 steps; cfg=1; audio_scale=1	3.24s MP4	267.06	82.43	~31GB	pass	Telegram message 24.

單一來源與紀錄規則

新增紀錄規則

任何新生成先追加到本頁與 CSV，不只寫在聊天或 terminal。
生成時間使用 ComfyUI `Prompt executed in ... seconds`，不要把影片時長當生成耗時。
每筆至少記錄模型、workflow、解析度、frames、fps、steps、CFG、audio scale、VRAM、輸出路徑。
長任務先跑 81-frame 或 3-5 秒樣片，再決定是否跑完整段。
若要同步到 Trello，從本頁的工作板建立 cards，避免 Trello 與 repo 文件各寫各的。

資料來源

Benchmark Markdown docs/research/avatar-generation-benchmarks.md

Benchmark CSV storage/research/avatar-generation-benchmarks.csv

Published Dashboard mentorai-worklog.pages.dev

Expression Research docs/research/avatar-expression-control-models.md

RunPod Skill Reference skills/runpod-mentorai/references/comfyui-longcat.md

MentorAI GPU 工作紀錄

目前判斷

工作板

完成 5

觀察中 3

下一步 4

最新交付

StudioV1 2026-06-21 Demo 工作紀錄

Wan2.1 InfiniteTalk stable sample

模型耗時表

單一來源與紀錄規則

新增紀錄規則

資料來源