GPT-5.5-xhigh
87.5
GPT-5.5-high
112.5
GPT-5.5-medium
62.5
GPT-5.4-xhigh
62.5
GPT-5.4-high
87.5
本次多模型指标
项目5.5-xhigh5.5-high5.5-medium5.4-xhigh5.4-highOpus 4.8 high
通过数7/129/125/125/127/127/12
IQ87.5112.562.562.587.587.5
Agent steps495448401476460611
费用$40.7$29.8$21.7$20.9$16.8$58.5
cache命中率95.2%95.3%95.1%94.9%96.4%97.6%
耗时2.6h1.9h1.7h3.2h2.3h2.9h
总tokens41.1M31.7M23.4M37.9M34.7M57.7M
备注:Opus 4.8 为 CC Max 20x。Opus 列为不定期参考列,仅进入表格,不进入趋势图。
固定评测任务集
固定评测任务集
每日 IQ 点固定使用这组混合语言 DeepSWE 任务;每行展示今天各组模型在同一道题上的结果。
-
01Add JSONPath query APIs to orderedmap and Starlark modules
ytt-jsonpath-query-api5.5-xhigh 通过 5.5-high 未过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
02Add build-time grammar conflict analysis to participle
participle-grammar-conflict-analysis5.5-xhigh 未过 5.5-high 未过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 通过 -
03Harden module loading, cache introspection, and script flags
abs-module-cache-flags5.5-xhigh 未过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 未过 -
04Add multipart response parsing to HTTPX
httpx-multipart-response-parsing5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
05Add incremental cache controls to Bandit
bandit-incremental-cache-control5.5-xhigh 通过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 未过 -
06Add session bundle recording and replay to IPython
ipython-session-bundle-replay5.5-xhigh 未过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 未过 -
07Add a per-origin circuit breaker to ofetch
ofetch-per-origin-circuit-breaker5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
08Add link format conversion between wiki and markdown syntax
obsidian-linter-link-format-conversion5.5-xhigh 未过 5.5-high 未过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 未过 -
09Add atomic signal selectors to Kea
kea-atomic-signal-selectors5.5-xhigh 通过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 通过 -
10Add shorthand expansion and compression to the lexer
csstree-shorthand-expansion-compression5.5-xhigh 通过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 通过 -
11Add deterministic multi-key sorting to fd
fd-deterministic-multi-key-sorting5.5-xhigh 未过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 通过 -
12Preserve structure needed by stylesheet selectors
oxvg-structural-selector-preservation5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 未过