GPT-5.5-xhigh
87.5
GPT-5.5-high
125.0
GPT-5.5-medium
100.0
GPT-5.4-xhigh
100.0
GPT-5.4-high
100.0
本次多模型指标
项目5.5-xhigh5.5-high5.5-medium5.4-xhigh5.4-highOpus 4.8 high
通过数7/1210/128/128/128/127/12
IQ87.5125.0100.0100.0100.087.5
Agent steps495402363542432611
费用$41.4$26.6$20.6$22.4$15.5$58.5
cache命中率95.5%93.9%94.3%95.5%95.3%97.6%
耗时3.1h1.8h1.5h4.1h2.6h2.9h
总tokens43.2M25.3M20.7M43.8M29.8M57.7M
备注:Opus 4.8 为 CC Max 20x。Opus 列为不定期参考列,仅进入表格,不进入趋势图。
固定评测任务集
固定评测任务集
每日 IQ 点固定使用这组混合语言 DeepSWE 任务;每行展示今天各组模型在同一道题上的结果。
-
01Add JSONPath query APIs to orderedmap and Starlark modules
ytt-jsonpath-query-api5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
02Add build-time grammar conflict analysis to participle
participle-grammar-conflict-analysis5.5-xhigh 未过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 通过 5.4-high 通过 -
03Harden module loading, cache introspection, and script flags
abs-module-cache-flags5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
04Add multipart response parsing to HTTPX
httpx-multipart-response-parsing5.5-xhigh 通过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 通过 5.4-high 通过 -
05Add incremental cache controls to Bandit
bandit-incremental-cache-control5.5-xhigh 未过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 通过 5.4-high 未过 -
06Add session bundle recording and replay to IPython
ipython-session-bundle-replay5.5-xhigh 未过 5.5-high 未过 5.5-medium 通过 5.4-xhigh 未过 5.4-high 未过 -
07Add a per-origin circuit breaker to ofetch
ofetch-per-origin-circuit-breaker5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
08Add link format conversion between wiki and markdown syntax
obsidian-linter-link-format-conversion5.5-xhigh 未过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 未过 5.4-high 未过 -
09Add atomic signal selectors to Kea
kea-atomic-signal-selectors5.5-xhigh 通过 5.5-high 未过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过 -
10Add shorthand expansion and compression to the lexer
csstree-shorthand-expansion-compression5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 未过 5.4-high 通过 -
11Add deterministic multi-key sorting to fd
fd-deterministic-multi-key-sorting5.5-xhigh 未过 5.5-high 通过 5.5-medium 未过 5.4-xhigh 未过 5.4-high 未过 -
12Preserve structure needed by stylesheet selectors
oxvg-structural-selector-preservation5.5-xhigh 通过 5.5-high 通过 5.5-medium 通过 5.4-xhigh 通过 5.4-high 通过
公众号
扫码进群交流
微信雷达群
扫码进群交流