models /
gpt-5.4-mini scaffold-assembly-v1
Provider: openai ·
Run at 2026-06-09 ·
Submission 4c42a6c5
Tier breakdown
The 3-dimension diagnostic matrix for this submission. A tier shows no CBS until the structural prover returns a definitive verdict there.
| Tier | BOM pass | Scope pass | Buildability | CaSS | CBS | Coverage | n |
|---|---|---|---|---|---|---|---|
| trivial | 75.0% | 0.0% | 0.0% | 0.0% | 0.0% | 75% | 4 |
| easy | 100.0% | 25.0% | 25.0% | 25.0% | 25.0% | 100% | 4 |
| medium | 25.0% | 25.0% | 0.0% | 25.0% | 0.0% | 25% | 4 |
| hard | 50.0% | 0.0% | — | 0.0% | 0.0% | 0% | 4 |
| stretch | 25.0% | 0.0% | — | 0.0% | 0.0% | 0% | 4 |
| expert | 75.0% | 25.0% | 0.0% | 25.0% | 0.0% | 25% | 4 |
| master | 20.0% | 0.0% | 0.0% | 0.0% | 0.0% | 20% | 5 |
Per-task results
All 29 tasks attempted by this baseline. Click a task to see the same row from every model's perspective.
| # | Slug | Tier | CaSS | BOM | Scope | Build | Ref sim | Iter | Tokens | Duration | Status | 3D |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0001 | flat-plate-5x7 | trivial | 0 | 0 | 0 | incon. | 0.000 | 3 | 5.2k | 12.0s | DNC | — |
| 0002 | flat-plate-5x5 | trivial | 0 | 1 | 0 | 0 | 0.000 | 7 | 15.4k | 15.6s | 3D | |
| 0003 | staggered-column-2x2 | trivial | 0 | 1 | 0 | 0 | 0.000 | 4 | 14.1k | 10.2s | 3D | |
| 0004 | hollow-frame-8x8 | trivial | 0 | 1 | 0 | 0 | 0.000 | 8 | 45.4k | 23.9s | 3D | |
| 0005 | small-table | easy | 0 | 1 | 0 | 0 | 0.000 | 9 | 50.4k | 17.2s | 3D | |
| 0006 | step-pyramid | easy | 1 | 1 | 1 | 1 | 0.400 | 9 | 27.6k | 14.8s | pass | 3D |
| 0007 | simple-gate | easy | 0 | 1 | 0 | 0 | 0.000 | 9 | 58.0k | 24.0s | 3D | |
| 0008 | picture-frame | easy | 0 | 1 | 0 | 0 | 0.000 | 12 | 151.5k | 36.7s | 3D | |
| 0009 | minifig-chair | medium | 1 | 1 | 1 | 0 | 0.000 | 6 | 40.7k | 18.3s | pass | 3D |
| 0010 | bookshelf-3-shelf | medium | 0 | 0 | 0 | incon. | 0.000 | 6 | 69.6k | 21.1s | DNC | — |
| 0011 | simple-bridge | medium | 0 | 0 | 0 | incon. | 0.000 | 7 | 36.6k | 22.8s | DNC | — |
| 0012 | square-lantern | medium | 0 | 0 | 0 | incon. | 0.000 | 10 | 110.7k | 33.8s | DNC | — |
| 0013 | farm-tractor | hard | 0 | 1 | 0 | incon. | 0.000 | 9 | 41.3k | 43.6s | 3D | |
| 0014 | rolling-cart | hard | 0 | 1 | 0 | incon. | 0.000 | 7 | 50.1k | 27.2s | 3D | |
| 0015 | drawbridge | hard | 0 | 0 | 0 | incon. | 0.000 | 8 | 74.8k | 29.4s | DNC | — |
| 0016 | slatted-bench | hard | 0 | 0 | 0 | incon. | 0.000 | 5 | 42.2k | 34.0s | DNC | — |
| 0017 | small-house | stretch | 0 | 0 | 0 | incon. | 0.000 | 11 | 122.8k | 50.1s | DNC | — |
| 0018 | small-lighthouse | stretch | 0 | 0 | 0 | incon. | 0.000 | 14 | 83.9k | 54.3s | DNC | — |
| 0019 | garden-shed | stretch | 0 | 1 | 0 | incon. | 0.000 | 9 | 236.5k | 37.5s | 3D | |
| 0020 | fountain-with-pool | stretch | 0 | 0 | 0 | incon. | 0.000 | 10 | 190.3k | 57.7s | DNC | — |
| 0021 | grand-bridge | expert | 1 | 1 | 1 | incon. | 0.000 | 6 | 30.8k | 35.4s | pass | 3D |
| 0022 | manor-house | expert | 0 | 0 | 0 | incon. | 0.000 | 5 | 45.6k | 46.5s | DNC | — |
| 0023 | clock-tower | expert | 0 | 1 | 0 | 0 | 0.017 | 12 | 184.1k | 47.3s | 3D | |
| 0024 | grand-fountain | expert | 0 | 1 | 0 | incon. | 0.000 | 4 | 22.9k | 13.7s | 3D | |
| 0025 | mosaic-smiley | master | 0 | 0 | 0 | incon. | 0.000 | 6 | 25.1k | 38.4s | DNC | — |
| 0026 | voxel-tree | master | 0 | 1 | 0 | 0 | 0.000 | 6 | 34.3k | 14.8s | 3D | |
| 0027 | grand-mosaic-sunset | master | 0 | 0 | 0 | incon. | 0.000 | 5 | 29.0k | 24.3s | DNC | — |
| 0028 | mega-voxel-rocket | master | 0 | 0 | 0 | incon. | 0.000 | 2 | 3.8k | 3.7s | DNC | — |
| 0029 | colossal-mosaic-grid | master | 0 | 0 | 0 | incon. | 0.000 | 6 | 36.4k | 18.6s | DNC | — |
Submitted BOMs
The final BOM the model emitted on each task it attempted.
#0001 flat-plate-5x7 DNC
adapter error: OpenAI 400: { "error": { "message": "Invalid prompt: your prompt was flagged as potentially violating our usage policy. Please try again with a different prompt: https://platform.openai.com/docs/guides/reasoning#advice-on-prompting", "type": "invalid_request_error", "param": null, "code": "invalid_prompt" } }
#0002 flat-plate-5x5 trivial CaSS 0
| part_num | color_id | qty |
|---|---|---|
78329 | 19 | 5 |
| total pieces | 5 | |
#0003 staggered-column-2x2 trivial CaSS 0
| part_num | color_id | qty |
|---|---|---|
3003 | 1 | 13 |
| total pieces | 13 | |
#0004 hollow-frame-8x8 trivial CaSS 0
| part_num | color_id | qty |
|---|---|---|
3034 | 1 | 2 |
3795 | 1 | 2 |
3020 | 1 | 2 |
3022 | 1 | 4 |
3023 | 1 | 4 |
| total pieces | 14 | |
#0005 small-table easy CaSS 0
| part_num | color_id | qty |
|---|---|---|
3001 | 72 | 4 |
3022 | 0 | 1 |
| total pieces | 5 | |
#0006 step-pyramid easy CaSS 1
| part_num | color_id | qty |
|---|---|---|
3958 | 19 | 1 |
3031 | 19 | 1 |
3022 | 19 | 1 |
| total pieces | 3 | |
#0007 simple-gate easy CaSS 0
| part_num | color_id | qty |
|---|---|---|
3005 | 1 | 8 |
3004 | 1 | 2 |
3010 | 1 | 1 |
| total pieces | 11 | |
#0008 picture-frame easy CaSS 0
| part_num | color_id | qty |
|---|---|---|
3010 | 8 | 8 |
3009 | 8 | 2 |
3032 | 1 | 1 |
4276a | 8 | 2 |
3020 | 1 | 1 |
| total pieces | 14 | |
#0009 minifig-chair medium CaSS 1
| part_num | color_id | qty |
|---|---|---|
3022 | 86 | 4 |
3003 | 86 | 8 |
3005 | 86 | 4 |
| total pieces | 16 | |
#0010 bookshelf-3-shelf DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0011 simple-bridge DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0012 square-lantern DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0013 farm-tractor hard CaSS 0
| part_num | color_id | qty |
|---|---|---|
4488 | 2 | 4 |
69912 | 0 | 2 |
70695 | 0 | 2 |
4079 | 71 | 1 |
2741 | 0 | 1 |
3829c01 | 0 | 1 |
30640 | 2 | 1 |
| total pieces | 12 | |
#0014 rolling-cart hard CaSS 0
| part_num | color_id | qty |
|---|---|---|
3020 | 71 | 2 |
3004 | 71 | 2 |
38339 | 0 | 4 |
24314c01 | 0 | 4 |
3661 | 0 | 1 |
61252 | 71 | 1 |
| total pieces | 14 | |
#0015 drawbridge DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0016 slatted-bench DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0017 small-house DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0018 small-lighthouse DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0019 garden-shed stretch CaSS 0
| part_num | color_id | qty |
|---|---|---|
41539 | 80 | 1 |
3034 | 80 | 12 |
821 | 5 | 1 |
3082 | 12 | 2 |
79736 | 25 | 1 |
41770 | 85 | 1 |
41769 | 85 | 1 |
99781 | 19 | 2 |
| total pieces | 21 | |
#0020 fountain-with-pool DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0021 grand-bridge expert CaSS 1
| part_num | color_id | qty |
|---|---|---|
3030 | 0 | 2 |
3832 | 0 | 4 |
3062a | 0 | 48 |
3633 | 0 | 4 |
| total pieces | 58 | |
#0022 manor-house DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0023 clock-tower expert CaSS 0
| part_num | color_id | qty |
|---|---|---|
41539 | 86 | 8 |
3031 | 86 | 1 |
92947 | 86 | 18 |
3005 | 47 | 4 |
3004 | 86 | 4 |
3942a | 86 | 1 |
| total pieces | 36 | |
#0024 grand-fountain expert CaSS 0
| part_num | color_id | qty |
|---|---|---|
3062a | 71 | 6 |
92947 | 71 | 5 |
4032a | 71 | 1 |
| total pieces | 12 | |
#0025 mosaic-smiley DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0026 voxel-tree master CaSS 0
| part_num | color_id | qty |
|---|---|---|
3005 | 28 | 3 |
3005 | 0 | 6 |
3024 | 28 | 1 |
3024 | 0 | 2 |
| total pieces | 12 | |
#0027 grand-mosaic-sunset DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
#0028 mega-voxel-rocket DNC
no BOM after nudge
#0029 colossal-mosaic-grid DNC
model produced a non-conformant assembly (assembly instance counts must exactly match final_bom.parts quantities)
Run metadata
| Submission ID | 4c42a6c5-d1b7-499c-bf50-0f595c9b297f |
|---|---|
| Protocol | scaffold-assembly-v1 |
| Protocol hash | sha256:394f2d859fafd42d0ac144c5f251e68ea2e7f4c06da48bf733e1c2505055ee9c |
| Run at | 2026-06-09T17:12:45.812Z |
| Total duration | 13m47s |
| Adapter health | ok |