models /

gemini-3-pro-preview raw-v1

Provider: google · Run at 2026-04-28 · Submission a747a6b4

CaSS 39.6%
CBS 0.0% v1: 0 by design
Coverage 0%
Cost $4.20
Tokens (in/out) 1.65M / 11.3k
Cost circuit-breaker tripped. $4.00 cap exceeded at $4.20 estimated — tripped at task 0015 (drawbridge); 5 task(s) marked did_not_run and excluded from the CaSS denominator.
Diagnostic row, not ranked. legacy task-set provenance; 20 task(s) missing canary/hash; 5 did_not_run

Tier breakdown

The 3-dimension diagnostic matrix for this baseline. CBS can read 0 while the conservative v1 validator has low definitive coverage — that's expected for v1.

Tier BOM pass Scope pass Buildability CaSS CBS Coverage n
trivial 75.0% 75.0% 75.0% 0.0% 0% 4
easy 75.0% 25.0% 25.0% 0.0% 0% 4
medium 75.0% 25.0% 25.0% 0.0% 0% 4
hard 66.7% 33.3% 33.3% 0.0% 0% 3
stretch 0.0% 0.0% 0% 0

Per-task results

All 20 tasks attempted by this baseline. Click a task to see the same row from every model's perspective.

# Slug Tier CaSS BOM Scope Build Ref sim Iter Tokens Duration Status 3D
0001 flat-plate-5x7 trivial 0 0 0 incon. 0.000 3 3.9k 58.3s DNC
0002 flat-plate-5x5 trivial 1 1 1 incon. 0.143 13 96.8k 3m54s pass
0003 staggered-column-2x2 trivial 1 1 1 incon. 0.000 5 25.2k 2m10s pass
0004 hollow-frame-8x8 trivial 1 1 1 incon. 0.000 3 7.8k 1m06s pass
0005 small-table easy 0 1 0 incon. 0.000 7 72.1k 3m18s
0006 step-pyramid easy 1 1 1 incon. 0.400 6 28.1k 4m44s pass
0007 simple-gate easy 0 1 0 incon. 0.000 8 111.0k 4m12s
0008 picture-frame easy 0 0 0 incon. 0.000 6 89.0k 5m07s DNC
0009 minifig-chair medium 0 0 0 incon. 0.000 13 88.3k 5m04s DNC
0010 bookshelf-3-shelf medium 0 1 0 incon. 0.417 4 45.3k 3m05s
0011 simple-bridge medium 0 1 0 incon. 0.250 7 57.4k 2m53s
0012 square-lantern medium 1 1 1 incon. 0.500 10 264.3k 4m07s pass
0013 farm-tractor hard 0 1 0 incon. 0.086 16 400.4k 4m52s
0014 rolling-cart hard 1 1 1 incon. 0.000 12 256.4k 3m56s pass
0015 drawbridge hard 0 0 0 incon. 0.000 13 113.3k 5m35s DNC
0016 slatted-bench hard DNR 0 0 0ms DNR
0017 small-house stretch DNR 0 0 0ms DNR
0018 small-lighthouse stretch DNR 0 0 0ms DNR
0019 garden-shed stretch DNR 0 0 0ms DNR
0020 fountain-with-pool stretch DNR 0 0 0ms DNR

Submitted BOMs

The final BOM the model emitted on each task it attempted.

#0001 flat-plate-5x7 DNC

adapter error: Gemini 503: { "error": { "code": 503, "message": "This model is currently experiencing high demand. Spikes in demand are usually temporary. Please try again later.", "status": "UNAVAILABLE" } }

#0002 flat-plate-5x5 trivial CaSS 1

5x5 Flat Red Surface
part_num color_id qty
3031 4 2
3710 4 2
78329 4 2
total pieces 6

#0003 staggered-column-2x2 trivial CaSS 1

Staggered Column
part_num color_id qty
3004 4 26
total pieces 26

#0004 hollow-frame-8x8 trivial CaSS 1

Hollow Square Frame
part_num color_id qty
3460 0 4
3666 0 4
total pieces 8

#0005 small-table easy CaSS 0

Small Table
part_num color_id qty
3031 70 1
1751 70 1
2453a 70 4
total pieces 6

#0006 step-pyramid easy CaSS 1

Step Pyramid
part_num color_id qty
3958 19 1
3031 19 1
3022 19 1
total pieces 3

#0007 simple-gate easy CaSS 0

Simple Gate
part_num color_id qty
3003 0 10
3007 0 1
total pieces 11

#0008 picture-frame DNC

timeout after 300000ms

#0009 minifig-chair DNC

timeout after 300000ms

#0010 bookshelf-3-shelf medium CaSS 0

Small Bookshelf
part_num color_id qty
3795 70 3
3004 70 12
3010 70 6
total pieces 21

#0011 simple-bridge medium CaSS 0

Small Footbridge
part_num color_id qty
3003 71 8
3029 70 1
3633 70 6
total pieces 15

#0012 square-lantern medium CaSS 1

Small Square Lantern
part_num color_id qty
3031 72 2
3010 72 6
3004 72 6
3005 72 8
total pieces 22

#0013 farm-tractor hard CaSS 0

Small Farm Tractor
part_num color_id qty
4624 15 2
3641 0 2
6014a 15 2
6015 0 2
4600 0 2
3795 2 1
3022 2 1
2877 2 2
3829c01 0 1
4079b 0 1
3023 2 1
total pieces 17

#0014 rolling-cart hard CaSS 1

Open-top Cart
part_num color_id qty
3032 4 1
4600 0 2
30027a 15 4
30028 0 4
3009 4 2
3004 4 2
3666 4 2
3023 4 1
61252 71 1
3024 4 1
total pieces 20

#0015 drawbridge DNC

timeout after 300000ms

Run metadata

Submission IDa747a6b4-639d-4483-a932-db0f0c01e79d
Protocolraw-v1
Protocol hashsha256:ec6641f2d101a9de13bfca90229cdb93b49f1bd9369f3f1e7587f41b33c4c69c
Run at2026-04-28T15:21:40.351Z
Total duration55m01s
Adapter healthok