An expert benchmark for multimodal-instructed 3D CAD model editing.
neuralCAD-Edit is the first benchmark for editing 3D CAD models collected from expert CAD engineers. Rather than the text-only conditioning of prior work, it captures realistic CAD editing requests by recording professional designers interacting directly with CAD software — talking, pointing, and drawing — across ten consenting designers.
Leading foundation models are benchmarked against human CAD experts carrying out the same edits, revealing a large gap in both automatic metrics and human evaluation: even the best foundation model (GPT 5.2) scores 53% lower, in absolute terms, than CAD experts in human acceptance trials. The benchmark is meant as a solid, expert-grounded foundation against which 3D CAD editing methods and foundation models can be developed.
Each request is captured as a multimodal interaction — text, plus video with speech, on-screen interaction, and drawing — and paired with the expert’s own edit and a second designer’s human baseline. The benchmark scores AI approaches across four modality combinations against the standard set by human CAD experts, isolating how much multimodal grounding (pointing, sketching, demonstrating) closes the gap that text instructions alone leave open.