News

Semgrep: GLM 5.2 beats Claude in our Cyber Benchmarks

  • --Semgrep.dev
  • published date: 2026-06-28 17:50:47 UTC

Among models given nothing but a prompt, the best open-weight option beat Claude Opus 4.8.

We ran a set of popular open-source models against our IDOR benchmark, the same dataset and the same prompt we've used to evaluate frontier coding agents. The result surprised us: GLM 5.2, an open-we… [+13071 chars]