US Government Says China's Best AI Models Lag Behind. Experts Aren't So Sure
NIST's CAISI evaluated DeepSeek V4 Pro using private benchmarks and a cost-comparison filter that excluded every US model except GPT-5.4 mini. Critics call the methodology convenient.
In brief <ul><li>CAISI's evaluation ranked DeepSeek V4 Pro eight months behind the U.S. frontier, using an IRT-based scoring system across nine benchmarks including two private, unverifiable dataset… [+4166 chars]