Intelligent Software for AI Corp., Juan A. Meza Mathematical Proof as a Litmus Test: New Research Reveals Hidden Failure Modes in Advanced AI Reasoning Models (2025) Study reveals hidden weaknesses in AI reasoning models like R1 and o3 using mathematical proofs as evaluation litmus test AI Evaluation AI News AI Research AI Safety Benchmarking Large Language Models Mathematical Reasoning Dec 10, 2025 Our blog
Intelligent Software for AI Corp., Juan A. Meza New Neural Framework Exposes Critical Compositional Gap in AI Reasoning: 97.5% Accurate Task Taxonomy Reveals Transformer Limitations Researchers expose fundamental compositional reasoning gaps in AI transformers through validated task taxonomy framework AI Benchmarks AI News AI Research Abstract Reasoning Machine Learning Transformer Models Dec 9, 2025 Our blog