The company said the biggest leap is in agentic coding and computer. On Terminal-Bench 2.0, which tests complex command-line workflows requiring planning and tool coordination, GPT-5.5 hits 82.7% ...