Microsoft researchers have developed On-Policy Context Distillation (OPCD), a training method that permanently embeds ...
MIT introduces Self-Distillation Fine-Tuning to reduce catastrophic forgetting; it uses student-teacher demonstrations and needs 2.5x compute.
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning ...
Training large AI models has become one of the biggest challenges in modern computing—not just because of complexity, but because of cost, power use, and wasted resources. A new research paper from ...
The company mainly trained Phi-4-reasoning-vision-15B on open-source data. The data included images and text-based descriptions of the objects depicted in those images. Before it started training the ...
Utkarsh Amitabh says he definitely wasn't in the market for a new job in January 2025, when data labeling startup micro1 approached him about joining its network of human experts who help companies ...
A recent partnership sent a clear signal through the market about the future of artificial intelligence (AI), and it has ...
Despite the hurdles, PewDiePie emphasized that the experiment was primarily about learning through trial and error. He ...
Microsoft releases Phi-4 Reasoning Vision 15B, a multimodal AI model that activates its own thinking mode and handles ...
Rohit Prasad, Amazon’s senior vice president and head scientist for artificial general intelligence, left, speaks at the Madrona IA Summit in Seattle with Madrona’s S. “Soma” Somasegar. (GeekWire ...