Alibaba's Qwen3.6-27B Beats Much Larger Models on Coding Benchmarks
Alibaba's open-source Qwen3.6-27B outperforms its 15x larger predecessor on most coding benchmarks with only 27 billion parameters.
Alibaba's open-source Qwen3.6-27B outperforms its 15x larger predecessor on most coding benchmarks with only 27 billion parameters.
A new benchmark reveals that even top AI models drop roughly 50% in accuracy when analyzing complicated charts, exposing a key limitation in visual reasoning.
Chinese AI startup DeepSeek is in talks to raise at least $300 million at a $10 billion valuation, signaling growing investor confidence in China's AI sector.
Z.AI releases GLM-5.1, a 754B-parameter open source model designed for long-horizon agentic tasks, running autonomously for up to 8 hours and outperforming Claude Opus 4 on benchmarks.
Anthropic's run-rate revenue has surpassed $30 billion in 2026, up from $9 billion, driven by surging demand for its Claude AI model.
Meta says it will eventually release open-source versions of its new AI models led by Alexandr Wang, but plans to keep certain components proprietary initially.
Arcee AI released Trinity-Large-Thinking, a powerful new open-weights reasoning model under Apache 2.0 that enterprises can download and customize.
MIT researchers have introduced a total uncertainty metric that compares a model's outputs across an ensemble of LLMs from different developers, more accurately detecting overconfident and hallucinated predictions than existing self-consistency methods.
Anthropic announced it is doubling the usage limits for Claude AI subscribers during off-peak hours, a significant capacity expansion that comes as Claude's daily active users have surged over 140% since January 2026.
Anthropic launches Claude Sonnet 4.6, delivering frontier AI performance in coding, computer use, and agents with 1M token context window, just 12 days after Opus 4.6.
Claude Opus 4.6 introduces groundbreaking features including 1M token context window, agent teams for parallel coordination, and adaptive thinking for enterprise workflows.
Mount Sinai research shows AI LLMs believe medical misinformation 32-46% of the time, especially when framed as expert advice.
AI pioneer Yann LeCun has departed from Meta, warning that the AI industry is overly focused on large language models (LLMs) and is heading in the wrong direction. He advocates for a shift towards predictive world models.