Google Launches Gemma 4 12B For Local Multimodal AI On Laptops
Google introduced Gemma 4 12B, an encoder-free multimodal open model built to run locally on laptops with 16GB of memory.
Google introduced Gemma 4 12B, an encoder-free multimodal open model built to run locally on laptops with 16GB of memory.
Mira Murati's Thinking Machines Lab previewed interaction models designed for continuous real-time collaboration with AI.
Luma AI's Uni-1 uses autoregressive architecture to beat Google Nano Banana 2 and OpenAI GPT Image 1.5 on reasoning benchmarks while cutting 2K resolution pricing by up to 30%.
Xiaomi unveiled MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-TTS — a trio of AI models featuring over 1 trillion parameters, multimodal perception, and emotional speech synthesis, rivaling Claude Opus 4.6 on agent benchmarks.
Google has launched Gemini Embedding 2, the first natively multimodal embedding model capable of jointly mapping text, images, and video into a unified vector space for retrieval and search tasks.
China's DeepSeek is on the verge of releasing its V4 multimodal model — capable of generating text, images, and video — while reportedly denying early optimization access to Nvidia and AMD, instead granting it exclusively to domestic chipmakers Huawei and Cambricon ahead of China's annual parliamentary sessions.
DeepSeek job postings reveal plans for multimodal AI search engine supporting text, images, and audio, directly targeting Google's search market share.
Beijing-based Moonshot AI releases Kimi K2.5, an open-source multimodal AI model that rivals OpenAI and Anthropic while being four times cheaper to run, raising questions about US semiconductor export controls' effectiveness in constraining China's AI development.