Google DeepMind Releases Gemini Robotics-ER 1.6
Odaily News Google DeepMind has released Gemini Robotics-ER 1.6, positioned as a high-level reasoning model for robots. Compared to its predecessor ER 1.5 and Gemini 3.0 Flash, it shows significant improvements in spatial reasoning and multi-view understanding. The model is now available to developers via the Gemini API and Google AI Studio. The core upgrades include three key capabilities:1. Enhanced pointing accuracy: Can be used for precise object detection, counting, spatial relationship reasoning (e.g., "point to all objects that can fit into the blue cup"), and motion trajectory planning. It can also correctly refuse to point to objects that do not exist in the scene.2. Multi-view success detection: Robots can now integrate views from multiple camera feeds to determine if a task is completed, maintaining accuracy even in occluded or dynamic environments.3. New instrument reading capability: Can interpret various industrial instruments such as circular pressure gauges, vertical level indicators, and digital displays. It achieves step-by-step reasoning through agentic vision (visual reasoning + code execution): first zooming in on detail areas, then calculating proportions and intervals via pointing and code, and finally combining world knowledge to derive the reading.