The Sequence Radar #885: Last Week in AI: Models, Games, and the Future of Evaluation
New model releases, new agents and a soccer cup.

TL;DR
- OpenAI released GPT 5.6 with Sol, Terra, and Luna models, emphasizing tiered intelligence for different market needs and a phased-access strategy focused on safety and control.
- Anthropic introduced Claude Tag, a feature that allows users to structure prompts and responses with semantic markers, facilitating better context tracking and evolving human-AI interaction towards structured collaboration.
- General Intuition raised $320M to develop 'large action models' trained on action-labeled gameplay data, viewing video games as a rich substrate for embodied AI.
- The LayerLens Stratix Cup demonstrated a new method of AI evaluation through a soccer tournament, where models competed by writing their own strategies and adapting in real-time.
- The article notes a shift in AI development from chatbots to more organism-like systems that sense, plan, act, fail, and adapt.
- Research papers covered include Autodata for synthetic data generation, iLLaDA for large language diffusion models, evaluations of agent memory systems (MEMPROBE), Qwen-AgentWorld for general agents, and Tapered Language Models.
- Recent AI tech releases include GPT 5.6 Sol, Claude Tag, and Mistral OCR.
- Several AI companies received significant funding, including Patronus AI ($50M), General Intuition ($320M), Netris ($15M), and Groq ($650M).