Nvidia Unveils Faster AI Chips Sooner Than Expected (5 minute read) Nvidia revealed its latest AI server systems at the Consumer Electronics Show in Las Vegas. The Vera Rubin chips are set to go on sale in the second half of this year. They are designed to handle the enormous computing loads needed to create simulations of reality for use in model training. Rubin delivers a 10-fold reduction in cost compared with Blackwell chips. | AMD Unveils New Chip For Corporate Data Centers, Talks Up Demand (3 minute read) AMD's new MI440X chip is designed for use in smaller corporate data centers. The company's new Helios system, based on the new chip, will go on sale later this year. AMD's MI500 series of processors will debut in 2027. Those chips will deliver up to 1,000 times the performance of the MI300 series, which was first rolled out in 2023. | NVIDIA introduces Alpamayo (4 minute read) NVIDIA launched the Alpamayo family of open AI models, datasets, and simulators to address long-tail challenges in autonomous driving. The models use reasoning-based vision-language-action architectures to support safer, end-to-end AV systems that generalize to rare scenarios. | | Gross Profit per Token (2 minute read) Meta's $2 billion acquisition of Manus highlights the importance of gross profit per token, with Manus showing 40x GP multiple. DeepSeek and Together AI have the lowest multiples because they resell inference, unlike Perplexity, which achieves the highest at 222x as an application. Investors prioritize token monetization over raw volume, as indicated by a 0.71 correlation between gross profit per token and valuation. | AI Capex: Built on Options, Priced as Certainty (6 minute read) Letters of intents are real options that move markets, guide counterparties, and shape capex decisions before obligations harden. Options can be enough to pull forward tens of billions in irreversible investment. If the option doesn't convert, the stranded capital becomes a restructuring problem. | | GRPO++: Tricks for Making RL Actually Work (72 minute read) Group Relative Policy Optimization (GRPO) is the RL optimizer used to train most open-source reasoning models. It is popular due to its conceptual simplicity and practical efficiency. The vanilla GRPO algorithm has subtle issues that can hinder the RL training process, especially at scale. This post provides an overview of the work done to solve the shortcomings of GRPO. | M2.1: Multilingual and Multi-Task Coding with Strong Generalization (17 minute read) MiniMax-M2.1 matches or surpasses the level of global top-tier models on multiple internal and external benchmarks. The open source model has exceptional performance in code generation, tool usage, instruction following, and long-range planning. This post discusses how the model was trained and shares insights gained during the process. | Deep Delta Learning (2 minute read) A new approach to neural architecture, Deep Delta Learning generalizes residual networks using a single scalar gate that interpolates between identity, projection, and reflection. | KernelEvolve: Scaling Agentic Kernel Coding for Heterogeneous AI Accelerators at Meta (1 minute read) Making deep learning recommendation model (DLRM) training and inference fast and efficient presents three key system challenges: model architecture diversity, kernel primitive diversity, and hardware generation and architecture heterogeneity. KernelEvolve is an agentic kernel coding framework that tackles heterogeneity at scale for DLRM. It is designed to take kernel specifications as input and automate the process of kernel generation and optimization for a recommendation model across heterogeneous hardware architectures. KernelEvolve is designed to optimize a wide variety of production recommendation models across generations of Nvidia and AMD GPUs, as well as Meta's AI accelerators. It significantly mitigates the programmability barrier for new AI hardware. | | OpenAI to Buy Pinterest? Strategic Analysis (8 minute read) OpenAI may acquire Pinterest, a visual search app disguised as a social platform. ChatGPT is missing Pinterest's level of visual efficiency because that is not a core capability of large language models. Pinterest's ability to go from intent to product, or from inspiration to intent to product, will be very important for OpenAI's commerce aspirations. ChatGPT must go visual to create a successful UX and user experience for agentic commerce. | | | Love TLDR? Tell your friends and get rewards! | | Share your referral link below with friends to get free TLDR swag! | | | | Track your referrals here. | | | |
0 Comments