MiniMax's Hailuo 02 tops Google Veo 3 in user benchmarks at much lower video costs (4 minute read) MiniMax's second-generation video AI model, Hailuo 02, features major upgrades in both performance and price. It uses an architecture called Noise-aware Compute Redistribution that improves training and inference efficiency by a factor of 2.5. The architecture handles long video sequences differently depending on the stage of training. Hailuo 02 has three times more parameters and four times more training data compared to its previous version. A video generated with the model is available in the article. | | Inference Economics of Language Models (35 minute read) The first comprehensive model of LLM serving economics reveals why current approaches to scaling inference hit walls faster than expected, as AI companies race to serve token-intensive reasoning models and agents. Network latency, not bandwidth, creates the primary bottleneck that prevents companies from simply adding more GPUs to increase capacity. Algorithmic breakthroughs like speculative decoding, which delivers double the speed at no additional cost, continue to reshape the economic landscape as providers struggle to match surging demand. | Compiling LLMs into a MegaKernel: A Path to Low-Latency Inference (7 minute read) Traditional large language model (LLM) systems often rely on sequences of GPU kernel launches and external communications calls, which results in underutilized hardware. This post discusses how a team created a compiler that automatically transforms LLM inference into a single megakernel, which eliminates launch overhead, enables fine-grained software pipelining, and overlaps computation with communication across GPUs. The end-to-end GPU fusion approach reduces LLM inference latency by 1.2 to 6.7 times. | | How 100+ Security Leaders Are Tackling AI Risk (Sponsor) AI adoption is accelerating—and new research shows most security programs are still working to catch up.Get a clear view into how real teams are securing AI in the cloud: ✅ See where AI adoption is outpacing security ✅ Learn what top orgs are doing to manage shadow AI ✅ Benchmark your AI maturity against industry peers ✅ Get practical next steps to close the AI risk gap Get the insights | Changes made to the Model Context Protocol (2 minute read) This document lists major changes made to the Model Context Protocol (MCP) specification since the previous revision, 2025-03-26. Some of the changes include the removal of support for JSON-RPC batching, the added support for structured tool output, and the clarification of security considerations and best practices in the authorization spec. A link to the complete list of all changes is available. | Improving Fine-Grained Subword Understanding in LLMs (15 minute read) StochasTok randomly decomposes tokens during training: instead of always seeing "strawberry" as one unit, models encounter it split as "straw|berry," "str|awberry," or even "s|t|r|a|w|b|e|r|r|y," learning the internal structure humans naturally perceive. Models trained with this method achieve near-perfect accuracy on character counting and multi-digit math while maintaining performance on standard benchmarks. | | Six-month-old, solo-owned vibe coder Base44 sells to Wix for $80M cash (3 minute read) Israeli developer Maor Shlomo recently sold his six-month-old, bootstrapped vibe-coding startup, Base44, to Wix for $80 million cash. His eight employees will collectively receive $25 million of the $80 million as a 'retention' bonus. Base44 grew to 250,000 users in six months. It generated $189,000 in profit in May even after covering high LLM token costs. The startup grew mostly through word of mouth. | Andrej Karpathy on How AI is Changing Software (39 minute video) Andrej Karpathy argues we're entering "Software 3.0" where LLMs function as cloud-based operating systems programmable through English - best captured by his concept of "vibe coding". Rather than pursuing full autonomous AI agents, he advocates for "autonomy sliders" in tools like Cursor that offset AI limitations through human oversight, and emphasizes the need for LLM-friendly documentation as AI agents become major consumers of digital information. | | Love TLDR? Tell your friends and get rewards! | Share your referral link below with friends to get free TLDR swag! | | Track your referrals here. | | | |
0 Comments