Llama 4 (6 minute read) Meta has unveiled Llama 4 Scout and Maverick, two 17B parameter multimodal models that offer state-of-the-art performance on major benchmarks, along with Llama 4 Behemoth, a 288B model still in training that surpasses GPT-4.5 in STEM tasks. | Midjourney V7 (2 minute read) Midjourney has released its new image generation model, V7 alpha. It brings smarter text interpretation, better image coherence, and introduces Draft Mode for fast, low-cost iterations with optional voice commands and personalization. | Cyberattacks by AI agents are coming (7 minute read) AI agents are emerging as potent tools in cybersecurity capable of executing complex attacks and potentially scaling operations like ransomware. The LLM Agent Honeypot project aims to detect these agents by simulating vulnerable servers. Its work has revealed that agents can adapt and avoid detection better than traditional bots. Experts anticipate an increase in agent-driven cyberattacks, urging preemptive development of defenses as these technologies evolve. | | Rope to Nope: Hybrid Attention for Long Context (25 minute read) The key innovation that enabled Llama 4 to reach 10m+ tokens in context is the alternation between no positional embeddings and rotational positional embeddings. While there are only have benchmarks on Needle in the Haystack, it seems to be a strong confirmation of performance of alternating layers. | Inference-Time Scaling for Generalist Reward Modeling (31 minute read) This paper from DeepSeek talks about how to use inference time scaling to make reward modeling better to bootstrap stronger reasoners. It hints at a broader strategy from the Chinese start-up to use its existing reasoning models as the base for a new generation of reward models to train the next generation of reasoners. | | Nano Aha Moment (GitHub Repo) A single file, single GPU, from scratch full parameter tuning library that replicates DeepSeek R1-Zero style training. | Object Counting (GitHub Repo) A fully automated zero-shot object counting method leveraging feature maps and self-attention mechanisms that achieves state-of-the-art accuracy on the FSC147 dataset. | | DeepSeek 1.58bit GGUF (Hugging Face Hub) The unsloth folks have figured out which piece of the new R1 model can be properly quantized. They also found some tokenizer quirks to be aware of, which make quantization slightly harder. In summary, just the MoE layers go to 1.58 bit while everything else remains in 4 or 6 with their dynamic quantization scheme. | AI masters Minecraft: DeepMind program finds diamonds without being taught (5 minute read) DeepMind's AI system, Dreamer, successfully learned to collect diamonds in Minecraft without prior human guidance, highlighting a step towards general AI systems. Using reinforcement learning, Dreamer independently explores and builds a model of the game environment to predict future actions and outcomes. This advancement suggests potential applications for AI in real-world scenarios where trial and error are costly. | The artifact isn't the art: Rethinking creativity in the age of AI (6 minute read) AI-generated Ghibli-style visuals have surged in popularity, straining OpenAI's servers and sparking debates about creativity in the AI age. While AI can rapidly produce artistic images, it lacks the human ability to experience and synthesize complex ideas and emotions. The future of creativity will focus on meaningful outputs shaped by human insight and purpose, with AI as a tool rather than a creator. | | Love TLDR? Tell your friends and get rewards! | Share your referral link below with friends to get free TLDR swag! | | Track your referrals here. | Want to advertise in TLDR? 📰 If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us. Want to work at TLDR? 💼 Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them! If you have any comments or feedback, just respond to this email! Thanks for reading, Andrew Tan, Ali Aminian & Andrew Carr | | | |
0 Comments