OpenAI has launched three new models in its API: GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano. These models outperform GPT‑4o and GPT‑4o mini across the board, with major gains in coding and instruction following. They also have larger context windows—supporting up to 1 million tokens of context—and are able to better use that context with improved long-context comprehension. They feature a refreshed knowledge cutoff of June 2024.
Hugging Face, the center of the open source AI community, has long stated its goal is to be a decentralized DeepMind. While this isn't exactly the case, adding in an open source robotics platform via Pollen moves it closer to that goal.
DeepMind has announced DolphinGemma, a large language model developed by Google that helps scientists study how dolphins communicate — and hopefully find out what they're saying, too.
The ByteDance team has released a paper showing how to train a competitive 7B parameter video generation model on a "modest" compute budget of 655k H100 hours. It has strong performance on a number of temporally difficult tasks.
Most generative models on continuous signals operate in latent space due to computational constraints. This work introduces a series of cascades that allow the generation to happen directly in pixel space. This eliminates the need for a pretrained VAE.
New VLM that can reason about contacts between humans in 3D and objects. It does so by leveraging a strong base model and lifting its reasoning into 3D with clever multi-view rendering.
Scaling up image tokenizers is challenging because they tend to collapse. This work introduces GigaTok, which is a massive tokenizer with superior reconstruction performance. Decoder scaling and regularization helped with stability and overall quality.
C3PO introduces a new test-time optimization technique that improves accuracy in Mixture-of-Experts LLMs by re-mixing expert weights based on similar reference samples.
OpenAI's BrowseComp is a new benchmark of 1,266 problems designed to evaluate AI agents' browsing skills in gathering complex, hard-to-locate information online.
Executives from nine companies share how they're leveraging Google Cloud's AI tools to drive innovation across sectors, with over 600 real-world use cases highlighted.
Vertex AI introduces updates to video, image, speech, and music generation models, enhancing creative workflows for businesses. Google AI is enabling specialized AI agents for companies, improving productivity and security. A new Agent2Agent Protocol allows different AI agents to securely communicate across platforms.
NVIDIA is localizing AI hardware production by building factories in Texas and Arizona, aiming to produce Blackwell chips and AI supercomputers entirely within the U.S.
Educators can now use Gemini to generate questions or quizzes from selected text in Google Classroom, enhancing lesson interactivity and streamlining content creation.
Love TLDR? Tell your friends and get rewards!
Share your referral link below with friends to get free TLDR swag!
0 Comments