No More Disks: The Architecture Behind Stateless Compute in ClickHouse Cloud (23 minute read) ClickHouse Cloud now uses a fully stateless compute architecture, enabled by a new in-memory database engine that stores all metadata centrally in a Shared Catalog rather than on local disks. This design allows compute nodes to quickly access the latest state at startup and supports stateless compute not only for ClickHouse's native format, but also for open table formats like Iceberg and Delta Lake. | | Has Self-Serve BI Finally Arrived Thanks to AI? (18 minute read) The MCP integration enables users to interact with BI data through a conversational interface, providing real-time answers without the need to manually navigate dashboards. By querying source data directly and supplementing responses with expert validation and visualizations, GenBI reduces the risk of errors or hallucinated results. | MCP and the reshaping of data visualisation & business intelligence (6 minute read) Model Context Protocol (MCP) is an open standard that allows AI systems like Claude to directly connect to diverse data sources, from PDFs to databases to BI tools, without custom integrations. This could reshape the role of BI professionals, as executives might bypass traditional workflows to self-serve insights via AI. While the human edge in storytelling, governance, and complex analysis still holds, data teams should proactively engage with MCP to stay relevant as automation advances. | There is No Golden Path Anymore: Engineering Practices are Being Rewritten (36 minute podcast) Ben Matthews from Stack Overflow and LoΓ―c Houssier from Superhuman discuss strategies for engineering teams to navigate rapid technological change, emphasizing strong leadership and aligned autonomy to empower teams and increase organizational velocity. AI is transforming workflows at Superhuman, from improving onboarding and streamlining work to reviving stalled projects. | | Rill (GitHub Repo) Rill lets you quickly build ultra-responsive dashboards from your data lake by co-locating an embedded, in-memory DuckDB engine with a SvelteKit front end, enabling sub-second SQL queries, live profiling, and "dashboards as code" with Git versioning. It auto-profiles datasets on each keystroke, offers opinionated default visuals, and imports Parquet/CSV from S3, GCS, HTTP, or local files. This SQL-first, self-hosted BI tool slashes latency for exploratory analysis and embeds governance via project files. | Kompute (GitHub Repo) This repository offers a general-purpose GPU compute framework built on Vulkan, designed for high-performance data processing across a range of graphics cards, including AMD, Qualcomm, and NVIDIA. Key features include asynchronous processing, mobile support, and optimization for advanced GPU data tasks, making it highly relevant for data engineers working on GPU-accelerated applications. | Sail 0.3: Long Live Spark (5 minute read) Sail 0.3 enhances Spark compatibility with a Rust-native execution engine. Supporting both Spark 4.0 and 3.5 while improving performance and reducing latency in cloud-native storage, it introduces a lightweight PySpark client, allows flexible installation options, and automatically adjusts runtime behavior based on the installed Spark version, ensuring seamless integration and efficiency for data engineers. | Announcing Lakebase Public Preview (7 minute read) Lakebase is a fully managed Postgres database built for AI and analytics on Databricks. It combines transactional and analytical workloads in one platform, supports serverless scaling and instant branching, and integrates with Unity Catalog for governance. This enables faster development of intelligent data apps without managing infrastructure. | | SaaS 2.0 (12 minute read) Traditional SaaS like Salesforce packages generic "opinionated lists" and rigid workflows that often misalign with a team's unique processes. By contrast, a "specialist-and-a-spreadsheet" model uses AI agents prompted with explicit sales playbooks or matchmaking expertise to dynamically manage lists, ask clarifying questions, and apply nuanced rules on the fly. This flips the paradigm from buying monolithic software to consuming bespoke expertise at scale, promising highly tailored workflows without custom development or cumbersome UIs. | GraphRAG-powered AI Agent interfaces: Real-world applications in incident and change management (18 minute read) GraphRAG, which combines structured knowledge graphs with retrieval-augmented generation, delivers significant gains in incident and change management compared to traditional RAG by surfacing actionable, context-rich insights. Prototype evaluations on real-world ICM datasets demonstrate that GraphRAG consistently outperforms unstructured and flat vector-based retrieval, especially under noisy or incomplete data. The dual-mode framework presented enables both automated dashboards and real-time AI agent interfaces, supporting human-in-the-loop and autonomous workflows. | | No Code Is Dead (15 minute read) Generative AI is overtaking traditional no-code platforms by enabling rapid app creation via natural language, but without proper controls, this accelerates technical debt and creates unmaintainable code, so industry leaders advocate hybrid models (combining AI-driven automation with visual or low-code environments) to ensure readability, governance, and scalability. | | | Love TLDR? Tell your friends and get rewards! | | Share your referral link below with friends to get free TLDR swag! | | | | Track your referrals here. | | Want to advertise in TLDR? π° If your company is interested in reaching an audience of data engineering professionals and decision makers, you may want to advertise with us. Want to work at TLDR? πΌ Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them! If you have any comments or feedback, just respond to this email! Thanks for reading, Joel Van Veluwen, Tzu-Ruey Ching & Remi Turpaud | | | |
0 Comments