How Perplexity Built an AI Google (16 minute read) Perplexity built an "answer engine" by combining real-time web search with LLMs through a Retrieval-Augmented Generation (RAG) pipeline that searches the web, extracts relevant snippets, and generates cited answers. Its architecture uses Vespa AI to index 200+ billion URLs, intelligently routes queries between in-house "Sonar" models and third-party LLMs (GPT/Claude) based on complexity, and runs on a custom-built ROSE inference engine optimized for speed and cost. | Why we migrated from Python to Node.js (10 minute read) Skald rewrote its backend from Python (Django) to Node.js a week after launching to address scaling concerns, primarily due to the complexities and limitations of Python's asynchronous capabilities for LLM interactions. Python's async support was unintuitive compared to Node.js. The migration took three days, and it resulted in a roughly threefold increase in throughput and a more unified codebase, though it required building more utilities from scratch and losing the Python ecosystem's ML focus. | | Why your retrospectives don't work and how to fix them (11 minute read) Most retrospectives fail because teams document problems without fixing them, using vague action items like "investigate" instead of concrete solutions with owners and deadlines. Toyota's Production System principles are helpful to follow here. Assign a rotating "Fixer" role (one person responsible for fixes each week), conduct postmortems immediately while context is fresh, and implement permanent solutions that prevent problems from recurring rather than temporary patches. | Architectural debt is not just technical debt (8 minute read) Architectural debt extends beyond code to include business and strategy layers, impacting whole organizations. Enterprise Architects should focus on integration patterns, system overlaps, and vendor lock-in at the application/infrastructure layer, while at the business layer, they should concentrate on ownership, documentation, and cross-departmental processes. At the strategy layer, Enterprise Architects should ensure accurate capability definitions and frameworks to prevent flawed strategic decisions. | | Web Codegen Scorer (GitHub Repo) The Web Codegen Scorer is a tool developed by the Angular team at Google to evaluate the quality of web code generated by LLMs. It allows users to compare code quality across different models, frameworks, and tools using established code quality measures. | oRPC (GitHub Repo) oRPC makes building end-to-end type-safe APIs that adhere to OpenAPI standards easier. It supports contract-first development and OpenTelemetry integration, along with various frameworks, schema validators, and runtimes. The library provides a suite of packages for building APIs, creating clients, generating OpenAPI specs, and integrating with popular frameworks such as React and NestJS. | Ruby Benchmark (GitHub Repo) The Benchmark module in Ruby allows devs to measure the execution time of Ruby code snippets. It provides detailed reports, including user CPU time, system CPU time, total time, and real (elapsed) time. | | The Case Against pgvector (13 minute read) While pgvector is a useful extension for bringing vector similarity search to Postgres, it's often oversimplified in online tutorials. There are still many challenges with running pgvector in production, including complex index management, query planning difficulties with filtering, and the hidden costs of real-time indexing. Existing blog posts often omit the operational realities and trade-offs involved, and for many teams, traditional vector databases are usually simpler. | Why engineers can't be rational about programming languages (12 minute read) Programming language choices are often driven by identity and emotion rather than objective technical analysis, leading to costly mistakes. Neuroscience research supports this, showing that challenging identity-based beliefs activates threat responses in the brain. This article shows this with examples, including a personal experience where a CTO's Perl preference bankrupted a company and a more recent example where a VP chose Rust based on hype. | AI's Dial-Up Era (17 minute read) The current AI boom is similar to the internet's early days in 1995. There should be caution against extreme optimism and pessimism surrounding AI's impact. AI's effect on employment will vary by industry, depending on unmet demand and the pace of automation. | | App Store web version (GitHub Repo) This is an archive of the Apple App Store's frontend source code, unintentionally made public due to Apple forgetting to disable sourcemaps in production. | A Friendly Tour of Process Memory on Linux (20 minute read) This is a detailed explanation of how Linux manages process memory that covers topics such as virtual memory, page tables, VMAs, memory mapping, copy-on-write, transparent huge pages, TLB invalidation, Meltdown mitigations, and tools for inspecting memory usage. | | | Love TLDR? Tell your friends and get rewards! | | Share your referral link below with friends to get free TLDR swag! | | | | Track your referrals here. | | Want to advertise in TLDR? π° If your company is interested in reaching an audience of web developers and engineering decision makers, you may want to advertise with us. Want to work at TLDR? πΌ Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them! If you have any comments or feedback, just respond to this email! Thanks for reading, Priyam Mohanty, Jenny Xu & Ceora Ford | | | |
0 Comments