12TB of AI Coding Agent Logs (17 minute video)
AI coding is shifting from token maxing to token efficiency as teams move from subscriptions to per-token billing and costs become harder to control. Better workflows rely on careful upfront planning, right-sized agent sessions, cleaner context, API-first tooling, strong CI, and focused human review.
|
How we built SmithDB's inverted index for full-text search (11 minute read)
SmithDB builds inverted indexes with efficient JSON parsing, tokenization, string interning, and radix sorting; interning lifted construction speed by ~2.2x. Streaming compaction bounds memory regardless of index size, while aligned chunks and request coalescing reduce object-storage GETs. Queries merge local-SSD indexes with object-storage segments for sub-second freshness.
|
|
Why Real Workload Performance is the Metric that Matters (7 minute read)
Real workload performance matters more than headline benchmarks because production systems need to handle real data, concurrency, latency, scale, and cost. Performance claims should be judged by whether the workload matches yours, the setup is production-ready, results hold as data grows, and the product is actually available.
|
Building My Own Self-Hosted dbt Cloud (6 minute read)
A self-hosted dbt Cloud-style app can deliver much of the developer experience by combining dbt Core with a React/FastAPI interface and Prefect for orchestration. The biggest lesson is to use APIs, not CLI scraping, for reliable job management, logs, deployments, and real-time run status.
|
|
Apache Flink 2.3.0 Release Announcement (8 minute read)
Flink 2.3 moves toward a declarative streaming data platform. Materialized tables can evolve through DDL and query changes while avoiding unnecessary historical reprocessing in many common cases. SQL adds changelog conversion, explicit upsert conflict handling, and native S3 support without Hadoop dependencies.
|
Hardwood 1.0: A Fast, Lightweight Apache Parquet Reader for the JVM (9 minute read)
Hardwood 1.0 is a production-ready, JVM-native Parquet reader for Java 21+ that removes mandatory dependencies and parallelizes page decoding across CPU cores by default. It covers Parquet physical/logical types, projections, predicate push-down, local and object-store files, with row and batch column APIs. Benchmarks show 16.5M rows/sec and ~17-18x selective push-down speedups.
|
Kafka Share Groups - Pathological Fetch Waits with Record_limit (13 minute read)
A notable performance pitfall in Kafka Share Groups arises when using record_limit with fewer consumers than partitions, especially under partition skew. This leads to pathological fetch waits, which can drastically slow consumption during backlog drains or skewed workloads. The simplest mitigation is to use at least as many consumers as partitions when running with record_limit.
|
|
Gemma Interactions View (5 minute read)
A coding-agent challenge turned into a collaborative lab, with agents sharing playbooks, pooling quota, debugging each other's work, and stacking small improvements into big performance gains.
|
|
Love TLDR? Tell your friends and get rewards! |
|
Share your referral link below with friends to get free TLDR swag!
|
|
|
| Track your referrals here. |
|
|
|
0 Comments