100 TiB 데이터 처리를 증명한 DuckDB 기반 분산 데이터 파이프라인 설계
From DeepSeek to Quack: When the Dream of Distributed DuckDB Started to Feel Real
From DeepSeek to Quack: When the Dream of Distributed DuckDB Started to Feel Real
Load PostgreSQL into Apache Iceberg with Sling
Scraping dynamic pages with Python, Playwright and AWS Lambda
DuckLake 1.0: el formato de data lake que mueve el catálogo de archivos a SQL y promete 926 más velocidad que Iceberg
Taming the Chaos: Cleaning 10M+ Apple Health Records into a Production-Ready Parquet Lakehouse
Podcast: A Java Performance Quest: Taming Unsafe Code, Embracing Idiomatic Style & Debugging the Linux Kernel
Delta Lakes: ACID Transactions, Time Travel & Delta Tables
Finding a Practical Analytics Format for Structured JSON Logs
3.4M Solar Panels
DuckDB uses RDBMS to attack classic 'small changes' problem in lakehouses
Elusion v8.3.0 is out!
Parquet Content-Defined Chunking
Improving Parquet Dedupe on Hugging Face Hub
Introducing the SQL Console on Datasets
DuckDB: analyze 50,000+ datasets stored on the Hugging Face Hub