From 6a01cf76b2a2c59a5d820ba81a487ef6dd1d83e4 Mon Sep 17 00:00:00 2001 From: Whelan Boyd Date: Fri, 23 Jan 2026 13:58:26 -0800 Subject: [PATCH 1/2] Add Jan update Signed-off-by: Whelan Boyd --- src/content/blog/december-2025.mdx | 52 ++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 src/content/blog/december-2025.mdx diff --git a/src/content/blog/december-2025.mdx b/src/content/blog/december-2025.mdx new file mode 100644 index 0000000..325504c --- /dev/null +++ b/src/content/blog/december-2025.mdx @@ -0,0 +1,52 @@ +--- +title: "December & January Bulletin" +date: "2026-01-06" +authors: ["Community Team"] +excerpt: "Overview of all work happening in Vortex" +published: true +--- + +[TODO - Adam to fill in Datafusion 52 support] + +The work mentioned in the previous bulletin to revamp Array evaluation to be fully lazy was released on January 6th. This happens by converting their execution to an Operator model that evaluates into Vectors (fully decompressed, zero-copy to Arrow representation). As a reminder, this work enables many more optimizations, and also provides unified abstractions for evaluating on +different processor types (CPUs & GPUs). + +Speaking of, focus is now on GPU support. The goal is to enable querying training data on the fly and streaming it from object storage directly to GPU memory with high throughput. To achieve this, the team is: - adding the necessary encodings via Vortex's extensions to enable GPU-native decompression - integrating with NVIDIA's CUDA toolkit so GPUs can scan Vortex files in object storage directly + +## Around the Ecosystem + +- DuckDB Labs published performance benchmarks comparing Vortex (using the `duckdb-vortex` extension) versus Parquet. Spoiler alert: Vortex was ~18% faster! +- Spice AI began a blog series about building their data accelerato Cayenne with Vortex and Datafusion + - [🌪️ Vortex: The Bet on Encoding-Efficient Columnar Storage for Hot Data](https://www.linkedin.com/posts/lukekim_datafusion-spiceai-data-activity-7417019189477126144-1TRe/) + - [🔬 The Research Behind Modern Data Compression & Vortex](https://www.linkedin.com/posts/lukekim_datafusion-developers-ai-activity-7417649503291498496-TD5_/) + - [🤖 Three Data Problems Vortex Solves for Applications and Agents in 2026](https://www.linkedin.com/posts/lukekim_vortex-efficient-columnar-storage-for-hot-activity-7419472524750798848-lZC6/) + +## Acknowledgments + +We want to thank to anyone who has tried Vortex, provided feedback, asked question and filed issues. + +The following contributed to the December & January releases. + +```text +Joe Isaacs +Adam Gutglick +Connor Tsui +Nicholas Gates +Alexander Droste +Robert Kruszewski +Alfonso Subiotto Marqués +Andrew Duffy +Cancai Cai +Onur Satici +Dmitrii Blaginin +Dan King +Baris Palaska +godnight10061 +sherlockbeard +Frederic Branczyk +paultiq +Pratham Agarwal +Hao Huaijin +Dave Bunten +Harry Scholes +``` From 0c520f8315bc20531cfd9ebb8966bcdbe7ba73ec Mon Sep 17 00:00:00 2001 From: Whelan Boyd Date: Fri, 23 Jan 2026 14:14:46 -0800 Subject: [PATCH 2/2] Clean up GPU support Signed-off-by: Whelan Boyd --- src/content/blog/december-2025.mdx | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/content/blog/december-2025.mdx b/src/content/blog/december-2025.mdx index 325504c..04cc906 100644 --- a/src/content/blog/december-2025.mdx +++ b/src/content/blog/december-2025.mdx @@ -11,7 +11,14 @@ published: true The work mentioned in the previous bulletin to revamp Array evaluation to be fully lazy was released on January 6th. This happens by converting their execution to an Operator model that evaluates into Vectors (fully decompressed, zero-copy to Arrow representation). As a reminder, this work enables many more optimizations, and also provides unified abstractions for evaluating on different processor types (CPUs & GPUs). -Speaking of, focus is now on GPU support. The goal is to enable querying training data on the fly and streaming it from object storage directly to GPU memory with high throughput. To achieve this, the team is: - adding the necessary encodings via Vortex's extensions to enable GPU-native decompression - integrating with NVIDIA's CUDA toolkit so GPUs can scan Vortex files in object storage directly +## GPU Support + +Speaking of, focus is now on enabling reading Vortex files into GPUs. To achieve this, the team is: + +- Adding GPU decompression for existing kernels, as well as some new encodings optimized for GPU +- Integrating with NVIDIA's CUDA toolkit for high performance I/O. + +The first supported output types will be Arrow Device Arrays and cuDF. As with all of Vortex, these capabilities are fully exposed to plugins so advanced users can extend and customize for their own use. ## Around the Ecosystem