Back to blog
RustPython

Case Study: VideoSwiper, how switching from Python to Rust made my software faster and more efficient

June 4, 2026
4 min read
Case Study: VideoSwiper, how switching from Python to Rust made my software faster and more efficient

What is VideoSwiper?

VideoSwiper was born from a practical need: reviewing gigabytes of video footage without actually watching them.

This program creates customizable frame collages at high speeds so the user can easily choose what to keep and what to move to the trash folder. However, to power this software, a robust and fast collage generation algorithm is absolutely essential; otherwise, the tool would just be useless.

The Switch: from Python to Rust

Phase 1: Python and OpenCV

The original Python algorithm used OpenCV, and it worked, but it had two critical issues:

Took too much storage: Output collages (lossless PNGs) occupied a massive amount of space (e.g., 555MB for 59 videos).

Execution Overhead: Python’s implementation consisted in creating multiple Python interpreters at the same time using Dart, creating huge overhead and resulting in an extremely inefficient multi-threading solution.

Phase 2: The initial Rust implementation

The first part of the transition was rough. By creating a 1:1 port of the Python algorithm, the performance wasn't great at all.

The old Rust implementation took 2 minutes and 5 seconds to process a single high-quality video. This taught me a valuable lesson: Rust is faster by default, but you also need good code to support it.

Phase 3: Understanding and fixing the problem

I rewrote the code from the ground up, taking inspiration from the old Python logic. Here is what actually worked:

  • Massive parallelism: Using Rayon, I managed to get parallelism working in a matter of minutes, making the code much faster by properly leveraging multi-core processors.
  • Stride handling: Some videos with unusual resolutions were being altered by the FFMPEG library by adding "padding" data to align the video in memory. I eliminated that problem by making the program find and calculate the true height and width of a certain video file.
  • FASTER resizing: When making a program as fast as possible, you also need to select the right libraries. I noticed that by using the standard image crate, the resizing section took almost 60% of the collage creation time. Then I stumbled upon fast_image_resize, a library that uses hardware parallel instructions to do exactly what I needed at blazing speeds.
  • Lanczos3 vs Nearest Neighbor: During the rewrite, I had to choose the resampling algorithm. I tested Lanczos3 and Nearest Neighbor based on speed and quality. Since VideoSwiper is built to be as fast as possible (even sacrificing a tiny bit of image quality), I went with Nearest Neighbor. A single frame extraction could slow down up to 10 times using Lanczos3 for a difference that not every user could see. (I’m still considering adding a settings tab for advanced users to choose their scaling algorithm).

Performance analysis

To truly measure the improvements, I ran a series of benchmarks. The numbers speak for themselves.

1. The Single Video Test

  • Resolution: 2560x1440
  • Framerate: 60fps
  • Length: 1 minute
  • Size: 209MB
  • Frames extracted: 40
Quality levelPython timePython sizeRust timeRust size
07.14s4.44MB1.43s450KB
16.12s9.61MB1.52s1.05MB
26.36s16.40MB1.94s2.10MB
37.66s34.20MB2.43s5.43MB
49.14s55.60MB2.78s12.70MB

2. The Batch Test

I took a collection of 59 videos, mostly high resolution, with a total size of 32.7GB. (Settings: Quality 2, 40 Frames, 12 Threads)

EngineTotal timeTotal size
Python1 minute 4.97s555 MB
Rust25.60s57.7 MB

Technical deep dive: the striping problem

One of the most complex things about video processing is Stride.

Video decoders often align rows of pixels to memory boundaries for hardware efficiency, adding invisible bytes at the end of each row (for example, if we have a 500-byte row and our memory aligns at 512, the decoder will put 12 bytes of padding to align the video with memory).

If we treat the buffer directly "as is," we encounter problems aligning our new pixels on the collage, confusing where a frame finishes and when the next one starts.

rust
// Remove stride padding to get clean RGB data
let mut clean_data = Vec::with_capacity(u_w * u_h * 3);
for row in 0..u_h {
   let start = row * stride;
   let end = start + (u_w * 3);
   clean_data.extend_from_slice(&data[start..end]);
}

Conclusions

VideoSwiper proves that choosing a fast language is only the beginning of a long process. Having a well-balanced codebase, rock-solid integrations, and choosing the right libraries can drastically impact how users experience your software.

  • Final speed: More than an hour of 1440p content summarized in under 30 seconds.
  • Relative speed: About 2.5x faster than the old Python solution.
  • Efficiency: 10x storage savings on the generated collages.