Everybody loves benchmarking, and we love it, too! We always claim that Savant is fast and highly optimized for Nvidia hardware because it uses TensorRT inference under the hood. However, without numbers and benchmarks, the declaration may sound unfounded. Thus, we decided to publish a benchmark demonstrating the inference performance for three technologies:
- PyTorch on CUDA + video processing with OpenCV;
- PyTorch on CUDA + hardware accelerated (NVDEC) video processing with Torchaudio (weirdly, video processing primitives lie in the Torchaudio library);
- Savant.
The 1st is what most developers usually use as a de-facto approach. The 2nd is used rarely because it requires a custom build, and developers often underestimate hardware-accelerated video decoding/encoding as the critical enabling factor for CUDA-based processing.
Continue reading Serving YOLO V8: Savant is More Than Three Times Faster Than PyTorch