December, 2025

Savant natively uses models in the NVIDIA TensorRT format, optimized for a particular hardware platform. However, users do not need to convert models manually; we encourage them to use the ONNX format, which allows Savant to build TensorRT engines internally.

Once built, the models are cached and loaded quickly unless the cache is moved to a GPU of a different GPU family (e.g., Turing to Ampere) or the batch size changes, which causes Nvinfer, used internally by Savant, to rebuild them. You may also want to rebuild the model for a particular GPU, even within the same family, to ensure it is optimal, because, depending on GPU properties, you can get a better-optimized model for that GPU, especially if you allow TensorRT to use more memory. It’s worth trying if you care about performance maximization. In this manual, we walk through exporting an Ultralytics model to the ONNX format for use in Savant.

As Blackwell GPUs become increasingly used in computer vision and video analytics, demand for their support in Savant has grown significantly. To maintain support for the widely used line of NVIDIA edge (Jetson Orin) and datacenter devices, we decided to switch Savant from using DeepStream 7.0 to a custom-built DeepStream 7.1 with TensorRT 10.9, enabling support for Blackwell GPUs.

The new Savant release received version 0.6.0. The 0.6.x release line is compatible with the previous long-term support 0.5.x line, so users should be able to migrate to newer Blackwell GPUs without code changes. However, particular models may require tweaking and modifications to work with TensorRT 10.9. Another important update in the 0.6.x line is Python 3.12 (in 0.5.x Savant used Python 3.10). This Python version is known to run faster than Python 3.10, which should improve the performance of CPU-bound workloads.

In 0.6.x, we plan to develop Savant in the same evolutionary manner as in 0.5.x: all new features go to 0.6.x, and 0.5 is frozen for new features.

The next significant Savant upgrade will transition from DeepStream 7.1 to DeepStream 8. The timeline is not defined yet, but mostly depends on the NVIDIA roadmap for supporting the entire Jetson line (Orin, Thor), DGX Spark, and top Blackwell GPUs (B300).

To use Savant 0.6.0 on discrete GPUs, you need drivers 570.133.20 or newer; on Jetson, you need JetPack 6.2.

Release notes and Docker images are available on our GitHub at the link. The documentation is updated to address new system requirements.

Do not forget to join our Discord server, where you can ask questions, promote features, and get quick help.

Month: December 2025

How to prepare an Ultralytics model for the use in Savant

Savant 0.6.0 is Out: the first release to support Blackwell GPUs