🚀 0.2.7 Release Notes

Savant 0.2.7 was released on February 7, 2024. The release includes several bug fixes, four new demos, and other enhancements, including documentation and benchmarking.

Savant crossed the 400-star band on GitHub, and Discord is now the place must-have-to-join. The work on the release took three months. In the following sections, we will cover essential parts of the release in detail.

IMPORTANT: Savant 0.2.7 is the last feature release in the 0.2.X branch. The following releases in the 0.2.X branch will be maintenance and bugfix releases. The feature development switches to the 0.3.X branch based on DeepStream 6.4 and WILL NOT support the Jetson Xavier family because Nvidia does not support them with DS 6.4.

GitHub: https://github.com/insight-platform/Savant/releases/tag/v0.2.7

New Demos

In total, we now provide 26 demos and samples. The most significant samples are collected on the Savant website.

The release introduces new demos:

The RT-DETR transformer-based detection model demo. The demo shows how to use the real-time RT-DETR detection model with Savant. The model works slightly slower than the de-facto standard YOLOV8 but opens the way for transformer-based models in real-time computer vision. Read more in a dedicated article.

CuPy-based postprocessing demo. Before Savant 0.2.7, postprocessing operations used CPU-downloaded tensors. In Savant 0.2.7, we implemented direct access to GPU-allocated tensors and a set of functions converting those tensors between OpenCV GpuMat, PyTorch, and CuPy. This is a significant improvement for using models involving large tensors and complex postprocessing. Now, you can use a marker in a postprocessing function to instruct the framework to use GPU-allocated output tensors and process them with CuPy almost in a drop-in-replacement way for NumPy. We improved the YOLOV8 segmentation demo to use GPU-based postprocessing optionally. It allowed the offloading of the CPU greatly; previously, NumPy utilized almost 100% of all available CPU cores.

PyTorch integration demo. Savant uses the TensorRT inference engine for blazingly fast inference. However, PyTorch is a dominant technology with plenty of high-quality code and models. In the release, we demonstrate integrating PyTorch in the Savant pipeline and using GPU-accelerated inference and postprocessing in pure Python without excessive CPU-GPU transfers. PyTorch is not as fast as TensorRT, but it is beneficial in some cases. Read more in a dedicated article.

Oriented bounding-boxes demo. Savant supports oriented bounding boxes (boxes with rotation angle) out-of-the-box. In the release, we finally implemented a demo, showing how to use them.

New Significant Features

Prometheus integration. The pipeline and buffer adapter can export run metrics to Prometheus and Grafana. It improves observability and helps users to monitor the performance of the pipeline. Developers can declare custom metrics exported along with system metrics. For details, explore this demo and read more in a separate article.
Buffer Adapter. The buffer adapter implements a persistent transactional on-disk buffer for data traveling between adapters and modules. With the adapter, it is possible to develop highly utilized pipelines that consume resources unpredictably and survive traffic bursts. The adapter exports its item and size load to Prometheus.
Compile-only run mode. Modules can now compile their TRT-inferred models without running the pipeline. We provide a special command allowing developers to compile models and shut down the pipeline without serving them. It helps deploy the pipelines more predictably because, previously, when a model was compiled for a long time, it was unclear what happened. Read more in a separate article.
Shutdown handler in PyFunc. This new API allows handling pipeline shutdown operations properly to release resources and notify 3rd-party systems about the termination.
Frame filtering on Ingress and Egress. By default, the pipeline accepts all frames containing video data. With Ingress and Egress filtering, developers can filter the data to avoid the processing. For example, they can select only keyframes for processing and drop the rest of the frames. These filters are advanced because they work on encoded video streams and, thus, can corrupt them when misused.
Model post-processing on GPU. With a new feature, developers can instruct the framework to access model output tensors directly from GPU memory without downloading them to the CPU memory. This is a significant improvement for using models involving large tensors and complex postprocessing. Now, you can use a marker in a postprocessing function to instruct the framework to use GPU-allocated output tensors and process them with CuPy almost in a drop-in-replacement way for NumPy. Read more in a separate article.
GPU memory representation functions. In the release, we provide functions for conversion memory buffers between OpenCV GpuMat, PyTorch GPU tensors, and CuPy tensors. It enables efficient GPU-only data processing at scale without copying them to the CPU memory. Such libraries like OpenCV CUDA, Torch Vision, and CuPy provide a broad range of high-performance algorithms, allowing developers to utilize GPU resources efficiently and decrease the requirements for CPU capacity.
Advanced object attribute modification operations. In the release, we implemented a set of new operations allowing developers to modify object attributes in a more convenient way.
Queue utilization API for PyFunc. Savant allows adding queues between PyFuncs to implement parallel processing and traffic burst management. With the release, we implemented a new API allowing developers to access queues deployed in the pipeline to request their utilization.

Next Release (0.3.7)

In the upcoming 0.3.7, we plan to switch to DeepStream 6.4 and do not introduce other features. The idea is to have a release fully compatible with 0.2.7 but based on DeepStream 6.4 and improved technology without API incompatibility.

Don’t forget to subscribe to our X to receive updates on Savant. Also, we have Discord, where we help users.

To know more about Savant, read our article: Ten reasons to consider Savant for your computer vision project.