How To Count People In Polygonal Areas with Savant

In an era where data-driven decisions are paramount, the ability to accurately count and monitor people in specific areas has become invaluable. Nvidia’s PeopleNet neural network offers a solution for this. Aside from PeopleNet, other models can be used like the representatives of the YOLO family. As a component of Nvidia’s DeepStream SDK, which specializes in AI-driven video analytics, PeopleNet boasts training on extensive datasets, ensuring high precision in detecting individuals even in intricate or densely populated scenes.

At its core, people counting in polygonal areas involves detecting and enumerating individuals within a predefined polygonal region in an image or video feed. Unlike generic people counting, this method narrows down the focus to specific zones, allowing for targeted monitoring and analysis.

Applications Of People Counting

The task has a huge number of applications in our day-to-day life. Several popular cases are represented in the following list:

Retail Analytics: By monitoring foot traffic in specific store sections, retailers can refine product placements, promotional displays, and overall store design.
Workspace Utilization: Companies can gauge the usage of meeting rooms, lounges, or workspaces, facilitating efficient space management.
Transit Stations: Authorities can monitor passenger flow in specific zones of train stations or bus terminals, aiding in crowd management and safety.
Airport Management: Airports can track passenger density in areas like check-in counters, security checks, and boarding gates, ensuring smooth operations and timely interventions.
Parking Facilities: By counting people entering and exiting parking zones, management can optimize space allocation and security measures.
Public Safety: Monitoring of public areas such as parks, squares, or festivals ensures crowd safety and can assist in emergency response planning.
Urban Planning: City administrators can gather data on pedestrian movement in specific city zones, aiding in the design of pedestrian-friendly spaces.
Health Monitoring: Especially relevant in times of health crises, authorities can oversee public adherence to guidelines like social distancing in designated areas.

Determining if a Point Lies Inside a Polygon: An Algorithmic Approach

In many educational articles, you may find the situation when authors use rectangular zones to count objects in areas. In many cases, it can be sufficient but is definitely a limited approach because cameras usually observe a viewport in perspective view, rather than perpendicularly. Thus, even if a rectangular is the right figure to cover the area, it transforms into a trapezoid or an arbitrary polygon. In general, polygonal areas provide greater flexibility and fewer limitations.

One of the classic computational geometry problems is determining whether a given point lies inside, outside, or on the boundary of a polygon. Various algorithms have been proposed to solve this problem, with the “ray casting” or “crossing number” method being among the most popular.

Ray Casting Algorithm: The basic idea behind the ray casting algorithm is straightforward:

Draw a Horizontal Ray: From the point in question, cast a horizontal ray to the right.
Count Intersections: Count how many times this ray intersects with the polygon’s edges.
Determine Position:
- If the number of intersections is odd, the point lies inside the polygon.
- If the number of intersections is even, the point lies outside the polygon.

Video Analytical/Computer Vision Pipeline in Savant

The pipeline is very simple and contains several steps:

determine polygons are defined for a stream (optimization);
PeopleNet inference;
Assigning the object to configured polygons;
Display results.

The results for the working pipeline are represented in the following video:

https://www.youtube.com/watch?v=MPsYUU5gWMM&ab_channel=IvanKud

As you can see, the pipeline is smart enough to skip the processing when the stream does not contain defined polygons. Also, the Savant-RS Rust core library contains algorithms helping developers to determine mutual object relations in optimized ways. Thus, you can call for high-performance functions to execute the ray-casting algorithm without the need to code it manually.

Additional optimization can be implemented by skipping video frames based on the sampling approach: e.g., to process every n-th sampling.

To overcome the detector flickering, the pipeline can utilize time-window-based data normalization for counters, or it can be done in a 3rd-party system receiving the results.

Expanse Of The Pipeline On Other Areas

In the sample, PeopleNet is used to detect persons; however, the model can be easily switched to another one trained for apples or cabbage, cars, or stray dogs. A simple change allows solving multiple tasks without extra hassle.

Savant Sample

The above-discussed pipeline is implemented with the Savant framework. You can easily investigate how it works and modify it for your needs.

URI: https://github.com/insight-platform/Savant/tree/develop/samples/area_object_counting

What is Savant?
Savant is an open-source, high-level framework for building real-time, streaming, highly efficient multimedia AI applications on the Nvidia stack. It helps to develop dynamic, fault-tolerant inference pipelines that utilize the best Nvidia approaches for data center and edge accelerators.

Savant is built on DeepStream and provides a high-level abstraction layer for building inference pipelines. It is designed to be easy to use, flexible, and scalable. It is a great choice for building smart CV and video analytics applications for cities, retail, manufacturing, and more.

Don’t forget to subscribe to our X to receive updates on Savant. Also, we have Discord, where we help users.

Want to know more about Savant, read our article: Ten reasons to consider Savant for your computer vision project.