How To Work With MJPEG USB Camera in Savant

Many USB cameras offer MJPEG as a major format for video streams. MJPEG is a lossy compressed format that combines low latency and optimized compression, allowing USB cameras to support high-resolution, high-FPS streaming. It is also a very popular solution for synchronized stereo cameras, as depicted in the lead image.

How To emulate MJPEG camera in Linux
Learn how to create an emulated MJPEG USB camera in the dedicated article.

The software must process MJPEG efficiently to guarantee low latency and low CPU utilization. Nvidia DeepStream supports JPEG as a first-class codec in the pipelines, and what is more important, Nvidia Jetson has dedicated NVJPEG ASICs for high-performance JPEG processing, making the device ideal for the use cases encoding and decoding JPEGs. In the recent DeepStream 6.4 JPEG decoding functionality in hardware is also supported by dGPU (A100, A30, H100).

NVJPEG is a game-changer technology for Jetson devices because their CPUs are not as capable as X86, and the pipelines always suffer from restrained performance. In the following section, we discuss how to access an MJPEG USB camera with the FFmpeg source adapter and produce low-latency MJPEG output with the module

Let us begin by accessing MJPEG data. We will use the FFmpeg source adapter, a universal adapter that allows capturing video streams supported by FFmpeg. To access the USB camera, we will use the following syntax (Docker Compose). We will demonstrate for Nvidia Jetson. The link to the corresponding compose for dGPU is at the end of the article.

  usb-cam:
    image: ghcr.io/insight-platform/savant-adapters-gstreamer-l4t:latest
    restart: unless-stopped
    volumes:
      - zmq_sockets:/tmp/zmq-sockets
    environment:
      - URI=/dev/video0
      - FFMPEG_PARAMS=input_format=mjpeg,video_size=1920x1080
      - ZMQ_ENDPOINT=pub+connect:ipc:///tmp/zmq-sockets/input-video.ipc
      - SOURCE_ID=video
    devices:
      - /dev/video0:/dev/video0
    entrypoint: /opt/savant/adapters/gst/sources/ffmpeg.sh
    depends_on:
      module:
        condition: service_healthy

The adapter does not decode MJPEG data and sends compressed images to the module under source_id=video. With the FFMPEG_PARAMS you can configure extra L4T FFmpeg properties like video resolution and fps.

From the USB adapter, captured frames flow to the module, where they are decoded (NVJPEG) straight into GPU-allocated memory.

  module:
    image: ghcr.io/insight-platform/savant-deepstream-l4t:latest
    restart: unless-stopped
    volumes:
      - zmq_sockets:/tmp/zmq-sockets
      - ../../cache:/cache
      - ..:/opt/savant/samples
    command: samples/multiple_rtsp/demo.yml
    environment:
      - MODEL_PATH=/cache/models/peoplenet_detector
      - DOWNLOAD_PATH=/cache/downloads/peoplenet_detector
      - ZMQ_SRC_ENDPOINT=sub+bind:ipc:///tmp/zmq-sockets/input-video.ipc
      - ZMQ_SINK_ENDPOINT=pub+bind:ipc:///tmp/zmq-sockets/output-video.ipc
      - METRICS_FRAME_PERIOD=1000
      - CODEC=jpeg
    runtime: nvidia

Jetson devices have two NVJPEG ASICs, allowing processing up to 384 MP/sec (45+ FPS at 4K) for Xavier NX and 1200 MP on Orin (120 FPS at 4K). These numbers are huge, so in our above-listed module, we also set the output encoder to jpeg enabling compressed stream even on Jetson Orin Nano, which does not have a hardware video encoder (NVENC).

To sum up, our module consumes JPEGs and produces JPEGs in hardware without CPU utilization.

Finally, our pipeline delivers video in the AO-RTSP adapter, producing an HLS/RTSP stream.

  always-on-sink:
    image: ghcr.io/insight-platform/savant-adapters-deepstream-l4t:latest
    restart: unless-stopped
    ports:
      - "554:554"    # RTSP
      - "1935:1935"  # RTMP
      - "888:888"    # HLS
      - "8889:8889"  # WebRTC
      - "13000:13000"  # Stream control API
    volumes:
      - zmq_sockets:/tmp/zmq-sockets
      - ../assets/stub_imgs:/stub_imgs
    environment:
      - ZMQ_ENDPOINT=sub+connect:ipc:///tmp/zmq-sockets/output-video.ipc
      - SOURCE_IDS=video
      - FRAMERATE=20/1
      - STUB_FILE_LOCATION=/stub_imgs/smpte100_1280x720.jpeg
      - DEV_MODE=True
      - MAX_RESOLUTION=3840x2160
      - METRICS_FRAME_PERIOD=1000
      - METRICS_TIME_PERIOD=10
    command: python -m adapters.ds.sinks.always_on_rtsp
    # runtime: nvidia

This adapter is configured for software encoding to work on Jetson Orin Nano (because this particular device cannot encode H 264 in hardware). However, if you are on NX or AGX, you can uncomment the last line to enable the hardware encoder.

When the nvidia runtime is activated, the adapter consumes the JPEG stream and decodes it with NVJPEG efficiently.

The full sample for Nvidia Jetson and Nvidia dGPU can be found in the Savant repository on GitHub.

We have a Discord server, please join the community to get help. To know more about Savant, read our article: Ten reasons to consider Savant for your computer vision project.