Skip to content

AVT SDK Multimedia Framework

In the getting started guide, we have learned how to build your first application using AVT SDK, without explaining how it works. Therefore, in this tutorial, we will introduce the basic concepts of the multimedia framework in AVT SDK.

Overview

AVT SDK provides a comprehensive cross-platform multimedia processing/streaming framework, capable of performing complex tasks such as video/audio capture, encoding, AI inference, streaming, etc.

For Qualcomm platforms, the framework is built on top of GStreamer and Qualcomm GStreamer plugins, simplifying the development process. With AVT SDK, developers can easily build multimedia applications without any knowledge of the underlying GStreamer framework and plugins, while still benefiting from the hardware acceleration provided by Qualcomm platforms.

What is GStreamer?

GStreamer is an open-source media streaming framework. Qualcomm also provides a set of GStreamer plugins, so that developers can use GStreamer to build multimedia applications that are well optimized for Qualcomm platforms. However, the power and flexibility of GStreamer also make it complex and hard to learn, and that's where AVT SDK comes in.

Filter Graph Architecture

In AVT SDK, a filter graph, similar to a GStreamer pipeline, is managed by the AvtGraph class. It contains all the multimedia processing components, such as video sources, encoders, AI inference, etc.

graph LR
subgraph AvtGraph
    direction LR

    subgraph "Source 1"
        Src1[Source]
        AI1[AI Inference]
    end

    subgraph "Source 2"
        Src2[Source]
    end

    Mixer[Mixer]
    Encoder[Encoder]
    Preview[Preview]
    Stream[Stream]
    Record[Record]

    Src1 --> AI1
    AI1 --> Mixer
    Src2 --> Mixer
    Mixer --> Encoder
    Encoder --> Stream
    Encoder --> Record
    Mixer --> Preview

    style Mixer stroke-dasharray: 5 5
end

A conceptual diagram of filter graph in AVT SDK. All the components with solid lines can be dynamically added or removed, even if the graph is running.

The AvtGraph class provides a unified interface for managing the filter graph, including adding/removing nodes, setting parameters, and executing the graph. The basic structure of the filter graph is shown in the above diagram. As mentioned above, all the solid-line components in the diagram can be dynamically added or removed. The only dashed-line component is Mixer, which can only be enabled during the construction of AvtGraph.

Core Components

Here's a brief description of each key component in the AVT SDK filter graph:

Component Description Related Example
Source Captures video/audio data from devices like cameras/microphones All the examples in Examples
AI Inference Performs AI inference tasks like object detection or segmentation. It is part of the source component. - Infer Demo
Mixer Combines multiple video/audio sources into a single output stream. Disabled by default. - Mixer Demo
Encoder Encodes video/audio data into specific formats (H.264, H.265, etc.) - Record Demo
- RTSP Demo
- RTMP Demo
- Infer Demo
Preview Displays processed video on screen - Preview Demo
- Mixer Demo
- Infer Demo
Stream Transmits processed media over network (RTSP, RTMP, etc.) - RTSP Demo
- RTMP Demo
Record Saves processed media to a file in specified format (MP4, MOV, etc.) - Record Demo
- Infer Demo

For detailed usage of each component, please refer to the related examples.

Unsupported Features on Qualcomm Platforms

If you take a look at the API of AvtGraph or other classes, you will find that AVT SDK provides a wide range of features. However, currently only limited features are supported on Qualcomm platforms. Hence, please carefully check the return value of the functions if you are not sure if it is supported. The functions may return AvtResult::AVT_RESULT_UNSUPPORTED or AvtResult::AVT_RESULT_UNLOADED_MODULE if the feature is not supported.

Further Reading