How to Download Qualcomm AI Hub Models

Qualcomm AI Hub provides various AI models optimized for Qualcomm devices. This guide introduces two methods for downloading these models:

Through the Qualcomm AI Hub website.
Using the Python package qai-hub-models.

You'll learn how to access all the Qualcomm-provided models, including those with licensing restrictions like YOLOv8 and YOLOv11.

Note

This guide uses the QL601 product, which is based on the QCS6490 chipset, as an example. However, users can follow similar steps to download models for other Qualcomm chipsets supported by the AI Hub.

Using Qualcomm AI Hub

The following steps show how to download models via Qualcomm AI Hub.

Visit the Qualcomm AI Hub website, choose the Model tab and then All models.
Choose the chipset QCS6490 (Proxy) to display all the supported models.
- All currently supported models for the QCS6490 (Proxy) chipset are quantized.
- Certain models are temporarily incompatible with Qualcomm GStreamer plugins due to various reasons, which Qualcomm has yet to resolve.
Click the card of the model you would like to download. In this example, DeepLabV3-Plus-MobileNet-Quantized is the model to be downloaded.
The download page shows the model metrics and technical details. To download the model, click the Download Model button.

Models for QCS6490 support two different runtimes:

Qualcomm AI Engine Direct (QNN)TensorFlow Lite (TFLite)

Remember to select the correct device, i.e, Qualcomm QCS6490, when trying to download a QNN model.

TFLite models are agnostic so you don't need to choose any device first.
Certain models cannot be downloaded directly from Qualcomm AI Hub due to licensing constraints, such as YOLOv8. As a result, there is no Download Model button available when you browse the YOLOv8 page. To download these models, you may have to install qai-hub-models instead.

Using `qai-hub-models`

As mentioned before, certain models cannot be downloaded from Qualcomm AI Hub. Instead, you have to install Qualcomm Python packages to download them.

Environment setup

It's highly recommended that you setup a clean environment using things like Miniconda or virtualenv, and the Python version should be in the range from 3.9 to 3.13.

pip update

Remember to update pip before going further. Otherwise, you may unexpectedly install the older versions of qai-hub modules.

pip install --upgrade pip setuptools wheel

Installation and configuration of qai-hub

Visit Qualcomm AI Hub and log in.

Go to Settings and check Getting started, where you should see the following screen:

qai_hub_yolov8_download_no_button

Copy the following content and execute it to both install and configure qai-hub.

pip install qai-hub
qai-hub configure --api_token <YOUR_TOKEN>

`qai-hub-models` installation

You can either install qai-hub-models manually or install additional dependencies required for certain models (e.g., YOLOv11), which will also automatically install qai-hub-models.

pip install qai-hub-models

or

pip install --no-cache-dir "qai-hub-models[yolov11-det]"

Model export

You can download most models directly from Qualcomm AI Hub, but there are exceptions.

Certain models such as YOLOv11-Detection can't be downloaded from there due to licensing constraints. Therefore, you don't see the Download model button there. Instead, click the Model Repository button and you should see check the Example & Usage section, where you will see the command that was mentioned before.

qai_hub_yolov11_q

pip install "qai-hub-models[yolov11-det]"

After the dependency installation, run the following command:

python -m qai_hub_models.models.yolov11_det.export --quantize w8a8

Note

w8a8 refers to quantizing both weights and activations of the network to INT8 precision, while w8a16 means quantizing activations to INT16 precision. If a model is only available in a w8a16 version, it likely indicates that it suffers significant accuracy degradation when using w8a8. Currently, w8a16 models are only available in ONNX or QNN format.

The whole process first clones the Ultralytics repository, downloads the yolov11n.pt model to /root/.qaihm/models, and saves the COCO dataset to /root/.qaihm/fiftyone on your device.

The downloaded .pt model will then be uploaded to Qualcomm AI Hub, where compiling, quantization, profiling, and inference of the uploaded model will be executed. You can check the details of those tasks on https://app.aihub.qualcomm.com/jobs/, as shown in the picture below.

qaihub_jobs

After all the tasks above have been done, a TensorFlow Lite (.tflite) model will be downloaded to your current CLI position in your device. A folder named build will be created and you can see the quantized model there.

If you want to export a QNN (.bin) model, you need to add additional arguments. Run the following command to check the details:

python -m qai_hub_models.models.yolov11_det.export --help

qaihub_export_help

We did not specify the target runtime in the previously shown CLI command. As a result, a .tflite model will be compiled, quantized and downloaded by default. To download a QNN(.bin) model, --target-runtime and --chipset arguments are required.

You can see the supported chipsets from the help command mentioned above. Here we take QCS6490 for example.

python -m qai_hub_models.models.yolov11_det.export --target-runtime qnn --chipset qualcomm-qcs6490-proxy --quantize w8a8

It literally follows the same procedure and downloads a .bin model to your current CLI position.

More about QNN models

Tip

It's advised that you use .tflite models as they are not influenced by the QAIRT version.

If you fail to successfully run inference by using QNN models downloaded either from Qualcomm AI Hub or qai-hub-models, you may need to check whether the QAIRT version on your board matches that of the .bin model being used.

Click the "See more metrics" button to check the QAIRT version.

qai_hub_see_more_metrics_button

Details including the QAIRT version for the model are displayed.

qai_hub_see_more_metrics

You can enter the following command to check the supported QAIRT versions:

qai-hub list-frameworks

You should see the supported QAIRT versions on Qualcomm AI Hub as shown below:

Framework	API Version	API Tags	Full Version
QAIRT	2.31	[]	2.31.0.250130151446_114721
QAIRT	2.32	['default']	2.32.6.250402152434_116405
QAIRT	2.33	['latest']	2.33.2.250410134701_117956

API Version is what you need to check.

Qualcomm also provides an option that allows users to compile a QNN model using an earlier version of QAIRT. An example command to export a QNN model using QAIRT 2.31 is as follows:

python -m qai_hub_models.models.yolov11_det.export --target-runtime qnn --compile-options "--qairt_version 2.31" --profile-options "--qairt_version 2.31" --chipset qualcomm-qcs6490-proxy --quantize w8a8

Be sure to specify QAIRT versions in both --compile-options and --profile-options so that the whole export process can be successfully finished.