How to Download Qualcomm AI Hub Models
Qualcomm AI Hub provides various AI models optimized for Qualcomm devices. This guide introduces two methods for downloading these models:
- Through the Qualcomm AI Hub website.
- Using the Python package
qai-hub-models
.
You'll learn how to access all the Qualcomm-provided models, including those with licensing restrictions like YOLOv8 and YOLOv11.
Note
This guide uses the QL601 product, which is based on the QCS6490 chipset, as an example. However, users can follow similar steps to download models for other Qualcomm chipsets supported by the AI Hub.
Using Qualcomm AI Hub
The following steps show how to download models via Qualcomm AI Hub.
-
Visit the Qualcomm AI Hub website, choose the Model tab and then All models.
-
Choose the chipset QCS6490 (Proxy) to display all the supported models.
- All currently supported models for the QCS6490 (Proxy) chipset are quantized.
- Certain models are temporarily incompatible with Qualcomm GStreamer plugins due to various reasons, which Qualcomm has yet to resolve.
-
Click the card of the model you would like to download. In this example, DeepLabV3-Plus-MobileNet-Quantized is the model to be downloaded.
-
The download page shows the model metrics and technical details. To download the model, click the Download Model button.
Models for QCS6490 support two different runtimes:
Remember to select the correct device, i.e, Qualcomm QCS6490, when trying to download a QNN model.
TFLite models are agnostic so you don't need to choose any device first.
-
Certain models cannot be downloaded directly from Qualcomm AI Hub due to licensing constraints, such as YOLOv8. As a result, there is no Download Model button available when you browse the YOLOv8 page. To download these models, you may have to install
qai-hub-models
instead.
Using qai-hub-models
As mentioned before, certain models cannot be downloaded from Qualcomm AI Hub. Instead, you have to install Qualcomm Python packages to download them.
Environment setup
It's highly recommended that you setup a clean environment using things like Miniconda or virtualenv, and the Python version should be in the range from 3.9 to 3.13.
pip update
Remember to update pip before going further. Otherwise, you may unexpectedly install the older versions of qai-hub modules.
Installation and configuration of qai-hub
Visit Qualcomm AI Hub and log in.
Go to Settings and check Getting started, where you should see the following screen:
Copy the following content and execute it to both install and configure qai-hub
.
qai-hub-models
installation
You can either install qai-hub-models
manually or install additional dependencies required for certain models (e.g., YOLOv11), which will also automatically install qai-hub-models
.
or
Model export
You can download most models directly from Qualcomm AI Hub, but there are exceptions.
Certain models such as YOLOv11-Detection can't be downloaded from there due to licensing constraints. Therefore, you don't see the Download model button there. Instead, click the Model Repository button and you should see check the Example & Usage section, where you will see the command that was mentioned before.
After the dependency installation, run the following command:
Note
w8a8 refers to quantizing both weights and activations of the network to INT8 precision, while w8a16 means quantizing activations to INT16 precision. If a model is only available in a w8a16 version, it likely indicates that it suffers significant accuracy degradation when using w8a8. Currently, w8a16 models are only available in ONNX or QNN format.
The whole process first clones the Ultralytics repository, downloads the yolov11n.pt
model to /root/.qaihm/models
, and saves the COCO dataset to /root/.qaihm/fiftyone
on your device.
The downloaded .pt
model will then be uploaded to Qualcomm AI Hub, where compiling, quantization, profiling, and inference of the uploaded model will be executed. You can check the details of those tasks on https://app.aihub.qualcomm.com/jobs/, as shown in the picture below.
After all the tasks above have been done, a TensorFlow Lite (.tflite
) model will be downloaded to your current CLI position in your device. A folder named build will be created and you can see the quantized model there.
If you want to export a QNN (.bin
) model, you need to add additional arguments. Run the following command to check the details:
We did not specify the target runtime in the previously shown CLI command. As a result, a .tflite
model will be compiled, quantized and downloaded by default. To download a QNN(.bin
) model, --target-runtime
and --chipset
arguments are required.
You can see the supported chipsets from the help command mentioned above. Here we take QCS6490 for example.
python -m qai_hub_models.models.yolov11_det.export --target-runtime qnn --chipset qualcomm-qcs6490-proxy --quantize w8a8
It literally follows the same procedure and downloads a .bin
model to your current CLI position.
More about QNN models
Tip
It's advised that you use .tflite
models as they are not influenced by the QAIRT version.
If you fail to successfully run inference by using QNN models downloaded either from Qualcomm AI Hub or qai-hub-models
, you may need to check whether the QAIRT version on your board matches that of the .bin
model being used.
Click the "See more metrics" button to check the QAIRT version.
Details including the QAIRT version for the model are displayed.
You can enter the following command to check the supported QAIRT versions:
You should see the supported QAIRT versions on Qualcomm AI Hub as shown below:
Framework | API Version | API Tags | Full Version |
---|---|---|---|
QAIRT | 2.31 | [] | 2.31.0.250130151446_114721 |
QAIRT | 2.32 | ['default'] | 2.32.6.250402152434_116405 |
QAIRT | 2.33 | ['latest'] | 2.33.2.250410134701_117956 |
API Version
is what you need to check.
Qualcomm also provides an option that allows users to compile a QNN model using an earlier version of QAIRT. An example command to export a QNN model using QAIRT 2.31 is as follows:
python -m qai_hub_models.models.yolov11_det.export --target-runtime qnn --compile-options "--qairt_version 2.31" --profile-options "--qairt_version 2.31" --chipset qualcomm-qcs6490-proxy --quantize w8a8
--compile-options
and --profile-options
so that the whole export process can be successfully finished.