Skip to content

Technical Blog

Accelerate VLM Development with AI Fusion Kit

AI Fusion Kit

The first barrier in any multimodal LLM project is often not about the model itself, but about the hardware. Looking for a powerful computing platform, a high-quality camera, and a sensitive microphone can take a lot of time and effort. What's worse, these components may not work well together, leading to a tangled web of driver issues, compatibility conflicts, and frustrating debugging sessions before your real work even begins.

The AI Fusion Kit is designed to eliminate these challenges entirely. It is a complete, out-of-the-box solution where every component works seamlessly together.

Building a Resilient Smart Traffic Monitoring System on Jetson: Recovery via ERMI Virtual Media and API

Edge AI devices are increasingly used in smart traffic applications such as license plate recognition, vehicle flow analysis, and anomaly detection. However, real-world deployment challenges such as configuration corruption, system failure, or file damage can severely degrade device functionality. These issues are further complicated by the physical inaccessibility of many deployed systems.

We outline a practical, developer-oriented solution using ERMI (Edge Remote Management Interface) API and virtual media capabilities to enable remote recovery and minimize system downtime, set within the broader background of industry adoption of edge AI and remote management standards.

AI Fusion Kit Quick Start Guide

AVerMedia AI Fusion Kit is an all-in-one solution for LLM/VLM developers. It consists of a powerful AI box PC, a 4K camera, and an AI speakerphone, allowing you to easily build your own multimodal AI applications. This guide will walk you through the steps to get started with the AI Fusion Kit.

Porting Tier IV Edge.Auto to AVerMedia D135

This blog shares our first hands-on experience with Tier IV’s Edge.Auto perception framework. After validating the perception modules in the CARLA simulator, we took the next step by deploying the same pipeline on the AVerMedia D135 embedded platform. This journey represents a practical attempt to bridge the gap between simulation and real-world deployment, helping us better understand how to run ROS 2-based perception logic on edge hardware.

Voice Kiosk on QL601: Building a Low-Power, Voice-First Edge Terminal

Imagine walking up to a kiosk in a busy fast-food restaurant. Instead of tapping through a complex menu on a touchscreen, you simply say, "I'd like a cheeseburger, no onions, with a side of fries and a vanilla milkshake." The kiosk confirms your order instantly, and you're ready to pay. This isn't science fiction; it's the future of customer interaction, and it's powered by AI running directly at the edge.

Time to First Token

Time to First Token (TTFT) refers to the latency between a user hit the Enter key and the appearance of the first character shows on the screen. Excessive TTFT can greatly diminish the overall user experience.

TTFT is a crucial response time indicator for an online interactive application powered by a large language model (LLM), as it reflects how quickly users can catch the first character from the model through a web page.

Here, we will explore two simple ways to get the latency of first token from a language model.

Installing Isaac ROS on Jetson

This article explains how to set up and install NVIDIA Isaac ROS on Jetson platforms, including Docker configuration, SSD integration, developer environment setup, and compatibility between different JetPack and Isaac ROS versions.