Sneak Peek: Exploring the Qt AI Inference API (Proof-of-Concept)
April 07, 2025 by Qt Group | Comments

Director, Industry, Qt Group

Senior Manager, R&D
At Qt Group’s 2025 Hackathon, our developers got experimental. They built a proof-of-concept for what we’re calling the Qt AI Inference API (PoC) – a tool that is set to simplify AI integration into QML and C++ applications.
With this blog, we’re offering you an exclusive look at what might be coming down the road. This early preview is all about exploring possibilities and gathering feedback from our valued developer community and industry partners.
➤ Discover the exciting possibilities of the Qt AI Inference API (PoC) at our Qt World Summit 2025 and the Qt Contributor Summit 2025!
The AI Revolution in Industrial Automation
A 2024 Cisco study shows that 41% of firms plan to prioritize AI investments in the next two years. This reflects a clear industry shift: AI is becoming a core component of industrial automation and is no longer just an experimental technology but rather a fundamental driver of productivity, efficiency, and innovation.
However, integrating AI into industrial environments isn’t without its challenges. From hardware constraints to fragmented AI frameworks, companies struggle to deploy AI models efficiently across diverse platforms. Moreover, there’s growing market interest in pioneering innovations in robotics and industrial applications, where multiple models must interact seamlessly in a pipeline.
AI in industrial automation isn’t just about incremental improvements. It’s about fundamentally increasing productivity and efficiency.
On the Edge of Possibilities
Among the various applications transforming the industry, edge AI is emerging as a game-changer, enabling real-time decision-making on devices with limited computing power.
Deploying AI at the edge, however, introduces a unique set of technical hurdles compared to traditional cloud-based AI. These challenges include:
-
Limited computing power: Embedded devices lack the raw processing power of cloud servers. Optimizing AI models for these environments is critical.
-
Energy consumption: Running AI models on battery-powered devices can lead to excessive power drain.
-
Real-time processing: Industrial automation requires AI models to react instantly – latency is not an option.
-
Cybersecurity risks: Transmitting sensitive industrial data to the cloud raises security concerns.
Overcoming these constraints is vital to unlocking the full potential of edge AI in industrial applications.
The cloud is not secure enough in some cases. Local AI processing offers better security, but it comes with hardware limitations.
The Current AI Landscape: A Wild West of Fragmentation
Developers must navigate an overwhelming number of frameworks, ranging from TensorFlow, PyTorch, and ONNX to vendor-specific AI toolkits (NVIDIA, Intel, Qualcomm, etc.) – each with its APIs, deployment models, SDKs, and hardware requirements. Keeping up with the rapidly changing APIs and technologies makes it even harder. This landscape requires not only extensive time and learning but often the specialized expertise of AI professionals, who can be hard to find.
For many application developers, the focus is not on the underlying AI frameworks or models; they need robust solutions for converting speech to text, generating text-to-speech, leveraging the latest language models, or recognizing objects in images and videos – all integrated comprehensively and seamlessly.
The situation is further complicated by the rapid pace of change, which pressures businesses to decide whether to continue using outdated technologies or to invest resources in keeping up with the fast-evolving AI landscape.
Your Unified AI Solution | Built with Qt
Imagine a scenario where integrating AI doesn’t mean writing thousands of lines of adapter code. The AI unified solution built with Qt seamlessly integrates multiple AI models into a cross-platform pipeline, leveraging the Qt Multimedia and Qt AI Inference APIs to simplify media processing and inter-process communication, while enabling intuitive application logic and UI behavior through a declarative QML interface.
With the Qt AI Inference API (PoC), developers can instead take advantage of:
-
Key capabilities: Integrate ASR (Whisper), LLMs (Ollama), and TTS (Piper) within one unified framework.
-
A single API for multiple AI frameworks: No need to write vendor-specific code.
-
Seamless QML & C++ integration: AI models can be integrated with just a few lines of code.
-
Easily swappable AI models: Change models—such as switching to DeepSeek – by simply adjusting one parameter in a Qt QML application.
-
Local or cloud deployment of AI models: Achieve streamlined deployment with minimal effort.
-
Flexible AI pipelines: Combine multiple AI models/solutions (e.g., Speech Recognition + Text-to-Speech) effortlessly.
-
A backend plugin system: Supports both proprietary and open-source AI solutions.
Developers no longer need to worry about whether an AI model requires a REST API, a Python binding, or a C++ library because the Qt AI Inference API (PoC) abstracts it all into a single, consistent interface.
Image 1. Typical ways to create an AI pipeline
Image 1 Explanation - Read More
Usually more than one AI model is needed to implement a use case. Also, there is a need to connect them together to create a pipeline of AI models, here a "speech->text->text->speech" pipeline for a speech based AI chat application.
There are typically two ways to implement that, either by directly accessing every AI model from the application and creating the AI pipeline inside the application, or by creating an external, custom made and use case specific AI pipeline that the application uses.
In the first case, on the left side of Image 1, the application accesses each AI model via some API, typically a REST API or sometimes via gRPC or WebSocket protocol. The API is provided by a Python server running either locally in the same device or in the cloud. The Python server then accesses a Python API of some AI Framework to do the AI model loading and the actual deployment of the AI model, called as "Inference". The application also needs to integrate the different AI models, with different APIs, into an AI pipeline. This results in many thousands of lines of code that is both use case and AI Framework dependent and that could be avoided.
In the second case, on the right side of the Image 1, the external AI pipeline moves that code away from the application to an external, use-case specific AI pipeline. It means that the AI pipeline needs to be implemented for each use case separately. The application is now accessing fully use-case specific private APIs of the AI pipeline. Any change in the use case requires changes on both the application and the AI pipeline. Again there are thousands of lines of code that are use-case and AI Framework specific and could be avoided.
In both cases, for any video, audio or speech data there is a need to use some platform dependent Multimedia Framework to record the media and to play it, meaning that the application and/or the Python servers need to be implemented differently for each target platform.
AI app
from
Days to Minutes
.png?width=650&height=369&name=image%20(8).png)
Image 2. How Qt can do the AI pipeline better
Image 2 Explanation - Read More
How can Qt do it better?
For media recording and playback the Qt Framework already provides a cross-platform Qt Multimedia module and its Qt Multimedia API. Since the Qt Framework supports also Python that API can be used both in the application side and in the Python servers. This means that both are platform independent i.e. can be used in many target platforms.
To provide the cross-platform and cross-AI-framework API for Inference, there is now the new Qt AI Inference API (PoC). It is provided by a new "Qt Data Processing" module. For each AI Framework there is its own backend adaptation plugin that connects to the Python server to do the Inference.
The Qt Framework already provides its own mechanism for inter-process communication, called as Qt Remote Objects module (QtRO). This enables direct communication between the backends and the Python servers without adding code for message encoding, decoding, sending and receiving.
In this way thousands of lines of code is saved, the complexity of the application is reduced, and both the client and the Python servers are usable in any supported platforms and AI Frameworks without modifications. More importantly, the AI pipeline is automatically created in the background by simply defining it in QML in the application. This enables seamless use of AI elements for AI models and pipelines together with the UI elements to let the AI elements control both the application logic and UI behaviour. This enables use of AI in more comprehensive ways than before.
Image 3. Qt AI Inference API (PoC) - Architecture illustration
Image 3 Explanation - Read More
The new Data Processing module provides the new Qt AI Inference API and different AI backend plugins to provide adaptation between it and different Inference services and AI frameworks. An Inference service provides a higher level API that hides the supported AI frameworks.
Each embedded board vendor provides one or two AI frameworks for Inference operation. E.g., one can be a proprietary AI framework for generative AI optimized for the available HW acceration on the board and the other for more simple AI models.
The AI frameworks are using either CPU, GPU or some AI processor for executing the inference operation using parallel processing techniques.
A multimedia framework is needed for image, video and audio/speech processing and generation for decoding and encoding the data and for converting the data to be suitable for the inference operation. It is needed also for accessing audio or video sources, e.g. microphones and cameras. Likewise it is needed for playing audio or video. The existing Qt Multimedia module provides a cross-platform API for these services hiding the platform specific multimedia framework.
Seamless AI Integration, Simplified
The prototype was built across Windows, Linux, and even on embedded platforms like NVIDIA Jetson.
The result was clear: by abstracting away the complexity of diverse AI frameworks, developers saved significant time and could deploy AI models in just minutes.
This flexibility benefits not just AI experts but also companies building in-house AI models. They no longer need to master vendor-specific AI frameworks—Qt AI Inference API (PoC) handles the complexity for them.
We're indirectly providing value even for companies that build their models in-house. Now they don’t have to learn every vendor’s AI system just to maximize hardware performance.
We Want to Hear from You
We’re excited to share the repositories for the Qt AI Inference API (PoC) and the example app that we are developing to significantly speed up the development of AI applications at the edge. We invite you to try the API and see how it can simplify your process and future-proof your projects.
Your feedback on these resources would also be greatly appreciated!
If you’re a developer or industry leader working with AI, or if you see the potential for this approach in your projects, don't hesitate to get in touch with us to explore the Qt AI Inference API (PoC) and unlock new possibilities for industrial automation.
Learn more also about Qt in Industrial Automation.
Blog Topics:
Comments
Subscribe to our newsletter
Subscribe Newsletter
Try Qt 6.9 Now!
Download the latest release here: www.qt.io/download.
Qt 6.9 is now available, with new features and improvements for application developers and device creators.
We're Hiring
Check out all our open positions here and follow us on Instagram to see what it's like to be #QtPeople.
I just want to say thank god that you are embracing GitLab, even if it is only a tiny bit. The current gerrit/jira/cgit setup is awful.
Thanks. We hope it would be easier to collaborate around the PoC.
Why the choice of multiprocessing and especially why Python?
Thank you for the good comment. You are absolutely right, there is no real reason to limit to Python and multiprocessing. You can also add backends for native AI frameworks like whisper.cpp, llama.cpp and Piper's C++ API and do the inference in the application process. Since Python has become the defacto language for AI frameworks it was used here in the illustrations, but nothing prevents using other languages. Also, it was an easy path to get the PoC done.
That makes sense. Good choice for the PoC indeed! I'll try it in the coming time. Thanks for making this
You are welcome. It would be really nice to get backend plugins made for those native ones. Maybe you could consider contributing for those?