Concerning Starwit Technology
Intel’s Vision for Computer Vision Applications
Starwit Technologies was established to develop software solutions for city traffic modernization and optimization. Established by three former Volkswagen engineers, their hypothesis is that cities can address numerous traffic issues if they have access to more precise real-time data. Obtaining greater mobility with the same number of vehicles will also be beneficial. Since optical cameras are the most adaptable and widely used method of observing reality, they are the primary sensor used to gather data. But image processing remains a very challenging computational problem, particularly when there are strict financial and skill requirements to be met.
The vast majority of AI-powered Computer Vision object detection and tracking algorithms used today are designed to operate quickly on GPUs. However, servers and industry PCs with GPUs are posing problems in the domain of smart cities from various angles, such as limited city budgets, energy consumption, or heat dissipation in embedded scenarios.
Their software products will have a significant competitive advantage if they can perform the required image processing on CPU-only Computer Vision systems quickly enough.
Constructing the Fix
The Starwit Awareness Engine, a core Intel product, is optimized for the newest GPU hardware. A proving ground was required to demonstrate that a pure CPU approach Computer Vision is feasible. Thus, they used Intel’s fastest Xeon microprocessor to perform a number of measurement runs on the Intel Developer Cloud. Using Intel Extensions for PyTorch (IPEX) powered by oneAPI and Intel engineers, multiple optimizations were tested for object detection (YOLOv8). For comprehensive measurement results, see deep dive.
The business results were very positive, and Intel hardware and Intel AI Tools will power the first field deployments in Wolfsburg, Germany, and Carmel, Indiana, USA, the two pilot cities.
Run a test to determine the ideal batch size
The number of images that are inferenced that is, object detection runs in a single session is one of the many knobs and levers that can be adjusted to achieve the best performance outcomes. Seven distinct benchmark runs’ results are displayed in the diagram below.
Intel’s Vision for Inclusive Computer Vision
A benchmark run consists of identifying objects on 9000 individual images using YOLOv8, model size nano. The only difference between benchmark runs is the batch size. The number of images used as input for each inference call, or batch size, is displayed next to each result. Findings indicate that, when all other factors are held constant, a batch size of 8 produces the best results, allowing for approximately 120 fps on 12 cores (or threads) of an Intel Xeon Platinum 8480+ CPU when using AMX through IPEX.
Twelve cores can therefore process inferencing for eight cameras quickly enough, since product trials found that an inferencing speed of fifteen frames per second was adequate for traffic analysis. These outcomes, along with other benchmark runs, are highly promising for AI-powered computer vision using Intel Xeon CPUs.
Real-time data helps solve traffic problems and improve vehicle mobility. Optical cameras are the main data collection tool due to their flexibility, but image processing is computationally intensive, especially within budget.
By demonstrating the viability of a pure CPU approach, Starwit Technologies is well on its way to revolutionizing smart city applications, addressing the widespread reliance on GPU-optimized AI algorithms for object detection. The company worked with Intel to perform careful measurement runs on the Intel Developer Cloud using the company’s newest Xeon microprocessor and Intel software tools enabled by oneAPI. Starwit Technologies used Intel Extensions for PyTorch and YOLOv8 with multiple optimizations to measure object detection with the help of Intel engineers.
Intel’s Vision for Computer Vision
Encourageing results show how promising the business outlook is. Benchmark runs demonstrating peak performance supported the choice of Intel Xeon CPUs; the most effective batch size was 8, yielding about 120 frames per second on an Intel Xeon Platinum 8480+ CPU with 12 cores.
These findings have important practical ramifications because traffic analysis can be conducted at the 15 frames per second inferencing speed. This indicates that inferencing for eight cameras at once can be effectively handled by 12 cores. The experience of Starwit Technologies highlights the revolutionary potential of Intel’s most recent Xeon CPUs in opening up AI-based computer vision to a wider audience and opening the door for developments in smart city solutions.