The new benchmark for digital signage audience measurement is with 3D computer vision

23 June 2020 by Lorenz Buser, Product Manager Smart Signage


The use and applications of 2D computer vision in the digital signage industry over the years have shown that its adoptees know its worth. But as networks and advertisers begin to feel the pressure to compete smarter with better data, and with smart technology becoming more and more embedded into everyday life, the flat worldview and basic data that 2D computer vision offers is no longer enough. Today, what matters most for networks is adopting a sophisticated technology with more capabilities and enhancing the accuracy of actionable data for digital signage audience measurement in a physical space.

Seeing the world as it really is

3D computer vision capabilities

In the above video of a controlled test environment (much like one would see at the entrance of a trade fair), we are seeing people and data points in real-time through the “eyes” of a 3D visual sensor integrated with a digital screen. Advanced 3D computer vision, installed together with digital signage, transcends beyond the limitations we know of 2D computer vision. In this particular instance shown in the video above, with a tracking area of 317 m2, a diagonal field of view of 120°, and a detection range of up to 15 m in front of the screen, (without quality loss), advanced 3D computer vision can:

  • Track 25 people at the same time accurately without quality loss
  • Identify the gender of each audience member with an accuracy rate of 95%
  • Estimate the age of each audience member with an average deviation of +/-4 years
  • Deliver real-time, accurate, and granular 3D data about the position in space and the walking path of each audience member with an average deviation of 10 cm
  • Detect audience attention span time through head pose estimation with an average deviation of only 4.3°

Bigger data

Turning visual data into actionable data with machine learning

To deliver this wide set of metrics in high quality and real-time, a 3D visual sensor is used to capture the scene in high-resolution with depth information. The visual data is then analyzed by complex computer vision algorithms solution running on performant hardware that interprets the visual data. To obtain higher quality data, the collection of machine learning models is optimized for high accuracy and trained on proprietary domain specific data. In the instance of Advertima, thorough analysis runs at 15 frames per second to enable continuous tracking, and our software allows the calibration of an installation so a sensor understands its position relative to physical space.

These factors combined enable 3D computer vision to resolve ambiguities in tracking (e.g. when people in the field-of-view cross and overlap) while running a comprehensive analysis, and therefore delivers you various in-depth metrics with enhanced accuracy. Although not possible in every outdoor advertising setting, handling these conditions enables the application for a large number of in-store and DOOH advertising formats.

Of course, it is reasonable to assume that these capabilities come at an astronomical price in the form of hardware costs. But thanks to various measures and the continuous efforts of our team, the hardware can be purchased for prices in the same range of professional digital signage hardware components.

Advertima's Smart Signage | 3D Computer Vision | Digital Signage Audience Measurement

Accuracy is king

Real-time smart targeting for higher audience engagement

We know that digital signage networks are typically used in crowded public spaces like streets, airports, train/bus stations, shopping centers, and busy retail stores where 3D vision is a prerequisite to know the exact positions and walking paths of people in front of digital signage screens. This information is, for example, required to target a specific group of people (audience members who will most likely see the content according to their position, walking path, body pose and head pose) with the most relevant content. To enable real-time targeting capabilities and increase campaign performances, you need to keep the complexity of engagement in the physical space in mind. It is crucial to prioritize audience members based on their likelihood to pay attention to the next content; if the people chosen do not pay attention to the screen, targeting them based on demographics is pointless (especially in bigger crowds). Once correct audience member prioritization is ensured, you also need a sophisticated demographic estimation model that achieves a much higher accuracy rate for gender identification, and overcomes humans’ ability to estimate people’s age. This level of precision with smart targeting is maintained even when the number of passers-by reaches 25 people at the same time. It is these real-time content targeting capabilities enabled by 3D vision technology and sophisticated machine learning algorithms that are proven (in Advertima’s evaluation phase for a global retailer) to increase audience engagement by 3 times.

In-depth audience analytics for post-campaign performance

Besides measuring precisely the number of people and identifying their demographics, 3D computer vision (with advanced machine learning models) also enables the accurate measurement of an audience’s attention span (unique or accumulated) to an advertisement, reducing the gap between online and offline campaign analytics and providing deeper insight into the performance of advertising campaigns. The accuracy value in the software is key here: it assures advertisers and networks that the data and audience analytics is dependable and true to make their key marketing decisions.

Privacy by design

One of the most asked questions we get about 3D computer vision technology in public spaces is how it respects privacy laws and its compliance with GDPR. Advertima’s solution was built from the ground up with privacy as a top priority. Only anonymous aggregate data is extracted from each scene, and captured images are deleted from the edge computer within 70 milliseconds. Our goal is to provide demographic and behavioral – not personal – information about audience groups. As a result, our 3D computer vision technology delivers accurate and granular data, while respecting people’s privacy.

What’s key to remember about 3D computer vision is that the depth information 3D visual sensors obtain is only valuable when combined with software that’s able to accurately interpret that visual information. Sure, you can roughly see what’s happening in front of a digital screen, but if the complexity of audience member selection for engagement optimization in the physical space is not considered, or audience information is wrongly detected, any targeted, real-time marketing efforts thereafter fail, resulting in useless campaign reports, and wasted time and money.

For more information, contact Lorenz Buser, Product Manager Smart Signage at