The art of 3D perception
We perceive the world in 3D. We interact with 3D objects, and we can differentiate between various 3D shapes and sizes, instinctively understanding and identifying what they are.
3D perception is the art of utilising technology, to the point where it can also understand and identify objects.
The technology takes the concept of a 2D image, and it builds on that concept by harnessing the power of multiple innovations to come together to create a 3D shape.
According to Fatih Porikili, Senior Director of Technology, Qualcomm Technologies, lighting plays an important part of 3D perception.
Porikili said: “3D perception facilitates reliable results in varying light conditions, provides confident cues for object and scene recognition, and allows accurate size, pose, and motion estimation.”
Light is perceived differently in 2D compared to 3D
Objects perceived in 2D will change depending on various outside sources, such as whether it’s day or night, the structure, and finish of the object – like the colour, texture, and pattern. These factors all have a bearing on how an image is relayed. However, this does not happen with 3D objects as 3D structures are more rigid, so the structures appearance doesn’t change. The result is that 3D perception can better recognise objects, giving more reliable results.
Qualcomm utilises simple technology to create 3D perception, indeed they achieve state-of-the-art (SOTA) accuracy with 90% less computation than other 3D perception leaders in the field.
What does 3D perception help to achieve?
The aim of 3D perception is to make life better for everyone, no matter the technology they use. It can be used in mobile phone and camera technology, through to autonomous driving and immersive XR.
For example, to facilitate autonomous driving, 3D perception combines several technologies, including the use of cameras, LiDAR, and radar.
In a fully immersive XR experience, 3D perception uses:
- Six degrees-of-freedom motion estimation ((6DOF) the freedom of movement of a rigid body in a three-dimensional space)
- Obstacle avoidance
- Object placement
- Photorealistic rendering
- Hand pose estimation
- Interacting in virtual environments
The combination of these perception technologies allows users to move and interact in a virtual world without colliding with ‘real world’ objects. Hand pose estimation, for example, means that although users can’t see their hands in real-life, they can see them in the immersive world using cameras and sensors.
However, 3D perception technology is not without its challenges
The two broadest challenges facing the advancement of 3D perception are: data challenges, and implementation challenges.
The availability of high-quality 3D video datasets, plus the sparse vs volumetric nature of the 3D point cloud are just a couple of the obstacles to overcome.
Much of the 3D data is sparse. Sensors pick up the surface of an object, but not the space in or around it. This effects how the data is going to be represented, and not all 3D acquisition devices provide complete 3D models.
There is also a lack of availability of high-quality 3D video datasets compared to the quality of 2D images.
Other challenges include utilising hardware and software platforms – such as memory, SDKs, and tools – because 3D perception is relatively newer than 2D perception, some of the tools are not at the same level as the 2D version.
There is also computational load – including training and inference, plus manipulation and viewpoint management to consider.
“There is a long list of challenges in 3D, that’s why it makes 3D perception very interesting and exciting. There are many technical problems to solve,” says Porikili
So, how can 3D perception scale-up?
According to Porikili, Qualcomm’s 3D perception research is unique. It builds novel AI techniques and real-world deployments through its full-stack AI research. Also, by building energy-efficient platforms, it aims to make 3D perception universal.
“Our purposeful innovation has led to many 3D perception breakthroughs both in novel research and in proof-of-concept demonstrations on target devices, thanks to our full-stack optimisations using the Qualcomm AI Stack toolkits and SDKs. I’d like to highlight four key areas of our leading 3D perception research: depth estimation, object detection, post estimation, and scene understanding.”
Looking ahead, the team hope to continue to push the boundaries of 3D perception. They expect breakthroughs to occur in neural radiance fields (NeRF), 3D imitation learning, neuro-SLAM (Simultaneous Localisation and Mapping), and 3D scene understanding in RF (Wi-Fi/5G).
Porikili notes that there is even more 3D innovation to come, and so the 3D world is urged to watch this space.