We define Situation Awareness by the ability to simultaneously perform Perception, Localisation and Comprehension of the environment.
This ability is fundamental to any Smart Machine such as a self-driving car: it is a prerequisite to take action in the real world.
The initial approach of the industry to solve this key challenge has been to increase the capabilities of separated sensors like LiDAR, Cameras, Radar, IMUs; combine them via Sensor Fusion and process the resulting data using Machine Learning (ML) on powerful computing platforms.
However, the Scalability and Reliability levels required for mass-produced Smart Machines can’t be achieved without solving the new challenges that this approach creates, including the problems of Calibration, Synchronisation, Latency, Energy consumption, Cost and “explainability”.
Outsight have followed a different path:
We have combined a novel broadband laser imager with an innovative processing approach that allows for an industry’s first real-time Full Situation Awareness in a standalone device.
We generate a multi-physics data-stream that delivers not only 3D Range, Velocity and Color, but also identifies the material composition of objects from distance, like Skin, Cotton, Metal, Plastic, Asphalt, Ice, Snow, and Oil.
This unique sensing capability, combined with our SLAM on Chip(R) software, allowed us to create a 3D Semantic Camera that delivers an unprecedented level of actionable information without entrusting Machine Learning, including Object Detection & Classification.
This will not only accelerate the emergence of fully automated Smart Machines like L4-L5 Self-Driving Cars, Robots, and Autonomous Flying Taxis but will also bring the safety benefits of Full Situation Awareness to the current man-controlled machines like L1-L3 ADAS (Advanced Driving Assistance Systems), Construction/Mining equipment, Helicopters and many more.
Situation Awareness is the problem to be solved
Perception is the ability of Smart Machines (including Self-Driving cars) to feed off environmental conditions to collect data. This is possible with the aid of input sensors such as Cameras, Radar and LiDAR.
As Smart Machines evolve, the kind of situations they are expected to deal with become increasingly complex: enhancing the sensor performance is of paramount importance. Better sensors is the first answer to this riddle (e.g. higher resolution cameras, longer-range LiDAR).
However, focusing only on the perception challenges could lead to a situation of incremental innovation, to the detriment of quantum-leap solutions that could emerge when tackling problems with a more holistic approach.
The actual capability that Smart Machines require is Situation Awareness.
Partially solving this problem can still be useful in several situations:
- Perception + Semantics: a Face ID security application is a good example, where the Camera provides the image and a specific software analyses and verifies the ID of the person. The notion of where the Camera and/or the person are (neither in relative or absolute terms) is not necessary.
- Perception + Localization: the creation of precise maps where the 3D data from LiDAR is combined with the Localization data from sensors like IMUs (Inertial Measurement Unit) to deliver a point-cloud map is a good example of the utility of localized perception data.
- Localization + Semantics: when using a GPS-based tracker with the appropriate software and a map, in the context of fleet management, by example, you can optimise the vehicle's navigation.
However, most Smart Machines require the three aspects to be considered simultaneously.
Think about the obstacle detection in moving machines like in the Self-Driving Car:
- The car needs to actually see the object
- The localization is critical; not only distance, velocity, and movement (is it far/close? is it getting closer?) but its actual position (is it on the road?) and orientation in respect to ours (is it on my lane?) need to be accurately identified.
- The car needs to comprehend and analyze the obstacle: is it a car/a pedestrian/a tree.
Currently, these functions are separately performed by different components in the Car.
Many other applications would also benefit if these three aspects were simultaneously solved using a static computer perception system. For example, a Surveillance solution in an airport or a mall is much more efficient when the typical Perception+Semantics output provided by Smart Security cameras is combined with the notion of 3D Localization of objects.
So, how is the industry tackling the Situation Awareness problem?
The infancy of Situation Awareness
In the case of Self-Driving Cars, the current approach of the industry is based on the combination of a heterogeneous set of separate sensing technologies with a central processing unit based on Machine Learning, running on high-end computing platforms.
This is an interesting and even necessary first step during the Research & Development phase, and also during the first small scale deployments, where cost and scalability are not main concerns.
However, from the perspective of a scalable solution, and in the context of Full Situation Awareness, this architecture has some serious drawbacks on not less than four different levels which are: the sensing devices, the localization system, the processing methods, and the system integration.
The challenges related to the Sensing Devices (Perception)
There is no perfect sensor- each sensor provides only a partial understanding of the reality.
Each kind of sensor performs differently in key metrics that are important for safe and reliable Smart Machines.
The mainstream approach to solve complex situations like Automated Driving is to combine data from several individual and separated sensors. This is an emerging discipline called Sensor Fusion.
- Cameras provide high-resolution images but they are passive. They lack native depth information and depend greatly on lighting conditions, even if good progress is being made in this field.
- Radars can measure velocity with great precision but fare poorly when measuring the position of objects due to their low resolution and have trouble detecting static objects. This will remain a challenge even after significant innovation in the field of Imaging Radar.
- LiDARs can perceive in 3D, but the "performance vs. cost" trade-off of current technologies is far from making them suitable for mass production except for simple use cases. Current LiDAR technologies are either high-performing but unscalable (Fiber Laser LiDAR) or affordable but with low performance in terms of Range and Resolution (Diode Laser, VCSEL LiDAR, due to Eye Safety constraints among other reasons).
Our contrarian approach is that a single device, a 3D Semantic Camera, must deliver a comprehensive output, that includes not only 3D Range, Velocity and Color, but also the material composition of objects like Skin, Cotton, Metal, Plastic, Asphalt, Ice, Snow, and Oil. ie a Full Reality perception.
We achieve this through a process that we call Fused Sensing, that blends multi-physics information into a single 3D image.
A single 3D Image blends in real-time not only Color and Range, but also the Full Velocity and the Material composition (Skin, Cotton, Ice, Snow, Plastic, Metal..)
The challenges on the Localization front
While pinpointing the position of the car and its surroundings seems like a solved issue, the reality is that it's still a challenge: GPS/GNSS’s precision is not enough and have trouble in situations like Urban Canyons, IMUs are much more precise than GPS for short periods of time… before drift becomes too significant, and wheel encoders can't be trusted alone because of slipping wheels, changing diameter and other factors.
For best results, you can combine these different sensors and methods to get a more robust localization. These are often integrated with a previously-created map (even if that opens a new challenge which is how to keep the map up to date and available).
The article Localizing perception gets into more detail about these different techniques.
While we believe these approaches are useful and should be used together, our 3D Semantic Camera delivers an uncorrelated Localization output (Relative Ego-motion if there is no reference map, Absolute Localization if there is a reference) thanks to a SLAM-on-Chip algorithm (Simultaneous Localization and Mapping embedded on a Chip) that relies only on Perception data.
This 3D SLAM capability was originally developed by the company Dibotics, now part of Outsight.
The challenge related to the processing methods (Semantics)
Regardless of the kind of data output collected from the sensors (Visual information, LiDAR / Radar) the main approach of the industry is to make extensively use of Machine Learning/Neuronal Network methods to deliver meaningful information.
While Machine Learning is a wonderful tool for high abstraction-level problems, we do not think it should be considered as the only or definitive solution for most of the problems related to Situation Awareness, because of several serious drawbacks, that are best described in the article Using the Machine Learning Hammer wisely.
Among these drawbacks of relying only on Machine Learning, we find the fact of dealing with its Black-Box nature and its reliability:
The huge amount of information required for training and the power consumption are also a major inconvenient, as pointed out Karen Hao in the MIT Technology Review:
The mainstream approach of the industry is to rely extensively on Machine Learning methods, relying on finding patterns in trained datasets to infer qualitative (statistical) conclusions on new samples. This is useful for high-level decisions.
Our contrarian approach to perform Situation Awareness is not to use Machine Learning.
No training, no datasets, no labelling: we instead rely on deductive inference to get to quantitative conclusions.
We feed the central decision-making processes with edge-computed and already classified data (ie. only meaningful and relevant information).
The challenge related to Sensor Fusion and System integration
In today's Situation Awareness solutions, the individual sensors (Perception, Localization) communicate through a network and are connected to a central processing unit (Semantics).
Most Sensor Fusion methods rely on either of the two architectures highlighted below:
- Big Brain + Dumb Sensors (aka Low-level fusion): This system employs very basic sensors and sends the raw data to a central processing unit. This central unit is often running with the help of Machine Learning processes, that works better with as much data as possible.
- Black-Box Smart Sensors (aka High-level fusion): Edge computing capabilities, embedded on the sensor itself, allow it to send only the conclusions (ie. detected objects) through the network. This makes it much easier for the central processing unit to Fuse information from different sensors and requires much less network bandwidth.
Regardless of the advantages of each approach, they share a common set of drawbacks and specific challenges when considering scalability and cost, among them Latency, Calibration, Synchronization, Installation/Wiring and Sensor Location.
The article Fused Sensing vs. Sensor Fusion describes this in more detail.
Single Device Situation Awareness
Our approach is to provide a Full Situation Awareness in a single device, where Perception, Localization, and Semantics are simultaneously blended thanks to our Fused Sensing embedded software engine.
Because our Perception is based on multi-physics data, enabled by a new broadband laser imager, we can get well beyond what a LiDAR or 2D Camera could do, by not only providing Range and Color but also the Full Velocity of objects and their Material composition ID.
|We call this comprehensive way of perceiving the environment Full reality perception.|
In turn, this will result in better safety standards in a Smart Machine: if your car can assess the real-time condition of the road and can distinguish black ice from sand it can make a better judgment in respect to the timing of breaking and speed.
An embedded 3D SLAM on Chip performs Localization and Mapping in real-time where a 3D dense occupancy grid is populated with the low-level point-wise classification and with high-level Object Detection and Tracking.
The Semantic output of our device is an explainable and actionable stream of information, that can feed the decision making of higher-level Machine Learning-based processes in a much more effective way.
Situation Awareness is one of the key challenges for Smarter and Safer machines, including Cars.
The mainstream approach of the industry, based on separated Sensors, Isolated Perception/Localization/Semantics processes, and Isolated Acquisition vs. Processing modules brings a new set of complex and unsolved challenges.
Based on the unique combination of sensing hardware and embedded software, we've taken a different path and built a 3D Semantic Camera, a scalable device that delivers long-range comprehensive and meaningful information.
We believe in making a smarter and safer world by empowering cars, drones and robots with superhuman capabilities to perceive and understand their surroundings.
We aim to accelerate the emergence of these Smart Machines but also enable a full set of new applications in markets like Security&Surveillance, Agriculture, Mining, Industry 4.0 and Smart Cities.
Outsight's 3D Semantic Camera provides seamless Situation Awareness in a standalone device as a key enabler.