laitimes

Technological trends brought about by 3D machine vision

author:Machine Vision Knowledge Recommendation Officer

In science fiction, robots are either antagonistic to humans or mutate into bad guys. But today's real-life applications of robots are very different. Machines are seeing and acting on the world instead of human eyes, making life smarter wherever they go.

By ingesting images to simulate the visual function of the human eye, extracting information and then analyzing and processing, machine vision has become an indispensable "third eye" in the process of smart cities, and its application fields range from individual needs in food production process management, agricultural planting control, medical testing and other aspects to public projects such as transportation and security. Among them, the egg collection line counter developed by our company Shenzhen Langrui Zhike Industrial Co., Ltd. has been put into poultry farms, which can improve efficiency and reduce costs in the process of use.

With the development and progress of machine vision, 3D machine vision has ushered in a huge opportunity in the automation industry, mainly used for quality assurance and inspection. According to data forecasts, the compound annual growth rate from 2017 to 2022 will reach 11.07%, and the global 3D machine vision market size is expected to reach $2.13 billion in 2022.

In the machine vision summit, almost half of the papers are related to 3D. Frontier exploration can be described as crazy, so what are the new technology trends of 3D image + machine vision, which are hiding in the unknown fog and looking at the world today? Maybe these abilities will appear in your mobile phone, VR equipment and drones next year, or maybe it will become some kind of entrepreneurial craze kissed by capital.

3D data perception of large scenes

3D machine vision includes many aspects, including how to let agents understand 3D data, and how to obtain 3D model data through machine vision solutions.

In the traditional sense, 3D data acquisition, or 3D perception technology, can generally use multi-angle photography or depth sensors to achieve 3D data collection. The limitation of this technique is that the 3D data collected cannot be too large.

However, with the escalating requirements for 3D data, 3D data perception of super-large scenes is becoming a hot topic. For example, the high-precision map of the city used in unmanned driving can be regarded as a splicing of super-large 3D scenes. Many of the urban data deductions used in the field of smart cities are also rooted in the collection of urban 3D scenes.

Technological trends brought about by 3D machine vision

Machine vision is providing many new methods for 3D data perception of very large scenes. For example, automated imaging methods, such as visual SLAM, process continuous frames of images online to achieve real-time reconstruction of huge 3D scenes. Another example is point cloud segmentation and semantic understanding of point cloud data for aerial photography data, which helps to obtain urban 3D data quickly and cost-effectively.

In general, there are three main application directions for 3D data perception in today's super-large scenes, which are likely to become new investment and entrepreneurship hotspots in their respective technology fields:

1. 3D high-precision models of buildings, which are used in the fields of engineering supervision, intelligent design, logistics and smart cities.

2. The combination of high-precision map and 3D data perception is an important part of unmanned driving.

3. Indoor and outdoor integrated 3D modeling, which is of great help for smart home design, environmental monitoring, and VR/AR experience.

Mobile phones and 3D vision have entered the honeymoon period

At present, smart phones have become the largest carrier for the development of advanced technologies such as AR/AR and computational vision, and face recognition and AR functions have become the hotspots of the current development of smart phones. The field of computational vision is actually a simulation of biological vision using computer technology, in which depth recognition and multi-dimensional imaging make it the core technology.

Depth recognition is the key premise of computational vision, which can recognize biological vision, including the current popular Apple face recognition technology, and multi-dimensional imaging will include the current 3D display ending, that is, the reproduction of 3D pictures and videos. Using depth recognition and multi-dimensional imaging technology, in addition to restoring the picture that we can see with the naked eye, with the continuous integration of technology in the future, depth recognition technology can also be three-dimensional display of things that we cannot see with the naked eye. For example, future smartphones can use the analysis of depth recognition technology and artificial intelligence technology to identify the intensity of ultraviolet rays and remind us of sun protection and skin care in the sun.

Technological trends brought about by 3D machine vision

Eye-tracking technology in AR/VR

With the advancement of technology, we are now able to use human eyes for iris recognition, iris recognition is more effective and secure than facial recognition and fingerprint recognition, and many mobile phone manufacturers have begun to develop and use iris recognition functions.

In addition to iris recognition, there is also eye-tracking technology. Eye tracking is a technology that tracks the movement of the eye and uses that eye movement to enhance the experience of a product or service.

Eye-tracking technology has been popular in the smartphone space for a while, probably dating back to 2013, when the Galaxy S4 was the first to be equipped with eye tracking, which was mainly used for video playback. For example, if you're watching a video and your classmate behind you taps you on the shoulder, when you turn your head, the video will automatically pause because your eyes are no longer looking at the screen, and when you look back, the video will automatically resume. You don't need to use your hands to click pause and play; Or if you're looking at a web page on your phone, the page will automatically turn when your eyes see the bottom of the screen. In the same year, LG also launched an LG Optimus G Pro phone with eye tracking.

Unfortunately, eye tracking has not been able to make waves in the mobile phone space for probably two reasons. First of all, there is no demand for users, the average size of a smartphone is only about 5 inches, in such a small place, people prefer to interact directly with their fingers, not to mention that most of the functions of the mobile phone are to interact with their fingers, so there is not much play/pause this link; The second reason is that the technology was not mature at that time, the resolution was low, and the recognition was not accurate enough, resulting in some users feeling tired for their eyes.

Technological trends brought about by 3D machine vision

3D vision helps the intelligent transformation of the robot industry

As an exciting new technology, 3D vision has already appeared in consumer products such as Microsoft Kinect and Intel RealSense. In recent years, with the continuous progress of hardware-side technology and the continuous optimization of algorithms and software, the accuracy and practicability of 3D depth vision have been greatly improved, making "3D depth camera + gesture/face recognition" have the basis for large-scale entry into mobile intelligent terminals. As the world's leading mobile phone, Apple has taken the lead in adopting 3D vision technology on a large scale, which will completely activate the 3D vision market and open a new era.

3D vision technology not only greatly improves the recognition accuracy, but more importantly, opens up a broader artificial intelligence application space. With the development of machine vision, artificial intelligence, human-computer interaction and other science and technology, various highly intelligent robots have begun to enter reality, and 3D vision technology has become a good helper to help the manufacturing industry achieve "intelligent" transformation.

Well-known depth camera technologies and applications include Intel's RealSense, Microsoft's Kinect, Apple's PrimeSense, and Google's Project Tango. However, it can be seen that the research and development of this technology are mostly foreign companies, and there are only a few domestic companies or entrepreneurial teams in computing vision, and the technical barriers are still large.

There are three main technical solutions for depth cameras on the market: binocular passive vision, structured light, and TOF. Binocular passive vision mainly uses two optical cameras to obtain depth information through triangulation after matching the left and right stereo image pairs. This algorithm is highly complex and difficult, and the processing chip requires high computing performance, and it also inherits the shortcomings of ordinary RGB cameras: it is not suitable in dim environments and when the features are not obvious.

Structured light is based on the principle of emitting a relatively random but fixed pattern of spots through an infrared laser, which are captured by the camera at different locations depending on the distance from the camera. Then, the displacement of the spots in the captured image and the calibrated standard pattern at different positions is calculated, and the distance between the object and the camera is calculated by introducing parameters such as camera position and sensor size.

Microsoft uses ToF technology in the Kinect 2nd generation. ToF is the abbreviation of Time of flight, which literally translates to the meaning of flight time. The so-called time-of-flight method of 3D imaging is to continuously send light pulses to the target, and then use the sensor to receive the light returned from the object, and the distance to the target is obtained by detecting the flight (round-trip) time of the light pulse. In contrast, structured light technology has the advantage of being more mature and less expensive than ToF, making it more suitable for mobile devices such as mobile phones.

The depth camera is an essential module for all 3D vision devices, with which the device can obtain 3D size and depth information of objects in the surrounding environment in real time, and understand the world more comprehensively. Depth cameras provide basic technical support for indoor navigation and positioning, obstacle avoidance, motion capture, 3D scanning modeling and other applications, and have become a hot topic in today's industry research. Now, the iPhone X is equipped with a 3D depth camera, which is bound to vigorously promote the development of the field of machine vision and help the robot industry achieve a perfect "intelligent transformation".

Technological trends brought about by 3D machine vision

A better depth sensor solution

There is also a convergence of machine vision technology and 3D, mainly in the field of drones.

When drones carry out surveying and aerial photography today, they must be accompanied by the ability to understand space, otherwise it is not a small matter to take pictures, and it is a big matter to hit the south wall. And this ability mainly comes from the camera and sensors for spatial reading.

With the continuous upgrading of consumer drones, people's requirements for drone shooting effects are also increasing. Drones must constantly capture images at greater distances, in more extreme weather, in more complex movements. However, traditional sensing system solutions are no longer keeping up with user expectations.

Today's consumer-grade drones generally adopt two perception solutions, one is binocular vision technology, such as some DJI products, and the other is a structured light sensor, such as Microsoft's Kinect. However, these two mainstream solutions have certain limitations, such as limited sensing range, making it difficult to complete long-distance operations. Another example is that binocular vision technology will fail in the dark, so drone night shooting has always been a big pit, but structured light technology can not cope with strong light, and the drone is also very congested at noon.

A better solution lies in combining sensors and smart cameras to achieve a new sensing system solution that can adapt to different weather and weather and can sense at a long distance.

Today, many algorithms in machine vision technology are used to coordinate the work of different sensing devices to make drones become "multi-eye drones", which is becoming a popular solution. The massive addition of drone sensors to machine vision algorithms may also lead to improved trajectory capture capabilities, allowing drones to capture the overall environment or accurately capture moving objects, such as animals and vehicles in motion.

The above technology trends may become the next hot topic in the application of machine vision and graphics. This may seem like an off-the-beaten-path field, but in fact it can influence the dynamics of today's tech market.

The game of letting machines see the three-dimensional world has just begun, and the end of the story may be the end of the story when machines and humans can gaze at each other from the same perspective one day.

Source: Salon of Machine Vision

Disclaimer: Part of the content comes from the Internet, only for the purpose of readers' learning and Xi communication. The copyright of the article belongs to the original author. If there is anything wrong, please contact us to delete it.