The technology behind 3D gesture recognition

With the continuous promotion of touch screen technology, users have adapted and gradually become familiar with the interaction with the machine. Now, human-computer interaction technology has reached a higher level and entered the era of gesture recognition, but this is not easy. Gesture recognition is now available in the entertainment and gaming markets, but how does this technology affect our daily lives? Imagine that someone sitting on a couch can control lights and TV, or a car with a single wave Automatically detect if there are pedestrians nearby. As gesture recognition technology supports the continuous development of human-computer interaction, these and other functions will soon be realized. Gesture recognition technology has long been studied with 2D vision, but with the advent of 3D sensor technology, its application will become increasingly widespread and diverse.

This article refers to the address: http://

Limitations of 2D vision

Computer vision technology has been working hard to develop intelligently comparable to human intelligence to better understand the scene. If you can't explain the world around you, the computer can't connect with people naturally. The main problems faced by computers in understanding surrounding scenes include subdivision, object representation, machine learning and recognition. Because of the limitations of 2D scene characterization itself, gesture recognition systems must apply a variety of other prompts to get better results with more useful information. When the likelihood information includes the entire body tracking, it is difficult to obtain any information beyond the gesture recognition by 2D characterization, despite the integration of multiple prompt information.

“ z” (depth) innovation

The challenge in the development of 3D vision and gesture recognition has always been the acquisition of the third coordinate-z-axis coordinate. The human eye can see 3D objects, can naturally recognize the (x, y, z) axes, so that they can see everything, and then the brain can express these axes in the form of 3D images. One of the biggest challenges for machines not getting 3D vision is image analysis technology. There are currently three common solutions to 3D acquisition problems, each with its own unique features and specific uses. The three schemes are: stereo vision, structured light mode, and transit time (TOF). With the 3D image output provided by these technologies, gesture recognition technology can be realized.

Stereo vision

The stereo vision system is probably the best known 3D acquisition system. The system uses two cameras to obtain left and right stereo images, which are slightly offset, in the same order as the human eye. By comparing the two images, the computer can obtain different images corresponding to the displacement of the objects in the image. The different images or maps can be either colored or grayscale depending on the needs of the particular system. Stereo vision systems are currently used in 3D movies, providing a low-cost and exciting entertainment experience.

Structural light mode

Structured light mode can be used to measure or scan 3D objects. In this type of system, structured light modes can be illuminated on the entire object, which can be created using laser illumination interference or created using projected images. Using a camera similar to a stereo vision system helps the structured light mode system to obtain the 3D coordinates of the object. In addition, a single 2D camera system can be used to measure the displacement of any single strip and then obtain the coordinates via software analysis. Regardless of the system used, you can use coordinates to create a digital 3D graphic of the object's shape.

The technology behind 3D gesture recognition

Time to transit (TOF)

The transit time (TOF) sensor is a relatively new depth information system. The TOF system is a light radar (LIDAR) system that also emits light pulses from the emitter to the object. The receiver can determine the distance of the object being measured by calculating the optical pulse from the transmitter to the object and returning to the receiver in pixel format.

The TOF system is not a scanner because it does not support point-to-point measurements. The TOF system can simultaneously acquire the entire scene and determine the 3D range image. 3D images can be created using measured object coordinates and can be used for device control in robotics, manufacturing, medical technology, and digital photography.

The semiconductor devices required to implement the TOF system are available now. Current devices support the processing performance, speed and bandwidth required to implement TOF systems.

Comparison of 3D vision technology

Different applications or markets are available for different 3D vision technologies. Figure 1 shows the comparison of different 3D vision technologies and their relative advantages and disadvantages in terms of response time, software complexity, cost and accuracy.

Stereo vision technology requires a high degree of software complexity to obtain high-precision 3D depth data, which is typically processed by a digital signal processor (DSP) or a multi-core scalar processor. The stereo vision system supports small form factor and low cost, and is a good choice for consumer devices such as mobile phones. However, the accuracy and response time of stereo vision systems are not as good as other technologies, so it is not ideal for manufacturing systems that require high precision, such as quality control systems.

Structured light technology is a good solution for 3D object scanning including 3D computer-aided design (CAD) systems. The software complexity associated with these systems can be addressed by hard-wired logic (such as ASICs and FPGAs), which requires high development and material costs. In addition, this computational complexity can also result in slower response times. Structured light mode technology is superior to other 3D vision technologies in achieving high precision at the micro level.

The TOF system achieves a balance of performance and cost, making it ideal for device control in applications such as manufacturing and consumer electronics that require fast response times. TOF system software is often less complex, but these systems require expensive lighting components (LEDs, laser diodes) and high-speed interface-related components (fast ADCs, fast serial/parallel interfaces, fast PWM drivers), which will increase material costs. Figure 1 shows a comparison of the three 3D sensor technologies.

How does “z” (depth) affect the human-machine interface?

With the addition of the "z" coordinate, the display is closer to nature than the image, closer to humans. People can see the real things that the human eye sees from the surrounding environment on the display. Increasing this third dimension changes the display and application types that can be used.

display

Stereo display

Stereoscopic displays usually require the user to wear 3D glasses. This display provides different images for the left and right eyes, and the images seen by the two eyes are different, which makes the brain mistakenly think that they have seen the 3D image. This display is currently widely used in many 3D TVs and 3D cinemas.

Multi-view display

Multi-view display is different from stereo display and does not require special glasses. These displays can project multiple images at the same time, each image being slightly displaced to form an appropriate angle, allowing the user to see different projected images of the same object at each viewpoint angle. These displays support holographic effects and will deliver a new 3D experience in the near future.

Detection and application

The ability to process and display "z" coordinates will enable new applications, including gaming, manufacturing control, security, interactive digital signage, telemedicine, automotive and robot vision. Figure 2 is a view of some of the application areas supported by body skeleton and depth mapping sensing technology.

Human gesture recognition (consumer)

Human gesture recognition is a popular new technology that brings new input to games, consumer and mobile products. Users can interact with devices in an extremely natural and intuitive way to promote product promotion. These human gesture recognition products include 3D data at various resolutions from 160 x 120 pixels to 640 x 480 pixels and 30 to 60 fps. Software modules such as raw data to z-depth analysis, two-hand tracking, and full-body tracking require digital signal processors (DSPs) to efficiently and quickly process 3D data for real-time gaming and tracking.

industry

Most 3D vision industrial applications, such as industrial and manufacturing sensors, use image systems of at least 1 pixel to 100k pixels. 3D images can be analyzed using DSP technology to determine manufacturing defects or to select the right part from a component set.

Interactive digital signage (precisely positioned marketing tools)

Every day we are bombarded with advertisements, whether it is watching TV, driving or boarding at the airport. With interactive digital signage, companies can deliver content that fits every consumer with precisely targeted marketing tools. For example, if someone walks past a digital signage, an additional message may appear on the signage to confirm the customer. If the customer stops to read the information, the sign may be understood as a customer's interest in the product and provide more targeted information. The microphone will allow the billboard to detect and identify key phrases to further pinpoint the message provided.

These interactive digital signage systems will require 3D sensors for full body tracking, 2D sensors for facial recognition, and microphones for speech recognition. The software for these systems will run on more advanced DSPs and general-purpose processors (GPPs), enabling applications such as face recognition, full body tracking, and Flash media players, as well as features such as MPEG4 video decoding.

Medical (faultless virtual/remote care)

3D Vision will bring unprecedented new applications to the medical field. The doctor can consult without having to share a room with the patient. Remote Virtual Care Medical robot vision systems powered by high-precision 3D sensors ensure the highest quality medical care for every patient, no matter where they are.

Car (safe)

Recently, automotive applications have made great strides in the use of 2D sensor technology for traffic signals, lanes, and obstacle detection. With the advent of 3D sensing technology, the “z” data of 3D sensors will greatly enhance the reliability of scene analysis. By using a 3D vision system, cars now have new ways to prevent accidents, both day and night. With 3D sensors, the vehicle can reliably detect and interpret the surrounding environment to determine whether the object poses a safety threat to the vehicle and passengers inside the vehicle. These systems require hardware and software to support 3D vision systems and require intensive DSP and GPP processing performance to interpret 3D graphics in a very short time to avoid accidents.

video conference

After years of development, visual conferencing technology has evolved from intermittently disconnected images to current HD systems. Future enhanced video conferencing will take advantage of 3D sensors to deliver a more realistic and interactive video conferencing experience. The enhanced video conferencing system features an integrated 2D sensor and 3D sensor and microphone combination that will interface with other enhanced systems for high quality video processing, face recognition, 3D imaging, noise cancellation and content players (Flash, etc.) And other applications. With the emergence of this demand for intensive audio and video processing, DSPs with the best combination of performance and peripherals are needed.

The technology behind 3D gesture recognition

Technical processing steps

For many applications, both 2D and 3D camera systems are required to fully implement the application technology. Figure 3 shows the basic data path for these systems. Getting data from the sensor and then performing a visual analysis is not as simple as the data path diagram looks like. Specifically, TOF sensors require as much as 16 times the bandwidth of 2D sensors, which can cause high input/output (I/O) issues. Another bottleneck exists in the process of converting raw 3D data to a 3D point cloud. Solving these problems with the right combination of hardware and software is critical for gesture recognition and the successful application of 3D. The current data path can be implemented through a combination of DSP/GPP processors plus discrete analog components and software libraries.

3D Vision Embedded System Challenge

Input challenge

As mentioned earlier, input bandwidth limitations pose a significant challenge to 3D vision embedded systems. In addition, the input interface is not standardized. Designers can choose between different input options for 2D sensors and general purpose external memory interfaces, including serial and parallel interfaces. Designers can only use existing interfaces until a standard input interface that supports optimal bandwidth occurs.

Two different processor architectures

The 3D depth mapping processing shown in Figure 3 can be divided into two categories: one is data-centric vision-specific processing, and the other is application upper layer processing. Data-centric vision-specific processing requires a processor architecture capable of performing single instruction multiple data (SIMD) fast floating point multiply and add operations, as well as fast search algorithms. DSP is the perfect choice for fast and reliable execution of this processing function. For application-level processing, the advanced operating system (OS) and protocol stack provide the necessary set of features required by any application.

Depending on the two processor architecture requirements, a system-on-a-chip (SoC) that provides a high data rate I/O GPP+DSP+SIMD processor is ideal for 3D vision processing, supporting the necessary data and application processing.

Lack of standard middleware

Middleware in the field of 3D vision processing is the integration of many different components from a variety of sources, including open source (such as OpenCV) and proprietary business sources. The business library is primarily targeted at body tracking applications, a specific 3D vision application. A middleware interface standardized for all different 3D vision applications has not yet been developed.

What's exciting after "z" (depth)?

No one doubts the attractiveness of 3D vision. Engineers are already looking forward to future application development. So what are the latest technologies in the near future? Researchers are already developing various visual technologies for people and objects. Global researchers are using multipath optical analysis techniques to explore visual pathways that enable corner vision or to circumvent objects. Transparency research will bring systems that look at objects and materials, while motion detection systems will bring applications that look inside the human brain to test whether a person is lying.

The development of 3D vision and gesture recognition technology will bring endless possibilities. However, without the hardware and middleware necessary to support these exciting new technologies, the study will have no mission significance. SoCs (system-on-a-chips), which offer GPP+DSP+SIMD (General Purpose Processor + Digital Signal Processor + Single Instruction Multiple Data Stream) architecture, will provide a perfect combination of processing performance, peripheral support and the necessary bandwidth to enable This exciting technology and application.

APM 75V Rack Mount Dc Power Supply is applicable for activation of some electronic devices in military field. Comparing with other Variable Power Supply in the market, APM supports a most complete array of interfaces, including USB, LAN, RS232, RS485, analog control interface, GPIB (option).

Master/Slave Series/Parallel operation mode for up to 10 units, which allow a more flexible and convenient use of the Variable Ac Power Supply.It also provides built-in standard automobile electrical testing curves,users can select any built-in curve to do the DUT performance test directly according to their demand.

Some features of the Power Supply as below:


  • Ultrafast respond time and high efficiency.
  • Accurate voltage and current measurement capability
  • Constant Power and wide range of voltage and current output
  • Equips with LIST waveform editing function
  • Compliant with SCPI communication protocol
  • Full protection: OVP/OCP/OPP/OTP/SCP
  • Voltage drop compensation by remote sense line.
  • Have obtained CE,UL,CSA,FCC.ROHS


75V Dc Power Supply

75V Dc Power Supply,Programmable Dc Power Supply,Variable Power Supply,Regulated Power Supply

APM Technologies (Dongguan) Co., Ltd , https://www.apmpowersupply.com

This entry was posted in on