‘Seeing' with High-speed Vision [UTokyo-IIS Bulletin Vol.12]

2023.09.04

Topics

State-of-the-art Technology Igniting Innovation for Industrial robots, Driverless Cars

山川先生1.jpg

Industrial robots currently have limited applications. That's because they are often not capable of doing things that humans can easily do, such as handling soft objects and working in tandem with people. Associate Professor Yuji Yamakawa of UTokyo-IIS is determined to change all that, by applying "high-speed vision" to control robots and eventually detect pedestrians emerging from blind spots.

Enabling Robots to Handle Soft Objects

Yamakawa is developing high-speed intelligent systems to enable active interactions with robots, establishing sensor networks that revolve around high-speed vision, or a system capable of capturing and analyzing real-world images with one of the fastest speeds in the world.

As a UTokyo graduate school student, Yamakawa became interested in enabling robots to handle soft objects, which at that time was considered difficult, by using state-of-the-art high-speed vision technology. The technology was the hallmark of the laboratory led by his mentor, Professor Masatoshi Ishikawa, who was a pioneering researcher in the field.

"I knew I wanted to conduct research on robots, but I had to choose a theme that truly intrigued me," Yamakawa recalled. "After careful consideration, I realized that leveraging high-speed vision technology to handle deformable and soft objects would be the most promising course to follow."

Even after obtaining his doctorate in 2011, Yamakawa's passion for controlling robots to handle soft objects remained undiminished. In his lab at UTokyo-IIS, he has conducted numerous research projects, including estimating the state of deformable linear objects such as wires and strings. Establishing such technology has a wide range of applications, such as routing electrical cables in manufacturing and using medical threads in surgery, which still rely heavily on human labor.

The challenge to control such robotic movements has proven to be daunting. Soft and flexible objects have an infinite degree of freedom in their movement, a problem sometimes compounded by the presence of occlusion (when an object hides a part of another object) during interactions between robots and objects.

To deal with such problems, Yamakawa uses various approaches, including creating models based on human or nonhuman movements. At times, he harnesses the power of deep learning to achieve desired outcomes, while using visual feedback to correct their movements within milliseconds.

"It is essential to process the captured images to identify objects--whether it is a ball or a human being," Yamakawa said. "To gather information from those images within milliseconds, we use a simple yet efficient algorithm to ensure high-speed processing." A high-speed camera can capture around 1,000 images per second, which are instantly analyzed by a PC.

Real-time Occlusion-robust Deformable Linear Object Tracking

Collaboration with Robots

Yamakawa so proposed using high-speed vision for a real-time human-robot collaborative (HRC) system based on visual feedback control, which was applied, for example, to a peg-in-hole task. The high-speed robot hand used features three fingers with a speed that far exceeds the performance of a human being.

The peg-in-hole task works as follows: A human moved a board measuring 22 centimeters in length and10 centimeters in width with retroreflective markers attached to its four corners. Despite changing the board's position and orientation, the robot hand was able to insert the peg into the hole, which had a radius of about 6.3 millimeters. This was made possible through image-processing, which acquired image data at 1-millisecond intervals to measure the position and orientation of the board. The results were then sent to the real-time controller through a high-speed network so that the robot could precisely locate the board.

"We took advantage of the system's high speed to let a robot play a leading role in collaborating with us, rather than the other way around. Previously, it was humans who went out of their way to collaborate with robots, which I found rather odd."

Dynamic Human-Robot Interaction -Realizations of collaborative motion and peg-in-hole-

Last Resort in Autonomous Driving

In recent years, Yamakawa has moved from the field of robotics into autonomous driving, believing his high-speed vision technology can be used for detecting pedestrians suddenly emerging in front of a car.

"Ordinary cameras mounted on vehicles are not fast enough to detect pedestrians coming in front of them," he said. "The use of deep learning has proven effective in predicting such situations. But the technology is based on past data. If something that the computer had never learned happens, it can't deal with the situation."

According to a new approach proposed by Yamakawa, potentially dangerous blind spots were detected by using a monocular camera and an algorithm developed for this purpose before arriving at the location. The new system is designed to gather information on the depth of a location, through which the possibility of pedestrians being located behind structures is gauged. During a demonstration, pedestrians were detected as soon as they appeared in the images, even though only half of their bodies was visible.

"Our strength lies with speed, so it is best suited for detecting pedestrians coming out of blind spots," Yamakawa said. "I believe our system could be the last resort in terms of avoiding collisions."

High-Speed Recognition of Pedestrians out of Blind Spot with Monocular Vision

Aiming to Make the Fastest System

Yamakawa's main mission is to create the world's fastest vision system. "If robots can 'see' their working environments and cope with situations promptly, they can work continuously without stopping the conveyer belt during production, whereas the belt has to remain stopped until their procedures are finished under the current situation. This enhancement can greatly improve the productivity of a factory."

Yamakawa's challenge is how to address a problem associated with the high-speed camera, which produces dark images. Yamakawa has added markers to the robot to process dark images, but eventually aims to enable the system to see without markers. This can be achieved by combining deep learning and high-speed image processing, thus enabling robots to move faster and more intelligently.

Yamakawa believes that achieving this will expand the use of robots in various fields. "Faster and more intelligent robots will not only increase the production efficiency in the industrial fields but also ensure safety and improve the quality of life of humans by assuming more roles in household chores and elder care."

Short Dialogue

Associate Professor Tsukasa Mizutani(→'Seeing Through' Aging Infrastructure [UTokyo-IIS Bulletin Vol.12] ):
I am amazed by how much we have common in our approaches to research. For example, we have to selectively see the obtained data, which is necessary for reducing the volume of real-time calculations when employing algorithms. So, it's essential to extract only the data that is necessary.

Associate Professor Yuji Yamakawa:
That's right. In case of high-speed vision, we observe only some of the data we obtain. For example, even if we obtain an image with 1,000 by 1,000 pixels, we look at a section with 100 by 100 pixels. It is important to reduce the volume of calculations and find ways to cut corners.

Mizutani:
We're both focusing on the potential of automated vehicles in our research. That's another interesting thing we have in common.