Watch Atlas humanoid adapt to changing environment


Boston Dynamics published a new video highlighting how its new, electric Atlas humanoid performs tasks in the lab. You can watch the video above.

The first thing that hits me from the video is how Atlas showcases its real-time perception. The video shows how Atlas actively registers its frame of reference for the engine covers and all the picking/place locations. The robot continually updates its understanding of the world to handle the parts effectively. When it picks something up, it evaluates the topology of the part – how to handle it and where to place it.

atlas has a part in hand.

Atlas perceives the topology of the part held in its hand as it acquires the part from the shelf. | Credit: Boston Dynamics

Then, there is this moment at 1:14 in the demo where an engineer dropped an engine cover on the floor. Atlas seemed to hear the cover hit the floor. The humanoid then looks around, locates the part, figures out how to pick it up (again, evaluating its form), and places it with the necessary precision into the engine cover area.

The Robot Report reached out to Boston Dynamics to learn more about how Atlas knew a part was dropped onto the floor. We will update this article if we hear back.

“When the object is in view of the cameras, Atlas uses an object pose estimation model that uses a render-and-compare approach to estimate pose from monocular images,” Boston Dynamics wrote in a blog about the video. “The model is trained with large-scale synthetic data, and generalizes zero-shot to novel objects given a CAD model. When initialized with a 3D pose prior, the model iteratively refines it to minimize the discrepancy between the rendered CAD model and the captured camera image. Alternatively, the pose estimator can be initialized from a 2D region-of-interest prior (such as an object mask). Atlas then generates a batch of pose hypotheses that are fed to a scoring model, and the best fit hypothesis is subsequently refined. Atlas’s pose estimator works reliably on hundreds of factory assets which we have previously modeled and textured in-house.”

The video highlights Atlas’ ability to adapt and perceive its environment, adjust its concept of that world, and still stick to its assigned task. It shows how Atlas can handle chaotic environments, maintain its task objective, and make changes to its mission on the fly.

Watch Atlas humanoid adapt to changing environment

Atlas can scan the floor and identify a part on the floor that doesn’t belong there. | Credit: Boston Dynamics

I see, therefore I am

Robot vision guidance has been viable since the 1990s. At that time, robots could track items on moving conveyors and adjust local frames of reference for a circuit board assembly based on fiducials. Nothing is surprising or novel about this state-of-the-art for robot vision guidance.

What’s unique now for humanoids is the mobility of the robot. Any mobile manipulator must consistently update its world map. Modern robot vision guidance uses vision language models (VLM) to understand the world through the eye of the camera.

Those older industrial robots were fixed to the ground and used 2D vision and complex calibration routines to map the field of view of the camera. What we’re seeing demonstrated with Atlas is a mobile, humanoid robot understanding its surroundings and continuing its task even as the environment changes around the robot. Modern robots have a 3D understanding of the world around them.

Boston Dynamics admits this demo is a mix of AI-based functions (like perception) and some procedural programming for managing the mission. The video is a good demonstration of the progression of the capabilities of the software evolution. For these systems to work in the real world, they must handle both subtle changes and macro changes to their operating environments.

Making its way through the world

It’s fascinating to watch Atlas move. The movements, at times, seem a bit odd, but it’s an excellent illustration of how the AI perceives the world and the choices that it makes to move through the world. We only get to witness a small slice of this decision-making in the video.

Boston Dynamics has previously published a video showing motion capture (mocap) based behaviors. The mocap video demonstrates the agility of the system and what it can do with smooth input. The jerkiness of this latest video, under AI decision making and control, is a long way from the uncanny valley-involving mocap demonstrations. We also featured Boston Dynamics CTO Aaron Saunders as a keynote presenter at the 2025 Robotics Summit and Expo in Boston.

There remains a lot of real-time processing for Atlas to comprehend its world. In the video, we see the robot stopping to process the environment, before it makes a decision and continues. I’m confident this is only going to get faster over time as the code evolves and the AI models become better in their comprehension and adaptability. I think that’s where the race is now: developing the AI-based software that allows these robots to adapt, understand their environment, and continuously learn from a variety of multi-modal data.

 

By admin

Deixe um comentário

O seu endereço de email não será publicado. Campos obrigatórios marcados com *