Nvidia Leverages GPT-4 for New Robot-Training AI Agent
Nvidia has introduced a new AI training system called Eureka that leverages OpenAI’s GPT-4 large language model (LLM) to train robots to perform tasks faster than is standard. The autonomous training setup teaches robots to employ their mechanical dexterity, going beyond what humans are capable of in some cases. Eureka was able to teach a robotic hand to flawlessly execute complex pen-spinning tricks that would challenge most people, including the first time a robot hand had been trained to do the pen-spinning tricks seen in the view above.
Generative AI Robot Training
Eureka creates reward algorithms for robots on its own. Those tasks range from the pen-spinning trick to juggling, using scissors, and opening and closing doors. The AI agents leverage advanced reinforcement learning to automatically generate training programs that enable robots to master intricate real-world skills. In experiments, Eureka outperformed human-written training code over 80% of the time, improving robot success rates by more than 50%. The system utilizes generative AI to create novel reward functions, optimizing robot learning based on iterative simulation and feedback.
“Reinforcement learning has enabled impressive wins over the last decade, yet many challenges still exist, such as reward design, which remains a trial-and-error process,” Nvidia senior director of AI research Anima Anandkumar explained in a blog post. “Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks.”
The key innovation is Eureka’s ability to automatically generate optimized “reward” programs defining criteria for trial-and-error robotic learning. Eureka leverages AI to code reward algorithms that accelerate skill acquisition. This removes the need for human experts to manually formulate reinforcement learning goals. Eureka can simply ingest natural language descriptions of a desired task and handle programming robotic rewards from scratch.
For example, an engineer could tell Eureka to “teach a robotic hand to spin a pen” without any further guidance. Eureka would then synthesize and refine training code to make a robot proficient at the pen trick through repetitive simulation. And the system continually integrates human feedback to refine its automatically generated rewards. This allows it to align training with a developer’s specific vision for a robotics application. Developers can access the algorithms through the company’s Isaac Gym physics simulation platform for robotics. Eureka itself runs on Nvidia’s Omniverse simulation engine and the GPT-4 large language model.
“Eureka is a unique combination of large language models and NVIDIA GPU-accelerated simulation technologies,” Nvidia senior research scientist Linxi “Jim” Fan said. “We believe that Eureka will enable dexterous robot control and provide a new way to produce physically realistic animations for artists.”
Eureka is part of a wave of recent AI research advancements from Nvidia that aim to make developing and working with robots easier. For instance, Nvidia recently began enhancing its upgrades to its Jetson system for industrial robotics. Nvidia’s goal in that case was to improve the design and operation of generative AI in industrial systems by incorporating generative AI features without needing lots of specific training data. Nvidia is providing developer tools like pre-trained models, APIs, and microservices so engineers can quickly build AI into edge devices like robots, manufacturing systems, and logistics networks.
That’s one of several updates extending Nvidia’s generative AI work over the last year, including new partnerships with groups like Hugging Face to develop Training Cluster as a Service, a tool for streamlining enterprise LLM creation. The company also recently showcased how its Grace Hopper CPU+GPU Superchip can outperform any rival on the MLPerf industry benchmark tests. Nvidia suggests the MLPerf tests serve to make the new TensorRT-LLM software for upgrading LLM efficiency and inference processing ability more enticing. TensorRT-LLM leverages Nvidia’s GPUs and compilers to improve LLM speed and usability by a significant margin. The TensorFlow-based platform minimizes coding requirements and offloads performance optimizations to the software. The company’s TensorRT deep learning compiler and other techniques allow the LLMs to run across multiple GPUs without any code changes.