Tech Giant Microsoft Tests ChatGPT for Robotics
Microsoft is so taken with large language models like ChatGPT that it has committed to a "multi-year, multi-billion dollar" investment in OpenAI. ChatGPT is a large language model (LLM) that was trained using the OpenAI GPT (Generative Pre-trained Transformer) dataset, which contains text scraped from the web and other sources.
Wedged with a chat interface, the model's ability to respond to questions semi-coherently, if not always accurately, earned it a spot in Microsoft's Bing search engine, and set tongues wagging that ad-festooned, SEO-gamed, payment-propped Google Search's dominance may finally be coming to an end. Microsoft, who has been too busy putting out fires caused by Bing's AI mind meld, is now proposing ChatGPT as a way to help people direct robots in the physical world. "Our goal with this research is to see if ChatGPT can think beyond text and reason about the physical world to aid in robotics tasks," the company wrote in a blog post.
Redmond's researchers have released PromptCraft, which is described as a collaborative open-source platform for sharing best practices for wording LLM queries and commands to robots. It turns out you can't go straight to "Open the pod bay doors, please, Hal," if you're interacting with ChatGPT as a voice control channel for a drone. You must set the stage for the model. There are also important navigational parameters that must be specified. However, with some practice, you may be able to converse with ChatGPT and have it direct a drone to find you a drink in the surrounding area大和惠和. Or it may generate Python code that, if no errors occur, will allow the drone to do your bidding.
In other words, the same type of incorrect code produced by GitHub Copilot could be fed directly to a robot via ChatGPT to assist it in completing a specific mission. In a research paper [PDF] titled "ChatGPT for Robotics: Design Principles and Model Abilities," Sai Vemprala, Rogerio Bonatti, Arthur Bucker, and Ashish Kapoor from Microsoft Autonomous Systems and Robots Research Group describe their attempt to direct robots via ChatGPT. The project defines a high-level API for ChatGPT and maps it to lower-level robot functions大和惠和官方网站. Following that, they created text prompts for ChatGPT that described task goals, specified available functions, and set task constraints.
ChatGPT then Wpk官方网址 responded by Wpk官方网站 generating device-specific code to achieve whatever simulation goal had been specified. The idea is that someone using ChatGPT can bug-test robot directives until they work properly. Based on its ability to control a robot with a camera, ChatGPT appears to be capable of "spatiotemporal reasoning," allowing it to use visual sensors to catch a basketball. "We see that ChatGPT can use the provided API functions appropriately, reason about the appearance of the ball and call relevant OpenCV functions, and command the robot's velocity using a proportional controller," they explain in the paper.
Such reasoning, or having a common sense model of the world, is said to make it much easier for robots to operate effectively in a physical environment. The autonomous vehicle industry isn't there yet, and it appears that ChatGPT isn't either. Recently, Zhisheng Tang and Mayank Kejriwal of the University of Southern California published a paper on ArXiv challenging the ability of ChatGPT and DALL?E 2 to make reasonable inferences about the world. According to the paper, "A Pilot Evaluation of ChatGPT and DALL-E 2 on Decision Making and Spatial Reasoning," the two models reason inconsistently.
They discovered that Wpk官网地址 "despite demonstrating some level of rational decision-making, many of its decisions violate at least one of the axioms even under reasonable constructions of preferences, bets, and decision-making prompts" in ChatGPT. And, they claim, ChatGPT sometimes makes the right decision for the wrong reasons. Microsoft's researchers acknowledge that ChatGPT has limitations and caution that the model's output should not be applied to a robot without supervision. "We emphasize that these tools should not be given complete control of the robotics pipeline, particularly for safety-critical applications," the authors write in their paper. "Given LLMs' proclivity to generate incorrect responses, it is fairly important to ensure solution quality and code safety with human supervision before executing it on the robot."