Physical AI Must Sense, Think, Act and Optimize
Physical AI systems comprise software and hardware that use machine learning to interact with the physical world. At its simplest, physical AI relies on four steps: sense, think, act and optimize. It collects data from the physical world, analyzes it, makes decisions based on it, takes action in the physical world and then evaluates the results to improve the process in the future.
The term “physical AI” was popularized in 2024 by Nvidia CEO Jensen Huang, who described it as “AI that understands the laws of physics.” The premise is that AI technologies should include bodily awareness, such as an object’s sense of space, motions and interactions.
That may sound like another way to describe robotics, but there is a distinction. Robotics focuses more on automation rather than augmentation. Physical AI incorporates sensing, reacting and making decisions in real-world environments. If a physical system lacks that intelligence, it is just a machine. So a robot might qualify as physical AI or it may not, depending on its analytics functions.
Today’s AI primarily benefits knowledge workers
Most people encounter AI through ChatGPT, Microsoft Copilot and similar generative AI platforms. However, in most contexts, those systems have benefited white-collar workers more than those in blue-collar industries. They have made life easier and more efficient primarily for knowledge workers — for example, by helping to write blog posts, generate code or analyze marketing campaigns.
But the benefits are less direct for hardware-centric businesses
While industrial organizations have used AI, most of that use has been digitally focused. A sensor continuously collects data, such as temperature, and can generate a notification when it exceeds a specific threshold. A sensor collects and shares data, such as recognizing when the temperature exceeds a threshold. AI may be involved in the analysis, including correlating that data with data from other sources. Today, the results are displayed on a digital dashboard that still needs a human for oversight and potential action.
As we approach physical AI, the sense-think-act-optimize process can better automate responses, such as adjusting equipment’s temperature, without human intervention and performing diagnostics to determine the cause of a malfunction. Factory machines can “see” what is happening on the shop floor and act on it, taking advantage of inexpensive sensors, edge devices and industrial internet-of-things tools.
The speed of AI development expands what the technology can accomplish. The models are smaller, which makes them portable, which in turn makes it possible to compute more at the edge. These advances support miniaturization, standardization and the seamless transmission of increasing amounts of data, signals and power.
The human element
Notably, the value pool affected by physical AI is human labor. There is a huge financial incentive for companies to adopt the technology, given that humans are the most significant expense for most businesses. In the technology industry, we could say, “If you enable physical AI, you can automate or optimize human labor.” For business managers, the promise of financial savings will impress the board of directors more than the statement “I got ChatGPT to write a staff member’s blog post.”
We like to believe that digital AI is a thought enhancer and optimizer. ChatGPT does not remove the staff member, we assert. It makes the staff member more productive. That may not hold true for all workers or managers, but the technology excels in assistive roles for knowledge workers.
We now can have robots capable and intelligent enough to take physical action. Cobots, or collaborative robots, are smaller, lighter industrial robots that can share human workspaces. With automation and optimization capabilities, they can be used to improve safety by removing humans from dangerous situations, or they can add capacity to improve outcomes — for example, by helping doctors perform more surgeries in a particular time period.
Where is physical AI heading?
The next logical step toward physical AI is to act within the physical world based on an external signal. Today, advanced driving systems are the most pervasive use case. Consider them in the context of the four basic steps involved in physical AI.
Sense: ADAS captures information about the world in real time and accurately perceives and monitors the surrounding environment. That includes the road’s physical characteristics and lane markings, and the vehicles nearby. It must distinguish between a small car, a truck, a bike and people.
Think: Computations use the system’s perceptions to make intelligent decisions. With all of those inputs, in addition to data on weather conditions and other external factors, an ADAS must calculate fast — ideally, faster than a human can.
Act: Intelligence becomes action. The ADAS needs automation and optimization. It must respond in order to actuate a vehicle, slam on the brakes or change lanes, or do whatever is required.
Optimize: Every AI system should get smarter on its next iteration. Automation is good; optimization is better. Physical AI should evaluate the efficiency and accuracy of its actions so it can do a better job every time, throughout the vehicle’s lifecycle. The process is similar to what happens in nature: The human body is continuously receiving feedback and optimizing its response. For example, we might shiver to warm up or sweat to cool down, based on self-regulatory feedback that adjusts behavior to achieve a temperature that ensures healthy body conditions.
Automotive examples are far from unique. Industrial robots share a few subsystems with vehicles, for instance: They need to perceive the environment, identify objects and obstacles, and navigate autonomously. And they must be continuously connected into broader computer ecosystems.
In energy industries, physical AI can synthesize weather data, pipeline pressure readings and other critical data in real time, providing operators with contextual awareness and predictive alerts about vulnerable system components. Physical AI can transform industries like transportation and healthcare or in any context that requires physical interaction. Injecting trustworthy AI into such applications enables machines to make more competent decisions, enhancing both efficiency and safety.
Data makes IT/OT convergence promises achievable
The long-desired integration of IT and operational technology (OT) has never taken off, despite CIOs’ and plant managers’ earnest desires. IT/OT convergence has been hard to justify, given the lack of infrastructure to connect them, ever-present security concerns and the lack of a “killer app” to make implementation an urgent task.
Two things have changed the situation. Devices at the edge have become smaller, cheaper and more powerful, and they can keep up with the heavy CPU demands of machine learning models.
Also, data is now perceived as a new competitive edge. Physical AI collects a lot of data, whose analysis can help streamline a manufacturing process or improve self-driving accuracy. IT/OT convergence is a logical next step. By combining IT data (think supply chain systems) with OT data (such as machine performance logs), companies can see actionable benefits in predictive maintenance and demand forecasting.
Moreover, the data sources provide new opportunities. An interactive generative AI interface may help analysts and workers refine prompts, which can lead to more relevant answers. The better the AI model understands its world, the better the outcome. Superior input fidelity — that is, having higher-quality sensors — improves data quality, which enables generative AI systems to deliver better outcomes.
Those automations and optimizations offer new opportunities to make IT/OT convergence a realistic opportunity.