Object-centric world models are increasingly used in agents that reason and act within an environment. These models specify which objects exist, their properties, how they are changed by actions, and how they are perceived through the agent’s sensors. In this paper, we address the problem of how an agent can learn such models autonomously and online by executing actions and observing their effects. The agent models the environment as an object-centric, partially observable, relational MDP. We define an algorithm by which the agent incrementally learns the elements of such a model, namely: the signature for representing the objects and their properties, the observation function that links object states to observations, and a lifted specification of the transition function. We evaluate our approach by demonstrating how the agent can use the learned model to plan and solve a set of tasks not known a priori and compare these results with a set of Reinforcement Learning (RL) baselines. We show that our method learns an environment model that is effective for planning, while requiring significantly less training and outperforms the RL baselines.
Online Learning of Object-Centric Symbolic Models in Partially Observable Environments
Leonardo Lamanna
;Luciano Serafini;Paolo Traverso
2026-01-01
Abstract
Object-centric world models are increasingly used in agents that reason and act within an environment. These models specify which objects exist, their properties, how they are changed by actions, and how they are perceived through the agent’s sensors. In this paper, we address the problem of how an agent can learn such models autonomously and online by executing actions and observing their effects. The agent models the environment as an object-centric, partially observable, relational MDP. We define an algorithm by which the agent incrementally learns the elements of such a model, namely: the signature for representing the objects and their properties, the observation function that links object states to observations, and a lifted specification of the transition function. We evaluate our approach by demonstrating how the agent can use the learned model to plan and solve a set of tasks not known a priori and compare these results with a set of Reinforcement Learning (RL) baselines. We show that our method learns an environment model that is effective for planning, while requiring significantly less training and outperforms the RL baselines.| File | Dimensione | Formato | |
|---|---|---|---|
|
OLOM___ICAART_2026__camera_ready_.pdf
solo utenti autorizzati
Tipologia:
Documento in Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
539.8 kB
Formato
Adobe PDF
|
539.8 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
