Object-centric world models are increasingly used in agents that reason and act within an environment. These models specify which objects exist, their properties, how they are changed by actions, and how they are perceived through the agent’s sensors. In this paper, we address the problem of how an agent can learn such models autonomously and online by executing actions and observing their effects. The agent models the environment as an object-centric, partially observable, relational MDP. We define an algorithm by which the agent incrementally learns the elements of such a model, namely: the signature for representing the objects and their properties, the observation function that links object states to observations, and a lifted specification of the transition function. We evaluate our approach by demonstrating how the agent can use the learned model to plan and solve a set of tasks not known a priori and compare these results with a set of Reinforcement Learning (RL) baselines. We show that our method learns an environment model that is effective for planning, while requiring significantly less training and outperforms the RL baselines.

Online Learning of Object-Centric Symbolic Models in Partially Observable Environments

Leonardo Lamanna
;
Luciano Serafini;Paolo Traverso
2026-01-01

Abstract

Object-centric world models are increasingly used in agents that reason and act within an environment. These models specify which objects exist, their properties, how they are changed by actions, and how they are perceived through the agent’s sensors. In this paper, we address the problem of how an agent can learn such models autonomously and online by executing actions and observing their effects. The agent models the environment as an object-centric, partially observable, relational MDP. We define an algorithm by which the agent incrementally learns the elements of such a model, namely: the signature for representing the objects and their properties, the observation function that links object states to observations, and a lifted specification of the transition function. We evaluate our approach by demonstrating how the agent can use the learned model to plan and solve a set of tasks not known a priori and compare these results with a set of Reinforcement Learning (RL) baselines. We show that our method learns an environment model that is effective for planning, while requiring significantly less training and outperforms the RL baselines.
File in questo prodotto:
File Dimensione Formato  
OLOM___ICAART_2026__camera_ready_.pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 539.8 kB
Formato Adobe PDF
539.8 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/368988
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact