Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

¹Mila ²Université de Montréal ³University of Waterloo ⁴Google DeepMind ^*Co first author ⁺Equal advising

Abstract

While goal-conditioned behavior cloning (GCBC) methods can perform well on in-distribution training tasks, they do not necessarily generalize zero-shot to tasks that require conditioning on novel state-goal pairs, i.e. combinatorial generalization. In part, this limitation can be attributed to a lack of temporal consistency in the state representation learned by BC; if temporally correlated states are properly encoded to similar latent representations, then the out-of-distribution gap for novel state-goal pairs would be reduced. We formalize this notion by demonstrating how encouraging long-range temporal consistency via successor representations (SR) can facilitate generalization. We then propose a simple yet effective representation learning objective,

BYOL- γ

for GCBC, which theoretically approximates the successor representation in the finite MDP case through self-predictive representations, and achieves competitive empirical performance across a suite of challenging tasks requiring combinatorial generalization.

Overview

To learn better policy representations for generalization, we utilize an auxiliary self-predictive objective that predicts a future representation

ϕ (s_{t + k})

via

ψ_{f} (ϕ (s_{t}))

. We can also predict backwards with a separate predictor

ψ_{b} (ϕ (s_{t + k}))

. The target offset is sampled geometrically

k \sim geom (1 - γ)

Horizon Generalization

We conduct experiments to understand how success rate changes as an agent has to reach more challenging goals further away from its starting position. We consider the same base 5 evaluation tasks used in our main evaluation, but construct intermediate waypoints as shorter horizon goals along the shortest path from the start (●) to the final goal (★). We find that our objective can help generalize to further and more challenging goals.

BibTeX

@misc{lawson2025selfpredictiverepresentationscombinatorialgeneralization, title={Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning}, author={Daniel Lawson and Adriana Hugessen and Charlotte Cloutier and Glen Berseth and Khimya Khetarpal}, year={2025}, eprint={2506.10137}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2506.10137}, }

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

Abstract

Overview

Representation Visualization

OGBench Results

Horizon Generalization

BibTeX