Strengthening Generative Robot Policies through Predictive World Modeling

Han Qi¹, Haocheng Yin¹, Aris Zhu¹, Yilun Du¹, Heng Yang¹

¹School of Engineering and Applied Sciences, Harvard University

arXiv Code (Coming soon)

Abstract

We present generative predictive control (GPC), a learning control framework that (i) clones a generative diffusion-based policy from expert demonstrations, (ii) trains a predictive action-conditioned world model from both expert demonstrations and random explorations, and (iii) synthesizes an online planner that ranks and optimizes the action proposals from (i) by looking ahead into the future using the world model from (ii). Across a variety of robotic manipulation tasks, we demonstrate that GPC consistently outperforms behavior cloning in both state-based and vision-based settings, in simulation and in the real world.

Simulation Evaluation

Push-T

Triangle Drawing

Block Stacking

Cube & Sphere Swapping

World Model Prediction

Plain Push-T

World Model Prediction in GPC-RANK.

World Model Prediction in GPC-OPT.

Push-T collided with A

World Model Prediction in GPC-RANK.

World Model Prediction in GPC-OPT.

Push-T collided with A & R

World Model Prediction in GPC-RANK.

World Model Prediction in GPC-OPT.

Real-world Evaluation

Check all real-world evalutaion results (by clicking the titles)

Plain Push-T: Baseline (5 out of 10) vs. GPC-RANK (7 out of 10) vs. GPC-OPT (7 out of 10)

Baseline Test 0: Success.

GPC-RANK Test 0: Success.

GPC-OPT Test 0: Success.

Baseline Test 1: Failure.

GPC-RANK Test 1: Success.

GPC-OPT Test 1: Success.

Baseline Test 2: Success.

GPC-RANK Test 2: Success.

GPC-OPT Test 2: Success.

Baseline Test 3: Failure.

GPC-RANK Test 3: Failure.

GPC-OPT Test 3: Success.

Baseline Test 4: Failure.

GPC-RANK Test 4: Success.

GPC-OPT Test 4: Success.

Baseline Test 5: Failure.

GPC-RANK Test 5: Success.

GPC-OPT Test 5: Failure.

Baseline Test 6: Success.

GPC-RANK Test 6: Failure.

GPC-OPT Test 6: Failure.

Baseline Test 7: Failure.

GPC-RANK Test 7: Failure.

GPC-OPT Test 7: Success.

Baseline Test 8: Success.

GPC-RANK Test 8: Success.

GPC-OPT Test 8: Success.

Baseline Test 9: Success.

GPC-RANK Test 9: Success.

GPC-OPT Test 9: Failure.

Push-T collided with A: Baseline (2 out of 5) vs. GPC-RANK (4 out of 5) vs. GPC-OPT (3 out of 5)

Baseline Test 0: Failure.

GPC-RANK Test 0: Success.

GPC-OPT Test 0: Success.

Baseline Test 1: Success.

GPC-RANK Test 1: Success.

GPC-OPT Test 1: Success.

Baseline Test 2: Failure.

GPC-RANK Test 2: Success.

GPC-OPT Test 2: Failure.

Baseline Test 3: Failure.

GPC-RANK Test 3: Failure.

GPC-OPT Test 3: Failure.

Baseline Test 4: Success.

GPC-RANK Test 4: Success.

GPC-OPT Test 4: Success.

Push-T collided with A & R: Baseline (2 out of 5) vs. GPC-RANK (3 out of 5) vs. GPC-OPT (4 out of 5)

Baseline Test 0: Success.

GPC-RANK Test 0: Success.

GPC-OPT Test 0: Success.

Baseline Test 1: Success.

GPC-RANK Test 1: Success.

GPC-OPT Test 1: Success.

Baseline Test 2: Failure.

GPC-RANK Test 2: Success.

GPC-OPT Test 2: Success.

Baseline Test 3: Failure.

GPC-RANK Test 3: Failure.

GPC-OPT Test 3: Failure.

Baseline Test 4: Failure.

GPC-RANK Test 4: Failure.

GPC-OPT Test 4: Success.

Push-T collided with R: Baseline (2 out of 3) vs. GPC-RANK (3 out of 3) vs. GPC-OPT (3 out of 3)

Baseline Test 0: Success.

GPC-RANK Test 0: Success.

GPC-OPT Test 0: Success.

Baseline Test 1: Success.

GPC-RANK Test 1: Success.

GPC-OPT Test 1: Success.

Baseline Test 2: Failure.

GPC-RANK Test 2: Success.

GPC-OPT Test 2: Success.

Clothes Folding: Baseline (3 out of 10) vs. GPC-RANK (7 out of 10)

Baseline Test 0: Failure.

GPC-RANK Test 0: Success.

Baseline Test 1: Success.

GPC-RANK Test 1: Success.

Baseline Test 2: Failure.

GPC-RANK Test 2: Success.

Baseline Test 3: Failure.

GPC-RANK Test 3: Success.

Baseline Test 4: Failure.

GPC-RANK Test 4: Failure.

Baseline Test 5: Failure.

GPC-RANK Test 5: Success.

Baseline Test 6: Success.

GPC-RANK Test 6: Success.

Baseline Test 7: Success.

GPC-RANK Test 7: Success.

Baseline Test 8: Failure.

GPC-RANK Test 8: Failure.

Baseline Test 9: Failure.

GPC-RANK Test 9: Failure.

BibTeX

  @article{qi2025gpc,
    title={Strengthening Generative Robot Policies through Predictive World Modeling},
    author={Qi, Han and Yin, Haocheng and Du, Yilun and Yang, Heng},
    journal={arXiv preprint arXiv:2502.00622},
    year={2025}
  }