Reinforcement Learning

· Last updated: 2025-09-13 302 字 · 2 分钟

写在前面

Sim RL, real world RL, RL with Human-in-the-loop, 我们寻找reward,为科研也为生活。

图片

图源: https://arxiv.org/pdf/2508.08189

RL & VLA: A Review

What Can RL Bring to VLA Generalization? An Empirical Study

MoRE(No-env offline)

Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models

ReinboT(No-env offline)

Amplifying Robot Visual-Language Manipulation with Reinforcement Learning

OctoNav(Env-sim online)

Towards Generalist Embodied Navigation

ConRFT(Env-real online, RSS)

A Reinforced Fine-tuning Method for VLA Models via Consistency Policy

RLGD(Env-real online, RSS)

Policy Agnostic(Env-real online)

Offline RL and Online RL Fine-Tuning of Any Class and Backbone

GRAPE (test time)

Generalizing Robot Policy via Preference Alignment

BAIR (science, real) Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

  • paper link: https://arxiv.org/abs/2410.21845

  • time: 2024/10

  • core idea:

  • reading note: omg读会读多了,期刊真的不太好读啊…正文22页我哭死

  • 推荐指数:

© Nataraj Basappa 2025