LLM Agents
[RetroFormer] Retrospective LL Agents with Policy Gradient Optimization🔗
Arxiv: https://arxiv.org/abs/2308.02151 4 Aug 2023 Salesforce
This paper introduces Retroformer, a principled framework for reinforcing language agents by learning a plug-in retrospective model, which automatically refines the language agent prompts from environment feedback through policy optimization. Specifically, our proposed agent architecture can learn from arbitrary reward information across multiple environments and tasks, for iteratively fine-tuning a pre-trained language model, which refines the language agent prompts by reflecting on failed attempts and assigning credits of actions taken by the agent on future rewards