Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis

Zihao Li, Xiang Ji, Minshuo Chen, Mengdi Wang

Research output: Contribution to journalConference articlepeer-review

Fingerprint

Dive into the research topics of 'Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis'. Together they form a unique fingerprint.

Keyphrases

Computer Science