Variational policy gradient method for reinforcement learning with general utilities

Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvári, Mengdi Wang

Research output: Contribution to journalConference articlepeer-review

52 Scopus citations

Fingerprint

Dive into the research topics of 'Variational policy gradient method for reinforcement learning with general utilities'. Together they form a unique fingerprint.

Mathematics

Keyphrases