"Experiments with Infinite-Horizon, Policy-Gradient Estimation."

Jonathan Baxter, Peter L. Bartlett, Lex Weaver (2001)

Details and statistics

DOI: 10.1613/JAIR.807

access: open

type: Journal Article

metadata version: 2022-08-16

a service of  Schloss Dagstuhl - Leibniz Center for Informatics