"Inferring the Optimal Policy using Markov Chain Monte Carlo."

Brandon Trabucco et al. (2019)
a service of Schloss Dagstuhl - Leibniz Center for Informatics