Jan Peters, J. Andrew Bagnell: Policy Gradient Methods. Encyclopedia of Machine Learning 2010: 774-776