Prashanth L. A.
Prashanth Lakshmanrao Ananthapadmanabharao
Person information
- affiliation: University of Maryland
- affiliation: INRIA Lille - Nord Europe
- affiliation: Indian Institute of Science, Department of Computer Science and Automation
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
showing all ?? records
2010 – today
- 2017
- [j8]Prashanth L. A., Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus:
Adaptive System Optimization Using Random Directions Stochastic Approximation. IEEE Trans. Automat. Contr. 62(5): 2223-2238 (2017) - [c16]Aditya Gopalan, Prashanth L. A., Michael C. Fu, Steven I. Marcus:
Weighted Bandits or: How Bandits Learn Distorted Values That Are Not Expected. AAAI 2017: 1941-1947 - 2016
- [j7]Prashanth L. A., Mohammad Ghavamzadeh:
Variance-constrained actor-critic algorithms for discounted and average reward MDPs. Machine Learning 105(3): 367-417 (2016) - [j6]Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra:
A constrained optimization perspective on actor-critic algorithms and application to network routing. Systems & Control Letters 92: 46-51 (2016) - [c15]Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári:
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles. AISTATS 2016: 819-828 - [c14]D. Sai Koti Reddy, Prashanth L. A., Shalabh Bhatnagar:
Improved Hessian estimation for adaptive random directions stochastic approximation. CDC 2016: 3682-3687 - [c13]Prashanth L. A., Cheng Jie, Michael C. Fu, Steven I. Marcus, Csaba Szepesvári:
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control. ICML 2016: 1406-1415 - [i14]Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári:
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles. CoRR abs/1609.07087 (2016) - [i13]Aditya Gopalan, Prashanth L. A., Michael C. Fu, Steven I. Marcus:
Weighted bandits or: How bandits learn distorted values that are not expected. CoRR abs/1611.10283 (2016) - 2015
- [j5]Shalabh Bhatnagar, Prashanth L. A.:
Simultaneous Perturbation Newton Algorithms for Simulation Optimization. J. Optimization Theory and Applications 164(2): 621-643 (2015) - [j4]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta:
Simultaneous perturbation methods for adaptive labor staffing in service systems. Simulation 91(5): 432-455 (2015) - [c12]Nathaniel Korda, Prashanth L. A., Rémi Munos:
Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits. AAAI 2015: 2708-2714 - [c11]H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar:
Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games. AAMAS 2015: 1371-1379 - [c10]Nathaniel Korda, Prashanth L. A.:
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence. ICML 2015: 626-634 - [i12]Prashanth L. A., Shalabh Bhatnagar:
Adaptive system optimization using (simultaneous) random directions stochastic approximation. CoRR abs/1502.05577 (2015) - [i11]Prashanth L. A., Cheng Jie, Michael C. Fu, Steven I. Marcus:
Cumulative Prospect Theory Meets Reinforcement Learning: Estimation and Control. CoRR abs/1506.02632 (2015) - [i10]Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra:
A constrained optimization perspective on actor critic algorithms and application to network routing. CoRR abs/1507.07984 (2015) - 2014
- [j3]Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar:
Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks. Wireless Networks 20(8): 2589-2604 (2014) - [c9]
- [c8]Raphael Fonteneau, Prashanth L. A.:
Simultaneous perturbation algorithms for batch off-policy search. CDC 2014: 2622-2627 - [c7]Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar:
Adaptive sleep-wake control using reinforcement learning in sensor networks. COMSNETS 2014: 1-8 - [c6]Prashanth L. A., Nathaniel Korda, Rémi Munos:
Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control. ECML/PKDD (2) 2014: 66-81 - [i9]H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar:
Algorithms for Nash Equilibria in General-Sum Stochastic Games. CoRR abs/1401.2086 (2014) - [i8]Raphael Fonteneau, Prashanth L. A.:
Simultaneous Perturbation Algorithms for Batch Off-Policy Search. CoRR abs/1403.4514 (2014) - [i7]Prashanth L. A., Mohammad Ghavamzadeh:
Actor-Critic Algorithms for Risk-Sensitive Reinforcement Learning. CoRR abs/1403.6530 (2014) - [i6]
- [i5]Nathaniel Korda, Prashanth L. A.:
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence. CoRR abs/1411.3224 (2014) - 2013
- [c5]Prashanth Lakshmanrao Ananthapadmanabharao, Horabailu Laxminarayana Prasad, Nirmit Desai, Shalabh Bhatnagar:
Mechanisms for hostile agents with capacity constraints. AAMAS 2013: 659-666 - [c4]Prashanth L. A., Mohammad Ghavamzadeh:
Actor-Critic Algorithms for Risk-Sensitive MDPs. NIPS 2013: 252-260 - [i4]Prashanth L. A., Nathaniel Korda, Rémi Munos:
Analysis of stochastic approximation for efficient least squares regression and LSTD. CoRR abs/1306.2557 (2013) - [i3]Nathaniel Korda, Prashanth L. A., Rémi Munos:
Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits. CoRR abs/1307.3176 (2013) - [i2]Prashanth Lakshmanrao Ananthapadmanabharao, Abhranil Chatterjee, Shalabh Bhatnagar:
Reinforcement Learning for Sleep-Wake Scheduling in Sensor Networks. CoRR abs/1312.7292 (2013) - [i1]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta:
Simultaneous Perturbation Methods for Adaptive Labor Staffing in Service Systems. CoRR abs/1312.7430 (2013) - 2012
- [j2]Prashanth L. A., Shalabh Bhatnagar:
Threshold Tuning Using Stochastic Optimization for Graded Signal Control. IEEE Trans. Vehicular Technology 61(9): 3865-3880 (2012) - 2011
- [j1]Prashanth L. A., Shalabh Bhatnagar:
Reinforcement Learning With Function Approximation for Traffic Signal Control. IEEE Trans. Intelligent Transportation Systems 12(2): 412-421 (2011) - [c3]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Banerjee Dasgupta:
Stochastic Optimization for Adaptive Labor Staffing in Service Systems. ICSOC 2011: 487-494 - [c2]Prashanth L. A., Shalabh Bhatnagar:
Reinforcement learning with average cost for adaptive control of traffic lights at intersections. ITSC 2011: 1640-1645
2000 – 2009
- 2008
- [c1]Prashanth L. A., K. Gopinath:
OFDM-MAC algorithms and their impact on TCP performance in next generation mobile networks. COMSWARE 2008: 133-140
Coauthor Index
data released under the ODC-BY 1.0 license; see also our legal information page
last updated on 2017-12-29 19:58 CET by the dblp team