default search action
Odalric-Ambrym Maillard
Person information
- affiliation: Technion, Haifa, Faculty of Electrical Engineering
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j8]Timothée Mathieu, Debabrota Basu, Odalric-Ambrym Maillard:
Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithms. Trans. Mach. Learn. Res. 2024 (2024) - [c54]Shubhada Agrawal, Timothée Mathieu, Debabrota Basu, Odalric-Ambrym Maillard:
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption. ALT 2024: 74-124 - [i37]Tuan Dam, Odalric-Ambrym Maillard, Emilie Kaufmann:
Power Mean Estimation in Stochastic Monte-Carlo Tree_Search. CoRR abs/2406.02235 (2024) - [i36]Odalric-Ambrym Maillard, Mohammad Sadegh Talebi:
How to Shrink Confidence Sets for Many Equivalent Discrete Distributions? CoRR abs/2407.15662 (2024) - 2023
- [c53]Reda Ouhamma, Debabrota Basu, Odalric Maillard:
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration & Planning. AAAI 2023: 9336-9344 - [c52]Hassan Saber, Fabien Pesquerel, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi:
Logarithmic regret in communicating MDPs: Leveraging known dynamics with bandits. ACML 2023: 1167-1182 - [c51]Hippolyte Bourel, Anders Jonsson, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi:
Exploration in Reward Machines with Low Regret. AISTATS 2023: 4114-4146 - [c50]Patrick Saux, Odalric Maillard:
Risk-aware linear bandits with convex loss. AISTATS 2023: 7723-7754 - [c49]Sayak Ray Chowdhury, Patrick Saux, Odalric Maillard, Aditya Gopalan:
Bregman Deviations of Generic Exponential Families. COLT 2023: 394-449 - [c48]Dorian Baudry, Fabien Pesquerel, Rémy Degenne, Odalric-Ambrym Maillard:
Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits. NeurIPS 2023 - [i35]Timothée Mathieu, Riccardo Della Vecchia, Alena Shilova, Matheus Centa de Medeiros, Hector Kohler, Odalric-Ambrym Maillard, Philippe Preux:
AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents. CoRR abs/2306.10882 (2023) - [i34]Tuan Dam, Pascal Stenger, Lukas Schneider, Joni Pajarinen, Carlo D'Eramo, Odalric-Ambrym Maillard:
Monte-Carlo tree search with uncertainty propagation via optimal transport. CoRR abs/2309.10737 (2023) - [i33]Shubhada Agrawal, Timothée Mathieu, Debabrota Basu, Odalric-Ambrym Maillard:
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption. CoRR abs/2309.16563 (2023) - 2022
- [j7]Romain Gautron, Odalric-Ambrym Maillard, Philippe Preux, Marc Corbeels, Régis Sabbadin:
Reinforcement learning for crop management support: Review, prospects and challenges. Comput. Electron. Agric. 200: 107182 (2022) - [j6]Kinda Khawam, Hassan Fawaz, Samer Lahoud, Odalric-Ambrym Maillard, Steven Martin:
A channel selection game for multi-operator LoRaWAN deployments. Comput. Networks 216: 109185 (2022) - [j5]Lilian Besson, Emilie Kaufmann, Odalric-Ambrym Maillard, Julien Seznec:
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits. J. Mach. Learn. Res. 23: 77:1-77:40 (2022) - [j4]Mahsa Asadi, Aurélien Bellet, Odalric-Ambrym Maillard, Marc Tommasi:
Collaborative Algorithms for Online Personalized Mean Estimation. Trans. Mach. Learn. Res. 2022 (2022) - [c47]Fabien Pesquerel, Odalric-Ambrym Maillard:
IMED-RL: Regret optimal learning of ergodic Markov decision processes. NeurIPS 2022 - [i32]Sayak Ray Chowdhury, Patrick Saux, Odalric-Ambrym Maillard, Aditya Gopalan:
Bregman Deviations of Generic Exponential Families. CoRR abs/2201.07306 (2022) - [i31]Debabrota Basu, Odalric-Ambrym Maillard, Timothée Mathieu:
Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm. CoRR abs/2203.03186 (2022) - [i30]Romain Gautron, Emilio J. Padrón, Philippe Preux, Julien Bigot, Odalric-Ambrym Maillard, David Emukpere:
gym-DSSAT: a crop model turned into a Reinforcement Learning environment. CoRR abs/2207.03270 (2022) - [i29]Mahsa Asadi, Aurélien Bellet, Odalric-Ambrym Maillard, Marc Tommasi:
Collaborative Algorithms for Online Personalized Mean Estimation. CoRR abs/2208.11530 (2022) - [i28]Patrick Saux, Odalric-Ambrym Maillard:
Risk-aware linear bandits with convex loss. CoRR abs/2209.07154 (2022) - [i27]Reda Ouhamma, Debabrota Basu, Odalric-Ambrym Maillard:
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and Planning. CoRR abs/2210.02087 (2022) - 2021
- [c46]Sayak Ray Chowdhury, Aditya Gopalan, Odalric-Ambrym Maillard:
Reinforcement Learning in Parametric MDPs with Exponential Families. AISTATS 2021: 1855-1863 - [c45]Mohammad Sadegh Talebi, Anders Jonsson, Odalric Maillard:
Improved Exploration in Factored Average-Reward MDPs. AISTATS 2021: 3988-3996 - [c44]Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux:
Learning Value Functions in Deep Policy Gradients using Residual Variance. ICLR 2021 - [c43]Dorian Baudry, Romain Gautron, Emilie Kaufmann, Odalric Maillard:
Optimal Thompson Sampling strategies for support-aware CVaR bandits. ICML 2021: 716-726 - [c42]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Indexed Minimum Empirical Divergence for Unimodal Bandits. NeurIPS 2021: 7346-7356 - [c41]Dorian Baudry, Patrick Saux, Odalric-Ambrym Maillard:
From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits. NeurIPS 2021: 14029-14041 - [c40]Reda Ouhamma, Odalric-Ambrym Maillard, Vianney Perchet:
Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits. NeurIPS 2021: 18577-18589 - [c39]Fabien Pesquerel, Hassan Saber, Odalric-Ambrym Maillard:
Stochastic bandits with groups of similar arms. NeurIPS 2021: 19461-19472 - [c38]Reda Ouhamma, Odalric-Ambrym Maillard, Vianney Perchet:
Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge. NeurIPS 2021: 24430-24441 - [c37]Hassan Saber, Léo Saci, Odalric-Ambrym Maillard, Audrey Durand:
Routine Bandits: Minimizing Regret on Recurring Problems. ECML/PKDD (1) 2021: 3-18 - [i26]Reda Ouhamma, Odalric Maillard, Vianney Perchet:
Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge. CoRR abs/2111.01602 (2021) - [i25]Dorian Baudry, Patrick Saux, Odalric-Ambrym Maillard:
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits. CoRR abs/2111.09724 (2021) - [i24]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Indexed Minimum Empirical Divergence for Unimodal Bandits. CoRR abs/2112.01452 (2021) - 2020
- [c36]Edouard Leurent, Odalric-Ambrym Maillard:
Monte-Carlo Graph Search: the Value of Merging Similar States. ACML 2020: 577-592 - [c35]Edouard Leurent, Denis V. Efimov, Odalric-Ambrym Maillard:
Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems. CDC 2020: 1429-1434 - [c34]Réda Alami, Odalric Maillard, Raphaël Féraud:
Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay. ICML 2020: 211-221 - [c33]Hippolyte Bourel, Odalric Maillard, Mohammad Sadegh Talebi:
Tightening Exploration in Upper Confidence Reinforcement Learning. ICML 2020: 1056-1066 - [c32]Dorian Baudry, Emilie Kaufmann, Odalric-Ambrym Maillard:
Sub-sampling for Efficient Non-Parametric Bandit Exploration. NeurIPS 2020 - [c31]Edouard Leurent, Odalric-Ambrym Maillard, Denis V. Efimov:
Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs. NeurIPS 2020 - [i23]Edouard Leurent, Denis V. Efimov, Odalric-Ambrym Maillard:
Robust Estimation, Prediction and Control with Linear Dynamics and Generic Costs. CoRR abs/2002.10816 (2020) - [i22]Hippolyte Bourel, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi:
Tightening Exploration in Upper Confidence Reinforcement Learning. CoRR abs/2004.09656 (2020) - [i21]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Forced-exploration free Strategies for Unimodal Bandits. CoRR abs/2006.16569 (2020) - [i20]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Optimal Strategies for Graph-Structured Bandits. CoRR abs/2007.03224 (2020) - [i19]Edouard Leurent, Denis V. Efimov, Odalric-Ambrym Maillard:
Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems. CoRR abs/2007.10401 (2020) - [i18]Mohammad Sadegh Talebi, Anders Jonsson, Odalric-Ambrym Maillard:
Improved Exploration in Factored Average-Reward MDPs. CoRR abs/2009.04575 (2020) - [i17]Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux:
Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients. CoRR abs/2010.04440 (2020) - [i16]Dorian Baudry, Emilie Kaufmann, Odalric-Ambrym Maillard:
Sub-sampling for Efficient Non-Parametric Bandit Exploration. CoRR abs/2010.14323 (2020) - [i15]Dorian Baudry, Romain Gautron, Emilie Kaufmann, Odalric-Ambrym Maillard:
Thompson Sampling for CVaR Bandits. CoRR abs/2012.05754 (2020)
2010 – 2019
- 2019
- [b2]Odalric-Ambrym Maillard:
Mathematics of Statistical Sequential Decision Making. (Mathématique de la prise de décision séquentielle statistique). Lille University of Science and Technology, France, 2019 - [c30]Mahsa Asadi, Mohammad Sadegh Talebi, Hippolyte Bourel, Odalric-Ambrym Maillard:
Model-Based Reinforcement Learning Exploiting State-Action Equivalence. ACML 2019: 204-219 - [c29]Odalric-Ambrym Maillard:
Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds. ALT 2019: 610-632 - [c28]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Budgeted Reinforcement Learning in Continuous State Space. NeurIPS 2019: 9295-9305 - [c27]Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard:
Regret Bounds for Learning State Representations in Reinforcement Learning. NeurIPS 2019: 12717-12727 - [c26]Mohammad Sadegh Talebi, Odalric-Ambrym Maillard:
Learning Multiple Markov Chains via Adaptive Allocation. NeurIPS 2019: 13322-13332 - [c25]Edouard Leurent, Odalric-Ambrym Maillard:
Practical Open-Loop Optimistic Planning. ECML/PKDD (3) 2019: 69-85 - [i14]Edouard Leurent, Yann Blanco, Denis V. Efimov, Odalric-Ambrym Maillard:
Approximate Robust Control of Uncertain Dynamical Systems. CoRR abs/1903.00220 (2019) - [i13]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Scaling up budgeted reinforcement learning. CoRR abs/1903.01004 (2019) - [i12]Edouard Leurent, Odalric-Ambrym Maillard:
Practical Open-Loop Optimistic Planning. CoRR abs/1904.04700 (2019) - [i11]M. Sadegh Talebi, Odalric-Ambrym Maillard:
Learning Multiple Markov Chains via Adaptive Allocation. CoRR abs/1905.11128 (2019) - [i10]Subhojyoti Mukherjee, Odalric-Ambrym Maillard:
Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits. CoRR abs/1905.13159 (2019) - [i9]Mahsa Asadi, Mohammad Sadegh Talebi, Hippolyte Bourel, Odalric-Ambrym Maillard:
Model-Based Reinforcement Learning Exploiting State-Action Equivalence. CoRR abs/1910.04077 (2019) - 2018
- [j3]Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau:
Streaming kernel regression with provably adaptive mean, variance, and regularization. J. Mach. Learn. Res. 19: 17:1-17:34 (2018) - [c24]Mohammad Sadegh Talebi, Odalric-Ambrym Maillard:
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs. ALT 2018: 770-805 - [i8]Mohammad Sadegh Talebi, Odalric-Ambrym Maillard:
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs. CoRR abs/1803.01626 (2018) - 2017
- [j2]Robin Allesiardo, Raphaël Féraud, Odalric-Ambrym Maillard:
The non-stationary stochastic multi-armed bandit problem. Int. J. Data Sci. Anal. 3(4): 267-283 (2017) - [c23]Odalric-Ambrym Maillard:
Boundary Crossing for General Exponential Families. ALT 2017: 151-184 - [c22]Jaouad Mourtada, Odalric-Ambrym Maillard:
Efficient tracking of a growing number of experts. ALT 2017: 517-539 - [c21]Borja Balle, Odalric-Ambrym Maillard:
Spectral Learning from a Single Trajectory under Finite-State Policies. ICML 2017: 361-370 - [i7]Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau:
Streaming kernel regression with provably adaptive mean, variance, and regularization. CoRR abs/1708.00768 (2017) - [i6]Jaouad Mourtada, Odalric-Ambrym Maillard:
Efficient tracking of a growing number of experts. CoRR abs/1708.09811 (2017) - 2016
- [c20]Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric-Ambrym Maillard:
Pliable Rejection Sampling. ICML 2016: 2121-2129 - [i5]Aditya Gopalan, Odalric-Ambrym Maillard, Mohammadi Zaki:
Low-rank Bandits with Latent Mixtures. CoRR abs/1609.01508 (2016) - [i4]Robin Allesiardo, Raphaël Féraud, Odalric-Ambrym Maillard:
Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem. CoRR abs/1609.02139 (2016) - 2014
- [c19]Ronald Ortner, Odalric-Ambrym Maillard, Daniil Ryabko:
Selecting Near-Optimal Approximate State Representations in Reinforcement Learning. ALT 2014: 140-154 - [c18]Odalric-Ambrym Maillard, Shie Mannor:
Latent Bandits. ICML 2014: 136-144 - [c17]Odalric-Ambrym Maillard, Timothy A. Mann, Shie Mannor:
How hard is my MDP?" The distribution-norm to the rescue". NIPS 2014: 1835-1843 - [c16]Akram Baransi, Odalric-Ambrym Maillard, Shie Mannor:
Sub-sampling for Multi-armed Bandits. ECML/PKDD (1) 2014: 115-131 - [i3]Ronald Ortner, Odalric-Ambrym Maillard, Daniil Ryabko:
Selecting Near-Optimal Approximate State Representations in Reinforcement Learning. CoRR abs/1405.2652 (2014) - 2013
- [c15]Phuong Nguyen, Odalric-Ambrym Maillard, Daniil Ryabko, Ronald Ortner:
Competing with an Infinite Set of Models in Reinforcement Learning. AISTATS 2013: 463-471 - [c14]Odalric-Ambrym Maillard:
Robust Risk-Averse Stochastic Multi-armed Bandits. ALT 2013: 218-233 - [c13]Odalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner, Daniil Ryabko:
Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning. ICML (1) 2013: 543-551 - [i2]Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko:
Selecting the State-Representation in Reinforcement Learning. CoRR abs/1302.2552 (2013) - [i1]Odalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner, Daniil Ryabko:
Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning. CoRR abs/1302.2553 (2013) - 2012
- [j1]Odalric-Ambrym Maillard, Rémi Munos:
Linear regression with random projections. J. Mach. Learn. Res. 13: 2735-2772 (2012) - [c12]Odalric-Ambrym Maillard:
Hierarchical Optimistic Region Selection driven by Curiosity. NIPS 2012: 1457-1465 - [c11]Alexandra Carpentier, Odalric-Ambrym Maillard:
Online allocation and homogeneous partitioning for piecewise constant mean-approximation. NIPS 2012: 1970-1978 - 2011
- [b1]Odalric-Ambrym Maillard:
(APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement. Lille University of Science and Technology, France, 2011 - [c10]Alexandra Carpentier, Odalric-Ambrym Maillard, Rémi Munos:
Sparse Recovery with Brownian Sensing. NIPS 2011: 1782-1790 - [c9]Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko:
Selecting the State-Representation in Reinforcement Learning. NIPS 2011: 2627-2635 - [c8]Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz:
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences. COLT 2011: 497-514 - [c7]Odalric-Ambrym Maillard, Rémi Munos:
Adaptive Bandits: Towards the best history-dependent strategy. AISTATS 2011: 570-578 - 2010
- [c6]Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard, Rémi Munos:
LSTD with Random Projections. NIPS 2010: 721-729 - [c5]Odalric-Ambrym Maillard, Rémi Munos:
Scrambled Objects for Least-Squares Regression. NIPS 2010: 1549-1557 - [c4]Odalric-Ambrym Maillard, Rémi Munos:
Online Learning in Adversarial Lipschitz Environments. ECML/PKDD (2) 2010: 305-320 - [c3]Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh:
Finite-sample Analysis of Bellman Residual Minimization. ACML 2010: 299-314
2000 – 2009
- 2009
- [c2]Odalric-Ambrym Maillard, Nicolas Vayatis:
Complexity versus Agreement for Many Views. ALT 2009: 232-246 - [c1]Odalric-Ambrym Maillard, Rémi Munos:
Compressed Least-Squares Regression. NIPS 2009: 1213-1221
Coauthor Index
aka: M. Sadegh Talebi
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-13 00:44 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint