Stop the war!

Остановите войну!

for scientists:

default search action

combined dblp search
author search
venue search
publication search

ask others

Bilal Piot

> Home > Persons

Person information

affiliation: Lille University of Science and Technology, Research center in Computer Science, Signal and Automatic Control (CRIStAL)

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c34]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/AzarGPMRVC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/AzarGPMRVC24
Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Rémi Munos, Mark Rowland, Michal Valko, Daniele Calandriello:
A General Theoretical Paradigm to Understand Learning from Human Preferences. AISTATS 2024: 4447-4455
[i34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-04792
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-04792
Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Ramé, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel:
Direct Language Model Alignment from Online AI Feedback. CoRR abs/2402.04792 (2024)
[i33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-05749
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-05749
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. CoRR abs/2402.05749 (2024)
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-08635
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-08635
Daniele Calandriello, Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. CoRR abs/2403.08635 (2024)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-14655
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-14655
Lior Shani, Aviv Rosenberg, Asaf Cassel, Oran Lang, Daniele Calandriello, Avital Zipori, Hila Noga, Orgad Keller, Bilal Piot, Idan Szpektor, Avinatan Hassidim, Yossi Matias, Rémi Munos:
Multi-turn Reinforcement Learning from Preference Human Feedback. CoRR abs/2405.14655 (2024)
[i30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19107
Pierre Harvey Richemond, Yunhao Tang, Daniel Guo, Daniele Calandriello, Mohammad Gheshlaghi Azar, Rafael Rafailov, Bernardo Ávila Pires, Eugene Tarassov, Lucas Spangher, Will Ellsworth, Aliaksei Severyn, Jonathan Mallinson, Lior Shani, Gil Shamir, Rishabh Joshi, Tianqi Liu, Rémi Munos, Bilal Piot:
Offline Regularised Reinforcement Learning for Large Language Models Alignment. CoRR abs/2405.19107 (2024)
2023
[c33]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/RichemondTTSPH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/RichemondTTSPH23
Pierre Harvey Richemond, Allison C. Tam, Yunhao Tang, Florian Strub, Bilal Piot, Felix Hill:
The Edge of Orthogonality: A Simple View of What Makes BYOL Tick. ICML 2023: 29063-29081
[c32]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/TangGRPCMRALL0T23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGRPCMRALL0T23
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656
[i29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-04817
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-04817
Pierre H. Richemond, Allison C. Tam, Yunhao Tang, Florian Strub, Bilal Piot, Felix Hill:
The Edge of Orthogonality: A Simple View of What Makes BYOL Tick. CoRR abs/2302.04817 (2023)
[i28]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-01521
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-01521
Alaa Saade, Steven Kapturowski, Daniele Calandriello, Charles Blundell, Pablo Sprechmann, Leopoldo Sarra, Oliver Groth, Michal Valko, Bilal Piot:
Unlocking the Power of Representations in Long-term Novelty-based Exploration. CoRR abs/2305.01521 (2023)
[i27]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-12036
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-12036
Mohammad Gheshlaghi Azar, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, Rémi Munos:
A General Theoretical Paradigm to Understand Learning from Human Preferences. CoRR abs/2310.12036 (2023)
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-00886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-00886
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. CoRR abs/2312.00886 (2023)
2022
[c31]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/ChaabouniSATTDM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ChaabouniSATTDM22
Rahma Chaabouni, Florian Strub, Florent Altché, Eugene Tarassov, Corentin Tallec, Elnaz Davoodi, Kory Wallace Mathewson, Olivier Tieleman, Angeliki Lazaridou, Bilal Piot:
Emergent Communication at Scale. ICLR 2022
[c30]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GuoTPPATSCGTVMA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuoTPPATSCGTVMA22
Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08332
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08332
Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-15378
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-15378
Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022)
[i23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03319
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03319
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022)
2021
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-02055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-02055
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-10819
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-10819
Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Pérolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian Chen, Scott E. Reed, Marcus Hutter, Nando de Freitas, Shane Legg:
Shaking the foundations: delusions in sequence models for interaction and control. CoRR abs/2110.10819 (2021)
2020
[c29]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/BadiaSVGPKTAPBB20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/BadiaSVGPKTAPBB20
Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andrew Bolt, Charles Blundell:
Never Give Up: Learning Directed Exploration Strategies. ICLR 2020
[c28]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/BadiaPKSVGB20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BadiaPKSVGB20
Adrià Puigdomènech Badia, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Charles Blundell:
Agent57: Outperforming the Atari Human Benchmark. ICML 2020: 507-517
[c27]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/GuoPPGAMA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GuoPPGAMA20
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. ICML 2020: 3875-3886
[c26]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GrillSATRBDPGAP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillSATRBDPGAP20
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2002-06038
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-06038
Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andrew Bolt, Charles Blundell:
Never Give Up: Learning Directed Exploration Strategies. CoRR abs/2002.06038 (2020)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2003-13350
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-13350
Adrià Puigdomènech Badia, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Charles Blundell:
Agent57: Outperforming the Atari Human Benchmark. CoRR abs/2003.13350 (2020)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2004-14646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-14646
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. CoRR abs/2004.14646 (2020)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-00979
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-00979
Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal M. P. Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Alexander Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Çaglar Gülçehre, Tom Le Paine, Andrew Cowie, Ziyu Wang, Bilal Piot, Nando de Freitas:
Acme: A Research Framework for Distributed Reinforcement Learning. CoRR abs/2006.00979 (2020)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07733
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07733
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2010-10241
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-10241
Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel L. Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko:
BYOL works even without batch statistics. CoRR abs/2010.10241 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c25]
- view
  - electronic edition @ acm.org
  - no references & citations available
- export record
  dblp key:
  - conf/atal/BorsaHPLHMP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/BorsaHPLHMP19
Diana Borsa, Nicolas Heess, Bilal Piot, Siqi Liu, Leonard Hasenclever, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. AAMAS 2019: 1117-1124
[c24]
- view
- export record
  dblp key:
  - conf/nips/HarutyunyanDMAP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HarutyunyanDMAP19
Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. NeurIPS 2019: 12467-12476
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1902-07685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-07685
Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Jean-Bastien Grill, Florent Altché, Rémi Munos:
World Discovery Models. CoRR abs/1902.07685 (2019)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1912-02503
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-02503
Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. CoRR abs/1912.02503 (2019)
2018
[c23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HesselMHSODHPAS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HesselMHSODHPAS18
Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. AAAI 2018: 3215-3222
[c22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HesterVPLSPHQSO18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HesterVPLSPHQSO18
Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Deep Q-learning From Demonstrations. AAAI 2018: 3223-3230
[c21]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/PerolatPP18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/PerolatPP18
Julien Pérolat, Bilal Piot, Olivier Pietquin:
Actor-Critic Fictitious Play in Simultaneous Move Multistage Games. AISTATS 2018: 919-928
[c20]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/FortunatoAPMHOG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/FortunatoAPMHOG18
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks For Exploration. ICLR (Poster) 2018
[c19]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/GruslysDAPBM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/GruslysDAPBM18
Audrunas Gruslys, Will Dabney, Mohammad Gheshlaghi Azar, Bilal Piot, Marc G. Bellemare, Rémi Munos:
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning. ICLR (Poster) 2018
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1805-11593
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-11593
Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Vecerík, Matteo Hessel, Rémi Munos, Olivier Pietquin:
Observe and Look Further: Achieving Consistent Performance on Atari. CoRR abs/1805.11593 (2018)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1809-07802
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1809-07802
Julien Pérolat, Mateusz Malinowski, Bilal Piot, Olivier Pietquin:
Playing the Game of Universal Adversarial Perturbations. CoRR abs/1809.07802 (2018)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1811-06407
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-06407
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos:
Neural Predictive Belief Representations. CoRR abs/1811.06407 (2018)
2017
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/tnn/PiotGP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tnn/PiotGP17
Bilal Piot, Matthieu Geist, Olivier Pietquin:
Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning. IEEE Trans. Neural Networks Learn. Syst. 28(8): 1814-1826 (2017)
[c18]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/PerolatSPP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/PerolatSPP17
Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin:
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data. AISTATS 2017: 232-241
[c17]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/StrubVMPCP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/StrubVMPCP17
Florian Strub, Harm de Vries, Jérémie Mary, Bilal Piot, Aaron C. Courville, Olivier Pietquin:
End-to-end optimization of goal-driven and visually grounded dialogue systems. IJCAI 2017: 2765-2771
[c16]
- view
- export record
  dblp key:
  - conf/nips/GeistPP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GeistPP17
Matthieu Geist, Bilal Piot, Olivier Pietquin:
Is the Bellman residual a bad proxy? NIPS 2017: 3205-3214
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/StrubVMPCP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/StrubVMPCP17
Florian Strub, Harm de Vries, Jérémie Mary, Bilal Piot, Aaron C. Courville, Olivier Pietquin:
End-to-end optimization of goal-driven and visually grounded dialogue systems. CoRR abs/1703.05423 (2017)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/HesterVPLSPSDOA17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HesterVPLSPSDOA17
Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Learning from Demonstrations for Real World Reinforcement Learning. CoRR abs/1704.03732 (2017)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/BorsaPMP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BorsaPMP17
Diana Borsa, Bilal Piot, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. CoRR abs/1706.06617 (2017)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/FortunatoAPMOGM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/FortunatoAPMOGM17
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks for Exploration. CoRR abs/1706.10295 (2017)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/VecerikHSWPPHRL17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/VecerikHSWPPHRL17
Matej Vecerík, Todd Hester, Jonathan Scholz, Fumin Wang, Olivier Pietquin, Bilal Piot, Nicolas Heess, Thomas Rothörl, Thomas Lampe, Martin A. Riedmiller:
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. CoRR abs/1707.08817 (2017)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1710-02298
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1710-02298
Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Daniel Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. CoRR abs/1710.02298 (2017)
2016
[c15]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/PerolatPSP16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/PerolatPSP16
Julien Pérolat, Bilal Piot, Bruno Scherrer, Olivier Pietquin:
On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games. AISTATS 2016: 893-901
[c14]
- view
  - electronic edition @ acm.org
  - no references & citations available
- export record
  dblp key:
  - conf/atal/AsriPGLP16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/AsriPGLP16
Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin:
Score-based Inverse Reinforcement Learning. AAMAS 2016: 457-465
[c13]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/PerolatPGSP16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PerolatPGSP16
Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
Softened Approximate Policy Iteration for Markov Games. ICML 2016: 1860-1868
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/PiotGP16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/PiotGP16
Bilal Piot, Matthieu Geist, Olivier Pietquin:
Difference of Convex Functions Programming Applied to Control with Expert Data. CoRR abs/1606.01128 (2016)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/GeistPP16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GeistPP16
Matthieu Geist, Bilal Piot, Olivier Pietquin:
Should one minimize the expected Bellman residual or maximize the mean value? CoRR abs/1606.07636 (2016)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/PerolatSPP16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/PerolatSPP16
Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin:
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data. CoRR abs/1606.08718 (2016)
2015
[c12]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/PiotPG15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PiotPG15
Bilal Piot, Olivier Pietquin, Matthieu Geist:
Imitation Learning Applied to Embodied Conversational Agents. MLIS@ICML 2015: 1-5
[c11]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/PerolatSPP15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PerolatSPP15
Julien Pérolat, Bruno Scherrer, Bilal Piot, Olivier Pietquin:
Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games. ICML 2015: 1321-1329
[c10]
- view
  - electronic edition @ ijcai.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/ijcai/MunzerPGPL15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/MunzerPGPL15
Thibaut Munzer, Bilal Piot, Matthieu Geist, Olivier Pietquin, Manuel Lopes:
Inverse Reinforcement Learning in Relational Domains. IJCAI 2015: 3735-3741
2014
[c9]
- view
  - electronic edition @ acm.org
  - no references & citations available
- export record
  dblp key:
  - conf/atal/PiotGP14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/PiotGP14
Bilal Piot, Matthieu Geist, Olivier Pietquin:
Boosted and reward-regularized classification for apprenticeship learning. AAMAS 2014: 1249-1256
[c8]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PiotPG14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PiotPG14
Bilal Piot, Olivier Pietquin, Matthieu Geist:
Predicting when to laugh with structured classification. INTERSPEECH 2014: 1786-1790
[c7]
- view
- export record
  dblp key:
  - conf/nips/PiotGP14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/PiotGP14
Bilal Piot, Matthieu Geist, Olivier Pietquin:
Difference of Convex Functions Programming for Reinforcement Learning. NIPS 2014: 2519-2527
[c6]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/pkdd/PiotGP14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pkdd/PiotGP14
Bilal Piot, Matthieu Geist, Olivier Pietquin:
Boosted Bellman Residual Minimization Handling Expert Demonstrations. ECML/PKDD (2) 2014: 549-564
2013
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/ria/KleinPGP13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ria/KleinPGP13
Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin:
Classification structurée pour l'apprentissage par renforcement inverse. Rev. d'Intelligence Artif. 27(2): 155-169 (2013)
[c5]
- view
  - electronic edition @ acm.org
  - no references & citations available
- export record
  dblp key:
  - conf/atal/NiewiadomskiHUPWPCPBDGLMPR13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/NiewiadomskiHUPWPCPBDGLMPR13
Radoslaw Niewiadomski, Jennifer Hofmann, Jérôme Urbain, Tracey Platt, Johannes Wagner, Bilal Piot, Hüseyin Çakmak, Sathish Pammi, Tobias Baur, Stéphane Dupont, Matthieu Geist, Florian Lingenfelser, Gary McKeown, Olivier Pietquin, Willibald Ruch:
Laugh-aware virtual agent and its impact on user amusement. AAMAS 2013: 619-626
[c4]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/ifip5-5/ManciniABBBDDDGLNPPPUVW13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ifip5-5/ManciniABBBDDDGLNPPPUVW13
Maurizio Mancini, Laurent Ach, Emeline Bantegnie, Tobias Baur, Nadia Berthouze, Debajyoti Datta, Yu Ding, Stéphane Dupont, Harry J. Griffin, Florian Lingenfelser, Radoslaw Niewiadomski, Catherine Pelachaud, Olivier Pietquin, Bilal Piot, Jérôme Urbain, Gualtiero Volpe, Johannes Wagner:
Laugh When You're Winning. eNTERFACE 2013: 50-79
[c3]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/pkdd/KleinPGP13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pkdd/KleinPGP13
Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin:
A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning. ECML/PKDD (1) 2013: 1-16
[c2]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/pkdd/PiotGP13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pkdd/PiotGP13
Bilal Piot, Matthieu Geist, Olivier Pietquin:
Learning from Demonstrations: Is It Worth Estimating a Reward Function? ECML/PKDD (1) 2013: 17-32
2012
[c1]
- view
- export record
  dblp key:
  - conf/nips/KleinGPP12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KleinGPP12
Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin:
Inverse Reinforcement Learning through Structured Classification. NIPS 2012: 1016-1024

Coauthor Index

see FAQ

a service of

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.