default search action
Eiji Uchibe
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j35]Hamed Jabbari Asl, Eiji Uchibe:
Estimating cost function of expert players in differential games: A model-based method and its data-driven extension. Expert Syst. Appl. 255: 124687 (2024) - [j34]Hamed Jabbari Asl, Eiji Uchibe:
Online estimation of objective function for continuous-time deterministic systems. Neural Networks 172: 106116 (2024) - [j33]Hamed Jabbari Asl, Eiji Uchibe:
Inverse reinforcement learning methods for linear differential games. Syst. Control. Lett. 193: 105936 (2024) - [c37]Jiexin Wang, Eiji Uchibe:
Reward-Punishment Reinforcement Learning with Maximum Entropy. IJCNN 2024: 1-7 - [i11]Jiexin Wang, Eiji Uchibe:
Reward-Punishment Reinforcement Learning with Maximum Entropy. CoRR abs/2405.11784 (2024) - [i10]Satoshi Yagi, Mitsunori Tada, Eiji Uchibe, Suguru Kanoga, Takamitsu Matsubara, Jun Morimoto:
Unsupervised Neural Motion Retargeting for Humanoid Teleoperation. CoRR abs/2406.00727 (2024) - 2023
- [j32]Hamed Jabbari Asl, Eiji Uchibe:
Online Reinforcement Learning Control of Nonlinear Dynamic Systems: A State-action Value Function Based Solution. Neurocomputing 544: 126291 (2023) - 2022
- [j31]Yutaka Matsuo, Yann LeCun, Maneesh Sahani, Doina Precup, David Silver, Masashi Sugiyama, Eiji Uchibe, Jun Morimoto:
Deep learning, reinforcement learning, and world models. Neural Networks 152: 267-275 (2022) - [j30]Tomoya Yamanokuchi, Yuhwan Kwon, Yoshihisa Tsurumine, Eiji Uchibe, Jun Morimoto, Takamitsu Matsubara:
Randomized-to-Canonical Model Predictive Control for Real-World Visual Robotic Manipulation. IEEE Robotics Autom. Lett. 7(4): 8964-8971 (2022) - [j29]Eiji Uchibe:
Model-Based Imitation Learning Using Entropy Regularization of Model and Policy. IEEE Robotics Autom. Lett. 7(4): 10922-10929 (2022) - [c36]Hamed Jabbari Asl, Eiji Uchibe:
Online Data-Driven Inverse Reinforcement Learning for Deterministic Systems. SSCI 2022: 884-889 - [i9]Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara:
q-Munchausen Reinforcement Learning. CoRR abs/2205.07467 (2022) - [i8]Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara:
Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning. CoRR abs/2205.07885 (2022) - [i7]Eiji Uchibe:
Model-Based Imitation Learning Using Entropy Regularization of Model and Policy. CoRR abs/2206.10101 (2022) - [i6]Tomoya Yamanokuchi, Yuhwan Kwon, Yoshihisa Tsurumine, Eiji Uchibe, Jun Morimoto, Takamitsu Matsubara:
Randomized-to-Canonical Model Predictive Control for Real-world Visual Robotic Manipulation. CoRR abs/2207.01840 (2022) - 2021
- [j28]Jiexin Wang, Stefan Elfwing, Eiji Uchibe:
Modular deep reinforcement learning from reward and punishment for robot navigation. Neural Networks 135: 115-126 (2021) - [j27]Eiji Uchibe, Kenji Doya:
Forward and inverse reinforcement learning sharing network weights and hyperparameters. Neural Networks 144: 138-153 (2021) - [j26]Tom Macpherson, Masayuki Matsumoto, Hiroaki Gomi, Jun Morimoto, Eiji Uchibe, Takatoshi Hikida:
Parallel and hierarchical neural mechanisms for adaptive and predictive behavioral control. Neural Networks 144: 507-521 (2021) - 2020
- [i5]Eiji Uchibe, Kenji Doya:
Imitation learning based on entropy-regularized forward and inverse reinforcement learning. CoRR abs/2008.07284 (2020)
2010 – 2019
- 2019
- [j25]Shota Ohnishi, Eiji Uchibe, Yotaro Yamaguchi, Kosuke Nakanishi, Yuji Yasui, Shin Ishii:
Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning. Frontiers Neurorobotics 13: 103 (2019) - [j24]Yoshihisa Tsurumine, Yunduan Cui, Eiji Uchibe, Takamitsu Matsubara:
Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation. Robotics Auton. Syst. 112: 72-83 (2019) - [c35]Tadashi Kozuno, Eiji Uchibe, Kenji Doya:
Theoretical Analysis of Efficiency and Robustness of Softmax and Gap-Increasing Operators in Reinforcement Learning. AISTATS 2019: 2995-3003 - 2018
- [j23]Ken Kinjo, Eiji Uchibe, Kenji Doya:
Robustness of linearly solvable Markov games employing inaccurate dynamics model. Artif. Life Robotics 23(1): 1-9 (2018) - [j22]Eiji Uchibe:
Cooperative and Competitive Reinforcement and Imitation Learning for a Mixture of Heterogeneous Learning Modules. Frontiers Neurorobotics 12: 61 (2018) - [j21]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks 107: 3-11 (2018) - [j20]Eiji Uchibe:
Model-Free Deep Inverse Reinforcement Learning by Logistic Regression. Neural Process. Lett. 47(3): 891-905 (2018) - [c34]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Online meta-learning by parallel algorithm competition. GECCO 2018: 426-433 - [c33]Eiji Uchibe:
Efficient sample reuse in policy search by multiple importance sampling. GECCO 2018: 545-552 - [c32]Jiexin Wang, Stefan Elfwing, Eiji Uchibe:
Deep Reinforcement Learning by Parallelizing Reward and Punishment using the MaxPain Architecture. ICDL-EPIROB 2018: 175-180 - [i4]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Unbounded Output Networks for Classification. CoRR abs/1807.09443 (2018) - 2017
- [j19]Jiexin Wang, Eiji Uchibe, Kenji Doya:
Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer. Frontiers Neurorobotics 11: 1 (2017) - [c31]Chris Reinke, Eiji Uchibe, Kenji Doya:
Average Reward Optimization with Multiple Discounting Reinforcement Learners. ICONIP (1) 2017: 789-800 - [c30]Yoshihisa Tsurumine, Yunduan Cui, Eiji Uchibe, Takamitsu Matsubara:
Deep dynamic policy programming for robot control with raw images. IROS 2017: 1545-1550 - [i3]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. CoRR abs/1702.03118 (2017) - [i2]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Online Meta-learning by Parallel Algorithm Competition. CoRR abs/1702.07490 (2017) - [i1]Tadashi Kozuno, Eiji Uchibe, Kenji Doya:
Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming. CoRR abs/1710.10866 (2017) - 2016
- [j18]Jiexin Wang, Eiji Uchibe, Kenji Doya:
EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot. Artif. Life Robotics 21(1): 125-131 (2016) - [j17]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
From free energy to expected energy: Improving energy-based value function approximation in reinforcement learning. Neural Networks 84: 17-27 (2016) - [c29]Qiong Huang, Eiji Uchibe, Kenji Doya:
Emergence of communication among reinforcement learning agents under coordination environment. ICDL-EPIROB 2016: 57-58 - [c28]Eiji Uchibe:
Deep Inverse Reinforcement Learning by Logistic Regression. ICONIP (1) 2016: 23-31 - 2015
- [j16]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Expected energy-based restricted Boltzmann machine for classification. Neural Networks 64: 29-38 (2015) - 2014
- [c27]Eiji Uchibe, Kenji Doya:
Inverse reinforcement learning using Dynamic Policy Programming. ICDL-EPIROB 2014: 222-228 - [c26]Eiji Uchibe, Kenji Doya:
Combining learned controllers to achieve new goals based on linearly solvable MDPs. ICRA 2014: 5252-5259 - 2013
- [j15]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces. Frontiers Neurorobotics 7: 3 (2013) - [j14]Ken Kinjo, Eiji Uchibe, Kenji Doya:
Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task. Frontiers Neurorobotics 7: 7 (2013) - [c25]Naoto Yoshida, Eiji Uchibe, Kenji Doya:
Reinforcement learning with state-dependent discount factor. ICDL-EPIROB 2013: 1-6 - 2011
- [j13]Stefan Elfwing, Eiji Uchibe, Kenji Doya, Henrik I. Christensen:
Darwinian embodied evolution of the learning ability for survival. Adapt. Behav. 19(2): 101-120 (2011) - 2010
- [j12]Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Jan Peters, Kenji Doya:
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning. Neural Comput. 22(2): 342-376 (2010) - [c24]Stefan Elfwing, Makoto Otsuka, Eiji Uchibe, Kenji Doya:
Free-Energy Based Reinforcement Learning for Vision-Based Navigation with High-Dimensional Sensory Inputs. ICONIP (1) 2010: 215-222
2000 – 2009
- 2009
- [c23]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Emergence of Different Mating Strategies in Artificial Embodied Evolution. ICONIP (2) 2009: 638-647 - [c22]Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya:
A Generalized Natural Actor-Critic Algorithm. NIPS 2009: 1312-1320 - [p1]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
Co-evolution of Rewards and Meta-parameters in Embodied Evolution. Creating Brain-Like Intelligence 2009: 278-302 - 2008
- [j11]Stefan Elfwing, Eiji Uchibe, Kenji Doya, Henrik I. Christensen:
Co-evolution of Shaping Rewards and Meta-Parameters in Reinforcement Learning. Adapt. Behav. 16(6): 400-412 (2008) - [j10]Takashi Sato, Eiji Uchibe, Kenji Doya:
Learning how, what, and whether to communicate: emergence of protocommunication in reinforcement learning agents. Artif. Life Robotics 12(1-2): 70-74 (2008) - [j9]Tetsuro Morimura, Eiji Uchibe, Kenji Doya:
Natural actor-critic with baseline adjustment for variance reduction. Artif. Life Robotics 13(1): 275-279 (2008) - [j8]Eiji Uchibe, Kenji Doya:
Finding intrinsic rewards by embodied evolution and constrained reinforcement learning. Neural Networks 21(10): 1447-1455 (2008) - [c21]Takumi Kamioka, Eiji Uchibe, Kenji Doya:
NeuroEvolution Based on Reusable and Hierarchical Modular Representation. ICONIP (1) 2008: 22-31 - [c20]Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya:
A New Natural Policy Gradient by Stationary Distribution Metric. ECML/PKDD (2) 2008: 82-97 - 2007
- [j7]Stefan Elfwing, Eiji Uchibe, Kenji Doya, Henrik I. Christensen:
Evolutionary Development of Hierarchical Learning Structures. IEEE Trans. Evol. Comput. 11(2): 249-264 (2007) - [c19]Eiji Uchibe, Kenji Doya:
Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents. ICONIP (2) 2007: 167-176 - 2006
- [j6]Eiji Uchibe, Minoru Asada:
Incremental Coevolution With Competitive and Cooperative Tasks in a Multirobot Environment. Proc. IEEE 94(7): 1412-1424 (2006) - 2005
- [j5]Kenji Doya, Eiji Uchibe:
The Cyber Rodent Project: Exploration of Adaptive Mechanisms for Self-Preservation and Self-Reproduction. Adapt. Behav. 13(2): 149-160 (2005) - [c18]Stefan Elfwing, Eiji Uchibe, Kenji Doya, Henrik I. Christensen:
Biologically inspired embodied evolution of survival. Congress on Evolutionary Computation 2005: 2210-2216 - 2004
- [c17]Stefan Elfwing, Eiji Uchibe, Kenji Doya, Henrik I. Christensen:
Multi-agent reinforcement learning: using macro actions to learn a mating task. IROS 2004: 3164-3169 - 2003
- [c16]Stefan Elfwing, Eiji Uchibe, Kenji Doya:
An Evolutionary Approach to Automatic Construction of the Structure in Hierarchical Reinforcement Learning. GECCO 2003: 507-509 - 2002
- [j4]Eiji Uchibe, Masakazu Yanase, Minoru Asada:
Behavior generation for a mobile robot based on the adaptive fitness function. Robotics Auton. Syst. 40(2-3): 69-77 (2002) - 2001
- [j3]Minoru Asada, Eiji Uchibe:
Multiagent Learning towards RoboCup. New Gener. Comput. 19(2): 103-120 (2001) - [c15]Eiji Uchibe, Tatsunori Kato, Koh Hosoda, Minoru Asada:
Dynamic Task Assignment in a Multiagent/Multitask Environment based on Module Conflict Resolution. ICRA 2001: 3987-3992 - [c14]Eiji Uchibe, Masakazu Yanase, Minoru Asada:
Evolutionary Behavior Selection with Activation/Termination Constraints. RoboCup 2001: 234-243 - 2000
- [c13]Yasutake Takahashi, Eiji Uchibe, Takahashi Tamura, Masakazu Yanase, Shoichi Ikenoue, Shujiro Inui, Minoru Asada:
Osaka University "Trackies 2000". RoboCup 2000: 607-610
1990 – 1999
- 1999
- [j2]Minoru Asada, Eiji Uchibe, Koh Hosoda:
Cooperative Behavior Acquisition for Mobile Robots in Dynamically Changing Real Worlds Via Vision-Based Reinforcement Learning and Development. Artif. Intell. 110(2): 275-292 (1999) - [c12]Eiji Uchibe, Minoru Asada:
Multiple Reward Criterion for Cooperative Behavior Acquisition in a Muliagent Environment. RoboCup 1999: 519-530 - [c11]Sho'ji Suzuki, Tatsunori Kato, Hiroshi Ishizuka, Hiroyoshi Kawanishi, Takashi Tamura, Masakazu Yanase, Yasutake Takahashi, Eiji Uchibe, Minoru Asada:
The Team Description of Osaka University "Trackies-99". RoboCup 1999: 750-753 - 1998
- [j1]Minoru Asada, Sho'ji Suzuki, Yasutake Takahashi, Eiji Uchibe, Masateru Nakamura, Chizuko Mishima, Hiroshi Ishizuka, Tatsunori Kato:
TRACKIES: RoboCup-97 Middle-Size League World Cochampion. AI Mag. 19(3): 71-78 (1998) - [c10]Eiji Uchibe, Minoru Asada, Koh Hosoda:
State Space Construction for Behavior Acquisition in Multi Agent Environments with Vision and Action. ICCV 1998: 870-875 - [c9]Eiji Uchibe, Minoru Asada, Koh Hosoda:
Cooperative Behavior Acquisition in Multi Mobile Robots Environment by Reinforcement Learning Based on State Vector Estimation. ICRA 1998: 1558-1563 - [c8]Eiji Uchibe, Minoru Asada, Koh Hosoda:
Environmental Complexity Control for Vision-Based Learning Mobile Robot. ICRA 1998: 1865-1870 - [c7]Eiji Uchibe, Masateru Nakamura, Minoru Asada:
Co-evolution for cooperative behavior acquisition in a multiple mobile robot environment. IROS 1998: 425-430 - [c6]Eiji Uchibe, Masateru Nakamura, Minoru Asada:
Cooperative Behavior Acquisition in a Multiple Mobile Robot Environment by Co-evolution. RoboCup 1998: 273-285 - [c5]Sho'ji Suzuki, Tatsunori Kato, Hiroshi Ishizuka, Yasutake Takahashi, Eiji Uchibe, Minoru Asada:
An Application of Vision-Based Learning in RoboCup for a Real Robot with an Omnidirectional Vision System and the Team Description of Osaka University "Trackies". RoboCup 1998: 316-325 - 1997
- [c4]Eiji Uchibe, Minoru Asada, Koh Hosoda:
Vision Based State Space Construction for Learning Mobile Robots in Multi-agent Environments. EWLR 1997: 62-78 - [c3]Sho'ji Suzuki, Yasutake Takahashi, Eiji Uchibe, Masateru Nakamura, Chizuko Mishima, Hiroshi Ishizuka, Tatsunori Kato, Minoru Asada:
Vision-Based Robot Learning Towards RoboCup: Osaka University "Trackies". RoboCup 1997: 305-319 - 1996
- [c2]Eiji Uchibe, Minoru Asada, Koh Hosoda:
Behavior coordination for a mobile robot using modular reinforcement learning. IROS 1996: 1329-1336 - 1994
- [c1]Minoru Asada, Eiji Uchibe, Shoichi Noda, Sukoya Tawaratsumida, Koh Hosoda:
Coordination of multiple behaviors acquired by a vision-based reinforcement learning. IROS 1994: 917-924
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-10 20:51 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint