Room 1020 Emerging Tech. Building (ETB)
W. Bradley Knox
Media Lab, MIT
Software-based control systems are often deployed in the service of end users that lack programming skills. Examples of these control systems (i.e., autonomous agents) include personal robots and game-playing agents. In this talk, I’ll review my research on TAMER, a general framework for control algorithms that learn from real-valued signals of user approval and disapproval (i.e., reward) through simple human-machine interfaces that do not require technical expertise on the part of the user. These algorithms provide two distinct benefits: (1) giving general users the ability to specify correct behavior for a control system and (2) incorporating the wealth of available human task expertise to increase learning speed on tasks with predefined objective functions. I will focus in particular on the question: given the reward that human trainers actually provide, what task framing and reinforcement learning objective will lead to high performance on the the task the trainer intends to teach? This work makes significant progress towards answering the open question of how best to learn from human-generated reward, a potential source of guidance that will be abundant for many robots through social cues such as smiles and attention.
Bio: Brad Knox is a postdoctoral researcher at the MIT Media Lab. He received a PhD in Computer Science at the University of Texas at Austin. He studied psychology and fulfilled the premedical curriculum as a undergraduate at Texas A&M University and was an NSF Graduate Research Fellow from 2008–2011. Brad’s research, in collaboration with his former advisor Peter Stone, won the Pragnesh Jay Modi Best Student Paper Award at AAMAS in 2010 and was a finalist for the CoTeSys Cognitive Robotics Best Paper Award at Ro-Man in 2012. His research interests span machine learning, robotics, and psychology, especially machine learning algorithms that learn through human interaction.
Host: Jiang Hu