Research
Keywords
Planning under uncertainty, sequential decision making,
(decentralized) partially observable Markov decision
processes (POMDPs / Dec-POMDPs), reinforcement learning,
artificial intelligence, cooperative
multiagent/multi-robot systems, smart energy grids,
traffic optimization.
Overview
Artificial Intelligence's societal impact is pervasive
with citizens relying on spam filters, smart
thermostats, intelligent personal assistants and, in the
near future, on autonomously driving vehicles. The key
skill of an intelligent system is decision making: how
to react smartly to sensor inputs. Hence, a major
challenge of AI is designing agents: systems that
perceive their environment and execute actions. Optimal
decision making is particularly challenging when
uncertainty and many agents are involved, leading to a
need for new models and algorithms.
The need for scalable and flexible multiagent decision
making is particularly pressing given that intelligent
distributed systems are becoming ubiquitous in
society. For instance, autonomous guided vehicles
transport cargo and people, inter-vehicle communication
enables cars to form vehicular networks, smart grid
infrastructure allows consumers to produce and sell
electricity, and surveillance cameras provide urban
security and safety. In these settings, the controller
of the vehicle, the consumers, or the controller of the
cameras all need to act in the face of uncertainty.
Uncertainty manifests itself in various forms when
computing plans for agents, in particular in real-world
scenarios. In fact, as autonomous robots are being
applied in more and more complex domains the need for
handling uncertainty in sensors and actuators grows. The
exact effect of executing a particular action is
uncertain and the sensor readings only provide uncertain
information due to noise or a limited view of the
environment.
My research addresses the question of optimal decision
making under uncertainty using decision-theoretic models
like the Partially Observable Markov Decision Process
(POMDP) and its multiagent extensions (Decentralized
POMDPs), which are collectively known as Sequential
Decision Making (SDM). SDM forms a mathematically
well-grounded framework for planning under
uncertainty.
Selected research topics in Sequential Decision Making
Below I discuss a selection of research topics that I
have worked on, but see my publication list for more papers.
Constrained sequential decision making
Agents often have to optimize their decision making under resource constraints, for instance when simultaneous charging of electric vehicles might exceed grid capacity constraints. We have looked at approaches such as best-response planning (AAAI 2015) or fictitious play (ECAI 2016), but also algorithms that bound the probability of a resource violation (AAAI 2017) or
when the resource constraints themselves are stochastic (AAAI 2018).
Single-agent planning under uncertainty
In a 2017 paper we focused on speeding up the state of
the art in exact POMDP solving by applying a Benders
decomposition to the pruning of vectors using linear
programming, which is the most expensive operation (AAAI
2017, Java
code, C++
code).
I wrote an overview
chapter on POMDPs for the book Reinforcement
Learning: State of the Art (Springer, 2012).
During my PhD I developed Perseus, a
fast approximate POMDP planner which is easy to
implement (JAIR
2005, Java
code, C++
code, Matlab code). We also generalized approximate POMDP planning to
fully continuous domains (JMLR
2006).
Optimal multiagent planning under uncertainty
I have been working on planning under
uncertainty for multiagent (and multi-robot) systems. For instance, we
developed one of the currently fastest optimal planners for
general Dec-POMDPs (IJCAI
2011, JAIR
2013). It is based on an algorithm for speeding up a
key Dec-POMDP operation (the backup) with up to 10
orders of magnitude speedups on benchmarks (AAMAS
2010). The work builds on a journal paper that
laid the foundations for value-based planning in
Dec-POMDPs (JAIR
2008).
Applications
I have applied my decision-making algorithms in different contexts such as smart energy systems (UAI
2015, AAAI 2015), robotics (IJRR 2013, AAAI 2013) and traffic flow optimization (EAAI
2016).
|