Home
Contact
People
Publications
Research
Teaching
Projects (current)
    E-pi
    AI4B.io
    MEDIATOR
    LearnSDM
    FlexI
Projects (past)
    DCSMART
    GCP
    BalanCity
    Smoover
    PURe-MaS
    MAIS-S
    DecPUCS
    URUS
Resources
    Software
    Dec-POMDP
    POMDPs
Activities
    Workshops
    Tutorials
    Events

Research

Keywords

Planning under uncertainty, sequential decision making, (decentralized) partially observable Markov decision processes (POMDPs / Dec-POMDPs), reinforcement learning, artificial intelligence, cooperative multiagent/multi-robot systems, smart energy grids, traffic optimization.

Overview

Artificial Intelligence's societal impact is pervasive with citizens relying on spam filters, smart thermostats, intelligent personal assistants and, in the near future, on autonomously driving vehicles. The key skill of an intelligent system is decision making: how to react smartly to sensor inputs. Hence, a major challenge of AI is designing agents: systems that perceive their environment and execute actions. Optimal decision making is particularly challenging when uncertainty and many agents are involved, leading to a need for new models and algorithms.

The need for scalable and flexible multiagent decision making is particularly pressing given that intelligent distributed systems are becoming ubiquitous in society. For instance, autonomous guided vehicles transport cargo and people, inter-vehicle communication enables cars to form vehicular networks, smart grid infrastructure allows consumers to produce and sell electricity, and surveillance cameras provide urban security and safety. In these settings, the controller of the vehicle, the consumers, or the controller of the cameras all need to act in the face of uncertainty.

Uncertainty manifests itself in various forms when computing plans for agents, in particular in real-world scenarios. In fact, as autonomous robots are being applied in more and more complex domains the need for handling uncertainty in sensors and actuators grows. The exact effect of executing a particular action is uncertain and the sensor readings only provide uncertain information due to noise or a limited view of the environment.

My research addresses the question of optimal decision making under uncertainty using decision-theoretic models like the Partially Observable Markov Decision Process (POMDP) and its multiagent extensions (Decentralized POMDPs), which are collectively known as Sequential Decision Making (SDM). SDM forms a mathematically well-grounded framework for planning under uncertainty.

Selected research topics in Sequential Decision Making

Below I discuss a selection of research topics that I have worked on, but see my publication list for more papers.

Constrained sequential decision making

Agents often have to optimize their decision making under resource constraints, for instance when simultaneous charging of electric vehicles might exceed grid capacity constraints. We have looked at approaches such as best-response planning (AAAI 2015) or fictitious play (ECAI 2016), but also algorithms that bound the probability of a resource violation (AAAI 2017) or when the resource constraints themselves are stochastic (AAAI 2018).

Single-agent planning under uncertainty

In a 2017 paper we focused on speeding up the state of the art in exact POMDP solving by applying a Benders decomposition to the pruning of vectors using linear programming, which is the most expensive operation (AAAI 2017, Java code, C++ code).

I wrote an overview chapter on POMDPs for the book Reinforcement Learning: State of the Art (Springer, 2012).

During my PhD I developed Perseus, a fast approximate POMDP planner which is easy to implement (JAIR 2005, Java code, C++ code, Matlab code). We also generalized approximate POMDP planning to fully continuous domains (JMLR 2006).

Optimal multiagent planning under uncertainty

I have been working on planning under uncertainty for multiagent (and multi-robot) systems. For instance, we developed one of the currently fastest optimal planners for general Dec-POMDPs (IJCAI 2011, JAIR 2013). It is based on an algorithm for speeding up a key Dec-POMDP operation (the backup) with up to 10 orders of magnitude speedups on benchmarks (AAMAS 2010). The work builds on a journal paper that laid the foundations for value-based planning in Dec-POMDPs (JAIR 2008).

Applications

I have applied my decision-making algorithms in different contexts such as smart energy systems (UAI 2015, AAAI 2015), robotics (IJRR 2013, AAAI 2013) and traffic flow optimization (EAAI 2016).