Understanding sequential decisions via inverse reinforcement learning

Siyuan Liu, Miguel Araujo, Emma Brunskill, Rosaldo Rossetti, Joao Barros, Ramayya Krishnan

Research output: Contribution to journalConference article

11 Citations (Scopus)

Abstract

The execution of an agent's complex activities, comprising sequences of simpler actions, sometimes leads to the clash of conflicting functions that must be optimized. These functions represent satisfaction, short-term as well as long-term objectives, costs and individual preferences. The way that these functions are weighted is usually unknown even to the decision maker. But if we were able to understand the individual motivations and compare such motivations among individuals, then we would be able to actively change the environment so as to increase satisfaction and/or improve performance. In this work, we approach the problem of providing high-level and intelligible descriptions of the motivations of an agent, based on observations of such an agent during the fulfillment of a series of complex activities (called sequential decisions in our work). A novel algorithm for the analysis of observational records is proposed. We also present a methodology that allows researchers to converge towards a summary description of an agent's behaviors, through the minimization of an error measure between the current description and the observed behaviors. This work was validated using not only a synthetic dataset representing the motivations of a passenger in a public transportation network, but also real taxi drivers' behaviors from their trips in an urban network. Our results show that our method is not only useful, but also performs much better than the previous methods, in terms of accuracy, efficiency and scalability.

Original languageEnglish (US)
Article number6569134
Pages (from-to)177-186
Number of pages10
JournalProceedings - IEEE International Conference on Mobile Data Management
Volume1
DOIs
StatePublished - Sep 11 2013
Event14th International Conference on Mobile Data Management, MDM 2013 - Milan, Italy
Duration: Jun 3 2013Jun 6 2013

Fingerprint

Reinforcement learning
Scalability
Costs

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Liu, Siyuan ; Araujo, Miguel ; Brunskill, Emma ; Rossetti, Rosaldo ; Barros, Joao ; Krishnan, Ramayya. / Understanding sequential decisions via inverse reinforcement learning. In: Proceedings - IEEE International Conference on Mobile Data Management. 2013 ; Vol. 1. pp. 177-186.
@article{0f179e9e0fcd4890bb0a7a75878eea2e,
title = "Understanding sequential decisions via inverse reinforcement learning",
abstract = "The execution of an agent's complex activities, comprising sequences of simpler actions, sometimes leads to the clash of conflicting functions that must be optimized. These functions represent satisfaction, short-term as well as long-term objectives, costs and individual preferences. The way that these functions are weighted is usually unknown even to the decision maker. But if we were able to understand the individual motivations and compare such motivations among individuals, then we would be able to actively change the environment so as to increase satisfaction and/or improve performance. In this work, we approach the problem of providing high-level and intelligible descriptions of the motivations of an agent, based on observations of such an agent during the fulfillment of a series of complex activities (called sequential decisions in our work). A novel algorithm for the analysis of observational records is proposed. We also present a methodology that allows researchers to converge towards a summary description of an agent's behaviors, through the minimization of an error measure between the current description and the observed behaviors. This work was validated using not only a synthetic dataset representing the motivations of a passenger in a public transportation network, but also real taxi drivers' behaviors from their trips in an urban network. Our results show that our method is not only useful, but also performs much better than the previous methods, in terms of accuracy, efficiency and scalability.",
author = "Siyuan Liu and Miguel Araujo and Emma Brunskill and Rosaldo Rossetti and Joao Barros and Ramayya Krishnan",
year = "2013",
month = "9",
day = "11",
doi = "10.1109/MDM.2013.28",
language = "English (US)",
volume = "1",
pages = "177--186",
journal = "Proceedings - IEEE International Conference on Mobile Data Management",
issn = "1551-6245",

}

Understanding sequential decisions via inverse reinforcement learning. / Liu, Siyuan; Araujo, Miguel; Brunskill, Emma; Rossetti, Rosaldo; Barros, Joao; Krishnan, Ramayya.

In: Proceedings - IEEE International Conference on Mobile Data Management, Vol. 1, 6569134, 11.09.2013, p. 177-186.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Understanding sequential decisions via inverse reinforcement learning

AU - Liu, Siyuan

AU - Araujo, Miguel

AU - Brunskill, Emma

AU - Rossetti, Rosaldo

AU - Barros, Joao

AU - Krishnan, Ramayya

PY - 2013/9/11

Y1 - 2013/9/11

N2 - The execution of an agent's complex activities, comprising sequences of simpler actions, sometimes leads to the clash of conflicting functions that must be optimized. These functions represent satisfaction, short-term as well as long-term objectives, costs and individual preferences. The way that these functions are weighted is usually unknown even to the decision maker. But if we were able to understand the individual motivations and compare such motivations among individuals, then we would be able to actively change the environment so as to increase satisfaction and/or improve performance. In this work, we approach the problem of providing high-level and intelligible descriptions of the motivations of an agent, based on observations of such an agent during the fulfillment of a series of complex activities (called sequential decisions in our work). A novel algorithm for the analysis of observational records is proposed. We also present a methodology that allows researchers to converge towards a summary description of an agent's behaviors, through the minimization of an error measure between the current description and the observed behaviors. This work was validated using not only a synthetic dataset representing the motivations of a passenger in a public transportation network, but also real taxi drivers' behaviors from their trips in an urban network. Our results show that our method is not only useful, but also performs much better than the previous methods, in terms of accuracy, efficiency and scalability.

AB - The execution of an agent's complex activities, comprising sequences of simpler actions, sometimes leads to the clash of conflicting functions that must be optimized. These functions represent satisfaction, short-term as well as long-term objectives, costs and individual preferences. The way that these functions are weighted is usually unknown even to the decision maker. But if we were able to understand the individual motivations and compare such motivations among individuals, then we would be able to actively change the environment so as to increase satisfaction and/or improve performance. In this work, we approach the problem of providing high-level and intelligible descriptions of the motivations of an agent, based on observations of such an agent during the fulfillment of a series of complex activities (called sequential decisions in our work). A novel algorithm for the analysis of observational records is proposed. We also present a methodology that allows researchers to converge towards a summary description of an agent's behaviors, through the minimization of an error measure between the current description and the observed behaviors. This work was validated using not only a synthetic dataset representing the motivations of a passenger in a public transportation network, but also real taxi drivers' behaviors from their trips in an urban network. Our results show that our method is not only useful, but also performs much better than the previous methods, in terms of accuracy, efficiency and scalability.

UR - http://www.scopus.com/inward/record.url?scp=84883549455&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883549455&partnerID=8YFLogxK

U2 - 10.1109/MDM.2013.28

DO - 10.1109/MDM.2013.28

M3 - Conference article

AN - SCOPUS:84883549455

VL - 1

SP - 177

EP - 186

JO - Proceedings - IEEE International Conference on Mobile Data Management

JF - Proceedings - IEEE International Conference on Mobile Data Management

SN - 1551-6245

M1 - 6569134

ER -