Decision Making in Multiagent Settings

From THINC Lab Wiki
Jump to navigation Jump to search
What is Decision Making in Multiagent Settings?

The problem of automated decision making has been studied under various settings in the past few years. Consider an agent tasked with achieving certain goals in an uncertain environment. The agent needs to model its limitations (in the form of faulty actuators and sensors) as well as the dynamically changing environment and compute the optimal sequence of actions that will most likely lead to the achievement of the goals.

Partially Observable Markov Decision Processes (POMDPs) provide a generic formulation for decision making in single agent settings. In multi-agent settings, however, the agent also has to model other agents sharing the environment. The other agent(s) may be cooperative or antagonistic. In either case, our agent would need to figure out a way to cooperate with or outsmart the other agent(s) in order to maximize its own reward.

In order to compute an optimal strategy, the agent needs to anticipate every possible strategy of the other agent(s) and plan accordingly. Of course, the other agent(s) will probably be modeling our agent and trying to outsmart it. In that case, our agent would have to model the other agents at a higher level. This form of what I think that you think that I think that you think... leads to recursive reasoning that must be captured within the framework for multi-agent decision making.

Interactive-POMDPs incorporate all above features in a mathematical framework that generalizes POMDPs to multi-agent settings.

Project Description
I-POMDP.jpg

I-POMDPs provide a generalized approach for decision making in uncertain multi-agent settings. However, even the smaller problems of multi-agent decision making get computationally intractable very soon. It has been proven exact solution that multi-agent decision making problem is N-EXP Complete. Our research deals with finding scalable approximate solutions to multi-agent decision making problems.

In the past we have utilized particle filtering (IPF) and point-based (I-PBVI) approaches for approximate solutions to I-POMDPs. We are currently exploring more approaches for approximating I-POMDPs.

To evaluate the quality of policies obtained from decision making algorithms, we use a UAV reconnaissance problem setting and simulate the policies on GaTAC.

Repository of all Papers on I-POMDPs/I-DIDs
  • Framework:
  1. Piotr Gmytrasiewicz, "How to Do Things with Words: A Bayesian Approach", JAIR, 2020
  2. Adam Eck, Maulik Shah, Prashant Doshi, Leen-Kiat Soh, "Scalable Decision-Theoretic Planning in Open and Typed Multiagent Systems", AAAI, 2020
  3. Yitao Chen, Deepanshu Vasal, "Multi-Agent Decentralized Belief Propagation on Graphs", arXiv, 2020
  4. Piotr Gmytrasiewicz, Sarit Adhikari, "Optimal Sequential Planning for Communicative Actions: A Bayesian Approach", AAMAS, 2019
  5. Pol Rosello, "Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games", AAAI, 2016
  6. Muthukumaran Chandrasekaran, Adam Eck, Prashant Doshi, Leenkiat Soh, "Individual Planning in Open and Typed Agent Systems", UAI, 2016
  7. Le Tian, Jian Luo, Yifeng Zeng, He Wu, "Modeling and Algorithms for Multiagent Communication through Interactive Dynamic Influence Diagrams", Applied Artificial Intelligence, 2016
  8. Ekhlas Sonu, Yingke Chen, Prashant Doshi, "Individual Planning in Agent Populations: Exploiting Anonymity and FrameAction Hypergraphs", AAAI, 2015
  9. Piotr Gmytrasiewicz and Prashant Doshi, "A Framework for Sequential Planning in Multiagent Settings", Journal of AI Research (JAIR), Vol 24: 49-79, 2005
  10. Muthukumaran Chandrasekaran, Prashant Doshi, Yifeng Zeng, "Team Behavior in Interactive Dynamic Influence Diagrams with Applications to Ad Hoc Teams", AAMAS, 2014
  11. Trong Nghia Hoang, Kian Hsiang Low, "Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents", AAAI, 2013
  12. Bo Li, Jian Luo, "Interactive Dynamic Influence Diagrams Modeling Communication", Communications and Information Processing, 2012
  13. Prashant Doshi, "Decision Making in Complex Multiagent Contexts: A Tale of Two Frameworks", AAAI, 2012
  14. Piotr Gmytrasiewicz and Prashant Doshi, "Interactive POMDPs: Properties and Preliminary Results", AAMAS, 2004
  15. Prashant Doshi, "A Framework for Optimal Sequential Planning in Multiagent Settings", AAAI, 2004
  • Method:
  1. Sarit Adhikari, Piotr Gmytrasiewicz, "Point Based Solution Method for Communicative IPOMDPs", EUMAS, 2021
  2. Yinghui Pan, Jing Tang, Biyang Ma, Yifeng Zeng, Zhong Ming, "Toward data-driven solutions to interactive dynamic influence diagrams", Knowledge and Information System, 2021
  3. Keyang He, Prashant Doshi, Bikramjit Banerjee, "Cooperative-Competitive Reinforcement Learning with History-Dependent Rewards", AAMAS, 2021
  4. Aditya Shinde, Prashant Doshi, Omid Setayeshfar, "Cyber Attack Intent Recognition and Active Deception using Factored Interactive POMDPs", AAMAS, 2021
  5. Rohith Vallam, Sarthak Ahuja, Surya Sajja, Ritwik Chaudhuri, Rakesh Pimplikar, Kushal Mukherjee, Ramasuri Narayanam, Gyana Parija, "Dynamic Particle Allocation to Solve Interactive POMDP Models for Social Decision Making", AAMAS, 2019
  6. Yanlin Han, Piotr Gmytrasiewicz, "IPOMDP-Net: A Deep Neural Network for Partially Observable Multi-Agent Planning Using Interactive POMDPs", 33rd AAAI Conference on Artificial Intelligence, 2019
  7. Shu D. Jiang, Jonathan Odom, "Toward Initiative Decision-Making for Distributed Human-Robot Teams", HAI, 2018
  8. Andreas Hula, Iris Vilares, Terry Lohrenz, Peter Dayan, P. Read Montague, "A model of risk and mental state shifts during social interaction", PLOS, 2018
  9. Iacopo Olivo, "Solving Interactive POMDPs in Julia", Master's Thesis, 2018
  10. Yanlin Han, Piotr Gmytrasiewicz, "Learning Others’ Intentional Models in Multi-Agent Settings Using Interactive POMDPs", 32nd Neural Information Processing Systems, 2018
  11. Yanlin Han, Piotr Gmytrasiewicz, "Learning Others' Intentional Models in Multi-agent Settings Using Interactive POMDPs", 28th Modern Artificial Intelligence and Cognitive Science Conference (MAICS), pages 69-76, 2017
  12. Alessandro Panella, Piotr Gmytrasiewicz, "Interactive POMDPs with Finite-state Models of Other Agents", Autonomous Agents and Multi-Agent Systems, 2017
  13. Yanlin Han, Piotr Gmytrasiewicz , "Decayed Markov Chain Monte Carlo for Interactive POMDP", NIPS Workshop on Deep Learning for Action and Interaction, 2016.
  14. Roi Ceren, Prashant Doshi, Bikramjit Banerjee, "Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds", AAMAS, 2016
  15. a. Ross Conroy, Yifeng Zeng, Jing Tang, "Approximating Value Equivalence in Interactive Dynamic Influence Diagrams Using Behavioral Coverage", IJCAI, 2016
    b. Ross Conroy, Yifeng Zeng, Marc Cavazza, Jing Tang, Yinghui Pan, "A Value Equivalence Approach for Solving Interactive Dynamic Influence Diagrams", AAMAS, 2016
  16. Muthukumaran Chandrasekaran, Prashant Doshi, Yifeng Zeng, Yingke Chen, "Can Bounded and Self-interested Agents Be Teammates? Application to Planning in Ad Hoc Teams", Autonomous Agents and Multi-Agent Systems, 2016
  17. He Wu, Jian Luo, "Efficient Solutions of Interactive Dynamic Influence Diagrams Using Model Identification", Neurocomputing, 2016
  18. a. Alessandro Panella, Piotr Gmytrasiewicz, "Bayesian Learning of Other Agents’ Finite Controllers for Interactive POMDPs", AAAI, 2016
    b. Alessandro Panella, Piotr Gmytrasiewicz, "Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs", AAMAS, 2015
  19. Ekhlas Sonu, Prashant Doshi, "Scalable algorithms for sequential decision making under uncertainty in multi agent systems", Ph.D. Dissertation, 2015
  20. a. Xia Qu, Prashant Doshi, "Improved Planning for Infinite-Horizon Interactive POMDPs using Probabilistic Inference", AAMAS, 2015
    b. Xia Qu, Prashant Doshi, "Individual Planning in Infinite-Horizon Multiagent Settings: Inference, Structure and Scalability", NIPS, 2015
  21. Ross Conroy, Yifeng Zeng, Marc Cavazza, Yingke Chen, "Learning Behaviors in Agents Systems with Interactive Dynamic Influence Diagrams", AAAI, 2015
  22. Yingke Chen, Prashant Doshi, Yifeng Zeng, "Iterative Online Planning in Multiagent Settings with Limited Model Spaces and PAC Guarantee", AAMAS, 2015
  23. a. Fadel Adoe, Yingke Chen, Prashant Doshi, "Fast Solving of Influence Diagrams for Multiagent Planning on GPU-enabled Architectures", ICAART, 2015
    b. Fadel Adoe, Yingke Chen, Prashant Doshi, "Speeding Up Planning in Multiagent Settings Using CPU-GPU Architectures", Agents and Artificial Intelligence, 2015
  24. Yifeng Zeng, Prashant Doshi, Yingke Chen, Yinghui Pan, Hua Mao, Muthukumaran Chandrasekaran, "Approximating behavioral equivalence for scaling solutions of I-DIDs", Knowledge and Information Systems, 2015
  25. Yinghui Pan, Yifeng Zeng, Yanping Xiang, Le Sun, Xuefeng Chen, "Time-critical Interactive dynamic influence diagram", International Journal of Approximate Reasoning, 2015
  26. a. Yinghui Pan, Yingke Chen, Jing Tang, Yifeng Zeng, "Interactive Dynamic Influence Diagrams for Relational Agents", WI-IAT, 2015
    b. Yinghui Pan, Yifeng Zeng, Hua Mao, "Learning Agents’ Relations in Interactive Multiagent Dynamic Influence Diagrams", Agents and Data Mining Interaction, 2014
  27. Prashant Doshi, Piotr Gmytrasiewicz, "Subjective Equilibria in Interactive POMDPs: Theory and Computational Limitations", GTDT Workshop, 2014
  28. Le Tian, Langcai Cao, Jian Luo, Yifeng Zeng, He Wu, "Model Identification of Interactive Dynamic Influence Diagrams", Journal of Computational Information Systems, 2014
  29. Le Tian, Jian Luo, Langcai Cao, "Improved Behavior Equivalence Algorithm of Multi-Agent Interactive Dynamic Influence Diagrams", Journal of Huazhong University of Science and Technology, 2014
  30. Ekhlas Sonu, Prashant Doshi, "Bimodal Switching for Online Planning in Multiagent Settings", IJCAI, 2013
  31. Le Tian, Jian Luo, Zhili Huang, "Communication Based on Interactive Dynamic Influence Diagrams in Cooperative Multi-Agent Systems", International Conference on Computer Science and Education, 2013
  32. Le Tian, Jian Luo, Langcai Cao, Zhiping Chen, "Approximate Algorithm of Interactive Dynamic Influence Diagrams Based on KL Distance", Systems Engineering and Electronics, 2013
  33. He Wu, Jian Luo, Le Tian, "Exploring Efficient Communication in Interactive Dynamic Influence Diagrams", Chinese Intelligent Automation Conference, 2013
  34. Ling Zhou, Jian Luo, "A Communication Model for Interactive POMDPs", ICCSE, 2012
  35. Ekhlas Sonu, Prashant Doshi, "Generalized and Bounded Policy Iteration for Finitely-Nested Interactive POMDPs: Scaling Up", AAMAS, 2012
  36. Brenda Ng, Kofi Boakye, Carol Meyers, Andrew Wang, "Bayes-Adaptive Interactive POMDPs", AAAI, 2012
  37. Mark P. Woodward, Robert J. Wood, "Learning from Humans as an I-POMDP", 2012
  38. Trong Nghia Hoang, Kian Hsiang Low, "Intention-Aware Planning Under Uncertainty for Interacting with Self-Interested, Boundedly Rational Agents", AAMAS, 2012
  39. Xia Qu, Prashant Doshi, Adam Goodie, "Modeling Deep Strategic Reasoning by Humans in Competitive Games", AAMAS, 2012
  40. Yifeng Zeng, Prashant Doshi, "Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams", Journal of AI Research, 2012
  41. Yifeng Zeng, Hua Mao, Prashant Doshi, Yinghui Pan, Jian Luo, "Learning Communication in Interactive Dynamic Influence Diagrams", WI-IAT, 2012
  42. Bo Li, Jian Luo, Jinfa Zhuang, Huayi Yin, "Approximate Solving-Solution of Interactive Dynamic Influence Diagrams", Journal of Huazhong University of Science and Technology, 2012
  43. Yinghui Pan, Jian Luo, Yifeng Zeng, "The Exploration on Modeling Methods for Interactive Multi- Agent Dynamic Influence Diagrams", Journal of Xiamen University, 2012
  44. Yifeng Zeng, Yinghui Pan, Hua Mao, Jian Luo, "Improved Use of Partial Policies for Identifying Behavioral Equivalence", AAMAS, 2012
  45. Yifeng Zeng, Yingke Chen, Prashant Doshi, "Approximating Model Equivalence in Interactive Dynamic Influence Diagrams Using Top K Policy Paths", WI-IAT, 2011
  46. Alessandro Panella, Piotr Gmytrasiewicz, "A Partition-Based First-Order Probabilistic Logic to Represent Interactive Beliefs", International Conference on Scalable Uncertainty Management, 2011
  47. Jian Luo, Bo Li, Yifeng Zeng, "Double Compression of Models for Interactive Dynamic Influence Diagrams", International Symposium on Innovations in Intelligent Systems and Applications, 2011
  48. Bo Li, Jian Luo, Jinfa Zhuang, "Research of Decision-Making in the Multi-Agent System Based on Interactive Influence Diagrams", International Conference on Materials, Mechatronics and Automation, 2011
  49. Jian Luo, Bo Li, Yinghui Pan, Huayi Yin, Changqing Wu, "Approximate Solution of Interactive Dynamic Influence Diagram", Conference on Technologies and Applications of Artificial Intelligence, 2011
  50. Yifeng Zeng, Prashant Doshi, Yinghui Pan, Hua Mao, Muthukumaran Chandrasekaran, Jian Luo, "Utilizing Partial Policies for Identifying Equivalence of Behavioral Models", AAAI, 2011
  51. Yifeng Zeng, Prashant Doshi, "Model identification in Interactive Influence Diagrams Using Mutual Information", Web Intelligence and Agent Systems, 2010
  52. Yifeng Zeng, Yanping Xiang, "Time-Critical Decision Making in Interactive Dynamic Influence Diagram", WI-IAT, 2010
  53. a. Prashant Doshi, Mutukumaran Chandrasekaran, Yifeng Zeng, "Epsilon-Subjective Equivalence of Models for Interactive Dynamic Influence Diagram", WI-IAT, 2010
    b. Muthukumaran Chandrasekaran, Prashant Doshi, Yifeng Zeng, "Approximate Solutions of Interactive Dynamic Influence Diagrams Using epsilon-behavioral Equivalence", ISAIM, 2010
  54. Le Sun, Yifeng Zeng, Yanping Xiang, "An Influence Diagram Approach for Multiagent Time-Critical Dynamic Decision Modeling", Pacific Rim International Conference on Artificial Intelligence, 2010
  55. Muthukumaran Chandrasekaran, "Approximate Model Equivalence for Interactive Dynamic Influence Diagrams", Master's Thesis, 2010
  56. Prashant Doshi, Yifeng Zeng, Qiongyu Chen, "Graphical Models for Interactive POMDPs: Representations and Solutions", Autonomous Agents and Multi-Agent Systems, 2009
  57. Prashant Doshi, Piotr Gmytrasiewicz, "Monte Carlo Sampling Methods for Approximating Interactive POMDPs", Journal of AI Research, 2009
  58. Josh Bryan, Piotr Gmytrasiewicz, Antonio Del Giudice, "Particle Filtering Approximation of Kriegspiel Play with Opponent Modeling", AAMAS, 2009
  59. Prashant Doshi, "Compact Approximations of Mixture Distributions for State Estimation in Multiagent Settings", AAMAS, 2009
  60. a. Prashant Doshi, Yifeng Zeng, "Improved Approximation of Interactive Dynamic Influence Diagrams Using Discriminative Model Updates", AAMAS, 2009
    b. Yifeng Zeng, Prashant Doshi, "Speeding Up Exact Solutions of Interactive Dynamic Influence Diagrams Using Action Equivalence", IJCAI, 2009
  61. a. Prashant Doshi, Dennis Perez, "Generalized Point Based Value Iteration for Interactive POMDPs", AAAI, 2008
    b. Dennis D. Perez, Prashant Doshi, "Approximate Solutions of Interactive POMDPs Using Point Based Value Iteration", ISAIM, 2008
  62. Yifeng Zeng, Prashant Doshi, "An Information-Theoretic Approach to Model Identification in Interactive Influence Diagrams", WI-IAT, 2008
  63. Prashant Doshi, Yifeng Zeng, Qiongyu Chen, "Graphical Models for Online Solutions to Interactive POMDPs", AAAI, 2007
  64. Prashant Doshi, "On the Role of Interactive Epistemology in Multiagent Planning", AIPR, 2007
  65. Dennis D. Perez, "Anytime Point Based Approximations For Interactive POMDPs", Master's Thesis, 2007
  66. Yifeng Zeng, Prashant Doshi, Qiongyu Chen, "Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering", AAAI, 2007
  67. a. Prashant Doshi, "Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces", AAAI, 2007
    b. Prashant Doshi, "Approximate State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces", poster paper, AAMAS, 2007
    c. Prashant Doshi, "Approximate State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces", MSDM Workshop, 2007
  68. Bharanee Rathnasabapathy, Prashant Doshi, Piotr Gmytrasiewicz, "Exact Solutions of Interactive POMDPs Using Behavioral Equivalence", AAMAS, 2006
  69. Prashant Doshi, Piotr Gmytrasiewicz, "On the Difficulty of Achieving Equilibrium in Interactive POMDPs", AAAI, 2006
  70. a. Prashant Doshi, Piotr Gmytrasiewicz, "A Particle Filtering Based Approach to Approximating Interactive POMDPs", AAAI, 2005
    b. Prashant Doshi and Piotr Gmytrasiewicz, "Approximating State Estimation in Multiagent Settings Using Particle Filters", AAMAS, 2005
  71. Prashant Doshi, Piotr Gmytrasiewicz, "A Particle Filtering Algorithm for Interactive POMDPs", GTDT, 2004
  • Application:
  1. Connon Brooks and Daniel Szafir, "Building Second-Order Mental Models for Human-Robot Interaction", arXiv:1909.06508, 2019
  2. Jinyuan He, Le Sun, "An interactive service composition model based on Interactive POMDP", IC3, 2018
  3. Kenneth D. Bogert, Sina Solaimanpour, Prashant Doshi, "Aerial Robotic Simulations for Evaluation of Multi-Agent Planning in GaTAC", AAMAS, 2015
  4. Andreas Hula, P. Read Montague, Peter Dayan, "Monte Carlo Planning Method Estimates Planning Horizons during Interactive Social Exchange", PLOS, 2015
  5. Fangju Wang, "An I-POMDP Based Multi-Agent Architecture for Dialogue Tutoring", ICAICTE, 2013
  6. Verena Rieser, Oliver Lemon, Simon Keizer, "Opponent Modeling for Optimizing Strategic Dialogue", SeineDial: Workshop on the Semantics and Pragmatics of Dialogue, 2012
  7. Prashant Doshi, Xia Qu, Adam S. Goodie, Diana L. Young, "Modeling Human Recursive Reasoning Using Empirically Informed Interactive Partially Observable Markov Decision Process", IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 2012
  8. Jian Luo, Huayi Yin, Bo Li, Changqing Wu, "Path Planning for Automated Guided Vehicles System via Interactive Dynamic Influence Diagrams with Communication", ICCA, 2011
  9. Michael Wunder, Michael Kaisers, John Robert Yaros, Michael Littman, "Using Iterated Reasoning to Predict Opponent Strategies", AAMAS, 2011
  10. Brenda Ng, Carol Meyers, Kofi Boakey, John Nitao, "Towards Applying Interactive POMDPs to Real-World Adversary Modeling", AAAI, 2010
  11. Richard S. Seymour, Gilbert L. Peterson, "Responding to Sneaky Agents in Multi-agent Domains", AAAI, 2009
Related Resources
  • POMDP
  • GaTAC, Georgia Testbed for Autonomous Control of vehicles
Collaborating Institutions
Demos
  • GaTAC, Georgia Testbed for Autonomous Control of vehicles
Researchers