July  2014, 1(3): 447-469. doi: 10.3934/jdg.2014.1.447

A primal condition for approachability with partial monitoring

1. 

Faculty of Electrical Engineering, The Technion, 32 000 Haifa, Israel

2. 

LPMA, Université Paris-Diderot, 8 place FM/13, 75 013 Paris, France

3. 

GREGHEC, HEC Paris - CNRS, 1 rue de la Libération, 78 351 Jouy-en-Josas, France

Received  January 2013 Revised  March 2013 Published  July 2014

In approachability with full monitoring there are two types of conditions that are known to be equivalent for convex sets: a primal and a dual condition. The primal one is of the form: a set $\mathcal{C}$ is approachable if and only all containing half-spaces are approachable in the one-shot game. The dual condition is of the form: a convex set $\mathcal{C}$ is approachable if and only if it intersects all payoff sets of a certain form. We consider approachability in games with partial monitoring. In previous works [5,7] we provided a dual characterization of approachable convex sets and we also exhibited efficient strategies in the case where $\mathcal{C}$ is a polytope. In this paper we provide primal conditions on a convex set to be approachable with partial monitoring. They depend on a modified reward function and lead to approachability strategies based on modified payoff functions and that proceed by projections similarly to Blackwell's (1956) strategy. This is in contrast with previously studied strategies in this context that relied mostly on the signaling structure and aimed at estimating well the distributions of the signals received. Our results generalize classical results by Kohlberg [3] (see also [6]) and apply to games with arbitrary signaling structure as well as to arbitrary convex sets.
Citation: Shie Mannor, Vianney Perchet, Gilles Stoltz. A primal condition for approachability with partial monitoring. Journal of Dynamics & Games, 2014, 1 (3) : 447-469. doi: 10.3934/jdg.2014.1.447
References:
[1]

R. Aumann and M. Maschler, Repeated Games with Incomplete Information, MIT Press, 1995.  Google Scholar

[2]

D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific Journal of Mathematics, 6 (1956), 1-8. doi: 10.2140/pjm.1956.6.1.  Google Scholar

[3]

E. Kohlberg, Optimal strategies in repeated games with incomplete information, International Journal of Game Theory, 4 (1975), 7-24. doi: 10.1007/BF01766399.  Google Scholar

[4]

G. Lugosi, S. Mannor and G. Stoltz, Strategies for prediction under imperfect monitoring, Mathematics of Operations Research, 33 (2008), 513-528. doi: 10.1287/moor.1080.0312.  Google Scholar

[5]

S. Mannor, V. Perchet and G. Stoltz, Robust approachability and regret minimization in games with partial monitoring, http://hal.archives-ouvertes.fr/hal-00595695, 2012; An extended abstract was published in Proceedings of COLT'11. Google Scholar

[6]

J.-F. Mertens, S. Sorin and S. Zamir, Repeated Games, Technical Report no. 9420, 9421, 9422, Université de Louvain-la-Neuve, 1994. Google Scholar

[7]

V. Perchet, Approachability of convex sets in games with partial monitoring, Journal of Optimization Theory and Applications, 149 (2011), 665-677. doi: 10.1007/s10957-011-9797-3.  Google Scholar

[8]

V. Perchet, Internal regret with partial monitoring: Calibration-based optimal algorithms, Journal of Machine Learning Research, 12 (2011), 1893-1921.  Google Scholar

[9]

V. Perchet and M. Quincampoix, On an unified framework for approachability in games with or without signals, 2011., Available from: , ().   Google Scholar

[10]

S. Sorin, A First Course on Zero-Sum Repeated Games, Mathématiques & Applications, no. 37, Springer, 2002.  Google Scholar

show all references

References:
[1]

R. Aumann and M. Maschler, Repeated Games with Incomplete Information, MIT Press, 1995.  Google Scholar

[2]

D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific Journal of Mathematics, 6 (1956), 1-8. doi: 10.2140/pjm.1956.6.1.  Google Scholar

[3]

E. Kohlberg, Optimal strategies in repeated games with incomplete information, International Journal of Game Theory, 4 (1975), 7-24. doi: 10.1007/BF01766399.  Google Scholar

[4]

G. Lugosi, S. Mannor and G. Stoltz, Strategies for prediction under imperfect monitoring, Mathematics of Operations Research, 33 (2008), 513-528. doi: 10.1287/moor.1080.0312.  Google Scholar

[5]

S. Mannor, V. Perchet and G. Stoltz, Robust approachability and regret minimization in games with partial monitoring, http://hal.archives-ouvertes.fr/hal-00595695, 2012; An extended abstract was published in Proceedings of COLT'11. Google Scholar

[6]

J.-F. Mertens, S. Sorin and S. Zamir, Repeated Games, Technical Report no. 9420, 9421, 9422, Université de Louvain-la-Neuve, 1994. Google Scholar

[7]

V. Perchet, Approachability of convex sets in games with partial monitoring, Journal of Optimization Theory and Applications, 149 (2011), 665-677. doi: 10.1007/s10957-011-9797-3.  Google Scholar

[8]

V. Perchet, Internal regret with partial monitoring: Calibration-based optimal algorithms, Journal of Machine Learning Research, 12 (2011), 1893-1921.  Google Scholar

[9]

V. Perchet and M. Quincampoix, On an unified framework for approachability in games with or without signals, 2011., Available from: , ().   Google Scholar

[10]

S. Sorin, A First Course on Zero-Sum Repeated Games, Mathématiques & Applications, no. 37, Springer, 2002.  Google Scholar

[1]

Vianney Perchet, Marc Quincampoix. A differential game on Wasserstein space. Application to weak approachability with partial monitoring. Journal of Dynamics & Games, 2019, 6 (1) : 65-85. doi: 10.3934/jdg.2019005

[2]

Mathias Staudigl, Jan-Henrik Steg. On repeated games with imperfect public monitoring: From discrete to continuous time. Journal of Dynamics & Games, 2017, 4 (1) : 1-23. doi: 10.3934/jdg.2017001

[3]

Youjun Deng, Hongyu Liu, Wing-Yan Tsui. Identifying varying magnetic anomalies using geomagnetic monitoring. Discrete & Continuous Dynamical Systems, 2020, 40 (11) : 6411-6440. doi: 10.3934/dcds.2020285

[4]

Wenbo Fu, Debnath Narayan. Optimization algorithm for embedded Linux remote video monitoring system oriented to the internet of things (IOT). Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1341-1354. doi: 10.3934/dcdss.2019092

[5]

Edward S. Canepa, Alexandre M. Bayen, Christian G. Claudel. Spoofing cyber attack detection in probe-based traffic monitoring systems using mixed integer linear programming. Networks & Heterogeneous Media, 2013, 8 (3) : 783-802. doi: 10.3934/nhm.2013.8.783

[6]

Ning Zhang, Qiang Wu. Online learning for supervised dimension reduction. Mathematical Foundations of Computing, 2019, 2 (2) : 95-106. doi: 10.3934/mfc.2019008

[7]

Shuhua Wang, Zhenlong Chen, Baohuai Sheng. Convergence of online pairwise regression learning with quadratic loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4023-4054. doi: 10.3934/cpaa.2020178

[8]

Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete & Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503

[9]

Roberto C. Alamino, Nestor Caticha. Bayesian online algorithms for learning in discrete hidden Markov models. Discrete & Continuous Dynamical Systems - B, 2008, 9 (1) : 1-10. doi: 10.3934/dcdsb.2008.9.1

[10]

Marc Bocquet, Alban Farchi, Quentin Malartic. Online learning of both state and dynamics using ensemble Kalman filters. Foundations of Data Science, 2021, 3 (3) : 305-330. doi: 10.3934/fods.2020015

[11]

Soheila Garshasbi, Brian Yecies, Jun Shen. Microlearning and computer-supported collaborative learning: An agenda towards a comprehensive online learning system. STEM Education, 2021, 1 (4) : 225-255. doi: 10.3934/steme.2021016

[12]

Guowei Dai, Ruyun Ma, Haiyan Wang, Feng Wang, Kuai Xu. Partial differential equations with Robin boundary condition in online social networks. Discrete & Continuous Dynamical Systems - B, 2015, 20 (6) : 1609-1624. doi: 10.3934/dcdsb.2015.20.1609

[13]

Barak Shani, Eilon Solan. Strong approachability. Journal of Dynamics & Games, 2014, 1 (3) : 507-535. doi: 10.3934/jdg.2014.1.507

[14]

G. Calafiore, M.C. Campi. A learning theory approach to the construction of predictor models. Conference Publications, 2003, 2003 (Special) : 156-166. doi: 10.3934/proc.2003.2003.156

[15]

Ran Ma, Lu Zhang, Yuzhong Zhang. A best possible algorithm for an online scheduling problem with position-based learning effect. Journal of Industrial & Management Optimization, 2021  doi: 10.3934/jimo.2021144

[16]

Yudong Li, Yonggang Li, Bei Sun, Yu Chen. Zinc ore supplier evaluation and recommendation method based on nonlinear adaptive online transfer learning. Journal of Industrial & Management Optimization, 2021  doi: 10.3934/jimo.2021193

[17]

Javad Taheri-Tolgari, Mohammad Mohammadi, Bahman Naderi, Alireza Arshadi-Khamseh, Abolfazl Mirzazadeh. An inventory model with imperfect item, inspection errors, preventive maintenance and partial backlogging in uncertainty environment. Journal of Industrial & Management Optimization, 2019, 15 (3) : 1317-1344. doi: 10.3934/jimo.2018097

[18]

Wilhelm Schlag. Spectral theory and nonlinear partial differential equations: A survey. Discrete & Continuous Dynamical Systems, 2006, 15 (3) : 703-723. doi: 10.3934/dcds.2006.15.703

[19]

Dario Bauso, Thomas W. L. Norman. Approachability in population games. Journal of Dynamics & Games, 2020, 7 (4) : 269-289. doi: 10.3934/jdg.2020019

[20]

Qi Lü, Xu Zhang. A concise introduction to control theory for stochastic partial differential equations. Mathematical Control & Related Fields, 2021  doi: 10.3934/mcrf.2021020

 Impact Factor: 

Metrics

  • PDF downloads (45)
  • HTML views (0)
  • Cited by (0)

Other articles
by authors

[Back to Top]