Citation: |
[1] |
J. Abernethy, P. Bartlett and E. Hazan, Blackwell approachability and low-regret learning are equivalent, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 27-46. |
[2] |
S. As Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies, SIAM J. Control Optim., 48 (2009), 2461-2479.doi: 10.1137/090749098. |
[3] |
P. Auer, N. Cesa-Bianchi and C. Gentile, Adaptive and self-confident on-line learning algorithms, J. Comput. System Sci., 64 (2002), 48-75, Special issue on COLT 2000 (Palo Alto, CA).doi: 10.1006/jcss.2001.1795. |
[4] |
R. J. Aumann, Subjectivity and correlation in randomized strategies, J. Math. Econom., 1 (1974), 67-96.doi: 10.1016/0304-4068(74)90037-8. |
[5] |
R. J. Aumann and M. B. Maschler, Repeated Games with Incomplete Information, MIT Press, Cambridge, MA, 1995, With the collaboration of Richard E. Stearns. |
[6] |
M. Benaïm and M. Faure, Consistency of vanishingly smooth fictitious play, Math. Oper. Res., 38 (2013), 437-450.doi: 10.1287/moor.1120.0568. |
[7] |
M. Benaïm, J. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions. II. Applications, Math. Oper. Res., 31 (2006), 673-695.doi: 10.1287/moor.1060.0213. |
[8] |
A. Bernstein, S. Mannor and N. Shimkin, Opportunistic strategies for generalized no-regret problems, J. Mach. Learn. Res.: Workshop Conf. Proc., 30 (2013), 158-171. |
[9] |
A. Bernstein and N. Shimkin, Response-based approachability and its application to generalized no-regret algorithms, Manuscript. |
[10] |
L. J. Billera and B. Sturmfels, Fiber polytopes, The Annals of Mathematics, 135 (1992), 527-549.doi: 10.2307/2946575. |
[11] |
D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific J. Math., 6 (1956), 1-8.doi: 10.2140/pjm.1956.6.1. |
[12] |
D. Blackwell, Controlled random walks, in Proceedings of the International Congress of Mathematicians, 1954, Amsterdam, vol. III, 1956, 336-338. |
[13] |
D. Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions, John Wiley and Sons, Inc., New York, 1954. |
[14] |
A. Blum and Y. Mansour, From external to internal regret, in Learning theory, vol. 3559 of Lecture Notes in Comput. Sci., Springer, Berlin, (2005), 621-636.doi: 10.1007/11503415_42. |
[15] |
S. Bubeck, Introduction to online optimization, Manuscript. |
[16] |
N. Cesa-Bianchi and G. Lugosi, Potential-based algorithms in on-line prediction and game theory, Machine Learning, 51 (2003), 239-261. |
[17] |
N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games, Cambridge University Press, Cambridge, 2006.doi: 10.1017/CBO9780511546921. |
[18] |
X. Chen and H. White, Laws of large numbers for Hilbert space-valued mixingales with applications, Econometric Theory, 12 (1996), 284-304.doi: 10.1017/S0266466600006599. |
[19] |
A. P. Dawid, The well-calibrated Bayesian, J. Amer. Statist. Assoc., 77 (1982), 605-613.doi: 10.1080/01621459.1982.10477856. |
[20] |
A. P. Dawid, Self-calibrating priors do not exist: Comment, J. Amer. Statist. Assoc., 80 (1985), 339-342.doi: 10.1080/01621459.1985.10478117. |
[21] |
L. Evans and P. E. Souganidis, Differential games and representation formulas for solutions of Hamilton-Jacobi equations, Indiana Univ. Math. J., 33 (1984), 773-797.doi: 10.1512/iumj.1984.33.33040. |
[22] |
K. Fan, Minimax theorems, Proc. Nat. Acad. Sci. U. S. A., 39 (1953), 42-47.doi: 10.1073/pnas.39.1.42. |
[23] |
K. Fan, A minimax inequality and applications, in Inequalities, III (Proc. Third Sympos., Univ. California, Los Angeles, Calif., 1969; dedicated to the memory of Theodore S. Motzkin), Academic Press, New York, (1972), 103-113. |
[24] |
W. Feller, An Introduction to Probability Theory and its Applications. Vol. I, Third edition, John Wiley & Sons Inc., New York, 1968. |
[25] |
D. P. Foster, A proof of calibration via blackwell's approachability theorem, Games and Economic Behavior, 29 (1999), 73-78.doi: 10.1006/game.1999.0719. |
[26] |
D. P. Foster, A. Rakhlin, K. Sridharan and A. Tewari, Complexity-based approach to calibration with checking rules, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 293-314. |
[27] |
D. P. Foster and R. V. Vohra, Calibrated learning and correlated equilibrium, Games Econom. Behav., 21 (1997), 40-55.doi: 10.1006/game.1997.0595. |
[28] |
D. P. Foster and R. V. Vohra, Asymptotic calibration, Biometrika, 85 (1998), 379-390.doi: 10.1093/biomet/85.2.379. |
[29] |
D. P. Foster and R. V. Vohra, Regret in the on-line decision problem, Games Econom. Behav., 29 (1999), 7-35.doi: 10.1006/game.1999.0740. |
[30] |
D. P. Foster and R. V. Vohra, Calibration: Respice, adspice, prospice, Advances in Economics and Econometrics, 1 (2013), 423-442.doi: 10.1017/CBO9781139060011.014. |
[31] |
D. Fudenberg and D. M. Kreps, Learning mixed equilibria, Games Econom. Behav., 5 (1993), 320-367.doi: 10.1006/game.1993.1021. |
[32] |
D. Fudenberg and D. K. Levine, Conditional universal consistency, Games Econom. Behav., 29 (1999), 104-130.doi: 10.1006/game.1998.0705. |
[33] |
D. Fudenberg and D. K. Levine, An easier way to calibrate, Games Econom. Behav., 29 (1999), 131-137.doi: 10.1006/game.1999.0726. |
[34] |
F. Gul, D. Pearce and E. Stachetti, A bound on the proportion of pure strategy equilibria in generic games, Math. Oper. Res., 18 (1993), 548-552.doi: 10.1287/moor.18.3.548. |
[35] |
P. Hall and C. C. Heyde, Martingale Limit Theory and its Application, Academic Press Inc., New York, 1980, Probability and Mathematical Statistics. |
[36] |
J. Hannan, Approximation to bayes risk in repeated play, in Contributions to the Theory of Games, vol. III of Annals of Mathematics Studies 39, Princeton University Press, Princeton, N. J., 1957, 97-139. |
[37] |
S. Hart and A. Mas-Colell, A simple adaptive procedure leading to correlated equilibrium, Econometrica, 68 (2000), 1127-1150.doi: 10.1111/1468-0262.00153. |
[38] |
S. Hart and A. Mas-Colell, A general class of adaptive strategies, J. Econom. Theory, 98 (2001), 26-54.doi: 10.1006/jeth.2000.2746. |
[39] |
S. Hart and A. Mas-Colell, Regret-based continuous-time dynamics, Games Econom. Behav., 45 (2003), 375-394.doi: 10.1016/S0899-8256(03)00178-7. |
[40] |
E. Hazan and S. M. Kakade, (weak) calibration is computationaly hard, J. Mach. Learn. Res.: Workshop Conf. Proc., 23 (2012), 3.1-3.10. |
[41] |
J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play, Econometrica, 70 (2002), 2265-2294.doi: 10.1111/1468-0262.00376. |
[42] |
J. Hofbauer, S. Sorin and Y. Viossat, Time average replicator and best-reply dynamics, Math. Oper. Res., 34 (2009), 263-269.doi: 10.1287/moor.1080.0359. |
[43] |
S. M. Kakade and D. P. Foster, Deterministic calibration and Nash equilibrium, in Learning theory, vol. 3120 of Lecture Notes in Comput. Sci., Springer, Berlin, (2004), 33-48.doi: 10.1007/978-3-540-27819-1_3. |
[44] |
O. Kallenberg and R. Sztencel, Some dimension-free features of vector-valued martingales, Probability Theory and Related Fields, 88 (1991), 215-247.doi: 10.1007/BF01212560. |
[45] |
E. Kohlberg, Optimal strategies in repeated games with incomplete information, Internat. J. Game Theory, 4 (1975), 7-24.doi: 10.1007/BF01766399. |
[46] |
J. Kwon, Hilbert Distance, Bounded Convex Functions, and Application to the Exponential Weight Algorithm, Master's thesis, ENS Lyon, 2012. |
[47] |
E. Lehrer, Any inspection is manipulable, Econometrica, 69 (2001), 1333-1347.doi: 10.1111/1468-0262.00244. |
[48] |
E. Lehrer, Approachability in infinite dimensional spaces, Internat. J. Game Theory, 31 (2002), 253-268.doi: 10.1007/s001820200115. |
[49] |
E. Lehrer, A wide range no-regret theorem, Games Econom. Behav., 42 (2003), 101-115.doi: 10.1016/S0899-8256(03)00032-0. |
[50] |
E. Lehrer and E. Solan, Excludability and bounded computational capacity, Math. Oper. Res., 31 (2006), 637-648.doi: 10.1287/moor.1060.0211. |
[51] |
E. Lehrer and E. Solan, Learning to play partially-specified equilibrium, Manuscript. |
[52] |
E. Lehrer and E. Solan, Approachability with bounded memory, Games Econom. Behav., 66 (2009), 995-1004.doi: 10.1016/j.geb.2007.07.011. |
[53] |
E. Lehrer and S. Sorin, Minmax via differential inclusion, J. Conv. Analysis, 14 (2007), 271-273. |
[54] |
N. Littlestone and M. Warmuth, The weighted majority algorithm, Information and Computation, 108 (1994), 212-261.doi: 10.1006/inco.1994.1009. |
[55] |
R. D. Luce and H. Raiffa, Games and Decisions: Introduction and Critical Survey, John Wiley & Sons Inc., New York, N. Y., 1957. |
[56] |
S. Mannor and N. Shimkin, Regret minimization in repeated matrix games with variable stage duration, Games Econom. Behav., 63 (2008), 227-258.doi: 10.1016/j.geb.2007.07.006. |
[57] |
S. Mannor and G. Stoltz, A geometric proof of calibration, Math. Oper. Res., 35 (2010), 721-727.doi: 10.1287/moor.1100.0465. |
[58] |
S. Mannor, G. Stoltz and V. Perchet, Robust approachability and regret minimization in games with partial monitoring, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 515-536. |
[59] |
S. Mannor, G. Stoltz and V. Perchet, Set-valued approachability, with applications to regret minimization in games with partial monitoring, Manuscript. |
[60] |
S. Mannor and J. N. Tsitsiklis, Approachability in repeated games: Computational aspects and a Stackelberg variant, Games Econom. Behav., 66 (2009), 315-325.doi: 10.1016/j.geb.2008.03.008. |
[61] |
D. McFadden, Conditional logit analysis of qualitative choice behavior, Frontiers in econometrics, Academic Press: New York, 1974, 105-142. |
[62] |
J.-F. Mertens, S. Sorin and S. Zamir, Repeated games, CORE discussion paper, (1994), 9420-9422.doi: 10.1057/9780230226203.3424. |
[63] |
J. Neveu, Bases Mathématiques du Calcul des Probabilités, Masson et Cie, éditeurs, Paris, 1970. |
[64] |
J. Neveu, Martingales à Temps Discret, Masson et Cie, éditeurs, Paris, 1972. |
[65] |
D. Oakes, Self-calibrating priors do not exist, J. Amer. Statist. Assoc., 80 (1985), 339-342, With comments by A. P. Dawid and Mark J. Schervish.doi: 10.1080/01621459.1985.10478117. |
[66] |
W. Olszewski, Calibration and expert testing, in Handbook of Game Theory (eds. P. Young and S. Zamir), vol. IV, to appear. |
[67] |
V. Perchet, Calibration and internal no-regret with random signals, Algorithmic learning theory, 68-82, Lecture Notes in Comput. Sci., 5809, Springer, Berlin, 2009.doi: 10.1007/978-3-642-04414-4_10. |
[68] |
V. Perchet, Approchabilité, Calibration et Regret dans les Jeux à Observations Partielles (in French), PhD thesis, Université Pierre et Marie Curie, 2010. |
[69] |
V. Perchet, Approachability of convex sets in games with partial monitoring, J. Optim. Theory Appl., 149 (2011), 665-677.doi: 10.1007/s10957-011-9797-3. |
[70] |
V. Perchet, No-regret with partial monitoring: Calibration-based optimal algorithms, J. Mach. Learn. Res., 12 (2011), 1893-1921. |
[71] |
V. Perchet, Exponential weights approachability, applications to regret minimization and calibration, Manuscript. |
[72] |
V. Perchet and M. Quincampoix, Purely informative game: Application to approachability with partial monitoring, Manuscript. |
[73] |
A. Rakhlin, Lecture notes on online learning, Manuscript. |
[74] |
A. Rakhlin, K. Sridharan and A. Tewari, Online Learning: Random Averages, Combinatorial Parameters, and Learnability, in Neural Information Processing Systems, 2010. |
[75] |
A. Rakhlin, K. Sridharan and A. Tewari, Online learning: Beyond regret, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 559-594. |
[76] |
W. Rudin, Real and Complex Analysis, McGraw-Hill Series in Higher Mathematics, New York, 1974. |
[77] |
A. Sandroni, R. Smorodinsky and R. V. Vohra, Calibration with many checking rules, Math. Oper. Res., 28 (2003), 141-153.doi: 10.1287/moor.28.1.141.14264. |
[78] |
E. Seneta, Nonnegative Matrices and Markov Chains, 2nd edition, Springer Series in Statistics, Springer-Verlag, New York, 1981. |
[79] |
S. Sorin, A First Course on Zero-Sum Repeated Games, Springer-Verlag, 2002. |
[80] |
S. Sorin, Lectures on Dynamics in Games, Unpublished Lecture Notes, 2008. |
[81] |
S. Sorin, Exponential weight algorithm in continuous time, Math. Program., 116 (2009), 513-528.doi: 10.1007/s10107-007-0111-y. |
[82] |
X. Spinat, A necessary and sufficient condition for approachability, Math. Oper. Res., 27 (2002), 31-44.doi: 10.1287/moor.27.1.31.333. |
[83] |
G. Stoltz, Incomplete Information and Internal Regret in Prediction of Individual Sequences, PhD thesis, Université Paris-Sud, 2005. |
[84] |
G. Stoltz and G. Lugosi, Internal regret in on-line portfolio selection, Mach. Learn., 59 (2005), 125-159.doi: 10.1007/s10994-005-0465-4. |
[85] |
G. Stoltz and G. Lugosi, Learning correlated equilibria in games with compact sets of strategies, Games Econom. Behav., 59 (2007), 187-208.doi: 10.1016/j.geb.2006.04.007. |
[86] |
N. Vieille, Weak approachability, Math. Oper. Res., 17 (1992), 781-791.doi: 10.1287/moor.17.4.781. |
[87] |
Y. Viossat and A. Zapechelnyuk, No-regret dynamics and fictitious play, J. Econom. Theory, 148 (2013), 825-842.doi: 10.1016/j.jet.2012.07.003. |
[88] |
V. Vovk, Aggregating strategies, in Proceedings of the 3rd Annual Workshop on Computational Learning Theory, 1990, 372-383. |
[89] |
V. Vovk, I. Nouretdinov, A. Takemura and G. Shafer, Defensive forecasting for linear protocols, in Algorithmic Learning Theory, vol. 3734 of Lecture Notes in Comput. Sci., Springer, Berlin, (2005), 459-473.doi: 10.1007/11564089_35. |
[90] |
D. Walkup and R. J.-B. Wets, A lipschitzian characterization of convex polyhedra, Proceedings of the American Mathematical Society, 23 (1969), 167-173.doi: 10.1090/S0002-9939-1969-0246200-8. |
[91] |
A. Zapechelnyuk, Better-reply dynamics with bounded recall, Math. Oper. Res., 33 (2008), 869-879.doi: 10.1287/moor.1080.0323. |
[92] |
M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in Proceedings of the Twentieth International Conference on Machine Learning (ICML), 2003. |