
-
Previous Article
Modelling uncertainty using stochastic transport noise in a 2-layer quasi-geostrophic model
- FoDS Home
- This Issue
-
Next Article
Hierarchical approximations for data reduction and learning at multiple scales
A Bayesian nonparametric test for conditional independence
Department of Mathematics, Imperial College London, UK |
This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses Pólya tree priors on spaces of conditional probability densities, accounting for uncertainty in the form of the underlying distributions in a nonparametric way. The Bayesian perspective provides an inherently symmetric probability measure of conditional dependence or independence, a feature particularly advantageous in causal discovery and not employed in existing procedures of this type.
References:
[1] |
J. O. Berger and A. Guglielmi,
Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives, J. Amer. Statist. Assoc., 96 (2001), 174-184.
doi: 10.1198/016214501750333045. |
[2] |
W. Bergsma, Testing conditional independence for continuous random variables, Report Eurandom, 2004. |
[3] |
T. B. Berrett, Y. Wang, R. F. Barber and R. J. Samworth,
The conditional permutation test for independence while controlling for confounders, J. R. Stat. Soc. B, 82 (2020), 175-197.
doi: 10.1111/rssb.12340. |
[4] |
E. Candès, Y. Fan, L. Janson and J. Lv,
Panning for gold: Model-X knockoffs for high dimensional controlled variable selection, J. R. Stat. Soc. Ser. B. Stat. Methodol., 80 (2018), 551-577.
doi: 10.1111/rssb.12265. |
[5] |
G. Doran, K. Muandet, K. Zhang and B. Schölkopf, A permutation-based kernel conditional independence test, Proc. 30th Conf. UAI, 132–141. |
[6] |
M. Escobar and M. West,
Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc., 90 (1995), 577-588.
doi: 10.1080/01621459.1995.10476550. |
[7] |
S. Filippi and C. Holmes,
A Bayesian nonparametric approach to testing for dependence between random variables, Bayesian Anal., 12 (2017), 919-938.
doi: 10.1214/16-BA1027. |
[8] |
R. Fisher,
The distribution of the partial correlation coefficient, Metron, 3 (1924), 329-332.
|
[9] |
K. Fukumizu, A. Gretton, X. Sun and B. Schölkopf, Kernel measures of conditional dependence, Adv. Neural Inf. Process. Syst., 20, 489–496. |
[10] |
S. Ghosal and A. van der Vaart, Fundamentals of Nonparametric Bayesian Inference, Cambridge Series in Statistical and Probabilistic Mathematics, 44. Cambridge University Press, Cambridge, 2017.
doi: 10.1017/9781139029834. |
[11] |
J. K. Ghosh and R. V. Ramamoorthi, Bayesian Nonparametrics, Springer-Verlag, New York, 2003. |
[12] |
P. Giudici,
Bayes factors for zero partial covariances, J. Statist. Plann. Inference, 46 (1995), 161-174.
doi: 10.1016/0378-3758(94)00101-Z. |
[13] |
T. E. Hanson,
Inference for mixtures of finite Pólya tree models, J. Amer. Statist. Assoc., 101 (2006), 1548-1565.
doi: 10.1198/016214506000000384. |
[14] |
T. Hanson and W. O. Johnson,
Modeling regression error with a mixture of Pólya trees, J. Amer. Statist. Assoc., 97 (2002), 1020-1033.
doi: 10.1198/016214502388618843. |
[15] |
N. Harris and M. Drton,
PCalgorithm for nonparanormal graphical models, J. Mach. Learn. Res., 14 (2013), 3365-3383.
|
[16] |
P. Hoyer, D. Janzing, J. Mooij, J. Peters and B. Schölkopf, Nonlinear causal discovery with additive noise models, Adv. Neural Inf. Process. Syst. 21, 689–696. |
[17] |
T.-M. Huang,
Testing conditional independence using maximal nonlinear conditional correlation, Ann. Statist., 38 (2010), 2047-2091.
doi: 10.1214/09-AOS770. |
[18] |
R. E. Kass and A. E. Raftery,
Bayes factors, J. Amer. Statist. Assoc., 90 (1995), 773-795.
doi: 10.1080/01621459.1995.10476572. |
[19] |
T. Kunihama and D. B. Dunson,
Nonparametric Bayes inference on conditional independence, Biometrika, 103 (2016), 35-47.
doi: 10.1093/biomet/asv060. |
[20] |
M. Lavine,
Some aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 20 (1992), 1222-1235.
doi: 10.1214/aos/1176348767. |
[21] |
M. Lavine,
More aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 22 (1994), 1161-1176.
doi: 10.1214/aos/1176325623. |
[22] |
L. Ma,
Adaptive testing of conditional association through recursive mixture modeling, J. Amer. Statist. Assoc., 108 (2013), 1493-1505.
doi: 10.1080/01621459.2013.838899. |
[23] |
L. Ma,
Recursive partitioning and multi-scale modeling on conditional densities, Electron. J. Stat., 11 (2017), 1297-1325.
doi: 10.1214/17-EJS1254. |
[24] |
D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003.
![]() ![]() |
[25] |
D. Margaritis, Distribution-free learning of bayesian network structure in continuous domains, Proc. 20th Nat. Conf. Artificial Intel., (2005), 825–830. |
[26] |
R. D. Mauldin, W. D. Sudderth and S. C. Williams,
Pólya trees and random distributions, Ann. Statist., 20 (1992), 1203-1221.
doi: 10.1214/aos/1176348766. |
[27] |
S. M. Paddock, Randomized Pólya Trees: Bayesian Nonparametrics for Multivariate Data Analysis, Thesis (Ph.D.)–Duke University. 1999. |
[28] |
J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge University Press, 2009.
doi: 10.1017/CBO9780511803161.![]() ![]() ![]() |
[29] |
J. Peters, D. Janzing and B. Schölkopf, Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press, Cambridge, MA, 2017.
![]() ![]() |
[30] |
J. Peters, J. Mooij, D. Janzing and B. Schölkopf,
Causal discovery with continuous additive noise models, J. Mach. Learn. Res., 15 (2014), 2009-2053.
|
[31] |
J. Ramsey, A scalable conditional independence test for nonlinear, non-Gaussian data, arXiv: 1401.5031. |
[32] |
J. Runge, Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information, arXiv: 1709.01447. |
[33] |
F. Saad and V. Mansinghka,
Detecting dependencies in sparse, multivariate databases using probabilistic programming and non-parametric Bayes, Proc. Mach. Learn. Res., 46 (2017), 632-641.
|
[34] |
R. Shah and J. Peters, The hardness of conditional independence testing and the generalised covariance measure, arXiv: 1804.07203. |
[35] |
P. Spirtes and C. Glymour,
An algorithm for fast recovery of sparse causal graphs, Soc. Sci. Comput. Rev., 9 (1991), 62-72.
doi: 10.1177/089443939100900106. |
[36] |
E. Strobl, K. Zhang and S. Visweswaran, Approximate kernel-based conditional independence tests for fast non-parametric causal discovery, J. Causal Inference, (2019), 20180017.
doi: 10.1515/jci-2018-0017. |
[37] |
L. Su and H. White,
A consistent characteristic function-based test for conditional independence, J. Econom., 141 (2007), 807-834.
doi: 10.1016/j.jeconom.2006.11.006. |
[38] |
L. Su and H. White,
A nonparametric Hellinger metric test for conditional independence, Econom. Theory, 24 (2008), 829-864.
doi: 10.1017/S0266466608080341. |
[39] |
W. H. Wong and L. Ma,
Optional Pólya tree and Bayesian inference, Ann. Statist., 38 (2010), 1433-1459.
doi: 10.1214/09-AOS755. |
[40] |
Q. Zhang, S. Filippi, S. Flaxman and D. Sejdinovic, Feature-to-feature regression for a two-step conditional independence test, Proc. 33rd Conf. UAI, 2017. |
[41] |
K. Zhang, J. Peters, D. Janzing and B. Schölkopf, Kernel-based conditional independence test and application in causal discovery, arXiv: 1202.3775. |
[42] |
J. Zhang, L. Yang and X. Wu,
Pólya tree priors and their estimation with multi-group data, Stat. Pap., 60 (2019), 499-525.
doi: 10.1007/s00362-016-0852-x. |
show all references
References:
[1] |
J. O. Berger and A. Guglielmi,
Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives, J. Amer. Statist. Assoc., 96 (2001), 174-184.
doi: 10.1198/016214501750333045. |
[2] |
W. Bergsma, Testing conditional independence for continuous random variables, Report Eurandom, 2004. |
[3] |
T. B. Berrett, Y. Wang, R. F. Barber and R. J. Samworth,
The conditional permutation test for independence while controlling for confounders, J. R. Stat. Soc. B, 82 (2020), 175-197.
doi: 10.1111/rssb.12340. |
[4] |
E. Candès, Y. Fan, L. Janson and J. Lv,
Panning for gold: Model-X knockoffs for high dimensional controlled variable selection, J. R. Stat. Soc. Ser. B. Stat. Methodol., 80 (2018), 551-577.
doi: 10.1111/rssb.12265. |
[5] |
G. Doran, K. Muandet, K. Zhang and B. Schölkopf, A permutation-based kernel conditional independence test, Proc. 30th Conf. UAI, 132–141. |
[6] |
M. Escobar and M. West,
Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc., 90 (1995), 577-588.
doi: 10.1080/01621459.1995.10476550. |
[7] |
S. Filippi and C. Holmes,
A Bayesian nonparametric approach to testing for dependence between random variables, Bayesian Anal., 12 (2017), 919-938.
doi: 10.1214/16-BA1027. |
[8] |
R. Fisher,
The distribution of the partial correlation coefficient, Metron, 3 (1924), 329-332.
|
[9] |
K. Fukumizu, A. Gretton, X. Sun and B. Schölkopf, Kernel measures of conditional dependence, Adv. Neural Inf. Process. Syst., 20, 489–496. |
[10] |
S. Ghosal and A. van der Vaart, Fundamentals of Nonparametric Bayesian Inference, Cambridge Series in Statistical and Probabilistic Mathematics, 44. Cambridge University Press, Cambridge, 2017.
doi: 10.1017/9781139029834. |
[11] |
J. K. Ghosh and R. V. Ramamoorthi, Bayesian Nonparametrics, Springer-Verlag, New York, 2003. |
[12] |
P. Giudici,
Bayes factors for zero partial covariances, J. Statist. Plann. Inference, 46 (1995), 161-174.
doi: 10.1016/0378-3758(94)00101-Z. |
[13] |
T. E. Hanson,
Inference for mixtures of finite Pólya tree models, J. Amer. Statist. Assoc., 101 (2006), 1548-1565.
doi: 10.1198/016214506000000384. |
[14] |
T. Hanson and W. O. Johnson,
Modeling regression error with a mixture of Pólya trees, J. Amer. Statist. Assoc., 97 (2002), 1020-1033.
doi: 10.1198/016214502388618843. |
[15] |
N. Harris and M. Drton,
PCalgorithm for nonparanormal graphical models, J. Mach. Learn. Res., 14 (2013), 3365-3383.
|
[16] |
P. Hoyer, D. Janzing, J. Mooij, J. Peters and B. Schölkopf, Nonlinear causal discovery with additive noise models, Adv. Neural Inf. Process. Syst. 21, 689–696. |
[17] |
T.-M. Huang,
Testing conditional independence using maximal nonlinear conditional correlation, Ann. Statist., 38 (2010), 2047-2091.
doi: 10.1214/09-AOS770. |
[18] |
R. E. Kass and A. E. Raftery,
Bayes factors, J. Amer. Statist. Assoc., 90 (1995), 773-795.
doi: 10.1080/01621459.1995.10476572. |
[19] |
T. Kunihama and D. B. Dunson,
Nonparametric Bayes inference on conditional independence, Biometrika, 103 (2016), 35-47.
doi: 10.1093/biomet/asv060. |
[20] |
M. Lavine,
Some aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 20 (1992), 1222-1235.
doi: 10.1214/aos/1176348767. |
[21] |
M. Lavine,
More aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 22 (1994), 1161-1176.
doi: 10.1214/aos/1176325623. |
[22] |
L. Ma,
Adaptive testing of conditional association through recursive mixture modeling, J. Amer. Statist. Assoc., 108 (2013), 1493-1505.
doi: 10.1080/01621459.2013.838899. |
[23] |
L. Ma,
Recursive partitioning and multi-scale modeling on conditional densities, Electron. J. Stat., 11 (2017), 1297-1325.
doi: 10.1214/17-EJS1254. |
[24] |
D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003.
![]() ![]() |
[25] |
D. Margaritis, Distribution-free learning of bayesian network structure in continuous domains, Proc. 20th Nat. Conf. Artificial Intel., (2005), 825–830. |
[26] |
R. D. Mauldin, W. D. Sudderth and S. C. Williams,
Pólya trees and random distributions, Ann. Statist., 20 (1992), 1203-1221.
doi: 10.1214/aos/1176348766. |
[27] |
S. M. Paddock, Randomized Pólya Trees: Bayesian Nonparametrics for Multivariate Data Analysis, Thesis (Ph.D.)–Duke University. 1999. |
[28] |
J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge University Press, 2009.
doi: 10.1017/CBO9780511803161.![]() ![]() ![]() |
[29] |
J. Peters, D. Janzing and B. Schölkopf, Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press, Cambridge, MA, 2017.
![]() ![]() |
[30] |
J. Peters, J. Mooij, D. Janzing and B. Schölkopf,
Causal discovery with continuous additive noise models, J. Mach. Learn. Res., 15 (2014), 2009-2053.
|
[31] |
J. Ramsey, A scalable conditional independence test for nonlinear, non-Gaussian data, arXiv: 1401.5031. |
[32] |
J. Runge, Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information, arXiv: 1709.01447. |
[33] |
F. Saad and V. Mansinghka,
Detecting dependencies in sparse, multivariate databases using probabilistic programming and non-parametric Bayes, Proc. Mach. Learn. Res., 46 (2017), 632-641.
|
[34] |
R. Shah and J. Peters, The hardness of conditional independence testing and the generalised covariance measure, arXiv: 1804.07203. |
[35] |
P. Spirtes and C. Glymour,
An algorithm for fast recovery of sparse causal graphs, Soc. Sci. Comput. Rev., 9 (1991), 62-72.
doi: 10.1177/089443939100900106. |
[36] |
E. Strobl, K. Zhang and S. Visweswaran, Approximate kernel-based conditional independence tests for fast non-parametric causal discovery, J. Causal Inference, (2019), 20180017.
doi: 10.1515/jci-2018-0017. |
[37] |
L. Su and H. White,
A consistent characteristic function-based test for conditional independence, J. Econom., 141 (2007), 807-834.
doi: 10.1016/j.jeconom.2006.11.006. |
[38] |
L. Su and H. White,
A nonparametric Hellinger metric test for conditional independence, Econom. Theory, 24 (2008), 829-864.
doi: 10.1017/S0266466608080341. |
[39] |
W. H. Wong and L. Ma,
Optional Pólya tree and Bayesian inference, Ann. Statist., 38 (2010), 1433-1459.
doi: 10.1214/09-AOS755. |
[40] |
Q. Zhang, S. Filippi, S. Flaxman and D. Sejdinovic, Feature-to-feature regression for a two-step conditional independence test, Proc. 33rd Conf. UAI, 2017. |
[41] |
K. Zhang, J. Peters, D. Janzing and B. Schölkopf, Kernel-based conditional independence test and application in causal discovery, arXiv: 1202.3775. |
[42] |
J. Zhang, L. Yang and X. Wu,
Pólya tree priors and their estimation with multi-group data, Stat. Pap., 60 (2019), 499-525.
doi: 10.1007/s00362-016-0852-x. |







[1] |
Jörg Schmeling. A notion of independence via moving targets. Discrete and Continuous Dynamical Systems, 2006, 15 (1) : 269-280. doi: 10.3934/dcds.2006.15.269 |
[2] |
C. Xiong, J.P. Miller, F. Gao, Y. Yan, J.C. Morris. Testing increasing hazard rate for the progression time of dementia. Discrete and Continuous Dynamical Systems - B, 2004, 4 (3) : 813-821. doi: 10.3934/dcdsb.2004.4.813 |
[3] |
Fryderyk Falniowski, Marcin Kulczycki, Dominik Kwietniak, Jian Li. Two results on entropy, chaos and independence in symbolic dynamics. Discrete and Continuous Dynamical Systems - B, 2015, 20 (10) : 3487-3505. doi: 10.3934/dcdsb.2015.20.3487 |
[4] |
Jean-François Biasse, Michael J. Jacobson, Jr.. Smoothness testing of polynomials over finite fields. Advances in Mathematics of Communications, 2014, 8 (4) : 459-477. doi: 10.3934/amc.2014.8.459 |
[5] |
Antoni Buades, Bartomeu Coll, Jose-Luis Lisani, Catalina Sbert. Conditional image diffusion. Inverse Problems and Imaging, 2007, 1 (4) : 593-608. doi: 10.3934/ipi.2007.1.593 |
[6] |
Tomáš Smejkal, Jiří Mikyška, Jaromír Kukal. Comparison of modern heuristics on solving the phase stability testing problem. Discrete and Continuous Dynamical Systems - S, 2021, 14 (3) : 1161-1180. doi: 10.3934/dcdss.2020227 |
[7] |
Philippe Destuynder, Caroline Fabre. Few remarks on the use of Love waves in non destructive testing. Discrete and Continuous Dynamical Systems - S, 2016, 9 (2) : 427-444. doi: 10.3934/dcdss.2016005 |
[8] |
Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics and Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117 |
[9] |
Christopher Oballe, Alan Cherne, Dave Boothe, Scott Kerick, Piotr J. Franaszczuk, Vasileios Maroulas. Bayesian topological signal processing. Discrete and Continuous Dynamical Systems - S, 2022, 15 (4) : 797-817. doi: 10.3934/dcdss.2021084 |
[10] |
David Simmons. Conditional measures and conditional expectation; Rohlin's Disintegration Theorem. Discrete and Continuous Dynamical Systems, 2012, 32 (7) : 2565-2582. doi: 10.3934/dcds.2012.32.2565 |
[11] |
Xiaomin Zhou. A formula of conditional entropy and some applications. Discrete and Continuous Dynamical Systems, 2016, 36 (7) : 4063-4075. doi: 10.3934/dcds.2016.36.4063 |
[12] |
Deng Lu, Maria De Iorio, Ajay Jasra, Gary L. Rosner. Bayesian inference for latent chain graphs. Foundations of Data Science, 2020, 2 (1) : 35-54. doi: 10.3934/fods.2020003 |
[13] |
Sahani Pathiraja, Sebastian Reich. Discrete gradients for computational Bayesian inference. Journal of Computational Dynamics, 2019, 6 (2) : 385-400. doi: 10.3934/jcd.2019019 |
[14] |
Masoumeh Dashti, Stephen Harris, Andrew Stuart. Besov priors for Bayesian inverse problems. Inverse Problems and Imaging, 2012, 6 (2) : 183-200. doi: 10.3934/ipi.2012.6.183 |
[15] |
Mila Nikolova. Model distortions in Bayesian MAP reconstruction. Inverse Problems and Imaging, 2007, 1 (2) : 399-422. doi: 10.3934/ipi.2007.1.399 |
[16] |
Matthew M. Dunlop, Andrew M. Stuart. The Bayesian formulation of EIT: Analysis and algorithms. Inverse Problems and Imaging, 2016, 10 (4) : 1007-1036. doi: 10.3934/ipi.2016030 |
[17] |
Monica Pragliola, Daniela Calvetti, Erkki Somersalo. Overcomplete representation in a hierarchical Bayesian framework. Inverse Problems and Imaging, 2022, 16 (1) : 19-38. doi: 10.3934/ipi.2021039 |
[18] |
Huijie Qiao, Jiang-Lun Wu. On the path-independence of the Girsanov transformation for stochastic evolution equations with jumps in Hilbert spaces. Discrete and Continuous Dynamical Systems - B, 2019, 24 (4) : 1449-1467. doi: 10.3934/dcdsb.2018215 |
[19] |
Felipe Cucker, Jiu-Gang Dong. A conditional, collision-avoiding, model for swarming. Discrete and Continuous Dynamical Systems, 2014, 34 (3) : 1009-1020. doi: 10.3934/dcds.2014.34.1009 |
[20] |
Ping Huang, Ercai Chen, Chenwei Wang. Entropy formulae of conditional entropy in mean metrics. Discrete and Continuous Dynamical Systems, 2018, 38 (10) : 5129-5144. doi: 10.3934/dcds.2018226 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]