摘要
Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, and some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction and use restriction. This makes it di?cult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into the Markov chain Monte Carlo (MCMC) algorithm and propose an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node, and distribution knowledge including the conditional probability distribution (CPD) of edges and the probability distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.
Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, and some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction and use restriction. This makes it di?cult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into the Markov chain Monte Carlo (MCMC) algorithm and propose an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node, and distribution knowledge including the conditional probability distribution (CPD) of edges and the probability distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.
基金
This work was supported by the National Natural Science Foundation of China under Grant No. 61372171 and the National Key Technology Research and Development Program of China under Grant No. 2012BAH23B03. Acknowledgement We thank anonymous reviewers for their constructive and valuable comments. We also thank Professor Jian-Feng Zhan at Institute of Computing Technology, Chinese Academy of Sciences, Beijing, for his technical suggestions on this paper.