We study CFG parse tree enumeration in this paper. By dividing the set of all parse trees into infinite hierarchies according to height of parse tree, the hierarchical lexicographic order on the set of parse trees is ...We study CFG parse tree enumeration in this paper. By dividing the set of all parse trees into infinite hierarchies according to height of parse tree, the hierarchical lexicographic order on the set of parse trees is established. Then grammar-based algorithms for counting and enumerating CFG parse trees in this order are presented. To generate a parse tree of height n, the time complexity is O(n). If τ is a lowest parse tree for its yield, then O(n) =O(||τ|| + 1), where ||τ|| is the length of the sentence (yield) generated by τ. The sentence can be obtained as a by-product of the parse tree. To compute sentence from its parse tree (needn't be lowest one), the time complexity is O(node)+O(||τ|| + 1), where node is the number of non-leaf nodes of parse tree τ. To generate both a complete lowest parse tree and its yield at the same time, the time complexity is O(||τ|| + 1).展开更多
This paper proposes a tree kernel method of semantic relation detection and classification (RDC) between named entities. It resolves two critical problems in previous tree kernel methods of RDC. First, a new tree ke...This paper proposes a tree kernel method of semantic relation detection and classification (RDC) between named entities. It resolves two critical problems in previous tree kernel methods of RDC. First, a new tree kernel is presented to better capture the inherent structural information in a parse tree by enabling the standard convolution tree kernel with context-sensitiveness and approximate matching of sub-trees. Second, an enriched parse tree structure is proposed to well derive necessary structural information, e.g., proper latent annotations, from a parse tree. Evaluation on the ACE RDC corpora shows that both the new tree kernel and the enriched parse tree structure contribute significantly to RDC and our tree kernel method much outperforms the state-of-the-art ones.展开更多
This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependenc...This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.展开更多
基金Supported by the National Natural Science Foundation of China (Grant Nos. 60273023, 60721061)
文摘We study CFG parse tree enumeration in this paper. By dividing the set of all parse trees into infinite hierarchies according to height of parse tree, the hierarchical lexicographic order on the set of parse trees is established. Then grammar-based algorithms for counting and enumerating CFG parse trees in this order are presented. To generate a parse tree of height n, the time complexity is O(n). If τ is a lowest parse tree for its yield, then O(n) =O(||τ|| + 1), where ||τ|| is the length of the sentence (yield) generated by τ. The sentence can be obtained as a by-product of the parse tree. To compute sentence from its parse tree (needn't be lowest one), the time complexity is O(node)+O(||τ|| + 1), where node is the number of non-leaf nodes of parse tree τ. To generate both a complete lowest parse tree and its yield at the same time, the time complexity is O(||τ|| + 1).
基金Supported by the National Natural Science Foundation of China under Grant Nos.60873150,60970056 and 90920004
文摘This paper proposes a tree kernel method of semantic relation detection and classification (RDC) between named entities. It resolves two critical problems in previous tree kernel methods of RDC. First, a new tree kernel is presented to better capture the inherent structural information in a parse tree by enabling the standard convolution tree kernel with context-sensitiveness and approximate matching of sub-trees. Second, an enriched parse tree structure is proposed to well derive necessary structural information, e.g., proper latent annotations, from a parse tree. Evaluation on the ACE RDC corpora shows that both the new tree kernel and the enriched parse tree structure contribute significantly to RDC and our tree kernel method much outperforms the state-of-the-art ones.
基金Supported by the National Natural Science Foundation of China under Grant Nos.61331011 and 61273320the National High Technology Research and Development 863 Program of China under Grant No.2012AA011102the Natural Science Foundation of Jiangsu Provincial Department of Education under Grant No.10KJB520016
文摘This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.