A new method is proposed for constructing the Chinese sentential semantic structure in this paper. The method adopts the features including predicates, relations between predicates and basic arguments, relations betwe...A new method is proposed for constructing the Chinese sentential semantic structure in this paper. The method adopts the features including predicates, relations between predicates and basic arguments, relations between words, and case types to train the models of CRF + + and de- pendency parser. On the basis of the data set in Beijing Forest Studio-Chinese Tagged Corpus ( BFS- CTC), the proposed method obtains precision value of 73.63% in open test. This result shows that the formalized computer processing can construct the sentential semantic structure absolutely. The features of predicates, topic and comment extracted with the method can be applied in Chinese in- formation processing directly for promoting the development of Chinese semantic analysis. The method makes the analysis of sentential semantic analysis based on large scale of data possible. It is a tool for expanding the corpus and has certain theoretical research and practical application value.展开更多
In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggreg...In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggregation U-shaped attention network structure of MAAUNet(MultiRes aggregation attention UNet)is proposed based on MultiResUNet.Firstly,aggregate connection is introduced from the original feature aggregation at the same level.Skip connection is redesigned to aggregate features of different semantic scales at the decoder subnet,and the problem of semantic gaps is further solved that may exist between skip connections.Secondly,after the multi-scale convolution module,a convolution block attention module is added to focus and integrate features in the two attention directions of channel and space to adaptively optimize the intermediate feature map.Finally,the original convolution block is improved.The convolution channels are expanded with a series convolution structure to complement each other and extract richer spatial features.Residual connections are retained and the convolution block is turned into a multi-channel convolution block.The model is made to extract multi-scale spatial features.The experimental results show that MAAUNet has strong competitiveness in challenging datasets,and shows good segmentation performance and stability in dealing with multi-scale input and noise interference.展开更多
Ontology is a distinct, canonical and shared system of concepts, which is oriented to objects (fields). Nowadays, every discipline or field attaches great importance to establishing and applying ontology for researc...Ontology is a distinct, canonical and shared system of concepts, which is oriented to objects (fields). Nowadays, every discipline or field attaches great importance to establishing and applying ontology for research. And ontologies that related to linguistics are WordNet by cognitive linguist Prof. Miller from PrincetonUniversity, FrameNet by Prof. Fillmore from California University, Berkeley, GOLD (General Ontology for Language Description) by Dr. Farrar from Arizona University and DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering) by CNR cognitive science and technology research centre of Italy, etc. This article focuses on event structures hot discussed in cognitive linguistics, through an ontologically analytical approach, and gives a systematic description on the concepts and semantic relationships involved in the event structures. Any event structure can be represented through the 7S schema. "For some purpose, somebody does something for someone with some means, sometimes and somewhere". Therefore, an event consists of 7 conceptual domains: purpose, actor, action, object, facility, location and time. In the article, the main concepts of the 7 domains and over 20 semantic relationships between these domains are described in detail and illustrated by some examples.展开更多
Writing is one of the skills most difficult to train and develop.Although writing can be circumvented in many cases bysome people, modern civilization is imposing increasing demands on our ability to write, and write ...Writing is one of the skills most difficult to train and develop.Although writing can be circumvented in many cases bysome people, modern civilization is imposing increasing demands on our ability to write, and write well. In English writing,consistency of semantic meaning with structure and function in English writing is a very important part to evaluate the quality ofwriting.In student's English academic writing, there are different writing types. The consistency of semanticmeaning with structureand function in different academic writings is various. In this thesis, the consistency of the semantic meaning with structure andfunction in the essay and academic and report writing is analyzed.展开更多
This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallo...This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract new展开更多
On the basis of previous studies,this paper summarizes the semantics of"hai"and"geng"in comparative structure and findsout the contradictions and disputes in previous studies.
More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditi...More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.展开更多
In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concep...In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concept attribute,context attribute and meaningless attribute,according to their semantic features which are document frequency features and distinguishing capability features.It also defines the semantic relevance between two attributes when they have correlations in the database.Then it proposes trie-bitmap structure and pair pointer tables to implement efficient algorithms for discovering attribute semantic feature and detecting their semantic relevances.By using semantic attributes and their semantic relevances,expansion words can be generated and embedded into a vector space model with interpolation parameters.The experiments use an IMDB movie database and real texts collections to evaluate the proposed method by comparing its performance with a classical vector space model.The results show that the proposed method can improve text search efficiently and also improve both semantic features and semantic relevances with good separation capabilities.展开更多
In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and ev...In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and evaluations. The major contribution of this paper is to choose the best measure among different similarity measures that give us good result with less error rate. The experiment was done on a taxonomy built to measure the semantic distance between two concepts in the health domain, which are represented as nodes in the taxonomy. Similarity measures methods were evaluated relative to human experts’ ratings. Our experiment was applied on the ICD10 taxonomy to determine the similarity value between two concepts. The similarity between 30 pairs of the health domains has been evaluated using different types of semantic similarity measures equations. The experimental results discussed in this paper have shown that the Hoa A. Nguyen and Hisham Al-Mubaid measure has achieved high matching score by the expert’s judgment.展开更多
Spreadsheets contain a lot of valuable data and have many practical applications.The key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identi...Spreadsheets contain a lot of valuable data and have many practical applications.The key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identifying cell function types and discovering relationships between cell pairs.Most existing methods for understanding the semantic structure of spreadsheets do not make use of the semantic information of cells.A few studies do,but they ignore the layout structure information of spreadsheets,which affects the performance of cell function classification and the discovery of different relationship types of cell pairs.In this paper,we propose a Heuristic algorithm for Understanding the Semantic Structure of spreadsheets(HUSS).Specifically,for improving the cell function classification,we propose an error correction mechanism(ECM)based on an existing cell function classification model[11]and the layout features of spreadsheets.For improving the table structure analysis,we propose five types of heuristic rules to extract four different types of cell pairs,based on the cell style and spatial location information.Our experimental results on five real-world datasets demonstrate that HUSS can effectively understand the semantic structure of spreadsheets and outperforms corresponding baselines.展开更多
为有效积累和重用航空橡塑密封结构案例中蕴含的知识,提出基于模型的定义(Model Based Definition,MBD)的航空橡塑密封结构案例库构建方法。首先,基于航空橡塑密封结构行业设计标准和MBD的三维建模与标注方法,建立基于MBD的航空橡塑密...为有效积累和重用航空橡塑密封结构案例中蕴含的知识,提出基于模型的定义(Model Based Definition,MBD)的航空橡塑密封结构案例库构建方法。首先,基于航空橡塑密封结构行业设计标准和MBD的三维建模与标注方法,建立基于MBD的航空橡塑密封结构案例内容框架和表示,通过SolidWorks MBD模块实现航空橡塑密封结构案例的MBD表示;然后,提取航空橡塑密封结构MBD案例表示中的几何特征和语义特征,设计“几何+语义”的案例检索算法;最后,开发的基于MBD的航空橡塑密封结构案例库原型系统及其应用表明,基于MBD的航空橡塑密封结构案例表示与检索实现了知识的积累和重用。展开更多
基金Supported by the Science and Technology Innovation Plan of Beijing Institute of Technology(2013)
文摘A new method is proposed for constructing the Chinese sentential semantic structure in this paper. The method adopts the features including predicates, relations between predicates and basic arguments, relations between words, and case types to train the models of CRF + + and de- pendency parser. On the basis of the data set in Beijing Forest Studio-Chinese Tagged Corpus ( BFS- CTC), the proposed method obtains precision value of 73.63% in open test. This result shows that the formalized computer processing can construct the sentential semantic structure absolutely. The features of predicates, topic and comment extracted with the method can be applied in Chinese in- formation processing directly for promoting the development of Chinese semantic analysis. The method makes the analysis of sentential semantic analysis based on large scale of data possible. It is a tool for expanding the corpus and has certain theoretical research and practical application value.
基金National Natural Science Foundation of China(No.61806006)Jiangsu University Superior Discipline Construction Project。
文摘In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggregation U-shaped attention network structure of MAAUNet(MultiRes aggregation attention UNet)is proposed based on MultiResUNet.Firstly,aggregate connection is introduced from the original feature aggregation at the same level.Skip connection is redesigned to aggregate features of different semantic scales at the decoder subnet,and the problem of semantic gaps is further solved that may exist between skip connections.Secondly,after the multi-scale convolution module,a convolution block attention module is added to focus and integrate features in the two attention directions of channel and space to adaptively optimize the intermediate feature map.Finally,the original convolution block is improved.The convolution channels are expanded with a series convolution structure to complement each other and extract richer spatial features.Residual connections are retained and the convolution block is turned into a multi-channel convolution block.The model is made to extract multi-scale spatial features.The experimental results show that MAAUNet has strong competitiveness in challenging datasets,and shows good segmentation performance and stability in dealing with multi-scale input and noise interference.
文摘Ontology is a distinct, canonical and shared system of concepts, which is oriented to objects (fields). Nowadays, every discipline or field attaches great importance to establishing and applying ontology for research. And ontologies that related to linguistics are WordNet by cognitive linguist Prof. Miller from PrincetonUniversity, FrameNet by Prof. Fillmore from California University, Berkeley, GOLD (General Ontology for Language Description) by Dr. Farrar from Arizona University and DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering) by CNR cognitive science and technology research centre of Italy, etc. This article focuses on event structures hot discussed in cognitive linguistics, through an ontologically analytical approach, and gives a systematic description on the concepts and semantic relationships involved in the event structures. Any event structure can be represented through the 7S schema. "For some purpose, somebody does something for someone with some means, sometimes and somewhere". Therefore, an event consists of 7 conceptual domains: purpose, actor, action, object, facility, location and time. In the article, the main concepts of the 7 domains and over 20 semantic relationships between these domains are described in detail and illustrated by some examples.
文摘Writing is one of the skills most difficult to train and develop.Although writing can be circumvented in many cases bysome people, modern civilization is imposing increasing demands on our ability to write, and write well. In English writing,consistency of semantic meaning with structure and function in English writing is a very important part to evaluate the quality ofwriting.In student's English academic writing, there are different writing types. The consistency of semanticmeaning with structureand function in different academic writings is various. In this thesis, the consistency of the semantic meaning with structure andfunction in the essay and academic and report writing is analyzed.
文摘This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract new
文摘On the basis of previous studies,this paper summarizes the semantics of"hai"and"geng"in comparative structure and findsout the contradictions and disputes in previous studies.
基金supported by the Knowledge Innovation Program of the Chinese Academy of Sciencesthe National High-Tech R&D Program of China(2008BAK49B05)
文摘More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.
基金Program for New Century Excellent Talents in University(No.NCET-06-0290)the National Natural Science Foundation of China(No.60503036)the Fok Ying Tong Education Foundation Award(No.104027)
文摘In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concept attribute,context attribute and meaningless attribute,according to their semantic features which are document frequency features and distinguishing capability features.It also defines the semantic relevance between two attributes when they have correlations in the database.Then it proposes trie-bitmap structure and pair pointer tables to implement efficient algorithms for discovering attribute semantic feature and detecting their semantic relevances.By using semantic attributes and their semantic relevances,expansion words can be generated and embedded into a vector space model with interpolation parameters.The experiments use an IMDB movie database and real texts collections to evaluate the proposed method by comparing its performance with a classical vector space model.The results show that the proposed method can improve text search efficiently and also improve both semantic features and semantic relevances with good separation capabilities.
文摘In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and evaluations. The major contribution of this paper is to choose the best measure among different similarity measures that give us good result with less error rate. The experiment was done on a taxonomy built to measure the semantic distance between two concepts in the health domain, which are represented as nodes in the taxonomy. Similarity measures methods were evaluated relative to human experts’ ratings. Our experiment was applied on the ICD10 taxonomy to determine the similarity value between two concepts. The similarity between 30 pairs of the health domains has been evaluated using different types of semantic similarity measures equations. The experimental results discussed in this paper have shown that the Hoa A. Nguyen and Hisham Al-Mubaid measure has achieved high matching score by the expert’s judgment.
基金supported in part by the National Natural Science Foundation of China under Grants(Nos.62120106008,61806065,61906059,62076085,91746209 and 62076087)the Fundamental Research Funds for the Central Universities(No.JZ2020HGQA0186).
文摘Spreadsheets contain a lot of valuable data and have many practical applications.The key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identifying cell function types and discovering relationships between cell pairs.Most existing methods for understanding the semantic structure of spreadsheets do not make use of the semantic information of cells.A few studies do,but they ignore the layout structure information of spreadsheets,which affects the performance of cell function classification and the discovery of different relationship types of cell pairs.In this paper,we propose a Heuristic algorithm for Understanding the Semantic Structure of spreadsheets(HUSS).Specifically,for improving the cell function classification,we propose an error correction mechanism(ECM)based on an existing cell function classification model[11]and the layout features of spreadsheets.For improving the table structure analysis,we propose five types of heuristic rules to extract four different types of cell pairs,based on the cell style and spatial location information.Our experimental results on five real-world datasets demonstrate that HUSS can effectively understand the semantic structure of spreadsheets and outperforms corresponding baselines.
文摘为有效积累和重用航空橡塑密封结构案例中蕴含的知识,提出基于模型的定义(Model Based Definition,MBD)的航空橡塑密封结构案例库构建方法。首先,基于航空橡塑密封结构行业设计标准和MBD的三维建模与标注方法,建立基于MBD的航空橡塑密封结构案例内容框架和表示,通过SolidWorks MBD模块实现航空橡塑密封结构案例的MBD表示;然后,提取航空橡塑密封结构MBD案例表示中的几何特征和语义特征,设计“几何+语义”的案例检索算法;最后,开发的基于MBD的航空橡塑密封结构案例库原型系统及其应用表明,基于MBD的航空橡塑密封结构案例表示与检索实现了知识的积累和重用。