Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved throu...Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.展开更多
Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine th...Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.展开更多
The Cheng index distinguishes indica andjaponica rice based on six taxonomic traits.This index has been widely used for classifi- cation of indica and japonica varieties in China.In this study,a double haploid(DH)popu...The Cheng index distinguishes indica andjaponica rice based on six taxonomic traits.This index has been widely used for classifi- cation of indica and japonica varieties in China.In this study,a double haploid(DH)popula-tion derived from anther culture of ZYQ8/JX17 F,a typical inter-subspecies hybrid,was used to investigate the six taxonomictraits,i.e.leaf hairiness(LH),color of hullwhen heading(CHH),hairiness of hull(HH),length of the first and second panicle internode(LPI),length/width of grain(L/W),andphenol reaction(PH).The morphological in- dex(MI)was also calculated.Based on themolecular linkage map constructed from this展开更多
The classification method of relative permeability curves is rarely reported, when relative permeability curves are applied;if the multiple relative permeability curves are normalized directly, but not classified, the...The classification method of relative permeability curves is rarely reported, when relative permeability curves are applied;if the multiple relative permeability curves are normalized directly, but not classified, the calculated result maybe cause a large error. For example, the relationship curve between oil displacement efficiency and water cut, which derived from the relative permeability curve in LD oilfield is uncertain in the shape of low water cut stage. If being directly normalized, the result of the interpretation of the water flooded zone is very high. In this study, two problems were solved: 1) The mathematical equation of the relationship between oil displacement efficiency and water cut was deduced, and repaired the lost data of oil displacement efficiency and water cut curve, which solve the problem of uncertain curve shape. After analysis, the reason why the curve is not available is that relative permeability curves are not classified and optimized;2) Two kinds of classification and evaluation methods of relative permeability curve were put forward, the direct evaluation method and the analogy method;it can get the typical relative permeability curve by identifying abnormal curve.展开更多
Moisture induced disintegration of soft rock in Red Beds is common all over the world. The slake durability index test is most useful to quantify durability of the soft rocks. Based on a series of slaking test, this a...Moisture induced disintegration of soft rock in Red Beds is common all over the world. The slake durability index test is most useful to quantify durability of the soft rocks. Based on a series of slaking test, this article aims to develop a durability classification involving particle size and slaking procedure. To describe the slaking procedure in detail,the Relative Slake Durability Index(Id_i) is proposed. The Id_i is the percentage ratio of the i^(th) weight of oven-dry retained portion to the(i-1)^(th) weight of ovendry retained portion. Results show that the Id_i of samples have a large difference in certain slaking procedure, whereas the traditional Durability Slake Index(Id) is almost constant. Considering this limitation of Id in durability classification, an advanced classification by applying the Id_i and disintegration ratio(DR) is further established in this article. Compared to the durability classification based on Slake Durability Index(Id), the new classification accounts for the particle size of the slaked material and the slaking procedure, so it provides a better measure of the degree of slaking. The classification recommended in this article divide the slake durability into three classes(i.e., low, medium and high class). Furthermore, it divides both the low class and the medium class into 3 subclasses.展开更多
Inland freshwater lake wetlands play an important role in regional ecological balance. Hongze Lake is the fourth biggest freshwater lake in China. In the past three decades, there has been significant loss of freshwat...Inland freshwater lake wetlands play an important role in regional ecological balance. Hongze Lake is the fourth biggest freshwater lake in China. In the past three decades, there has been significant loss of freshwater wet- lands within the lake and at the mouths of neighboring rivers, due to disturbance, primarily from human activities. The main purpose of this paper was to explore a practical technology for differentiating wetlands effectively from upland types in close proximity to them. In the paper, an integrated method, which combined per-pixel and per-field classifi- cation, was used for mapping wetlands of Hongze Lake and their neighboring upland types. Firstly, Landsat ETM+ imagery was segmented and classified by using spectral and textural features. Secondly, ETM+ spectral bands, textural features derived from ETM+ Pan imagery, relative relations between neighboring classes, shape fea^xes, and elevation were used in a decision tree classification. Thirdly, per-pixel classification results from the decision tree classifier were improved by using classification results from object-oriented classification as a context. The results show that the technology has not only overcome the salt-and-pepper effect commonly observed in the past studies, but also has im- proved the accuracy of identification by nearly 5%.展开更多
This study is to explore a suitable method to classify landform, in order to support the decision making for community siting in mountainous areas.It first proposes the landform classification for community siting(LCC...This study is to explore a suitable method to classify landform, in order to support the decision making for community siting in mountainous areas.It first proposes the landform classification for community siting(LCCS) method with detailed discussions on its rationality and the chosen parameters.This method is then tested and verified in Quxian county.The LCCS method entails twograde parameters, which uses relative relief as the first grading parameter, slope as the second, followed by a synthesis process to form a suitable landform classification system.By applying the LCCS method in Quxian county, the result shows that its use of watershed to identify geomorphometric units, and its use of the altitude datum concept, can effectively classify landform according to the local cultural traditions, and the economic and environmental conditions.The verification result shows that comparing to the conventional methods, the LCCS method respects to people's daily experience due to its bottom-up approach.It not only help to minimize the disturbance to the nature when choosing locations for community development, but also helps to prepare more precise land management policies,which maximizes agricultural production and minimizes terrain transformation.展开更多
As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image...As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.展开更多
This paper investigates the approach of presenting groups by generators and relations from an original angle. It starts by interpreting this familiar concept with the novel notion of “formal words” created by juxtap...This paper investigates the approach of presenting groups by generators and relations from an original angle. It starts by interpreting this familiar concept with the novel notion of “formal words” created by juxtaposing letters in a set. Taking that as basis, several fundamental results related to free groups, such as Dyck’s Theorem, are proven. Then, the paper highlights three creative applications of the concept in classifying finite groups of a fixed order, representing all dihedral groups geometrically, and analyzing knots topologically. All three applications are of considerable significance in their respective topic areas and serve to illustrate the advantages and certain limitations of the approach flexibly and comprehensively.展开更多
Experiments of electrical responses of waterflooded layers were carried out on porous,fractured,porous-fractured and composite cores taken from carbonate reservoirs in the Zananor Oilfield,Kazakhstan to find out the e...Experiments of electrical responses of waterflooded layers were carried out on porous,fractured,porous-fractured and composite cores taken from carbonate reservoirs in the Zananor Oilfield,Kazakhstan to find out the effects of injected water salinity on electrical responses of carbonate reservoirs.On the basis of the experimental results and the mathematical model of calculating oil-water relative permeability of porous reservoirs by resistivity and the relative permeability model of two-phase flow in fractured reservoirs,the classification standards of water-flooded layers suitable for carbonate reservoirs with complex pore structure were established.The results show that the salinity of injected water is the main factor affecting the resistivity of carbonate reservoir.When low salinity water(fresh water)is injected,the relationship curve between resistivity and water saturation is U-shaped.When high salinity water(salt water)is injected,the curve is L-shaped.The classification criteria of water-flooded layers for carbonate reservoirs are as follows:(1)In porous reservoirs,the water cut(fw)is less than or equal to 5%in oil layers,5%–20%in weak water-flooded layers,20%–50%in moderately water-flooded layers,and greater than 50%in strong water-flooded layers.(2)For fractured,porous-fractured and composite reservoirs,the oil layers,weakly water-flooded layers,moderately water-flooded layers,and severely water-flooded layers have a water content of less than or equal to 5%,5%and 10%,10%to 50%,and larger than 50%respectively.展开更多
文摘Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.
基金National Natural Science Foundation of China(Grant Nos.62376166,62306188,61876113)National Key R&D Program of China(No.2022YFC3303504).
文摘Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.
文摘The Cheng index distinguishes indica andjaponica rice based on six taxonomic traits.This index has been widely used for classifi- cation of indica and japonica varieties in China.In this study,a double haploid(DH)popula-tion derived from anther culture of ZYQ8/JX17 F,a typical inter-subspecies hybrid,was used to investigate the six taxonomictraits,i.e.leaf hairiness(LH),color of hullwhen heading(CHH),hairiness of hull(HH),length of the first and second panicle internode(LPI),length/width of grain(L/W),andphenol reaction(PH).The morphological in- dex(MI)was also calculated.Based on themolecular linkage map constructed from this
文摘The classification method of relative permeability curves is rarely reported, when relative permeability curves are applied;if the multiple relative permeability curves are normalized directly, but not classified, the calculated result maybe cause a large error. For example, the relationship curve between oil displacement efficiency and water cut, which derived from the relative permeability curve in LD oilfield is uncertain in the shape of low water cut stage. If being directly normalized, the result of the interpretation of the water flooded zone is very high. In this study, two problems were solved: 1) The mathematical equation of the relationship between oil displacement efficiency and water cut was deduced, and repaired the lost data of oil displacement efficiency and water cut curve, which solve the problem of uncertain curve shape. After analysis, the reason why the curve is not available is that relative permeability curves are not classified and optimized;2) Two kinds of classification and evaluation methods of relative permeability curve were put forward, the direct evaluation method and the analogy method;it can get the typical relative permeability curve by identifying abnormal curve.
基金financially supported by the National Natural Science Foundation of China (Grant No. 41272332)
文摘Moisture induced disintegration of soft rock in Red Beds is common all over the world. The slake durability index test is most useful to quantify durability of the soft rocks. Based on a series of slaking test, this article aims to develop a durability classification involving particle size and slaking procedure. To describe the slaking procedure in detail,the Relative Slake Durability Index(Id_i) is proposed. The Id_i is the percentage ratio of the i^(th) weight of oven-dry retained portion to the(i-1)^(th) weight of ovendry retained portion. Results show that the Id_i of samples have a large difference in certain slaking procedure, whereas the traditional Durability Slake Index(Id) is almost constant. Considering this limitation of Id in durability classification, an advanced classification by applying the Id_i and disintegration ratio(DR) is further established in this article. Compared to the durability classification based on Slake Durability Index(Id), the new classification accounts for the particle size of the slaked material and the slaking procedure, so it provides a better measure of the degree of slaking. The classification recommended in this article divide the slake durability into three classes(i.e., low, medium and high class). Furthermore, it divides both the low class and the medium class into 3 subclasses.
基金Under the auspices of Natural Science Foundation of Jiangsu Province (No. BK2008360)Foundamental Research Funds for the Central Universities (No. 2009B12714,2009B11714)
文摘Inland freshwater lake wetlands play an important role in regional ecological balance. Hongze Lake is the fourth biggest freshwater lake in China. In the past three decades, there has been significant loss of freshwater wet- lands within the lake and at the mouths of neighboring rivers, due to disturbance, primarily from human activities. The main purpose of this paper was to explore a practical technology for differentiating wetlands effectively from upland types in close proximity to them. In the paper, an integrated method, which combined per-pixel and per-field classifi- cation, was used for mapping wetlands of Hongze Lake and their neighboring upland types. Firstly, Landsat ETM+ imagery was segmented and classified by using spectral and textural features. Secondly, ETM+ spectral bands, textural features derived from ETM+ Pan imagery, relative relations between neighboring classes, shape fea^xes, and elevation were used in a decision tree classification. Thirdly, per-pixel classification results from the decision tree classifier were improved by using classification results from object-oriented classification as a context. The results show that the technology has not only overcome the salt-and-pepper effect commonly observed in the past studies, but also has im- proved the accuracy of identification by nearly 5%.
基金supported by the National Natural Science Foundation of China(Grant Nos.51478056 and 51208202)
文摘This study is to explore a suitable method to classify landform, in order to support the decision making for community siting in mountainous areas.It first proposes the landform classification for community siting(LCCS) method with detailed discussions on its rationality and the chosen parameters.This method is then tested and verified in Quxian county.The LCCS method entails twograde parameters, which uses relative relief as the first grading parameter, slope as the second, followed by a synthesis process to form a suitable landform classification system.By applying the LCCS method in Quxian county, the result shows that its use of watershed to identify geomorphometric units, and its use of the altitude datum concept, can effectively classify landform according to the local cultural traditions, and the economic and environmental conditions.The verification result shows that comparing to the conventional methods, the LCCS method respects to people's daily experience due to its bottom-up approach.It not only help to minimize the disturbance to the nature when choosing locations for community development, but also helps to prepare more precise land management policies,which maximizes agricultural production and minimizes terrain transformation.
基金Project supported by the Hi-Tech Research and Development Pro-gram (863) of China (No. 2003AA119010), and China-American Digital Academic Library (CADAL) Project (No. CADAL2004002)
文摘As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.
文摘This paper investigates the approach of presenting groups by generators and relations from an original angle. It starts by interpreting this familiar concept with the novel notion of “formal words” created by juxtaposing letters in a set. Taking that as basis, several fundamental results related to free groups, such as Dyck’s Theorem, are proven. Then, the paper highlights three creative applications of the concept in classifying finite groups of a fixed order, representing all dihedral groups geometrically, and analyzing knots topologically. All three applications are of considerable significance in their respective topic areas and serve to illustrate the advantages and certain limitations of the approach flexibly and comprehensively.
基金Supported by the China National Major Science and Technology Project(2017ZX05030-002)the Natural Science Basic Research Plan in Shaanxi Province of China(2020JQ-747)the Fundamental Research Funds for the Central Universities(300102260107)
文摘Experiments of electrical responses of waterflooded layers were carried out on porous,fractured,porous-fractured and composite cores taken from carbonate reservoirs in the Zananor Oilfield,Kazakhstan to find out the effects of injected water salinity on electrical responses of carbonate reservoirs.On the basis of the experimental results and the mathematical model of calculating oil-water relative permeability of porous reservoirs by resistivity and the relative permeability model of two-phase flow in fractured reservoirs,the classification standards of water-flooded layers suitable for carbonate reservoirs with complex pore structure were established.The results show that the salinity of injected water is the main factor affecting the resistivity of carbonate reservoir.When low salinity water(fresh water)is injected,the relationship curve between resistivity and water saturation is U-shaped.When high salinity water(salt water)is injected,the curve is L-shaped.The classification criteria of water-flooded layers for carbonate reservoirs are as follows:(1)In porous reservoirs,the water cut(fw)is less than or equal to 5%in oil layers,5%–20%in weak water-flooded layers,20%–50%in moderately water-flooded layers,and greater than 50%in strong water-flooded layers.(2)For fractured,porous-fractured and composite reservoirs,the oil layers,weakly water-flooded layers,moderately water-flooded layers,and severely water-flooded layers have a water content of less than or equal to 5%,5%and 10%,10%to 50%,and larger than 50%respectively.