GeoLink has leveraged linked data principles to create a dataset that allows users to seamlessly query and reason over some of the most prominent geoscience metadata repositories in the United States.The GeoLink datas...GeoLink has leveraged linked data principles to create a dataset that allows users to seamlessly query and reason over some of the most prominent geoscience metadata repositories in the United States.The GeoLink dataset includes such diverse information as port calls made by oceanographic cruises,physical sample meta-data,research project funding and staffing,and authorship of technical reports.The data has been published according to best practices for linked data and is publicly available via a SPARQL Protocol and RDF Query Language(SPARQL)end point that at present contains more than 45 million Resource Description Framework(RDF)triples together with a collection of ontologies and geo-visualization tools.This article describes the geoscience datasets,the modeling and publication process,and current uses of the dataset.The focus is on providing enough detail to enable researchers,application developers and others who wish to lever-age the GeoLink data in their own work to do so.展开更多
Ontology alignment has been studied for over a decade,and over that time many alignment systems and methods have been developed by researchers in order to find simple 1-to-1 equivalence matches between two ontologies....Ontology alignment has been studied for over a decade,and over that time many alignment systems and methods have been developed by researchers in order to find simple 1-to-1 equivalence matches between two ontologies.However,very few alignment systems focus on finding complex correspondences.One reason for this limitation may be that there are no widely accepted alignment benchmarks that contain such complex relationships.In this paper,we propose a real-world data set from the GeoLink project as a potential complex ontology alignment benchmark.The data set consists of two ontologies,the GeoLink Base Ontology(GBO)and the GeoLink Modular Ontology(GMO),as well as a manually created reference alignment that was developed in consultation with domain experts from different institutions.The alignment includes 1:1,1:n,and m:n equivalence and subsumption correspondences,and is available in both Expressive and Declarative Ontology Alignment Language(EDOAL)and rule syntax.The benchmark has been expanded from its original version to contain real-world instance data from seven geoscience data providers that has been published according to both ontologies.This allows it to be used by extensional alignment systems or those that require training data.This benchmark has been incorporated into the Ontology Alignment Evaluation Initiative(OAEI)complex track to help researchers test their automated alignment systems and algorithms.This paper also analyzes the challenges inherent in effectively generating,detecting,and evaluating complex ontology alignments and provides a road map for future work on this topic.展开更多
基金This work was supported by the National Science Foundation[1440202].
文摘GeoLink has leveraged linked data principles to create a dataset that allows users to seamlessly query and reason over some of the most prominent geoscience metadata repositories in the United States.The GeoLink dataset includes such diverse information as port calls made by oceanographic cruises,physical sample meta-data,research project funding and staffing,and authorship of technical reports.The data has been published according to best practices for linked data and is publicly available via a SPARQL Protocol and RDF Query Language(SPARQL)end point that at present contains more than 45 million Resource Description Framework(RDF)triples together with a collection of ontologies and geo-visualization tools.This article describes the geoscience datasets,the modeling and publication process,and current uses of the dataset.The focus is on providing enough detail to enable researchers,application developers and others who wish to lever-age the GeoLink data in their own work to do so.
文摘Ontology alignment has been studied for over a decade,and over that time many alignment systems and methods have been developed by researchers in order to find simple 1-to-1 equivalence matches between two ontologies.However,very few alignment systems focus on finding complex correspondences.One reason for this limitation may be that there are no widely accepted alignment benchmarks that contain such complex relationships.In this paper,we propose a real-world data set from the GeoLink project as a potential complex ontology alignment benchmark.The data set consists of two ontologies,the GeoLink Base Ontology(GBO)and the GeoLink Modular Ontology(GMO),as well as a manually created reference alignment that was developed in consultation with domain experts from different institutions.The alignment includes 1:1,1:n,and m:n equivalence and subsumption correspondences,and is available in both Expressive and Declarative Ontology Alignment Language(EDOAL)and rule syntax.The benchmark has been expanded from its original version to contain real-world instance data from seven geoscience data providers that has been published according to both ontologies.This allows it to be used by extensional alignment systems or those that require training data.This benchmark has been incorporated into the Ontology Alignment Evaluation Initiative(OAEI)complex track to help researchers test their automated alignment systems and algorithms.This paper also analyzes the challenges inherent in effectively generating,detecting,and evaluating complex ontology alignments and provides a road map for future work on this topic.