摘要
Human-centric service is an important domain in smart city and includes rich applications that help residents with shopping, dining, transportation, entertainment, and other daily activities. These applications have generated a massive amount of hierarchical data with different schemas. In order to manage and analyze the city-wide and cross-application data in a unified way, data schema integration is necessary. However, data from human-centric services has some distinct characteristics, such as lack of support for semantic, matching, large number of schemas, and incompleteness of schema element labels. These make the schema integra- tion difficult using existing approaches. We propose a novel framework for the data schema integration of the human-centric services in smart city. The framework uses both schema metadata and instance data to do schema matching, and introduces human intervention based on a similarity entropy criteria to balance precision and efficiency. Moreover, the framework works in an incremental manner to reduce computation workload. We conduct an experiment with real-world dataset collected from multiple estate sale application systems. The results show that our approach can produce high-quality mediated schema with relatively less human in- terventions compared to the baseline method.
Human-centric service is an important domain in smart city and includes rich applications that help residents with shopping, dining, transportation, entertainment, and other daily activities. These applications have generated a massive amount of hierarchical data with different schemas. In order to manage and analyze the city-wide and cross-application data in a unified way, data schema integration is necessary. However, data from human-centric services has some distinct characteristics, such as lack of support for semantic, matching, large number of schemas, and incompleteness of schema element labels. These make the schema integra- tion difficult using existing approaches. We propose a novel framework for the data schema integration of the human-centric services in smart city. The framework uses both schema metadata and instance data to do schema matching, and introduces human intervention based on a similarity entropy criteria to balance precision and efficiency. Moreover, the framework works in an incremental manner to reduce computation workload. We conduct an experiment with real-world dataset collected from multiple estate sale application systems. The results show that our approach can produce high-quality mediated schema with relatively less human in- terventions compared to the baseline method.
基金
funded by the National High Technology Research and Development Program of China(863)under Grant No.2013AA01A605