Natural language semantic construction improves natural language comprehension ability and analytical skills of the machine.It is the basis for realizing the information exchange in the intelligent cloud-computing env...Natural language semantic construction improves natural language comprehension ability and analytical skills of the machine.It is the basis for realizing the information exchange in the intelligent cloud-computing environment.This paper proposes a natural language semantic construction method based on cloud database,mainly including two parts:natural language cloud database construction and natural language semantic construction.Natural Language cloud database is established on the CloudStack cloud-computing environment,which is composed by corpus,thesaurus,word vector library and ontology knowledge base.In this section,we concentrate on the pretreatment of corpus and the presentation of background knowledge ontology,and then put forward a TF-IDF and word vector distance based algorithm for duplicated webpages(TWDW).It raises the recognition efficiency of repeated web pages.The part of natural language semantic construction mainly introduces the dynamic process of semantic construction and proposes a mapping algorithm based on semantic similarity(MBSS),which is a bridge between Predicate-Argument(PA)structure and background knowledge ontology.Experiments show that compared with the relevant algorithms,the precision and recall of both algorithms we propose have been significantly improved.The work in this paper improves the understanding of natural language semantics,and provides effective data support for the natural language interaction function of the cloud service.展开更多
The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves stora...The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves storage and analysis of network flow statistic. However, this approach loses much valuable information within the Internet traffic. With the advancement of commodity hardware, in particular the volume of storage devices and the speed of interconnect technologies used in network adapter cards and multi-core processors, it is now possible to capture 10 Gbps and beyond real-time network traffic using a commodity computer, such as n2disk. Also with the advancement of distributed file system (such as Hadoop, ZFS, etc.) and open cloud computing platform (such as OpenStack, CloudStack, and Eucalyptus, etc.), it is practical to store such large volume of traffic data and fully in-depth analyse the inside communication within an acceptable latency. In this paper, based on well- known TimeMachine, we present TIFAflow, the design and implementation of a novel system for archiving and querying network flows. Firstly, we enhance the traffic archiving system named TImemachine+FAstbit (TIFA) with flow granularity, i.e., supply the system with flow table and flow module. Secondly, based on real network traces, we conduct performance comparison experiments of TIFAflow with other implementations such as common database solution, TimeMachine and TIFA system. Finally, based on comparison results, we demonstrate that TIFAflow has a higher performance improvement in storing and querying performance than TimeMachine and TIFA, both in time and space metrics.展开更多
基金This paper is partially supported by the Natural Science Foundation of Hebei Province(No.F2015207009)the Hebei higher education research project(No.BJ2016019,QN2016179)+5 种基金Research project of Hebei University of Economics and Business(No.2016KYZ05)Education technology research Foundation of the Ministry of Education(No.2017A01020)At the same time,the paper is also supported by the National Natural Science Foundation of China under grant No.61702305the China Postdoctoral Science Foundation under grant No.2017M622234the Qingdao city Postdoctoral Researchers Applied Research Projects,University Science and Technology Program of Shandong Province under the grant No.J16LN08the Shandong Province Key Laboratory of Wisdom Mine Information Technology foundation under the grant No.WMIT201601.
文摘Natural language semantic construction improves natural language comprehension ability and analytical skills of the machine.It is the basis for realizing the information exchange in the intelligent cloud-computing environment.This paper proposes a natural language semantic construction method based on cloud database,mainly including two parts:natural language cloud database construction and natural language semantic construction.Natural Language cloud database is established on the CloudStack cloud-computing environment,which is composed by corpus,thesaurus,word vector library and ontology knowledge base.In this section,we concentrate on the pretreatment of corpus and the presentation of background knowledge ontology,and then put forward a TF-IDF and word vector distance based algorithm for duplicated webpages(TWDW).It raises the recognition efficiency of repeated web pages.The part of natural language semantic construction mainly introduces the dynamic process of semantic construction and proposes a mapping algorithm based on semantic similarity(MBSS),which is a bridge between Predicate-Argument(PA)structure and background knowledge ontology.Experiments show that compared with the relevant algorithms,the precision and recall of both algorithms we propose have been significantly improved.The work in this paper improves the understanding of natural language semantics,and provides effective data support for the natural language interaction function of the cloud service.
基金the National Key Basic Research and Development (973) Program of China (Nos. 2012CB315801 and 2011CB302805)the National Natural Science Foundation of China A3 Program (No. 61161140320) and the National Natural Science Foundation of China (No. 61233016)Intel Research Councils UPO program with title of security Vulnerability Analysis based on Cloud Platform with Intel IA Architecture
文摘The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves storage and analysis of network flow statistic. However, this approach loses much valuable information within the Internet traffic. With the advancement of commodity hardware, in particular the volume of storage devices and the speed of interconnect technologies used in network adapter cards and multi-core processors, it is now possible to capture 10 Gbps and beyond real-time network traffic using a commodity computer, such as n2disk. Also with the advancement of distributed file system (such as Hadoop, ZFS, etc.) and open cloud computing platform (such as OpenStack, CloudStack, and Eucalyptus, etc.), it is practical to store such large volume of traffic data and fully in-depth analyse the inside communication within an acceptable latency. In this paper, based on well- known TimeMachine, we present TIFAflow, the design and implementation of a novel system for archiving and querying network flows. Firstly, we enhance the traffic archiving system named TImemachine+FAstbit (TIFA) with flow granularity, i.e., supply the system with flow table and flow module. Secondly, based on real network traces, we conduct performance comparison experiments of TIFAflow with other implementations such as common database solution, TimeMachine and TIFA system. Finally, based on comparison results, we demonstrate that TIFAflow has a higher performance improvement in storing and querying performance than TimeMachine and TIFA, both in time and space metrics.