This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and ro...This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future.展开更多
A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering ...A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering and can realize automatic abstracting of multi-documents. The algo- rithm of twice word segmentation based on the title and first sentences in paragraphs is investigated. Its precision and recall is above 95 %. For a specific domain on plastics, an automatic abstracting system named TCAAS is implemented. The precision and recall of multi-document’s automatic ab- stracting is above 75 %. Also, the experiments prove that it is feasible to use the method to develop a domain automatic abstracting system, which is valuable for further in-depth study.展开更多
Spoken dialogue systems are an active research field with wide applications. But the differences in the Chinese spoken dialogue system are not as distinct as that of English. In Chinese spoken dialogues, there are man...Spoken dialogue systems are an active research field with wide applications. But the differences in the Chinese spoken dialogue system are not as distinct as that of English. In Chinese spoken dialogues, there are many language phenomena. Firstly, most utterances are ill-formed. Secondly, ellipsis, anaphora and negation are also widely used in Chinese spoken dialogue. Determining how to extract semantic information from incomplete sentences and resolve negation, anaphora and ellipsis is crucial. SHTQS (Shanghai Transportation Query System) is an intelligent telephone-based spoken dialogue system providing information about the best route between any two sites in Shanghai. After a brief description of the system, the natural language processing is emphasized. Speech recognition sentences unavoidably contain errors. In language sequence processing procedures, these errors can be easily passed to the later parts and take on a ripple effect. To detect and recover these from errors as early as possible, language-processing strategies are specially considered. For errors resulting from divided words in speech recognition, segmentation and POS Tagging approaches that can rectify these errors are designed. Since most of the inquiry utterances are ill-formed and negation, anaphora and ellipsis are common language phenomena, the language understanding must be adequately adaptive. So, a partial syntactic parsing scheme is adopted and a chart algorithm is used. The parser is based on unification grammar. The semantic frame that extracts from the best arc set of the chart is used to represent the meaning of sentences. The negation, anaphora and ellipsis are also analyzed and corresponding processing approaches are presented. The accuracy of the language processing part is 88.39% and the testing result shows that the language processing strategies are rational and effective.展开更多
With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific...With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific journals points to around 40</span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span><span><span style="font-family:""><span style="font-family:Verdana;">000 with about four million articles published each year. Machine learning and deep learning applied to recommender systems had become unavoidable whether in industry or in research. In this current, we propose an optimized interface for bibliographic information retrieval as a </span><span style="font-family:Verdana;">running example, which allows different kind of researchers to find their</span><span style="font-family:Verdana;"> needs following some relevant criteria through natural language understanding. Papers indexed in Web of Science and Scopus are in high demand. Natural language including text and linguistic-based techniques, such as tokenization, named entity recognition, syntactic and semantic analysis, are used to express natural language queries. Our Interface uses association rules to find more related papers for recommendation. Spanning trees are challenged to optimize the search process of the system.展开更多
Understanding people's emotions through natural language is a challenging task for intelligent systems based on Internet of Things(Io T). The major difficulty is caused by the lack of basic knowledge in emotion ex...Understanding people's emotions through natural language is a challenging task for intelligent systems based on Internet of Things(Io T). The major difficulty is caused by the lack of basic knowledge in emotion expressions with respect to a variety of real world contexts. In this paper, we propose a Bayesian inference method to explore the latent semantic dimensions as contextual information in natural language and to learn the knowledge of emotion expressions based on these semantic dimensions. Our method synchronously infers the latent semantic dimensions as topics in words and predicts the emotion labels in both word-level and document-level texts. The Bayesian inference results enable us to visualize the connection between words and emotions with respect to different semantic dimensions. And by further incorporating a corpus-level hierarchy in the document emotion distribution assumption, we could balance the document emotion recognition results and achieve even better word and document emotion predictions. Our experiment of the wordlevel and the document-level emotion predictions, based on a well-developed Chinese emotion corpus Ren-CECps, renders both higher accuracy and better robustness in the word-level and the document-level emotion predictions compared to the state-of-theart emotion prediction algorithms.展开更多
Artificial intelligent based dialog systems are getting attention from both business and academic communities.The key parts for such intelligent chatbot systems are domain classification,intent detection,and named ent...Artificial intelligent based dialog systems are getting attention from both business and academic communities.The key parts for such intelligent chatbot systems are domain classification,intent detection,and named entity recognition.Various supervised,unsupervised,and hybrid approaches are used to detect each field.Such intelligent systems,also called natural language understanding systems analyze user requests in sequential order:domain classification,intent,and entity recognition based on the semantic rules of the classified domain.This sequential approach propagates the downstream error;i.e.,if the domain classification model fails to classify the domain,intent and entity recognition fail.Furthermore,training such intelligent system necessitates a large number of user-annotated datasets for each domain.This study proposes a single joint predictive deep neural network framework based on long short-term memory using only a small user-annotated dataset to address these issues.It investigates value added by incorporating unlabeled data from user chatting logs into multi-domain spoken language understanding systems.Systematic experimental analysis of the proposed joint frameworks,along with the semi-supervised multi-domain model,using open-source annotated and unannotated utterances shows robust improvement in the predictive performance of the proposed multi-domain intelligent chatbot over a base joint model and joint model based on adversarial learning.展开更多
Human-Computer dialogue systems provide a natural language based interface between human and computers. They are widely demanded in network information services, intelligent accompanying robots, and so on. A Human-Com...Human-Computer dialogue systems provide a natural language based interface between human and computers. They are widely demanded in network information services, intelligent accompanying robots, and so on. A Human-Computer dialogue system typically consists of three parts, namely Natural Language Understanding (NLU), Dialogue Management (DM) and Natural Language Generation (NLG). Each part has several different subtasks. Each subtask has been received lots of attentions, many improvements have been achieved on each subtask, respectively. But systems built in traditional pipeline way, where different subtasks are assembled sequently, suffered from some problems such as error accu- mulation and expanding, domain transferring. Therefore, researches on jointly modeling several subtasks in one part or cross different parts have been prompted greatly in recent years, especially the rapid developments on deep neural networks based joint models. There is even a few work aiming to integrate all subtasks of a dialogue system in a single model, namely end-to-end models. This paper introduces two basic frames of current dialogue systems and gives a brief survey on recent advances on variety subtasks at first, and then focuses on joint models for multiple subtasks of dialogues. We review several different joint models including integration of several subtasks inside NLU or NLG, jointly modeling cross NLG and DM, and jointly modeling through NLU, DM and NLG. Both advantages and problems of those joint models are discussed. We consider that the joint models, or end-to-end models, will be one important trend for developing Human-Computer dialogue systems.展开更多
The traditional strategy of 3D model reconstruction mainly concentrates on orthographic projections or engineering drawings. But there are some shortcomings. Such as, only few kinds of solids can be reconstructed, the...The traditional strategy of 3D model reconstruction mainly concentrates on orthographic projections or engineering drawings. But there are some shortcomings. Such as, only few kinds of solids can be reconstructed, the high complexity of time and less information about the 3D model. The research is extended and process card is treated as part of the 3D reconstruction. A set of process data is a superset of 2D engineering drawings set. The set comprises process drawings and process steps, and shows a sequencing and asymptotic course that a part is made from roughcast blank to final product. According to these characteristics, the object to be reconstructed is translated from the complicated engineering drawings into a series of much simpler process drawings. With the plentiful process information added for reconstruction, the disturbances such as irrelevant graph, symbol and label, etc. can be avoided. And more, the form change of both neighbor process drawings is so little that the engineering drawings interpretation has no difficulty; in addition, the abnormal solution and multi-solution can be avoided during reconstruction, and the problems of being applicable to more objects is solved ultimately. Therefore, the utility method for 3D reconstruction model will be possible. On the other hand, the feature information in process cards is provided for reconstruction model. Focusing on process cards, the feasibility and requirements of Working Procedure Model reconstruction is analyzed, and the method to apply and implement the Natural Language Understanding into the 3D reconstruction is studied. The method of asymptotic approximation product was proposed, by which a 3D process model can be constructed automatically and intelligently. The process model not only includes the information about parts characters, but also can deliver the information of design, process and engineering to the downstream applications.展开更多
SHTQS is an intelligent telephone-besed spoken dialyze system providing the infomation about the best route between two sites in Shanghai. Instead of separated parts of speech decoding and language parsing, a close co...SHTQS is an intelligent telephone-besed spoken dialyze system providing the infomation about the best route between two sites in Shanghai. Instead of separated parts of speech decoding and language parsing, a close cool,ration is carded out in SHTQS by integrating automatic speech recognizer (AS,R), language understanding, dialogue management and speech generatot. In such a way, the erroneous analysis and uncertainty happening in the preceding stages would be recovered and determined acourately with high-level knowledge, Moreover, instead of shallow word-level analysis or simply keyword or key phrase matching, a deeper analysis is performed in our system by integrating a robust parser and a semantic interpreter. The robust parser is particularly important for spontanecos speech inputs because most of the inquiry sentences/phrases are ill-formed. In addition, in designinga mixed-initiative dialogue system, understanding users' inquiries is essential; however, simply matching keywords and/or key phrases can hardly achieve this. Therefore, a semantic interpreter is incorporated in oar system. The performnce of is also evaluated. The dialogue efficiency is 4.4 sentences per query on an average and the case precision rate of language understanding module is up to 81%. The results are satisfactory.展开更多
People often communicate with auto-answering tools such as conversational agents due to their 24/7 availability and unbiased responses.However,chatbots are normally designed for specific purposes and areas of experien...People often communicate with auto-answering tools such as conversational agents due to their 24/7 availability and unbiased responses.However,chatbots are normally designed for specific purposes and areas of experience and cannot answer questions outside their scope.Chatbots employ Natural Language Understanding(NLU)to infer their responses.There is a need for a chatbot that can learn from inquiries and expand its area of experience with time.This chatbot must be able to build profiles representing intended topics in a similar way to the human brain for fast retrieval.This study proposes a methodology to enhance a chatbot’s brain functionality by clustering available knowledge bases on sets of related themes and building representative profiles.We used a COVID-19 information dataset to evaluate the proposed methodology.The pandemic has been accompanied by an“infodemic”of fake news.The chatbot was evaluated by a medical doctor and a public trial of 308 real users.Evaluationswere obtained and statistically analyzed tomeasure effectiveness,efficiency,and satisfaction as described by the ISO9214 standard.The proposed COVID-19 chatbot system relieves doctors from answering questions.Chatbots provide an example of the use of technology to handle an infodemic.展开更多
The process of understanding natural language can be viewed as the process of model construction. This paper? employing Kripke frame for intuitionistic logic semantics as the implement of model construction for natura...The process of understanding natural language can be viewed as the process of model construction. This paper? employing Kripke frame for intuitionistic logic semantics as the implement of model construction for natural language, introduces a method of incremental model construction.展开更多
Metaphor computation has attracted more and more attention because metaphor, to some extent, is the focus of mind and language mechanism. However, it encounters problems not only due to the rich expressive power of na...Metaphor computation has attracted more and more attention because metaphor, to some extent, is the focus of mind and language mechanism. However, it encounters problems not only due to the rich expressive power of natural language but also due to cognitive nature of human being. Therefore machine-understanding of metaphor is now becoming a bottle-neck in natural language processing and machine translation. This paper first suggests how a metaphor is understood and then presents a survey of current computational approaches, in terms of their linguistic historical roots, underlying foundations, methods and techniques currently used, advantages, limitations, and future trends. A comparison between metaphors in English and Chinese languages is also introduced because compared with development in English language Chinese metaphor computation is just at its starting stage. So a separate summarization of current progress made in Chinese metaphor computation is presented. As a conclusion, a few suggestions are proposed for further research on metaphor computation especially on Chinese metaphor computation.展开更多
Due to the significance and value in human-computer interaction and natural language processing,task-oriented dialog systems are attracting more and more attention in both academic and industrial communities.In this p...Due to the significance and value in human-computer interaction and natural language processing,task-oriented dialog systems are attracting more and more attention in both academic and industrial communities.In this paper,we survey recent advances and challenges in task-oriented dialog systems.We also discuss three critical topics for task-oriented dialog systems:(1)improving data efficiency to facilitate dialog modeling in low-resource settings,(2)modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance,and(3)integrating domain ontology knowledge into the dialog model.Besides,we review the recent progresses in dialog evaluation and some widely-used corpora.We believe that this survey,though incomplete,can shed a light on future research in task-oriented dialog systems.展开更多
In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conf...In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conference(SMP),and introduce the related data set-SMP-CAIL2020-Argmine.The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff,providing the full context recorded in the judgement documents of both parties.We received entries from 63 competing teams,38 of which scored higher than the provided baseline model(BERT)in the first phase and entered the second phase.The best performing system in the two phases achieved accuracy of 0.856 and 0.905,respectively.In this paper,we will present the results of the competition and a summary of the systems,highlighting commonalities and innovations among participating systems.The SMP-CAIL2020-Argmine data set and baseline modelshave been already released.展开更多
A new tagging method is presented to build a Chinese semantic corpus. The method characterizes the sentence meaning as a linear sequence of dependency relationships which are the semantic or syntactic relationships b...A new tagging method is presented to build a Chinese semantic corpus. The method characterizes the sentence meaning as a linear sequence of dependency relationships which are the semantic or syntactic relationships between words in the sentence. This representation method is used to build a Chinese statistical parser model to understand the sentence meaning. Specific experiments on automatic telephone switchboard conversations show that the proposed parser has a precision of 80%. This work provides a foundation for building a large-scale Chinese semantic corpus and for research on understanding modeling of the Chinese language.展开更多
One of the major challenges to build a task-oriented dialogue system is that dialogue state transition frequently happens between multiple domains such as booking hotels or restaurants.Recently,the encoder-decoder mod...One of the major challenges to build a task-oriented dialogue system is that dialogue state transition frequently happens between multiple domains such as booking hotels or restaurants.Recently,the encoder-decoder model based on the end-to-end neural network has become an attractive approach to meet this challenge.However,it usually requires a sufficiently large amount of training data and it is not flexible to handle dialogue state transition.This paper addresses these problems by proposing a simple but practical framework called Multi-Domain KB-BOT(MDKB-BOT),which leverages both neural networks and rule-based strategy in natural language understanding(NLU)and dialogue management(DM).Experiments on the data set of the Chinese Human-Computer Dialogue Technology Evaluation Campaign show that MDKB-BOT achieves competitive performance on several evaluation metrics,including task completion rate and user satisfaction.展开更多
文摘This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future.
基金supported by the National Natural Science Foundation of China(No.70572090,No.60305009)the Ph.D.Degree Teacher Foundation of North China Electric Power University.
文摘A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering and can realize automatic abstracting of multi-documents. The algo- rithm of twice word segmentation based on the title and first sentences in paragraphs is investigated. Its precision and recall is above 95 %. For a specific domain on plastics, an automatic abstracting system named TCAAS is implemented. The precision and recall of multi-document’s automatic ab- stracting is above 75 %. Also, the experiments prove that it is feasible to use the method to develop a domain automatic abstracting system, which is valuable for further in-depth study.
文摘Spoken dialogue systems are an active research field with wide applications. But the differences in the Chinese spoken dialogue system are not as distinct as that of English. In Chinese spoken dialogues, there are many language phenomena. Firstly, most utterances are ill-formed. Secondly, ellipsis, anaphora and negation are also widely used in Chinese spoken dialogue. Determining how to extract semantic information from incomplete sentences and resolve negation, anaphora and ellipsis is crucial. SHTQS (Shanghai Transportation Query System) is an intelligent telephone-based spoken dialogue system providing information about the best route between any two sites in Shanghai. After a brief description of the system, the natural language processing is emphasized. Speech recognition sentences unavoidably contain errors. In language sequence processing procedures, these errors can be easily passed to the later parts and take on a ripple effect. To detect and recover these from errors as early as possible, language-processing strategies are specially considered. For errors resulting from divided words in speech recognition, segmentation and POS Tagging approaches that can rectify these errors are designed. Since most of the inquiry utterances are ill-formed and negation, anaphora and ellipsis are common language phenomena, the language understanding must be adequately adaptive. So, a partial syntactic parsing scheme is adopted and a chart algorithm is used. The parser is based on unification grammar. The semantic frame that extracts from the best arc set of the chart is used to represent the meaning of sentences. The negation, anaphora and ellipsis are also analyzed and corresponding processing approaches are presented. The accuracy of the language processing part is 88.39% and the testing result shows that the language processing strategies are rational and effective.
文摘With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific journals points to around 40</span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span><span><span style="font-family:""><span style="font-family:Verdana;">000 with about four million articles published each year. Machine learning and deep learning applied to recommender systems had become unavoidable whether in industry or in research. In this current, we propose an optimized interface for bibliographic information retrieval as a </span><span style="font-family:Verdana;">running example, which allows different kind of researchers to find their</span><span style="font-family:Verdana;"> needs following some relevant criteria through natural language understanding. Papers indexed in Web of Science and Scopus are in high demand. Natural language including text and linguistic-based techniques, such as tokenization, named entity recognition, syntactic and semantic analysis, are used to express natural language queries. Our Interface uses association rules to find more related papers for recommendation. Spanning trees are challenged to optimize the search process of the system.
基金supported in part by the National Natural Science Foundation of China(NSFC)Key Program(61573094)Fundamental Research Funds for the Central Universities(N140402001)
文摘Understanding people's emotions through natural language is a challenging task for intelligent systems based on Internet of Things(Io T). The major difficulty is caused by the lack of basic knowledge in emotion expressions with respect to a variety of real world contexts. In this paper, we propose a Bayesian inference method to explore the latent semantic dimensions as contextual information in natural language and to learn the knowledge of emotion expressions based on these semantic dimensions. Our method synchronously infers the latent semantic dimensions as topics in words and predicts the emotion labels in both word-level and document-level texts. The Bayesian inference results enable us to visualize the connection between words and emotions with respect to different semantic dimensions. And by further incorporating a corpus-level hierarchy in the document emotion distribution assumption, we could balance the document emotion recognition results and achieve even better word and document emotion predictions. Our experiment of the wordlevel and the document-level emotion predictions, based on a well-developed Chinese emotion corpus Ren-CECps, renders both higher accuracy and better robustness in the word-level and the document-level emotion predictions compared to the state-of-theart emotion prediction algorithms.
基金This research was supported by the BK21 FOUR(Fostering Outstanding Universities for Research)funded by the Ministry of Education(MOE,Korea)and National Research Foundation of Korea(NFR).
文摘Artificial intelligent based dialog systems are getting attention from both business and academic communities.The key parts for such intelligent chatbot systems are domain classification,intent detection,and named entity recognition.Various supervised,unsupervised,and hybrid approaches are used to detect each field.Such intelligent systems,also called natural language understanding systems analyze user requests in sequential order:domain classification,intent,and entity recognition based on the semantic rules of the classified domain.This sequential approach propagates the downstream error;i.e.,if the domain classification model fails to classify the domain,intent and entity recognition fail.Furthermore,training such intelligent system necessitates a large number of user-annotated datasets for each domain.This study proposes a single joint predictive deep neural network framework based on long short-term memory using only a small user-annotated dataset to address these issues.It investigates value added by incorporating unlabeled data from user chatting logs into multi-domain spoken language understanding systems.Systematic experimental analysis of the proposed joint frameworks,along with the semi-supervised multi-domain model,using open-source annotated and unannotated utterances shows robust improvement in the predictive performance of the proposed multi-domain intelligent chatbot over a base joint model and joint model based on adversarial learning.
文摘Human-Computer dialogue systems provide a natural language based interface between human and computers. They are widely demanded in network information services, intelligent accompanying robots, and so on. A Human-Computer dialogue system typically consists of three parts, namely Natural Language Understanding (NLU), Dialogue Management (DM) and Natural Language Generation (NLG). Each part has several different subtasks. Each subtask has been received lots of attentions, many improvements have been achieved on each subtask, respectively. But systems built in traditional pipeline way, where different subtasks are assembled sequently, suffered from some problems such as error accu- mulation and expanding, domain transferring. Therefore, researches on jointly modeling several subtasks in one part or cross different parts have been prompted greatly in recent years, especially the rapid developments on deep neural networks based joint models. There is even a few work aiming to integrate all subtasks of a dialogue system in a single model, namely end-to-end models. This paper introduces two basic frames of current dialogue systems and gives a brief survey on recent advances on variety subtasks at first, and then focuses on joint models for multiple subtasks of dialogues. We review several different joint models including integration of several subtasks inside NLU or NLG, jointly modeling cross NLG and DM, and jointly modeling through NLU, DM and NLG. Both advantages and problems of those joint models are discussed. We consider that the joint models, or end-to-end models, will be one important trend for developing Human-Computer dialogue systems.
文摘The traditional strategy of 3D model reconstruction mainly concentrates on orthographic projections or engineering drawings. But there are some shortcomings. Such as, only few kinds of solids can be reconstructed, the high complexity of time and less information about the 3D model. The research is extended and process card is treated as part of the 3D reconstruction. A set of process data is a superset of 2D engineering drawings set. The set comprises process drawings and process steps, and shows a sequencing and asymptotic course that a part is made from roughcast blank to final product. According to these characteristics, the object to be reconstructed is translated from the complicated engineering drawings into a series of much simpler process drawings. With the plentiful process information added for reconstruction, the disturbances such as irrelevant graph, symbol and label, etc. can be avoided. And more, the form change of both neighbor process drawings is so little that the engineering drawings interpretation has no difficulty; in addition, the abnormal solution and multi-solution can be avoided during reconstruction, and the problems of being applicable to more objects is solved ultimately. Therefore, the utility method for 3D reconstruction model will be possible. On the other hand, the feature information in process cards is provided for reconstruction model. Focusing on process cards, the feasibility and requirements of Working Procedure Model reconstruction is analyzed, and the method to apply and implement the Natural Language Understanding into the 3D reconstruction is studied. The method of asymptotic approximation product was proposed, by which a 3D process model can be constructed automatically and intelligently. The process model not only includes the information about parts characters, but also can deliver the information of design, process and engineering to the downstream applications.
文摘SHTQS is an intelligent telephone-besed spoken dialyze system providing the infomation about the best route between two sites in Shanghai. Instead of separated parts of speech decoding and language parsing, a close cool,ration is carded out in SHTQS by integrating automatic speech recognizer (AS,R), language understanding, dialogue management and speech generatot. In such a way, the erroneous analysis and uncertainty happening in the preceding stages would be recovered and determined acourately with high-level knowledge, Moreover, instead of shallow word-level analysis or simply keyword or key phrase matching, a deeper analysis is performed in our system by integrating a robust parser and a semantic interpreter. The robust parser is particularly important for spontanecos speech inputs because most of the inquiry sentences/phrases are ill-formed. In addition, in designinga mixed-initiative dialogue system, understanding users' inquiries is essential; however, simply matching keywords and/or key phrases can hardly achieve this. Therefore, a semantic interpreter is incorporated in oar system. The performnce of is also evaluated. The dialogue efficiency is 4.4 sentences per query on an average and the case precision rate of language understanding module is up to 81%. The results are satisfactory.
基金The authors extend their appreciation to the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia,for funding this research work(Project Number UB-2-1442).
文摘People often communicate with auto-answering tools such as conversational agents due to their 24/7 availability and unbiased responses.However,chatbots are normally designed for specific purposes and areas of experience and cannot answer questions outside their scope.Chatbots employ Natural Language Understanding(NLU)to infer their responses.There is a need for a chatbot that can learn from inquiries and expand its area of experience with time.This chatbot must be able to build profiles representing intended topics in a similar way to the human brain for fast retrieval.This study proposes a methodology to enhance a chatbot’s brain functionality by clustering available knowledge bases on sets of related themes and building representative profiles.We used a COVID-19 information dataset to evaluate the proposed methodology.The pandemic has been accompanied by an“infodemic”of fake news.The chatbot was evaluated by a medical doctor and a public trial of 308 real users.Evaluationswere obtained and statistically analyzed tomeasure effectiveness,efficiency,and satisfaction as described by the ISO9214 standard.The proposed COVID-19 chatbot system relieves doctors from answering questions.Chatbots provide an example of the use of technology to handle an infodemic.
基金This paper was supported by the National Natural Science Foundation of China and the National '863' Hi-Tech Programme of China.
文摘The process of understanding natural language can be viewed as the process of model construction. This paper? employing Kripke frame for intuitionistic logic semantics as the implement of model construction for natural language, introduces a method of incremental model construction.
基金Supported by the National Natural Science Foundation of China under Grant No. 60373080.
文摘Metaphor computation has attracted more and more attention because metaphor, to some extent, is the focus of mind and language mechanism. However, it encounters problems not only due to the rich expressive power of natural language but also due to cognitive nature of human being. Therefore machine-understanding of metaphor is now becoming a bottle-neck in natural language processing and machine translation. This paper first suggests how a metaphor is understood and then presents a survey of current computational approaches, in terms of their linguistic historical roots, underlying foundations, methods and techniques currently used, advantages, limitations, and future trends. A comparison between metaphors in English and Chinese languages is also introduced because compared with development in English language Chinese metaphor computation is just at its starting stage. So a separate summarization of current progress made in Chinese metaphor computation is presented. As a conclusion, a few suggestions are proposed for further research on metaphor computation especially on Chinese metaphor computation.
基金the National Natural Science Foundation of China(Grant Nos.61936010 and 61876096)the National Key R&D Program of China(Grant No.2018YFC0830200)。
文摘Due to the significance and value in human-computer interaction and natural language processing,task-oriented dialog systems are attracting more and more attention in both academic and industrial communities.In this paper,we survey recent advances and challenges in task-oriented dialog systems.We also discuss three critical topics for task-oriented dialog systems:(1)improving data efficiency to facilitate dialog modeling in low-resource settings,(2)modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance,and(3)integrating domain ontology knowledge into the dialog model.Besides,we review the recent progresses in dialog evaluation and some widely-used corpora.We believe that this survey,though incomplete,can shed a light on future research in task-oriented dialog systems.
基金supported by National Key Research and Development Plan(No.2018YFC0830600),and is cooperated with China Justice Big Data Institute,which provided judgement documents and the employment of professional annotators.The competition is also sponsored by Beijing Thunisoft Information Technology Co.,Ltd.,and supported by both CAIL and SMP organizers.
文摘In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conference(SMP),and introduce the related data set-SMP-CAIL2020-Argmine.The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff,providing the full context recorded in the judgement documents of both parties.We received entries from 63 competing teams,38 of which scored higher than the provided baseline model(BERT)in the first phase and entered the second phase.The best performing system in the two phases achieved accuracy of 0.856 and 0.905,respectively.In this paper,we will present the results of the competition and a summary of the systems,highlighting commonalities and innovations among participating systems.The SMP-CAIL2020-Argmine data set and baseline modelshave been already released.
基金Supported by the National High- Technology DevelopmentProgram of China(No. 863 - 3 0 6- 2 D0 3 - 0 1- 2)
文摘A new tagging method is presented to build a Chinese semantic corpus. The method characterizes the sentence meaning as a linear sequence of dependency relationships which are the semantic or syntactic relationships between words in the sentence. This representation method is used to build a Chinese statistical parser model to understand the sentence meaning. Specific experiments on automatic telephone switchboard conversations show that the proposed parser has a precision of 80%. This work provides a foundation for building a large-scale Chinese semantic corpus and for research on understanding modeling of the Chinese language.
基金This work was supported by Beijing Natural Science Foundation(No.4174098)National Natural Science Foundation of China(No.61702047)the Fundamental Research Funds for the Central Universities(No.2017RC02).
文摘One of the major challenges to build a task-oriented dialogue system is that dialogue state transition frequently happens between multiple domains such as booking hotels or restaurants.Recently,the encoder-decoder model based on the end-to-end neural network has become an attractive approach to meet this challenge.However,it usually requires a sufficiently large amount of training data and it is not flexible to handle dialogue state transition.This paper addresses these problems by proposing a simple but practical framework called Multi-Domain KB-BOT(MDKB-BOT),which leverages both neural networks and rule-based strategy in natural language understanding(NLU)and dialogue management(DM).Experiments on the data set of the Chinese Human-Computer Dialogue Technology Evaluation Campaign show that MDKB-BOT achieves competitive performance on several evaluation metrics,including task completion rate and user satisfaction.