摘要
开放域对话系统目前采用的检索-生成方法是基于相似问题具有相似回答这一假设,其中最新的聚类-生成方法对训练集中的问题进行硬聚类,并从每个问题簇所对应的回答中学习该簇问题的回复模式.然而现有的方法忽略了问题的语义多样性,影响了生成回复的相关性和信息性.为了解决上述问题,本文提出了面向语义多样性的对话生成模型,利用可学习的软聚类网络将训练集的问题分配到多个语义簇来更好地捕获语义多样性.特别的,本文使用径向基神经网络实现软聚类过程,径向基网络的可微性使得模型能够对软聚类和回复生成过程进行端到端的训练,让这两个过程更加紧密耦合.在Chat数据集上进行实验,结果表明本文算法的性能高于已有的先进算法.
The retrieval-generation approaches,based on the hypothesis that similar questions have similar responses,are commonly used in current open-domain dialogue systems.Among them,the latest clustering-generation-based method employs a hard clustering manner for the training questions and learns the schema from each response cluster corresponding to each question cluster.However,current studies ignore the semantic multiplicity of questions,which weakens the relevance and informativeness of the generated responses.To tackle the above issue,we propose a dialogue generation model for semantic multiplicity utilizing a learnable soft-clustering network to group the training questions into multiple semantic clusters for better capturing semantic multiplicity.In particular,Radial Basis Function neural network is devised to fulfill the clustering network,and its differentiability endows the soft-clustering and the generation processes with end-to-end trainability,hence leading to two processes being more tightly coupled.The experiments have been conducted in Chat dataset and the results show that our proposed model outperforms the state-of-the-art baselines.
作者
刘家
卢永美
何东
卜令梅
陈黎
于中华
LIU Jia;LU Yong-mei;HE Dong;BU Ling-mei;CHEN Li;YU Zhong-hua(College of Computer Science,Sichuan University,Chengdu 610065,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2022年第10期2028-2034,共7页
Journal of Chinese Computer Systems
基金
国家重点研究项目(2020YFB0704502)资助.
关键词
对话生成
信息检索
软聚类
径向基网络
response generation
information retrieval
soft clustering
radial basis function network