期刊文献+

基于CenterNet的多教师联合知识蒸馏

Multi-teacher joint knowledge distillation based on CenterNet
下载PDF
导出
摘要 介绍了一种基于轻量化CenterNet的多教师联合知识蒸馏方案。所提方案能有效解决模型轻量化带来的性能恶化问题,可以显著缩小教师模型和学生模型之间的性能差距。将大规模复杂模型作为教师模型,指导训练轻量化学生模型。相比于模型的传统训练方案,使用所提知识蒸馏训练方案可以在相同的训练轮数后使轻量化模型达到更优的检测性能。主要贡献是提出了适用于CenterNet目标检测网络的新型知识蒸馏训练方案——多教师联合知识蒸馏。在后续实验中,进一步引入了蒸馏注意力机制,从而优化了多教师联合知识蒸馏的训练效果。在VOC2007数据集(Visual Object Classes 2007 Dataset)上,以MobileNetV2轻量化网络作为主干网络为例,相较于传统的CenterNet(主干网络为ResNet50),所提方案在参数量指标上压缩了74.7%,推理速度提升了70.5%,在平均精度上只有1.99的降低,取得了更好的“性能-速度”平衡。实验证明,同样经过100轮训练,使用多教师联合知识蒸馏训练方案的轻量化模型相较于普通训练方案,平均精度提升了11.30。 This paper introduces a multi-teacher joint knowledge distillation scheme based on lightweight CenterNet.The proposed scheme can effectively solve the problem of performance deterioration caused by lightweight model,and can significantly narrow the performance gap between teacher model and student model.The large-scale complex model is used as the teacher model to guide the training of the lightweight student model.Compared with the traditional training scheme of the model,the proposed knowledge distillation training scheme can achieve better detection performance of the lightweight model after the same number of training epochs.The main contribution of this paper is to propose the multi-teacher joint knowledge distillation which is a new knowledge distillation training scheme for CenterNet object detection network.In the follow-up experiment,the distillation attention mechanism is further introduced to optimize the training effect of multi-teacher joint knowledge distillation.On Visual Object Classes 2007 Dataset(VOC2007),taking MobileNetV2 lightweight network as backbone network as an example,compared with traditional CenterNet(backbone network is ResNet50),the parameter number index is compressed by 74.7%,the inference speed is increased by 70.5%,and the mean Average Precision(mAP)is only reduced by 1.99.A better“performance-speed”balance is then achieved.In addition,the experiment result proves that after the same 100 epochs of training,the mAP of the lightweight model using the multi-teacher joint knowledge distillation training scheme is improved by 11.30 compared with the ordinary training scheme.
作者 刘绍华 杜康 佘春东 杨傲 LIU Shaohua;DU Kang;SHE Chundong;YANG Ao(School of Electronic Engineering,Beijing University of Posts and Telecommunications,Beijing 100080,China)
出处 《系统工程与电子技术》 EI CSCD 2024年第4期1174-1184,共11页 Systems Engineering and Electronics
基金 国家自然科学基金(91938301)资助课题。
关键词 轻量化 知识蒸馏 注意力机制 联合训练 lightweight knowledge distillation attention mechanism joint training
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部