基于CenterNet的多教师联合知识蒸馏

Multi-teacher joint knowledge distillation based on CenterNet

下载PDF

导出

摘要介绍了一种基于轻量化CenterNet的多教师联合知识蒸馏方案。所提方案能有效解决模型轻量化带来的性能恶化问题,可以显著缩小教师模型和学生模型之间的性能差距。将大规模复杂模型作为教师模型,指导训练轻量化学生模型。相比于模型的传统训练方案,使用所提知识蒸馏训练方案可以在相同的训练轮数后使轻量化模型达到更优的检测性能。主要贡献是提出了适用于CenterNet目标检测网络的新型知识蒸馏训练方案——多教师联合知识蒸馏。在后续实验中,进一步引入了蒸馏注意力机制,从而优化了多教师联合知识蒸馏的训练效果。在VOC2007数据集(Visual Object Classes 2007 Dataset)上,以MobileNetV2轻量化网络作为主干网络为例,相较于传统的CenterNet(主干网络为ResNet50),所提方案在参数量指标上压缩了74.7%,推理速度提升了70.5%,在平均精度上只有1.99的降低,取得了更好的“性能-速度”平衡。实验证明,同样经过100轮训练,使用多教师联合知识蒸馏训练方案的轻量化模型相较于普通训练方案,平均精度提升了11.30。 This paper introduces a multi-teacher joint knowledge distillation scheme based on lightweight CenterNet.The proposed scheme can effectively solve the problem of performance deterioration caused by lightweight model,and can significantly narrow the performance gap between teacher model and student model.The large-scale complex model is used as the teacher model to guide the training of the lightweight student model.Compared with the traditional training scheme of the model,the proposed knowledge distillation training scheme can achieve better detection performance of the lightweight model after the same number of training epochs.The main contribution of this paper is to propose the multi-teacher joint knowledge distillation which is a new knowledge distillation training scheme for CenterNet object detection network.In the follow-up experiment,the distillation attention mechanism is further introduced to optimize the training effect of multi-teacher joint knowledge distillation.On Visual Object Classes 2007 Dataset(VOC2007),taking MobileNetV2 lightweight network as backbone network as an example,compared with traditional CenterNet(backbone network is ResNet50),the parameter number index is compressed by 74.7%,the inference speed is increased by 70.5%,and the mean Average Precision(mAP)is only reduced by 1.99.A better“performance-speed”balance is then achieved.In addition,the experiment result proves that after the same 100 epochs of training,the mAP of the lightweight model using the multi-teacher joint knowledge distillation training scheme is improved by 11.30 compared with the ordinary training scheme.

作者刘绍华杜康佘春东杨傲 LIU Shaohua;DU Kang;SHE Chundong;YANG Ao(School of Electronic Engineering,Beijing University of Posts and Telecommunications,Beijing 100080,China)

机构地区北京邮电大学电子工程学院

出处《系统工程与电子技术》 EI CSCD 2024年第4期1174-1184,共11页 Systems Engineering and Electronics

基金国家自然科学基金(91938301)资助课题。

关键词轻量化知识蒸馏注意力机制联合训练 lightweight knowledge distillation attention mechanism joint training

分类号 TP183 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1苏振强,苟刚.联合知识和视觉信息推理的视觉问答研究[J].计算机工程与应用,2024,60(5):95-102.
2王龙,莫尔登.阿拉尔职业技术学校汽修专业技能大赛备赛策略探索[J].汽车维护与修理,2024(4):57-59.
3佘浩东,赵良瑾.结合关键点与引导向量的旋转目标检测网络[J].中国图象图形学报,2024,29(2):533-544.

系统工程与电子技术

2024年第4期

浏览历史

内容加载中请稍等...

基于CenterNet的多教师联合知识蒸馏

相关作者

相关机构

相关主题

浏览历史