期刊文献+

改进AVSlowFast音视频融合模型对哺乳期母猪关键行为的识别

Behavior recognition of lactating sows using improved AVSlowFast audio-video fusion model
下载PDF
导出
摘要 哺乳期母猪的自动行为监测对于保障母猪健康并及时发现异常状态具有重要意义。为了在识别母猪行为中整合视觉和听觉信号蕴含的信息,该研究提出了一种基于音视频特征多模态融合的哺乳期母猪关键行为识别方法。首先,引入三分支结构的AVSlowFast模型作为基础网络,通过视频慢通道、视频快通道、音频通道有效挖掘在视觉和听觉2种模态下的相关行为特征,并基于多层次侧向连接深入融合视听觉模态信息。在此基础上,该研究在特征融合后期引入高斯上下文变换器通道注意力模块,在不新增模型参数的条件下进一步优化高维多模态三维特征的融合效果,提高行为识别的准确率。该研究以哺乳期母猪为对象,采集实际养殖环境中的音频与视频数据进行试验,试验结果表明基于改进AVSlowFast音视频融合模型识别进食、哺乳、睡眠、拱栏、饮水、日常活动6种关键行为的平均精确率与召回率分别为94.3%和94.6%。与基于SlowFast的单模态行为识别方法相比,该研究提出的方法对6种行为识别的平均F1分数上显著提升了12.7个百分点,为实现畜禽多模态行为监测提供了一种有效思路。 Pig farming can be greatly promoted by automatic behavior recognition for lactating sows.However,the recognition accuracies have been confined to behaviors with similar visual characteristics.In this study,an audio-video fusion-based model was proposed for the behavior classification of lactating sows in pig farming.A three-branch deep neural network(AVSlowFast)was employed as the backbone.The gaussian context transformer(GCT)attention mechanism was introduced to optimize the model without increasing the number of parameters.The experiment was conducted in Lihua Pig Farm of Changzhou City,Jiangsu Province,China,from August 1,2023 to September 10,2023.Ten long white sows were randomly selected as the research objects with significant differences in their litter environment and farrowing houses.All of these sows were within three days postpartum.The camera and sound recorder were used to collect video and audio data in the experiment,respectively.The dataset was constructed from the captured video and audio data.The sow behaviors were then manually labelled into six groups:breastfeeding,eating,drinking,sleeping,fence-hitting,and daily activities.Three models of behavior recognition verified the vision-audio fusion with different feature models.Specifically,MFCC-Vision Transformer was tested with audio features,SlowFast was with vision features,and AVSlowFast was with vision-audio multimodal features.The results showed that the outstandingly higher accuracies of multimodal models(AVSlowFast)were achieved to identify six types of sow behaviors,compared with two single-modal models,Vision Transfomer and Slowfast.Notably,AVSlowFast demonstrated superior performance in the behaviors with similar visual features among lactating sows,such as feeding,drinking,and fence-hitting.Nevertheless,there was a relatively smaller decrease in the recognition accuracy of sleeping behavior with a multimodal approach,compared with the single-vision.The reason was that the distinct audio features of sleep behavior were often lacking in the inclusion of audio information.The attention mechanisms(such as SENet and GCT)were then introduced to improve the recognition performance,especially in sleep behavior.After that,the accuracy of sleeping behavior recognition increased with the improved model.The attention mechanisms effectively adjusted the weight values of feature channels during iterative training,thus mitigating the interference caused by audio signals.GCT-AVSlowFast had achieved an accuracy of 94.3%precision and 94.6%recall,compared with SENet-AVSlowFast.The average F1-score of behavior recognition was significantly improved by 12.7 percentage points,compared with the single-modal(SlowFast).Finally,the superior performance of GCT-AVSlowFast without additional model parameters was suitable for deployment in resource-limited pig farm environments.The finding can also provide an effective approach to implementing multi-modal behavior monitoring in livestock and poultry.
作者 李泊 陈天明 朱佳颖 LI Bo;CHEN Tianming;ZHU Jiaying(College of Artificial Intelligence,Nanjing Agricultural University,Nanjing 210031,China;Key Laboratory of Livestock Farming Equipment,Ministry of Agriculture and Rural Affairs,Nanjing 210031,China)
出处 《农业工程学报》 EI CAS CSCD 北大核心 2024年第7期182-190,共9页 Transactions of the Chinese Society of Agricultural Engineering
基金 江苏省农业自主创新资金项目(CX(21)3057)。
关键词 行为识别 母猪 行为监测 音视频融合 多模态 通道注意力机制 AVSlowFast behavior recognition sows behavior recognition audio-video fusion multimodal attention mechanism AVSlowFast
  • 相关文献

参考文献11

二级参考文献116

  • 1何正友,蔡玉梅,钱清泉.小波熵理论及其在电力系统故障检测中的应用研究[J].中国电机工程学报,2005,25(5):38-43. 被引量:188
  • 2林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语音的话者识别[J].南京大学学报(自然科学版),2006,42(1):54-62. 被引量:23
  • 3崔世泉,李剑虹,崔卫国,包军.母猪哺乳初期的母性行为与催乳素受体基因多态性关系的初探[J].遗传,2007,29(1):47-51. 被引量:26
  • 4吴红卫,吴镇扬,赵力.基于多窗谱的心理声学语音增强[J].声学学报,2007,32(3):275-281. 被引量:12
  • 5DAWKINS M S. Behaviour as a tool in the assessment of animal welfare[ J]. Zoology, 2003,106(4) :383 - 387.
  • 6STEVENS B, KARLEN G M ,MORRISON R,et al. Effects of stage of gestation at mixing on aggression, injuries and stress in sows[J]. Applied Animal Behaviour Science, 2015,165(4) :40 -46.
  • 7ERIKSSON L J,ALLIE M C, GREINER R. The selection and application of an HR adaptive filter for use in active soundattenuation[ J]. IEEE Transactions on Acoustics,Speech and Signal Processing, 1987,35(4) :433 -437.
  • 8LOVE E K, BEE M A. An experimental test of noise-dependent voice amplitude regulation in cope’s grey treefrog, Hylachrysoscelis[ J]. Animal Behaviour, 2010, 80(3):509 -515.
  • 9UR M B, NIEZRECKI C. A wavelet packet adaptive filtering algorithm for enhancing manatee vocalizations[ J]. The Journal of theAcoustical Society of America, 2011,129(4) : 2059 - 2067.
  • 10BEIRENDONCK S V , THIELEN J V, VERBEKE G, et al. The association between sow and piglet behavior [ J ]. Journal ofVeterinary Behavior, 2014,9:107 - 113.

共引文献105

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部