摘要
为了分析我国海事行政处罚案由的类别和特征,揭示不同类别处罚事件发生次数的影响因素,以414 475条海事行政处罚数据为样本,使用Sparklyr分布式聚类技术实现了大规模处罚案由文本的有效聚类,并使用卡方检验提取了不同处罚案由文本的语义特征。继而,使用方差分析及LSD多重比较研究了海事辖区、季节、案由类别以及它们之间的交互对处罚事件发生次数的影响。研究表明:Sparklyr分布式聚类有效解决了单机环境下难以应对的大规模海事行政处罚案由文本的聚类问题,揭示了海事违法行为的多样性,不同类别处罚事件的发生次数存在很大差异,海事辖区、季节、海事辖区与案由类别的交互均是这种差异产生的显著影响因素,据此提出了差异化海事监管的建议。
To analyze the categories and characteristics of maritime administrative penalty reasons in China,and to reveal the influencing factors of the frequency of different penalty cases,taking 414475 records of maritime administrative penalties as the research data,Sparklyr distributed clustering technology was used to effectively cluster the large-scale text of penalty reasons,and chi-square test was used to extract the semantic features of different penalty reasons.Subsequently,analysis of variance and LSD multiple comparisons are used to investigate the impact of area,season,category of reason,and their interactions on the frequency of penalty cases.It shows that Sparklyr distributed clustering can effectively cluster the reasons of large-scale maritime administrative penalties which are difficult to deal with in a standalone environment.The research results reveal the diversity of maritime illegal behaviors,and the frequency of penalty cases involving different reasons varies greatly,and area,season,the interaction between area and the category of cause are all significant influencing factors for this difference.Finally,some suggestions on differentiated maritime supervision are put forward.
作者
于卫红
程佳雪
齐勇凯
YU Weihong;CHENG Jiaxue;QI Yongkai(School of Transportation Economics and Management,Dalian Maritime University,Dalian 116026,China)
出处
《武汉理工大学学报(信息与管理工程版)》
CAS
2024年第3期490-496,共7页
Journal of Wuhan University of Technology:Information & Management Engineering
基金
大连海事大学发展战略与高等教育研究重点课题(GJ2022Z01)。
关键词
海事监管
行政处罚
分布式文本聚类
方差分析
卡方检验
maritime supervision
administrative penalty
distributed text clustering
analysis of variance
chi-square test