摘要
微服务架构逐渐成为大规模云应用的主流设计架构,微服务可靠性是云服务商亟须处理的关键问题。精确检测并定位微服务应用故障可有效保障应用的可靠性与稳定性,基于微服务调用链的异常检测可在系统发生故障时及时发现系统异常行为并触发告警。针对当前主流检测方法无法保证异常告警的实时性和准确性问题,提出一种基于自然语言处理与双向长短期记忆(BiLSTM)网络的微服务调用链异常检测方法 MicroTrace。对调用链中记录的事件进行解析,将事件表示为语义序列与响应时间序列,利用词汇嵌入式表示算法提取事件的向量化表示,通过基于注意力机制的BiLSTM同时检测微服务实例的调用路径与性能异常。在真实微服务调用链数据集上的实验结果表明,该方法的查准率和查全率均可达96%以上,F1度量值相比于多模态-LSTM方法至少提升了6.8%。
The microservice architecture is gradually becoming the mainstream design architecture for large-scale cloud applications,and the reliability of systems based on microservices is a key issue that must be addressed by cloud service providers.Accurately and effectively detecting and locating the faults in microservice applications is crucial for ensuring the reliability and stability of applications.Anomalies in microservice call chains are detected to identify abnormal behaviors in the system in a timely manner and trigger an alarm when the system fails.However,the real-time performance of alarms indicating abnormal behaviors cannot be guaranteed by using current mainstream methods,which require the establishment of a knowledge base by augmenting data from the microservice call chain.Therefore,an anomaly detection method for microservice call chains based on natural language processing and a Bi-directional Long Short-Term Memory(BiLSTM)network is proposed herein.First,events are extracted into semantic sequences and response time sequences.Second,Word2vec is used to extract the vectorized representation of the event,detect the call path anomalies in call chains,and identify the performance anomalies of microservice instances caused by BiLSTM based on the attention mechanism.Finally,the proposed method is verified on an actual microservice call chain dataset.The experimental results show that the precision and recall of the proposed method can exceed 96% and that the detection accuracy improves by at least 6.8% compared with that of the Multimodal-LSTM method.
作者
张攀
高丰
周逸
饶涵宇
毛冬
李静
ZHANG Pan;GAO Feng;ZHOU Yi;RAO Hanyu;MAO Dong;LI Jing(State Grid Information&Telecommunication Branch,Beijing 100031,China;College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;Information&Telecommunication Branch,State Grid Zhejiang Electric Power Co.,Ltd.,Hangzhou 310016,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2022年第11期161-169,共9页
Computer Engineering
基金
国家电网有限公司科技项目“业务应用改造上云与全链路运行分析技术研究”(5700-202152169A-0-0-00)。
关键词
微服务
调用链
深度学习
异常检测
数据挖掘
microservice
call chain
deep learning
anomaly detection
data mining