Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely u...Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely used in motion analysis,medical evaluation,and behavior monitoring.In this paper,the authors propose a method for multi-view human pose estimation.Two image sensors were placed orthogonally with respect to each other to capture the pose of the subject as they moved,and this yielded accurate and comprehensive results of three-dimensional(3D)motion reconstruction that helped capture their multi-directional poses.Following this,we propose a method based on 3D pose estimation to assess the similarity of the features of motion of patients with motor dysfunction by comparing differences between their range of motion and that of normal subjects.We converted these differences into Fugl–Meyer assessment(FMA)scores in order to quantify them.Finally,we implemented the proposed method in the Unity framework,and built a Virtual Reality platform that provides users with human–computer interaction to make the task more enjoyable for them and ensure their active participation in the assessment process.The goal is to provide a suitable means of assessing movement disorders without requiring the immediate supervision of a physician.展开更多
Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,huma...Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,humanpose estimation has achieved great success in multiple fields such as animation and sports.However,to obtainaccurate positioning results,existing methods may suffer from large model sizes,a high number of parameters,and increased complexity,leading to high computing costs.In this paper,we propose a new lightweight featureencoder to construct a high-resolution network that reduces the number of parameters and lowers the computingcost.We also introduced a semantic enhancement module that improves global feature extraction and networkperformance by combining channel and spatial dimensions.Furthermore,we propose a dense connected spatialpyramid pooling module to compensate for the decrease in image resolution and information loss in the network.Finally,ourmethod effectively reduces the number of parameters and complexitywhile ensuring high performance.Extensive experiments show that our method achieves a competitive performance while dramatically reducing thenumber of parameters,and operational complexity.Specifically,our method can obtain 89.9%AP score on MPIIVAL,while the number of parameters and the complexity of operations were reduced by 41%and 36%,respectively.展开更多
3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimat...3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.展开更多
Human pose estimation(HPE)is a procedure for determining the structure of the body pose and it is considered a challenging issue in the computer vision(CV)communities.HPE finds its applications in several fields namel...Human pose estimation(HPE)is a procedure for determining the structure of the body pose and it is considered a challenging issue in the computer vision(CV)communities.HPE finds its applications in several fields namely activity recognition and human-computer interface.Despite the benefits of HPE,it is still a challenging process due to the variations in visual appearances,lighting,occlusions,dimensionality,etc.To resolve these issues,this paper presents a squirrel search optimization with a deep convolutional neural network for HPE(SSDCNN-HPE)technique.The major intention of the SSDCNN-HPE technique is to identify the human pose accurately and efficiently.Primarily,the video frame conversion process is performed and pre-processing takes place via bilateral filtering-based noise removal process.Then,the EfficientNet model is applied to identify the body points of a person with no problem constraints.Besides,the hyperparameter tuning of the EfficientNet model takes place by the use of the squirrel search algorithm(SSA).In the final stage,the multiclass support vector machine(M-SVM)technique was utilized for the identification and classification of human poses.The design of bilateral filtering followed by SSA based EfficientNetmodel for HPE depicts the novelty of the work.To demonstrate the enhanced outcomes of the SSDCNN-HPE approach,a series of simulations are executed.The experimental results reported the betterment of the SSDCNN-HPE system over the recent existing techniques in terms of different measures.展开更多
Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction...Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.展开更多
Spacecraft pose estimation is an important technology to maintain or change the spacecraft orientation in space.For spacecraft pose estimation,when two spacecraft are relatively distant,the depth information of the sp...Spacecraft pose estimation is an important technology to maintain or change the spacecraft orientation in space.For spacecraft pose estimation,when two spacecraft are relatively distant,the depth information of the space point is less than that of the measuring distance,so the camera model can be seen as a weak perspective projection model.In this paper,a spacecraft pose estimation algorithm based on four symmetrical points of the spacecraft outline is proposed.The analytical solution of the spacecraft pose is obtained by solving the weak perspective projection model,which can satisfy the requirements of the measurement model when the measurement distance is long.The optimal solution is obtained from the weak perspective projection model to the perspective projection model,which can meet the measurement requirements when the measuring distance is small.The simulation results show that the proposed algorithm can obtain better results,even though the noise is large.展开更多
In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains...In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains of computer vision and is used in solving several problems for human endeavours.After the detailed introduction,three different human body modes followed by the main stages of HPE and two pipelines of twodimensional(2D)HPE are presented.The details of the four components of HPE are also presented.The keypoints output format of two popular 2D HPE datasets and the most cited DL-based HPE articles from the year of breakthrough are both shown in tabular form.This study intends to highlight the limitations of published reviews and surveys respecting presenting a systematic review of the current DL-based solution to the 2D HPE model.Furthermore,a detailed and meaningful survey that will guide new and existing researchers on DL-based 2D HPE models is achieved.Finally,some future research directions in the field of HPE,such as limited data on disabled persons and multi-training DL-based models,are revealed to encourage researchers and promote the growth of HPE research.展开更多
With the advancement of image sensing technology, estimating 3Dhuman pose frommonocular video has becomea hot research topic in computer vision. 3D human pose estimation is an essential prerequisite for subsequentacti...With the advancement of image sensing technology, estimating 3Dhuman pose frommonocular video has becomea hot research topic in computer vision. 3D human pose estimation is an essential prerequisite for subsequentaction analysis and understanding. It empowers a wide spectrum of potential applications in various areas, suchas intelligent transportation, human-computer interaction, and medical rehabilitation. Currently, some methodsfor 3D human pose estimation in monocular video employ temporal convolutional network (TCN) to extractinter-frame feature relationships, but the majority of them suffer from insufficient inter-frame feature relationshipextractions. In this paper, we decompose the 3D joint location regression into the bone direction and length, wepropose the TCG, a temporal convolutional network incorporating Gaussian error linear units (GELU), to solvebone direction. It enablesmore inter-frame features to be captured andmakes the utmost of the feature relationshipsbetween data. Furthermore, we adopt kinematic structural information to solve bone length enhancing the use ofintra-frame joint features. Finally, we design a loss function for joint training of the bone direction estimationnetwork with the bone length estimation network. The proposed method has extensively experimented on thepublic benchmark dataset Human3.6M. Both quantitative and qualitative experimental results showed that theproposed method can achieve more accurate 3D human pose estimations.展开更多
Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has at...Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.展开更多
Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application scenarios.With the introduction of end-to-end direct regression methods,the fi...Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application scenarios.With the introduction of end-to-end direct regression methods,the field has entered a new stage of development.However,the regression results of joints that are more heavily influenced by external factors are not accurate enough even for the optimal method.In this paper,we propose an effective feature recalibration module based on the channel attention mechanism and a relative optimal calibration strategy,which is applied to themulti-viewmulti-person 3D human pose estimation task to achieve improved detection accuracy for joints that are more severely affected by external factors.Specifically,it achieves relative optimal weight adjustment of joint feature information through the recalibration module and strategy,which enables the model to learn the dependencies between joints and the dependencies between people and their corresponding joints.We call this method as the Efficient Recalibration Network(ER-Net).Finally,experiments were conducted on two benchmark datasets for this task,Campus and Shelf,in which the PCP reached 97.3% and 98.3%,respectively.展开更多
基金This work was supported by grants fromthe Natural Science Foundation of Hebei Province,under Grant No.F2021202021the S&T Program of Hebei,under Grant No.22375001Dthe National Key R&D Program of China,under Grant No.2019YFB1312500.
文摘Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely used in motion analysis,medical evaluation,and behavior monitoring.In this paper,the authors propose a method for multi-view human pose estimation.Two image sensors were placed orthogonally with respect to each other to capture the pose of the subject as they moved,and this yielded accurate and comprehensive results of three-dimensional(3D)motion reconstruction that helped capture their multi-directional poses.Following this,we propose a method based on 3D pose estimation to assess the similarity of the features of motion of patients with motor dysfunction by comparing differences between their range of motion and that of normal subjects.We converted these differences into Fugl–Meyer assessment(FMA)scores in order to quantify them.Finally,we implemented the proposed method in the Unity framework,and built a Virtual Reality platform that provides users with human–computer interaction to make the task more enjoyable for them and ensure their active participation in the assessment process.The goal is to provide a suitable means of assessing movement disorders without requiring the immediate supervision of a physician.
基金the National Natural Science Foundation of China(Grant Number 62076246).
文摘Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,humanpose estimation has achieved great success in multiple fields such as animation and sports.However,to obtainaccurate positioning results,existing methods may suffer from large model sizes,a high number of parameters,and increased complexity,leading to high computing costs.In this paper,we propose a new lightweight featureencoder to construct a high-resolution network that reduces the number of parameters and lowers the computingcost.We also introduced a semantic enhancement module that improves global feature extraction and networkperformance by combining channel and spatial dimensions.Furthermore,we propose a dense connected spatialpyramid pooling module to compensate for the decrease in image resolution and information loss in the network.Finally,ourmethod effectively reduces the number of parameters and complexitywhile ensuring high performance.Extensive experiments show that our method achieves a competitive performance while dramatically reducing thenumber of parameters,and operational complexity.Specifically,our method can obtain 89.9%AP score on MPIIVAL,while the number of parameters and the complexity of operations were reduced by 41%and 36%,respectively.
基金supported by the Program of Entrepreneurship and Innovation Ph.D.in Jiangsu Province(JSSCBS20211175)the School Ph.D.Talent Funding(Z301B2055)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(21KJB520002).
文摘3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.
文摘Human pose estimation(HPE)is a procedure for determining the structure of the body pose and it is considered a challenging issue in the computer vision(CV)communities.HPE finds its applications in several fields namely activity recognition and human-computer interface.Despite the benefits of HPE,it is still a challenging process due to the variations in visual appearances,lighting,occlusions,dimensionality,etc.To resolve these issues,this paper presents a squirrel search optimization with a deep convolutional neural network for HPE(SSDCNN-HPE)technique.The major intention of the SSDCNN-HPE technique is to identify the human pose accurately and efficiently.Primarily,the video frame conversion process is performed and pre-processing takes place via bilateral filtering-based noise removal process.Then,the EfficientNet model is applied to identify the body points of a person with no problem constraints.Besides,the hyperparameter tuning of the EfficientNet model takes place by the use of the squirrel search algorithm(SSA).In the final stage,the multiclass support vector machine(M-SVM)technique was utilized for the identification and classification of human poses.The design of bilateral filtering followed by SSA based EfficientNetmodel for HPE depicts the novelty of the work.To demonstrate the enhanced outcomes of the SSDCNN-HPE approach,a series of simulations are executed.The experimental results reported the betterment of the SSDCNN-HPE system over the recent existing techniques in terms of different measures.
文摘Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.
基金Supported by National Natural Science Foundation of China(Grant No.12272104).
文摘Spacecraft pose estimation is an important technology to maintain or change the spacecraft orientation in space.For spacecraft pose estimation,when two spacecraft are relatively distant,the depth information of the space point is less than that of the measuring distance,so the camera model can be seen as a weak perspective projection model.In this paper,a spacecraft pose estimation algorithm based on four symmetrical points of the spacecraft outline is proposed.The analytical solution of the spacecraft pose is obtained by solving the weak perspective projection model,which can satisfy the requirements of the measurement model when the measurement distance is long.The optimal solution is obtained from the weak perspective projection model to the perspective projection model,which can meet the measurement requirements when the measuring distance is small.The simulation results show that the proposed algorithm can obtain better results,even though the noise is large.
基金supported by the[Universiti Sains Malaysia]under FRGS Grant Number[FRGS/1/2020/STG07/USM/02/12(203.PKOMP.6711930)]FRGS Grant Number[304PTEKIND.6316497.USM.].
文摘In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains of computer vision and is used in solving several problems for human endeavours.After the detailed introduction,three different human body modes followed by the main stages of HPE and two pipelines of twodimensional(2D)HPE are presented.The details of the four components of HPE are also presented.The keypoints output format of two popular 2D HPE datasets and the most cited DL-based HPE articles from the year of breakthrough are both shown in tabular form.This study intends to highlight the limitations of published reviews and surveys respecting presenting a systematic review of the current DL-based solution to the 2D HPE model.Furthermore,a detailed and meaningful survey that will guide new and existing researchers on DL-based 2D HPE models is achieved.Finally,some future research directions in the field of HPE,such as limited data on disabled persons and multi-training DL-based models,are revealed to encourage researchers and promote the growth of HPE research.
基金supported by the Key Project of NSFC(Grant No.U1908214)Special Project of Central Government Guiding Local Science and Technology Development(Grant No.2021JH6/10500140)+5 种基金the Program for Innovative Research Team in University of Liaoning Province(LT2020015)the Support Plan for Key Field Innovation Team of Dalian(2021RT06)the Support Plan for Leading Innovation Team of Dalian University(XLJ202010)the Science and Technology Innovation Fund of Dalian(Grant No.2020JJ25CY001)in part by the National Natural Science Foundation of China under Grant 61906032the FundamentalResearch Funds for the Central Universities under Grant DUT21TD107.
文摘With the advancement of image sensing technology, estimating 3Dhuman pose frommonocular video has becomea hot research topic in computer vision. 3D human pose estimation is an essential prerequisite for subsequentaction analysis and understanding. It empowers a wide spectrum of potential applications in various areas, suchas intelligent transportation, human-computer interaction, and medical rehabilitation. Currently, some methodsfor 3D human pose estimation in monocular video employ temporal convolutional network (TCN) to extractinter-frame feature relationships, but the majority of them suffer from insufficient inter-frame feature relationshipextractions. In this paper, we decompose the 3D joint location regression into the bone direction and length, wepropose the TCG, a temporal convolutional network incorporating Gaussian error linear units (GELU), to solvebone direction. It enablesmore inter-frame features to be captured andmakes the utmost of the feature relationshipsbetween data. Furthermore, we adopt kinematic structural information to solve bone length enhancing the use ofintra-frame joint features. Finally, we design a loss function for joint training of the bone direction estimationnetwork with the bone length estimation network. The proposed method has extensively experimented on thepublic benchmark dataset Human3.6M. Both quantitative and qualitative experimental results showed that theproposed method can achieve more accurate 3D human pose estimations.
基金This research was supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)This work has also been supported by PrincessNourah bint Abdulrahman UniversityResearchers Supporting Project Number(PNURSP2022R239),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Alsothis work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.
基金supported in part by the Key Program of NSFC (Grant No.U1908214)Special Project of Central Government Guiding Local Science and Technology Development (Grant No.2021JH6/10500140)+3 种基金Program for the Liaoning Distinguished Professor,Program for Innovative Research Team in University of Liaoning Province (LT2020015)Dalian (2021RT06)and Dalian University (XLJ202010)the Science and Technology Innovation Fund of Dalian (Grant No.2020JJ25CY001)Dalian University Scientific Research Platform Project (No.202101YB03).
文摘Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application scenarios.With the introduction of end-to-end direct regression methods,the field has entered a new stage of development.However,the regression results of joints that are more heavily influenced by external factors are not accurate enough even for the optimal method.In this paper,we propose an effective feature recalibration module based on the channel attention mechanism and a relative optimal calibration strategy,which is applied to themulti-viewmulti-person 3D human pose estimation task to achieve improved detection accuracy for joints that are more severely affected by external factors.Specifically,it achieves relative optimal weight adjustment of joint feature information through the recalibration module and strategy,which enables the model to learn the dependencies between joints and the dependencies between people and their corresponding joints.We call this method as the Efficient Recalibration Network(ER-Net).Finally,experiments were conducted on two benchmark datasets for this task,Campus and Shelf,in which the PCP reached 97.3% and 98.3%,respectively.