There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can ...There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.展开更多
With the increasing necessities for reliable printed circuit board(PCB) product, there has been a considerable demand for high speed and high precision vision positioning system. To locate a rectangular lead component...With the increasing necessities for reliable printed circuit board(PCB) product, there has been a considerable demand for high speed and high precision vision positioning system. To locate a rectangular lead component with high accuracy and reliability, a new visual positioning method was introduced. Considering the limitations of Ghosal sub-pixel edge detection algorithm, an improved algorithm was proposed, in which Harris corner features were used to coarsely detect the edge points and Zernike moments were adopted to accurately detect the edge points. Besides, two formulas were developed to determine the edge intersections whose sub-pixel coordinates were calculated with bilinear interpolation and conjugate gradient method. The last experimental results show that the proposed method can detect the deflection and offset, and the detection errors are less than 0.04° and 0.02 pixels.展开更多
The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run le...The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run length coding based on general orientation run length coding and visual recognition method are described elaborately.The method of positioning and orientating based on the moment of inertia of the workpiece binary image is stated also.It has been applied in a research on flexible automatic coordinate measuring system formed by integrating computer aided design,computer vision and computer aided inspection planning,with a coordinate measuring machine.The results show that integrating computer vision with measurement system is a feasible and effective approach to improve their flexibility and automation.展开更多
The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor l...The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.展开更多
In the visual positioning of Unmanned Ground Vehicle(UGV),the visual odometer based on direct sparse method(DSO) has the advantages of small amount of calculation,high real-time performance and high robustness,so it i...In the visual positioning of Unmanned Ground Vehicle(UGV),the visual odometer based on direct sparse method(DSO) has the advantages of small amount of calculation,high real-time performance and high robustness,so it is more widely used than the visual odometer based on feature point method.Ordinary vision sensors have a narrower viewing angle than panoramic vision sensors,and there are fewer road signs in a single frame of image,resulting in poor road sign tracking and positioning capabilities,and severely restricting the development of visual odometry.Based on these considerations,this paper proposes a binocular stereo panoramic vision positioning algorithm based on extended DSO,which can solve these problems well.The experimental results show that the binocular stereo panoramic vision positioning algorithm based on the extended DSO can directly obtain the panoramic depth image around the UGV,which greatly improves the accuracy and robustness of the visual positioning compared with other ordinary visual odometers.It will have widely application prospects in the UGV field in the future.展开更多
This paper proposes an uncalibrated workpiece positioning method for peg-in-hole assembly of a device using an industrial robot.Depth images are used to identify and locate the workpieces when a peg-in-hole assembly t...This paper proposes an uncalibrated workpiece positioning method for peg-in-hole assembly of a device using an industrial robot.Depth images are used to identify and locate the workpieces when a peg-in-hole assembly task is carried out by an industrial robot in a flexible production system.First,the depth image is thresholded according to the depth data of the workpiece surface so as to filter out the background interference.Second,a series of image processing and the feature recognition algorithms are executed to extract the outer contour features and locate the center point position.This image information,fed by the vision system,will drive the robot to achieve the positioning,approximately.Finally,the Hough circle detection algorithm is used to extract the features and the relevant parameters of the circular hole where the assembly would be done,on the color image,for accurate positioning.The experimental result shows that the positioning accuracy of this method is between 0.6-1.2 mm,in the used experimental system.The entire positioning process need not require complicated calibration,and the method is highly flexible.It is suitable for the automatic assembly tasks with multi-specification or in small batches,in a flexible production system.展开更多
In order to achieve the goal that unmanned aerial vehicle(UAV)automatically positioning during power inspection,a visual positioning method which utilizes encoded sign as cooperative target is proposed.Firstly,we disc...In order to achieve the goal that unmanned aerial vehicle(UAV)automatically positioning during power inspection,a visual positioning method which utilizes encoded sign as cooperative target is proposed.Firstly,we discuss how to design the encoded sign and propose a robust decoding algorithm based on contour.Secondly,the Adaboost algorithm is used to train a classifier which can detect the encoded sign from image.Lastly,the position of UAV can be calculated by using the projective relation between the object points and their corresponding image points.Experiment includes two parts.First,simulated video data is used to verify the feasibility of the proposed method,and the results show that the average absolute error in each direction is below 0.02 m.Second,a video,acquired from an actual UAV flight,is used to calculate the position of UAV.The results show that the calculated trajectory is consistent with the actual flight path.The method runs at a speed of 0.153 sper frame.展开更多
In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and ha...In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.展开更多
A new visual measurement method is proposed to estimate three-dimensional (3D) position of the object on the floor based on a single camera. The camera fixed on a robot is in an inclined position with respect to the...A new visual measurement method is proposed to estimate three-dimensional (3D) position of the object on the floor based on a single camera. The camera fixed on a robot is in an inclined position with respect to the floor. A measurement model with the camera's extrinsic parameters such as the height and pitch angle is described. Single image of a chessboard pattern placed on the floor is enough to calibrate the camera's extrinsic parameters after the camera's intrinsic parameters are calibrated. Then the position of object on the floor can be computed with the measurement model. Furthermore, the height of object can be calculated with the paired-points in the vertical line sharing the same position on the floor. Compared to the conventional method used to estimate the positions on the plane, this method can obtain the 3D positions. The indoor experiment testifies the accuracy and validity of the proposed method.展开更多
基金supported by the NIBIB and the NEI of the National Institutes of Health(R01EB018117)。
文摘There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.
基金Project(51175242)supported by the National Natural Science Foundation of ChinaProject(BA2012031)supported by the Jiangsu Province Science and Technology Foundation of China
文摘With the increasing necessities for reliable printed circuit board(PCB) product, there has been a considerable demand for high speed and high precision vision positioning system. To locate a rectangular lead component with high accuracy and reliability, a new visual positioning method was introduced. Considering the limitations of Ghosal sub-pixel edge detection algorithm, an improved algorithm was proposed, in which Harris corner features were used to coarsely detect the edge points and Zernike moments were adopted to accurately detect the edge points. Besides, two formulas were developed to determine the edge intersections whose sub-pixel coordinates were calculated with bilinear interpolation and conjugate gradient method. The last experimental results show that the proposed method can detect the deflection and offset, and the detection errors are less than 0.04° and 0.02 pixels.
文摘The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run length coding based on general orientation run length coding and visual recognition method are described elaborately.The method of positioning and orientating based on the moment of inertia of the workpiece binary image is stated also.It has been applied in a research on flexible automatic coordinate measuring system formed by integrating computer aided design,computer vision and computer aided inspection planning,with a coordinate measuring machine.The results show that integrating computer vision with measurement system is a feasible and effective approach to improve their flexibility and automation.
文摘The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.
基金the Project of National Natural Science Foundation of China(Grant No.61773059)the Project of National Defense Technology Foundation Program of China(Grant No.20230028) to provide fund for conducting experiments。
文摘In the visual positioning of Unmanned Ground Vehicle(UGV),the visual odometer based on direct sparse method(DSO) has the advantages of small amount of calculation,high real-time performance and high robustness,so it is more widely used than the visual odometer based on feature point method.Ordinary vision sensors have a narrower viewing angle than panoramic vision sensors,and there are fewer road signs in a single frame of image,resulting in poor road sign tracking and positioning capabilities,and severely restricting the development of visual odometry.Based on these considerations,this paper proposes a binocular stereo panoramic vision positioning algorithm based on extended DSO,which can solve these problems well.The experimental results show that the binocular stereo panoramic vision positioning algorithm based on the extended DSO can directly obtain the panoramic depth image around the UGV,which greatly improves the accuracy and robustness of the visual positioning compared with other ordinary visual odometers.It will have widely application prospects in the UGV field in the future.
文摘This paper proposes an uncalibrated workpiece positioning method for peg-in-hole assembly of a device using an industrial robot.Depth images are used to identify and locate the workpieces when a peg-in-hole assembly task is carried out by an industrial robot in a flexible production system.First,the depth image is thresholded according to the depth data of the workpiece surface so as to filter out the background interference.Second,a series of image processing and the feature recognition algorithms are executed to extract the outer contour features and locate the center point position.This image information,fed by the vision system,will drive the robot to achieve the positioning,approximately.Finally,the Hough circle detection algorithm is used to extract the features and the relevant parameters of the circular hole where the assembly would be done,on the color image,for accurate positioning.The experimental result shows that the positioning accuracy of this method is between 0.6-1.2 mm,in the used experimental system.The entire positioning process need not require complicated calibration,and the method is highly flexible.It is suitable for the automatic assembly tasks with multi-specification or in small batches,in a flexible production system.
基金supported by the National Key Research Projects(No.2016YFB0501403)the National Demonstration Center for Experimental Remote Sensing&Information Engineering(Wuhan University)
文摘In order to achieve the goal that unmanned aerial vehicle(UAV)automatically positioning during power inspection,a visual positioning method which utilizes encoded sign as cooperative target is proposed.Firstly,we discuss how to design the encoded sign and propose a robust decoding algorithm based on contour.Secondly,the Adaboost algorithm is used to train a classifier which can detect the encoded sign from image.Lastly,the position of UAV can be calculated by using the projective relation between the object points and their corresponding image points.Experiment includes two parts.First,simulated video data is used to verify the feasibility of the proposed method,and the results show that the average absolute error in each direction is below 0.02 m.Second,a video,acquired from an actual UAV flight,is used to calculate the position of UAV.The results show that the calculated trajectory is consistent with the actual flight path.The method runs at a speed of 0.153 sper frame.
基金supported in part by the National Natural Science Foundation of China under Grant 62276274in part by the Natural Science Foundation of Shaanxi Province under Grant 2020JM-537,and in part by the Aeronautical Science Fund under Grant 201851U8012(corresponding author:Xiaogang Yang).
文摘In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.
基金supported by National Natural Science Foundation of China(Nos.61273352 and 61473295)National High Technology Research and Development Program of China(863 Program)(No.2015AA042307)Beijing Natural Science Foundation(No.4161002)
文摘A new visual measurement method is proposed to estimate three-dimensional (3D) position of the object on the floor based on a single camera. The camera fixed on a robot is in an inclined position with respect to the floor. A measurement model with the camera's extrinsic parameters such as the height and pitch angle is described. Single image of a chessboard pattern placed on the floor is enough to calibrate the camera's extrinsic parameters after the camera's intrinsic parameters are calibrated. Then the position of object on the floor can be computed with the measurement model. Furthermore, the height of object can be calculated with the paired-points in the vertical line sharing the same position on the floor. Compared to the conventional method used to estimate the positions on the plane, this method can obtain the 3D positions. The indoor experiment testifies the accuracy and validity of the proposed method.