Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes...Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes and their relationships are modeled as edges.More specifically,we employ the DGCNN to capture the features of objects and their relationships in the scene.A Graph Attention Network(GAT)is introduced to exploit latent features obtained from the initial estimation to further refine the object arrangement in the graph structure.A one loss function modified from cross entropy with a variable weight is proposed to solve the multi-category problem in the prediction of object and predicate.Results Experiments reveal that the proposed approach performs favorably against the state-of-the-art methods in terms of predicate classification and relationship prediction and achieves comparable performance on object classification prediction.Conclusions The 3D scene graph prediction approach can form an abstract description of the scene space from point clouds.展开更多
With the support of edge computing,the synergy and collaboration among central cloud,edge cloud,and terminal devices form an integrated computing ecosystem known as the cloud-edge-client architecture.This integration ...With the support of edge computing,the synergy and collaboration among central cloud,edge cloud,and terminal devices form an integrated computing ecosystem known as the cloud-edge-client architecture.This integration unlocks the value of data and computational power,presenting significant opportunities for large-scale 3D scene modeling and XR presentation.In this paper,we explore the perspectives and highlight new challenges in 3D scene modeling and XR presentation based on point cloud within the cloud-edge-client integrated architecture.We also propose a novel cloud-edge-client integrated technology framework and a demonstration of municipal governance application to address these challenges.展开更多
Three-dimensional(3D)high-fidelity surface models play an important role in urban scene construction.However,the data quantity of such models is large and places a tremendous burden on rendering.Many applications must...Three-dimensional(3D)high-fidelity surface models play an important role in urban scene construction.However,the data quantity of such models is large and places a tremendous burden on rendering.Many applications must balance the visual quality of the models with the rendering efficiency.The study provides a practical texture baking processing pipeline for generating 3D models to reduce the model complexity and preserve the visually pleasing details.Concretely,we apply a mesh simplification to the original model and use texture baking to create three types of baked textures,namely,a diffuse map,normal map and displacement map.The simplified model with the baked textures has a pleasing visualization effect in a rendering engine.Furthermore,we discuss the influence of various factors in the process on the results,as well as the functional principles and characteristics of the baking textures.The proposed approach is very useful for real-time rendering with limited rendering hardware as no additional memory or computing capacity is required for properly preserving the relief details of the model.Each step in the pipeline is described in detail to facilitate the realization.展开更多
in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such...in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such as boundaries and shape classes of individual object as well as relationshipsbetween objects.It is implemented as an added high-level component to an existing low-level binocularvision system[1]. Based on a pair of matched stereo images produced by that system,3D segmentation is firstperformed to group object boundary data into several edge-sets,each of which is believed to belong to aparticular object.Then gross features of each object are extracted and stored in an object recbrd.The finalstructural description of the scene is accomplished with information in the object record,a set of rules and arule implementor. The System is designed to handle partially occluded objects of different shapes and sizeson the 2D imager.Experimental results have shown its success in computing both object and structurallevel descriptions of common man-made objects.展开更多
In this paper,we propose a Structure-Aware Fusion Network(SAFNet)for 3D scene understanding.As 2D images present more detailed information while 3D point clouds convey more geometric information,fusing the two complem...In this paper,we propose a Structure-Aware Fusion Network(SAFNet)for 3D scene understanding.As 2D images present more detailed information while 3D point clouds convey more geometric information,fusing the two complementary data can improve the discriminative ability of the model.Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats.The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance.However,the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation.To address this,we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction.Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet.展开更多
As an important technology of digital construction,real 3D models can improve the immersion and realism of virtual reality(VR)scenes.The large amount of data for real 3D scenes requires more effective rendering method...As an important technology of digital construction,real 3D models can improve the immersion and realism of virtual reality(VR)scenes.The large amount of data for real 3D scenes requires more effective rendering methods,but the current rendering optimization methods have some defects and cannot render real 3D scenes in virtual reality.In this study,the location of the viewing frustum is predicted by a Kalman filter,and eye-tracking equipment is used to recognize the region of interest(ROI)in the scene.Finally,the real 3D model of interest in the predicted frustum is rendered first.The experimental results show that the method of this study can predict the frustrum location approximately 200 ms in advance,the prediction accuracy is approximately 87%,the scene rendering efficiency is improved by 8.3%,and the motion sickness is reduced by approximately 54.5%.These studies help promote the use of real 3D models in virtual reality and ROI recognition methods.In future work,we will further improve the prediction accuracy of viewing frustums in virtual reality and the application of eye tracking in virtual geographic scenes.展开更多
The increasing scale and complexity of 3D scene design work urge an efficient way to understand the design in multi-disciplinary team and exploit the experiences and underlying knowledge in previous works for reuse.Ho...The increasing scale and complexity of 3D scene design work urge an efficient way to understand the design in multi-disciplinary team and exploit the experiences and underlying knowledge in previous works for reuse.However the previous researches lack of concerning on relationship maintaining and design reuse in knowledge level.We propose a novel semantic driven design reuse system,including a property computation algorithm that enables our system to compute the properties while modeling process to maintain the semantic consistency,and a vertex statics based algorithm that enables the system to recognize scene design pattern as universal semantic model for the same type of scenes.With the universal semantic model,the system conducts the modeling process of future design works by suggestions and constraints on operation.The proposed framework empowers the reuse of 3D scene design on both model level and knowledge level.展开更多
基金Supported by National Natural Science Foundation of China(61872024)National Key R&D Program of China under Grant(2018YFB2100603).
文摘Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes and their relationships are modeled as edges.More specifically,we employ the DGCNN to capture the features of objects and their relationships in the scene.A Graph Attention Network(GAT)is introduced to exploit latent features obtained from the initial estimation to further refine the object arrangement in the graph structure.A one loss function modified from cross entropy with a variable weight is proposed to solve the multi-category problem in the prediction of object and predicate.Results Experiments reveal that the proposed approach performs favorably against the state-of-the-art methods in terms of predicate classification and relationship prediction and achieves comparable performance on object classification prediction.Conclusions The 3D scene graph prediction approach can form an abstract description of the scene space from point clouds.
基金the National Natural Science Foundation of China(U22B2034)the Fundamental Research Funds for the Central Universities(226-2022-00064).
文摘With the support of edge computing,the synergy and collaboration among central cloud,edge cloud,and terminal devices form an integrated computing ecosystem known as the cloud-edge-client architecture.This integration unlocks the value of data and computational power,presenting significant opportunities for large-scale 3D scene modeling and XR presentation.In this paper,we explore the perspectives and highlight new challenges in 3D scene modeling and XR presentation based on point cloud within the cloud-edge-client integrated architecture.We also propose a novel cloud-edge-client integrated technology framework and a demonstration of municipal governance application to address these challenges.
基金supported by the Key Program of the National Natural Science Foundation of China[grant no 41930104].
文摘Three-dimensional(3D)high-fidelity surface models play an important role in urban scene construction.However,the data quantity of such models is large and places a tremendous burden on rendering.Many applications must balance the visual quality of the models with the rendering efficiency.The study provides a practical texture baking processing pipeline for generating 3D models to reduce the model complexity and preserve the visually pleasing details.Concretely,we apply a mesh simplification to the original model and use texture baking to create three types of baked textures,namely,a diffuse map,normal map and displacement map.The simplified model with the baked textures has a pleasing visualization effect in a rendering engine.Furthermore,we discuss the influence of various factors in the process on the results,as well as the functional principles and characteristics of the baking textures.The proposed approach is very useful for real-time rendering with limited rendering hardware as no additional memory or computing capacity is required for properly preserving the relief details of the model.Each step in the pipeline is described in detail to facilitate the realization.
文摘深度歧义是单帧图像多人3D姿态估计面临的重要挑战,提取图像上下文对缓解深度歧义极具潜力.自顶向下方法大多基于人体检测建模关键点关系,人体包围框粒度粗背景噪声占比较大,极易导致关键点偏移或误匹配,还将影响基于人体尺度因子估计绝对深度的可靠性.自底向上的方法直接检出图像中的人体关键点再逐一恢复3D人体姿态.虽然能够显式获取场景上下文,但在相对深度估计方面处于劣势.提出新的双分支网络,自顶向下分支基于关键点区域提议提取人体上下文,自底向上分支基于三维空间提取场景上下文.提出带噪声抑制的人体上下文提取方法,通过建模“关键点区域提议”描述人体目标,建模姿态关联的动态稀疏关键点关系剔除弱连接减少噪声传播.提出从鸟瞰视角提取场景上下文的方法,通过建模图像深度特征并映射鸟瞰平面获得三维空间人体位置布局;设计人体和场景上下文融合网络预测人体绝对深度.在公开数据集MuPoTS-3D和Human3.6M上的实验结果表明:与同类先进模型相比,所提模型HSC-Pose的相对和绝对3D关键点位置精度至少提高2.2%和0.5%;平均根关键点位置误差至少降低4.2 mm.
文摘in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such as boundaries and shape classes of individual object as well as relationshipsbetween objects.It is implemented as an added high-level component to an existing low-level binocularvision system[1]. Based on a pair of matched stereo images produced by that system,3D segmentation is firstperformed to group object boundary data into several edge-sets,each of which is believed to belong to aparticular object.Then gross features of each object are extracted and stored in an object recbrd.The finalstructural description of the scene is accomplished with information in the object record,a set of rules and arule implementor. The System is designed to handle partially occluded objects of different shapes and sizeson the 2D imager.Experimental results have shown its success in computing both object and structurallevel descriptions of common man-made objects.
基金supported by the National Natural Science Foundation of China(No.61976023)。
文摘In this paper,we propose a Structure-Aware Fusion Network(SAFNet)for 3D scene understanding.As 2D images present more detailed information while 3D point clouds convey more geometric information,fusing the two complementary data can improve the discriminative ability of the model.Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats.The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance.However,the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation.To address this,we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction.Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet.
基金supported by the National Natural Science Foundation of China(grant numbers U2034202,41871289,42171397)the Sichuan Science and Technology Program(grant number 2020JDTD0003).
文摘As an important technology of digital construction,real 3D models can improve the immersion and realism of virtual reality(VR)scenes.The large amount of data for real 3D scenes requires more effective rendering methods,but the current rendering optimization methods have some defects and cannot render real 3D scenes in virtual reality.In this study,the location of the viewing frustum is predicted by a Kalman filter,and eye-tracking equipment is used to recognize the region of interest(ROI)in the scene.Finally,the real 3D model of interest in the predicted frustum is rendered first.The experimental results show that the method of this study can predict the frustrum location approximately 200 ms in advance,the prediction accuracy is approximately 87%,the scene rendering efficiency is improved by 8.3%,and the motion sickness is reduced by approximately 54.5%.These studies help promote the use of real 3D models in virtual reality and ROI recognition methods.In future work,we will further improve the prediction accuracy of viewing frustums in virtual reality and the application of eye tracking in virtual geographic scenes.
基金the National Natural Science Foundation of China(Nos.61073086 and 70871078)the National High Technology Research and Development Program (863) of China(No.2008AA04Z126)
文摘The increasing scale and complexity of 3D scene design work urge an efficient way to understand the design in multi-disciplinary team and exploit the experiences and underlying knowledge in previous works for reuse.However the previous researches lack of concerning on relationship maintaining and design reuse in knowledge level.We propose a novel semantic driven design reuse system,including a property computation algorithm that enables our system to compute the properties while modeling process to maintain the semantic consistency,and a vertex statics based algorithm that enables the system to recognize scene design pattern as universal semantic model for the same type of scenes.With the universal semantic model,the system conducts the modeling process of future design works by suggestions and constraints on operation.The proposed framework empowers the reuse of 3D scene design on both model level and knowledge level.