目前主流人体动作识别大部分都是基于卷积神经网络(Convolutional Neural Network,CNN)实现,而CNN容易忽略视频中的空间位置信息,从而降低了视频空间频域中动作识别能力。同时传统CNN不能快速定位到关键的特征位置,并且在训练过程中不...目前主流人体动作识别大部分都是基于卷积神经网络(Convolutional Neural Network,CNN)实现,而CNN容易忽略视频中的空间位置信息,从而降低了视频空间频域中动作识别能力。同时传统CNN不能快速定位到关键的特征位置,并且在训练过程中不能并行计算导致效率低。为了解决传统CNN在处理时间频域和多并行计算问题,提出了基于视觉Transformer(Vision Transformer,ViT)和3D卷积网络学习时空特征(Learning Spatiotemporal Features with 3D Convolutional Network,C3D)的人体动作识别算法。使用C3D提取视频的多维特征图、ViT的特征切片窗口对多维特征进行全局特征分割;使用Transformer的编码-解码模块对视频中人体动作进行预测。实验结果表明,所提的人体动作识别算法在UCF-101、HMDB51数据集上提高了动作识别的准确率。展开更多
Single-pixel imaging(SPI)can transform 2D or 3D image data into 1D light signals,which offers promising prospects for image compression and transmission.However,during data communication these light signals in public ...Single-pixel imaging(SPI)can transform 2D or 3D image data into 1D light signals,which offers promising prospects for image compression and transmission.However,during data communication these light signals in public channels will easily draw the attention of eavesdroppers.Here,we introduce an efficient encryption method for SPI data transmission that uses the 3D Arnold transformation to directly disrupt 1D single-pixel light signals and utilizes the elliptic curve encryption algorithm for key transmission.This encryption scheme immediately employs Hadamard patterns to illuminate the scene and then utilizes the 3D Arnold transformation to permutate the 1D light signal of single-pixel detection.Then the transformation parameters serve as the secret key,while the security of key exchange is guaranteed by an elliptic curve-based key exchange mechanism.Compared with existing encryption schemes,both computer simulations and optical experiments have been conducted to demonstrate that the proposed technique not only enhances the security of encryption but also eliminates the need for complicated pattern scrambling rules.Additionally,this approach solves the problem of secure key transmission,thus ensuring the security of information and the quality of the decrypted images.展开更多
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
Energy density can be substantially raised and even maximized if the bulk of an electrode material is fully utilized.Transition metal oxides based on conversion reaction mechanism are the imperative choice due to eith...Energy density can be substantially raised and even maximized if the bulk of an electrode material is fully utilized.Transition metal oxides based on conversion reaction mechanism are the imperative choice due to either constructing nanostructure or intercalation pseudocapacitance with their intrinsic limitations.However,the fully bulk utilization of transition metal oxides is hindered by the poor understanding of atomic-level conversion reaction mechanism,particularly it is largely missing at clarifying how the phase transformation(conversion reaction)determines the electrochemical performance such as power density and cyclic stability.Herein,α-Fe_(2)O_(3) is a case provided to claim how the diffusional and diffusionless transformation determine the electrochemical behaviors,as of its conversion reaction mechanism with fully bulk utilization in alkaline electrolyte.Specifically,the discharge productα-FeOOH diffusional from Fe(OH)2 is structurally identified as the atomic-level arch criminal for its cyclic stability deterioration,whereas the counterpartδ-FeOOH is theoretically diffusionless-like,unlocking the full potential of the pseudocapacitance with fully bulk utilization.Thus,such pseudocapacitance,in proof-of-concept and termed as conversion pseudocapacitance,is achieved via diffusionless-like transformation.This work not only provides an atomic-level perspective to reassess the potential electrochemical performance of the transition metal oxides electrode materials based on conversion reaction mechanism but also debuts a new paradigm for pseudocapacitance.展开更多
The staggered distribution of joints and fissures in space constitutes the weak part of any rock mass.The identification of rock mass structural planes and the extraction of characteristic parameters are the basis of ...The staggered distribution of joints and fissures in space constitutes the weak part of any rock mass.The identification of rock mass structural planes and the extraction of characteristic parameters are the basis of rock-mass integrity evaluation,which is very important for analysis of slope stability.The laser scanning technique can be used to acquire the coordinate information pertaining to each point of the structural plane,but large amount of point cloud data,uneven density distribution,and noise point interference make the identification efficiency and accuracy of different types of structural planes limited by point cloud data analysis technology.A new point cloud identification and segmentation algorithm for rock mass structural surfaces is proposed.Based on the distribution states of the original point cloud in different neighborhoods in space,the point clouds are characterized by multi-dimensional eigenvalues and calculated by the robust randomized Hough transform(RRHT).The normal vector difference and the final eigenvalue are proposed for characteristic distinction,and the identification of rock mass structural surfaces is completed through regional growth,which strengthens the difference expression of point clouds.In addition,nearest Voxel downsampling is also introduced in the RRHT calculation,which further reduces the number of sources of neighborhood noises,thereby improving the accuracy and stability of the calculation.The advantages of the method have been verified by laboratory models.The results showed that the proposed method can better achieve the segmentation and statistics of structural planes with interfaces and sharp boundaries.The method works well in the identification of joints,fissures,and other structural planes on Mangshezhai slope in the Three Gorges Reservoir area,China.It can provide a stable and effective technique for the identification and segmentation of rock mass structural planes,which is beneficial in engineering practice.展开更多
The major enrichment type of shale oil in the Chang 7_(3) shale of Upper Triassic Yanchang Formation in the Ordos Basin is unknown.This paper analyzes the organic matter transformation ratio,hydrocarbon expulsion effi...The major enrichment type of shale oil in the Chang 7_(3) shale of Upper Triassic Yanchang Formation in the Ordos Basin is unknown.This paper analyzes the organic matter transformation ratio,hydrocarbon expulsion efficiency and roof/floor sealing conditions of the Chang 7_(3) shale,and evaluates the major enrichment type of shale oil in this interval.The average organic matter transformation ratio of the Chang 7_(3) shale is about 45%;in other words,more than 50%of the organic matters have not transformed to hydrocarbons,and the lower the maturity,the greater the proportion of untransformed organic matters.The cumulative hydrocarbon expulsion efficiency of the transformed hydrocarbon is 27.5% on average,and the total proportion of untransformed organic matters plus retained hydrocarbons is greater than 70%.The relative hydrocarbon expulsion efficiency of the Chang 7_(3) shale is 60%on average,that is,about 40% of hydrocarbons retain in the shale.The Chang 7_(3) shale corresponds to Chang 7_(1+2) and Chang 8 sandstones as the roof and floor,respectively,and is further overlaid by Chang 6 shale,where extensive low porosity and low permeability–tight oil reservoirs have formed in the parts with relatively good porosity and permeability.Moreover,the Chang 7_(3) shale is tested to be in a negative pressure system(the pressure coefficient of 0.80–0.85).Therefore,the roof/floor sealing conditions of the Chang 7_(3) shale are poor.The retained hydrocarbons appear mostly in absorbed status,with low mobility.It is concluded that the medium–high mature shale oil is not the major enrichment type of shale oil in the Chang 7_(3) shale,but there may be enrichment opportunity for shale oil with good mobility in the areas where the sealing conditions are good without faults and fractures and oil reservoirs are formed off Chang 7_(1+2),Chang 6 and Chang 8.Furthermore,low–medium mature shale oil is believed to have great potential and is the major enrichment type of shale oil in the Chang 7_(3) shale.It is recommended to prepare relevant in-situ conversion technologies by pilot test and figure out the resource availability and distribution.展开更多
Transmission of data over the internet has become a critical issue as a result of the advancement in technology, since it is possible for pirates to steal the intellectual property of content owners. This paper presen...Transmission of data over the internet has become a critical issue as a result of the advancement in technology, since it is possible for pirates to steal the intellectual property of content owners. This paper presents a new digital watermarking scheme that combines some operators of the Genetic Algorithm (GA) and the Residue Number (RN) System (RNS) to perform encryption on an image, which is embedded into a cover image for the purposes of watermarking. Thus, an image watermarking scheme uses an encrypted image. The secret image is embedded in decomposed frames of the cover image achieved by applying a three-level Discrete Wavelet Transform (DWT). This is to ensure that the secret information is not exposed even when there is a successful attack on the cover information. Content creators can prove ownership of the multimedia content by unveiling the secret information in a court of law. The proposed scheme was tested with sample data using MATLAB2022 and the results of the simulation show a great deal of imperceptibility and robustness as compared to similar existing schemes.展开更多
文摘目前主流人体动作识别大部分都是基于卷积神经网络(Convolutional Neural Network,CNN)实现,而CNN容易忽略视频中的空间位置信息,从而降低了视频空间频域中动作识别能力。同时传统CNN不能快速定位到关键的特征位置,并且在训练过程中不能并行计算导致效率低。为了解决传统CNN在处理时间频域和多并行计算问题,提出了基于视觉Transformer(Vision Transformer,ViT)和3D卷积网络学习时空特征(Learning Spatiotemporal Features with 3D Convolutional Network,C3D)的人体动作识别算法。使用C3D提取视频的多维特征图、ViT的特征切片窗口对多维特征进行全局特征分割;使用Transformer的编码-解码模块对视频中人体动作进行预测。实验结果表明,所提的人体动作识别算法在UCF-101、HMDB51数据集上提高了动作识别的准确率。
基金Project supported by the National Natural Science Foundation of China(Grant No.62075241).
文摘Single-pixel imaging(SPI)can transform 2D or 3D image data into 1D light signals,which offers promising prospects for image compression and transmission.However,during data communication these light signals in public channels will easily draw the attention of eavesdroppers.Here,we introduce an efficient encryption method for SPI data transmission that uses the 3D Arnold transformation to directly disrupt 1D single-pixel light signals and utilizes the elliptic curve encryption algorithm for key transmission.This encryption scheme immediately employs Hadamard patterns to illuminate the scene and then utilizes the 3D Arnold transformation to permutate the 1D light signal of single-pixel detection.Then the transformation parameters serve as the secret key,while the security of key exchange is guaranteed by an elliptic curve-based key exchange mechanism.Compared with existing encryption schemes,both computer simulations and optical experiments have been conducted to demonstrate that the proposed technique not only enhances the security of encryption but also eliminates the need for complicated pattern scrambling rules.Additionally,this approach solves the problem of secure key transmission,thus ensuring the security of information and the quality of the decrypted images.
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.
基金This research is supported by the National Natural Science Foundation of China (51932003,51872115)2020 International Cooperation Project of the Department of Science and Technology of Jilin Province (20200801001GH)+5 种基金Program for the Development of Science and Technology of Jilin Province (20190201309JC)the Jilin Province/Jilin University Co-Construction Project-Funds for New Materials (SXGJSF2017-3,Branch-2/440050316A36)Project for Self-innovation Capability Construction of Jilin Province Development and Reform Commission (2021C026)the Open Project Program of Wuhan National Laboratory for Optoelectronics (2018WNLOKF022)the Program for JLU Science and Technology Innovative Research Team (JLUSTIRT,2017TD-09)the Fundamental Research Funds for the Central Universities JLU,and“Double-First Class”Discipline for Materials Science&Engineering.
文摘Energy density can be substantially raised and even maximized if the bulk of an electrode material is fully utilized.Transition metal oxides based on conversion reaction mechanism are the imperative choice due to either constructing nanostructure or intercalation pseudocapacitance with their intrinsic limitations.However,the fully bulk utilization of transition metal oxides is hindered by the poor understanding of atomic-level conversion reaction mechanism,particularly it is largely missing at clarifying how the phase transformation(conversion reaction)determines the electrochemical performance such as power density and cyclic stability.Herein,α-Fe_(2)O_(3) is a case provided to claim how the diffusional and diffusionless transformation determine the electrochemical behaviors,as of its conversion reaction mechanism with fully bulk utilization in alkaline electrolyte.Specifically,the discharge productα-FeOOH diffusional from Fe(OH)2 is structurally identified as the atomic-level arch criminal for its cyclic stability deterioration,whereas the counterpartδ-FeOOH is theoretically diffusionless-like,unlocking the full potential of the pseudocapacitance with fully bulk utilization.Thus,such pseudocapacitance,in proof-of-concept and termed as conversion pseudocapacitance,is achieved via diffusionless-like transformation.This work not only provides an atomic-level perspective to reassess the potential electrochemical performance of the transition metal oxides electrode materials based on conversion reaction mechanism but also debuts a new paradigm for pseudocapacitance.
基金the National Natural Science Foundation of China(51909136)the Open Research Fund of Key Laboratory of Geological Hazards on Three Gorges Reservoir Area(China Three Gorges University),Ministry of Education,Grant No.2022KDZ21Fund of National Major Water Conservancy Project Construction(0001212022CC60001)。
文摘The staggered distribution of joints and fissures in space constitutes the weak part of any rock mass.The identification of rock mass structural planes and the extraction of characteristic parameters are the basis of rock-mass integrity evaluation,which is very important for analysis of slope stability.The laser scanning technique can be used to acquire the coordinate information pertaining to each point of the structural plane,but large amount of point cloud data,uneven density distribution,and noise point interference make the identification efficiency and accuracy of different types of structural planes limited by point cloud data analysis technology.A new point cloud identification and segmentation algorithm for rock mass structural surfaces is proposed.Based on the distribution states of the original point cloud in different neighborhoods in space,the point clouds are characterized by multi-dimensional eigenvalues and calculated by the robust randomized Hough transform(RRHT).The normal vector difference and the final eigenvalue are proposed for characteristic distinction,and the identification of rock mass structural surfaces is completed through regional growth,which strengthens the difference expression of point clouds.In addition,nearest Voxel downsampling is also introduced in the RRHT calculation,which further reduces the number of sources of neighborhood noises,thereby improving the accuracy and stability of the calculation.The advantages of the method have been verified by laboratory models.The results showed that the proposed method can better achieve the segmentation and statistics of structural planes with interfaces and sharp boundaries.The method works well in the identification of joints,fissures,and other structural planes on Mangshezhai slope in the Three Gorges Reservoir area,China.It can provide a stable and effective technique for the identification and segmentation of rock mass structural planes,which is beneficial in engineering practice.
基金Supported by the National Natural Science Foundation of China(U22B6004).
文摘The major enrichment type of shale oil in the Chang 7_(3) shale of Upper Triassic Yanchang Formation in the Ordos Basin is unknown.This paper analyzes the organic matter transformation ratio,hydrocarbon expulsion efficiency and roof/floor sealing conditions of the Chang 7_(3) shale,and evaluates the major enrichment type of shale oil in this interval.The average organic matter transformation ratio of the Chang 7_(3) shale is about 45%;in other words,more than 50%of the organic matters have not transformed to hydrocarbons,and the lower the maturity,the greater the proportion of untransformed organic matters.The cumulative hydrocarbon expulsion efficiency of the transformed hydrocarbon is 27.5% on average,and the total proportion of untransformed organic matters plus retained hydrocarbons is greater than 70%.The relative hydrocarbon expulsion efficiency of the Chang 7_(3) shale is 60%on average,that is,about 40% of hydrocarbons retain in the shale.The Chang 7_(3) shale corresponds to Chang 7_(1+2) and Chang 8 sandstones as the roof and floor,respectively,and is further overlaid by Chang 6 shale,where extensive low porosity and low permeability–tight oil reservoirs have formed in the parts with relatively good porosity and permeability.Moreover,the Chang 7_(3) shale is tested to be in a negative pressure system(the pressure coefficient of 0.80–0.85).Therefore,the roof/floor sealing conditions of the Chang 7_(3) shale are poor.The retained hydrocarbons appear mostly in absorbed status,with low mobility.It is concluded that the medium–high mature shale oil is not the major enrichment type of shale oil in the Chang 7_(3) shale,but there may be enrichment opportunity for shale oil with good mobility in the areas where the sealing conditions are good without faults and fractures and oil reservoirs are formed off Chang 7_(1+2),Chang 6 and Chang 8.Furthermore,low–medium mature shale oil is believed to have great potential and is the major enrichment type of shale oil in the Chang 7_(3) shale.It is recommended to prepare relevant in-situ conversion technologies by pilot test and figure out the resource availability and distribution.
文摘Transmission of data over the internet has become a critical issue as a result of the advancement in technology, since it is possible for pirates to steal the intellectual property of content owners. This paper presents a new digital watermarking scheme that combines some operators of the Genetic Algorithm (GA) and the Residue Number (RN) System (RNS) to perform encryption on an image, which is embedded into a cover image for the purposes of watermarking. Thus, an image watermarking scheme uses an encrypted image. The secret image is embedded in decomposed frames of the cover image achieved by applying a three-level Discrete Wavelet Transform (DWT). This is to ensure that the secret information is not exposed even when there is a successful attack on the cover information. Content creators can prove ownership of the multimedia content by unveiling the secret information in a court of law. The proposed scheme was tested with sample data using MATLAB2022 and the results of the simulation show a great deal of imperceptibility and robustness as compared to similar existing schemes.