期刊文献+
共找到996篇文章
< 1 2 50 >
每页显示 20 50 100
MicroMagnetic.jl:A Julia package for micromagnetic and atomistic simulations with GPU support
1
作者 Weiwei Wang Boyao Lyu +2 位作者 Lingyao Kong Hans Fangohr Haifeng Du 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第10期70-79,共10页
MicroMagnetic.jl is an open-source Julia package for micromagnetic and atomistic simulations.Using the features of the Julia programming language,MicroMagnetic.jl supports CPU and various GPU platforms,including NVIDI... MicroMagnetic.jl is an open-source Julia package for micromagnetic and atomistic simulations.Using the features of the Julia programming language,MicroMagnetic.jl supports CPU and various GPU platforms,including NVIDIA,AMD,Intel,and Apple GPUs.Moreover,MicroMagnetic.jl supports Monte Carlo simulations for atomistic models and implements the nudged-elastic-band method for energy barrier computations.With built-in support for double and single precision modes and a design allowing easy extensibility to add new features,MicroMagnetic.jl provides a versatile toolset for researchers in micromagnetics and atomistic simulations. 展开更多
关键词 micromagnetic simulations atomistic simulations graphics processing units
原文传递
Electromagnetic scattering and imaging simulation of extremely large-scale sea-ship scene based on GPU parallel technology
2
作者 Cheng-Wei Zhang Zhi-Qin Zhao +2 位作者 Wei Yang Li-Lai Zhou Hai-Yu Zhu 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第2期16-23,共8页
Aiming to solve the bottleneck problem of electromagnetic scattering simulation in the scenes of extremely large-scale seas and ships,a high-frequency method by using graphics processing unit(GPU)parallel acceleration... Aiming to solve the bottleneck problem of electromagnetic scattering simulation in the scenes of extremely large-scale seas and ships,a high-frequency method by using graphics processing unit(GPU)parallel acceleration technique is proposed.For the implementation of different electromagnetic methods of physical optics(PO),shooting and bouncing ray(SBR),and physical theory of diffraction(PTD),a parallel computing scheme based on the CPU-GPU parallel computing scheme is realized to balance computing tasks.Finally,a multi-GPU framework is further proposed to solve the computational difficulty caused by the massive number of ray tubes in the ray tracing process.By using the established simulation platform,signals of ships at different seas are simulated and their images are achieved as well.It is shown that the higher sea states degrade the averaged peak signal-to-noise ratio(PSNR)of radar image. 展开更多
关键词 Multi graphics processing unit Radar imaging Sea-ship Shooting and bouncing rays
下载PDF
Compute Unified Device Architecture Implementation of Euler/Navier-Stokes Solver on Graphics Processing Unit Desktop Platform for 2-D Compressible Flows
3
作者 Zhang Jiale Chen Hongquan 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2016年第5期536-545,共10页
Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/N... Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially. 展开更多
关键词 graphics processing unit(gpu) gpu parallel computing compute unified device architecture(CUDA)Fortran finite volume method(FVM) acceleration
下载PDF
Simulation of fluid-structure interaction in a microchannel using the lattice Boltzmann method and size-dependent beam element on a graphics processing unit
4
作者 Vahid Esfahanian Esmaeil Dehdashti Amir Mehdi Dehrouye-Semnani 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第8期389-395,共7页
Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The b... Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The bottom boundary of the microchannel is simulated by size-dependent beam elements for the finite element method (FEM) based on a modified cou- ple stress theory. The lattice Boltzmann method (LBM) using the D2Q13 LB model is coupled to the FEM in order to solve the fluid part of the FSI problem. Because of the fact that the LBM generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallel computing. The simulations are carried out on graphics processing units (GPUs) using computed unified device architecture (CUDA). In the present study, the governing equations are non-dimensionalized and the set of dimensionless groups is exhibited to show their effects on micro-beam displacement. The numerical results show that the displacements of the micro-beam predicted by the size-dependent beam element are smaller than those by the classical beam element. 展开更多
关键词 fluid-structure interaction graphics processing unit lattice Boltzmann method size-dependentbeam element
原文传递
PHUI-GA: GPU-based efficiency evolutionary algorithm for mining high utility itemsets
5
作者 JIANG Haipeng WU Guoqing +3 位作者 SUN Mengdan LI Feng SUN Yunfei FANG Wei 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第4期965-975,共11页
Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining perform... Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining performance,but they still require huge computational resource and may miss many HUIs.Due to the good combination of EA and graphics processing unit(GPU),we propose a parallel genetic algorithm(GA)based on the platform of GPU for mining HUIM(PHUI-GA).The evolution steps with improvements are performed in central processing unit(CPU)and the CPU intensive steps are sent to GPU to eva-luate with multi-threaded processors.Experiments show that the mining performance of PHUI-GA outperforms the existing EAs.When mining 90%HUIs,the PHUI-GA is up to 188 times better than the existing EAs and up to 36 times better than the CPU parallel approach. 展开更多
关键词 high utility itemset mining(HUIM) graphics process-ing unit(gpu)parallel genetic algorithm(GA) mining perfor-mance
下载PDF
Multi-relaxation-time lattice Boltzmann simulations of lid driven flows using graphics processing unit
6
作者 Chenggong LI J.P.Y.MAA 《Applied Mathematics and Mechanics(English Edition)》 SCIE EI CSCD 2017年第5期707-722,共16页
Large eddy simulation (LES) using the Smagorinsky eddy viscosity model is added to the two-dimensional nine velocity components (D2Q9) lattice Boltzmann equation (LBE) with multi-relaxation-time (MRT) to simul... Large eddy simulation (LES) using the Smagorinsky eddy viscosity model is added to the two-dimensional nine velocity components (D2Q9) lattice Boltzmann equation (LBE) with multi-relaxation-time (MRT) to simulate incompressible turbulent cavity flows with the Reynolds numbers up to 1 × 10^7. To improve the computation efficiency of LBM on the numerical simulations of turbulent flows, the massively parallel computing power from a graphic processing unit (GPU) with a computing unified device architecture (CUDA) is introduced into the MRT-LBE-LES model. The model performs well, compared with the results from others, with an increase of 76 times in computation efficiency. It appears that the higher the Reynolds numbers is, the smaller the Smagorinsky constant should be, if the lattice number is fixed. Also, for a selected high Reynolds number and a selected proper Smagorinsky constant, there is a minimum requirement for the lattice number so that the Smagorinsky eddy viscosity will not be excessively large. 展开更多
关键词 large eddy simulation (LES) multi-relaxation-time (MRT) lattice Boltzmann equation (LBE) two-dimensional nine velocity components (D2Q9) Smagorinskymodel graphic processing unit gpu computing unified device architecture (CUDA)
下载PDF
基于NVIDIA GPU后向投影FFBP算法的加速研究
7
作者 潘丰 高伟 +3 位作者 罗俊 刘文冬 周春元 张慧 《电子测量技术》 北大核心 2023年第22期148-152,共5页
后向投影(BP)算法,在计算成像过程中未采用近似,成像质量高,任何阵列构型成像均适合。近年来在雷达成像技术领域广泛应用。但在毫米波三维全息成像中,计算效率较低,影响了实时成像的实现。在三维极坐标条件下,快速因式分解后向投影(FFBP... 后向投影(BP)算法,在计算成像过程中未采用近似,成像质量高,任何阵列构型成像均适合。近年来在雷达成像技术领域广泛应用。但在毫米波三维全息成像中,计算效率较低,影响了实时成像的实现。在三维极坐标条件下,快速因式分解后向投影(FFBP)算法,利用子孔径划分的方式进行成像,一定程度上解决了实时成像的问题。本文利用四线程CPU与GPU加速CUDA平台实现FFBP算法,并对比分析了多点目标成像,结果基本一致,进而验证加速算法的有效性。进一步,通过电磁仿真软件,对分辨力板建模和仿真,模拟真实目标,并进行GPU加速成像,计算时间比四线程CPU提高33.97倍,适用于三维近场实时成像系统,更好的应用于人体安检领域。 展开更多
关键词 三维极坐标系 FFBP算法 图像处理器(gpu) 子孔径划分
原文传递
Complex hexagonal close-packed dendritic growth during alloy solidification by graphics processing unit-accelerated three-dimensional phase-field simulations:demo for Mg–Gd alloy
8
作者 Sheng-Lan Yang Jing Zhong +5 位作者 Kai Wang Xun Kang Jian-Bao Gao Jiong Wang Qian Li Li-Jun Zhang 《Rare Metals》 SCIE EI CAS CSCD 2023年第10期3468-3484,共17页
In this study,insights into the effect of interfacial anisotropy on a complex hexagonal close-packed(hcp) dendritic growth during alloy solidification were gained by graphics processing unit(GPU)-accelerated three-dim... In this study,insights into the effect of interfacial anisotropy on a complex hexagonal close-packed(hcp) dendritic growth during alloy solidification were gained by graphics processing unit(GPU)-accelerated three-dimensional(3D) phase-field simulations,as demonstrated for a Mg-Gd alloy.An anisotropic phasefield model with finite interface dissipation was developed by incorporating the contribution of the anisotropy of interfacial energy into the total free energy functional.The modified spherical harmonic anisotropy function was then chosen for the hcp crystal.The GPU parallel computing algorithm was implemented in the present phase-field model,and a corresponding code was developed in the compute unified device architecture parallel computing platform.Benchmark tests indicated that the calculation efficiency of a single TESLA V100 GPU could be~80times that of open multi-processing(OpenMP) with eight central processing unit cores.By coupling the phase-field model with reliable thermodynamic and interfacial energy descriptions,the 3D phase-field simulation of α-Mg dendritic growth in the Mg-6Gd(in wt%) alloy during solidification was performed.Various two-dimensional dendrite morphologies were revealed by cutting the simulated 3D dendrite along different crystallographic planes.Typical sixfold equiaxed and butterflied microstructures observed in experiments were well reproduced. 展开更多
关键词 Interfacial anisotropy Dendrite solidification Phase-field model graphics processing unit(gpu) Mg–Gd
原文传递
基于NVIDIA GPU的机载SAR实时成像处理算法CUDA设计与实现 被引量:17
9
作者 孟大地 胡玉新 +2 位作者 石涛 孙蕊 李晓波 《雷达学报(中英文)》 CSCD 2013年第4期481-491,共11页
合成孔径雷达(SAR)成像处理的运算量较大,在基于中央处理器(Central Processing Unit,CPU)的工作站或服务器上一般需要耗费较长的时间,无法满足实时性要求。借助于通用并行计算架构(CUDA)编程架构,该文提出一种基于图形处理器(GPU)的SA... 合成孔径雷达(SAR)成像处理的运算量较大,在基于中央处理器(Central Processing Unit,CPU)的工作站或服务器上一般需要耗费较长的时间,无法满足实时性要求。借助于通用并行计算架构(CUDA)编程架构,该文提出一种基于图形处理器(GPU)的SAR成像处理算法实现方案。该方案解决了GPU显存不足以容纳一景SAR数据时数据处理环节与内存/显存间数据传输环节的并行化问题,并能够支持多GPU设备的并行处理,充分利用了GPU设备的计算资源。在NVIDIA K20C和INTEL E5645上的测试表明,与传统基于GPU的SAR成像处理算法相比,该方案能够达到数十倍的速度提升,显著降低了处理设备的功耗,提高了处理设备的便携性,能够达到每秒约36兆采样点的实时处理速度。 展开更多
关键词 SAR 实时成像 图形处理器(gpu) 通用并行计算架构(CUDA)
下载PDF
A new approach for real time object detection and tracking on high resolution and multi-camera surveillance videos using GPU 被引量:4
10
作者 Mohammad Farukh Hashmi Ritu Pal +1 位作者 Rajat Saxena Avinash G.Keskar 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第1期130-144,共15页
High resolution cameras and multi camera systems are being used in areas of video surveillance like security of public places, traffic monitoring, and military and satellite imaging. This leads to a demand for computa... High resolution cameras and multi camera systems are being used in areas of video surveillance like security of public places, traffic monitoring, and military and satellite imaging. This leads to a demand for computational algorithms for real time processing of high resolution videos. Motion detection and background separation play a vital role in capturing the object of interest in surveillance videos, but as we move towards high resolution cameras, the time-complexity of the algorithm increases and thus fails to be a part of real time systems. Parallel architecture provides a surpass platform to work efficiently with complex algorithmic solutions. In this work, a method was proposed for identifying the moving objects perfectly in the videos using adaptive background making, motion detection and object estimation. The pre-processing part includes an adaptive block background making model and a dynamically adaptive thresholding technique to estimate the moving objects. The post processing includes a competent parallel connected component labelling algorithm to estimate perfectly the objects of interest. New parallel processing strategies are developed on each stage of the algorithm to reduce the time-complexity of the system. This algorithm has achieved a average speedup of 12.26 times for lower resolution video frames(320×240, 720×480, 1024×768) and 7.30 times for higher resolution video frames(1360×768, 1920×1080, 2560×1440) on GPU, which is superior to CPU processing. Also, this algorithm was tested by changing the number of threads in a thread block and the minimum execution time has been achieved for 16×16 thread block. And this algorithm was tested on a night sequence where the amount of light in the scene is very less and still the algorithm has given a significant speedup and accuracy in determining the object. 展开更多
关键词 central processing unit (CPU) graphics processing unit gpu MORPHOLOGY connected component labelling (CCL)
下载PDF
A GPU-Based Parallel Algorithm for 2D Large Deformation Contact Problems Using the Finite Particle Method 被引量:1
11
作者 Wei Wang Yanfeng Zheng +2 位作者 Jingzhe Tang Chao Yang Yaozhi Luo 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第11期595-626,共32页
Large deformation contact problems generally involve highly nonlinear behaviors,which are very time-consuming and may lead to convergence issues.The finite particle method(FPM)effectively separates pure deformation fr... Large deformation contact problems generally involve highly nonlinear behaviors,which are very time-consuming and may lead to convergence issues.The finite particle method(FPM)effectively separates pure deformation from total motion in large deformation problems.In addition,the decoupled procedures of the FPM make it suitable for parallel computing,which may provide an approach to solve time-consuming issues.In this study,a graphics processing unit(GPU)-based parallel algorithm is proposed for two-dimensional large deformation contact problems.The fundamentals of the FPM for planar solids are first briefly introduced,including the equations of motion of particles and the internal forces of quadrilateral elements.Subsequently,a linked-list data structure suitable for parallel processing is built,and parallel global and local search algorithms are presented for contact detection.The contact forces are then derived and directly exerted on particles.The proposed method is implemented with main solution procedures executed in parallel on a GPU.Two verification problems comprising large deformation frictional contacts are presented,and the accuracy of the proposed algorithm is validated.Furthermore,the algorithm’s performance is investigated via a large-scale contact problem,and the maximum speedups of total computational time and contact calculation reach 28.5 and 77.4,respectively,relative to commercial finite element software Abaqus/Explicit running on a single-core central processing unit(CPU).The contact calculation time percentage of the total calculation time is only 18%with the FPM,much smaller than that(50%)with Abaqus/Explicit,demonstrating the efficiency of the proposed method. 展开更多
关键词 Finite particle method graphics processing unit(gpu) parallel computing contact algorithm LARGE
下载PDF
A survey and measurement study of GPU DVFS on energy conservation 被引量:2
12
作者 Xinxin Mei Qiang Wang Xiaowen Chu 《Digital Communications and Networks》 SCIE 2017年第2期89-100,共12页
Energy efficiency has become one of the top design criteria for current computing systems. The Dynamic Voltage and Frequency Scaling (DVFS) has been widely adopted by laptop computers, servers, and mobile devices to... Energy efficiency has become one of the top design criteria for current computing systems. The Dynamic Voltage and Frequency Scaling (DVFS) has been widely adopted by laptop computers, servers, and mobile devices to conserve energy, while the GPU DVFS is still at a certain early age. This paper aims at exploring the impact of GPU DVFS on the application performance and power consumption, and furthermore, on energy conservation. We survey the state-of-the-art GPU DVFS characterizations, and then summarize recent research works on GPU power and performance models. We also conduct real GPU DVFS experiments on NVIDIA Fermi and Maxwell GPUs. According to our experimental results, GPU DVFS has significant potential for energy saving. The effect of scaling core voltage/frequency and memory voltage/frequency depends on not only the GPLI architectures, but also the characteristic of GPU applications. 展开更多
关键词 graphics processing unit Dynamic voltage and frequency scaling Energy efficiency
下载PDF
NeuDATool:an Open Source Neutron Data Analysis Tools,Supporting GPU Hardware Acceleration,and across-Computer Cluster Nodes Parallel 被引量:1
13
作者 Chang-li Ma He Cheng +3 位作者 Tai-sen Zuo Gui-sheng Jiao Ze-hua Han Hong Qin 《Chinese Journal of Chemical Physics》 SCIE CAS CSCD 2020年第6期727-732,I0003,共7页
Empirical potential structure refinement is a neutron scattering data analysis algorithm and a software package.It was developed by the disordered materials group in the British spallation neutron source(ISIS)in 1980s... Empirical potential structure refinement is a neutron scattering data analysis algorithm and a software package.It was developed by the disordered materials group in the British spallation neutron source(ISIS)in 1980s,and aims to construct the most-probable atomic structures of disordered materials in the field of chemical physics.It has been extensively used during the past decades,and has generated reliable results.However,it implements a shared-memory architecture with open multi-processing(OpenMP).With the extensive construction of supercomputer clusters and the widespread use of graphics processing unit(GPU)acceleration technology,it is now possible to rebuild the EPSR with these techniques in the effort to improve its calculation speed.In this study,an open source framework NeuDATool is proposed.It is programmed in the object-oriented language C++,can be paralleled across nodes within a computer cluster,and supports GPU acceleration.The performance of NeuDATool has been tested with water and amorphous silica neutron scattering data.The test shows that the software can reconstruct the correct microstructure of the samples,and the calculation speed with GPU acceleration can increase by more than 400 times,compared with CPU serial algorithm at a simulation box that has about 100 thousand atoms.NeuDATool provides another choice to implement simulation in the(neutron)diffraction community,especially for experts who are familiar with C++programming and want to define specific algorithms for their analysis. 展开更多
关键词 Neutron diffraction Neutron scattering Empirical potential structure refinement graphics processing unit C++
下载PDF
Accelerating f inite difference wavef ield-continuation depth migration by GPU
14
作者 刘国峰 孟小红 刘洪 《Applied Geophysics》 SCIE CSCD 2012年第1期41-48,115,共9页
The most popular hardware used for parallel depth migration is the PC-Cluster but its application is limited due to large space occupation and high power consumption. In this paper, we introduce a new hardware archite... The most popular hardware used for parallel depth migration is the PC-Cluster but its application is limited due to large space occupation and high power consumption. In this paper, we introduce a new hardware architecture, based on which the finite difference (FD) wavefield-continuation depth migration can be conducted using the Graphics Processing Unit (GPU) as a CPU coprocessor. We demonstrate the program module and three key optimization steps for implementing FD depth migration: memory, thread structure, and instruction optimizations and consider evaluation methods for the amount of optimization. 2D and 3D models are used to test depth migration on the GPU. The tested results show that the depth migration computational efficiency greatly increased using the general-purpose GPU, increasing by at least 25 times compared to the AMD 2.5 GHz CPU. 展开更多
关键词 Wavefield-continuation depth migration finite difference Graphic processing unit EFFICIENCY
下载PDF
AN EFFICIENT GPU ACCELERATION FORMAT FOR FINITE ELEMENT ANALYSIS
15
作者 Tian Jin Li Gong +1 位作者 Fei Wu Zeng Guohui 《Journal of Electronics(China)》 2013年第6期599-608,共10页
This paper proposes a new Graphics Processing Unit(GPU)-accelerated storage format to speed up Sparse Matrix Vector Products(SMVPs) for Finite Element Method(FEM) analysis of electromagnetic problems.A new format call... This paper proposes a new Graphics Processing Unit(GPU)-accelerated storage format to speed up Sparse Matrix Vector Products(SMVPs) for Finite Element Method(FEM) analysis of electromagnetic problems.A new format called Modified Compile Time Optimization(MCTO) format is used to reduce much execution time and design for hastening the iterative solution of FEM equations especially when rows have uneven lengths.The MCTO-applied FEM is about 10 times faster than conventional FEM on a CPU,and faster than other row-major ordering formats on a GPU.Numerical results show that the proposed GPU-accelerated storage format turns out to be an excellent accelerator. 展开更多
关键词 Finite Element Method (FEM) graphics processing unit gpu Parallelizationstrategy Modified Compile Time Optimization (MCTO)
下载PDF
ARM GPU的多任务调度设计与实现 被引量:6
16
作者 丑文龙 梅魁志 +1 位作者 高增辉 李博良 《西安交通大学学报》 EI CAS CSCD 北大核心 2014年第12期87-92,共6页
针对现有GPU任务调度系统在多任务环境下不能保证图形任务响应时间的问题,提出基于分类和多优先级队列(CPMQ)的调度方案,并在ARM的嵌入式GPU上实现验证.该方案中,将GPU的多任务划分为图形任务、通用计算任务和实时图形3类任务并分别... 针对现有GPU任务调度系统在多任务环境下不能保证图形任务响应时间的问题,提出基于分类和多优先级队列(CPMQ)的调度方案,并在ARM的嵌入式GPU上实现验证.该方案中,将GPU的多任务划分为图形任务、通用计算任务和实时图形3类任务并分别建立队列排队,其中图形任务和通用计算任务按照优先级在各自队列中排队,实时图形按照任务截止时间排队.面向多队列的任务调度,优先从实时任务队列中选择任务,并按照加权公平算法分别在图形任务队列和通用计算队列中选择任务.实验结果表明:相比于ARM GPU的原有调度系统,CPMQ在不显著增加通用计算任务的执行时间和调度开销的情况下,将实时图形任务的帧率提升了5%~20%. 展开更多
关键词 图形处理器 多任务 调度 排队
下载PDF
Study on the particle breakage of ballast based on a GPU accelerated discrete element method 被引量:4
17
作者 Guang-Yu Liu Wen-Jie Xu +1 位作者 Qi-Cheng Sun Nicolin Govender 《Geoscience Frontiers》 SCIE CAS CSCD 2020年第2期461-471,共11页
Breakage of particles will have greatly influence on mechanical behavior of granular material(GM)under external loads,such as ballast,rockfill and sand.The discrete element method(DEM)is one of the most popular method... Breakage of particles will have greatly influence on mechanical behavior of granular material(GM)under external loads,such as ballast,rockfill and sand.The discrete element method(DEM)is one of the most popular methods for simulating GM as each particle is represented on its own.To study breakage mechanism of particle breakage,a cohesive contact mode is developed based on the GPU accelerated DEM code-Blaze-DEM.A database of the 3D geometry model of rock blocks is established based on the 3D scanning method.And an agglomerate describing the rock block with a series of non-overlapping spherical particles is used to build the DEM numerical model of a railway ballast sample,which is used to the DEM oedometric test to study the particles’breakage characteristics of the sample under external load.Furthermore,to obtain the meso-mechanical parameters used in DEM,a black-analysis method is used based on the laboratory tests of the rock sample.Based on the DEM numerical tests,the particle breakage process and mechanisms of the railway ballast are studied.All results show that the developed code can better used for large scale simulation of the particle breakage analysis of granular material. 展开更多
关键词 Discrete element method(DEM) Particle breakage Graphical processing unit(gpu) Railway ballast Granular material(GM)
下载PDF
Real-time color holographic video reconstruction using multiple-graphics processing unit cluster acceleration and three spatial light modulators 被引量:5
18
作者 Shohei Ikawa Naoki Takada +8 位作者 Hiromitsu Araki Hiroaki Niwase Hiromi Sannomiya Hirotaka Nakayama Minoru Oikawa Yuichiro Mori Takashi Kakue Tomoyoshi Shimobaba Tomoyoshi Ito 《Chinese Optics Letters》 SCIE EI CAS CSCD 2020年第1期18-22,共5页
We demonstrate real-time three-dimensional(3D)color video using a color electroholographic system with a cluster of multiple-graphics processing units(multi-GPU)and three spatial light modulators(SLMs)corresponding re... We demonstrate real-time three-dimensional(3D)color video using a color electroholographic system with a cluster of multiple-graphics processing units(multi-GPU)and three spatial light modulators(SLMs)corresponding respectively to red,green,and blue(RGB)-colored reconstructing lights.The multi-GPU cluster has a computer-generated hologram(CGH)display node containing a GPU,for displaying calculated CGHs on SLMs,and four CGH calculation nodes using 12 GPUs.The GPUs in the CGH calculation node generate CGHs corresponding to RGB reconstructing lights in a 3D color video using pipeline processing.Real-time color electroholography was realized for a 3D color object comprising approximately 21,000 points per color. 展开更多
关键词 color electroholography real-time electroholography multiple-graphics processing unit cluster graphics processing unit
原文传递
Real-time spatiotemporal division multiplexing electroholography for 1,200,000 object points using multiple-graphics processing unit cluster 被引量:2
19
作者 Hiromi Sannomiya Naoki Takada +7 位作者 Kohei Suzuki Tomoya Sakaguchi Hirotaka Nakayama Minoru Oikawa Yuichiro Mori Takashi Kakue Tomoyoshi Shimobaba Tomoyoshi Ito 《Chinese Optics Letters》 SCIE EI CAS CSCD 2020年第7期28-32,共5页
Computationally, the calculation of computer-generated holograms is extremely expensive, and the image quality deteriorates when reconstructing three-dimensional(3 D) holographic video from a point-cloud model compris... Computationally, the calculation of computer-generated holograms is extremely expensive, and the image quality deteriorates when reconstructing three-dimensional(3 D) holographic video from a point-cloud model comprising a huge number of object points. To solve these problems, we implement herein a spatiotemporal division multiplexing method on a cluster system with 13 GPUs connected by a gigabit Ethernet network.A performance evaluation indicates that the proposed method can realize a real-time holographic video of a3 D object comprising ~1,200,000 object points. These results demonstrate a clear 3 D holographic video at32.7 frames per second reconstructed from a 3 D object comprising 1,064,462 object points. 展开更多
关键词 real-time electroholography multiple-graphics processing unit cluster graphics processing unit spatiotemporal division multiplexing electroholography
原文传递
上一页 1 2 50 下一页 到第
使用帮助 返回顶部