期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Runtime Power Allocation Based on Multi-GPU Utilization in GAMESS
1
作者 masha sosonkina Vaibhav Sundriyal Jorge Luis Galvez Vallejo 《Journal of Computer and Communications》 2022年第9期66-80,共15页
To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize performan... To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize performance under a given power budget by distributing the available power according to the relative GPU utilization. Time series forecasting methods were used to develop workload prediction models that provide accurate prediction of GPU utilization during application execution. Experiments were performed on a multi-GPU computing platform DGX-1 equipped with eight NVIDIA V100 GPUs used for quantum chemistry calculations in the GAMESS package. For a limited power budget, the proposed strategy may deliver as much as hundred times better GAMESS performance than that obtained when the power is distributed equally among all the GPUs. 展开更多
关键词 Time Series Forecasting ARIMA Power Allocation Performance Modeling GAMESS GPU Utilization
下载PDF
Core and Uncore Joint Frequency Scaling Strategy 被引量:1
2
作者 Vaibhav Sundriyal masha sosonkina +1 位作者 Bryce Westheimer Mark Gordon 《Journal of Computer and Communications》 2018年第12期184-201,共18页
Energy-proportional computing is one of the foremost constraints in the design of next generation exascale systems. These systems must have a very high FLOP-per-watt ratio to be sustainable, which requires tremendous ... Energy-proportional computing is one of the foremost constraints in the design of next generation exascale systems. These systems must have a very high FLOP-per-watt ratio to be sustainable, which requires tremendous improvements in power efficiency for modern computing systems. This paper focuses on the processor—as still the biggest contributor to the power usage—by considering both its core and uncore power subsystems. The uncore describes those processor functions that are not handled by the core, such as L3 cache and on-chip interconnect, and contributes significantly to the total system power. The uncore frequency scaling (UFS) capability has been available to the user since the Intel Haswell processor generation. In this paper, performance and power models are proposed to use both the UFS and dynamic voltage and frequency scaling (DVFS) to reduce the energy consumption in parallel applications. Then, these models are incorporated into a runtime strategy that performs processor frequency scaling during parallel application execution. The strategy can be implemented at the kernel/firmware level, which makes it suitable for improving the energy efficiency of exascale design. Experiments on a 20-core Haswell-EP machine using the quantum chemistry application GAMESS and NAS benchmark resulted in up to 24% energy savings with as little as 2% performance loss. 展开更多
关键词 Uncore FREQUENCY SCALING (UFS) Dynamic Voltage and FREQUENCY SCALING (DVFS) Power GAMESS Energy SAVINGS NAS Benchmarks
下载PDF
Runtime Energy Savings Based on Machine Learning Models for Multicore Applications 被引量:1
3
作者 Vaibhav Sundriyal masha sosonkina 《Journal of Computer and Communications》 2022年第6期63-80,共18页
To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize energy sa... To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize energy savings under a given performance degradation. Machine learning techniques were utilized to develop performance models which would provide accurate performance prediction with change in operating core-uncore frequency. Experiments, performed on a node (28 cores) of a modern computing platform showed significant energy savings of as much as 26% with performance degradation of as low as 5% under the proposed strategy compared with the execution in the unlimited power case. 展开更多
关键词 Machine Learning RAPL DVFS Uncore Frequency Scaling Energy Savings Performance Modeling
下载PDF
Maximizing Performance under a Power Constraint on Modern Multicore Systems
4
作者 Vaibhav Sundriyal masha sosonkina +1 位作者 Bryce Westheimer Mark S. Gordon 《Journal of Computer and Communications》 2019年第7期252-266,共15页
Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. These supercomputers should provide high throughput per unit of power to be sustainable in terms of operating c... Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. These supercomputers should provide high throughput per unit of power to be sustainable in terms of operating cost and failure rates. In this paper, a power-bounded strategy is proposed that maximizes parallel application performance under a given power constraint. The strategy dynamically allocates power to core, uncore, and memory power domains within a node to maximize performance under a given power budget. Experiments on a 20-core Haswell-EP platform for a real-world parallel application GAMESS demonstrate that the proposed strategy delivers performance within 4% of the best possible performance for as much as 25% reduction in the minimum power budget required for maximum performance. 展开更多
关键词 Uncore FREQUENCY SCALING (UFS) Dynamic Voltage and FREQUENCY SCALING (DVFS) Power BUDGET GAMESS
下载PDF
Distributed Strategy for Power Re-Allocation in High Performance Applications
5
作者 Vaibhav Sundriyal masha sosonkina 《Journal of Computer and Communications》 2020年第12期142-158,共17页
To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to distribute a given... To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to distribute a given power allocation among the cluster nodes assigned to the application while balancing their performance change. The strategy operates in a timeslice-based manner to estimate the current application performance and power usage per node followed by power redistribution across the nodes. Experiments, performed on four nodes (112 cores) of a modern computing platform interconnected with Infiniband showed that even a significant power budget reduction of 20% may result in a performance degradation of as low as 1% under the proposed strategy compared with the execution in the unlimited power case. 展开更多
关键词 Multinode Power Allocation RAPL UFS DVFS Maximizing Performance Component Power
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部