期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
A TSE based design for MMSE and QRD of MIMO systems based on ASIP
1
作者 冯雪林 SHI Jinglin +3 位作者 CHEN Yang FU Yanlu ZHANG Qineng XIAO Feng 《High Technology Letters》 EI CAS 2023年第2期166-173,共8页
A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set process... A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set processor(ASIP), which uses TSE algorithm instead of resource-consuming reciprocal and reciprocal square root(RSR) operations.The aim is to give a high performance implementation for MMSE and QRD in one programmable platform simultaneously.Furthermore, instruction set architecture(ISA) and the allocation of data paths in single instruction multiple data-very long instruction word(SIMD-VLIW) architecture are provided, offering more data parallelism and instruction parallelism for different dimension matrices and operation types.Meanwhile, multiple level numerical precision can be achieved with flexible table size and expansion order in TSE ISA.The ASIP has been implemented to a 28 nm CMOS process and frequency reaches 800 MHz.Experimental results show that the proposed design provides perfect numerical precision within the fixed bit-width of the ASIP, higher matrix processing rate better than the requirements of 5G system and more rate-area efficiency comparable with ASIC implementations. 展开更多
关键词 multi-input and multi-output(MIMO) minimum mean-square error(MMSE) QR decomposition(QRD) Taylor series expansion(TSE) application specific instruction set processor(ASIP) instruction set architecture(ISA) single instruction multiple data(SIMD) very long instruction word(VLIW)
下载PDF
HXPY: A High-Performance Data Processing Package for Financial Time-Series Data
2
作者 郭家栋 彭靖姝 +1 位作者 苑航 倪明选 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第1期3-24,共22页
A tremendous amount of data has been generated by global financial markets everyday,and such time-series data needs to be analyzed in real time to explore its potential value.In recent years,we have witnessed the succ... A tremendous amount of data has been generated by global financial markets everyday,and such time-series data needs to be analyzed in real time to explore its potential value.In recent years,we have witnessed the successful adoption of machine learning models on financial data,where the importance of accuracy and timeliness demands highly effective computing frameworks.However,traditional financial time-series data processing frameworks have shown performance degradation and adaptation issues,such as the outlier handling with stock suspension in Pandas and TA-Lib.In this paper,we propose HXPY,a high-performance data processing package with a C++/Python interface for financial time-series data.HXPY supports miscellaneous acceleration techniques such as the streaming algorithm,the vectorization instruction set,and memory optimization,together with various functions such as time window functions,group operations,down-sampling operations,cross-section operations,row-wise or column-wise operations,shape transformations,and alignment functions.The results of benchmark and incremental analysis demonstrate the superior performance of HXPY compared with its counterparts.From MiBs to GiBs data,HXPY significantly outperforms other in-memory dataframe computing rivals even up to hundreds of times. 展开更多
关键词 dataframe time-series data SIMD(single instruction multiple data) CUDA(Compute Unified Device Architecture)
原文传递
Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads
3
作者 李若时 彭平 +2 位作者 邵志远 金海 郑然 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第4期807-820,共14页
Computer vision(CV)algorithms have been extensively used for a myriad of applications nowadays.As the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processi... Computer vision(CV)algorithms have been extensively used for a myriad of applications nowadays.As the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV algorithms.Single Instruction Multiple Data(SIMD)instructions,capable of conducting the same operation on multiple data items in a single instruction,are extensively employed to improve the efficiency of CV algorithms.In this paper,we evaluate the power and effectiveness of RISC-V vector extension(RV-V)on typical CV algorithms,such as Gray Scale,Mean Filter,and Edge Detection.By our examinations,we show that compared with the baseline OpenCV implementation using scalar instructions,the equivalent implementations using the RV-V(version 0.8)can reduce the instruction count of the same CV algorithm up to 24x,when processing the same input images.Whereas,the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V co-processor.In our evaluation,by using the vector co-processor(with eight execution lanes)of Xuantie C906,vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts. 展开更多
关键词 RISC-V vector extension single instruction multiple data(SIMD) computer vision OpenCV
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部