期刊文献+
共找到430篇文章
< 1 2 22 >
每页显示 20 50 100
基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究
1
作者 孙己龙 刘勇 +4 位作者 周黎伟 路鑫 侯小龙 王亚琼 王志丰 《图学学报》 CSCD 北大核心 2024年第5期1050-1061,共12页
为解决因衬砌裂缝性状随机、分布密集、标注框分辨率低所导致的现有模型识别精度低、检测速度慢及参数量庞大等问题,以第2版可变形卷积网络(DCNv2)和端到端变换器解码器(Transformer Decoder)为基础对YOLOv8网络框架进行改进,提出了面... 为解决因衬砌裂缝性状随机、分布密集、标注框分辨率低所导致的现有模型识别精度低、检测速度慢及参数量庞大等问题,以第2版可变形卷积网络(DCNv2)和端到端变换器解码器(Transformer Decoder)为基础对YOLOv8网络框架进行改进,提出了面向衬砌裂缝的检测模型DTD-YOLOv8。首先,通过引入DCNv2对YOLOv8主干卷积网络C2f进行融合以实现模型对裂缝形变特征的准确快速感知,同时采用Transformer Decoder对YOLOv8检测头进行替换以实现端到端框架内完整目标检测流程,从而消除因Anchor-free处理模式所带来的计算消耗。采用自建裂缝数据集对SSD,Faster-RCNN,RT-DETR,YOLOv3,YOLOv5,YOLOv8和DTD-YOLOv8的7种检测模型进行对比验证。结果表明:改进模型F1分数和mAP@50值分别为87.05%和89.58%;其中F1分数相较其他6种模型分别提高了14.16%,7.68%,1.55%,41.36%,8.20%和7.40%;mAP@50分别提高了28.84%,15.47%,1.33%,47.65%,10.14%和10.84%。改进模型参数量仅为RT-DETR的三分之一,检测单张图片的速度为16.01 ms,FPS为65.46帧每秒,对比其他模型检测速度得到提升。该模型在面向运营隧道裂缝检测任务需求时能够表现出高效的性能。 展开更多
关键词 隧道工程 目标检测 第2版可变形卷积网络 Transformer decoder 衬砌裂缝
下载PDF
Parallel Implementation of the CCSDS Turbo Decoder on GPU
2
作者 Liu Zhanxian Liu Rongke +3 位作者 Zhang Haijun Wang Ning Sun Lei Wang Jianquan 《China Communications》 SCIE CSCD 2024年第10期70-77,共8页
This paper presents a software turbo decoder on graphics processing units(GPU).Unlike previous works,the proposed decoding architecture for turbo codes mainly focuses on the Consultative Committee for Space Data Syste... This paper presents a software turbo decoder on graphics processing units(GPU).Unlike previous works,the proposed decoding architecture for turbo codes mainly focuses on the Consultative Committee for Space Data Systems(CCSDS)standard.However,the information frame lengths of the CCSDS turbo codes are not suitable for flexible sub-frame parallelism design.To mitigate this issue,we propose a padding method that inserts several bits before the information frame header.To obtain low-latency performance and high resource utilization,two-level intra-frame parallelisms and an efficient data structure are considered.The presented Max-Log-Map decoder can be adopted to decode the Long Term Evolution(LTE)turbo codes with only small modifications.The proposed CCSDS turbo decoder at 10 iterations on NVIDIA RTX3070 achieves about 150 Mbps and 50Mbps throughputs for the code rates 1/6 and 1/2,respectively. 展开更多
关键词 CCSDS CUDA GPU parallel decoding turbo codes
下载PDF
Quantized Decoders that Maximize Mutual Information for Polar Codes
3
作者 Zhu Hongfei Cao Zhiwei +1 位作者 Zhao Yuping Li Dou 《China Communications》 SCIE CSCD 2024年第7期125-134,共10页
In this paper,we innovatively associate the mutual information with the frame error rate(FER)performance and propose novel quantized decoders for polar codes.Based on the optimal quantizer of binary-input discrete mem... In this paper,we innovatively associate the mutual information with the frame error rate(FER)performance and propose novel quantized decoders for polar codes.Based on the optimal quantizer of binary-input discrete memoryless channels(BDMCs),the proposed decoders quantize the virtual subchannels of polar codes to maximize mutual information(MMI)between source bits and quantized symbols.The nested structure of polar codes ensures that the MMI quantization can be implemented stage by stage.Simulation results show that the proposed MMI decoders with 4 quantization bits outperform the existing nonuniform quantized decoders that minimize mean-squared error(MMSE)with 4 quantization bits,and yield even better performance than uniform MMI quantized decoders with 5 quantization bits.Furthermore,the proposed 5-bit quantized MMI decoders approach the floating-point decoders with negligible performance loss. 展开更多
关键词 maximize mutual information polar codes QUANTIZATION successive cancellation decoding
下载PDF
Unifying Convolution and Transformer Decoder for Textile Fiber Identification
4
作者 许罗力 李粉英 常姗 《Journal of Donghua University(English Edition)》 CAS 2023年第4期357-363,共7页
At present,convolutional neural networks(CNNs)and transformers surpass humans in many situations(such as face recognition and object classification),but do not work well in identifying fibers in textile surface images... At present,convolutional neural networks(CNNs)and transformers surpass humans in many situations(such as face recognition and object classification),but do not work well in identifying fibers in textile surface images.Hence,this paper proposes an architecture named FiberCT which takes advantages of the feature extraction capability of CNNs and the long-range modeling capability of transformer decoders to adaptively extract multiple types of fiber features.Firstly,the convolution module extracts fiber features from the input textile surface images.Secondly,these features are sent into the transformer decoder module where label embeddings are compared with the features of each type of fibers through multi-head cross-attention and the desired features are pooled adaptively.Finally,an asymmetric loss further purifies the extracted fiber representations.Experiments show that FiberCT can more effectively extract the representations of various types of fibers and improve fiber identification accuracy than state-of-the-art multi-label classification approaches. 展开更多
关键词 non-destructive textile fiber identification transformer decoder asymmetric loss
下载PDF
Area optimization of parallel Chien search architecture for Reed-Solomon(255,239) decoder 被引量:1
5
作者 胡庆生 王志功 +1 位作者 张军 肖洁 《Journal of Southeast University(English Edition)》 EI CAS 2006年第1期5-10,共6页
A global optimization algorithm (GOA) for parallel Chien search circuit in Reed-Solomon (RS) (255,239) decoder is presented. By finding out the common modulo 2 additions within groups of Galois field (GF) mult... A global optimization algorithm (GOA) for parallel Chien search circuit in Reed-Solomon (RS) (255,239) decoder is presented. By finding out the common modulo 2 additions within groups of Galois field (GF) multipliers and pre-computing the common items, the GOA can reduce the number of XOR gates efficiently and thus reduce the circuit area. Different from other local optimization algorithms, the GOA is a global one. When there are more than one maximum matches at a time, the best match choice in the GOA has the least impact on the final result by only choosing the pair with the smallest relational value instead of choosing a pair randomly. The results show that the area of parallel Chien search circuits can be reduced by 51% compared to the direct implementation when the group-based GOA is used for GF multipliers and by 26% if applying the GOA to GF multipliers separately. This optimization scheme can be widely used in general parallel architecture in which many GF multipliers are involved. 展开更多
关键词 RS decoder Chien search circuit area optimization Galois field multiplier
下载PDF
A Total Dose Radiation Hardened PDSOI CMOS 3-Line to 8-Line Decoder
6
作者 刘梦新 韩郑生 +3 位作者 李多力 刘刚 赵超荣 赵发展 《Journal of Semiconductors》 EI CAS CSCD 北大核心 2008年第6期1036-1039,共4页
The first domestic total dose hardened 2μm partially depleted silicon-on-insulator (PDSOI) CMOS 3-line to 8- line decoder fabricated in SIMOX is demonstrated. The radiation performance is characterized by transisto... The first domestic total dose hardened 2μm partially depleted silicon-on-insulator (PDSOI) CMOS 3-line to 8- line decoder fabricated in SIMOX is demonstrated. The radiation performance is characterized by transistor threshold voltage shifts,circuit static leakage currents,and I-V curves as a function of total dose up to 3× 10^5rad(Si). The worst case threshold voltage shifts of the front channels are less than 20mV for nMOS transistors at 3 × 10^5rad(Si) and follow-up irradiation and less than 70mV for the pMOS transistors. Furthermore, no significant radiation induced leakage currents and functional degeneration are observed. 展开更多
关键词 PDSOI decodeR total dose RADIATION
下载PDF
Modified Benes network architecture for WiMAX LDPC decoder 被引量:1
7
作者 徐勐 吴建辉 张萌 《Journal of Southeast University(English Edition)》 EI CAS 2011年第2期140-143,共4页
A modified Benes network is proposed to be used as an optimal shuffle network in worldwide interoperability for microwave access (WiMAX) low density parity check (LDPC) decoders, When the size of the input is not ... A modified Benes network is proposed to be used as an optimal shuffle network in worldwide interoperability for microwave access (WiMAX) low density parity check (LDPC) decoders, When the size of the input is not a power of two, the modified Benes network can achieve the most optimal performance. This modified Benes network is non-blocking and can perform any sorts of permutations, so it can support 19 modes specified in the WiMAX system. Furthermore, an efficient algorithm to generate the control signals for all the 2 × 2 switches in this network is derived, which can reduce the hardware complexity and overall latency of the modified Benes network. Synthesis results show that the proposed control signal generator can save 25.4% chip area and the overall network latency can be reduced by 36. 2%. 展开更多
关键词 worldwide interoperability for microwave access(WiMAX) quasi-cycle low density parity check (QC-LDPC) LDPC decoder Benes network
下载PDF
Viterbi Decoder ACS单元中路径度量值存储空间的优化
8
作者 郭正伟 赵勇 《现代电子技术》 2007年第17期71-73,共3页
ACS单元的设计及路径度量(PM)值的存储是Viterbi Decoder硬件实现的重要部分之一。介绍了一种码率为1/2的硬判决Viterbi Decoder的ACS部分的硬件实现方法。采用了一种全新的设计与存储方式,即原位运算旋转地址的方式,极大地节省了在ACS... ACS单元的设计及路径度量(PM)值的存储是Viterbi Decoder硬件实现的重要部分之一。介绍了一种码率为1/2的硬判决Viterbi Decoder的ACS部分的硬件实现方法。采用了一种全新的设计与存储方式,即原位运算旋转地址的方式,极大地节省了在ACS运算过程中用以存储路径度量值的RAM空间,大量的实验证明,设计的译码器在资源消耗上有较大优势。 展开更多
关键词 卷积码 VITERBI decodeR ACS单元 路径度量 分支度量 幸存路径 回溯
下载PDF
Design and implementation of an efficient SDRAM controller for HDTV decoder 被引量:3
9
作者 王晓辉 Zhao Yiqiang +2 位作者 Xie Xiaodong Wu Di Zhang Peng 《High Technology Letters》 EI CAS 2007年第4期402-406,共5页
A high performance SDRAM controller for HDTV decoder is designed. MB-based ( macro block) address mapping, adaptive-precharge and command interleaving are adopted in this controller. MB-based address mapping reduces... A high performance SDRAM controller for HDTV decoder is designed. MB-based ( macro block) address mapping, adaptive-precharge and command interleaving are adopted in this controller. MB-based address mapping reduces the precharge operations of the video processing unit in one access; adaptive- precharge avoids unnecessary precharge operations; while command interleaving inserts the precharge and activate commands of the next access into the command sequence of the current access, thus reduces the no operation (NOP) cycles. Combination of these three schemes effectively improves the SDRAM performance. Compared with precharge-all scheme, adaptive-precharge and command interleaving reduce the SDRAM overhead cycles by 70% and increases SDRAM performance by up to 19.2% in the best case. This controller has been implemented in an AVS SoC and the frequency is 200MHz. 展开更多
关键词 SDRAM controller MB-based address mapping adaptive-precharge command interleaving HDTV decoder
下载PDF
Low-loss belief propagation decoder with Tanner graph in quantum error-correction codes 被引量:1
10
作者 Dan-Dan Yan Xing-Kui Fan +1 位作者 Zhen-Yu Chen Hong-Yang Ma 《Chinese Physics B》 SCIE EI CAS CSCD 2022年第1期143-149,共7页
Quantum error-correction codes are immeasurable resources for quantum computing and quantum communication.However,the existing decoders are generally incapable of checking node duplication of belief propagation(BP)on ... Quantum error-correction codes are immeasurable resources for quantum computing and quantum communication.However,the existing decoders are generally incapable of checking node duplication of belief propagation(BP)on quantum low-density parity check(QLDPC)codes.Based on the probability theory in the machine learning,mathematical statistics and topological structure,a GF(4)(the Galois field is abbreviated as GF)augmented model BP decoder with Tanner graph is designed.The problem of repeated check nodes can be solved by this decoder.In simulation,when the random perturbation strength p=0.0115-0.0116 and number of attempts N=60-70,the highest decoding efficiency of the augmented model BP decoder is obtained,and the low-loss frame error rate(FER)decreases to 7.1975×10^(-5).Hence,we design a novel augmented model decoder to compare the relationship between GF(2)and GF(4)for quantum code[[450,200]]on the depolarization channel.It can be verified that the proposed decoder provides the widely application range,and the decoding performance is better in QLDPC codes. 展开更多
关键词 tanner graph belief propagation decoder augmented model fourier transform
原文传递
Real-Time Implementation for Reduced-Complexity LDPC Decoder in Satellite Communication 被引量:4
11
作者 WANG Yongqing LIU Donglei SUN Lida WU Siliang 《China Communications》 SCIE CSCD 2014年第12期94-104,共11页
In this paper,it has proposed a realtime implementation of low-density paritycheck(LDPC) decoder with less complexity used for satellite communication on FPGA platform.By adopting a(2048.4096)irregular quasi-cyclic(QC... In this paper,it has proposed a realtime implementation of low-density paritycheck(LDPC) decoder with less complexity used for satellite communication on FPGA platform.By adopting a(2048.4096)irregular quasi-cyclic(QC) LDPC code,the proposed partly parallel decoding structure balances the complexity between the check node unit(CNU) and the variable node unit(VNU) based on min-sum(MS) algorithm,thereby achieving less Slice resources and superior clock performance.Moreover,as a lookup table(LUT) is utilized in this paper to search the node message stored in timeshare memory unit,it is simple to reuse and save large amount of storage resources.The implementation results on Xilinx FPGA chip illustrate that,compared with conventional structure,the proposed scheme can achieve at last 28.6%and 8%cost reduction in RAM and Slice respectively.The clock frequency is also increased to 280 MHz without decoding performance deterioration and convergence speed reduction. 展开更多
关键词 quasi-cyclic code LDPC decoder min-sum algorithm partial parallel structure lookup table
下载PDF
Radiation Tolerant Viterbi Decoders for On-Board Processing(OBP) in Satellite Communications 被引量:1
12
作者 Zhen Gao Lina Yan +3 位作者 Jinhua Zhu Ruishi Han Ullah Anees Reviriego Pedro 《China Communications》 SCIE CSCD 2020年第1期140-150,共11页
Modern satellite communication systems require on-board processing(OBP)for performance improvements,and SRAM-FPGAs are an attractive option for OBP implementation.However,SRAM-FPGAs are sensitive to radiation effects,... Modern satellite communication systems require on-board processing(OBP)for performance improvements,and SRAM-FPGAs are an attractive option for OBP implementation.However,SRAM-FPGAs are sensitive to radiation effects,among which single event upsets(SEUs)are important as they can lead to data corruption and system failure.This paper studies the fault tolerance capability of a SRAM-FPGA implemented Viterbi decoder to SEUs on the user memory.Analysis and fault injection experiments are conducted to verify that over 97%of the SEUs on user memory would not lead to output errors.To achieve a better reliability,selective protection schemes are then proposed to further improve the reliability of the decoder to SEUs on user memory with very small overhead.Although the results are obtained for a specific FPGA implementation,the developed reliability estimation model and the general conclusions still hold for other implementations. 展开更多
关键词 viterbi decoder on-board processing FPGA user memory fault tolerance single event upsets
下载PDF
Determination of quantum toric error correction code threshold using convolutional neural network decoders 被引量:1
13
作者 Hao-Wen Wang Yun-Jia Xue +2 位作者 Yu-Lin Ma Nan Hua Hong-Yang Ma 《Chinese Physics B》 SCIE EI CAS CSCD 2022年第1期136-142,共7页
Quantum error correction technology is an important solution to solve the noise interference generated during the operation of quantum computers.In order to find the best syndrome of the stabilizer code in quantum err... Quantum error correction technology is an important solution to solve the noise interference generated during the operation of quantum computers.In order to find the best syndrome of the stabilizer code in quantum error correction,we need to find a fast and close to the optimal threshold decoder.In this work,we build a convolutional neural network(CNN)decoder to correct errors in the toric code based on the system research of machine learning.We analyze and optimize various conditions that affect CNN,and use the RestNet network architecture to reduce the running time.It is shortened by 30%-40%,and we finally design an optimized algorithm for CNN decoder.In this way,the threshold accuracy of the neural network decoder is made to reach 10.8%,which is closer to the optimal threshold of about 11%.The previous threshold of 8.9%-10.3%has been slightly improved,and there is no need to verify the basic noise. 展开更多
关键词 quantum error correction toric code convolutional neural network(CNN)decoder
原文传递
Functional Verification Based on FPGA for AVS Video Decoder 被引量:1
14
作者 FU Fang-fang YI Oing-ming SHI Min 《Semiconductor Photonics and Technology》 CAS 2009年第4期219-224,共6页
In this paper,based on the field-programmable gate array(FPGA)xc5vlx220 of Xilinx Company,the FPGA verification method for application specific integrated circuit(ASIC)design is introduced.Firstly,the basic principles... In this paper,based on the field-programmable gate array(FPGA)xc5vlx220 of Xilinx Company,the FPGA verification method for application specific integrated circuit(ASIC)design is introduced.Firstly,the basic principles of FPGA verification are introduced.Then,the structure of the FPGA board and the verification methods are illustrated.Finally,the workflow of FPGA verification for audio video coding standard(AVS)decoder and the method of restoring images are introduced in detail.The FPGA resources occupancy is shown and analyzed.The result shows that FPGA can verify the ASIC rapidly and effectively so as to shorten the development cycle. 展开更多
关键词 FPGA verification AVS video decoder MATLAB
下载PDF
Efficient VLSI architecture of CAVLC decoder with power optimized 被引量:1
15
作者 陈光化 胡登基 +2 位作者 张金艺 郑伟峰 曾为民 《Journal of Shanghai University(English Edition)》 CAS 2009年第6期462-465,共4页
This paper presents an efficient VLSI architecture of the contest-based adaptive variable length code (CAVLC) decoder with power optimized for the H.264/advanced video coding (AVC) standard. In the proposed design... This paper presents an efficient VLSI architecture of the contest-based adaptive variable length code (CAVLC) decoder with power optimized for the H.264/advanced video coding (AVC) standard. In the proposed design, according to the regularity of the codewords, the first one detector is used to solve the low efficiency and high power dissipation problem within the traditional method of table-searching. Considering the relevance of the data used in the process of runbefore's decoding, arithmetic operation is combined with finite state machine (FSM), which achieves higher decoding efficiency. According to the CAVLC decoding flow, clock gating is employed in the module level and the register level respectively, which reduces 43% of the overall dynamic power dissipation. The proposed design can decode every syntax element in one clock cycle. When the proposed design is synthesized at the clock constraint of 100 MHz, the synthesis result shows that the design costs 11 300 gates under a 0.25 μm CMOS technology, which meets the demand of real time decoding in the H.264/AVC standard. 展开更多
关键词 H.264/advanced video coding (AVC) contest-based adaptive variable length code (CAVLC) decodeR
下载PDF
Low complexity suboptimal decode algorithms for quasi- orthogonal space time block codes
16
作者 李正权 吴名 +2 位作者 沈连丰 王志功 贾子彦 《Journal of Southeast University(English Edition)》 EI CAS 2016年第1期1-5,共5页
Due to the high complexity of the pairwise decoding algorithm and the poor performance of zero forcing( ZF) /minimum mean square error( MMSE) decoding algorithm, two low-complexity suboptimal decoding algorithms, ... Due to the high complexity of the pairwise decoding algorithm and the poor performance of zero forcing( ZF) /minimum mean square error( MMSE) decoding algorithm, two low-complexity suboptimal decoding algorithms, called pairwisequasi-ZF and pairwise-quasi-MMSE decoders, are proposed. First,two transmit signals are detected by the quasi-ZF or the quasiMMSE algorithm at the receiver. Then, the two detected signals as the decoding results are substituted into the two pairwise decoding algorithm expressions to detect the other two transmit signals. The bit error rate( BER) performance of the proposed algorithms is compared with that of the current known decoding algorithms.Also, the number of calculations of ZF, MMSE, quasi-ZF and quasi-MMSE algorithms is compared with each other. Simulation results showthat the BER performance of the proposed algorithms is substantially improved in comparison to the quasi-ZF and quasiMMSE algorithms. The BER performance of the pairwise-quasiZF( pairwise-quasi-MMSE) decoder is equivalent to the pairwiseZF( pairwise-MMSE) decoder, while the computational complexity is significantly reduced. 展开更多
关键词 quasi-orthogonal space-time block code(QOSTBC) low-complexity decoding pairwise-quasi-ZF pairwise-quasi-MMSE bit error rate(BER)
下载PDF
Low-complexity MP3 decoder based on Broadcom embedded platform
17
作者 冉川 沈庭芝 《Journal of Beijing Institute of Technology》 EI CAS 2011年第1期94-99,共6页
A low complexity MP3 decoder based on Broadcom embedded platform was proposed. C code level optimization algorithms on inverse quantization, stereo decoding and alias reduction based on PC were proposed to further re... A low complexity MP3 decoder based on Broadcom embedded platform was proposed. C code level optimization algorithms on inverse quantization, stereo decoding and alias reduction based on PC were proposed to further reduce the amount of memory usage and the computational complex ity. Furthermore, the executable file of the optimized MP3 decoder was generated under the Linux environment, and transplanted to the set top box based on Broadcom embedded platform. Experi ment results showed that the total time for decoding was reduced on the embedded platform, and the goal of real time and fluent playing of audio files was fulfilled, which demonstrated the effectiveness of the proposed MP3 decoder. The proposed MP3 decoder could be applied in fields such xs the set top box based on Broadcom embedded platform and other portable devices. 展开更多
关键词 MP3 decoder algorithm optimization LINUX BROADCOM
下载PDF
Construction of Quasi-Cyclic Low-Density Parity-Check Codes for Simplifying Shuffle Networks in Layered Decoder
18
作者 张建军 董明科 +2 位作者 王达 金野 项海格 《China Communications》 SCIE CSCD 2013年第12期102-113,共12页
Offset Shuffle Networks(OSNs) interleave a-posterior probability messages in the Block Row-Layered Decoder(BRLD) of QuasiCyclic Low-Density Parity-Check(QC-LDPC)codes.However,OSNs usually consume a significant amount ... Offset Shuffle Networks(OSNs) interleave a-posterior probability messages in the Block Row-Layered Decoder(BRLD) of QuasiCyclic Low-Density Parity-Check(QC-LDPC)codes.However,OSNs usually consume a significant amount of computational resources and limit the clock frequency,particularly when the size of the Circulant Permutation Matrix(CPM)is large.To simplify the architecture of the OSN,we propose a Simplified Offset Shuffle Network Block Progressive Edge-Growth(SOSNBPEG) algorithm to construct a class of QCLDPC codes.The SOSN-BPEG algorithm constrains the shift values of CPMs and the difference of the shift values in the same column by progressively appending check nodes.Simulation results indicate that the error performance of the SOSN-BPEG codes is the same as that of the codes in WiMAX and DVB-S2.The SOSNBPEG codes can reduce the complexity of the OSNs by up to 54.3%,and can improve the maximum frequency by up to 21.7%for various code lengths and rates. 展开更多
关键词 QC-LDPC codes construction alg-orithm PEG algorithm row-layered decoder shuffle network
下载PDF
Decode公司在基因研究方面获得新突破
19
作者 王旭静 窦道龙 《生物技术通报》 CAS CSCD 2002年第5期48-48,共1页
关键词 药物遗传学 疾病相关基因 研究进展 decode公司 基因研究
下载PDF
A hardware/software co-optimization approach for embedded software of MP3 decoder
20
作者 ZHANG Wei LIU Peng ZHAI Zhi-bo 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2007年第1期42-49,共8页
In order to improve the efficiency of embedded software running on processor core, this paper proposes a hard-ware/software co-optimization approach for embedded software from the system point of view. The proposed st... In order to improve the efficiency of embedded software running on processor core, this paper proposes a hard-ware/software co-optimization approach for embedded software from the system point of view. The proposed stepwise methods aim at exploiting the structure and the resources of the processor as much as possible for software algorithm optimization. To achieve low memory usage and low frequency need for the same performance, this co-optimization approach was used to optimize embedded software of MP3 decoder based on a 16-bit fixed-point DSP core. After the optimization, the results of decoding 128 kbps, 44.1 kHz stereo MP3 on DSP evaluation platform need 45.9 MIPS and 20.4 kbytes memory space. The optimization rate achieves 65.6% for memory and 49.6% for frequency respectively compared with the results by compiler using floating-point computation. The experimental result indicates the availability of the hardware/software co-optimization approach depending on the algorithm and architecture. 展开更多
关键词 Hardware/software co-optimization DSP Embedded software MP3 decoder
下载PDF
上一页 1 2 22 下一页 到第
使用帮助 返回顶部