Hybrid Hardware/Software Floating-Point Implementations for Optimized Area and Throughput Tradeoffs

PROJECT TITLE :

Hybrid Hardware/Software Floating-Point Implementations for Optimized Area and Throughput Tradeoffs - 2017

ABSTRACT:

Hybrid floating-purpose (FP) implementations improve software FP performance without incurring the realm overhead of full hardware FP units. The proposed implementations are synthesized in sixty five-nm CMOS and integrated into little fastened-point processors with a RISC-like design. Unsigned, shift carry, and leading zero detection (USL) support is added to a processor to augment an existing instruction set architecture and increase FP throughput with little area overhead. The hybrid implementations with USL support increase software FP throughput per core by 2.18× for addition/subtraction, one.twenty nine× for multiplication, three.07-4.05× for division, and three.eleven-three.eighty one× for sq. root, and use ninety.seven-94.vip.c less area than dedicated fused multiply- add (FMA) hardware. Hybrid implementations with custom FP-specific hardware increase throughput per core over a fastened-purpose software kernel by three.sixty nine-seven.twenty eight× for addition/subtraction, one.twenty two-a pair of.03× for multiplication, 14.4× for division, and thirty one.9× for square root, and use 77.three-97.zerop.c less space than dedicated FMA hardware. The circuit space and throughput are found for 38 multiply-add, eight addition/subtraction, vi multiplication, forty five division, and forty five sq. root designs. Thirty-three multiply- add implementations are presented, which improve throughput per core versus a mounted-purpose software implementation by 1.11-15.9× and use thirty eight.2-ninety five.3percent less area than dedicated FMA hardware.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here