Vitis HLS Libraries Reference
- Arbitrary Precision Data Types Library: arbitrary precision data types allowing C code to use variables with smaller bit-widths for improved performance and area in hardware.
- Vitis HLS Math Library: used to specify standard math operations for synthesis into Xilinx devices.
- HLS Stream Library: for modeling and compiling streaming data structures.
- HLS IP Libraries: IP functions, including fast fourier transform (FFT) and finite impulse response (FIR)
You can use each of the C libraries in your design by including the library
header file in your code. These header files are located in the
include directory
in the Vitis HLS
installation area.
Arbitrary Precision Data Types Library
C-based native data types are on 8-bit boundaries (8, 16, 32, 64 bits). RTL buses (corresponding to hardware) support arbitrary lengths. HLS needs a mechanism to allow the specification of arbitrary precision bit-width and not rely on the artificial boundaries of native C data types: if a 17-bit multiplier is required, you should not be forced to implement this with a 32-bit multiplier.
Vitis HLS provides both integer and fixed-point arbitrary precision data types for C++. The advantage of arbitrary precision data types is that they allow the C code to be updated to use variables with smaller bit-widths and then for the C simulation to be re-executed to validate that the functionality remains identical or acceptable.
Using Arbitrary Precision Data Types
Vitis HLS provides arbitrary precision integer data types that manage the value of the integer numbers within the boundaries of the specified width, as shown in the following table.
| Language | Integer Data Type | Required Header |
|---|---|---|
| C++ |
ap_[u]int<W> (1024 bits) Can be extended to 32K bits wide. |
#include “ap_int.h” |
| C++ | ap_[u]fixed<W,I,Q,O,N> | #include “ap_fixed.h” |
The header files define the arbitrary precision types are also provided with Vitis HLS as a standalone package with the rights to use them in your own source code. The package, xilinx_hls_lib_<release_number>.tgz, is provided in the include directory in the Vitis HLS installation area.
Arbitrary Integer Precision Types with C++
The header file ap_int.h defines the arbitrary precision
integer data type for the C++ ap_[u]int data types. To use arbitrary precision
integer data types in a C++ function:
- Add header file ap_int.h to the source code.
- Change the bit types to
ap_int<N>for signed types orap_uint<N>for unsigned types, whereNis a bit-size from 1 to 1024.
The following example shows how the header file is added and two variables implemented to use 9-bit integer and 10-bit unsigned integer types:
#include "ap_int.h"
void foo_top (
) {
ap_int<9> var1; // 9-bit
ap_uint<10> var2; // 10-bit unsigned
Arbitrary Precision Fixed-Point Data Types
In Vitis HLS, it is important to use fixed-point data types, because the behavior of the C++ simulations performed using fixed-point data types match that of the resulting hardware created by synthesis. This allows you to analyze the effects of bit-accuracy, quantization, and overflow with fast C-level simulation.
These data types manage the value of real (non-integer) numbers within the boundaries of a specified total width and integer width, as shown in the following figure.
Fixed-Point Identifier Summary
The following table provides a brief overview of operations supported by fixed-point types.
| Identifier | Description | |
|---|---|---|
|
W |
Word length in bits | |
I |
The number of bits used to represent the integer value (the number of bits above the decimal point) | |
| Q | Quantization mode: This dictates the behavior when greater precision is generated than can be defined by smallest fractional bit in the variable used to store the result. | |
| ap_fixed Types | Description | |
| AP_RND | Round to plus infinity | |
| AP_RND_ZERO | Round to zero | |
| AP_RND_MIN_INF | Round to minus infinity | |
| AP_RND_INF | Round to infinity | |
| AP_RND_CONV | Convergent rounding | |
| AP_TRN | Truncation to minus infinity (default) | |
| AP_TRN_ZERO | Truncation to zero | |
| O | Overflow mode: This dictates the behavior when the result of an operation exceeds the maximum (or minimum in the case of negative numbers) possible value that can be stored in the variable used to store the result. |
|
| ap_fixed Types | Description | |
| AP_SAT | Saturation | |
| AP_SAT_ZERO | Saturation to zero | |
| AP_SAT_SYM | Symmetrical saturation | |
| AP_WRAP | Wrap around (default) | |
| AP_WRAP_SM | Sign magnitude wrap around | |
| N | This defines the number of saturation bits in overflow wrap modes. | |
Example Using ap_fixed
In this example the Vitis HLS ap_fixed type is used to define an 18-bit variable with
6 bits representing the numbers above the decimal point and 12-bits representing the
value below the decimal point. The variable is specified as signed, the quantization
mode is set to round to plus infinity and the default wrap-around mode is used for
overflow.
#include <ap_fixed.h>
...
ap_fixed<18,6,AP_RND > my_type;
...
C++ Arbitrary Precision Integer Types
The native data types in C++ are on 8-bit boundaries (8, 16, 32 and 64 bits). RTL signals and operations support arbitrary bit-lengths.
Vitis HLS provides arbitrary precision data types for C++ to allow variables and operations in the C++ code to be specified with any arbitrary bit-widths: 6-bit, 17-bit, 234-bit, up to 1024 bits.
AP_INT_MAX_W
with a positive integer value less than or equal to 32768 before inclusion of the ap_int.h header file.Arbitrary precision data types have are two primary advantages over the native C++ types:
- Better quality hardware: If for example, a 17-bit multiplier is required,
arbitrary precision types can specify that exactly 17-bit are used in the calculation.
Without arbitrary precision data types, such a multiplication (17-bit) must be implemented using 32-bit integer data types and result in the multiplication being implemented with multiple DSP modules.
- Accurate C++ simulation/analysis: Arbitrary precision data types in the C++ code allows the C++ simulation to be performed using accurate bit-widths and for the C++ simulation to validate the functionality (and accuracy) of the algorithm before synthesis.
The arbitrary precision types in C++ have none of the disadvantages of those in C:
- C++ arbitrary types can be compiled with standard C++ compilers (there
is no C++ equivalent of
apcc). - C++ arbitrary precision types do not suffer from Integer Promotion Issues.
It is not uncommon for users to change a file extension from .c to .cpp so the file can be compiled as C++, where neither of these issues are present.
For the C++ language, the header file ap_int.h defines the arbitrary precision integer data types ap_(u)int<W>. For example, ap_int<8> represents an 8-bit signed integer data type and ap_uint<234> represents a 234-bit unsigned integer type.
The ap_int.h file is located in the directory $HLS_ROOT/include, where $HLS_ROOT is the Vitis HLS installation directory.
The code shown in the following example is a repeat of the code shown in
the Basic Arithmetic example in Standard Types. In this example, the data types in the top-level
function to be synthesized are specified as dinA_t, dinB_t, and so on.
#include "cpp_ap_int_arith.h"
void cpp_ap_int_arith(din_A inA, din_B inB, din_C inC, din_D inD,
dout_1 *out1, dout_2 *out2, dout_3 *out3, dout_4 *out4
) {
// Basic arithmetic operations
*out1 = inA * inB;
*out2 = inB + inA;
*out3 = inC / inA;
*out4 = inD % inA;
}
In this latest update to this example, the C++ arbitrary precision types are used:
- Add header file ap_int.h to the source code.
- Change the native C++ types to arbitrary precision types
ap_int<N>orap_uint<N>, whereNis a bit-size from 1 to 1024 (as noted above, this can be extended to 32K-bits if required).
The data types are defined in the header cpp_ap_int_arith.h.
Compared with the Basic Arithmetic example in Standard Types, the input data types have simply
been reduced to represent the maximum size of the real input data (for example, 8-bit input
inA is reduced to 6-bit input). The output types have
been refined to be more accurate, for example, out2, the
sum of inA and inB, need
only be 13-bit and not 32-bit.
The following example shows basic arithmetic with C++ arbitrary precision types.
#ifndef _CPP_AP_INT_ARITH_H_
#define _CPP_AP_INT_ARITH_H_
#include <stdio.h>
#include "ap_int.h"
#define N 9
// Old data types
//typedef char dinA_t;
//typedef short dinB_t;
//typedef int dinC_t;
//typedef long long dinD_t;
//typedef int dout1_t;
//typedef unsigned int dout2_t;
//typedef int32_t dout3_t;
//typedef int64_t dout4_t;
typedef ap_int<6> dinA_t;
typedef ap_int<12> dinB_t;
typedef ap_int<22> dinC_t;
typedef ap_int<33> dinD_t;
typedef ap_int<18> dout1_t;
typedef ap_uint<13> dout2_t;
typedef ap_int<22> dout3_t;
typedef ap_int<6> dout4_t;
void cpp_ap_int_arith(dinA_t inA,dinB_t inB,dinC_t inC,dinD_t inD,dout1_t
*out1,dout2_t *out2,dout3_t *out3,dout4_t *out4);
#endif
If C++ Arbitrary Precision Integer Types are synthesized, it results in a design that is functionally identical
to Standard Types. Rather than
use the C++ cout operator to output the results to a file,
the built-in ap_int method .to_int() is used to convert the ap_int results
to integer types used with the standard fprintf function.
fprintf(fp, %d*%d=%d; %d+%d=%d; %d/%d=%d; %d mod %d=%d;\n,
inA.to_int(), inB.to_int(), out1.to_int(),
inB.to_int(), inA.to_int(), out2.to_int(),
inC.to_int(), inA.to_int(), out3.to_int(),
inD.to_int(), inA.to_int(), out4.to_int());
C++ Arbitrary Precision Integer Types: Reference Information
For comprehensive information on the methods, synthesis behavior, and all aspects
of using the ap_(u)int<N> arbitrary precision data types, see
C++ Arbitrary Precision Types. This
section includes:
- Techniques for assigning constant and initialization values to arbitrary precision integers (including values greater than 1024-bit).
- A description of Vitis HLS helper methods, such as printing, concatenating, bit-slicing and range selection functions.
- A description of operator behavior, including a description of shift operations (a negative shift values, results in a shift in the opposite direction).
C++ Arbitrary Precision Types
Vitis HLS provides a C++ template class,
ap_[u]int<>, that implements arbitrary
precision (or bit-accurate) integer data types with consistent, bit-accurate behavior
between software and hardware modeling.
This class provides all arithmetic, bitwise, logical and relational operators allowed for native C integer types. In addition, this class provides methods to handle some useful hardware operations, such as allowing initialization and conversion of variables of widths greater than 64 bits. Details for all operators and class methods are discussed below.
Compiling ap_[u]int<> Types
To use the ap_[u]int<> classes, you must
include the ap_int.h header file in all source files that reference
ap_[u]int<> variables.
When compiling software models that use these classes, it may be necessary
to specify the location of the Vitis HLS header files, for example by
adding the -I/<HLS_HOME>/include option
for g++ compilation.
Declaring/Defining ap_[u] Variables
There are separate signed and unsigned classes:
ap_int<int_W>(signed)ap_uint<int_W>(unsigned)
The template parameter int_W specifies the total
width of the variable being declared.
User-defined types may be created with the C/C++ typedef
statement as shown in the following examples:
include "ap_int.h"// use ap_[u]fixed<> types
typedef ap_uint<128> uint128_t; // 128-bit user defined type
ap_int<96> my_wide_var; // a global variable declaration
The default maximum width allowed is 1024 bits. This default may be overridden
by defining the macro AP_INT_MAX_W with a positive
integer value less than or equal to 32768 before inclusion of the ap_int.h
header file.
AP_INT_MAX_W too High may cause slow software compile and
run times.Following is an example of overriding AP_INT_MAX_W:
#define AP_INT_MAX_W 4096 // Must be defined before next line
#include "ap_int.h"
ap_int<4096> very_wide_var;
Initialization and Assignment from Constants (Literals)
The class constructor and assignment operator overloads, allows initialization
of and assignment to ap_[u]fixed<> variables
using standard C/C++ integer literals.
This method of assigning values to ap_[u]fixed<>
variables is subject to the limitations of C++ and the system upon which
the software will run. This typically leads to a 64-bit limit on integer
literals (for example, for those LL or ULL
suffixes).
To allow assignment of values wider than 64-bits, the ap_[u]fixed<>
classes provide constructors that allow initialization from a string
of arbitrary length (less than or equal to the width of the variable).
By default, the string provided is interpreted as a hexadecimal value as long as it contains only valid hexadecimal digits (that is, 0-9 and a-f). To assign a value from such a string, an explicit C++ style cast of the string to the appropriate type must be made.
Following are examples of initialization and assignments, including for values greater than 64-bit, are:
ap_int<42> a_42b_var(-1424692392255LL); // long long decimal format
a_42b_var = 0x14BB648B13FLL; // hexadecimal format
a_42b_var = -1; // negative int literal sign-extended to full width
ap_uint<96> wide_var(“76543210fedcba9876543210”, 16); // Greater than 64-bit
wide_var = ap_int<96>(“0123456789abcdef01234567”, 16);
ap_uint<N> a ={0}.The ap_[u]<> constructor may be explicitly
instructed to interpret the string as representing the number in radix
2, 8, 10, or 16 formats. This is accomplished by adding the appropriate
radix value as a second parameter to the constructor call.
A compilation error occurs if the string literal contains any characters that are invalid as digits for the radix specified.
The following examples use different radix formats:
ap_int<6> a_6bit_var(“101010”, 2); // 42d in binary format
a_6bit_var = ap_int<6>(“40”, 8); // 32d in octal format
a_6bit_var = ap_int<6>(“55”, 10); // decimal format
a_6bit_var = ap_int<6>(“2A”, 16); // 42d in hexadecimal format
a_6bit_var = ap_int<6>(“42”, 2); // COMPILE-TIME ERROR! “42” is not binary
The radix of the number encoded in the string can also be inferred by
the constructor, when it is prefixed with a zero (0)
followed by one of the following characters: “b”,
“o” or “x”.
The prefixes “0b”, “0o”
and “0x” correspond to binary, octal
and hexadecimal formats respectively.
The following examples use alternate initializer string formats:
ap_int<6> a_6bit_var(“0b101010”, 2); // 42d in binary format
a_6bit_var = ap_int<6>(“0o40”, 8); // 32d in octal format
a_6bit_var = ap_int<6>(“0x2A”, 16); // 42d in hexidecimal format
a_6bit_var = ap_int<6>(“0b42”, 2); // COMPILE-TIME ERROR! “42” is not binary
If the bit-width is greater than 53-bits, the ap_[u]fixed
value must be initialized with a string, for example:
ap_ufixed<72,10> Val(“2460508560057040035.375”);
Support for Console I/O (Printing)
As with initialization and assignment to ap_[u]fixed<>
variables, Vitis HLS supports printing values that require more than
64-bits to represent.
Using the C++ Standard Output Stream
The easiest way to output any value stored in an ap_[u]int
variable is to use the C++ standard output stream:
std::cout (#include <iostream> or
<iostream.h>)
The stream insertion operator (<<) is overloaded to
correctly output the full range of values possible for any given
ap_[u]fixed variable. The following stream manipulators
are also supported:
- dec (decimal)
- hex (hexadecimal)
- oct (octal)
These allow formatting of the value as indicated.
The following example uses cout to print values:
#include <iostream.h>
// Alternative: #include <iostream>
ap_ufixed<72> Val(“10fedcba9876543210”);
cout << Val << endl; // Yields: “313512663723845890576”
cout << hex << val << endl; // Yields: “10fedcba9876543210”
cout << oct << val << endl; // Yields: “41773345651416625031020”
Using the Standard C Library
You can also use the standard C library (#include
<stdio.h>) to print out values larger than 64-bits:
- Convert the value to a C++
std::stringusing theap_[u]fixedclasses methodto_string(). - Convert the result to a null-terminated C character string using the
std::stringclass methodc_str().
Optional Argument One (Specifying the Radix)
You can pass the ap[u]int::to_string() method an optional
argument specifying the radix of the numerical format desired. The valid radix
argument values are:
- 2 (binary) (default)
- 8 (octal)
- 10 (decimal)
- 16 (hexadecimal)
Optional Argument Two (Printing as Signed Values)
A second optional argument to ap_[u]int::to_string() specifies
whether to print the non-decimal formats as signed values. This argument is boolean.
The default value is false, causing the non-decimal formats to be printed as
unsigned values.
The following examples use printf to print values:
ap_int<72> Val(“80fedcba9876543210”);
printf(“%s\n”, Val.to_string().c_str()); // => “80FEDCBA9876543210”
printf(“%s\n”, Val.to_string(10).c_str()); // => “-2342818482890329542128”
printf(“%s\n”, Val.to_string(8).c_str()); // => “401773345651416625031020”
printf(“%s\n”, Val.to_string(16, true).c_str()); // => “-7F0123456789ABCDF0”
Expressions Involving ap_[u]<> types
Variables of ap_[u]<> types may generally
be used freely in expressions involving C/C++ operators. Some behaviors
may be unexpected. These are discussed in detail below.
Zero- and Sign-Extension on Assignment From Narrower to Wider Variables
When assigning the value of a narrower bit-width signed
(ap_int<>) variable to a wider one, the value is
sign-extended to the width of the destination variable, regardless of its
signedness.
Similarly, an unsigned source variable is zero-extended before assignment.
Explicit casting of the source variable may be necessary to ensure expected behavior on assignment. See the following example:
ap_uint<10> Result;
ap_int<7> Val1 = 0x7f;
ap_uint<6> Val2 = 0x3f;
Result = Val1; // Yields: 0x3ff (sign-extended)
Result = Val2; // Yields: 0x03f (zero-padded)
Result = ap_uint<7>(Val1); // Yields: 0x07f (zero-padded)
Result = ap_int<6>(Val2); // Yields: 0x3ff (sign-extended)
Truncation on Assignment of Wider to Narrower Variables
Assigning the value of a wider source variable to a narrower one leads to truncation of the value. All bits beyond the most significant bit (MSB) position of the destination variable are lost.
There is no special handling of the sign information during truncation. This may lead to unexpected behavior. Explicit casting may help avoid this unexpected behavior.
Class Methods and Operators
The ap_[u]int types do not support implicit conversion
from wide ap_[u]int (>64bits) to builtin C/C++
integer types. For example, the following code example return s1, because
the implicit cast from ap_int[65] to bool
in the if-statement returns a 0.
bool nonzero(ap_uint<65> data) {
return data; // This leads to implicit truncation to 64b int
}
int main() {
if (nonzero((ap_uint<65>)1 << 64)) {
return 0;
}
printf(FAIL\n);
return 1;
}
To convert wide ap_[u]int types to built-in integers,
use the explicit conversion functions included with the ap_[u]int
types:
to_int()to_long()to_bool()
In general, any valid operation that can be done on a native C/C++ integer
data type is supported using operator overloading for ap_[u]int
types.
In addition to these overloaded operators, some class specific operators and methods are included to ease bit-level operations.
Binary Arithmetic Operators
Standard binary integer arithmetic operators are overloaded to provide arbitrary precision arithmetic. These operators take either:
- Two operands of
ap_[u]int, or - One
ap_[u]inttype and one C/C++ fundamental integer data type
For example:
- char
- short
- int
The width and signedness of the resulting value is determined by the width and signedness of the operands, before sign-extension, zero-padding or truncation are applied based on the width of the destination variable (or expression). Details of the return value are described for each operator.
When expressions contain a mix of ap_[u]int and C/C++ fundamental
integer types, the C++ types assume the following widths:
char(8-bits)short(16-bits)int(32-bits)long(32-bits)long long(64-bits)
Addition
ap_(u)int::RType ap_(u)int::operator + (ap_(u)int op)
Returns the sum of:
- Two
ap_[u]int, or - One
ap_[u]intand a C/C++ integer type
The width of the sum value is:
- One bit more than the wider of the two operands, or
- Two bits if and only if the wider is unsigned and the narrower is signed
The sum is treated as signed if either (or both) of the operands is of a signed type.
Subtraction
ap_(u)int::RType ap_(u)int::operator - (ap_(u)int op)
Returns the difference of two integers.
The width of the difference value is:
- One bit more than the wider of the two operands, or
- Two bits if and only if the wider is unsigned and the narrower signed
This is true before assignment, at which point it is sign-extended, zero-padded, or truncated based on the width of the destination variable.
The difference is treated as signed regardless of the signedness of the operands.
Multiplication
ap_(u)int::RType ap_(u)int::operator * (ap_(u)int op)
Returns the product of two integer values.
The width of the product is the sum of the widths of the operands.
The product is treated as a signed type if either of the operands is of a signed type.
Division
ap_(u)int::RType ap_(u)int::operator / (ap_(u)int op)
Returns the quotient of two integer values.
The width of the quotient is the width of the dividend if the divisor is an unsigned type. Otherwise, it is the width of the dividend plus one.
The quotient is treated as a signed type if either of the operands is of a signed type.
Modulus
ap_(u)int::RType ap_(u)int::operator % (ap_(u)int op)
Returns the modulus, or remainder of integer division, for two integer values.
The width of the modulus is the minimum of the widths of the operands, if they are both of the same signedness.
If the divisor is an unsigned type and the dividend is signed, then the width is that of the divisor plus one.
The quotient is treated as having the same signedness as the dividend.
Following are examples of arithmetic operators:
ap_uint<71> Rslt;
ap_uint<42> Val1 = 5;
ap_int<23> Val2 = -8;
Rslt = Val1 + Val2; // Yields: -3 (43 bits) sign-extended to 71 bits
Rslt = Val1 - Val2; // Yields: +3 sign extended to 71 bits
Rslt = Val1 * Val2; // Yields: -40 (65 bits) sign extended to 71 bits
Rslt = 50 / Val2; // Yields: -6 (33 bits) sign extended to 71 bits
Rslt = 50 % Val2; // Yields: +2 (23 bits) sign extended to 71 bits
Bitwise Logical Operators
The bitwise logical operators all return a value with a width that is the maximum of the widths of the two operands. It is treated as unsigned if and only if both operands are unsigned. Otherwise, it is of a signed type.
Sign-extension (or zero-padding) may occur, based on the signedness of the expression, not the destination variable.
Bitwise OR
ap_(u)int::RType ap_(u)int::operator | (ap_(u)int op)
Returns the bitwise OR of the two operands.
Bitwise AND
ap_(u)int::RType ap_(u)int::operator & (ap_(u)int op)
Returns the bitwise AND of the two operands.
Bitwise XOR
ap_(u)int::RType ap_(u)int::operator ^ (ap_(u)int op)
Returns the bitwise XOR of the two operands.
Unary Operators
Addition
ap_(u)int ap_(u)int::operator + ()
Returns the self copy of the ap_[u]int operand.
Subtraction
ap_(u)int::RType ap_(u)int::operator - ()
Returns the following:
- The negated value of the operand with the same width if it is a signed type, or
- Its width plus one if it is unsigned.
The return value is always a signed type.
Bitwise Inverse
ap_(u)int::RType ap_(u)int::operator ~ ()
Returns the bitwise-NOT of the operand with the same width and
signedness.
Logical Invert
bool ap_(u)int::operator ! ()
Returns a Boolean false value if and only if the operand is
not equal to zero (0).
Returns a Boolean true value if the operand is equal to zero
(0).
Ternary Operators
When you use the ternary operator with the standard C int type, you
must explicitly cast from one type to the other to ensure that both results have the same
type. For example:
// Integer type is cast to ap_int type
ap_int<32> testc3(int a, ap_int<32> b, ap_int<32> c, bool d) {
return d?ap_int<32>(a):b;
}
// ap_int type is cast to an integer type
ap_int<32> testc4(int a, ap_int<32> b, ap_int<32> c, bool d) {
return d?a+1:(int)b;
}
// Integer type is cast to ap_int type
ap_int<32> testc5(int a, ap_int<32> b, ap_int<32> c, bool d) {
return d?ap_int<33>(a):b+1;
}
Shift Operators
Each shift operator comes in two versions:
- One version for unsigned right-hand side (RHS) operands
- One version for signed right-hand side (RHS) operands
A negative value supplied to the signed RHS versions reverses the shift operations direction. That is, a shift by the absolute value of the RHS operand in the opposite direction occurs.
The shift operators return a value with the same width as the left-hand side (LHS) operand. As with C/C++, if the LHS operand of a shift-right is a signed type, the sign bit is copied into the most significant bit positions, maintaining the sign of the LHS operand.
Unsigned Integer Shift Right
ap_(u)int ap_(u)int::operator << (ap_uint<int_W2> op)
Integer Shift Right
ap_(u)int ap_(u)int::operator << (ap_int<int_W2> op)
Unsigned Integer Shift Left
ap_(u)int ap_(u)int::operator >> (ap_uint<int_W2> op)
Integer Shift Left
ap_(u)int ap_(u)int::operator >> (ap_int<int_W2> op)
Following are examples of shift operations:
ap_uint<13> Rslt;
ap_uint<7> Val1 = 0x41;
Rslt = Val1 << 6; // Yields: 0x0040, i.e. msb of Val1 is lost
Rslt = ap_uint<13>(Val1) << 6; // Yields: 0x1040, no info lost
ap_int<7> Val2 = -63;
Rslt = Val2 >> 4; //Yields: 0x1ffc, sign is maintained and extended
Compound Assignment Operators
Vitis HLS supports compound assignment operators:
- *=
- /=
- %=
- +=
- -=
- <<=
- >>=
- &=
- ^=
- |=
The RHS expression is first evaluated then supplied as the
RHS operand to the base operator, the result of which is assigned
back to the LHS variable. The expression sizing, signedness, and
potential sign-extension or truncation rules apply as discussed above for the relevant
operations.
ap_uint<10> Val1 = 630;
ap_int<3> Val2 = -3;
ap_uint<5> Val3 = 27;
Val1 += Val2 - Val3; // Yields: 600 and is equivalent to:
// Val1 = ap_uint<10>(ap_int<11>(Val1) +
// ap_int<11>((ap_int<6>(Val2) -
// ap_int<6>(Val3))));
Increment and Decrement Operators
The increment and decrement operators are provided. All return a value of the same width as the operand and which is unsigned if and only if both operands are of unsigned types and signed otherwise.
Pre-Increment
ap_(u)int& ap_(u)int::operator ++ ()
Returns the incremented value of the operand.
Assigns the incremented value to the operand.
Post-Increment
const ap_(u)int ap_(u)int::operator ++ (int)
Returns the value of the operand before assignment of the incremented value to the operand variable.
Pre-Decrement
ap_(u)int& ap_(u)int::operator -- ()
Returns the decremented value of, as well as assigning the decremented value to, the operand.
Post-Decrement
const ap_(u)int ap_(u)int::operator -- (int)
Returns the value of the operand before assignment of the decremented value to the operand variable.
Relational Operators
Vitis HLS supports all relational operators. They return a Boolean value based on the
result of the comparison. You can compare variables of ap_[u]int
types to C/C++ fundamental integer types with these operators.
Equality
bool ap_(u)int::operator == (ap_(u)int op)
Inequality
bool ap_(u)int::operator != (ap_(u)int op)
Less than
bool ap_(u)int::operator < (ap_(u)int op)
Greater than
bool ap_(u)int::operator > (ap_(u)int op)
Less than or equal to
bool ap_(u)int::operator <= (ap_(u)int op)
Greater than or equal to
bool ap_(u)int::operator >= (ap_(u)int op)
Other Class Methods, Operators, and Data Members
The following sections discuss other class methods, operators, and data members.
Bit-Level Operations
The following methods facilitate common bit-level operations on the value stored in
ap_[u]int type variables.
Length
int ap_(u)int::length ()
Returns an integer value providing the total number of bits in the
ap_[u]int variable.
Concatenation
ap_concat_ref ap_(u)int::concat (ap_(u)int low)
ap_concat_ref ap_(u)int::operator , (ap_(u)int high, ap_(u)int low)
Concatenates two ap_[u]int variables, the width of the returned value
is the sum of the widths of the operands.
The High and Low arguments are placed in the higher and lower order bits of the result
respectively; the concat() method places the argument in the lower order
bits.
When using the overloaded comma operator, the parentheses are required. The comma operator version may also appear on the LHS of assignment.
ap_[u]int type before concatenating.
ap_uint<10> Rslt;
ap_int<3> Val1 = -3;
ap_int<7> Val2 = 54;
Rslt = (Val2, Val1); // Yields: 0x1B5
Rslt = Val1.concat(Val2); // Yields: 0x2B6
(Val1, Val2) = 0xAB; // Yields: Val1 == 1, Val2 == 43
Bit Selection
ap_bit_ref ap_(u)int::operator [] (int bit)
Selects one bit from an arbitrary precision integer value and returns it.
The returned value is a reference value that can set or clear the corresponding bit in this
ap_[u]int.
The bit argument must be an int value. It specifies the index of the
bit to select. The least significant bit has index 0. The highest permissible index is one less
than the bit-width of this ap_[u]int.
The result type ap_bit_ref represents the reference to one bit of this
ap_[u]int instance specified by bit.
Range Selection
ap_range_ref ap_(u)int::range (unsigned Hi, unsigned Lo)
ap_range_ref ap_(u)int::operator () (unsigned Hi, unsigned Lo)
Returns the value represented by the range of bits specified by the arguments.
The Hi argument specifies the most significant bit (MSB) position of
the range, and Lo specifies the least significant bit (LSB).
The LSB of the source variable is in position 0. If the Hi argument has
a value less than Lo, the bits are returned in reverse order.
ap_uint<4> Rslt;
ap_uint<8> Val1 = 0x5f;
ap_uint<8> Val2 = 0xaa;
Rslt = Val1.range(3, 0); // Yields: 0xF
Val1(3,0) = Val2(3, 0); // Yields: 0x5A
Val1(3,0) = Val2(4, 1); // Yields: 0x55
Rslt = Val1.range(4, 7); // Yields: 0xA; bit-reversed!
AND reduce
bool ap_(u)int::and_reduce ()
- Applies the
ANDoperation on all bits in thisap_(u)int. - Returns the resulting single bit.
- Equivalent to comparing this value against
-1(all ones) and returningtrueif it matches,falseotherwise.
OR reduce
bool ap_(u)int::or_reduce ()
- Applies the
ORoperation on all bits in thisap_(u)int. - Returns the resulting single bit.
- Equivalent to comparing this value against
0(all zeros) and returningfalseif it matches,trueotherwise.
XOR reduce
bool ap_(u)int::xor_reduce ()
- Applies the
XORoperation on all bits in thisap_int. - Returns the resulting single bit.
- Equivalent to counting the number of
1bits in this value and returningfalseif the count is even ortrueif the count is odd.
NAND reduce
bool ap_(u)int::nand_reduce ()
- Applies the
NANDoperation on all bits in thisap_int. - Returns the resulting single bit.
- Equivalent to comparing this value against
-1(all ones) and returningfalseif it matches,trueotherwise.
NOR reduce
bool ap_int::nor_reduce ()
- Applies the
NORoperation on all bits in thisap_int. - Returns the resulting single bit.
- Equivalent to comparing this value against
0(all zeros) and returningtrueif it matches,falseotherwise.
XNOR reduce
bool ap_(u)int::xnor_reduce ()
- Applies the
XNORoperation on all bits in thisap_(u)int. - Returns the resulting single bit.
- Equivalent to counting the number of
1bits in this value and returningtrueif the count is even orfalseif the count is odd.
Bit Reduction Method Examples
ap_uint<8> Val = 0xaa;
bool t = Val.and_reduce(); // Yields: false
t = Val.or_reduce(); // Yields: true
t = Val.xor_reduce(); // Yields: false
t = Val.nand_reduce(); // Yields: true
t = Val.nor_reduce(); // Yields: false
t = Val.xnor_reduce(); // Yields: true
Bit Reverse
void ap_(u)int::reverse ()
Reverses the contents of ap_[u]int instance:
- The LSB becomes the MSB.
- The MSB becomes the LSB.
Reverse Method Example
ap_uint<8> Val = 0x12;
Val.reverse(); // Yields: 0x48
Test Bit Value
bool ap_(u)int::test (unsigned i)
Checks whether specified bit of ap_(u)int instance is
1.
Returns true if Yes, false if No.
Test Method Example
ap_uint<8> Val = 0x12;
bool t = Val.test(5); // Yields: true
Set Bit Value
void ap_(u)int::set (unsigned i, bool v)
void ap_(u)int::set_bit (unsigned i, bool v)
Sets the specified bit of the ap_(u)int instance to the value of
integer V.
Set Bit (to 1)
void ap_(u)int::set (unsigned i)
Sets the specified bit of the ap_(u)int instance to the value
1 (one).
Clear Bit (to 0)
void ap_(u)int:: clear(unsigned i)
Sets the specified bit of the ap_(u)int instance to the value
0 (zero).
Invert Bit
void ap_(u)int:: invert(unsigned i)
Inverts the bit specified in the function argument of the ap_(u)int
instance. The specified bit becomes 0 if its original value is
1 and vice versa.
Example of bit set, clear and invert bit methods:
ap_uint<8> Val = 0x12;
Val.set(0, 1); // Yields: 0x13
Val.set_bit(4, false); // Yields: 0x03
Val.set(7); // Yields: 0x83
Val.clear(1); // Yields: 0x81
Val.invert(4); // Yields: 0x91
Rotate Right
void ap_(u)int:: rrotate(unsigned n)
Rotates the ap_(u)int instance n places to right.
Rotate Left
void ap_(u)int:: lrotate(unsigned n)
Rotates the ap_(u)int instance n places to left.
ap_uint<8> Val = 0x12;
Val.rrotate(3); // Yields: 0x42
Val.lrotate(6); // Yields: 0x90
Bitwise NOT
void ap_(u)int:: b_not()
- Complements every bit of the
ap_(u)intinstance.
ap_uint<8> Val = 0x12;
Val.b_not(); // Yields: 0xED
Bitwise NOT Example
Test Sign
bool ap_int:: sign()
- Checks whether the
ap_(u)intinstance is negative. - Returns
trueif negative. - Returns
falseif positive.
Explicit Conversion Methods
To C/C++ “(u)int”
int ap_(u)int::to_int ()
unsigned ap_(u)int::to_uint ()
- Returns native C/C++ (32-bit on most systems) integers with the value
contained in the
ap_[u]int. - Truncation occurs if the value is greater than can be represented by an
[unsigned] int.
To C/C++ 64-bit “(u)int”
long long ap_(u)int::to_int64 ()
unsigned long long ap_(u)int::to_uint64 ()
- Returns native C/C++ 64-bit integers with the value contained in the
ap_[u]int. - Truncation occurs if the value is greater than can be represented by an
[unsigned] int.
To C/C++ “double”
double ap_(u)int::to_double ()
- Returns a native C/C++
double64-bit floating point representation of the value contained in theap_[u]int. - If the
ap_[u]intis wider than 53 bits (the number of bits in the mantissa of adouble), the resultingdoublemay not have the exact value expected.
ap_[u]int to other data types.Sizeof
The standard C++ sizeof() function should not be used with
ap_[u]int or other classes or instance of object. The
ap_int<> data type is a class and sizeof
returns the storage used by that class or instance object.
sizeof(ap_int<N>) always returns the number of bytes used. For
example:
sizeof(ap_int<127>)=16
sizeof(ap_int<128>)=16
sizeof(ap_int<129>)=24
sizeof(ap_int<130>)=24
Compile Time Access to Data Type Attributes
The ap_[u]int<> types are provided with a static member that
allows the size of the variables to be determined at compile time. The data type is provided
with the static const member width, which is automatically assigned the
width of the data type:
static const int width = _AP_W;
You can use the width data member to extract the data width of an
existing ap_[u]int<> data type to create another
ap_[u]int<> data type at compile time. The following example shows
how the size of variable Res is defined as 1-bit greater than variables
Val1 and Val2:
// Definition of basic data type
#define INPUT_DATA_WIDTH 8
typedef ap_int<INPUT_DATA_WIDTH> data_t;
// Definition of variables
data_t Val1, Val2;
// Res is automatically sized at compile-time to be 1-bit greater than data type
data_t
ap_int<data_t::width+1> Res = Val1 + Val2;
This ensures that Vitis HLS correctly models the bit-growth caused by the addition even if
you update the value of INPUT_DATA_WIDTH for data_t.
C++ Arbitrary Precision Fixed-Point Types
C++ functions can take advantage of the arbitrary precision fixed-point types included with Vitis HLS. The following figure summarizes the basic features of these fixed-point types:
- The word can be signed (
ap_fixed) or unsigned (ap_ufixed). - A word with of any arbitrary size
Wcan be defined. - The number of places above the decimal point I, also defines the number
of decimal places in the word,
W-I(represented byBin the following figure). - The type of rounding or quantization (
Q) can be selected. - The overflow behavior (
OandN) can be selected.
Arbitrary precision fixed-point types use more memory during C simulation.
If using very large arrays of ap_[u]fixed types, refer to
the discussion of C simulation in Arrays.
The advantages of using fixed-point types are:
- They allow fractional number to be easily represented.
- When variables have a different number of integer and decimal place bits, the alignment of the decimal point is handled.
- There are numerous options to handle how rounding should happen: when there are too few decimal bits to represent the precision of the result.
- There are numerous options to handle how variables should overflow: when the result is greater than the number of integer bits can represent.
These attributes are summarized by examining the code in the example below.
First, the header file ap_fixed.h is included. The
ap_fixed types are then defined using the typedef statement:
- A 10-bit input: 8-bit integer value with 2 decimal places.
- A 6-bit input: 3-bit integer value with 3 decimal places.
- A 22-bit variable for the accumulation: 17-bit integer value with 5 decimal places.
- A 36-bit variable for the result: 30-bit integer value with 6 decimal places.
The function contains no code to manage the alignment of the decimal point after operations are performed. The alignment is done automatically.
The following code sample shows ap_fixed
type.
#include "ap_fixed.h"
typedef ap_ufixed<10,8, AP_RND, AP_SAT> din1_t;
typedef ap_fixed<6,3, AP_RND, AP_WRAP> din2_t;
typedef ap_fixed<22,17, AP_TRN, AP_SAT> dint_t;
typedef ap_fixed<36,30> dout_t;
dout_t cpp_ap_fixed(din1_t d_in1, din2_t d_in2) {
static dint_t sum;
sum += d_in1;
return sum * d_in2;
}
Using ap_(u)fixed types, the C++
simulation is bit accurate. Fast simulation can validate the algorithm and its accuracy.
After synthesis, the RTL exhibits the identical bit-accurate behavior.
Arbitrary precision fixed-point types can be freely assigned literal values
in the code. This is shown in the test bench (see the example below) used with the example
above, in which the values of in1 and in2 are declared and assigned constant values.
When assigning literal values involving operators, the literal values must
first be cast to ap_(u)fixed types. Otherwise, the C
compiler and Vitis HLS interpret the literal as an
integer or float/double type and may fail to find a
suitable operator. As shown in the following example, in the assignment of in1 = in1 + din1_t(0.25), the literal 0.25 is cast to an ap_fixed type.
#include <cmath>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
#include "ap_fixed.h"
typedef ap_ufixed<10,8, AP_RND, AP_SAT> din1_t;
typedef ap_fixed<6,3, AP_RND, AP_WRAP> din2_t;
typedef ap_fixed<22,17, AP_TRN, AP_SAT> dint_t;
typedef ap_fixed<36,30> dout_t;
dout_t cpp_ap_fixed(din1_t d_in1, din2_t d_in2);
int main()
{
ofstream result;
din1_t in1 = 0.25;
din2_t in2 = 2.125;
dout_t output;
int retval=0;
result.open(result.dat);
// Persistent manipulators
result << right << fixed << setbase(10) << setprecision(15);
for (int i = 0; i <= 250; i++)
{
output = cpp_ap_fixed(in1,in2);
result << setw(10) << i;
result << setw(20) << in1;
result << setw(20) << in2;
result << setw(20) << output;
result << endl;
in1 = in1 + din1_t(0.25);
in2 = in2 - din2_t(0.125);
}
result.close();
// Compare the results file with the golden results
retval = system(diff --brief -w result.dat result.golden.dat);
if (retval != 0) {
printf(Test failed !!!\n);
retval=1;
} else {
printf(Test passed !\n);
}
// Return 0 if the test passes
return retval;
}
Fixed-Point Identifier Summary
The following table shows the quantization and overflow modes.
| Identifier | Description | |
|---|---|---|
| W | Word length in bits | |
| I | The number of bits used to represent the integer value (the number of bits above the decimal point) | |
| Q | Quantization mode dictates the behavior when greater precision is generated than can be defined by smallest fractional bit in the variable used to store the result. | |
| Mode | Description | |
| AP_RND | Rounding to plus infinity | |
| AP_RND_ZERO | Rounding to zero | |
| AP_RND_MIN_INF | Rounding to minus infinity | |
| AP_RND_INF | Rounding to infinity | |
| AP_RND_CONV | Convergent rounding | |
| AP_TRN | Truncation to minus infinity (default) | |
| AP_TRN_ZERO | Truncation to zero | |
| O | Overflow mode dictates the behavior when more bits are generated than the variable to store the result contains. | |
| Mode | Description | |
| AP_SAT | Saturation | |
| AP_SAT_ZERO | Saturation to zero | |
| AP_SAT_SYM | Symmetrical saturation | |
| AP_WRAP | Wrap around (default) | |
| AP_WRAP_SM | Sign magnitude wrap around | |
| N | The number of saturation bits in wrap modes. | |
C++ Arbitrary Precision Fixed-Point Types: Reference Information
For comprehensive information on the methods, synthesis behavior, and all aspects
of using the ap_(u)fixed<N> arbitrary precision fixed-point data
types, see C++ Arbitrary Precision Fixed-Point Types.
This section includes:
- Techniques for assigning constant and initialization values to arbitrary precision integers (including values greater than 1024-bit).
- A detailed description of the overflow and saturation modes.
- A description of Vitis HLS helper methods, such as printing, concatenating, bit-slicing and range selection functions.
- A description of operator behavior, including a description of shift operations (a negative shift values, results in a shift in the opposite direction).
C++ Arbitrary Precision Fixed-Point Types
Vitis HLS supports fixed-point types that allow fractional arithmetic to be easily handled. The advantage of fixed-point arithmetic is shown in the following example.
ap_fixed<11, 6> Var1 = 22.96875; // 11-bit signed word, 5 fractional bits
ap_ufixed<12,11> Var2 = 512.5; // 12-bit word, 1 fractional bit
ap_fixed<16,11> Res1; // 16-bit signed word, 5 fractional bits
Res1 = Var1 + Var2; // Result is 535.46875
Even though Var1 and Var2
have different precisions, the fixed-point type ensures that the decimal
point is correctly aligned before the operation (an addition in this
case), is performed. You are not required to perform any operations in
the C code to align the decimal point.
The type used to store the result of any fixed-point arithmetic operation must be large enough (in both the integer and fractional bits) to store the full result.
If this is not the case, the ap_fixed type
performs:
- overflow handling (when the result has more MSBs than the assigned type supports)
- quantization (or rounding, when the result has fewer LSBs than the assigned type supports)
The ap_[u]fixed type provides various options on
how the overflow and quantization are performed. The options are discussed
below.
ap_[u]fixed Representation
In ap[u]fixed types, a fixed-point value is represented
as a sequence of bits with a specified position for the binary point.
- Bits to the left of the binary point represent the integer part of the value.
- Bits to the right of the binary point represent the fractional part of the value.
ap_[u]fixed type is defined as follows:
ap_[u]fixed<int W,
int I,
ap_q_mode Q,
ap_o_mode O,
ap_sat_bits N>;
Quantization Modes
| Rounding to plus infinity | AP_RND |
| Rounding to zero | AP_RND_ZERO |
| Rounding to minus infinity | AP_RND_MIN_INF |
| Rounding to infinity | AP_RND_INF |
| Convergent rounding | AP_RND_CONV |
| Truncation | AP_TRN |
| Truncation to zero | AP_TRN_ZERO |
AP_RND
- Round the value to the nearest representable value for the specific
ap_[u]fixedtype.ap_fixed<3, 2, AP_RND, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5 ap_fixed<3, 2, AP_RND, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0
AP_RND_ZERO
- Round the value to the nearest representable value.
- Round towards zero.
- For positive values, delete the redundant bits.
- For negative values, add the least significant bits to get the nearest representable value.
ap_fixed<3, 2, AP_RND_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_RND_ZERO, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0
AP_RND_MIN_INF
- Round the value to the nearest representable value.
- Round towards minus infinity.
- For positive values, delete the redundant bits.
- For negative values, add the least significant bits.
ap_fixed<3, 2, AP_RND_MIN_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_RND_MIN_INF, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.5
AP_RND_INF
- Round the value to the nearest representable value.
- The rounding depends on the least significant bit.
- For positive values, if the least significant bit is set, round towards plus infinity. Otherwise, round towards minus infinity.
- For negative values, if the least significant bit is set, round towards minus infinity. Otherwise, round towards plus infinity.
ap_fixed<3, 2, AP_RND_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5 ap_fixed<3, 2, AP_RND_INF, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.5
AP_RND_CONV
- Round the value to the nearest representable value.
- The rounding depends on the least significant bit.
- If least significant bit is set, round towards plus infinity.
- Otherwise, round towards minus infinity.
ap_fixed<3, 2, AP_RND_CONV, AP_SAT> UAPFixed4 = 0.75; // Yields: 1.0 ap_fixed<3, 2, AP_RND_CONV, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0
AP_TRN
- Always round the value towards minus infinity.
ap_fixed<3, 2, AP_TRN, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_TRN, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.5
AP_TRN_ZERO
Round the value to:
- For positive values, the rounding is the same as mode
AP_TRN. - For negative values, round towards
zero.
ap_fixed<3, 2, AP_TRN_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_TRN_ZERO, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0
Overflow Modes
| Saturation | AP_SAT |
| Saturation to zero | AP_SAT_ZERO |
| Symmetrical saturation | AP_SAT_SYM |
| Wrap-around | AP_WRAP |
| Sign magnitude wrap-around | AP_WRAP_SM |
AP_SAT
Saturate the value.
- To the maximum value in case of overflow.
- To the negative maximum value in case of negative overflow.
ap_fixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = 19.0; // Yields: 7.0 ap_fixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = -19.0; // Yields: -8.0 ap_ufixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = 19.0; // Yields: 15.0 ap_ufixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = -19.0; // Yields: 0.0
AP_SAT_ZERO
Force the value to zero in case of overflow, or negative overflow.
ap_fixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = 19.0; // Yields: 0.0
ap_fixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = -19.0; // Yields: 0.0
ap_ufixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = 19.0; // Yields: 0.0
ap_ufixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = -19.0; // Yields: 0.0
AP_SAT_SYM
Saturate the value:
- To the maximum value in case of overflow.
- To the minimum value in case of negative overflow.
- Negative maximum for signed
ap_fixedtypes - Zero for unsigned
ap_ufixedtypes
ap_fixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = 19.0; // Yields: 7.0 ap_fixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = -19.0; // Yields: -7.0 ap_ufixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = 19.0; // Yields: 15.0 ap_ufixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = -19.0; // Yields: 0.0 - Negative maximum for signed
AP_WRAP
Wrap the value around in case of overflow.
ap_fixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = 31.0; // Yields: -1.0
ap_fixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = -19.0; // Yields: -3.0
ap_ufixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = 19.0; // Yields: 3.0
ap_ufixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = -19.0; // Yields: 13.0
If the value of N is set to zero (the default overflow mode):
- All MSB bits outside the range are deleted.
- For unsigned numbers. After the maximum it wraps around to zero.
- For signed numbers. After the maximum, it wraps to the minimum values.
If N>0:
- When N > 0, N MSB bits are saturated or set to 1.
- The sign bit is retained, so positive numbers remain positive and negative numbers remain negative.
- The bits that are not saturated are copied starting from the LSB side.
AP_WRAP_SM
The value should be sign-magnitude wrapped around.
ap_fixed<4, 4, AP_RND, AP_WRAP_SM> UAPFixed4 = 19.0; // Yields: -4.0
ap_fixed<4, 4, AP_RND, AP_WRAP_SM> UAPFixed4 = -19.0; // Yields: 2.0
If the value of N is set to zero (the default overflow mode):
- This mode uses sign magnitude wrapping.
- Sign bit set to the value of the least significant deleted bit.
- If the most significant remaining bit is different from the original MSB, all the remaining bits are inverted.
- If MSBs are same, the other bits are copied over.
- Delete redundant MSBs.
- The new sign bit is the least significant bit of the deleted bits. 0 in this case.
- Compare the new sign bit with the sign of the new value.
- If different, invert all the numbers. They are different in this case.
If N>0:
- Uses sign magnitude saturation
- N MSBs are saturated to 1.
- Behaves similar to a case in which N = 0, except that positive numbers stay positive and negative numbers stay negative.
Compiling ap_[u]fixed<> Types
To use the ap_[u]fixed<> classes, you must
include the ap_fixed.h header file in all source files that
reference ap_[u]fixed<> variables.
When compiling software models that use these classes, it may be necessary
to specify the location of the Vitis HLS header files, for example by
adding the “-I/<HLS_HOME>/include”
option for g++ compilation.
Declaring and Defining ap_[u]fixed<> Variables
There are separate signed and unsigned classes:
ap_fixed<W,I>(signed)ap_ufixed<W,I>(unsigned)
You can create user-defined types with the C/C++ typedef
statement:
#include "ap_fixed.h" // use ap_[u]fixed<> types
typedef ap_ufixed<128,32> uint128_t; // 128-bit user defined type,
// 32 integer bits
User-Defined Types Examples
Initialization and Assignment from Constants (Literals)
You can initialize ap_[u]fixed variable with
normal floating point constants of the usual C/C++ width:
- 32 bits for type
float - 64 bits for type
double
That is, typically, a floating point value that is single precision type or in the form of double precision.
Note that the value assigned to the fixed-point variable will be limited by the precision of the constant. Use string initialization as described in Initialization and Assignment from Constants (Literals) to ensure that all bits of the fixed-point variable are populated according to the precision described by the string.
#include <ap_fixed.h>
ap_ufixed<30, 15> my15BitInt = 3.1415;
ap_fixed<42, 23> my42BitInt = -1158.987;
ap_ufixed<99, 40> = 287432.0382911;
ap_fixed<36,30> = -0x123.456p-1;
The ap_[u]fixed types do not support initialization if they are used in an array of std::complex types.
typedef ap_fixed<DIN_W, 1, AP_TRN, AP_SAT> coeff_t; // MUST have IW >= 1
std::complex<coeff_t> twid_rom[REAL_SZ/2] = {{ 1, -0 },{ 0.9,-0.006 }, etc.}
The initialization values must first be cast to std::complex:
typedef ap_fixed<DIN_W, 1, AP_TRN, AP_SAT> coeff_t; // MUST have IW >= 1
std::complex<coeff_t> twid_rom[REAL_SZ/2] = {std::complex<coeff_t>( 1, -0 ),
std::complex<coeff_t>(0.9,-0.006 ),etc.}
Support for Console I/O (Printing)
As with initialization and assignment to ap_[u]fixed<>
variables, Vitis HLS supports printing values that require more than
64 bits to represent.
The easiest way to output any value stored in an ap_[u]fixed
variable is to use the C++ standard output stream, std::cout
(#include <iostream> or <iostream.h>). The stream
insertion operator, “<<“, is
overloaded to correctly output the full range of values possible for
any given ap_[u]fixed variable. The following
stream manipulators are also supported, allowing formatting of the value
as shown.
dec(decimal)hex(hexadecimal)oct(octal)#include <iostream.h> // Alternative: #include <iostream> ap_fixed<6,3, AP_RND, AP_WRAP> Val = 3.25; cout << Val << endl; // Yields: 3.25
Using the Standard C Library
You can also use the standard C library (#include
<stdio.h>) to print out values larger than 64-bits:
- Convert the value to a C++
std::stringusing theap_[u]fixedclasses methodto_string(). - Convert the result to a null-terminated C character string using the
std::stringclass methodc_str().
Optional Argument One (Specifying the Radix)
You can pass the ap[u]int::to_string() method an optional
argument specifying the radix of the numerical format desired. The valid radix
argument values are:
- 2 (binary)
- 8 (octal
- 10 (decimal)
- 16 (hexadecimal) (default)
Optional Argument Two (Printing as Signed Values)
A second optional argument to ap_[u]int::to_string() specifies
whether to print the non-decimal formats as signed values. This argument is boolean.
The default value is false, causing the non-decimal formats to be printed as
unsigned values.
ap_fixed<6,3, AP_RND, AP_WRAP> Val = 3.25;
printf("%s \n", in2.to_string().c_str()); // Yields: 0b011.010
printf("%s \n", in2.to_string(10).c_str()); //Yields: 3.25
The ap_[u]fixed types are supported by the following C++
manipulator functions:
- setprecision
- setw
- setfill
The setprecision manipulator sets the decimal precision to be used. It takes one
parameter f as the value of decimal precision, where
n specifies the maximum number of meaningful digits to
display in total (counting both those before and those after the decimal point).
The default value of f is 6, which is consistent with native C
float type.
ap_fixed<64, 32> f =3.14159;
cout << setprecision (5) << f << endl;
cout << setprecision (9) << f << endl;
f = 123456;
cout << setprecision (5) << f << endl;
The example above displays the following results where the printed results are rounded when the actual precision exceeds the specified precision:
3.1416
3.14159
1.2346e+05
The setw manipulator:
- Sets the number of characters to be used for the field width.
- Takes one parameter
was the value of the widthwhere
wdetermines the minimum number of characters to be written in some output representation.
If the standard width of the representation is shorter than the field width, the
representation is padded with fill characters. Fill characters are controlled by the
setfill manipulator which takes one parameter f as the
padding character.
For example, given:
ap_fixed<65,32> aa = 123456;
int precision = 5;
cout<<setprecision(precision)<<setw(13)<<setfill('T')<<a<<endl;
The output is:
TTT1.2346e+05
Expressions Involving ap_[u]fixed<> types
Arbitrary precision fixed-point values can participate in expressions that use any operators supported by C/C++. After an arbitrary precision fixed-point type or variable is defined, their usage is the same as for any floating point type or variable in the C/C++ languages.
Observe the following caveats:
- Zero and Sign Extensions
All values of smaller bit-width are zero or sign-extended depending on the sign of the source value. You may need to insert casts to obtain alternative signs when assigning smaller bit-widths to larger.
- Truncations
Truncation occurs when you assign an arbitrary precision fixed-point of larger bit-width than the destination variable.
Class Methods, Operators, and Data Members
In general, any valid operation that can be done on a native C/C++ integer
data type is supported (using operator overloading) for ap_[u]fixed
types. In addition to these overloaded operators, some class specific
operators and methods are included to ease bit-level operations.
Binary Arithmetic Operators
Addition
ap_[u]fixed::RType ap_[u]fixed::operator + (ap_[u]fixed op)
Adds an arbitrary precision fixed-point with a given operand
op.
The operands can be any of the following integer types:
- ap_[u]fixed
- ap_[u]int
- C/C++
The result type ap_[u]fixed::RType depends on the type information
of the two operands.
ap_fixed<76, 63> Result;
ap_fixed<5, 2> Val1 = 1.125;
ap_fixed<75, 62> Val2 = 6721.35595703125;
Result = Val1 + Val2; //Yields 6722.480957
Because Val2 has the
larger bit-width on both integer part and fraction part, the result type
has the same bit-width and plus one to be able to store all possible
result values.
Specifying the data's width controls resources by using the power functions, as shown below. In similar cases, Xilinx recommends specifying the width of the stored result instead of specifying the width of fixed point operations.
ap_ufixed<16,6> x=5;
ap_ufixed<16,7>y=hl::rsqrt<16,6>(x+x);
Subtraction
ap_[u]fixed::RType ap_[u]fixed::operator - (ap_[u]fixed op)
Subtracts an arbitrary precision fixed-point with a given operand
op.
The result type ap_[u]fixed::RType depends on the
type information of the two operands.
ap_fixed<76, 63> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val2 - Val1; // Yields 6720.23057
Because Val2 has the larger bit-width on both integer
part and fraction part, the result type has the same bit-width and plus
one to be able to store all possible result values.
Multiplication
ap_[u]fixed::RType ap_[u]fixed::operator * (ap_[u]fixed op)
Multiplies an arbitrary precision fixed-point with a given operand
op.
ap_fixed<80, 64> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 * Val2; // Yields 7561.525452
This shows the multiplication of Val1 and
Val2. The result type is the sum of their
integer part bit-width and their fraction part bit width.
Division
ap_[u]fixed::RType ap_[u]fixed::operator / (ap_[u]fixed op)
Divides an arbitrary precision fixed-point by a given operand
op.
ap_fixed<84, 66> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Val2 / Val1; // Yields 5974.538628
This shows the division of Val1 and
Val2. To preserve enough precision:
- The integer bit-width of the result type is sum of the integer
bit-width of
Val2and the fraction bit-width ofVal1. - The fraction bit-width of the result type is equal to the fraction
bit-width of
Val2.
Bitwise Logical Operators
Bitwise OR
ap_[u]fixed::RType ap_[u]fixed::operator | (ap_[u]fixed op)
Applies a bitwise operation on an arbitrary precision fixed-point and a
given operand op.
ap_fixed<75, 62> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 | Val2; // Yields 6271.480957
Bitwise AND
ap_[u]fixed::RType ap_[u]fixed::operator & (ap_[u]fixed op)
Applies a bitwise operation on an arbitrary precision fixed-point and a
given operand op.
ap_fixed<75, 62> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 & Val2; // Yields 1.00000
Bitwise XOR
ap_[u]fixed::RType ap_[u]fixed::operator ^ (ap_[u]fixed op)
Applies an xor bitwise operation on an arbitrary
precision fixed-point and a given operand op.
ap_fixed<75, 62> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 ^ Val2; // Yields 6720.480957
Increment and Decrement Operators
Pre-Increment
ap_[u]fixed ap_[u]fixed::operator ++ ()
This operator function prefix increases an arbitrary precision fixed-point
variable by 1.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = ++Val1; // Yields 6.125000
Post-Increment
ap_[u]fixed ap_[u]fixed::operator ++ (int)
This operator function postfix:
- Increases an arbitrary precision fixed-point variable by
1. - Returns the original val of this arbitrary
precision fixed-point.
ap_fixed<25, 8> Result; ap_fixed<8, 5> Val1 = 5.125; Result = Val1++; // Yields 5.125000
Pre-Decrement
ap_[u]fixed ap_[u]fixed::operator -- ()
This operator function prefix decreases this arbitrary precision fixed-point
variable by 1.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = --Val1; // Yields 4.125000
Post-Decrement
ap_[u]fixed ap_[u]fixed::operator -- (int)
This operator function postfix:
- Decreases this arbitrary precision fixed-point variable by
1. - Returns the original val of this arbitrary
precision fixed-point.
ap_fixed<25, 8> Result; ap_fixed<8, 5> Val1 = 5.125; Result = Val1--; // Yields 5.125000
Unary Operators
Addition
ap_[u]fixed ap_[u]fixed::operator + ()
Returns a self copy of an arbitrary precision fixed-point variable.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = +Val1; // Yields 5.125000
Subtraction
ap_[u]fixed::RType ap_[u]fixed::operator - ()
Returns a negative value of an arbitrary precision fixed-point variable.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = -Val1; // Yields -5.125000
Equality Zero
bool ap_[u]fixed::operator ! ()
This operator function:
- Compares an arbitrary precision fixed-point variable with
0, - Returns the result.
bool Result; ap_fixed<8, 5> Val1 = 5.125; Result = !Val1; // Yields false
Bitwise Inverse
ap_[u]fixed::RType ap_[u]fixed::operator ~ ()
Returns a bitwise complement of an arbitrary precision fixed-point variable.
ap_fixed<25, 15> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = ~Val1; // Yields -5.25
Shift Operators
Unsigned Shift Left
ap_[u]fixed ap_[u]fixed::operator << (ap_uint<_W2> op)
This operator function:
- Shifts left by a given integer operand.
- Returns the result.
The operand can be a C/C++ integer type:
charshortintlong
The return type of the shift left operation is the same width as the type being shifted.
ap_fixed<25, 15> Result;
ap_fixed<8, 5> Val = 5.375;
ap_uint<4> sh = 2;
Result = Val << sh; // Yields -10.5
The bit-width of the result is (W = 25, I
= 15). Because the shift left operation result type
is same as the type of Val:
- The high order two bits of
Valare shifted out. - The result is -10.5.
If a result of 21.5 is required, Val must be cast to
ap_fixed<10, 7> first -- for
example, ap_ufixed<10, 7>(Val).
Signed Shift Left
ap_[u]fixed ap_[u]fixed::operator << (ap_int<_W2> op)
This operator:
- Shifts left by a given integer operand.
- Returns the result.
The shift direction depends on whether the operand is positive or negative.
- If the operand is positive, a shift right is performed.
- If the operand is negative, a shift left (opposite direction) is performed.
The operand can be a C/C++ integer type:
charshortintlong
The return type of the shift right operation is the same width as the type being shifted.
ap_fixed<25, 15, false> Result;
ap_uint<8, 5> Val = 5.375;
ap_int<4> Sh = 2;
Result = Val << sh; // Shift left, yields -10.25
Sh = -2;
Result = Val << sh; // Shift right, yields 1.25
Unsigned Shift Right
ap_[u]fixed ap_[u]fixed::operator >> (ap_uint<_W2> op)
This operator function:
- Shifts right by a given integer operand.
- Returns the result.
The operand can be a C/C++ integer type:
charshortintlong
The return type of the shift right operation is the same width as the type being shifted.
ap_fixed<25, 15> Result;
ap_fixed<8, 5> Val = 5.375;
ap_uint<4> sh = 2;
Result = Val >> sh; // Yields 1.25
If it is necessary to preserve all significant bits, extend fraction part
bit-width of the Val first, for example
ap_fixed<10, 5>(Val).
Signed Shift Right
ap_[u]fixed ap_[u]fixed::operator >> (ap_int<_W2> op)
This operator:
- Shifts right by a given integer operand.
- Returns the result.
The shift direction depends on whether operand is positive or negative.
- If the operand is positive, a shift right performed.
- If operand is negative, a shift left (opposite direction) is performed.
The operand can be a C/C++ integer type (char,
short, int, or
long).
The return type of the shift right operation is the same width as type being shifted. For example:
ap_fixed<25, 15, false> Result;
ap_uint<8, 5> Val = 5.375;
ap_int<4> Sh = 2;
Result = Val >> sh; // Shift right, yields 1.25
Sh = -2;
Result = Val >> sh; // Shift left, yields -10.5
1.25
Relational Operators
Equality
bool ap_[u]fixed::operator == (ap_[u]fixed op)
This operator compares the arbitrary precision fixed-point variable with a given operand.
Returns true if they are equal and
false if they are not equal.
The type of operand op can be
ap_[u]fixed, ap_int
or C/C++ integer types. For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 == Val2; // Yields true
Result = Val1 == Val3; // Yields false
Inequality
bool ap_[u]fixed::operator != (ap_[u]fixed op)
This operator compares this arbitrary precision fixed-point variable with a given operand.
Returns true if they are not equal and
false if they are equal.
The type of operand op can be:
- ap_[u]fixed
- ap_int
- C or C++ integer types
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 != Val2; // Yields false
Result = Val1 != Val3; // Yields true
Greater than or equal to
bool ap_[u]fixed::operator >= (ap_[u]fixed op)
This operator compares a variable with a given operand.
Returns true if they are equal or if the variable is
greater than the operator and false otherwise.
The type of operand op can be
ap_[u]fixed, ap_int
or C/C++ integer types.
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 >= Val2; // Yields true
Result = Val1 >= Val3; // Yields false
Less than or equal to
bool ap_[u]fixed::operator <= (ap_[u]fixed op)
This operator compares a variable with a given operand, and return
true if it is equal to or less than the
operand and false if not.
The type of operand op can be ap_[u]fixed,
ap_int or C/C++ integer types.
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 <= Val2; // Yields true
Result = Val1 <= Val3; // Yields true
Greater than
bool ap_[u]fixed::operator > (ap_[u]fixed op)
This operator compares a variable with a given operand, and return
true if it is greater than the operand and
false if not.
The type of operand op can be
ap_[u]fixed,
ap_int, or C/C++ integer types.
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 > Val2; // Yields false
Result = Val1 > Val3; // Yields false
Less than
bool ap_[u]fixed::operator < (ap_[u]fixed op)
This operator compares a variable with a given operand, and return
true if it is less than the operand and
false if not.
The type of operand op can be ap_[u]fixed,
ap_int, or C/C++ integer types. For
example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 < Val2; // Yields false
Result = Val1 < Val3; // Yields true
Bit Operator
Bit-Select and Set
af_bit_ref ap_[u]fixed::operator [] (int bit)
This operator selects one bit from an arbitrary precision fixed-point value and returns it.
The returned value is a reference value that can set or clear the
corresponding bit in the ap_[u]fixed variable.
The bit argument must be an integer value and it specifies the index of
the bit to select. The least significant bit has index
0. The highest permissible index is one
less than the bit-width of this ap_[u]fixed
variable.
The result type is af_bit_ref with a value of either
0 or 1. For
example:
ap_int<8, 5> Value = 1.375;
Value[3]; // Yields 1
Value[4]; // Yields 0
Value[2] = 1; // Yields 1.875
Value[3] = 0; // Yields 0.875
Bit Range
af_range_ref af_(u)fixed::range (unsigned Hi, unsigned Lo)
af_range_ref af_(u)fixed::operator [] (unsigned Hi, unsigned Lo)
This operation is similar to bit-select operator [] except that it operates on a range of bits instead of a single bit.
It selects a group of bits from the arbitrary precision fixed-point
variable. The Hi argument provides the upper
range of bits to be selected. The Lo argument
provides the lowest bit to be selected. If Lo is
larger than Hi the bits selected are returned in
the reverse order.
The return type af_range_ref represents a reference
in the range of the ap_[u]fixed variable
specified by Hi and Lo.
For example:
ap_uint<4> Result = 0;
ap_ufixed<4, 2> Value = 1.25;
ap_uint<8> Repl = 0xAA;
Result = Value.range(3, 0); // Yields: 0x5
Value(3, 0) = Repl(3, 0); // Yields: -1.5
// when Lo > Hi, return the reverse bits string
Result = Value.range(0, 3); // Yields: 0xA
Range Select
af_range_ref af_(u)fixed::range ()
af_range_ref af_(u)fixed::operator []
This operation is the special case of the range select operator
[]. It selects all bits from this
arbitrary precision fixed-point value in the normal order.
The return type af_range_ref represents a reference to the range specified by Hi = W - 1 and Lo = 0. For example:
ap_uint<4> Result = 0;
ap_ufixed<4, 2> Value = 1.25;
ap_uint<8> Repl = 0xAA;
Result = Value.range(); // Yields: 0x5
Value() = Repl(3, 0); // Yields: -1.5
Length
int ap_[u]fixed::length ()
This function returns an integer value that provides the number of bits in an arbitrary precision fixed-point value. It can be used with a type or a value. For example:
ap_ufixed<128, 64> My128APFixed;
int bitwidth = My128APFixed.length(); // Yields 128
Explicit Conversion Methods
Fixed to Double
double ap_[u]fixed::to_double ()
This member function returns this fixed-point value in form of IEEE double precision format. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
double Result;
Result = MyAPFixed.to_double(); // Yields 333.789
Fixed to Float
float ap_[u]fixed::to_float()
This member function returns this fixed-point value in form of IEEE float precision format. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
float Result;
Result = MyAPFixed.to_float(); // Yields 333.789
Fixed to Half-Precision Floating Point
half ap_[u]fixed::to_half()
This member function return this fixed-point value in form of HLS half-precision (16-bit) float precision format. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
half Result;
Result = MyAPFixed.to_half(); // Yields 333.789
Fixed to ap_int
ap_int ap_[u]fixed::to_ap_int ()
This member function explicitly converts this fixed-point
value to ap_int that captures all
integer bits (fraction bits are truncated). For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
ap_uint<77> Result;
Result = MyAPFixed.to_ap_int(); //Yields 333
Fixed to Integer
int ap_[u]fixed::to_int ()
unsigned ap_[u]fixed::to_uint ()
ap_slong ap_[u]fixed::to_int64 ()
ap_ulong ap_[u]fixed::to_uint64 ()
This member function explicitly converts this fixed-point value to C built-in integer types. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
unsigned int Result;
Result = MyAPFixed.to_uint(); //Yields 333
unsigned long long Result;
Result = MyAPFixed.to_uint64(); //Yields 333
ap_[u]fixed to other data types.Compile Time Access to Data Type Attributes
The ap_[u]fixed<>
types are provided with several static members that allow the size and
configuration of data types to be determined at compile time. The data
type is provided with the static const members: width, iwidth, qmode and
omode:
static const int width = _AP_W;
static const int iwidth = _AP_I;
static const ap_q_mode qmode = _AP_Q;
static const ap_o_mode omode = _AP_O;
You can use these data members to extract the following information from any
existing ap_[u]fixed<> data type:
width: The width of the data type.iwidth: The width of the integer part of the data type.qmode: The quantization mode of the data type.omode: The overflow mode of the data type.
For example, you can use these data members to extract the data width of an
existing ap_[u]fixed<> data type to create
another ap_[u]fixed<> data type at compile
time.
The following example shows how the size of variable
Res is automatically defined as 1-bit greater
than variables Val1 and Val2 with the same quantization
modes:
// Definition of basic data type
#define INPUT_DATA_WIDTH 12
#define IN_INTG_WIDTH 6
#define IN_QMODE AP_RND_ZERO
#define IN_OMODE AP_WRAP
typedef ap_fixed<INPUT_DATA_WIDTH, IN_INTG_WIDTH, IN_QMODE, IN_OMODE> data_t;
// Definition of variables
data_t Val1, Val2;
// Res is automatically sized at run-time to be 1-bit greater than INPUT_DATA_WIDTH
// The bit growth in Res will be in the integer bits
ap_int<data_t::width+1, data_t::iwidth+1, data_t::qmode, data_t::omode> Res = Val1 +
Val2;
This ensures that Vitis HLS correctly
models the bit-growth caused by the addition even if you update the
value of INPUT_DATA_WIDTH, IN_INTG_WIDTH, or the quantization modes for
data_t.
Vitis HLS Math Library
The Vitis HLS Math Library (hls_math.h) provides support for the
synthesis of the standard C (math.h) and C++
(cmath.h) libraries and is automatically used to specify the
math operations during synthesis. The support includes floating point
(single-precision, double-precision and half-precision) for all functions and
fixed-point support for some functions.
The hls_math.h library can optionally be used in C++ source code in
place of the standard C++ math library (cmath.h), but it cannot be
used in C source code. Vitis HLS will use the appropriate simulation implementation
to avoid accuracy difference between C simulation and C/RTL co-simulation.
HLS Math Library Accuracy
The HLS math functions are implemented as synthesizable bit-approximate functions from the hls_math.h library. Bit-approximate HLS math library functions do not provide the same accuracy as the standard C function. To achieve the desired result, the bit-approximate implementation might use a different underlying algorithm than the standard C math library version. The accuracy of the function is specified in terms of ULP (Unit of Least Precision). This difference in accuracy has implications for both C simulation and C/RTL co-simulation.
The ULP difference is typically in the range of 1-4 ULP.
- If the standard C math library is used in the C source code, there may be a difference between the C simulation and the C/RTL co-simulation due to the fact that some functions exhibit a ULP difference from the standard C math library.
- If the HLS math library is used in the C source code, there will be no difference between the C simulation and the C/RTL co-simulation. A C simulation using the HLS math library, may however differ from a C simulation using the standard C math library.
In addition, the following seven functions might show some differences, depending on the C standard used to compile and run the C simulation:
- copysign
- fpclassify
- isinf
- isfinite
- isnan
- isnormal
- signbit
C90 mode
Only isinf, isnan, and copysign are usually provided by the system header files, and they
operate on doubles. In particular, copysign
always returns a double result. This might result in unexpected results after
synthesis if it must be returned to a float, because a double-to-float conversion
block is introduced into the hardware.
C99 mode (-std=c99)
All seven functions are usually provided under the expectation that
the system header files will redirect them to __isnan(double) and __isnan(float). The usual GCC header files do not redirect
isnormal, but implement it in terms of
fpclassify.
C++ Using math.h
All seven are provided by the system header files, and they operate on doubles.
copysign always returns a
double result. This might cause unexpected results after synthesis if it must be
returned to a float, because a double-to-float conversion block is introduced into
the hardware.
C++ Using cmath
Similar to C99 mode(-std=c99),
except that:
- The system header files are usually different.
- The functions are properly overloaded for:
float(). snan(double)isinf(double)
copysign and copysignf are handled as built-ins even when using
namespace std;.
C++ Using cmath and namespace std
No issues. Xilinx recommends using the following for best results:
-std=c99for C-fno-builtinfor C and C++
-std=c99, use the Tcl command add_files with the -cflags option.
Alternatively, use the Edit CFLAGs button in
the Project Settings dialog box.The HLS Math Library
The following functions are provided in the HLS math library. Each
function supports half-precision (type half),
single-precision (type float) and double precision
(type double).
func listed below, there is also an
associated half-precision only function named half_func and single-precision only function named funcf provided in the library.When mixing half-precision, single-precision and double-precision data types, check for common synthesis errors to prevent introducing type-conversion hardware in the final FPGA implementation.
Trigonometric Functions
| acos | acospi | asin | asinpi |
| atan | atan2 | atan2pi | cos |
| cospi | sin | sincos | sinpi |
| tan | tanpi |
Hyperbolic Functions
| acosh | asinh | atanh | cosh |
| sinh | tanh |
Exponential Functions
| exp | exp10 | exp2 | expm1 |
| frexp | ldexp | modf |
Logarithmic Functions
| ilogb | log | log10 | log1p |
Power Functions
| cbrt | hypot | pow | rsqrt |
| sqrt |
Error Functions
| erf | erfc |
Rounding Functions
| ceil | floor | llrint | llround |
| lrint | lround | nearbyint | rint |
| round | trunc |
Remainder Functions
| fmod | remainder | remquo |
Floating-point
| copysign | nan | nextafter | nexttoward |
Difference Functions
| fdim | fmax | fmin | maxmag |
| minmag |
Other Functions
| abs | divide | fabs | fma |
| fract | mad | recip |
Classification Functions
| fpclassify | isfinite | isinf | isnan |
| isnormal | signbit |
Comparison Functions
| isgreater | isgreaterequal | isless | islessequal |
| islessgreater | isunordered |
Relational Functions
| all | any | bitselect | isequal |
| isnotequal | isordered | select |
Fixed-Point Math Functions
Fixed-point implementations are also provided for the following math functions.
All fixed-point math functions support ap_[u]fixed and ap_[u]int data types with following bit-width specification,
ap_fixed<W,I>where I<=33 and W-I<=32ap_ufixed<W,I>where I<=32 and W-I<=32ap_int<I>where I<=33ap_uint<I>where I<=32
Trigonometric Functions
| cos | sin | tan | acos | asin | atan | atan2 | sincos |
| cospi | sinpi |
Hyperbolic Functions
| cosh | sinh | tanh | acosh | asinh | atanh |
Exponential Functions
| exp | frexp | modf | exp2 | expm1 |
Logarithmic Functions
| log | log10 | ilogb | log1p |
Power Functions
| pow | sqrt | rsqrt | cbrt | hypot |
Error Functions
| erf | erfc |
Rounding Functions
| ceil | floor | trunc | round | rint | nearbyint |
Floating Point
| nextafter | nexttoward |
Difference Functions
| erf | erfc | fdim | fmax | fmin | maxmag | minmag |
Other Functions
| fabs | recip | abs | fract | divide |
Classification Functions
| signbit |
Comparison Functions
| isgreater | isgreaterequal | isless | islessequal | islessgreater |
Relational Functions
| isequal | isnotequal | any | all | bitselect |
The fixed-point type provides a slightly-less accurate version of the function value, but a smaller and faster RTL implementation.
The methodology for implementing a math function with a fixed-point data types is:
- Determine if a fixed-point implementation is supported.
- Update the math functions to use
ap_fixedtypes. - Perform C simulation to validate the design still operates with the required precision. The C simulation is performed using the same bit-accurate types as the RTL implementation.
- Synthesize the design.
For example, a fixed-point implementation of the function sin is specified by using fixed-point types with the
math function as follows:
#include "hls_math.h"
#include "ap_fixed.h"
ap_fixed<32,2> my_input, my_output;
my_input = 24.675;
my_output = sin(my_input);
When using fixed-point math functions, the result type must have the same width and integer bits as the input.
Verification and Math Functions
If the standard C math library is used in the C source code, the C simulation results and the C/RTL co-simulation results may be different: if any of the math functions in the source code have an ULP difference from the standard C math library it may result in differences when the RTL is simulated.
If the hls_math.h library is used in the C source code, the C simulation and C/RTL co-simulation results are identical. However, the results of C simulation using hls_math.h are not the same as those using the standard C libraries. The hls_math.h library simply ensures the C simulation matches the C/RTL co-simulation results. In both cases, the same RTL implementation is created. The following explains each of the possible options which are used to perform verification when using math functions.
Verification Option 1: Standard Math Library and Verify Differences
In this option, the standard C math libraries are used in the source code. If any of the functions synthesized do have exact accuracy the C/RTL co-simulation is different than the C simulation. The following example highlights this approach.
#include <cmath>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
typedef float data_t;
data_t cpp_math(data_t angle) {
data_t s = sinf(angle);
data_t c = cosf(angle);
return sqrtf(s*s+c*c);
}
In this case, the results between C simulation and C/RTL co-simulation are different. Keep in mind when comparing the outputs of simulation, any results written from the test bench are written to the working directory where the simulation executes:
- C simulation: Folder
<project>/<solution>/csim/build - C/RTL co-simulation: Folder
<project>/<solution>/sim/<RTL>
where <project> is the project folder,
<solution> is the name of the solution folder and
<RTL> is the type of RTL verified (verilog or vhdl). The following
figure shows a typical comparison of the pre-synthesis results file on the left-hand side and
the post-synthesis RTL results file on the right-hand side. The output is shown in the third
column.
The results of pre-synthesis simulation and post-synthesis simulation differ by fractional amounts. You must decide whether these fractional amounts are acceptable in the final RTL implementation.
The recommended flow for handling these differences is using a test bench that
checks the results to ensure that they lie within an acceptable error range. This can be
accomplished by creating two versions of the same function, one for synthesis and one as a
reference version. In this example, only function cpp_math is synthesized.
#include <cmath>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
typedef float data_t;
data_t cpp_math(data_t angle) {
data_t s = sinf(angle);
data_t c = cosf(angle);
return sqrtf(s*s+c*c);
}
data_t cpp_math_sw(data_t angle) {
data_t s = sinf(angle);
data_t c = cosf(angle);
return sqrtf(s*s+c*c);
}
The test bench to verify the design compares the outputs of both functions to
determine the difference, using variable diff in the
following example. During C simulation both functions produce identical outputs. During C/RTL
co-simulation function cpp_math produces different
results and the difference in results are checked.
int main() {
data_t angle = 0.01;
data_t output, exp_output, diff;
int retval=0;
for (data_t i = 0; i <= 250; i++) {
output = cpp_math(angle);
exp_output = cpp_math_sw(angle);
// Check for differences
diff = ( (exp_output > output) ? exp_output - output : output - exp_output);
if (diff > 0.0000005) {
printf("Difference %.10f exceeds tolerance at angle %.10f \n", diff, angle);
retval=1;
}
angle = angle + .1;
}
if (retval != 0) {
printf("Test failed !!!\n");
retval=1;
} else {
printf("Test passed !\n");
}
// Return 0 if the test passes
return retval;
}
If the margin of difference is lowered to 0.00000005, this test bench highlights the margin of error during C/RTL co-simulation:
Difference 0.0000000596 at angle 1.1100001335
Difference 0.0000000596 at angle 1.2100001574
Difference 0.0000000596 at angle 1.5100002289
Difference 0.0000000596 at angle 1.6100002527
etc..
When using the standard C math libraries (math.h and
cmath.h) create a “smart” test bench to verify any differences in
accuracy are acceptable.
Verification Option 2: HLS Math Library and Validate Differences
An alternative verification option is to convert the source code to use the HLS math library. With this option, there are no differences between the C simulation and C/RTL co-simulation results. The following example shows how the code above is modified to use the hls_math.h library.
- Include the
hls_math.hheader file. - Replace the math functions with the equivalent
hls::function.#include <cmath> #include "hls_math.h" #include <fstream> #include <iostream> #include <iomanip> #include <cstdlib> using namespace std; typedef float data_t; data_t cpp_math(data_t angle) { data_t s = hls::sinf(angle); data_t c = hls::cosf(angle); return hls::sqrtf(s*s+c*c); }
Verification Option 3: HLS Math Library File and Validate Differences
Including the HLS math library file lib_hlsm.cpp as a design file ensures Vitis HLS uses the HLS math library for C simulation. This option is identical to option2 however it does not require the C code to be modified.
The HLS math library file is located in the src directory in the Vitis HLS installation area. Simply copy the file to your local folder and add the file as a standard design file.
As with option 2, with this option there is now a difference between the C simulation results using the HLS math library file and those previously obtained without adding this file. These difference should be validated with C simulation using a “smart” test bench similar to option 1.
Common Synthesis Errors
The following are common use errors when synthesizing math functions. These are often (but not exclusively) caused by converting C functions to C++ to take advantage of synthesis for math functions.
C++ cmath.h
If the C++ cmath.h header file is used, the floating point functions (for
example, sinf and cosf) can be used.
These result in 32-bit operations in hardware. The cmath.h header file also
overloads the standard functions (for example, sin and
cos) so they can be used for float and double types.
C math.h
If the C math.h library is used, the single-precision functions (for
example, sinf and cosf) are required
to synthesize 32-bit floating point operations. All standard function calls (for
example, sin and cos) result in
doubles and 64-bit double-precision operations being synthesized.
Cautions
When converting C functions to C++ to take advantage of math.h support, be
sure that the new C++ code compiles correctly before synthesizing with Vitis HLS.
For example, if sqrtf() is used in the code with
math.h, it requires the following code extern added to the C++ code to
support it:
#include <math.h>
extern “C” float sqrtf(float);
To avoid unnecessary hardware caused by type conversion, follow the warnings on mixing double and float types discussed in Floats and Doubles.
HLS Stream Library
Streaming data is a type of data transfer in which data samples are sent in sequential order starting from the first sample. Streaming requires no address management.
Modeling designs that use streaming data can be difficult in C. The approach of using pointers to perform multiple read and/or write accesses can introduce issues, because there are implications for the type qualifier and how the test bench is constructed.
Vitis HLS provides a C++ template class hls::stream<> for modeling streaming data structures. The streams implemented with the hls::stream<> class have the following attributes.
- In the C code, an
hls::stream<>behaves like a FIFO of infinite depth. There is no requirement to define the size of anhls::stream<>. - They are read from and written to sequentially. That is, after data is read from an
hls::stream<>, it cannot be read again. - An
hls::stream<>on the top-level interface is by default implemented with anap_fifointerface. - An
hls::stream<>internal to the design is implemented as a FIFO with a depth of 2. The optimization directive STREAM is used to change this default size.
This section shows how the hls::stream<> class can more easily model designs with streaming data. The topics in this section provide:
- An overview of modeling with streams and the RTL implementation of streams.
- Rules for global stream variables.
- How to use streams.
- Blocking reads and writes.
- Non-Blocking Reads and writes.
- Controlling the FIFO depth.
hls::stream class should always be passed
between functions as a C++ reference argument. For example,
&my_stream.hls::stream class is only used in C++
designs. Array of streams is not supported.C Modeling and RTL Implementation
Streams are modeled as an infinite queue in software (and in the test bench during RTL co-simulation). There is no need to specify any depth to simulate streams in C++. Streams can be used inside functions and on the interface to functions. Internal streams may be passed as function parameters.
Streams can be used only in C++ based designs. Each hls::stream<> object must be written by a single process and read by a single process.
If an hls::stream is used on the top-level interface, it is by default implemented in the RTL as a FIFO interface (ap_fifo) but may be optionally implemented as a handshake interface (ap_hs) or an AXI-Stream interface (axis).
If an hls::stream is used inside the design
function and synthesized into hardware, it is implemented as a
FIFO with a default depth of 2. In some cases, such as when
interpolation is used, the depth of the FIFO might have to be
increased to ensure the FIFO can hold all the elements produced
by the hardware. Failure to ensure the FIFO is large enough to
hold all the data samples generated by the hardware can result
in a stall in the design (seen in C/RTL co-simulation and in the
hardware implementation). The depth of the FIFO can be adjusted
using the STREAM directive with the depth
option. An example of this is provided in the example design
hls_stream.
hls::stream variables are correctly sized when used in the default
non-DATAFLOW regions.If an hls::stream is used to transfer data between tasks (sub-functions or loops), you should immediately consider implementing the tasks in a DATAFLOW region where data streams from one task to the next. The default (non-DATAFLOW) behavior is to complete each task before starting the next task, in which case the FIFOs used to implement the hls::stream variables must be sized to ensure they are large enough to hold all the data samples generated by the producer task. Failure to increase the size of the hls::stream variables results in the error below:
ERROR: [XFORM 203-733] An internal stream xxxx.xxxx.V.user.V' with default size is
used in a non-dataflow region, which may result in deadlock. Please consider to
resize the stream using the directive 'set_directive_stream' or the 'HLS stream'
pragma.
This error informs you that in a non-DATAFLOW region (the default FIFOs depth is 2) may not be large enough to hold all the data samples written to the FIFO by the producer task.
Global and Local Streams
Streams may be defined either locally or globally. Local streams are always implemented as internal FIFOs. Global streams can be implemented as internal FIFOs or ports:
- Globally-defined streams that are only read from, or only written to, are inferred as external ports of the top-level RTL block.
- Globally-defined streams that are both read from and written to (in the hierarchy below the top-level function) are implemented as internal FIFOs.
Streams defined in the global scope follow the same rules as any other global variables.
Using HLS Streams
To use hls::stream<> objects, include
the header file hls_stream.h. Streaming data objects are
defined by specifying the type and variable name. In this example, a 128-bit unsigned
integer type is defined and used to create a stream variable called my_wide_stream.
#include "ap_int.h"
#include "hls_stream.h"
typedef ap_uint<128> uint128_t; // 128-bit user defined type
hls::stream<uint128_t> my_wide_stream; // A stream declaration
Streams must use scoped naming. Xilinx
recommends using the scoped hls:: naming shown in the
example above. However, if you want to use the hls
namespace, you can rewrite the preceding example as:
#include <ap_int.h>
#include <hls_stream.h>
using namespace hls;
typedef ap_uint<128> uint128_t; // 128-bit user defined type
stream<uint128_t> my_wide_stream; // hls:: no longer required
Given a stream specified as hls::stream<T>, the type T may be:
- Any C++ native data type
- A Vitis HLS arbitrary precision type (for example, ap_int<>, ap_ufixed<>)
- A user-defined struct containing either of the above types
A stream can also be specified as hls::stream<Type, Depth>, where Depth indicates the depth of the FIFO
needed in the verification adapter that the HLS tool creates for RTL co-simulation.
Streams may be optionally named. Providing a name for the stream allows the name to be used in reporting. For example, Vitis HLS automatically checks to ensure all elements from an input stream are read during simulation. Given the following two streams:
stream<uint8_t> bytestr_in1;
stream<uint8_t> bytestr_in2("input_stream2");
WARNING: Hls::stream 'hls::stream<unsigned char>.1' contains leftover data, which
may result in RTL simulation hanging.
WARNING: Hls::stream 'input_stream2' contains leftover data, which may result in RTL
simulation hanging.
Any warning on elements left in the streams are reported as follows, where
it is clear which message relates to bytetr_in2:
When streams are passed into and out of functions, they must be passed-by-reference as in the following example:
void stream_function (
hls::stream<uint8_t> &strm_out,
hls::stream<uint8_t> &strm_in,
uint16_t strm_len
)
Vitis HLS supports both blocking and non-blocking access methods.
- Non-blocking accesses can be implemented only as FIFO interfaces.
- Streaming ports that are implemented as
ap_fifoports and that are defined with an AXI4-Stream resource must not use non-blocking accesses.
A complete design example using streams is provided in the Vitis HLS examples. Refer to the hls_stream example in the design examples available from the
GUI welcome screen.
Blocking Reads and Writes
The basic accesses to an hls::stream<>
object are blocking reads and writes. These are accomplished using class
methods. These methods stall (block) execution if a read is attempted
on an empty stream FIFO, a write is attempted to a full stream FIFO,
or until a full handshake is accomplished for a stream mapped to an ap_hs
interface protocol.
A stall can be observed in C/RTL co-simulation as the continued execution of the simulator without any progress in the transactions. The following shows a classic example of a stall situation, where the RTL simulation time keeps increasing, but there is no progress in the inter or intra transactions:
// RTL Simulation : "Inter-Transaction Progress" ["Intra-Transaction Progress"] @
"Simulation Time"
///////////////////////////////////////////////////////////////////////////////////
// RTL Simulation : 0 / 1 [0.00%] @ "110000"
// RTL Simulation : 0 / 1 [0.00%] @ "202000"
// RTL Simulation : 0 / 1 [0.00%] @ "404000"
Blocking Write Methods
In this example, the value of variable src_var is pushed into the stream.
// Usage of void write(const T & wdata)
hls::stream<int> my_stream;
int src_var = 42;
my_stream.write(src_var);
The << operator is overloaded such that it may be used in a similar fashion to the stream
insertion operators for C++ stream (for example, iostreams and filestreams). The
hls::stream<> object to be written to is supplied as the left-hand side argument and the
value to be written as the right-hand side.
// Usage of void operator << (T & wdata)
hls::stream<int> my_stream;
int src_var = 42;
my_stream << src_var;
Blocking Read Methods
This method reads from the head of the stream and assigns the values to the variable dst_var.
// Usage of void read(T &rdata)
hls::stream<int> my_stream;
int dst_var;
my_stream.read(dst_var);
Alternatively, the next object in the stream can be read by assigning (using for example =, +=) the stream to an object on the left-hand side:
// Usage of T read(void)
hls::stream<int> my_stream;
int dst_var = my_stream.read();
The '>>' operator is overloaded to allow use similar to the stream extraction operator for
C++ stream (for example, iostreams and filestreams). The hls::stream is supplied as the
LHS argument and the destination variable the RHS.
// Usage of void operator >> (T & rdata)
hls::stream<int> my_stream;
int dst_var;
my_stream >> dst_var;
Non-Blocking Reads and Writes
Non-blocking write and read methods are also provided. These allow execution to continue even when a read is attempted on an empty stream or a write to a full stream.
These methods return a Boolean value indicating the status of the access (true if successful, false otherwise). Additional methods are included for testing the status of an hls::stream<> stream.
ap_fifo protocol. More
specifically, the AXI-Stream standard and the Xilinx
ap_hs IO protocol do not support non-blocking
accesses.During C simulation, streams have an infinite size. It is therefore not possible to validate with C simulation if the stream is full. These methods can be verified only during RTL simulation when the FIFO sizes are defined (either the default size of 1, or an arbitrary size defined with the STREAM directive).
Non-Blocking Writes
This method attempts to push variable src_var into the stream my_stream, returning a boolean true if successful. Otherwise, false is returned and the queue is unaffected.
// Usage of void write_nb(const T & wdata)
hls::stream<int> my_stream;
int src_var = 42;
if (my_stream.write_nb(src_var)) {
// Perform standard operations
...
} else {
// Write did not occur
return;
}
Fullness Test
bool full(void)
Returns true, if and only if the
hls::stream<> object is full.
// Usage of bool full(void)
hls::stream<int> my_stream;
int src_var = 42;
bool stream_full;
stream_full = my_stream.full();
Non-Blocking Read
bool read_nb(T & rdata)
This method attempts to read a value from the stream, returning
true if successful. Otherwise, false is returned and the queue is unaffected.
// Usage of void read_nb(const T & wdata)
hls::stream<int> my_stream;
int dst_var;
if (my_stream.read_nb(dst_var)) {
// Perform standard operations
...
} else {
// Read did not occur
return;
}
Emptiness Test
bool empty(void)
Returns true if the hls::stream<> is empty.
// Usage of bool empty(void)
hls::stream<int> my_stream;
int dst_var;
bool stream_empty;
stream_empty = my_stream.empty();
The following example shows how a combination of non-blocking accesses and full/empty tests can provide error handling functionality when the RTL FIFOs are full or empty:
#include "hls_stream.h"
using namespace hls;
typedef struct {
short data;
bool valid;
bool invert;
} input_interface;
bool invert(stream<input_interface>& in_data_1,
stream<input_interface>& in_data_2,
stream<short>& output
) {
input_interface in;
bool full_n;
// Read an input value or return
if (!in_data_1.read_nb(in))
if (!in_data_2.read_nb(in))
return false;
// If the valid data is written, return not-full (full_n) as true
if (in.valid) {
if (in.invert)
full_n = output.write_nb(~in.data);
else
full_n = output.write_nb(in.data);
}
return full_n;
}
Controlling the RTL FIFO Depth
For most designs using streaming data, the default RTL FIFO depth of 2 is sufficient. Streaming data is generally processed one sample at a time.
For multirate designs in which the implementation requires a FIFO with a depth greater than 2, you must determine (and set using the STREAM directive) the depth necessary for the RTL simulation to complete. If the FIFO depth is insufficient, RTL co-simulation stalls.
Because stream objects cannot be viewed in the GUI directives pane, the STREAM directive cannot be applied directly in that pane.
Right-click the function in which an hls::stream<> object is declared (or is used, or exists in the argument list) to:
- Select the STREAM directive.
- Populate the
variablefield manually with name of the stream variable.
Alternatively, you can:
- Specify the STREAM directive manually in the
directives.tclfile, or - Add it as a pragma in
source.
C/RTL Co-Simulation Support
The Vitis HLS C/RTL co-simulation feature does not support structures or classes containing hls::stream<> members in the top-level interface. Vitis HLS supports these structures or classes for synthesis.
typedef struct {
hls::stream<uint8_t> a;
hls::stream<uint16_t> b;
} strm_strct_t;
void dut_top(strm_strct_t indata, strm_strct_t outdata) {
}
These restrictions apply to both top-level function arguments and globally declared
objects. If structs of streams are used for synthesis, the design must be verified using an
external RTL simulator and user-created HDL test bench. There are no such restrictions on
hls::stream<> objects with strictly internal linkage.
HLS IP Libraries
Vitis HLS provides C++ libraries to implement a number of Xilinx IP blocks. The C libraries allow the following Xilinx IP blocks to be directly inferred from the C++ source code ensuring a high-quality implementation in the FPGA.
| Library Header File | Description |
|---|---|
| hls_fft.h | Allows the Xilinx LogiCORE IP FFT to be simulated in C and implemented using the Xilinx LogiCORE block. |
| hls_fir.h | Allows the Xilinx LogiCORE IP FIR to be simulated in C and implemented using the Xilinx LogiCORE block. |
| hls_dds.h | Allows the Xilinx LogiCORE IP DDS to be simulated in C and implemented using the Xilinx LogiCORE block. |
| ap_shift_reg.h | Provides a C++ class to implement a shift register which is implemented directly using a Xilinx SRL primitive. |
FFT IP Library
The Xilinx FFT IP block can be called within a C++ design using the library hls_fft.h. This section explains how the FFT can be configured in your C++ code.
To use the FFT in your C++ code:
- Include the
hls_fft.hlibrary in the code - Set the default parameters using the pre-defined struct
hls::ip_fft::params_t - Define the run time configuration
- Call the FFT function
- Optionally, check the run time status
The following code examples provide a summary of how each of these steps is performed. Each step is discussed in more detail below.
First, include the FFT library in the source code. This header file resides in the include directory in the Vitis HLS installation area which is automatically searched when Vitis HLS executes.
#include "hls_fft.h"
Define the static parameters of the FFT. This includes such things as input width, number of
channels, type of architecture. which do not change dynamically. The FFT library includes a
parameterization struct hls::ip_fft::params_t, which can be used to initialize all
static parameters with default values.
In this example, the default values for output ordering and the widths of the configuration
and status ports are over-ridden using a user-defined struct param1 based on the
pre-defined struct.
struct param1 : hls::ip_fft::params_t {
static const unsigned ordering_opt = hls::ip_fft::natural_order;
static const unsigned config_width = FFT_CONFIG_WIDTH;
static const unsigned status_width = FFT_STATUS_WIDTH;
};
Define types and variables for both the run time configuration and run time status. These values can be dynamic and are therefore defined as variables in the C code which can change and are accessed through APIs.
typedef hls::ip_fft::config_t<param1> config_t;
typedef hls::ip_fft::status_t<param1> status_t;
config_t fft_config1;
status_t fft_status1;
Next, set the run time configuration. This example sets the direction of the FFT (Forward or Inverse) based on the value of variable “direction” and also set the value of the scaling schedule.
fft_config1.setDir(direction);
fft_config1.setSch(0x2AB);
Call the FFT function using the HLS namespace with the defined static configuration
(param1 in this example). The function parameters are, in order, input data, output data,
output status and input configuration.
hls::fft<param1> (xn1, xk1, &fft_status1, &fft_config1);
Finally, check the output status. This example checks the overflow flag and stores the results in variable “ovflo”.
*ovflo = fft_status1->getOvflo();
Design examples using the FFT C library are provided in the Vitis HLS examples and can be accessed using menu option .
FFT Static Parameters
The static parameters of the FFT define how the FFT is configured and specifies the fixed parameters such as the size of the FFT, whether the size can be changed dynamically, whether the implementation is pipelined or radix_4_burst_io.
The hls_fft.h header file defines a struct
hls::ip_fft::params_t which can be used to
set default values for the static parameters. If the default values are to be used,
the parameterization struct can be used directly with the FFT function.
hls::fft<hls::ip_fft::params_t >
(xn1, xk1, &fft_status1, &fft_config1);
A more typical use is to change some of the parameters to non-default values. This is performed by creating a new user-defined parameterization struct based on the default parameterization struct and changing some of the default values.
In the following example, a new user struct my_fft_config is defined with a new value
for the output ordering (changed to natural_order). All other static parameters to the FFT
use the default values.
struct my_fft_config : hls::ip_fft::params_t {
static const unsigned ordering_opt = hls::ip_fft::natural_order;
};
hls::fft<my_fft_config >
(xn1, xk1, &fft_status1, &fft_config1);
The values used for the parameterization struct hls::ip_fft::params_t are explained
in FFT Struct Parameters. The default values for the parameters and a list of possible values
are provided in FFT Struct Parameter Values.
FFT Struct Parameters
| Parameter | Description |
|---|---|
| input_width | Data input port width. |
| output_width | Data output port width. |
| status_width | Output status port width. |
| config_width | Input configuration port width. |
| max_nfft | The size of the FFT data set is specified as 1 << max_nfft. |
| has_nfft | Determines if the size of the FFT can be run time configurable. |
| channels | Number of channels. |
| arch_opt | The implementation architecture. |
| phase_factor_width | Configure the internal phase factor precision. |
| ordering_opt | The output ordering mode. |
| ovflo | Enable overflow mode. |
| scaling_opt | Define the scaling options. |
| rounding_opt | Define the rounding modes. |
| mem_data | Specify using block or distributed RAM for data memory. |
| mem_phase_factors | Specify using block or distributed RAM for phase factors memory. |
| mem_reorder | Specify using block or distributed RAM for output reorder memory. |
| stages_block_ram | Defines the number of block RAM stages used in the implementation. |
| mem_hybrid | When block RAMs are specified for data, phase factor, or reorder buffer, mem_hybrid specifies where or not to use a hybrid of block and distributed RAMs to reduce block RAM count in certain configurations. |
| complex_mult_type | Defines the types of multiplier to use for complex multiplications. |
| butterfly_type | Defines the implementation used for the FFT butterfly. |
When specifying parameter values which are not integer or boolean, the HLS FFT namespace should be used.
For example, the possible values for parameter butterfly_type in the following table are use_luts and use_xtremedsp_slices. The values used in the C program should be butterfly_type = hls::ip_fft::use_luts and butterfly_type = hls::ip_fft::use_xtremedsp_slices.
FFT Struct Parameter Values
The following table covers all features and functionality of the FFT IP. Features and functionality not described in this table are not supported in the Vitis HLS implementation.
| Parameter | C Type | Default Value | Valid Values |
|---|---|---|---|
| input_width | unsigned | 16 | 8-34 |
| output_width | unsigned | 16 | input_width to (input_width + max_nfft + 1) |
| status_width | unsigned | 8 | Depends on FFT configuration |
| config_width | unsigned | 16 | Depends on FFT configuration |
| max_nfft | unsigned | 10 | 3-16 |
| has_nfft | bool | false | True, False |
| channels | unsigned | 1 | 1-12 |
| arch_opt | unsigned | pipelined_streaming_io | automatically_select pipelined_streaming_io radix_4_burst_io radix_2_burst_io radix_2_lite_burst_io |
| phase_factor_width | unsigned | 16 | 8-34 |
| ordering_opt | unsigned | bit_reversed_order | bit_reversed_order natural_order |
| ovflo | bool | true | false true |
| scaling_opt | unsigned | scaled | scaled unscaled block_floating_point |
| rounding_opt | unsigned | truncation | truncation convergent_rounding |
| mem_data | unsigned | block_ram | block_ram distributed_ram |
| mem_phase_factors | unsigned | block_ram | block_ram distributed_ram |
| mem_reorder | unsigned | block_ram | block_ram distributed_ram |
| stages_block_ram | unsigned | (max_nfft < 10) ? 0 : (max_nfft - 9) |
0-11 |
| mem_hybrid | bool | false | false true |
| complex_mult_type | unsigned | use_mults_resources | use_luts use_mults_resources use_mults_performance |
| butterfly_type | unsigned | use_luts | use_luts use_xtremedsp_slices |
FFT Runtime Configuration and Status
The FFT supports runtime configuration and runtime status monitoring through the
configuration and status ports. These ports are defined as arguments to the FFT
function, shown here as variables fft_status1
and fft_config1:
hls::fft<param1> (xn1, xk1, &fft_status1, &fft_config1);
The runtime configuration and status can be accessed using the predefined structs from the FFT C library:
- hls::ip_fft::config_t<param1>
- hls::ip_fft::status_t<param1>
The runtime configuration struct allows the following actions to be performed in the C code:
- Set the FFT length, if runtime configuration is enabled
- Set the FFT direction as forward or inverse
- Set the scaling schedule
The FFT length can be set as follows:
typedef hls::ip_fft::config_t<param1> config_t;
config_t fft_config1;
// Set FFT length to 512 => log2(512) =>9
fft_config1-> setNfft(9);max_nfft in the
static configuration. The FFT direction can be set as follows:
typedef hls::ip_fft::config_t<param1> config_t;
config_t fft_config1;
// Forward FFT
fft_config1->setDir(1);
// Inverse FFT
fft_config1->setDir(0);
The FFT scaling schedule can be set as follows:
typedef hls::ip_fft::config_t<param1> config_t;
config_t fft_config1;
fft_config1->setSch(0x2AB);
The output status port can be accessed using the pre-defined struct to determine:
- If any overflow occurred during the FFT
- The value of the block exponent
The FFT overflow mode can be checked as follows:
typedef hls::ip_fft::status_t<param1> status_t;
status_t fft_status1;
// Check the overflow flag
bool *ovflo = fft_status1->getOvflo();
And the block exponent value can be obtained using:
typedef hls::ip_fft::status_t<param1> status_t;
status_t fft_status1;
// Obtain the block exponent
unsigned int *blk_exp = fft_status1-> getBlkExp();
Using the FFT Function
The FFT function is defined in the HLS namespace and can be called as follows:
hls::fft<STATIC_PARAM> (
INPUT_DATA_ARRAY,
OUTPUT_DATA_ARRAY,
OUTPUT_STATUS,
INPUT_RUN_TIME_CONFIGURATION);
The STATIC_PARAM is the static parameterization struct that defines the static parameters
for the FFT.
Both the input and output data are supplied to the function as arrays (INPUT_DATA_ARRAY
and OUTPUT_DATA_ARRAY). In the final implementation, the ports on the FFT RTL block will
be implemented as AXI4-Stream ports. Xilinx recommends always using the FFT function in
a region using dataflow optimization (set_directive_dataflow), because this ensures
the arrays are implemented as streaming arrays. An alternative is to specify both arrays as
streaming using the set_directive_stream command.
The data types for the arrays can be float or ap_fixed.
typedef float data_t;
complex<data_t> xn[FFT_LENGTH];
complex<data_t> xk[FFT_LENGTH];
To use fixed-point data types, the Vitis HLS arbitrary precision type ap_fixed should be used.
#include "ap_fixed.h"
typedef ap_fixed<FFT_INPUT_WIDTH,1> data_in_t;
typedef ap_fixed<FFT_OUTPUT_WIDTH,FFT_OUTPUT_WIDTH-FFT_INPUT_WIDTH+1> data_out_t;
#include <complex>
typedef hls::x_complex<data_in_t> cmpxData;
typedef hls::x_complex<data_out_t> cmpxDataOut;
In both cases, the FFT should be parameterized with the same correct data sizes. In the case of floating point data, the data widths will always be 32-bit and any other specified size will be considered invalid.
The multichannel functionality of the FFT can be used by using two-dimensional arrays for the input and output data. In this case, the array data should be configured with the first dimension representing each channel and the second dimension representing the FFT data.
typedef float data_t;
static complex<data_t> xn[CHANNEL][FFT_LENGTH];
static complex<data_t> xk[CHANELL][FFT_LENGTH];
The FFT core consumes and produces data as interleaved channels (for example, ch0-data0, ch1-data0, ch2-data0, etc, ch0-data1, ch1-data1, ch2-data2, etc.). Therefore, to stream the input or output arrays of the FFT using the same sequential order that the data was read or written, you must fill or empty the two-dimensional arrays for multiple channels by iterating through the channel index first, as shown in the following example:
cmpxData in_fft[FFT_CHANNELS][FFT_LENGTH];
cmpxData out_fft[FFT_CHANNELS][FFT_LENGTH];
// Write to FFT Input Array
for (unsigned i = 0; i < FFT_LENGTH; i++) {
for (unsigned j = 0; j < FFT_CHANNELS; ++j) {
in_fft[j][i] = in.read().data;
}
}
// Read from FFT Output Array
for (unsigned i = 0; i < FFT_LENGTH; i++) {
for (unsigned j = 0; j < FFT_CHANNELS; ++j) {
out.data = out_fft[j][i];
}
}
Design examples using the FFT C library are provided in the Vitis HLS examples and can be accessed using menu option .
FIR Filter IP Library
The Xilinx FIR IP block can be called within a C++ design using the library hls_fir.h. This section explains how the FIR can be configured in your C++ code.
To use the FIR in your C++ code:
- Include the
hls_fir.hlibrary in the code. - Set the static parameters using the pre-defined struct
hls::ip_fir::params_t. - Call the FIR function.
- Optionally, define a run time input configuration to modify some parameters dynamically.
The following code examples provide a summary of how each of these steps is performed. Each step is discussed in more detail below.
First, include the FIR library in the source code. This header file resides in the include directory in the Vitis HLS installation area. This directory is automatically searched when Vitis HLS executes. There is no need to specify the path to this directory if compiling inside Vitis HLS.
#include "hls_fir.h"
Define the static parameters of the FIR. This includes such static attributes
such as the input width, the coefficients, the filter rate (single,
decimation, hilbert). The FIR library includes
a parameterization struct hls::ip_fir::params_t
which can be used to initialize all static parameters with default values.
In this example, the coefficients are defined as residing in array coeff_vec and the default
values for the number of coefficients, the input width and the quantization mode are
over-ridden using a user a user-defined struct myconfig based on the pre-defined struct.
struct myconfig : hls::ip_fir::params_t {
static const double coeff_vec[sg_fir_srrc_coeffs_len];
static const unsigned num_coeffs = sg_fir_srrc_coeffs_len;
static const unsigned input_width = INPUT_WIDTH;
static const unsigned quantization = hls::ip_fir::quantize_only;
};
Create an instance of the FIR function using the HLS namespace with the defined static
parameters (myconfig in this example) and then call the function with the run method to
execute the function. The function arguments are, in order, input data and output data.
static hls::FIR<param1> fir1;
fir1.run(fir_in, fir_out);
Optionally, a run time input configuration can be used. In some modes of the FIR, the data on this input determines how the coefficients are used during interleaved channels or when coefficient reloading is required. This configuration can be dynamic and is therefore defined as a variable. For a complete description of which modes require this input configuration, refer to the FIR Compiler LogiCORE IP Product Guide (PG149).
When the run time input configuration is used, the FIR function is called with three arguments: input data, output data and input configuration.
// Define the configuration type
typedef ap_uint<8> config_t;
// Define the configuration variable
config_t fir_config = 8;
// Use the configuration in the FFT
static hls::FIR<param1> fir1;
fir1.run(fir_in, fir_out, &fir_config);
Design examples using the FIR C library are provided in the Vitis HLS examples and can be accessed using menu option .
FIR Static Parameters
The static parameters of the FIR define how the FIR IP is parameterized and specifies non-dynamic items such as the input and output widths, the number of fractional bits, the coefficient values, the interpolation and decimation rates. Most of these configurations have default values: there are no default values for the coefficients.
The hls_fir.h header file defines a struct hls::ip_fir::params_t that can be used to set
the default values for most of the static parameters.
In this example, a new user struct my_config is defined and with a new value for the coefficients. The coefficients are specified as residing in array coeff_vec. All other parameters to the FIR use the default values.
struct myconfig : hls::ip_fir::params_t {
static const double coeff_vec[sg_fir_srrc_coeffs_len];
};
static hls::FIR<myconfig> fir1;
fir1.run(fir_in, fir_out);
FIR Static Parameters describes the parameters used for the parametrization struct
hls::ip_fir::params_t. FIR Struct Parameter Values provides the default values for the
parameters and a list of possible values.
FIR Struct Parameters
| Parameter | Description |
|---|---|
| input_width | Data input port width |
| input_fractional_bits | Number of fractional bits on the input port |
| output_width | Data output port width |
| output_fractional_bits | Number of fractional bits on the output port |
| coeff_width | Bit-width of the coefficients |
| coeff_fractional_bits | Number of fractional bits in the coefficients |
| num_coeffs | Number of coefficients |
| coeff_sets | Number of coefficient sets |
| input_length | Number of samples in the input data |
| output_length | Number of samples in the output data |
| num_channels | Specify the number of channels of data to process |
| total_num_coeff | Total number of coefficients |
| coeff_vec[total_num_coeff] | The coefficient array |
| filter_type | The type implementation used for the filter |
| rate_change | Specifies integer or fractional rate changes |
| interp_rate | The interpolation rate |
| decim_rate | The decimation rate |
| zero_pack_factor | Number of zero coefficients used in interpolation |
| rate_specification | Specify the rate as frequency or period |
| hardware_oversampling_rate | Specify the rate of over-sampling |
| sample_period | The hardware oversample period |
| sample_frequency | The hardware oversample frequency |
| quantization | The quantization method to be used |
| best_precision | Enable or disable the best precision |
| coeff_structure | The type of coefficient structure to be used |
| output_rounding_mode | Type of rounding used on the output |
| filter_arch | Selects a systolic or transposed architecture |
| optimization_goal | Specify a speed or area goal for optimization |
| inter_column_pipe_length | The pipeline length required between DSP columns |
| column_config | Specifies the number of DSP module columns |
| config_method | Specifies how the DSP module columns are configured |
| coeff_padding | Number of zero padding added to the front of the filter |
When specifying parameter values that are not integer or boolean, the HLS FIR namespace should be used.
For example the possible values for rate_change are shown in the following table to be integer and fixed_fractional. The values used in the C program should be rate_change = hls::ip_fir::integer and rate_change = hls::ip_fir::fixed_fractional.
FIR Struct Parameter Values
The following table covers all features and functionality of the FIR IP. Features and functionality not described in this table are not supported in the Vitis HLS implementation.
| Parameter | C Type | Default Value | Valid Values |
|---|---|---|---|
| input_width | unsigned | 16 | No limitation |
| input_fractional_bits | unsigned | 0 | Limited by size of input_width |
| output_width | unsigned | 24 | No limitation |
| output_fractional_bits | unsigned | 0 | Limited by size of output_width |
| coeff_width | unsigned | 16 | No limitation |
| coeff_fractional_bits | unsigned | 0 | Limited by size of coeff_width |
| num_coeffs | bool | 21 | Full |
| coeff_sets | unsigned | 1 | 1-1024 |
| input_length | unsigned | 21 | No limitation |
| output_length | unsigned | 21 | No limitation |
| num_channels | unsigned | 1 | 1-1024 |
| total_num_coeff | unsigned | 21 | num_coeffs * coeff_sets |
| coeff_vec[total_num_coeff] | double array | None | Not applicable |
| filter_type | unsigned | single_rate | single_rate, interpolation, decimation, hilbert_filter, interpolated |
| rate_change | unsigned | integer | integer, fixed_fractional |
| interp_rate | unsigned | 1 | 1-1024 |
| decim_rate | unsigned | 1 | 1-1024 |
| zero_pack_factor | unsigned | 1 | 1-8 |
| rate_specification | unsigned | period | frequency, period |
| hardware_oversampling_rate | unsigned | 1 | No Limitation |
| sample_period | bool | 1 | No Limitation |
| sample_frequency | unsigned | 0.001 | No Limitation |
| quantization | unsigned | integer_coefficients | integer_coefficients, quantize_only, maximize_dynamic_range |
| best_precision | unsigned | false | false true |
| coeff_structure | unsigned | non_symmetric | inferred, non_symmetric, symmetric, negative_symmetric, half_band, hilbert |
| output_rounding_mode | unsigned | full_precision | full_precision, truncate_lsbs, non_symmetric_rounding_down, non_symmetric_rounding_up, symmetric_rounding_to_zero, symmetric_rounding_to_infinity, convergent_rounding_to_even, convergent_rounding_to_odd |
| filter_arch | unsigned | systolic_multiply_accumulate | systolic_multiply_accumulate, transpose_multiply_accumulate |
| optimization_goal | unsigned | area | area, speed |
| inter_column_pipe_length | unsigned | 4 | 1-16 |
| column_config | unsigned | 1 | Limited by number of DSP48s used |
| config_method | unsigned | single | single, by_channel |
| coeff_padding | bool | false | false true |
Using the FIR Function
The FIR function is defined in the HLS namespace and can be called as follows:
// Create an instance of the FIR
static hls::FIR<STATIC_PARAM> fir1;
// Execute the FIR instance fir1
fir1.run(INPUT_DATA_ARRAY, OUTPUT_DATA_ARRAY);
The STATIC_PARAM is the static parameterization struct that defines most static
parameters for the FIR.
Both the input and output data are supplied to the function as arrays (INPUT_DATA_ARRAY
and OUTPUT_DATA_ARRAY). In the final implementation, these ports on the FIR IP will be
implemented as AXI4-Stream ports. Xilinx recommends always using the FIR function in a
region using the dataflow optimization (set_directive_dataflow), because this
ensures the arrays are implemented as streaming arrays. An alternative is to specify both
arrays as streaming using the set_directive_stream command.
The multichannel functionality of the FIR is supported through interleaving the data in a single input and single output array.
- The size of the input array should be large enough to accommodate all samples:
num_channels * input_length. - The output array size should be specified to contain all output samples:
num_channels * output_length.
The following code example demonstrates, for two channels, how the data is interleaved. In
this example, the top-level function has two channels of input data (din_i, din_q) and two
channels of output data (dout_i, dout_q). Two functions, at the front-end (fe) and
back-end (be) are used to correctly order the data in the FIR input array and extract it from
the FIR output array.
void dummy_fe(din_t din_i[LENGTH], din_t din_q[LENGTH], din_t out[FIR_LENGTH]) {
for (unsigned i = 0; i < LENGTH; ++i) {
out[2*i] = din_i[i];
out[2*i + 1] = din_q[i];
}
}
void dummy_be(dout_t in[FIR_LENGTH], dout_t dout_i[LENGTH], dout_t dout_q[LENGTH]) {
for(unsigned i = 0; i < LENGTH; ++i) {
dout_i[i] = in[2*i];
dout_q[i] = in[2*i+1];
}
}
void fir_top(din_t din_i[LENGTH], din_t din_q[LENGTH],
dout_t dout_i[LENGTH], dout_t dout_q[LENGTH]) {
din_t fir_in[FIR_LENGTH];
dout_t fir_out[FIR_LENGTH];
static hls::FIR<myconfig> fir1;
dummy_fe(din_i, din_q, fir_in);
fir1.run(fir_in, fir_out);
dummy_be(fir_out, dout_i, dout_q);
}
Optional FIR Runtime Configuration
In some modes of operation, the FIR requires an additional input to configure how the coefficients are used. For a complete description of which modes require this input configuration, refer to the FIR Compiler LogiCORE IP Product Guide (PG149).
This input configuration can be performed in the C code using a standard ap_int.h 8-bit data type. In this example, the header file fir_top.h specifies the use of the FIR and ap_fixed libraries, defines a number of the design parameter values and then defines some fixed-point types based on these:
#include "ap_fixed.h"
#include "hls_fir.h"
const unsigned FIR_LENGTH = 21;
const unsigned INPUT_WIDTH = 16;
const unsigned INPUT_FRACTIONAL_BITS = 0;
const unsigned OUTPUT_WIDTH = 24;
const unsigned OUTPUT_FRACTIONAL_BITS = 0;
const unsigned COEFF_WIDTH = 16;
const unsigned COEFF_FRACTIONAL_BITS = 0;
const unsigned COEFF_NUM = 7;
const unsigned COEFF_SETS = 3;
const unsigned INPUT_LENGTH = FIR_LENGTH;
const unsigned OUTPUT_LENGTH = FIR_LENGTH;
const unsigned CHAN_NUM = 1;
typedef ap_fixed<INPUT_WIDTH, INPUT_WIDTH - INPUT_FRACTIONAL_BITS> s_data_t;
typedef ap_fixed<OUTPUT_WIDTH, OUTPUT_WIDTH - OUTPUT_FRACTIONAL_BITS> m_data_t;
typedef ap_uint<8> config_t;
In the top-level code, the information in the header file is included, the static parameterization struct is created using the same constant values used to specify the bit-widths, ensuring the C code and FIR configuration match, and the coefficients are specified. At the top-level, an input configuration, defined in the header file as 8-bit data, is passed into the FIR.
#include "fir_top.h"
struct param1 : hls::ip_fir::params_t {
static const double coeff_vec[total_num_coeff];
static const unsigned input_length = INPUT_LENGTH;
static const unsigned output_length = OUTPUT_LENGTH;
static const unsigned num_coeffs = COEFF_NUM;
static const unsigned coeff_sets = COEFF_SETS;
};
const double param1::coeff_vec[total_num_coeff] =
{6,0,-4,-3,5,6,-6,-13,7,44,64,44,7,-13,-6,6,5,-3,-4,0,6};
void dummy_fe(s_data_t in[INPUT_LENGTH], s_data_t out[INPUT_LENGTH],
config_t* config_in, config_t* config_out)
{
*config_out = *config_in;
for(unsigned i = 0; i < INPUT_LENGTH; ++i)
out[i] = in[i];
}
void dummy_be(m_data_t in[OUTPUT_LENGTH], m_data_t out[OUTPUT_LENGTH])
{
for(unsigned i = 0; i < OUTPUT_LENGTH; ++i)
out[i] = in[i];
}
// DUT
void fir_top(s_data_t in[INPUT_LENGTH],
m_data_t out[OUTPUT_LENGTH],
config_t* config)
{
s_data_t fir_in[INPUT_LENGTH];
m_data_t fir_out[OUTPUT_LENGTH];
config_t fir_config;
// Create struct for config
static hls::FIR<param1> fir1;
//==================================================
// Dataflow process
dummy_fe(in, fir_in, config, &fir_config);
fir1.run(fir_in, fir_out, &fir_config);
dummy_be(fir_out, out);
//==================================================
}
Design examples using the FIR C library are provided in the Vitis HLS examples and can be accessed using menu option .
DDS IP Library
You can use the Xilinx Direct Digital Synthesizer (DDS) IP block within a C++ design using the hls_dds.h library. This section explains how to configure DDS IP in your C++ code.
none mode for Phase_Offset, but it does
not support programmable and streaming modes for
these parameters.To use the DDS in the C++ code:
- Include the
hls_dds.hlibrary in the code. - Set the default parameters using the pre-defined struct
hls::ip_dds::params_t. - Call the DDS function.
First, include the DDS library in the source code. This header file resides in the include directory in the Vitis HLS installation area, which is automatically searched when Vitis HLS executes.
#include "hls_dds.h"
Define the static parameters of the DDS. For example, define the phase width, clock rate,
and phase and increment offsets. The DDS C library includes a parameterization struct
hls::ip_dds::params_t, which is used to initialize all static parameters with default
values. By redefining any of the values in this struct, you can customize the implementation.
The following example shows how to override the default values for the phase width, clock
rate, phase offset, and the number of channels using a user-defined struct param1, which
is based on the existing predefined struct hls::ip_dds::params_t:
struct param1 : hls::ip_dds::params_t {
static const unsigned Phase_Width = PHASEWIDTH;
static const double DDS_Clock_Rate = 25.0;
static const double PINC[16];
static const double POFF[16];
};
Create an instance of the DDS function using the HLS namespace with the defined static
parameters (for example, param1). Then, call the function with the run method to execute
the function. Following are the data and phase function arguments shown in order:
static hls::DDS<config1> dds1;
dds1.run(data_channel, phase_channel);
To access design examples that use the DDS C library, select .
DDS Static Parameters
The static parameters of the DDS define how to configure the DDS, such as the
clock rate, phase interval, and modes. The hls_dds.h header file defines an hls::ip_dds::params_t struct, which sets the default values for the
static parameters. To use the default values, you can use the parameterization
struct directly with the DDS function.
static hls::DDS< hls::ip_dds::params_t > dds1;
dds1.run(data_channel, phase_channel);
The following table describes the parameters for the hls::ip_dds::params_t parameterization struct.
| Parameter | Description |
|---|---|
DDS_Clock_Rate |
Specifies the clock rate for the DDS output. |
Channels |
Specifies the number of channels. The DDS and phase generator can support up to 16 channels. The channels are time-multiplexed, which reduces the effective clock frequency per channel. |
Mode_of_Operation |
Specifies one of the following operation modes: Standard mode for use when the accumulated phase can be truncated before it is used to access the SIN/COS LUT. Rasterized mode for use when the desired frequencies and system clock are related by a rational fraction. |
Modulus |
Describes the relationship between the system clock frequency and the desired frequencies. Use this parameter in rasterized mode only. |
Spurious_Free_Dynamic_Range |
Specifies the targeted purity of the tone produced by the DDS. |
Frequency_Resolution |
Specifies the minimum frequency resolution in Hz and determines the Phase Width used by the phase accumulator, including associated phase increment (PINC) and phase offset (POFF) values. |
Noise_Shaping |
Controls whether to use phase truncation, dithering, or Taylor series correction. |
Phase_Width |
Sets the width of the following: PHASE_OUT
field within Phase
field within Phase accumulator Associated phase increment and offset registers Phase
field in For rasterized mode, the phase width is fixed as
the number of bits required to describe the valid input range
|
Output_Width |
Sets the width of SINE and
COSINE fields within m_axis_data_tdata. The SFDR provided by
this parameter depends on the selected Noise
Shaping option. |
Phase_Increment |
Selects the phase increment value. |
Phase_Offset |
Selects the phase offset value. |
Output_Selection |
Sets the output selection to SINE,
COSINE, or both in the
m_axis_data_tdata bus. |
Negative_Sine |
Negates the SINE field at run time. |
Negative_Cosine |
Negates the COSINE field at run time. |
Amplitude_Mode |
Sets the amplitude to full range or unit circle. |
Memory_Type |
Controls the implementation of the SIN/COS LUT. |
Optimization_Goal |
Controls whether the implementation decisions target highest speed or lowest resource. |
DSP48_Use |
Controls the implementation of the phase accumulator and addition stages for phase offset, dither noise addition, or both. |
Latency_Configuration |
Sets the latency of the core to the optimum value based upon the Optimization Goal. |
Latency |
Specifies the manual latency value. |
Output_Form |
Sets the output form to two’s complement or to sign and magnitude. In general, the output of SINE and COSINE is in two’s complement form. However, when quadrant symmetry is used, the output form can be changed to sign and magnitude. |
PINC[XIP_DDS_CHANNELS_MAX] |
Sets the values for the phase increment for each output channel. |
POFF[XIP_DDS_CHANNELS_MAX] |
Sets the values for the phase offset for each output channel. |
DDS Struct Parameter Values
The following table shows the possible values for the hls::ip_dds::params_t parameterization struct parameters.
| Parameter | C Type | Default Value | Valid Values |
|---|---|---|---|
| DDS_Clock_Rate | double | 20.0 | Any double value |
| Channels | unsigned | 1 | 1 to 16 |
| Mode_of_Operation | unsigned | XIP_DDS_MOO_CONVENTIONAL | XIP_DDS_MOO_CONVENTIONAL truncates the accumulated phase. XIP_DDS_MOO_RASTERIZED selects rasterized mode. |
| Modulus | unsigned | 200 | 129 to 256 |
| Spurious_Free_Dynamic_Range | double | 20.0 | 18.0 to 150.0 |
| Frequency_Resolution | double | 10.0 | 0.000000001 to 125000000 |
| Noise_Shaping | unsigned | XIP_DDS_NS_NONE | XIP_DDS_NS_NONE produces phase truncation DDS. XIP_DDS_NS_DITHER uses phase dither to improve SFDR at the expense of increased noise floor. XIP_DDS_NS_TAYLOR interpolates sine/cosine values using the otherwise discarded bits from phase truncation XIP_DDS_NS_AUTO automatically determines noise-shaping. |
| Phase_Width | unsigned | 16 | Must be an integer multiple of 8 |
| Output_Width | unsigned | 16 | Must be an integer multiple of 8 |
| Phase_Increment | unsigned | XIP_DDS_PINCPOFF_FIXED | XIP_DDS_PINCPOFF_FIXED fixes PINC at generation time, and PINC cannot be changed at run time. This is the only value supported. |
| Phase_Offset | unsigned | XIP_DDS_PINCPOFF_NONE | XIP_DDS_PINCPOFF_NONE does not generate phase offset. XIP_DDS_PINCPOFF_FIXED fixes POFF at generation time, and POFF cannot be changed at run time. |
| Output_Selection | unsigned | XIP_DDS_OUT_SIN_AND_COS | XIP_DDS_OUT_SIN_ONLY produces sine output only. XIP_DDS_OUT_COS_ONLY produces cosine output only. XIP_DDS_OUT_SIN_AND_COS produces both sin and cosine output. |
| Negative_Sine | unsigned | XIP_DDS_ABSENT | XIP_DDS_ABSENT produces standard sine wave. XIP_DDS_PRESENT negates sine wave. |
| Negative_Cosine | bool | XIP_DDS_ABSENT | XIP_DDS_ABSENT produces standard sine wave. XIP_DDS_PRESENT negates sine wave. |
| Amplitude_Mode | unsigned | XIP_DDS_FULL_RANGE | XIP_DDS_FULL_RANGE normalizes amplitude to the output width with the binary point in the first place. For example, an 8-bit output has a binary amplitude of 100000000 - 10 giving values between 01111110 and 11111110, which corresponds to just less than 1 and just more than -1 respectively. XIP_DDS_UNIT_CIRCLE normalizes amplitude to half full range, that is, values range from 01000 .. (+0.5). to 110000 .. (-0.5). |
| Memory_Type | unsigned | XIP_DDS_MEM_AUTO | XIP_DDS_MEM_AUTO selects distributed ROM for small cases where the table can be contained in a single layer of memory and selects block ROM for larger cases. XIP_DDS_MEM_BLOCK always uses block RAM. XIP_DDS_MEM_DIST always uses distributed RAM. |
| Optimization_Goal | unsigned | XIP_DDS_OPTGOAL_AUTO | XIP_DDS_OPTGOAL_AUTO automatically selects the optimization goal. XIP_DDS_OPTGOAL_AREA optimizes for area. XIP_DDS_OPTGOAL_SPEED optimizes for performance. |
| DSP48_Use | unsigned | XIP_DDS_DSP_MIN | XIP_DDS_DSP_MIN implements the phase accumulator and the stages for phase offset, dither noise addition, or both in FPGA logic. XIP_DDS_DSP_MAX implements the phase accumulator and the phase offset, dither noise addition, or both using DSP slices. In the case of single channel, the DSP slice can also provide the register to store programmable phase increment, phase offset, or both and thereby, save further fabric resources. |
| Latency_Configuration | unsigned | XIP_DDS_LATENCY_AUTO | XIP_DDS_LATENCY_AUTO automatically determines he latency. XIP_DDS_LATENCY_MANUAL manually specifies the latency using the Latency option. |
| Latency | unsigned | 5 | Any value |
| Output_Form | unsigned | XIP_DDS_OUTPUT_TWOS | XIP_DDS_OUTPUT_TWOS outputs two's complement. XIP_DDS_OUTPUT_SIGN_MAG outputs signed magnitude. |
| PINC[XIP_DDS_CHANNELS_MAX] | unsigned array | {0} | Any value for the phase increment for each channel |
| POFF[XIP_DDS_CHANNELS_MAX] | unsigned array | {0} | Any value for the phase offset for each channel |
SRL IP Library
C code is written to satisfy several different requirements: reuse, readability, and performance. Until now, it is unlikely that the C code was written to result in the most ideal hardware after high-level synthesis.
Like the requirements for reuse, readability, and performance, certain coding techniques or pre-defined constructs can ensure that the synthesis output results in more optimal hardware or to better model hardware in C for easier validation of the algorithm.
Mapping Directly into SRL Resources
Many C algorithms sequentially shift data through arrays. They add a new value to the start of the array, shift the existing data through array, and drop the oldest data value. This operation is implemented in hardware as a shift register.
This most common way to implement a shift register from C into hardware is to completely partition the array into individual elements, and allow the data dependencies between the elements in the RTL to imply a shift register.
Logic synthesis typically implements the RTL shift register into a Xilinx SRL resource, which efficiently implements shift registers. The issue is that sometimes logic synthesis does not implement the RTL shift register using an SRL component:
- When data is accessed in the middle of the shift register, logic synthesis cannot directly infer an SRL.
- Sometimes, even when the SRL is ideal, logic synthesis may implement the shift-resister in flip-flops, due to other factors. (Logic synthesis is also a complex process).
Vitis HLS provides a C++ class (ap_shift_reg) to ensure that the shift register defined in the C code is always implemented using an SRL resource. The ap_shift_reg class has two methods to perform the various read and write accesses supported by an SRL component.
Read from the Shifter
The read method allows a specified location to be read from the shifter register.
The ap_shift_reg.h header file that defines
the ap_shift_reg class is also included with
Vitis HLS as a standalone package. You
have the right to use it in your own source code. The package xilinx_hls_lib_<release_number>.tgz is located
in the include directory in the Vitis HLS installation area.
// Include the Class
#include "ap_shift_reg.h"
// Define a variable of type ap_shift_reg<type, depth>
// - Sreg must use the static qualifier
// - Sreg will hold integer data types
// - Sreg will hold 4 data values
static ap_shift_reg<int, 4> Sreg;
int var1;
// Read location 2 of Sreg into var1
var1 = Sreg.read(2);
Read, Write, and Shift Data
A shift method allows a read, write, and shift operation to be performed.
// Include the Class
#include "ap_shift_reg.h"
// Define a variable of type ap_shift_reg<type, depth>
// - Sreg must use the static qualifier
// - Sreg will hold integer data types
// - Sreg will hold 4 data values
static ap_shift_reg<int, 4> Sreg;
int var1;
// Read location 3 of Sreg into var1
// THEN shift all values up one and load In1 into location 0
var1 = Sreg.shift(In1,3);
Read, Write, and Enable-Shift
The shift method also supports an enabled input, allowing the shift process to be controlled and enabled by a variable.
// Include the Class
#include "ap_shift_reg.h"
// Define a variable of type ap_shift_reg<type, depth>
// - Sreg must use the static qualifier
// - Sreg will hold integer data types
// - Sreg will hold 4 data values
static ap_shift_reg<int, 4> Sreg;
int var1, In1;
bool En;
// Read location 3 of Sreg into var1
// THEN if En=1
// Shift all values up one and load In1 into location 0
var1 = Sreg.shift(In1,3,En);
When using the ap_shift_reg class, Vitis HLS creates a unique RTL component for
each shifter. When logic synthesis is performed, this component is synthesized into an SRL
resource.