Vitis HLS Libraries Reference
 Arbitrary Precision Data Types Library: arbitrary precision data types allowing C code to use variables with smaller bitwidths for improved performance and area in hardware.
 Vitis HLS Math Library: used to specify standard math operations for synthesis into Xilinx devices.
 HLS Stream Library: for modeling and compiling streaming data structures.
 HLS IP Libraries: IP functions, including fast fourier transform (FFT) and finite impulse response (FIR)
You can use each of the C libraries in your design by including the library
header file in your code. These header files are located in the
include
directory
in the Vitis HLS
installation area.
Arbitrary Precision Data Types Library
Cbased native data types are on 8bit boundaries (8, 16, 32, 64 bits). RTL buses (corresponding to hardware) support arbitrary lengths. HLS needs a mechanism to allow the specification of arbitrary precision bitwidth and not rely on the artificial boundaries of native C data types: if a 17bit multiplier is required, you should not be forced to implement this with a 32bit multiplier.
Vitis HLS provides both integer and fixedpoint arbitrary precision data types for C++. The advantage of arbitrary precision data types is that they allow the C code to be updated to use variables with smaller bitwidths and then for the C simulation to be reexecuted to validate that the functionality remains identical or acceptable.
Using Arbitrary Precision Data Types
Vitis HLS provides arbitrary precision integer data types that manage the value of the integer numbers within the boundaries of the specified width, as shown in the following table.
Language  Integer Data Type  Required Header 

C++ 
ap_[u]int<W> (1024 bits) Can be extended to 32K bits wide. 
#include “ap_int.h” 
C++  ap_[u]fixed<W,I,Q,O,N>  #include “ap_fixed.h” 
The header files define the arbitrary precision types are also provided with Vitis HLS as a standalone package with the rights to use them in your own source code. The package, xilinx_hls_lib_<release_number>.tgz, is provided in the include directory in the Vitis HLS installation area.
Arbitrary Integer Precision Types with C++
The header file ap_int.h defines the arbitrary precision
integer data type for the C++ ap_[u]int
data types. To use arbitrary precision
integer data types in a C++ function:
 Add header file ap_int.h to the source code.
 Change the bit types to
ap_int<N>
for signed types orap_uint<N>
for unsigned types, whereN
is a bitsize from 1 to 1024.
The following example shows how the header file is added and two variables implemented to use 9bit integer and 10bit unsigned integer types:
#include "ap_int.h"
void foo_top (…) {
ap_int<9> var1; // 9bit
ap_uint<10> var2; // 10bit unsigned
Arbitrary Precision FixedPoint Data Types
In Vitis HLS, it is important to use fixedpoint data types, because the behavior of the C++ simulations performed using fixedpoint data types match that of the resulting hardware created by synthesis. This allows you to analyze the effects of bitaccuracy, quantization, and overflow with fast Clevel simulation.
These data types manage the value of real (noninteger) numbers within the boundaries of a specified total width and integer width, as shown in the following figure.
FixedPoint Identifier Summary
The following table provides a brief overview of operations supported by fixedpoint types.
Identifier  Description  

W 
Word length in bits  
I 
The number of bits used to represent the integer value (the number of bits above the decimal point)  
Q  Quantization mode: This dictates the behavior when greater precision is generated than can be defined by smallest fractional bit in the variable used to store the result.  
ap_fixed Types  Description  
AP_RND  Round to plus infinity  
AP_RND_ZERO  Round to zero  
AP_RND_MIN_INF  Round to minus infinity  
AP_RND_INF  Round to infinity  
AP_RND_CONV  Convergent rounding  
AP_TRN  Truncation to minus infinity (default)  
AP_TRN_ZERO  Truncation to zero  
O  Overflow mode: This dictates the behavior when the result of an operation exceeds the maximum (or minimum in the case of negative numbers) possible value that can be stored in the variable used to store the result. 

ap_fixed Types  Description  
AP_SAT  Saturation  
AP_SAT_ZERO  Saturation to zero  
AP_SAT_SYM  Symmetrical saturation  
AP_WRAP  Wrap around (default)  
AP_WRAP_SM  Sign magnitude wrap around  
N  This defines the number of saturation bits in overflow wrap modes. 
Example Using ap_fixed
In this example the Vitis HLS ap_fixed
type is used to define an 18bit variable with
6 bits representing the numbers above the decimal point and 12bits representing the
value below the decimal point. The variable is specified as signed, the quantization
mode is set to round to plus infinity and the default wraparound mode is used for
overflow.
#include <ap_fixed.h>
...
ap_fixed<18,6,AP_RND > my_type;
...
C++ Arbitrary Precision Integer Types
The native data types in C++ are on 8bit boundaries (8, 16, 32 and 64 bits). RTL signals and operations support arbitrary bitlengths.
Vitis HLS provides arbitrary precision data types for C++ to allow variables and operations in the C++ code to be specified with any arbitrary bitwidths: 6bit, 17bit, 234bit, up to 1024 bits.
AP_INT_MAX_W
with a positive integer value less than or equal to 32768 before inclusion of the ap_int.h header file.Arbitrary precision data types have are two primary advantages over the native C++ types:
 Better quality hardware: If for example, a 17bit multiplier is required,
arbitrary precision types can specify that exactly 17bit are used in the calculation.
Without arbitrary precision data types, such a multiplication (17bit) must be implemented using 32bit integer data types and result in the multiplication being implemented with multiple DSP modules.
 Accurate C++ simulation/analysis: Arbitrary precision data types in the C++ code allows the C++ simulation to be performed using accurate bitwidths and for the C++ simulation to validate the functionality (and accuracy) of the algorithm before synthesis.
The arbitrary precision types in C++ have none of the disadvantages of those in C:
 C++ arbitrary types can be compiled with standard C++ compilers (there
is no C++ equivalent of
apcc
).  C++ arbitrary precision types do not suffer from Integer Promotion Issues.
It is not uncommon for users to change a file extension from .c to .cpp so the file can be compiled as C++, where neither of these issues are present.
For the C++ language, the header file ap_int.h defines the arbitrary precision integer data types ap_(u)int<W>
. For example, ap_int<8>
represents an 8bit signed integer data type and ap_uint<234>
represents a 234bit unsigned integer type.
The ap_int.h file is located in the directory $HLS_ROOT/include, where $HLS_ROOT is the Vitis HLS installation directory.
The code shown in the following example is a repeat of the code shown in
the Basic Arithmetic example in Standard Types. In this example, the data types in the toplevel
function to be synthesized are specified as dinA_t
, dinB_t
, and so on.
#include "cpp_ap_int_arith.h"
void cpp_ap_int_arith(din_A inA, din_B inB, din_C inC, din_D inD,
dout_1 *out1, dout_2 *out2, dout_3 *out3, dout_4 *out4
) {
// Basic arithmetic operations
*out1 = inA * inB;
*out2 = inB + inA;
*out3 = inC / inA;
*out4 = inD % inA;
}
In this latest update to this example, the C++ arbitrary precision types are used:
 Add header file ap_int.h to the source code.
 Change the native C++ types to arbitrary precision types
ap_int<N>
orap_uint<N>
, whereN
is a bitsize from 1 to 1024 (as noted above, this can be extended to 32Kbits if required).
The data types are defined in the header cpp_ap_int_arith.h.
Compared with the Basic Arithmetic example in Standard Types, the input data types have simply
been reduced to represent the maximum size of the real input data (for example, 8bit input
inA
is reduced to 6bit input). The output types have
been refined to be more accurate, for example, out2
, the
sum of inA
and inB
, need
only be 13bit and not 32bit.
The following example shows basic arithmetic with C++ arbitrary precision types.
#ifndef _CPP_AP_INT_ARITH_H_
#define _CPP_AP_INT_ARITH_H_
#include <stdio.h>
#include "ap_int.h"
#define N 9
// Old data types
//typedef char dinA_t;
//typedef short dinB_t;
//typedef int dinC_t;
//typedef long long dinD_t;
//typedef int dout1_t;
//typedef unsigned int dout2_t;
//typedef int32_t dout3_t;
//typedef int64_t dout4_t;
typedef ap_int<6> dinA_t;
typedef ap_int<12> dinB_t;
typedef ap_int<22> dinC_t;
typedef ap_int<33> dinD_t;
typedef ap_int<18> dout1_t;
typedef ap_uint<13> dout2_t;
typedef ap_int<22> dout3_t;
typedef ap_int<6> dout4_t;
void cpp_ap_int_arith(dinA_t inA,dinB_t inB,dinC_t inC,dinD_t inD,dout1_t
*out1,dout2_t *out2,dout3_t *out3,dout4_t *out4);
#endif
If C++ Arbitrary Precision Integer Types are synthesized, it results in a design that is functionally identical
to Standard Types. Rather than
use the C++ cout
operator to output the results to a file,
the builtin ap_int
method .to_int()
is used to convert the ap_int
results
to integer types used with the standard fprintf
function.
fprintf(fp, %d*%d=%d; %d+%d=%d; %d/%d=%d; %d mod %d=%d;\n,
inA.to_int(), inB.to_int(), out1.to_int(),
inB.to_int(), inA.to_int(), out2.to_int(),
inC.to_int(), inA.to_int(), out3.to_int(),
inD.to_int(), inA.to_int(), out4.to_int());
C++ Arbitrary Precision Integer Types: Reference Information
For comprehensive information on the methods, synthesis behavior, and all aspects
of using the ap_(u)int<N>
arbitrary precision data types, see
C++ Arbitrary Precision Types. This
section includes:
 Techniques for assigning constant and initialization values to arbitrary precision integers (including values greater than 1024bit).
 A description of Vitis HLS helper methods, such as printing, concatenating, bitslicing and range selection functions.
 A description of operator behavior, including a description of shift operations (a negative shift values, results in a shift in the opposite direction).
C++ Arbitrary Precision Types
Vitis HLS provides a C++ template class,
ap_[u]int<>
, that implements arbitrary
precision (or bitaccurate) integer data types with consistent, bitaccurate behavior
between software and hardware modeling.
This class provides all arithmetic, bitwise, logical and relational operators allowed for native C integer types. In addition, this class provides methods to handle some useful hardware operations, such as allowing initialization and conversion of variables of widths greater than 64 bits. Details for all operators and class methods are discussed below.
Compiling ap_[u]int<> Types
To use the ap_[u]int<>
classes, you must
include the ap_int.h
header file in all source files that reference
ap_[u]int<>
variables.
When compiling software models that use these classes, it may be necessary
to specify the location of the Vitis HLS header files, for example by
adding the I/<HLS_HOME>/include
option
for g++
compilation.
Declaring/Defining ap_[u] Variables
There are separate signed and unsigned classes:
ap_int<int_W>
(signed)ap_uint<int_W>
(unsigned)
The template parameter int_W
specifies the total
width of the variable being declared.
Userdefined types may be created with the C/C++ typedef
statement as shown in the following examples:
include "ap_int.h"// use ap_[u]fixed<> types
typedef ap_uint<128> uint128_t; // 128bit user defined type
ap_int<96> my_wide_var; // a global variable declaration
The default maximum width allowed is 1024 bits. This default may be overridden
by defining the macro AP_INT_MAX_W
with a positive
integer value less than or equal to 32768 before inclusion of the ap_int.h
header file.
AP_INT_MAX_W
too High may cause slow software compile and
run times.Following is an example of overriding AP_INT_MAX_W
:
#define AP_INT_MAX_W 4096 // Must be defined before next line
#include "ap_int.h"
ap_int<4096> very_wide_var;
Initialization and Assignment from Constants (Literals)
The class constructor and assignment operator overloads, allows initialization
of and assignment to ap_[u]fixed<>
variables
using standard C/C++ integer literals.
This method of assigning values to ap_[u]fixed<>
variables is subject to the limitations of C++ and the system upon which
the software will run. This typically leads to a 64bit limit on integer
literals (for example, for those LL
or ULL
suffixes).
To allow assignment of values wider than 64bits, the ap_[u]fixed<>
classes provide constructors that allow initialization from a string
of arbitrary length (less than or equal to the width of the variable).
By default, the string provided is interpreted as a hexadecimal value as long as it contains only valid hexadecimal digits (that is, 09 and af). To assign a value from such a string, an explicit C++ style cast of the string to the appropriate type must be made.
Following are examples of initialization and assignments, including for values greater than 64bit, are:
ap_int<42> a_42b_var(1424692392255LL); // long long decimal format
a_42b_var = 0x14BB648B13FLL; // hexadecimal format
a_42b_var = 1; // negative int literal signextended to full width
ap_uint<96> wide_var(“76543210fedcba9876543210”, 16); // Greater than 64bit
wide_var = ap_int<96>(“0123456789abcdef01234567”, 16);
ap_uint<N> a ={0}
.The ap_[u]<>
constructor may be explicitly
instructed to interpret the string as representing the number in radix
2, 8, 10, or 16 formats. This is accomplished by adding the appropriate
radix value as a second parameter to the constructor call.
A compilation error occurs if the string literal contains any characters that are invalid as digits for the radix specified.
The following examples use different radix formats:
ap_int<6> a_6bit_var(“101010”, 2); // 42d in binary format
a_6bit_var = ap_int<6>(“40”, 8); // 32d in octal format
a_6bit_var = ap_int<6>(“55”, 10); // decimal format
a_6bit_var = ap_int<6>(“2A”, 16); // 42d in hexadecimal format
a_6bit_var = ap_int<6>(“42”, 2); // COMPILETIME ERROR! “42” is not binary
The radix of the number encoded in the string can also be inferred by
the constructor, when it is prefixed with a zero (0
)
followed by one of the following characters: “b
”,
“o
” or “x
”.
The prefixes “0b
”, “0o
”
and “0x
” correspond to binary, octal
and hexadecimal formats respectively.
The following examples use alternate initializer string formats:
ap_int<6> a_6bit_var(“0b101010”, 2); // 42d in binary format
a_6bit_var = ap_int<6>(“0o40”, 8); // 32d in octal format
a_6bit_var = ap_int<6>(“0x2A”, 16); // 42d in hexidecimal format
a_6bit_var = ap_int<6>(“0b42”, 2); // COMPILETIME ERROR! “42” is not binary
If the bitwidth is greater than 53bits, the ap_[u]fixed
value must be initialized with a string, for example:
ap_ufixed<72,10> Val(“2460508560057040035.375”);
Support for Console I/O (Printing)
As with initialization and assignment to ap_[u]fixed<>
variables, Vitis HLS supports printing values that require more than
64bits to represent.
Using the C++ Standard Output Stream
The easiest way to output any value stored in an ap_[u]int
variable is to use the C++ standard output stream:
std::cout (#include <iostream>
or
<iostream.h
>)
The stream insertion operator (<<
) is overloaded to
correctly output the full range of values possible for any given
ap_[u]fixed
variable. The following stream manipulators
are also supported:
 dec (decimal)
 hex (hexadecimal)
 oct (octal)
These allow formatting of the value as indicated.
The following example uses cout
to print values:
#include <iostream.h>
// Alternative: #include <iostream>
ap_ufixed<72> Val(“10fedcba9876543210”);
cout << Val << endl; // Yields: “313512663723845890576”
cout << hex << val << endl; // Yields: “10fedcba9876543210”
cout << oct << val << endl; // Yields: “41773345651416625031020”
Using the Standard C Library
You can also use the standard C library (#include
<stdio.h>
) to print out values larger than 64bits:
 Convert the value to a C++
std::string
using theap_[u]fixed
classes methodto_string()
.  Convert the result to a nullterminated C character string using the
std::string
class methodc_str()
.
Optional Argument One (Specifying the Radix)
You can pass the ap[u]int::to_string()
method an optional
argument specifying the radix of the numerical format desired. The valid radix
argument values are:
 2 (binary) (default)
 8 (octal)
 10 (decimal)
 16 (hexadecimal)
Optional Argument Two (Printing as Signed Values)
A second optional argument to ap_[u]int::to_string()
specifies
whether to print the nondecimal formats as signed values. This argument is boolean.
The default value is false, causing the nondecimal formats to be printed as
unsigned values.
The following examples use printf
to print values:
ap_int<72> Val(“80fedcba9876543210”);
printf(“%s\n”, Val.to_string().c_str()); // => “80FEDCBA9876543210”
printf(“%s\n”, Val.to_string(10).c_str()); // => “2342818482890329542128”
printf(“%s\n”, Val.to_string(8).c_str()); // => “401773345651416625031020”
printf(“%s\n”, Val.to_string(16, true).c_str()); // => “7F0123456789ABCDF0”
Expressions Involving ap_[u]<> types
Variables of ap_[u]<>
types may generally
be used freely in expressions involving C/C++ operators. Some behaviors
may be unexpected. These are discussed in detail below.
Zero and SignExtension on Assignment From Narrower to Wider Variables
When assigning the value of a narrower bitwidth signed
(ap_int<>
) variable to a wider one, the value is
signextended to the width of the destination variable, regardless of its
signedness.
Similarly, an unsigned source variable is zeroextended before assignment.
Explicit casting of the source variable may be necessary to ensure expected behavior on assignment. See the following example:
ap_uint<10> Result;
ap_int<7> Val1 = 0x7f;
ap_uint<6> Val2 = 0x3f;
Result = Val1; // Yields: 0x3ff (signextended)
Result = Val2; // Yields: 0x03f (zeropadded)
Result = ap_uint<7>(Val1); // Yields: 0x07f (zeropadded)
Result = ap_int<6>(Val2); // Yields: 0x3ff (signextended)
Truncation on Assignment of Wider to Narrower Variables
Assigning the value of a wider source variable to a narrower one leads to truncation of the value. All bits beyond the most significant bit (MSB) position of the destination variable are lost.
There is no special handling of the sign information during truncation. This may lead to unexpected behavior. Explicit casting may help avoid this unexpected behavior.
Class Methods and Operators
The ap_[u]int
types do not support implicit conversion
from wide ap_[u]int
(>64bits) to builtin C/C++
integer types. For example, the following code example return s1, because
the implicit cast from ap_int[65]
to bool
in the ifstatement returns a 0.
bool nonzero(ap_uint<65> data) {
return data; // This leads to implicit truncation to 64b int
}
int main() {
if (nonzero((ap_uint<65>)1 << 64)) {
return 0;
}
printf(FAIL\n);
return 1;
}
To convert wide ap_[u]int
types to builtin integers,
use the explicit conversion functions included with the ap_[u]int
types:
to_int()
to_long()
to_bool()
In general, any valid operation that can be done on a native C/C++ integer
data type is supported using operator overloading for ap_[u]int
types.
In addition to these overloaded operators, some class specific operators and methods are included to ease bitlevel operations.
Binary Arithmetic Operators
Standard binary integer arithmetic operators are overloaded to provide arbitrary precision arithmetic. These operators take either:
 Two operands of
ap_[u]int
, or  One
ap_[u]int
type and one C/C++ fundamental integer data type
For example:
 char
 short
 int
The width and signedness of the resulting value is determined by the width and signedness of the operands, before signextension, zeropadding or truncation are applied based on the width of the destination variable (or expression). Details of the return value are described for each operator.
When expressions contain a mix of ap_[u]int
and C/C++ fundamental
integer types, the C++ types assume the following widths:
char
(8bits)short
(16bits)int
(32bits)long
(32bits)long long
(64bits)
Addition
ap_(u)int::RType ap_(u)int::operator + (ap_(u)int op)
Returns the sum of:
 Two
ap_[u]int
, or  One
ap_[u]int
and a C/C++ integer type
The width of the sum value is:
 One bit more than the wider of the two operands, or
 Two bits if and only if the wider is unsigned and the narrower is signed
The sum is treated as signed if either (or both) of the operands is of a signed type.
Subtraction
ap_(u)int::RType ap_(u)int::operator  (ap_(u)int op)
Returns the difference of two integers.
The width of the difference value is:
 One bit more than the wider of the two operands, or
 Two bits if and only if the wider is unsigned and the narrower signed
This is true before assignment, at which point it is signextended, zeropadded, or truncated based on the width of the destination variable.
The difference is treated as signed regardless of the signedness of the operands.
Multiplication
ap_(u)int::RType ap_(u)int::operator * (ap_(u)int op)
Returns the product of two integer values.
The width of the product is the sum of the widths of the operands.
The product is treated as a signed type if either of the operands is of a signed type.
Division
ap_(u)int::RType ap_(u)int::operator / (ap_(u)int op)
Returns the quotient of two integer values.
The width of the quotient is the width of the dividend if the divisor is an unsigned type. Otherwise, it is the width of the dividend plus one.
The quotient is treated as a signed type if either of the operands is of a signed type.
Modulus
ap_(u)int::RType ap_(u)int::operator % (ap_(u)int op)
Returns the modulus, or remainder of integer division, for two integer values.
The width of the modulus is the minimum of the widths of the operands, if they are both of the same signedness.
If the divisor is an unsigned type and the dividend is signed, then the width is that of the divisor plus one.
The quotient is treated as having the same signedness as the dividend.
Following are examples of arithmetic operators:
ap_uint<71> Rslt;
ap_uint<42> Val1 = 5;
ap_int<23> Val2 = 8;
Rslt = Val1 + Val2; // Yields: 3 (43 bits) signextended to 71 bits
Rslt = Val1  Val2; // Yields: +3 sign extended to 71 bits
Rslt = Val1 * Val2; // Yields: 40 (65 bits) sign extended to 71 bits
Rslt = 50 / Val2; // Yields: 6 (33 bits) sign extended to 71 bits
Rslt = 50 % Val2; // Yields: +2 (23 bits) sign extended to 71 bits
Bitwise Logical Operators
The bitwise logical operators all return a value with a width that is the maximum of the widths of the two operands. It is treated as unsigned if and only if both operands are unsigned. Otherwise, it is of a signed type.
Signextension (or zeropadding) may occur, based on the signedness of the expression, not the destination variable.
Bitwise OR
ap_(u)int::RType ap_(u)int::operator  (ap_(u)int op)
Returns the bitwise OR
of the two operands.
Bitwise AND
ap_(u)int::RType ap_(u)int::operator & (ap_(u)int op)
Returns the bitwise AND
of the two operands.
Bitwise XOR
ap_(u)int::RType ap_(u)int::operator ^ (ap_(u)int op)
Returns the bitwise XOR
of the two operands.
Unary Operators
Addition
ap_(u)int ap_(u)int::operator + ()
Returns the self copy of the ap_[u]int
operand.
Subtraction
ap_(u)int::RType ap_(u)int::operator  ()
Returns the following:
 The negated value of the operand with the same width if it is a signed type, or
 Its width plus one if it is unsigned.
The return value is always a signed type.
Bitwise Inverse
ap_(u)int::RType ap_(u)int::operator ~ ()
Returns the bitwiseNOT
of the operand with the same width and
signedness.
Logical Invert
bool ap_(u)int::operator ! ()
Returns a Boolean false
value if and only if the operand is
not equal to zero (0
).
Returns a Boolean true
value if the operand is equal to zero
(0
).
Ternary Operators
When you use the ternary operator with the standard C int
type, you
must explicitly cast from one type to the other to ensure that both results have the same
type. For example:
// Integer type is cast to ap_int type
ap_int<32> testc3(int a, ap_int<32> b, ap_int<32> c, bool d) {
return d?ap_int<32>(a):b;
}
// ap_int type is cast to an integer type
ap_int<32> testc4(int a, ap_int<32> b, ap_int<32> c, bool d) {
return d?a+1:(int)b;
}
// Integer type is cast to ap_int type
ap_int<32> testc5(int a, ap_int<32> b, ap_int<32> c, bool d) {
return d?ap_int<33>(a):b+1;
}
Shift Operators
Each shift operator comes in two versions:
 One version for unsigned righthand side (RHS) operands
 One version for signed righthand side (RHS) operands
A negative value supplied to the signed RHS versions reverses the shift operations direction. That is, a shift by the absolute value of the RHS operand in the opposite direction occurs.
The shift operators return a value with the same width as the lefthand side (LHS) operand. As with C/C++, if the LHS operand of a shiftright is a signed type, the sign bit is copied into the most significant bit positions, maintaining the sign of the LHS operand.
Unsigned Integer Shift Right
ap_(u)int ap_(u)int::operator << (ap_uint<int_W2> op)
Integer Shift Right
ap_(u)int ap_(u)int::operator << (ap_int<int_W2> op)
Unsigned Integer Shift Left
ap_(u)int ap_(u)int::operator >> (ap_uint<int_W2> op)
Integer Shift Left
ap_(u)int ap_(u)int::operator >> (ap_int<int_W2> op)
Following are examples of shift operations:
ap_uint<13> Rslt;
ap_uint<7> Val1 = 0x41;
Rslt = Val1 << 6; // Yields: 0x0040, i.e. msb of Val1 is lost
Rslt = ap_uint<13>(Val1) << 6; // Yields: 0x1040, no info lost
ap_int<7> Val2 = 63;
Rslt = Val2 >> 4; //Yields: 0x1ffc, sign is maintained and extended
Compound Assignment Operators
Vitis HLS supports compound assignment operators:
 *=
 /=
 %=
 +=
 =
 <<=
 >>=
 &=
 ^=
 =
The RHS
expression is first evaluated then supplied as the
RHS
operand to the base operator, the result of which is assigned
back to the LHS
variable. The expression sizing, signedness, and
potential signextension or truncation rules apply as discussed above for the relevant
operations.
ap_uint<10> Val1 = 630;
ap_int<3> Val2 = 3;
ap_uint<5> Val3 = 27;
Val1 += Val2  Val3; // Yields: 600 and is equivalent to:
// Val1 = ap_uint<10>(ap_int<11>(Val1) +
// ap_int<11>((ap_int<6>(Val2) 
// ap_int<6>(Val3))));
Increment and Decrement Operators
The increment and decrement operators are provided. All return a value of the same width as the operand and which is unsigned if and only if both operands are of unsigned types and signed otherwise.
PreIncrement
ap_(u)int& ap_(u)int::operator ++ ()
Returns the incremented value of the operand.
Assigns the incremented value to the operand.
PostIncrement
const ap_(u)int ap_(u)int::operator ++ (int)
Returns the value of the operand before assignment of the incremented value to the operand variable.
PreDecrement
ap_(u)int& ap_(u)int::operator  ()
Returns the decremented value of, as well as assigning the decremented value to, the operand.
PostDecrement
const ap_(u)int ap_(u)int::operator  (int)
Returns the value of the operand before assignment of the decremented value to the operand variable.
Relational Operators
Vitis HLS supports all relational operators. They return a Boolean value based on the
result of the comparison. You can compare variables of ap_[u]int
types to C/C++ fundamental integer types with these operators.
Equality
bool ap_(u)int::operator == (ap_(u)int op)
Inequality
bool ap_(u)int::operator != (ap_(u)int op)
Less than
bool ap_(u)int::operator < (ap_(u)int op)
Greater than
bool ap_(u)int::operator > (ap_(u)int op)
Less than or equal to
bool ap_(u)int::operator <= (ap_(u)int op)
Greater than or equal to
bool ap_(u)int::operator >= (ap_(u)int op)
Other Class Methods, Operators, and Data Members
The following sections discuss other class methods, operators, and data members.
BitLevel Operations
The following methods facilitate common bitlevel operations on the value stored in
ap_[u]int
type variables.
Length
int ap_(u)int::length ()
Returns an integer value providing the total number of bits in the
ap_[u]int
variable.
Concatenation
ap_concat_ref ap_(u)int::concat (ap_(u)int low)
ap_concat_ref ap_(u)int::operator , (ap_(u)int high, ap_(u)int low)
Concatenates two ap_[u]int
variables, the width of the returned value
is the sum of the widths of the operands.
The High and Low arguments are placed in the higher and lower order bits of the result
respectively; the concat()
method places the argument in the lower order
bits.
When using the overloaded comma operator, the parentheses are required. The comma operator version may also appear on the LHS of assignment.
ap_[u]int
type before concatenating.
ap_uint<10> Rslt;
ap_int<3> Val1 = 3;
ap_int<7> Val2 = 54;
Rslt = (Val2, Val1); // Yields: 0x1B5
Rslt = Val1.concat(Val2); // Yields: 0x2B6
(Val1, Val2) = 0xAB; // Yields: Val1 == 1, Val2 == 43
Bit Selection
ap_bit_ref ap_(u)int::operator [] (int bit)
Selects one bit from an arbitrary precision integer value and returns it.
The returned value is a reference value that can set or clear the corresponding bit in this
ap_[u]int
.
The bit argument must be an int
value. It specifies the index of the
bit to select. The least significant bit has index 0. The highest permissible index is one less
than the bitwidth of this ap_[u]int
.
The result type ap_bit_ref
represents the reference to one bit of this
ap_[u]int
instance specified by bit.
Range Selection
ap_range_ref ap_(u)int::range (unsigned Hi, unsigned Lo)
ap_range_ref ap_(u)int::operator () (unsigned Hi, unsigned Lo)
Returns the value represented by the range of bits specified by the arguments.
The Hi
argument specifies the most significant bit (MSB) position of
the range, and Lo
specifies the least significant bit (LSB).
The LSB of the source variable is in position 0. If the Hi
argument has
a value less than Lo
, the bits are returned in reverse order.
ap_uint<4> Rslt;
ap_uint<8> Val1 = 0x5f;
ap_uint<8> Val2 = 0xaa;
Rslt = Val1.range(3, 0); // Yields: 0xF
Val1(3,0) = Val2(3, 0); // Yields: 0x5A
Val1(3,0) = Val2(4, 1); // Yields: 0x55
Rslt = Val1.range(4, 7); // Yields: 0xA; bitreversed!
AND reduce
bool ap_(u)int::and_reduce ()
 Applies the
AND
operation on all bits in thisap_(u)int
.  Returns the resulting single bit.
 Equivalent to comparing this value against
1
(all ones) and returningtrue
if it matches,false
otherwise.
OR reduce
bool ap_(u)int::or_reduce ()
 Applies the
OR
operation on all bits in thisap_(u)int
.  Returns the resulting single bit.
 Equivalent to comparing this value against
0
(all zeros) and returningfalse
if it matches,true
otherwise.
XOR reduce
bool ap_(u)int::xor_reduce ()
 Applies the
XOR
operation on all bits in thisap_int
.  Returns the resulting single bit.
 Equivalent to counting the number of
1
bits in this value and returningfalse
if the count is even ortrue
if the count is odd.
NAND reduce
bool ap_(u)int::nand_reduce ()
 Applies the
NAND
operation on all bits in thisap_int
.  Returns the resulting single bit.
 Equivalent to comparing this value against
1
(all ones) and returningfalse
if it matches,true
otherwise.
NOR reduce
bool ap_int::nor_reduce ()
 Applies the
NOR
operation on all bits in thisap_int
.  Returns the resulting single bit.
 Equivalent to comparing this value against
0
(all zeros) and returningtrue
if it matches,false
otherwise.
XNOR reduce
bool ap_(u)int::xnor_reduce ()
 Applies the
XNOR
operation on all bits in thisap_(u)int
.  Returns the resulting single bit.
 Equivalent to counting the number of
1
bits in this value and returningtrue
if the count is even orfalse
if the count is odd.
Bit Reduction Method Examples
ap_uint<8> Val = 0xaa;
bool t = Val.and_reduce(); // Yields: false
t = Val.or_reduce(); // Yields: true
t = Val.xor_reduce(); // Yields: false
t = Val.nand_reduce(); // Yields: true
t = Val.nor_reduce(); // Yields: false
t = Val.xnor_reduce(); // Yields: true
Bit Reverse
void ap_(u)int::reverse ()
Reverses the contents of ap_[u]int
instance:
 The LSB becomes the MSB.
 The MSB becomes the LSB.
Reverse Method Example
ap_uint<8> Val = 0x12;
Val.reverse(); // Yields: 0x48
Test Bit Value
bool ap_(u)int::test (unsigned i)
Checks whether specified bit of ap_(u)int
instance is
1
.
Returns true if Yes, false if No.
Test Method Example
ap_uint<8> Val = 0x12;
bool t = Val.test(5); // Yields: true
Set Bit Value
void ap_(u)int::set (unsigned i, bool v)
void ap_(u)int::set_bit (unsigned i, bool v)
Sets the specified bit of the ap_(u)int
instance to the value of
integer V
.
Set Bit (to 1)
void ap_(u)int::set (unsigned i)
Sets the specified bit of the ap_(u)int
instance to the value
1
(one).
Clear Bit (to 0)
void ap_(u)int:: clear(unsigned i)
Sets the specified bit of the ap_(u)int
instance to the value
0
(zero).
Invert Bit
void ap_(u)int:: invert(unsigned i)
Inverts the bit specified in the function argument of the ap_(u)int
instance. The specified bit becomes 0
if its original value is
1
and vice versa.
Example of bit set, clear and invert bit methods:
ap_uint<8> Val = 0x12;
Val.set(0, 1); // Yields: 0x13
Val.set_bit(4, false); // Yields: 0x03
Val.set(7); // Yields: 0x83
Val.clear(1); // Yields: 0x81
Val.invert(4); // Yields: 0x91
Rotate Right
void ap_(u)int:: rrotate(unsigned n)
Rotates the ap_(u)int
instance n places to right.
Rotate Left
void ap_(u)int:: lrotate(unsigned n)
Rotates the ap_(u)int
instance n places to left.
ap_uint<8> Val = 0x12;
Val.rrotate(3); // Yields: 0x42
Val.lrotate(6); // Yields: 0x90
Bitwise NOT
void ap_(u)int:: b_not()
 Complements every bit of the
ap_(u)int
instance.
ap_uint<8> Val = 0x12;
Val.b_not(); // Yields: 0xED
Bitwise NOT Example
Test Sign
bool ap_int:: sign()
 Checks whether the
ap_(u)int
instance is negative.  Returns
true
if negative.  Returns
false
if positive.
Explicit Conversion Methods
To C/C++ “(u)int”
int ap_(u)int::to_int ()
unsigned ap_(u)int::to_uint ()
 Returns native C/C++ (32bit on most systems) integers with the value
contained in the
ap_[u]int
.  Truncation occurs if the value is greater than can be represented by an
[unsigned] int
.
To C/C++ 64bit “(u)int”
long long ap_(u)int::to_int64 ()
unsigned long long ap_(u)int::to_uint64 ()
 Returns native C/C++ 64bit integers with the value contained in the
ap_[u]int
.  Truncation occurs if the value is greater than can be represented by an
[unsigned] int
.
To C/C++ “double”
double ap_(u)int::to_double ()
 Returns a native C/C++
double
64bit floating point representation of the value contained in theap_[u]int
.  If the
ap_[u]int
is wider than 53 bits (the number of bits in the mantissa of adouble
), the resultingdouble
may not have the exact value expected.
ap_[u]int
to other data types.Sizeof
The standard C++ sizeof()
function should not be used with
ap_[u]int
or other classes or instance of object. The
ap_int<>
data type is a class and sizeof
returns the storage used by that class or instance object.
sizeof(ap_int<N>)
always returns the number of bytes used. For
example:
sizeof(ap_int<127>)=16
sizeof(ap_int<128>)=16
sizeof(ap_int<129>)=24
sizeof(ap_int<130>)=24
Compile Time Access to Data Type Attributes
The ap_[u]int<>
types are provided with a static member that
allows the size of the variables to be determined at compile time. The data type is provided
with the static const member width
, which is automatically assigned the
width of the data type:
static const int width = _AP_W;
You can use the width
data member to extract the data width of an
existing ap_[u]int<>
data type to create another
ap_[u]int<>
data type at compile time. The following example shows
how the size of variable Res
is defined as 1bit greater than variables
Val1
and Val2
:
// Definition of basic data type
#define INPUT_DATA_WIDTH 8
typedef ap_int<INPUT_DATA_WIDTH> data_t;
// Definition of variables
data_t Val1, Val2;
// Res is automatically sized at compiletime to be 1bit greater than data type
data_t
ap_int<data_t::width+1> Res = Val1 + Val2;
This ensures that Vitis HLS correctly models the bitgrowth caused by the addition even if
you update the value of INPUT_DATA_WIDTH for data_t
.
C++ Arbitrary Precision FixedPoint Types
C++ functions can take advantage of the arbitrary precision fixedpoint types included with Vitis HLS. The following figure summarizes the basic features of these fixedpoint types:
 The word can be signed (
ap_fixed
) or unsigned (ap_ufixed
).  A word with of any arbitrary size
W
can be defined.  The number of places above the decimal point I, also defines the number
of decimal places in the word,
WI
(represented byB
in the following figure).  The type of rounding or quantization (
Q
) can be selected.  The overflow behavior (
O
andN
) can be selected.
Arbitrary precision fixedpoint types use more memory during C simulation.
If using very large arrays of ap_[u]fixed
types, refer to
the discussion of C simulation in Arrays.
The advantages of using fixedpoint types are:
 They allow fractional number to be easily represented.
 When variables have a different number of integer and decimal place bits, the alignment of the decimal point is handled.
 There are numerous options to handle how rounding should happen: when there are too few decimal bits to represent the precision of the result.
 There are numerous options to handle how variables should overflow: when the result is greater than the number of integer bits can represent.
These attributes are summarized by examining the code in the example below.
First, the header file ap_fixed.h is included. The
ap_fixed
types are then defined using the typedef
statement:
 A 10bit input: 8bit integer value with 2 decimal places.
 A 6bit input: 3bit integer value with 3 decimal places.
 A 22bit variable for the accumulation: 17bit integer value with 5 decimal places.
 A 36bit variable for the result: 30bit integer value with 6 decimal places.
The function contains no code to manage the alignment of the decimal point after operations are performed. The alignment is done automatically.
The following code sample shows ap_fixed
type.
#include "ap_fixed.h"
typedef ap_ufixed<10,8, AP_RND, AP_SAT> din1_t;
typedef ap_fixed<6,3, AP_RND, AP_WRAP> din2_t;
typedef ap_fixed<22,17, AP_TRN, AP_SAT> dint_t;
typedef ap_fixed<36,30> dout_t;
dout_t cpp_ap_fixed(din1_t d_in1, din2_t d_in2) {
static dint_t sum;
sum += d_in1;
return sum * d_in2;
}
Using ap_(u)fixed
types, the C++
simulation is bit accurate. Fast simulation can validate the algorithm and its accuracy.
After synthesis, the RTL exhibits the identical bitaccurate behavior.
Arbitrary precision fixedpoint types can be freely assigned literal values
in the code. This is shown in the test bench (see the example below) used with the example
above, in which the values of in1
and in2
are declared and assigned constant values.
When assigning literal values involving operators, the literal values must
first be cast to ap_(u)fixed
types. Otherwise, the C
compiler and Vitis HLS interpret the literal as an
integer or float/double
type and may fail to find a
suitable operator. As shown in the following example, in the assignment of in1 = in1 + din1_t(0.25)
, the literal 0.25 is cast to an ap_fixed
type.
#include <cmath>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
#include "ap_fixed.h"
typedef ap_ufixed<10,8, AP_RND, AP_SAT> din1_t;
typedef ap_fixed<6,3, AP_RND, AP_WRAP> din2_t;
typedef ap_fixed<22,17, AP_TRN, AP_SAT> dint_t;
typedef ap_fixed<36,30> dout_t;
dout_t cpp_ap_fixed(din1_t d_in1, din2_t d_in2);
int main()
{
ofstream result;
din1_t in1 = 0.25;
din2_t in2 = 2.125;
dout_t output;
int retval=0;
result.open(result.dat);
// Persistent manipulators
result << right << fixed << setbase(10) << setprecision(15);
for (int i = 0; i <= 250; i++)
{
output = cpp_ap_fixed(in1,in2);
result << setw(10) << i;
result << setw(20) << in1;
result << setw(20) << in2;
result << setw(20) << output;
result << endl;
in1 = in1 + din1_t(0.25);
in2 = in2  din2_t(0.125);
}
result.close();
// Compare the results file with the golden results
retval = system(diff brief w result.dat result.golden.dat);
if (retval != 0) {
printf(Test failed !!!\n);
retval=1;
} else {
printf(Test passed !\n);
}
// Return 0 if the test passes
return retval;
}
FixedPoint Identifier Summary
The following table shows the quantization and overflow modes.
Identifier  Description  

W  Word length in bits  
I  The number of bits used to represent the integer value (the number of bits above the decimal point)  
Q  Quantization mode dictates the behavior when greater precision is generated than can be defined by smallest fractional bit in the variable used to store the result.  
Mode  Description  
AP_RND  Rounding to plus infinity  
AP_RND_ZERO  Rounding to zero  
AP_RND_MIN_INF  Rounding to minus infinity  
AP_RND_INF  Rounding to infinity  
AP_RND_CONV  Convergent rounding  
AP_TRN  Truncation to minus infinity (default)  
AP_TRN_ZERO  Truncation to zero  
O  Overflow mode dictates the behavior when more bits are generated than the variable to store the result contains.  
Mode  Description  
AP_SAT  Saturation  
AP_SAT_ZERO  Saturation to zero  
AP_SAT_SYM  Symmetrical saturation  
AP_WRAP  Wrap around (default)  
AP_WRAP_SM  Sign magnitude wrap around  
N  The number of saturation bits in wrap modes. 
C++ Arbitrary Precision FixedPoint Types: Reference Information
For comprehensive information on the methods, synthesis behavior, and all aspects
of using the ap_(u)fixed<N>
arbitrary precision fixedpoint data
types, see C++ Arbitrary Precision FixedPoint Types.
This section includes:
 Techniques for assigning constant and initialization values to arbitrary precision integers (including values greater than 1024bit).
 A detailed description of the overflow and saturation modes.
 A description of Vitis HLS helper methods, such as printing, concatenating, bitslicing and range selection functions.
 A description of operator behavior, including a description of shift operations (a negative shift values, results in a shift in the opposite direction).
C++ Arbitrary Precision FixedPoint Types
Vitis HLS supports fixedpoint types that allow fractional arithmetic to be easily handled. The advantage of fixedpoint arithmetic is shown in the following example.
ap_fixed<11, 6> Var1 = 22.96875; // 11bit signed word, 5 fractional bits
ap_ufixed<12,11> Var2 = 512.5; // 12bit word, 1 fractional bit
ap_fixed<16,11> Res1; // 16bit signed word, 5 fractional bits
Res1 = Var1 + Var2; // Result is 535.46875
Even though Var1
and Var2
have different precisions, the fixedpoint type ensures that the decimal
point is correctly aligned before the operation (an addition in this
case), is performed. You are not required to perform any operations in
the C code to align the decimal point.
The type used to store the result of any fixedpoint arithmetic operation must be large enough (in both the integer and fractional bits) to store the full result.
If this is not the case, the ap_fixed
type
performs:
 overflow handling (when the result has more MSBs than the assigned type supports)
 quantization (or rounding, when the result has fewer LSBs than the assigned type supports)
The ap_[u]fixed
type provides various options on
how the overflow and quantization are performed. The options are discussed
below.
ap_[u]fixed Representation
In ap[u]fixed
types, a fixedpoint value is represented
as a sequence of bits with a specified position for the binary point.
 Bits to the left of the binary point represent the integer part of the value.
 Bits to the right of the binary point represent the fractional part of the value.
ap_[u]fixed type
is defined as follows:
ap_[u]fixed<int W,
int I,
ap_q_mode Q,
ap_o_mode O,
ap_sat_bits N>;
Quantization Modes
Rounding to plus infinity  AP_RND 
Rounding to zero  AP_RND_ZERO 
Rounding to minus infinity  AP_RND_MIN_INF 
Rounding to infinity  AP_RND_INF 
Convergent rounding  AP_RND_CONV 
Truncation  AP_TRN 
Truncation to zero  AP_TRN_ZERO 
AP_RND
 Round the value to the nearest representable value for the specific
ap_[u]fixed
type.ap_fixed<3, 2, AP_RND, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5 ap_fixed<3, 2, AP_RND, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
AP_RND_ZERO
 Round the value to the nearest representable value.
 Round towards zero.
 For positive values, delete the redundant bits.
 For negative values, add the least significant bits to get the nearest representable value.
ap_fixed<3, 2, AP_RND_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_RND_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
AP_RND_MIN_INF
 Round the value to the nearest representable value.
 Round towards minus infinity.
 For positive values, delete the redundant bits.
 For negative values, add the least significant bits.
ap_fixed<3, 2, AP_RND_MIN_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_RND_MIN_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5
AP_RND_INF
 Round the value to the nearest representable value.
 The rounding depends on the least significant bit.
 For positive values, if the least significant bit is set, round towards plus infinity. Otherwise, round towards minus infinity.
 For negative values, if the least significant bit is set, round towards minus infinity. Otherwise, round towards plus infinity.
ap_fixed<3, 2, AP_RND_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5 ap_fixed<3, 2, AP_RND_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5
AP_RND_CONV
 Round the value to the nearest representable value.
 The rounding depends on the least significant bit.
 If least significant bit is set, round towards plus infinity.
 Otherwise, round towards minus infinity.
ap_fixed<3, 2, AP_RND_CONV, AP_SAT> UAPFixed4 = 0.75; // Yields: 1.0 ap_fixed<3, 2, AP_RND_CONV, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
AP_TRN
 Always round the value towards minus infinity.
ap_fixed<3, 2, AP_TRN, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_TRN, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5
AP_TRN_ZERO
Round the value to:
 For positive values, the rounding is the same as mode
AP_TRN
.  For negative values, round towards
zero.
ap_fixed<3, 2, AP_TRN_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0 ap_fixed<3, 2, AP_TRN_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
Overflow Modes
Saturation  AP_SAT 
Saturation to zero  AP_SAT_ZERO 
Symmetrical saturation  AP_SAT_SYM 
Wraparound  AP_WRAP 
Sign magnitude wraparound  AP_WRAP_SM 
AP_SAT
Saturate the value.
 To the maximum value in case of overflow.
 To the negative maximum value in case of negative overflow.
ap_fixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = 19.0; // Yields: 7.0 ap_fixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = 19.0; // Yields: 8.0 ap_ufixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = 19.0; // Yields: 15.0 ap_ufixed<4, 4, AP_RND, AP_SAT> UAPFixed4 = 19.0; // Yields: 0.0
AP_SAT_ZERO
Force the value to zero in case of overflow, or negative overflow.
ap_fixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = 19.0; // Yields: 0.0
ap_fixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = 19.0; // Yields: 0.0
ap_ufixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = 19.0; // Yields: 0.0
ap_ufixed<4, 4, AP_RND, AP_SAT_ZERO> UAPFixed4 = 19.0; // Yields: 0.0
AP_SAT_SYM
Saturate the value:
 To the maximum value in case of overflow.
 To the minimum value in case of negative overflow.
 Negative maximum for signed
ap_fixed
types  Zero for unsigned
ap_ufixed
types
ap_fixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = 19.0; // Yields: 7.0 ap_fixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = 19.0; // Yields: 7.0 ap_ufixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = 19.0; // Yields: 15.0 ap_ufixed<4, 4, AP_RND, AP_SAT_SYM> UAPFixed4 = 19.0; // Yields: 0.0
 Negative maximum for signed
AP_WRAP
Wrap the value around in case of overflow.
ap_fixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = 31.0; // Yields: 1.0
ap_fixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = 19.0; // Yields: 3.0
ap_ufixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = 19.0; // Yields: 3.0
ap_ufixed<4, 4, AP_RND, AP_WRAP> UAPFixed4 = 19.0; // Yields: 13.0
If the value of N is set to zero (the default overflow mode):
 All MSB bits outside the range are deleted.
 For unsigned numbers. After the maximum it wraps around to zero.
 For signed numbers. After the maximum, it wraps to the minimum values.
If N>0:
 When N > 0, N MSB bits are saturated or set to 1.
 The sign bit is retained, so positive numbers remain positive and negative numbers remain negative.
 The bits that are not saturated are copied starting from the LSB side.
AP_WRAP_SM
The value should be signmagnitude wrapped around.
ap_fixed<4, 4, AP_RND, AP_WRAP_SM> UAPFixed4 = 19.0; // Yields: 4.0
ap_fixed<4, 4, AP_RND, AP_WRAP_SM> UAPFixed4 = 19.0; // Yields: 2.0
If the value of N is set to zero (the default overflow mode):
 This mode uses sign magnitude wrapping.
 Sign bit set to the value of the least significant deleted bit.
 If the most significant remaining bit is different from the original MSB, all the remaining bits are inverted.
 If MSBs are same, the other bits are copied over.
 Delete redundant MSBs.
 The new sign bit is the least significant bit of the deleted bits. 0 in this case.
 Compare the new sign bit with the sign of the new value.
 If different, invert all the numbers. They are different in this case.
If N>0:
 Uses sign magnitude saturation
 N MSBs are saturated to 1.
 Behaves similar to a case in which N = 0, except that positive numbers stay positive and negative numbers stay negative.
Compiling ap_[u]fixed<> Types
To use the ap_[u]fixed<>
classes, you must
include the ap_fixed.h
header file in all source files that
reference ap_[u]fixed<> variables
.
When compiling software models that use these classes, it may be necessary
to specify the location of the Vitis HLS header files, for example by
adding the “I/<HLS_HOME>/include”
option for g++
compilation.
Declaring and Defining ap_[u]fixed<> Variables
There are separate signed and unsigned classes:
ap_fixed<W,I>
(signed)ap_ufixed<W,I>
(unsigned)
You can create userdefined types with the C/C++ typedef
statement:
#include "ap_fixed.h" // use ap_[u]fixed<> types
typedef ap_ufixed<128,32> uint128_t; // 128bit user defined type,
// 32 integer bits
UserDefined Types Examples
Initialization and Assignment from Constants (Literals)
You can initialize ap_[u]fixed
variable with
normal floating point constants of the usual C/C++ width:
 32 bits for type
float
 64 bits for type
double
That is, typically, a floating point value that is single precision type or in the form of double precision.
Note that the value assigned to the fixedpoint variable will be limited by the precision of the constant. Use string initialization as described in Initialization and Assignment from Constants (Literals) to ensure that all bits of the fixedpoint variable are populated according to the precision described by the string.
#include <ap_fixed.h>
ap_ufixed<30, 15> my15BitInt = 3.1415;
ap_fixed<42, 23> my42BitInt = 1158.987;
ap_ufixed<99, 40> = 287432.0382911;
ap_fixed<36,30> = 0x123.456p1;
The ap_[u]fixed types do not support initialization if they are used in an array of std::complex types.
typedef ap_fixed<DIN_W, 1, AP_TRN, AP_SAT> coeff_t; // MUST have IW >= 1
std::complex<coeff_t> twid_rom[REAL_SZ/2] = {{ 1, 0 },{ 0.9,0.006 }, etc.}
The initialization values must first be cast to std::complex
:
typedef ap_fixed<DIN_W, 1, AP_TRN, AP_SAT> coeff_t; // MUST have IW >= 1
std::complex<coeff_t> twid_rom[REAL_SZ/2] = {std::complex<coeff_t>( 1, 0 ),
std::complex<coeff_t>(0.9,0.006 ),etc.}
Support for Console I/O (Printing)
As with initialization and assignment to ap_[u]fixed<>
variables, Vitis HLS supports printing values that require more than
64 bits to represent.
The easiest way to output any value stored in an ap_[u]fixed
variable is to use the C++ standard output stream, std::cout
(#include <iostream> or <iostream.h>)
. The stream
insertion operator, “<<
“, is
overloaded to correctly output the full range of values possible for
any given ap_[u]fixed
variable. The following
stream manipulators are also supported, allowing formatting of the value
as shown.
dec
(decimal)hex
(hexadecimal)oct
(octal)#include <iostream.h> // Alternative: #include <iostream> ap_fixed<6,3, AP_RND, AP_WRAP> Val = 3.25; cout << Val << endl; // Yields: 3.25
Using the Standard C Library
You can also use the standard C library (#include
<stdio.h>
) to print out values larger than 64bits:
 Convert the value to a C++
std::string
using theap_[u]fixed
classes methodto_string()
.  Convert the result to a nullterminated C character string using the
std::string
class methodc_str()
.
Optional Argument One (Specifying the Radix)
You can pass the ap[u]int::to_string()
method an optional
argument specifying the radix of the numerical format desired. The valid radix
argument values are:
 2 (binary)
 8 (octal
 10 (decimal)
 16 (hexadecimal) (default)
Optional Argument Two (Printing as Signed Values)
A second optional argument to ap_[u]int::to_string()
specifies
whether to print the nondecimal formats as signed values. This argument is boolean.
The default value is false, causing the nondecimal formats to be printed as
unsigned values.
ap_fixed<6,3, AP_RND, AP_WRAP> Val = 3.25;
printf("%s \n", in2.to_string().c_str()); // Yields: 0b011.010
printf("%s \n", in2.to_string(10).c_str()); //Yields: 3.25
The ap_[u]fixed
types are supported by the following C++
manipulator functions:
 setprecision
 setw
 setfill
The setprecision manipulator sets the decimal precision to be used. It takes one
parameter f
as the value of decimal precision, where
n
specifies the maximum number of meaningful digits to
display in total (counting both those before and those after the decimal point).
The default value of f
is 6, which is consistent with native C
float type.
ap_fixed<64, 32> f =3.14159;
cout << setprecision (5) << f << endl;
cout << setprecision (9) << f << endl;
f = 123456;
cout << setprecision (5) << f << endl;
The example above displays the following results where the printed results are rounded when the actual precision exceeds the specified precision:
3.1416
3.14159
1.2346e+05
The setw
manipulator:
 Sets the number of characters to be used for the field width.
 Takes one parameter
w
as the value of the widthwhere
w
determines the minimum number of characters to be written in some output representation.
If the standard width of the representation is shorter than the field width, the
representation is padded with fill characters. Fill characters are controlled by the
setfill manipulator which takes one parameter f
as the
padding character.
For example, given:
ap_fixed<65,32> aa = 123456;
int precision = 5;
cout<<setprecision(precision)<<setw(13)<<setfill('T')<<a<<endl;
The output is:
TTT1.2346e+05
Expressions Involving ap_[u]fixed<> types
Arbitrary precision fixedpoint values can participate in expressions that use any operators supported by C/C++. After an arbitrary precision fixedpoint type or variable is defined, their usage is the same as for any floating point type or variable in the C/C++ languages.
Observe the following caveats:
 Zero and Sign Extensions
All values of smaller bitwidth are zero or signextended depending on the sign of the source value. You may need to insert casts to obtain alternative signs when assigning smaller bitwidths to larger.
 Truncations
Truncation occurs when you assign an arbitrary precision fixedpoint of larger bitwidth than the destination variable.
Class Methods, Operators, and Data Members
In general, any valid operation that can be done on a native C/C++ integer
data type is supported (using operator overloading) for ap_[u]fixed
types. In addition to these overloaded operators, some class specific
operators and methods are included to ease bitlevel operations.
Binary Arithmetic Operators
Addition
ap_[u]fixed::RType ap_[u]fixed::operator + (ap_[u]fixed op)
Adds an arbitrary precision fixedpoint with a given operand
op
.
The operands can be any of the following integer types:
 ap_[u]fixed
 ap_[u]int
 C/C++
The result type ap_[u]fixed::RType
depends on the type information
of the two operands.
ap_fixed<76, 63> Result;
ap_fixed<5, 2> Val1 = 1.125;
ap_fixed<75, 62> Val2 = 6721.35595703125;
Result = Val1 + Val2; //Yields 6722.480957
Because Val2
has the
larger bitwidth on both integer part and fraction part, the result type
has the same bitwidth and plus one to be able to store all possible
result values.
Specifying the data's width controls resources by using the power functions, as shown below. In similar cases, Xilinx recommends specifying the width of the stored result instead of specifying the width of fixed point operations.
ap_ufixed<16,6> x=5;
ap_ufixed<16,7>y=hl::rsqrt<16,6>(x+x);
Subtraction
ap_[u]fixed::RType ap_[u]fixed::operator  (ap_[u]fixed op)
Subtracts an arbitrary precision fixedpoint with a given operand
op
.
The result type ap_[u]fixed::RType
depends on the
type information of the two operands.
ap_fixed<76, 63> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val2  Val1; // Yields 6720.23057
Because Val2
has the larger bitwidth on both integer
part and fraction part, the result type has the same bitwidth and plus
one to be able to store all possible result values.
Multiplication
ap_[u]fixed::RType ap_[u]fixed::operator * (ap_[u]fixed op)
Multiplies an arbitrary precision fixedpoint with a given operand
op
.
ap_fixed<80, 64> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 * Val2; // Yields 7561.525452
This shows the multiplication of Val1
and
Val2
. The result type is the sum of their
integer part bitwidth and their fraction part bit width.
Division
ap_[u]fixed::RType ap_[u]fixed::operator / (ap_[u]fixed op)
Divides an arbitrary precision fixedpoint by a given operand
op
.
ap_fixed<84, 66> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Val2 / Val1; // Yields 5974.538628
This shows the division of Val1
and
Val2
. To preserve enough precision:
 The integer bitwidth of the result type is sum of the integer
bitwidth of
Val2
and the fraction bitwidth ofVal1
.  The fraction bitwidth of the result type is equal to the fraction
bitwidth of
Val2
.
Bitwise Logical Operators
Bitwise OR
ap_[u]fixed::RType ap_[u]fixed::operator  (ap_[u]fixed op)
Applies a bitwise operation on an arbitrary precision fixedpoint and a
given operand op
.
ap_fixed<75, 62> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1  Val2; // Yields 6271.480957
Bitwise AND
ap_[u]fixed::RType ap_[u]fixed::operator & (ap_[u]fixed op)
Applies a bitwise operation on an arbitrary precision fixedpoint and a
given operand op
.
ap_fixed<75, 62> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 & Val2; // Yields 1.00000
Bitwise XOR
ap_[u]fixed::RType ap_[u]fixed::operator ^ (ap_[u]fixed op)
Applies an xor
bitwise operation on an arbitrary
precision fixedpoint and a given operand op
.
ap_fixed<75, 62> Result;
ap_fixed<5, 2> Val1 = 1625.153;
ap_fixed<75, 62> Val2 = 6721.355992351;
Result = Val1 ^ Val2; // Yields 6720.480957
Increment and Decrement Operators
PreIncrement
ap_[u]fixed ap_[u]fixed::operator ++ ()
This operator function prefix increases an arbitrary precision fixedpoint
variable by 1
.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = ++Val1; // Yields 6.125000
PostIncrement
ap_[u]fixed ap_[u]fixed::operator ++ (int)
This operator function postfix:
 Increases an arbitrary precision fixedpoint variable by
1
.  Returns the original val of this arbitrary
precision fixedpoint.
ap_fixed<25, 8> Result; ap_fixed<8, 5> Val1 = 5.125; Result = Val1++; // Yields 5.125000
PreDecrement
ap_[u]fixed ap_[u]fixed::operator  ()
This operator function prefix decreases this arbitrary precision fixedpoint
variable by 1
.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = Val1; // Yields 4.125000
PostDecrement
ap_[u]fixed ap_[u]fixed::operator  (int)
This operator function postfix:
 Decreases this arbitrary precision fixedpoint variable by
1
.  Returns the original val of this arbitrary
precision fixedpoint.
ap_fixed<25, 8> Result; ap_fixed<8, 5> Val1 = 5.125; Result = Val1; // Yields 5.125000
Unary Operators
Addition
ap_[u]fixed ap_[u]fixed::operator + ()
Returns a self copy of an arbitrary precision fixedpoint variable.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = +Val1; // Yields 5.125000
Subtraction
ap_[u]fixed::RType ap_[u]fixed::operator  ()
Returns a negative value of an arbitrary precision fixedpoint variable.
ap_fixed<25, 8> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = Val1; // Yields 5.125000
Equality Zero
bool ap_[u]fixed::operator ! ()
This operator function:
 Compares an arbitrary precision fixedpoint variable with
0
,  Returns the result.
bool Result; ap_fixed<8, 5> Val1 = 5.125; Result = !Val1; // Yields false
Bitwise Inverse
ap_[u]fixed::RType ap_[u]fixed::operator ~ ()
Returns a bitwise complement of an arbitrary precision fixedpoint variable.
ap_fixed<25, 15> Result;
ap_fixed<8, 5> Val1 = 5.125;
Result = ~Val1; // Yields 5.25
Shift Operators
Unsigned Shift Left
ap_[u]fixed ap_[u]fixed::operator << (ap_uint<_W2> op)
This operator function:
 Shifts left by a given integer operand.
 Returns the result.
The operand can be a C/C++ integer type:
char
short
int
long
The return type of the shift left operation is the same width as the type being shifted.
ap_fixed<25, 15> Result;
ap_fixed<8, 5> Val = 5.375;
ap_uint<4> sh = 2;
Result = Val << sh; // Yields 10.5
The bitwidth of the result is (W = 25
, I
= 15
). Because the shift left operation result type
is same as the type of Val
:
 The high order two bits of
Val
are shifted out.  The result is 10.5.
If a result of 21.5 is required, Val
must be cast to
ap_fixed<10, 7>
first  for
example, ap_ufixed<10, 7>(Val)
.
Signed Shift Left
ap_[u]fixed ap_[u]fixed::operator << (ap_int<_W2> op)
This operator:
 Shifts left by a given integer operand.
 Returns the result.
The shift direction depends on whether the operand is positive or negative.
 If the operand is positive, a shift right is performed.
 If the operand is negative, a shift left (opposite direction) is performed.
The operand can be a C/C++ integer type:
char
short
int
long
The return type of the shift right operation is the same width as the type being shifted.
ap_fixed<25, 15, false> Result;
ap_uint<8, 5> Val = 5.375;
ap_int<4> Sh = 2;
Result = Val << sh; // Shift left, yields 10.25
Sh = 2;
Result = Val << sh; // Shift right, yields 1.25
Unsigned Shift Right
ap_[u]fixed ap_[u]fixed::operator >> (ap_uint<_W2> op)
This operator function:
 Shifts right by a given integer operand.
 Returns the result.
The operand can be a C/C++ integer type:
char
short
int
long
The return type of the shift right operation is the same width as the type being shifted.
ap_fixed<25, 15> Result;
ap_fixed<8, 5> Val = 5.375;
ap_uint<4> sh = 2;
Result = Val >> sh; // Yields 1.25
If it is necessary to preserve all significant bits, extend fraction part
bitwidth of the Val
first, for example
ap_fixed<10, 5>(Val)
.
Signed Shift Right
ap_[u]fixed ap_[u]fixed::operator >> (ap_int<_W2> op)
This operator:
 Shifts right by a given integer operand.
 Returns the result.
The shift direction depends on whether operand is positive or negative.
 If the operand is positive, a shift right performed.
 If operand is negative, a shift left (opposite direction) is performed.
The operand can be a C/C++ integer type (char
,
short
, int
, or
long
).
The return type of the shift right operation is the same width as type being shifted. For example:
ap_fixed<25, 15, false> Result;
ap_uint<8, 5> Val = 5.375;
ap_int<4> Sh = 2;
Result = Val >> sh; // Shift right, yields 1.25
Sh = 2;
Result = Val >> sh; // Shift left, yields 10.5
1.25
Relational Operators
Equality
bool ap_[u]fixed::operator == (ap_[u]fixed op)
This operator compares the arbitrary precision fixedpoint variable with a given operand.
Returns true
if they are equal and
false
if they are not equal.
The type of operand op
can be
ap_[u]fixed
, ap_int
or C/C++ integer types. For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 == Val2; // Yields true
Result = Val1 == Val3; // Yields false
Inequality
bool ap_[u]fixed::operator != (ap_[u]fixed op)
This operator compares this arbitrary precision fixedpoint variable with a given operand.
Returns true
if they are not equal and
false
if they are equal.
The type of operand op
can be:
 ap_[u]fixed
 ap_int
 C or C++ integer types
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 != Val2; // Yields false
Result = Val1 != Val3; // Yields true
Greater than or equal to
bool ap_[u]fixed::operator >= (ap_[u]fixed op)
This operator compares a variable with a given operand.
Returns true
if they are equal or if the variable is
greater than the operator and false
otherwise.
The type of operand op
can be
ap_[u]fixed
, ap_int
or C/C++ integer types.
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 >= Val2; // Yields true
Result = Val1 >= Val3; // Yields false
Less than or equal to
bool ap_[u]fixed::operator <= (ap_[u]fixed op)
This operator compares a variable with a given operand, and return
true
if it is equal to or less than the
operand and false
if not.
The type of operand op can be ap_[u]fixed
,
ap_int
or C/C++ integer types.
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 <= Val2; // Yields true
Result = Val1 <= Val3; // Yields true
Greater than
bool ap_[u]fixed::operator > (ap_[u]fixed op)
This operator compares a variable with a given operand, and return
true
if it is greater than the operand and
false
if not.
The type of operand op
can be
ap_[u]fixed
,
ap_int
, or C/C++ integer types.
For example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 > Val2; // Yields false
Result = Val1 > Val3; // Yields false
Less than
bool ap_[u]fixed::operator < (ap_[u]fixed op)
This operator compares a variable with a given operand, and return
true
if it is less than the operand and
false
if not.
The type of operand op can be ap_[u]fixed
,
ap_int
, or C/C++ integer types. For
example:
bool Result;
ap_ufixed<8, 5> Val1 = 1.25;
ap_fixed<9, 4> Val2 = 17.25;
ap_fixed<10, 5> Val3 = 3.25;
Result = Val1 < Val2; // Yields false
Result = Val1 < Val3; // Yields true
Bit Operator
BitSelect and Set
af_bit_ref ap_[u]fixed::operator [] (int bit)
This operator selects one bit from an arbitrary precision fixedpoint value and returns it.
The returned value is a reference value that can set or clear the
corresponding bit in the ap_[u]fixed
variable.
The bit argument must be an integer value and it specifies the index of
the bit to select. The least significant bit has index
0
. The highest permissible index is one
less than the bitwidth of this ap_[u]fixed
variable.
The result type is af_bit_ref
with a value of either
0
or 1
. For
example:
ap_int<8, 5> Value = 1.375;
Value[3]; // Yields 1
Value[4]; // Yields 0
Value[2] = 1; // Yields 1.875
Value[3] = 0; // Yields 0.875
Bit Range
af_range_ref af_(u)fixed::range (unsigned Hi, unsigned Lo)
af_range_ref af_(u)fixed::operator [] (unsigned Hi, unsigned Lo)
This operation is similar to bitselect operator [] except that it operates on a range of bits instead of a single bit.
It selects a group of bits from the arbitrary precision fixedpoint
variable. The Hi
argument provides the upper
range of bits to be selected. The Lo
argument
provides the lowest bit to be selected. If Lo
is
larger than Hi
the bits selected are returned in
the reverse order.
The return type af_range_ref
represents a reference
in the range of the ap_[u]fixed
variable
specified by Hi
and Lo
.
For example:
ap_uint<4> Result = 0;
ap_ufixed<4, 2> Value = 1.25;
ap_uint<8> Repl = 0xAA;
Result = Value.range(3, 0); // Yields: 0x5
Value(3, 0) = Repl(3, 0); // Yields: 1.5
// when Lo > Hi, return the reverse bits string
Result = Value.range(0, 3); // Yields: 0xA
Range Select
af_range_ref af_(u)fixed::range ()
af_range_ref af_(u)fixed::operator []
This operation is the special case of the range select operator
[]
. It selects all bits from this
arbitrary precision fixedpoint value in the normal order.
The return type af_range_ref represents a reference to the range specified by Hi = W  1 and Lo = 0. For example:
ap_uint<4> Result = 0;
ap_ufixed<4, 2> Value = 1.25;
ap_uint<8> Repl = 0xAA;
Result = Value.range(); // Yields: 0x5
Value() = Repl(3, 0); // Yields: 1.5
Length
int ap_[u]fixed::length ()
This function returns an integer value that provides the number of bits in an arbitrary precision fixedpoint value. It can be used with a type or a value. For example:
ap_ufixed<128, 64> My128APFixed;
int bitwidth = My128APFixed.length(); // Yields 128
Explicit Conversion Methods
Fixed to Double
double ap_[u]fixed::to_double ()
This member function returns this fixedpoint value in form of IEEE double precision format. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
double Result;
Result = MyAPFixed.to_double(); // Yields 333.789
Fixed to Float
float ap_[u]fixed::to_float()
This member function returns this fixedpoint value in form of IEEE float precision format. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
float Result;
Result = MyAPFixed.to_float(); // Yields 333.789
Fixed to HalfPrecision Floating Point
half ap_[u]fixed::to_half()
This member function return this fixedpoint value in form of HLS halfprecision (16bit) float precision format. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
half Result;
Result = MyAPFixed.to_half(); // Yields 333.789
Fixed to ap_int
ap_int ap_[u]fixed::to_ap_int ()
This member function explicitly converts this fixedpoint
value to ap_int
that captures all
integer bits (fraction bits are truncated). For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
ap_uint<77> Result;
Result = MyAPFixed.to_ap_int(); //Yields 333
Fixed to Integer
int ap_[u]fixed::to_int ()
unsigned ap_[u]fixed::to_uint ()
ap_slong ap_[u]fixed::to_int64 ()
ap_ulong ap_[u]fixed::to_uint64 ()
This member function explicitly converts this fixedpoint value to C builtin integer types. For example:
ap_ufixed<256, 77> MyAPFixed = 333.789;
unsigned int Result;
Result = MyAPFixed.to_uint(); //Yields 333
unsigned long long Result;
Result = MyAPFixed.to_uint64(); //Yields 333
ap_[u]fixed
to other data types.Compile Time Access to Data Type Attributes
The ap_[u]fixed<>
types are provided with several static members that allow the size and
configuration of data types to be determined at compile time. The data
type is provided with the static const
members: width
, iwidth
, qmode
and
omode
:
static const int width = _AP_W;
static const int iwidth = _AP_I;
static const ap_q_mode qmode = _AP_Q;
static const ap_o_mode omode = _AP_O;
You can use these data members to extract the following information from any
existing ap_[u]fixed<>
data type:
width
: The width of the data type.iwidth
: The width of the integer part of the data type.qmode
: The quantization mode of the data type.omode
: The overflow mode of the data type.
For example, you can use these data members to extract the data width of an
existing ap_[u]fixed<>
data type to create
another ap_[u]fixed<>
data type at compile
time.
The following example shows how the size of variable
Res
is automatically defined as 1bit greater
than variables Val1
and Val2
with the same quantization
modes:
// Definition of basic data type
#define INPUT_DATA_WIDTH 12
#define IN_INTG_WIDTH 6
#define IN_QMODE AP_RND_ZERO
#define IN_OMODE AP_WRAP
typedef ap_fixed<INPUT_DATA_WIDTH, IN_INTG_WIDTH, IN_QMODE, IN_OMODE> data_t;
// Definition of variables
data_t Val1, Val2;
// Res is automatically sized at runtime to be 1bit greater than INPUT_DATA_WIDTH
// The bit growth in Res will be in the integer bits
ap_int<data_t::width+1, data_t::iwidth+1, data_t::qmode, data_t::omode> Res = Val1 +
Val2;
This ensures that Vitis HLS correctly
models the bitgrowth caused by the addition even if you update the
value of INPUT_DATA_WIDTH, IN_INTG_WIDTH, or the quantization modes for
data_t
.
Vitis HLS Math Library
The Vitis HLS Math Library (hls_math.h
) provides support for the
synthesis of the standard C (math.h
) and C++
(cmath.h
) libraries and is automatically used to specify the
math operations during synthesis. The support includes floating point
(singleprecision, doubleprecision and halfprecision) for all functions and
fixedpoint support for some functions.
The hls_math.h
library can optionally be used in C++ source code in
place of the standard C++ math library (cmath.h
), but it cannot be
used in C source code. Vitis HLS will use the appropriate simulation implementation
to avoid accuracy difference between C simulation and C/RTL cosimulation.
HLS Math Library Accuracy
The HLS math functions are implemented as synthesizable bitapproximate functions from the hls_math.h
library. Bitapproximate HLS math library functions do not provide the same accuracy as the standard C function. To achieve the desired result, the bitapproximate implementation might use a different underlying algorithm than the standard C math library version. The accuracy of the function is specified in terms of ULP (Unit of Least Precision). This difference in accuracy has implications for both C simulation and C/RTL cosimulation.
The ULP difference is typically in the range of 14 ULP.
 If the standard C math library is used in the C source code, there may be a difference between the C simulation and the C/RTL cosimulation due to the fact that some functions exhibit a ULP difference from the standard C math library.
 If the HLS math library is used in the C source code, there will be no difference between the C simulation and the C/RTL cosimulation. A C simulation using the HLS math library, may however differ from a C simulation using the standard C math library.
In addition, the following seven functions might show some differences, depending on the C standard used to compile and run the C simulation:
 copysign
 fpclassify
 isinf
 isfinite
 isnan
 isnormal
 signbit
C90 mode
Only isinf
, isnan
, and copysign
are usually provided by the system header files, and they
operate on doubles. In particular, copysign
always returns a double result. This might result in unexpected results after
synthesis if it must be returned to a float, because a doubletofloat conversion
block is introduced into the hardware.
C99 mode (std=c99)
All seven functions are usually provided under the expectation that
the system header files will redirect them to __isnan(double)
and __isnan(float)
. The usual GCC header files do not redirect
isnormal
, but implement it in terms of
fpclassify
.
C++ Using math.h
All seven are provided by the system header files, and they operate on doubles.
copysign
always returns a
double result. This might cause unexpected results after synthesis if it must be
returned to a float, because a doubletofloat conversion block is introduced into
the hardware.
C++ Using cmath
Similar to C99 mode(std=c99)
,
except that:
 The system header files are usually different.
 The functions are properly overloaded for:
float(). snan(double)
isinf(double)
copysign
and copysignf
are handled as builtins even when using
namespace std;
.
C++ Using cmath and namespace std
No issues. Xilinx recommends using the following for best results:
std=c99
for Cfnobuiltin
for C and C++
std=c99
, use the Tcl command add_files
with the cflags
option.
Alternatively, use the Edit CFLAGs button in
the Project Settings dialog box.The HLS Math Library
The following functions are provided in the HLS math library. Each
function supports halfprecision (type half
),
singleprecision (type float
) and double precision
(type double
).
func
listed below, there is also an
associated halfprecision only function named half_func
and singleprecision only function named funcf
provided in the library.When mixing halfprecision, singleprecision and doubleprecision data types, check for common synthesis errors to prevent introducing typeconversion hardware in the final FPGA implementation.
Trigonometric Functions
acos  acospi  asin  asinpi 
atan  atan2  atan2pi  cos 
cospi  sin  sincos  sinpi 
tan  tanpi 
Hyperbolic Functions
acosh  asinh  atanh  cosh 
sinh  tanh 
Exponential Functions
exp  exp10  exp2  expm1 
frexp  ldexp  modf 
Logarithmic Functions
ilogb  log  log10  log1p 
Power Functions
cbrt  hypot  pow  rsqrt 
sqrt 
Error Functions
erf  erfc 
Rounding Functions
ceil  floor  llrint  llround 
lrint  lround  nearbyint  rint 
round  trunc 
Remainder Functions
fmod  remainder  remquo 
Floatingpoint
copysign  nan  nextafter  nexttoward 
Difference Functions
fdim  fmax  fmin  maxmag 
minmag 
Other Functions
abs  divide  fabs  fma 
fract  mad  recip 
Classification Functions
fpclassify  isfinite  isinf  isnan 
isnormal  signbit 
Comparison Functions
isgreater  isgreaterequal  isless  islessequal 
islessgreater  isunordered 
Relational Functions
all  any  bitselect  isequal 
isnotequal  isordered  select 
FixedPoint Math Functions
Fixedpoint implementations are also provided for the following math functions.
All fixedpoint math functions support ap_[u]fixed and ap_[u]int data types with following bitwidth specification,
ap_fixed<W,I>
where I<=33 and WI<=32ap_ufixed<W,I>
where I<=32 and WI<=32ap_int<I>
where I<=33ap_uint<I>
where I<=32
Trigonometric Functions
cos  sin  tan  acos  asin  atan  atan2  sincos 
cospi  sinpi 
Hyperbolic Functions
cosh  sinh  tanh  acosh  asinh  atanh 
Exponential Functions
exp  frexp  modf  exp2  expm1 
Logarithmic Functions
log  log10  ilogb  log1p 
Power Functions
pow  sqrt  rsqrt  cbrt  hypot 
Error Functions
erf  erfc 
Rounding Functions
ceil  floor  trunc  round  rint  nearbyint 
Floating Point
nextafter  nexttoward 
Difference Functions
erf  erfc  fdim  fmax  fmin  maxmag  minmag 
Other Functions
fabs  recip  abs  fract  divide 
Classification Functions
signbit 
Comparison Functions
isgreater  isgreaterequal  isless  islessequal  islessgreater 
Relational Functions
isequal  isnotequal  any  all  bitselect 
The fixedpoint type provides a slightlyless accurate version of the function value, but a smaller and faster RTL implementation.
The methodology for implementing a math function with a fixedpoint data types is:
 Determine if a fixedpoint implementation is supported.
 Update the math functions to use
ap_fixed
types.  Perform C simulation to validate the design still operates with the required precision. The C simulation is performed using the same bitaccurate types as the RTL implementation.
 Synthesize the design.
For example, a fixedpoint implementation of the function sin
is specified by using fixedpoint types with the
math function as follows:
#include "hls_math.h"
#include "ap_fixed.h"
ap_fixed<32,2> my_input, my_output;
my_input = 24.675;
my_output = sin(my_input);
When using fixedpoint math functions, the result type must have the same width and integer bits as the input.
Verification and Math Functions
If the standard C math library is used in the C source code, the C simulation results and the C/RTL cosimulation results may be different: if any of the math functions in the source code have an ULP difference from the standard C math library it may result in differences when the RTL is simulated.
If the hls_math.h
library is used in the C source code, the C simulation and C/RTL cosimulation results are identical. However, the results of C simulation using hls_math.h
are not the same as those using the standard C libraries. The hls_math.h
library simply ensures the C simulation matches the C/RTL cosimulation results. In both cases, the same RTL implementation is created. The following explains each of the possible options which are used to perform verification when using math functions.
Verification Option 1: Standard Math Library and Verify Differences
In this option, the standard C math libraries are used in the source code. If any of the functions synthesized do have exact accuracy the C/RTL cosimulation is different than the C simulation. The following example highlights this approach.
#include <cmath>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
typedef float data_t;
data_t cpp_math(data_t angle) {
data_t s = sinf(angle);
data_t c = cosf(angle);
return sqrtf(s*s+c*c);
}
In this case, the results between C simulation and C/RTL cosimulation are different. Keep in mind when comparing the outputs of simulation, any results written from the test bench are written to the working directory where the simulation executes:
 C simulation: Folder
<project>/<solution>/csim/build
 C/RTL cosimulation: Folder
<project>/<solution>/sim/<RTL>
where <project>
is the project folder,
<solution>
is the name of the solution folder and
<RTL>
is the type of RTL verified (verilog or vhdl). The following
figure shows a typical comparison of the presynthesis results file on the lefthand side and
the postsynthesis RTL results file on the righthand side. The output is shown in the third
column.
The results of presynthesis simulation and postsynthesis simulation differ by fractional amounts. You must decide whether these fractional amounts are acceptable in the final RTL implementation.
The recommended flow for handling these differences is using a test bench that
checks the results to ensure that they lie within an acceptable error range. This can be
accomplished by creating two versions of the same function, one for synthesis and one as a
reference version. In this example, only function cpp_math
is synthesized.
#include <cmath>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
typedef float data_t;
data_t cpp_math(data_t angle) {
data_t s = sinf(angle);
data_t c = cosf(angle);
return sqrtf(s*s+c*c);
}
data_t cpp_math_sw(data_t angle) {
data_t s = sinf(angle);
data_t c = cosf(angle);
return sqrtf(s*s+c*c);
}
The test bench to verify the design compares the outputs of both functions to
determine the difference, using variable diff
in the
following example. During C simulation both functions produce identical outputs. During C/RTL
cosimulation function cpp_math
produces different
results and the difference in results are checked.
int main() {
data_t angle = 0.01;
data_t output, exp_output, diff;
int retval=0;
for (data_t i = 0; i <= 250; i++) {
output = cpp_math(angle);
exp_output = cpp_math_sw(angle);
// Check for differences
diff = ( (exp_output > output) ? exp_output  output : output  exp_output);
if (diff > 0.0000005) {
printf("Difference %.10f exceeds tolerance at angle %.10f \n", diff, angle);
retval=1;
}
angle = angle + .1;
}
if (retval != 0) {
printf("Test failed !!!\n");
retval=1;
} else {
printf("Test passed !\n");
}
// Return 0 if the test passes
return retval;
}
If the margin of difference is lowered to 0.00000005, this test bench highlights the margin of error during C/RTL cosimulation:
Difference 0.0000000596 at angle 1.1100001335
Difference 0.0000000596 at angle 1.2100001574
Difference 0.0000000596 at angle 1.5100002289
Difference 0.0000000596 at angle 1.6100002527
etc..
When using the standard C math libraries (math.h
and
cmath.h
) create a “smart” test bench to verify any differences in
accuracy are acceptable.
Verification Option 2: HLS Math Library and Validate Differences
An alternative verification option is to convert the source code to use the HLS math library. With this option, there are no differences between the C simulation and C/RTL cosimulation results. The following example shows how the code above is modified to use the hls_math.h
library.
 Include the
hls_math.h
header file.  Replace the math functions with the equivalent
hls::
function.#include <cmath> #include "hls_math.h" #include <fstream> #include <iostream> #include <iomanip> #include <cstdlib> using namespace std; typedef float data_t; data_t cpp_math(data_t angle) { data_t s = hls::sinf(angle); data_t c = hls::cosf(angle); return hls::sqrtf(s*s+c*c); }
Verification Option 3: HLS Math Library File and Validate Differences
Including the HLS math library file lib_hlsm.cpp
as a design file ensures Vitis HLS uses the HLS math library for C simulation. This option is identical to option2 however it does not require the C code to be modified.
The HLS math library file is located in the src
directory in the Vitis HLS installation area. Simply copy the file to your local folder and add the file as a standard design file.
As with option 2, with this option there is now a difference between the C simulation results using the HLS math library file and those previously obtained without adding this file. These difference should be validated with C simulation using a “smart” test bench similar to option 1.
Common Synthesis Errors
The following are common use errors when synthesizing math functions. These are often (but not exclusively) caused by converting C functions to C++ to take advantage of synthesis for math functions.
C++ cmath.h
If the C++ cmath.h
header file is used, the floating point functions (for
example, sinf
and cosf
) can be used.
These result in 32bit operations in hardware. The cmath.h
header file also
overloads the standard functions (for example, sin
and
cos
) so they can be used for float and double types.
C math.h
If the C math.h
library is used, the singleprecision functions (for
example, sinf
and cosf
) are required
to synthesize 32bit floating point operations. All standard function calls (for
example, sin
and cos
) result in
doubles and 64bit doubleprecision operations being synthesized.
Cautions
When converting C functions to C++ to take advantage of math.h
support, be
sure that the new C++ code compiles correctly before synthesizing with Vitis HLS.
For example, if sqrtf()
is used in the code with
math.h
, it requires the following code extern added to the C++ code to
support it:
#include <math.h>
extern “C” float sqrtf(float);
To avoid unnecessary hardware caused by type conversion, follow the warnings on mixing double and float types discussed in Floats and Doubles.
HLS Stream Library
Streaming data is a type of data transfer in which data samples are sent in sequential order starting from the first sample. Streaming requires no address management.
Modeling designs that use streaming data can be difficult in C. The approach of using pointers to perform multiple read and/or write accesses can introduce issues, because there are implications for the type qualifier and how the test bench is constructed.
Vitis HLS provides a C++ template class hls::stream<>
for modeling streaming data structures. The streams implemented with the hls::stream<>
class have the following attributes.
 In the C code, an
hls::stream<>
behaves like a FIFO of infinite depth. There is no requirement to define the size of anhls::stream<>
.  They are read from and written to sequentially. That is, after data is read from an
hls::stream<>
, it cannot be read again.  An
hls::stream<>
on the toplevel interface is by default implemented with anap_fifo
interface.  An
hls::stream<>
internal to the design is implemented as a FIFO with a depth of 2. The optimization directive STREAM is used to change this default size.
This section shows how the hls::stream<>
class can more easily model designs with streaming data. The topics in this section provide:
 An overview of modeling with streams and the RTL implementation of streams.
 Rules for global stream variables.
 How to use streams.
 Blocking reads and writes.
 NonBlocking Reads and writes.
 Controlling the FIFO depth.
hls::stream
class should always be passed
between functions as a C++ reference argument. For example,
&my_stream
.hls::stream
class is only used in C++
designs. Array of streams is not supported.C Modeling and RTL Implementation
Streams are modeled as an infinite queue in software (and in the test bench during RTL cosimulation). There is no need to specify any depth to simulate streams in C++. Streams can be used inside functions and on the interface to functions. Internal streams may be passed as function parameters.
Streams can be used only in C++ based designs. Each hls::stream<>
object must be written by a single process and read by a single process.
If an hls::stream
is used on the toplevel interface, it is by default implemented in the RTL as a FIFO interface (ap_fifo
) but may be optionally implemented as a handshake interface (ap_hs
) or an AXIStream interface (axis
).
If an hls::stream
is used inside the design
function and synthesized into hardware, it is implemented as a
FIFO with a default depth of 2. In some cases, such as when
interpolation is used, the depth of the FIFO might have to be
increased to ensure the FIFO can hold all the elements produced
by the hardware. Failure to ensure the FIFO is large enough to
hold all the data samples generated by the hardware can result
in a stall in the design (seen in C/RTL cosimulation and in the
hardware implementation). The depth of the FIFO can be adjusted
using the STREAM directive with the depth
option. An example of this is provided in the example design
hls_stream.
hls::stream
variables are correctly sized when used in the default
nonDATAFLOW regions.If an hls::stream
is used to transfer data between tasks (subfunctions or loops), you should immediately consider implementing the tasks in a DATAFLOW region where data streams from one task to the next. The default (nonDATAFLOW) behavior is to complete each task before starting the next task, in which case the FIFOs used to implement the hls::stream
variables must be sized to ensure they are large enough to hold all the data samples generated by the producer task. Failure to increase the size of the hls::stream
variables results in the error below:
ERROR: [XFORM 203733] An internal stream xxxx.xxxx.V.user.V' with default size is
used in a nondataflow region, which may result in deadlock. Please consider to
resize the stream using the directive 'set_directive_stream' or the 'HLS stream'
pragma.
This error informs you that in a nonDATAFLOW region (the default FIFOs depth is 2) may not be large enough to hold all the data samples written to the FIFO by the producer task.
Global and Local Streams
Streams may be defined either locally or globally. Local streams are always implemented as internal FIFOs. Global streams can be implemented as internal FIFOs or ports:
 Globallydefined streams that are only read from, or only written to, are inferred as external ports of the toplevel RTL block.
 Globallydefined streams that are both read from and written to (in the hierarchy below the toplevel function) are implemented as internal FIFOs.
Streams defined in the global scope follow the same rules as any other global variables.
Using HLS Streams
To use hls::stream<>
objects, include
the header file hls_stream.h
. Streaming data objects are
defined by specifying the type and variable name. In this example, a 128bit unsigned
integer type is defined and used to create a stream variable called my_wide_stream
.
#include "ap_int.h"
#include "hls_stream.h"
typedef ap_uint<128> uint128_t; // 128bit user defined type
hls::stream<uint128_t> my_wide_stream; // A stream declaration
Streams must use scoped naming. Xilinx
recommends using the scoped hls::
naming shown in the
example above. However, if you want to use the hls
namespace, you can rewrite the preceding example as:
#include <ap_int.h>
#include <hls_stream.h>
using namespace hls;
typedef ap_uint<128> uint128_t; // 128bit user defined type
stream<uint128_t> my_wide_stream; // hls:: no longer required
Given a stream specified as hls::stream<T>
, the type T may be:
 Any C++ native data type
 A Vitis HLS arbitrary precision type (for example, ap_int<>, ap_ufixed<>)
 A userdefined struct containing either of the above types
A stream can also be specified as hls::stream<Type, Depth>
, where Depth indicates the depth of the FIFO
needed in the verification adapter that the HLS tool creates for RTL cosimulation.
Streams may be optionally named. Providing a name for the stream allows the name to be used in reporting. For example, Vitis HLS automatically checks to ensure all elements from an input stream are read during simulation. Given the following two streams:
stream<uint8_t> bytestr_in1;
stream<uint8_t> bytestr_in2("input_stream2");
WARNING: Hls::stream 'hls::stream<unsigned char>.1' contains leftover data, which
may result in RTL simulation hanging.
WARNING: Hls::stream 'input_stream2' contains leftover data, which may result in RTL
simulation hanging.
Any warning on elements left in the streams are reported as follows, where
it is clear which message relates to bytetr_in2:
When streams are passed into and out of functions, they must be passedbyreference as in the following example:
void stream_function (
hls::stream<uint8_t> &strm_out,
hls::stream<uint8_t> &strm_in,
uint16_t strm_len
)
Vitis HLS supports both blocking and nonblocking access methods.
 Nonblocking accesses can be implemented only as FIFO interfaces.
 Streaming ports that are implemented as
ap_fifo
ports and that are defined with an AXI4Stream resource must not use nonblocking accesses.
A complete design example using streams is provided in the Vitis HLS examples. Refer to the hls_stream
example in the design examples available from the
GUI welcome screen.
Blocking Reads and Writes
The basic accesses to an hls::stream<>
object are blocking reads and writes. These are accomplished using class
methods. These methods stall (block) execution if a read is attempted
on an empty stream FIFO, a write is attempted to a full stream FIFO,
or until a full handshake is accomplished for a stream mapped to an ap_hs
interface protocol.
A stall can be observed in C/RTL cosimulation as the continued execution of the simulator without any progress in the transactions. The following shows a classic example of a stall situation, where the RTL simulation time keeps increasing, but there is no progress in the inter or intra transactions:
// RTL Simulation : "InterTransaction Progress" ["IntraTransaction Progress"] @
"Simulation Time"
///////////////////////////////////////////////////////////////////////////////////
// RTL Simulation : 0 / 1 [0.00%] @ "110000"
// RTL Simulation : 0 / 1 [0.00%] @ "202000"
// RTL Simulation : 0 / 1 [0.00%] @ "404000"
Blocking Write Methods
In this example, the value of variable src_var
is pushed into the stream.
// Usage of void write(const T & wdata)
hls::stream<int> my_stream;
int src_var = 42;
my_stream.write(src_var);
The << operator is overloaded such that it may be used in a similar fashion to the stream
insertion operators for C++ stream (for example, iostreams and filestreams). The
hls::stream<>
object to be written to is supplied as the lefthand side argument and the
value to be written as the righthand side.
// Usage of void operator << (T & wdata)
hls::stream<int> my_stream;
int src_var = 42;
my_stream << src_var;
Blocking Read Methods
This method reads from the head of the stream and assigns the values to the variable dst_var
.
// Usage of void read(T &rdata)
hls::stream<int> my_stream;
int dst_var;
my_stream.read(dst_var);
Alternatively, the next object in the stream can be read by assigning (using for example =, +=) the stream to an object on the lefthand side:
// Usage of T read(void)
hls::stream<int> my_stream;
int dst_var = my_stream.read();
The '>>' operator is overloaded to allow use similar to the stream extraction operator for
C++ stream (for example, iostreams and filestreams). The hls::stream
is supplied as the
LHS argument and the destination variable the RHS.
// Usage of void operator >> (T & rdata)
hls::stream<int> my_stream;
int dst_var;
my_stream >> dst_var;
NonBlocking Reads and Writes
Nonblocking write and read methods are also provided. These allow execution to continue even when a read is attempted on an empty stream or a write to a full stream.
These methods return a Boolean value indicating the status of the access (true
if successful, false
otherwise). Additional methods are included for testing the status of an hls::stream<>
stream.
ap_fifo
protocol. More
specifically, the AXIStream standard and the Xilinx
ap_hs
IO protocol do not support nonblocking
accesses.During C simulation, streams have an infinite size. It is therefore not possible to validate with C simulation if the stream is full. These methods can be verified only during RTL simulation when the FIFO sizes are defined (either the default size of 1, or an arbitrary size defined with the STREAM directive).
NonBlocking Writes
This method attempts to push variable src_var
into the stream my_stream
, returning a boolean true
if successful. Otherwise, false
is returned and the queue is unaffected.
// Usage of void write_nb(const T & wdata)
hls::stream<int> my_stream;
int src_var = 42;
if (my_stream.write_nb(src_var)) {
// Perform standard operations
...
} else {
// Write did not occur
return;
}
Fullness Test
bool full(void)
Returns true
, if and only if the
hls::stream<>
object is full.
// Usage of bool full(void)
hls::stream<int> my_stream;
int src_var = 42;
bool stream_full;
stream_full = my_stream.full();
NonBlocking Read
bool read_nb(T & rdata)
This method attempts to read a value from the stream, returning
true
if successful. Otherwise, false
is returned and the queue is unaffected.
// Usage of void read_nb(const T & wdata)
hls::stream<int> my_stream;
int dst_var;
if (my_stream.read_nb(dst_var)) {
// Perform standard operations
...
} else {
// Read did not occur
return;
}
Emptiness Test
bool empty(void)
Returns true
if the hls::stream<>
is empty.
// Usage of bool empty(void)
hls::stream<int> my_stream;
int dst_var;
bool stream_empty;
stream_empty = my_stream.empty();
The following example shows how a combination of nonblocking accesses and full/empty tests can provide error handling functionality when the RTL FIFOs are full or empty:
#include "hls_stream.h"
using namespace hls;
typedef struct {
short data;
bool valid;
bool invert;
} input_interface;
bool invert(stream<input_interface>& in_data_1,
stream<input_interface>& in_data_2,
stream<short>& output
) {
input_interface in;
bool full_n;
// Read an input value or return
if (!in_data_1.read_nb(in))
if (!in_data_2.read_nb(in))
return false;
// If the valid data is written, return notfull (full_n) as true
if (in.valid) {
if (in.invert)
full_n = output.write_nb(~in.data);
else
full_n = output.write_nb(in.data);
}
return full_n;
}
Controlling the RTL FIFO Depth
For most designs using streaming data, the default RTL FIFO depth of 2 is sufficient. Streaming data is generally processed one sample at a time.
For multirate designs in which the implementation requires a FIFO with a depth greater than 2, you must determine (and set using the STREAM directive) the depth necessary for the RTL simulation to complete. If the FIFO depth is insufficient, RTL cosimulation stalls.
Because stream objects cannot be viewed in the GUI directives pane, the STREAM directive cannot be applied directly in that pane.
Rightclick the function in which an hls::stream<>
object is declared (or is used, or exists in the argument list) to:
 Select the STREAM directive.
 Populate the
variable
field manually with name of the stream variable.
Alternatively, you can:
 Specify the STREAM directive manually in the
directives.tcl
file, or  Add it as a pragma in
source
.
C/RTL CoSimulation Support
The Vitis HLS C/RTL cosimulation feature does not support structures or classes containing hls::stream<>
members in the toplevel interface. Vitis HLS supports these structures or classes for synthesis.
typedef struct {
hls::stream<uint8_t> a;
hls::stream<uint16_t> b;
} strm_strct_t;
void dut_top(strm_strct_t indata, strm_strct_t outdata) { … }
These restrictions apply to both toplevel function arguments and globally declared
objects. If structs of streams are used for synthesis, the design must be verified using an
external RTL simulator and usercreated HDL test bench. There are no such restrictions on
hls::stream<>
objects with strictly internal linkage.
HLS IP Libraries
Vitis HLS provides C++ libraries to implement a number of Xilinx IP blocks. The C libraries allow the following Xilinx IP blocks to be directly inferred from the C++ source code ensuring a highquality implementation in the FPGA.
Library Header File  Description 

hls_fft.h  Allows the Xilinx LogiCORE IP FFT to be simulated in C and implemented using the Xilinx LogiCORE block. 
hls_fir.h  Allows the Xilinx LogiCORE IP FIR to be simulated in C and implemented using the Xilinx LogiCORE block. 
hls_dds.h  Allows the Xilinx LogiCORE IP DDS to be simulated in C and implemented using the Xilinx LogiCORE block. 
ap_shift_reg.h  Provides a C++ class to implement a shift register which is implemented directly using a Xilinx SRL primitive. 
FFT IP Library
The Xilinx FFT IP block can be called within a C++ design using the library hls_fft.h
. This section explains how the FFT can be configured in your C++ code.
To use the FFT in your C++ code:
 Include the
hls_fft.h
library in the code  Set the default parameters using the predefined struct
hls::ip_fft::params_t
 Define the run time configuration
 Call the FFT function
 Optionally, check the run time status
The following code examples provide a summary of how each of these steps is performed. Each step is discussed in more detail below.
First, include the FFT library in the source code. This header file resides in the include directory in the Vitis HLS installation area which is automatically searched when Vitis HLS executes.
#include "hls_fft.h"
Define the static parameters of the FFT. This includes such things as input width, number of
channels, type of architecture. which do not change dynamically. The FFT library includes a
parameterization struct hls::ip_fft::params_t
, which can be used to initialize all
static parameters with default values.
In this example, the default values for output ordering and the widths of the configuration
and status ports are overridden using a userdefined struct param1
based on the
predefined struct.
struct param1 : hls::ip_fft::params_t {
static const unsigned ordering_opt = hls::ip_fft::natural_order;
static const unsigned config_width = FFT_CONFIG_WIDTH;
static const unsigned status_width = FFT_STATUS_WIDTH;
};
Define types and variables for both the run time configuration and run time status. These values can be dynamic and are therefore defined as variables in the C code which can change and are accessed through APIs.
typedef hls::ip_fft::config_t<param1> config_t;
typedef hls::ip_fft::status_t<param1> status_t;
config_t fft_config1;
status_t fft_status1;
Next, set the run time configuration. This example sets the direction of the FFT (Forward or Inverse) based on the value of variable “direction” and also set the value of the scaling schedule.
fft_config1.setDir(direction);
fft_config1.setSch(0x2AB);
Call the FFT function using the HLS namespace with the defined static configuration
(param1
in this example). The function parameters are, in order, input data, output data,
output status and input configuration.
hls::fft<param1> (xn1, xk1, &fft_status1, &fft_config1);
Finally, check the output status. This example checks the overflow flag and stores the results in variable “ovflo”.
*ovflo = fft_status1>getOvflo();
Design examples using the FFT C library are provided in the Vitis HLS examples and can be accessed using menu option .
FFT Static Parameters
The static parameters of the FFT define how the FFT is configured and specifies the fixed parameters such as the size of the FFT, whether the size can be changed dynamically, whether the implementation is pipelined or radix_4_burst_io.
The hls_fft.h header file defines a struct
hls::ip_fft::params_t
which can be used to
set default values for the static parameters. If the default values are to be used,
the parameterization struct can be used directly with the FFT function.
hls::fft<hls::ip_fft::params_t >
(xn1, xk1, &fft_status1, &fft_config1);
A more typical use is to change some of the parameters to nondefault values. This is performed by creating a new userdefined parameterization struct based on the default parameterization struct and changing some of the default values.
In the following example, a new user struct my_fft_config
is defined with a new value
for the output ordering (changed to natural_order). All other static parameters to the FFT
use the default values.
struct my_fft_config : hls::ip_fft::params_t {
static const unsigned ordering_opt = hls::ip_fft::natural_order;
};
hls::fft<my_fft_config >
(xn1, xk1, &fft_status1, &fft_config1);
The values used for the parameterization struct hls::ip_fft::params_t
are explained
in FFT Struct Parameters. The default values for the parameters and a list of possible values
are provided in FFT Struct Parameter Values.
FFT Struct Parameters
Parameter  Description 

input_width  Data input port width. 
output_width  Data output port width. 
status_width  Output status port width. 
config_width  Input configuration port width. 
max_nfft  The size of the FFT data set is specified as 1 << max_nfft. 
has_nfft  Determines if the size of the FFT can be run time configurable. 
channels  Number of channels. 
arch_opt  The implementation architecture. 
phase_factor_width  Configure the internal phase factor precision. 
ordering_opt  The output ordering mode. 
ovflo  Enable overflow mode. 
scaling_opt  Define the scaling options. 
rounding_opt  Define the rounding modes. 
mem_data  Specify using block or distributed RAM for data memory. 
mem_phase_factors  Specify using block or distributed RAM for phase factors memory. 
mem_reorder  Specify using block or distributed RAM for output reorder memory. 
stages_block_ram  Defines the number of block RAM stages used in the implementation. 
mem_hybrid  When block RAMs are specified for data, phase factor, or reorder buffer, mem_hybrid specifies where or not to use a hybrid of block and distributed RAMs to reduce block RAM count in certain configurations. 
complex_mult_type  Defines the types of multiplier to use for complex multiplications. 
butterfly_type  Defines the implementation used for the FFT butterfly. 
When specifying parameter values which are not integer or boolean, the HLS FFT namespace should be used.
For example, the possible values for parameter butterfly_type
in the following table are use_luts
and use_xtremedsp_slices
. The values used in the C program should be butterfly_type = hls::ip_fft::use_luts
and butterfly_type = hls::ip_fft::use_xtremedsp_slices
.
FFT Struct Parameter Values
The following table covers all features and functionality of the FFT IP. Features and functionality not described in this table are not supported in the Vitis HLS implementation.
Parameter  C Type  Default Value  Valid Values 

input_width  unsigned  16  834 
output_width  unsigned  16  input_width to (input_width + max_nfft + 1) 
status_width  unsigned  8  Depends on FFT configuration 
config_width  unsigned  16  Depends on FFT configuration 
max_nfft  unsigned  10  316 
has_nfft  bool  false  True, False 
channels  unsigned  1  112 
arch_opt  unsigned  pipelined_streaming_io  automatically_select pipelined_streaming_io radix_4_burst_io radix_2_burst_io radix_2_lite_burst_io 
phase_factor_width  unsigned  16  834 
ordering_opt  unsigned  bit_reversed_order  bit_reversed_order natural_order 
ovflo  bool  true  false true 
scaling_opt  unsigned  scaled  scaled unscaled block_floating_point 
rounding_opt  unsigned  truncation  truncation convergent_rounding 
mem_data  unsigned  block_ram  block_ram distributed_ram 
mem_phase_factors  unsigned  block_ram  block_ram distributed_ram 
mem_reorder  unsigned  block_ram  block_ram distributed_ram 
stages_block_ram  unsigned  (max_nfft < 10) ? 0 : (max_nfft  9) 
011 
mem_hybrid  bool  false  false true 
complex_mult_type  unsigned  use_mults_resources  use_luts use_mults_resources use_mults_performance 
butterfly_type  unsigned  use_luts  use_luts use_xtremedsp_slices 
FFT Runtime Configuration and Status
The FFT supports runtime configuration and runtime status monitoring through the
configuration and status ports. These ports are defined as arguments to the FFT
function, shown here as variables fft_status1
and fft_config1
:
hls::fft<param1> (xn1, xk1, &fft_status1, &fft_config1);
The runtime configuration and status can be accessed using the predefined structs from the FFT C library:
 hls::ip_fft::config_t<param1>
 hls::ip_fft::status_t<param1>
The runtime configuration struct allows the following actions to be performed in the C code:
 Set the FFT length, if runtime configuration is enabled
 Set the FFT direction as forward or inverse
 Set the scaling schedule
The FFT length can be set as follows:
typedef hls::ip_fft::config_t<param1> config_t;
config_t fft_config1;
// Set FFT length to 512 => log2(512) =>9
fft_config1> setNfft(9);
max_nfft
in the
static configuration. The FFT direction can be set as follows:
typedef hls::ip_fft::config_t<param1> config_t;
config_t fft_config1;
// Forward FFT
fft_config1>setDir(1);
// Inverse FFT
fft_config1>setDir(0);
The FFT scaling schedule can be set as follows:
typedef hls::ip_fft::config_t<param1> config_t;
config_t fft_config1;
fft_config1>setSch(0x2AB);
The output status port can be accessed using the predefined struct to determine:
 If any overflow occurred during the FFT
 The value of the block exponent
The FFT overflow mode can be checked as follows:
typedef hls::ip_fft::status_t<param1> status_t;
status_t fft_status1;
// Check the overflow flag
bool *ovflo = fft_status1>getOvflo();
And the block exponent value can be obtained using:
typedef hls::ip_fft::status_t<param1> status_t;
status_t fft_status1;
// Obtain the block exponent
unsigned int *blk_exp = fft_status1> getBlkExp();
Using the FFT Function
The FFT function is defined in the HLS namespace and can be called as follows:
hls::fft<STATIC_PARAM> (
INPUT_DATA_ARRAY,
OUTPUT_DATA_ARRAY,
OUTPUT_STATUS,
INPUT_RUN_TIME_CONFIGURATION);
The STATIC_PARAM
is the static parameterization struct that defines the static parameters
for the FFT.
Both the input and output data are supplied to the function as arrays (INPUT_DATA_ARRAY
and OUTPUT_DATA_ARRAY
). In the final implementation, the ports on the FFT RTL block will
be implemented as AXI4Stream ports. Xilinx recommends always using the FFT function in
a region using dataflow optimization (set_directive_dataflow
), because this ensures
the arrays are implemented as streaming arrays. An alternative is to specify both arrays as
streaming using the set_directive_stream
command.
The data types for the arrays can be float or ap_fixed.
typedef float data_t;
complex<data_t> xn[FFT_LENGTH];
complex<data_t> xk[FFT_LENGTH];
To use fixedpoint data types, the Vitis HLS arbitrary precision type ap_fixed should be used.
#include "ap_fixed.h"
typedef ap_fixed<FFT_INPUT_WIDTH,1> data_in_t;
typedef ap_fixed<FFT_OUTPUT_WIDTH,FFT_OUTPUT_WIDTHFFT_INPUT_WIDTH+1> data_out_t;
#include <complex>
typedef hls::x_complex<data_in_t> cmpxData;
typedef hls::x_complex<data_out_t> cmpxDataOut;
In both cases, the FFT should be parameterized with the same correct data sizes. In the case of floating point data, the data widths will always be 32bit and any other specified size will be considered invalid.
The multichannel functionality of the FFT can be used by using twodimensional arrays for the input and output data. In this case, the array data should be configured with the first dimension representing each channel and the second dimension representing the FFT data.
typedef float data_t;
static complex<data_t> xn[CHANNEL][FFT_LENGTH];
static complex<data_t> xk[CHANELL][FFT_LENGTH];
The FFT core consumes and produces data as interleaved channels (for example, ch0data0, ch1data0, ch2data0, etc, ch0data1, ch1data1, ch2data2, etc.). Therefore, to stream the input or output arrays of the FFT using the same sequential order that the data was read or written, you must fill or empty the twodimensional arrays for multiple channels by iterating through the channel index first, as shown in the following example:
cmpxData in_fft[FFT_CHANNELS][FFT_LENGTH];
cmpxData out_fft[FFT_CHANNELS][FFT_LENGTH];
// Write to FFT Input Array
for (unsigned i = 0; i < FFT_LENGTH; i++) {
for (unsigned j = 0; j < FFT_CHANNELS; ++j) {
in_fft[j][i] = in.read().data;
}
}
// Read from FFT Output Array
for (unsigned i = 0; i < FFT_LENGTH; i++) {
for (unsigned j = 0; j < FFT_CHANNELS; ++j) {
out.data = out_fft[j][i];
}
}
Design examples using the FFT C library are provided in the Vitis HLS examples and can be accessed using menu option .
FIR Filter IP Library
The Xilinx FIR IP block can be called within a C++ design using the library hls_fir.h
. This section explains how the FIR can be configured in your C++ code.
To use the FIR in your C++ code:
 Include the
hls_fir.h
library in the code.  Set the static parameters using the predefined struct
hls::ip_fir::params_t
.  Call the FIR function.
 Optionally, define a run time input configuration to modify some parameters dynamically.
The following code examples provide a summary of how each of these steps is performed. Each step is discussed in more detail below.
First, include the FIR library in the source code. This header file resides in the include directory in the Vitis HLS installation area. This directory is automatically searched when Vitis HLS executes. There is no need to specify the path to this directory if compiling inside Vitis HLS.
#include "hls_fir.h"
Define the static parameters of the FIR. This includes such static attributes
such as the input width, the coefficients, the filter rate (single
,
decimation
, hilbert
). The FIR library includes
a parameterization struct hls::ip_fir::params_t
which can be used to initialize all static parameters with default values.
In this example, the coefficients are defined as residing in array coeff_vec
and the default
values for the number of coefficients, the input width and the quantization mode are
overridden using a user a userdefined struct myconfig
based on the predefined struct.
struct myconfig : hls::ip_fir::params_t {
static const double coeff_vec[sg_fir_srrc_coeffs_len];
static const unsigned num_coeffs = sg_fir_srrc_coeffs_len;
static const unsigned input_width = INPUT_WIDTH;
static const unsigned quantization = hls::ip_fir::quantize_only;
};
Create an instance of the FIR function using the HLS namespace with the defined static
parameters (myconfig
in this example) and then call the function with the run
method to
execute the function. The function arguments are, in order, input data and output data.
static hls::FIR<param1> fir1;
fir1.run(fir_in, fir_out);
Optionally, a run time input configuration can be used. In some modes of the FIR, the data on this input determines how the coefficients are used during interleaved channels or when coefficient reloading is required. This configuration can be dynamic and is therefore defined as a variable. For a complete description of which modes require this input configuration, refer to the FIR Compiler LogiCORE IP Product Guide (PG149).
When the run time input configuration is used, the FIR function is called with three arguments: input data, output data and input configuration.
// Define the configuration type
typedef ap_uint<8> config_t;
// Define the configuration variable
config_t fir_config = 8;
// Use the configuration in the FFT
static hls::FIR<param1> fir1;
fir1.run(fir_in, fir_out, &fir_config);
Design examples using the FIR C library are provided in the Vitis HLS examples and can be accessed using menu option .
FIR Static Parameters
The static parameters of the FIR define how the FIR IP is parameterized and specifies nondynamic items such as the input and output widths, the number of fractional bits, the coefficient values, the interpolation and decimation rates. Most of these configurations have default values: there are no default values for the coefficients.
The hls_fir.h header file defines a struct hls::ip_fir::params_t
that can be used to set
the default values for most of the static parameters.
In this example, a new user struct my_config
is defined and with a new value for the coefficients. The coefficients are specified as residing in array coeff_vec
. All other parameters to the FIR use the default values.
struct myconfig : hls::ip_fir::params_t {
static const double coeff_vec[sg_fir_srrc_coeffs_len];
};
static hls::FIR<myconfig> fir1;
fir1.run(fir_in, fir_out);
FIR Static Parameters describes the parameters used for the parametrization struct
hls::ip_fir::params_t
. FIR Struct Parameter Values provides the default values for the
parameters and a list of possible values.
FIR Struct Parameters
Parameter  Description 

input_width  Data input port width 
input_fractional_bits  Number of fractional bits on the input port 
output_width  Data output port width 
output_fractional_bits  Number of fractional bits on the output port 
coeff_width  Bitwidth of the coefficients 
coeff_fractional_bits  Number of fractional bits in the coefficients 
num_coeffs  Number of coefficients 
coeff_sets  Number of coefficient sets 
input_length  Number of samples in the input data 
output_length  Number of samples in the output data 
num_channels  Specify the number of channels of data to process 
total_num_coeff  Total number of coefficients 
coeff_vec[total_num_coeff]  The coefficient array 
filter_type  The type implementation used for the filter 
rate_change  Specifies integer or fractional rate changes 
interp_rate  The interpolation rate 
decim_rate  The decimation rate 
zero_pack_factor  Number of zero coefficients used in interpolation 
rate_specification  Specify the rate as frequency or period 
hardware_oversampling_rate  Specify the rate of oversampling 
sample_period  The hardware oversample period 
sample_frequency  The hardware oversample frequency 
quantization  The quantization method to be used 
best_precision  Enable or disable the best precision 
coeff_structure  The type of coefficient structure to be used 
output_rounding_mode  Type of rounding used on the output 
filter_arch  Selects a systolic or transposed architecture 
optimization_goal  Specify a speed or area goal for optimization 
inter_column_pipe_length  The pipeline length required between DSP columns 
column_config  Specifies the number of DSP module columns 
config_method  Specifies how the DSP module columns are configured 
coeff_padding  Number of zero padding added to the front of the filter 
When specifying parameter values that are not integer or boolean, the HLS FIR namespace should be used.
For example the possible values for rate_change
are shown in the following table to be integer
and fixed_fractional
. The values used in the C program should be rate_change = hls::ip_fir::integer
and rate_change = hls::ip_fir::fixed_fractional
.
FIR Struct Parameter Values
The following table covers all features and functionality of the FIR IP. Features and functionality not described in this table are not supported in the Vitis HLS implementation.
Parameter  C Type  Default Value  Valid Values 

input_width  unsigned  16  No limitation 
input_fractional_bits  unsigned  0  Limited by size of input_width 
output_width  unsigned  24  No limitation 
output_fractional_bits  unsigned  0  Limited by size of output_width 
coeff_width  unsigned  16  No limitation 
coeff_fractional_bits  unsigned  0  Limited by size of coeff_width 
num_coeffs  bool  21  Full 
coeff_sets  unsigned  1  11024 
input_length  unsigned  21  No limitation 
output_length  unsigned  21  No limitation 
num_channels  unsigned  1  11024 
total_num_coeff  unsigned  21  num_coeffs * coeff_sets 
coeff_vec[total_num_coeff]  double array  None  Not applicable 
filter_type  unsigned  single_rate  single_rate, interpolation, decimation, hilbert_filter, interpolated 
rate_change  unsigned  integer  integer, fixed_fractional 
interp_rate  unsigned  1  11024 
decim_rate  unsigned  1  11024 
zero_pack_factor  unsigned  1  18 
rate_specification  unsigned  period  frequency, period 
hardware_oversampling_rate  unsigned  1  No Limitation 
sample_period  bool  1  No Limitation 
sample_frequency  unsigned  0.001  No Limitation 
quantization  unsigned  integer_coefficients  integer_coefficients, quantize_only, maximize_dynamic_range 
best_precision  unsigned  false  false true 
coeff_structure  unsigned  non_symmetric  inferred, non_symmetric, symmetric, negative_symmetric, half_band, hilbert 
output_rounding_mode  unsigned  full_precision  full_precision, truncate_lsbs, non_symmetric_rounding_down, non_symmetric_rounding_up, symmetric_rounding_to_zero, symmetric_rounding_to_infinity, convergent_rounding_to_even, convergent_rounding_to_odd 
filter_arch  unsigned  systolic_multiply_accumulate  systolic_multiply_accumulate, transpose_multiply_accumulate 
optimization_goal  unsigned  area  area, speed 
inter_column_pipe_length  unsigned  4  116 
column_config  unsigned  1  Limited by number of DSP48s used 
config_method  unsigned  single  single, by_channel 
coeff_padding  bool  false  false true 
Using the FIR Function
The FIR function is defined in the HLS namespace and can be called as follows:
// Create an instance of the FIR
static hls::FIR<STATIC_PARAM> fir1;
// Execute the FIR instance fir1
fir1.run(INPUT_DATA_ARRAY, OUTPUT_DATA_ARRAY);
The STATIC_PARAM
is the static parameterization struct that defines most static
parameters for the FIR.
Both the input and output data are supplied to the function as arrays (INPUT_DATA_ARRAY
and OUTPUT_DATA_ARRAY
). In the final implementation, these ports on the FIR IP will be
implemented as AXI4Stream ports. Xilinx recommends always using the FIR function in a
region using the dataflow optimization (set_directive_dataflow
), because this
ensures the arrays are implemented as streaming arrays. An alternative is to specify both
arrays as streaming using the set_directive_stream
command.
The multichannel functionality of the FIR is supported through interleaving the data in a single input and single output array.
 The size of the input array should be large enough to accommodate all samples:
num_channels * input_length
.  The output array size should be specified to contain all output samples:
num_channels * output_length
.
The following code example demonstrates, for two channels, how the data is interleaved. In
this example, the toplevel function has two channels of input data (din_i
, din_q
) and two
channels of output data (dout_i
, dout_q
). Two functions, at the frontend (fe) and
backend (be) are used to correctly order the data in the FIR input array and extract it from
the FIR output array.
void dummy_fe(din_t din_i[LENGTH], din_t din_q[LENGTH], din_t out[FIR_LENGTH]) {
for (unsigned i = 0; i < LENGTH; ++i) {
out[2*i] = din_i[i];
out[2*i + 1] = din_q[i];
}
}
void dummy_be(dout_t in[FIR_LENGTH], dout_t dout_i[LENGTH], dout_t dout_q[LENGTH]) {
for(unsigned i = 0; i < LENGTH; ++i) {
dout_i[i] = in[2*i];
dout_q[i] = in[2*i+1];
}
}
void fir_top(din_t din_i[LENGTH], din_t din_q[LENGTH],
dout_t dout_i[LENGTH], dout_t dout_q[LENGTH]) {
din_t fir_in[FIR_LENGTH];
dout_t fir_out[FIR_LENGTH];
static hls::FIR<myconfig> fir1;
dummy_fe(din_i, din_q, fir_in);
fir1.run(fir_in, fir_out);
dummy_be(fir_out, dout_i, dout_q);
}
Optional FIR Runtime Configuration
In some modes of operation, the FIR requires an additional input to configure how the coefficients are used. For a complete description of which modes require this input configuration, refer to the FIR Compiler LogiCORE IP Product Guide (PG149).
This input configuration can be performed in the C code using a standard ap_int.h 8bit data type. In this example, the header file fir_top.h
specifies the use of the FIR and ap_fixed
libraries, defines a number of the design parameter values and then defines some fixedpoint types based on these:
#include "ap_fixed.h"
#include "hls_fir.h"
const unsigned FIR_LENGTH = 21;
const unsigned INPUT_WIDTH = 16;
const unsigned INPUT_FRACTIONAL_BITS = 0;
const unsigned OUTPUT_WIDTH = 24;
const unsigned OUTPUT_FRACTIONAL_BITS = 0;
const unsigned COEFF_WIDTH = 16;
const unsigned COEFF_FRACTIONAL_BITS = 0;
const unsigned COEFF_NUM = 7;
const unsigned COEFF_SETS = 3;
const unsigned INPUT_LENGTH = FIR_LENGTH;
const unsigned OUTPUT_LENGTH = FIR_LENGTH;
const unsigned CHAN_NUM = 1;
typedef ap_fixed<INPUT_WIDTH, INPUT_WIDTH  INPUT_FRACTIONAL_BITS> s_data_t;
typedef ap_fixed<OUTPUT_WIDTH, OUTPUT_WIDTH  OUTPUT_FRACTIONAL_BITS> m_data_t;
typedef ap_uint<8> config_t;
In the toplevel code, the information in the header file is included, the static parameterization struct is created using the same constant values used to specify the bitwidths, ensuring the C code and FIR configuration match, and the coefficients are specified. At the toplevel, an input configuration, defined in the header file as 8bit data, is passed into the FIR.
#include "fir_top.h"
struct param1 : hls::ip_fir::params_t {
static const double coeff_vec[total_num_coeff];
static const unsigned input_length = INPUT_LENGTH;
static const unsigned output_length = OUTPUT_LENGTH;
static const unsigned num_coeffs = COEFF_NUM;
static const unsigned coeff_sets = COEFF_SETS;
};
const double param1::coeff_vec[total_num_coeff] =
{6,0,4,3,5,6,6,13,7,44,64,44,7,13,6,6,5,3,4,0,6};
void dummy_fe(s_data_t in[INPUT_LENGTH], s_data_t out[INPUT_LENGTH],
config_t* config_in, config_t* config_out)
{
*config_out = *config_in;
for(unsigned i = 0; i < INPUT_LENGTH; ++i)
out[i] = in[i];
}
void dummy_be(m_data_t in[OUTPUT_LENGTH], m_data_t out[OUTPUT_LENGTH])
{
for(unsigned i = 0; i < OUTPUT_LENGTH; ++i)
out[i] = in[i];
}
// DUT
void fir_top(s_data_t in[INPUT_LENGTH],
m_data_t out[OUTPUT_LENGTH],
config_t* config)
{
s_data_t fir_in[INPUT_LENGTH];
m_data_t fir_out[OUTPUT_LENGTH];
config_t fir_config;
// Create struct for config
static hls::FIR<param1> fir1;
//==================================================
// Dataflow process
dummy_fe(in, fir_in, config, &fir_config);
fir1.run(fir_in, fir_out, &fir_config);
dummy_be(fir_out, out);
//==================================================
}
Design examples using the FIR C library are provided in the Vitis HLS examples and can be accessed using menu option .
DDS IP Library
You can use the Xilinx Direct Digital Synthesizer (DDS) IP block within a C++ design using the hls_dds.h
library. This section explains how to configure DDS IP in your C++ code.
none
mode for Phase_Offset, but it does
not support programmable
and streaming
modes for
these parameters.To use the DDS in the C++ code:
 Include the
hls_dds.h
library in the code.  Set the default parameters using the predefined struct
hls::ip_dds::params_t
.  Call the DDS function.
First, include the DDS library in the source code. This header file resides in the include directory in the Vitis HLS installation area, which is automatically searched when Vitis HLS executes.
#include "hls_dds.h"
Define the static parameters of the DDS. For example, define the phase width, clock rate,
and phase and increment offsets. The DDS C library includes a parameterization struct
hls::ip_dds::params_t
, which is used to initialize all static parameters with default
values. By redefining any of the values in this struct, you can customize the implementation.
The following example shows how to override the default values for the phase width, clock
rate, phase offset, and the number of channels using a userdefined struct param1
, which
is based on the existing predefined struct hls::ip_dds::params_t
:
struct param1 : hls::ip_dds::params_t {
static const unsigned Phase_Width = PHASEWIDTH;
static const double DDS_Clock_Rate = 25.0;
static const double PINC[16];
static const double POFF[16];
};
Create an instance of the DDS function using the HLS namespace with the defined static
parameters (for example, param1
). Then, call the function with the run method to execute
the function. Following are the data and phase function arguments shown in order:
static hls::DDS<config1> dds1;
dds1.run(data_channel, phase_channel);
To access design examples that use the DDS C library, select
.DDS Static Parameters
The static parameters of the DDS define how to configure the DDS, such as the
clock rate, phase interval, and modes. The hls_dds.h header file defines an hls::ip_dds::params_t
struct, which sets the default values for the
static parameters. To use the default values, you can use the parameterization
struct directly with the DDS function.
static hls::DDS< hls::ip_dds::params_t > dds1;
dds1.run(data_channel, phase_channel);
The following table describes the parameters for the hls::ip_dds::params_t
parameterization struct.
Parameter  Description 

DDS_Clock_Rate 
Specifies the clock rate for the DDS output. 
Channels 
Specifies the number of channels. The DDS and phase generator can support up to 16 channels. The channels are timemultiplexed, which reduces the effective clock frequency per channel. 
Mode_of_Operation 
Specifies one of the following operation modes: Standard mode for use when the accumulated phase can be truncated before it is used to access the SIN/COS LUT. Rasterized mode for use when the desired frequencies and system clock are related by a rational fraction. 
Modulus 
Describes the relationship between the system clock frequency and the desired frequencies. Use this parameter in rasterized mode only. 
Spurious_Free_Dynamic_Range 
Specifies the targeted purity of the tone produced by the DDS. 
Frequency_Resolution 
Specifies the minimum frequency resolution in Hz and determines the Phase Width used by the phase accumulator, including associated phase increment (PINC) and phase offset (POFF) values. 
Noise_Shaping 
Controls whether to use phase truncation, dithering, or Taylor series correction. 
Phase_Width 
Sets the width of the following: PHASE_OUT
field within Phase
field within Phase accumulator Associated phase increment and offset registers Phase
field in For rasterized mode, the phase width is fixed as
the number of bits required to describe the valid input range

Output_Width 
Sets the width of SINE and
COSINE fields within m_axis_data_tdata . The SFDR provided by
this parameter depends on the selected Noise
Shaping option. 
Phase_Increment 
Selects the phase increment value. 
Phase_Offset 
Selects the phase offset value. 
Output_Selection 
Sets the output selection to SINE,
COSINE, or both in the
m_axis_data_tdata bus. 
Negative_Sine 
Negates the SINE field at run time. 
Negative_Cosine 
Negates the COSINE field at run time. 
Amplitude_Mode 
Sets the amplitude to full range or unit circle. 
Memory_Type 
Controls the implementation of the SIN/COS LUT. 
Optimization_Goal 
Controls whether the implementation decisions target highest speed or lowest resource. 
DSP48_Use 
Controls the implementation of the phase accumulator and addition stages for phase offset, dither noise addition, or both. 
Latency_Configuration 
Sets the latency of the core to the optimum value based upon the Optimization Goal. 
Latency 
Specifies the manual latency value. 
Output_Form 
Sets the output form to two’s complement or to sign and magnitude. In general, the output of SINE and COSINE is in two’s complement form. However, when quadrant symmetry is used, the output form can be changed to sign and magnitude. 
PINC[XIP_DDS_CHANNELS_MAX] 
Sets the values for the phase increment for each output channel. 
POFF[XIP_DDS_CHANNELS_MAX] 
Sets the values for the phase offset for each output channel. 
DDS Struct Parameter Values
The following table shows the possible values for the hls::ip_dds::params_t
parameterization struct parameters.
Parameter  C Type  Default Value  Valid Values 

DDS_Clock_Rate  double  20.0  Any double value 
Channels  unsigned  1  1 to 16 
Mode_of_Operation  unsigned  XIP_DDS_MOO_CONVENTIONAL  XIP_DDS_MOO_CONVENTIONAL truncates the accumulated phase. XIP_DDS_MOO_RASTERIZED selects rasterized mode. 
Modulus  unsigned  200  129 to 256 
Spurious_Free_Dynamic_Range  double  20.0  18.0 to 150.0 
Frequency_Resolution  double  10.0  0.000000001 to 125000000 
Noise_Shaping  unsigned  XIP_DDS_NS_NONE  XIP_DDS_NS_NONE produces phase truncation DDS. XIP_DDS_NS_DITHER uses phase dither to improve SFDR at the expense of increased noise floor. XIP_DDS_NS_TAYLOR interpolates sine/cosine values using the otherwise discarded bits from phase truncation XIP_DDS_NS_AUTO automatically determines noiseshaping. 
Phase_Width  unsigned  16  Must be an integer multiple of 8 
Output_Width  unsigned  16  Must be an integer multiple of 8 
Phase_Increment  unsigned  XIP_DDS_PINCPOFF_FIXED  XIP_DDS_PINCPOFF_FIXED fixes PINC at generation time, and PINC cannot be changed at run time. This is the only value supported. 
Phase_Offset  unsigned  XIP_DDS_PINCPOFF_NONE  XIP_DDS_PINCPOFF_NONE does not generate phase offset. XIP_DDS_PINCPOFF_FIXED fixes POFF at generation time, and POFF cannot be changed at run time. 
Output_Selection  unsigned  XIP_DDS_OUT_SIN_AND_COS  XIP_DDS_OUT_SIN_ONLY produces sine output only. XIP_DDS_OUT_COS_ONLY produces cosine output only. XIP_DDS_OUT_SIN_AND_COS produces both sin and cosine output. 
Negative_Sine  unsigned  XIP_DDS_ABSENT  XIP_DDS_ABSENT produces standard sine wave. XIP_DDS_PRESENT negates sine wave. 
Negative_Cosine  bool  XIP_DDS_ABSENT  XIP_DDS_ABSENT produces standard sine wave. XIP_DDS_PRESENT negates sine wave. 
Amplitude_Mode  unsigned  XIP_DDS_FULL_RANGE  XIP_DDS_FULL_RANGE normalizes amplitude to the output width with the binary point in the first place. For example, an 8bit output has a binary amplitude of 100000000  10 giving values between 01111110 and 11111110, which corresponds to just less than 1 and just more than 1 respectively. XIP_DDS_UNIT_CIRCLE normalizes amplitude to half full range, that is, values range from 01000 .. (+0.5). to 110000 .. (0.5). 
Memory_Type  unsigned  XIP_DDS_MEM_AUTO  XIP_DDS_MEM_AUTO selects distributed ROM for small cases where the table can be contained in a single layer of memory and selects block ROM for larger cases. XIP_DDS_MEM_BLOCK always uses block RAM. XIP_DDS_MEM_DIST always uses distributed RAM. 
Optimization_Goal  unsigned  XIP_DDS_OPTGOAL_AUTO  XIP_DDS_OPTGOAL_AUTO automatically selects the optimization goal. XIP_DDS_OPTGOAL_AREA optimizes for area. XIP_DDS_OPTGOAL_SPEED optimizes for performance. 
DSP48_Use  unsigned  XIP_DDS_DSP_MIN  XIP_DDS_DSP_MIN implements the phase accumulator and the stages for phase offset, dither noise addition, or both in FPGA logic. XIP_DDS_DSP_MAX implements the phase accumulator and the phase offset, dither noise addition, or both using DSP slices. In the case of single channel, the DSP slice can also provide the register to store programmable phase increment, phase offset, or both and thereby, save further fabric resources. 
Latency_Configuration  unsigned  XIP_DDS_LATENCY_AUTO  XIP_DDS_LATENCY_AUTO automatically determines he latency. XIP_DDS_LATENCY_MANUAL manually specifies the latency using the Latency option. 
Latency  unsigned  5  Any value 
Output_Form  unsigned  XIP_DDS_OUTPUT_TWOS  XIP_DDS_OUTPUT_TWOS outputs two's complement. XIP_DDS_OUTPUT_SIGN_MAG outputs signed magnitude. 
PINC[XIP_DDS_CHANNELS_MAX]  unsigned array  {0}  Any value for the phase increment for each channel 
POFF[XIP_DDS_CHANNELS_MAX]  unsigned array  {0}  Any value for the phase offset for each channel 
SRL IP Library
C code is written to satisfy several different requirements: reuse, readability, and performance. Until now, it is unlikely that the C code was written to result in the most ideal hardware after highlevel synthesis.
Like the requirements for reuse, readability, and performance, certain coding techniques or predefined constructs can ensure that the synthesis output results in more optimal hardware or to better model hardware in C for easier validation of the algorithm.
Mapping Directly into SRL Resources
Many C algorithms sequentially shift data through arrays. They add a new value to the start of the array, shift the existing data through array, and drop the oldest data value. This operation is implemented in hardware as a shift register.
This most common way to implement a shift register from C into hardware is to completely partition the array into individual elements, and allow the data dependencies between the elements in the RTL to imply a shift register.
Logic synthesis typically implements the RTL shift register into a Xilinx SRL resource, which efficiently implements shift registers. The issue is that sometimes logic synthesis does not implement the RTL shift register using an SRL component:
 When data is accessed in the middle of the shift register, logic synthesis cannot directly infer an SRL.
 Sometimes, even when the SRL is ideal, logic synthesis may implement the shiftresister in flipflops, due to other factors. (Logic synthesis is also a complex process).
Vitis HLS provides a C++ class (ap_shift_reg
) to ensure that the shift register defined in the C code is always implemented using an SRL resource. The ap_shift_reg
class has two methods to perform the various read and write accesses supported by an SRL component.
Read from the Shifter
The read method allows a specified location to be read from the shifter register.
The ap_shift_reg.h header file that defines
the ap_shift_reg
class is also included with
Vitis HLS as a standalone package. You
have the right to use it in your own source code. The package xilinx_hls_lib_<release_number>.tgz is located
in the include directory in the Vitis HLS installation area.
// Include the Class
#include "ap_shift_reg.h"
// Define a variable of type ap_shift_reg<type, depth>
//  Sreg must use the static qualifier
//  Sreg will hold integer data types
//  Sreg will hold 4 data values
static ap_shift_reg<int, 4> Sreg;
int var1;
// Read location 2 of Sreg into var1
var1 = Sreg.read(2);
Read, Write, and Shift Data
A shift
method allows a read, write, and shift operation to be performed.
// Include the Class
#include "ap_shift_reg.h"
// Define a variable of type ap_shift_reg<type, depth>
//  Sreg must use the static qualifier
//  Sreg will hold integer data types
//  Sreg will hold 4 data values
static ap_shift_reg<int, 4> Sreg;
int var1;
// Read location 3 of Sreg into var1
// THEN shift all values up one and load In1 into location 0
var1 = Sreg.shift(In1,3);
Read, Write, and EnableShift
The shift
method also supports an enabled input, allowing the shift process to be controlled and enabled by a variable.
// Include the Class
#include "ap_shift_reg.h"
// Define a variable of type ap_shift_reg<type, depth>
//  Sreg must use the static qualifier
//  Sreg will hold integer data types
//  Sreg will hold 4 data values
static ap_shift_reg<int, 4> Sreg;
int var1, In1;
bool En;
// Read location 3 of Sreg into var1
// THEN if En=1
// Shift all values up one and load In1 into location 0
var1 = Sreg.shift(In1,3,En);
When using the ap_shift_reg
class, Vitis HLS creates a unique RTL component for
each shifter. When logic synthesis is performed, this component is synthesized into an SRL
resource.