I would like to precisely control the packing of a LUTs into a delay chain, but MAP optimizes away the LUT1 buffers that form the delay chain.
Is there a way to prevent this optimization?
The LUT1 buffers are maintained if a LOCK_PINS constraint is applied to the LUTs:
UCF syntax example:
INST "lut1" LOCK_PINS;
INST "lut2" LOCK_PINS;
...
For a more complicated use case involving a multiple input LUTs, the same constraint can be used to lock the original input pin usage. Alternatively, a different pin mapping can be specified and locked using the LOCK_PINS constraint.
UCF syntax example:
INST "some_lut4" LOCK_PINS=I0:A4,I1:A3,I2:A2,I3:A1;
For more variations on the use of this constraint, see page 131 of the Constraints Guide (UG625):
http://www.xilinx.com/support/documentation/sw_manuals/xilinx11/cgd.pdf
The LOCK_PINS constraint can be combined with other pack and placement constraints (BEL, RLOC, LOC, LUTNM) to completely control the LUT usage.