When the FIR Compiler is used in System Generator, there are additional requirements that affect the data flow in order to have consistency with the Simulink simulation behavior.
In the Simulink environment each signal has a sample rate associated with it. If it is slower than the system rate (the fastest rate in the design), then it can only change values on multiples of its sample rate relative to the system rate. That is, if you have a sample period of 2 and a system rate of 1, that signal can only change values every other clock cycle.
This requires that if the input to the FIR Compiler is slower than the system rate, it can only change once per its own sample period. This is different from the way the FIR compiler can be used outside of System Generator, which for multi-channel implementations allows you to input all samples of each channel sequentially at the system rate (the rate of the clock driving the FIR Compiler).
Because of this difference in behavior, in order to match the behavior of the HDL generated by System Generator, it is required that the chan_in output be synchronized to the input sample rate domain. This is done by registering the signal in this domain. This register causes an extra cycle of latency which is not present on the core when running outside of System Generator.
As a result, when using a multi-channel FIR Compiler implementation in System Generator, you must take this extra cycle of latency into account by inputting channel 1 when the FIR Compiler chan_in output is reading 0, and input channel 2 when the chan_in output is reading 1, etc.
This behavior is now documented in the System Generator user guide in the help section for the FIR Compiler block.