Using clEnqueueMigrateMemObjects to Transfer Data

OpenCL™ provides a number of APIs for transferring data between the host and the device. Typically, data movement APIs such as clEnqueueWriteBuffer and clEnqueueReadBuffer implicitly migrate memory objects to the device after they are enqueued. They do not guarantee when the data is transferred. This makes it difficult for the host application to overlap the placements of the memory objects onto the device with the computation carried out by kernels.

OpenCL 1.2 introduced a new API, clEnqueueMigrateMemObjects, with which memory migration can be explicitly performed ahead of the dependent commands. This allows the application to preemptively change the association of a memory object, through regular command queue scheduling, in order to prepare for another upcoming command. This also permits an application to overlap the placement of memory objects with other unrelated operations before these memory objects are needed, potentially hiding transfer latencies. Once the event associated by clEnqueueMigrateMemObjects has been marked CL_COMPLETE, the memory objects specified in mem_objects have been successfully migrated to the device associated with command_queue.

The clEnqueueMigrateMemObjects API can also be used to direct the initial placement of a memory object after creation, possibly avoiding the initial overhead of instantiating the object on the first enqueued command to use it.

Another advantage of clEnqueueMigrateMemObjects is that it can migrate multiple memory objects in a single API call. This reduces the overhead of scheduling and calling functions for transferring data for more than one memory object.

Below is the code snippet showing the usage of clEnqueueMigrateMemObjects from Vector Multiplication for XPR Device example in the host category from Xilinx On-boarding Example GitHub.

int err = clEnqueueMigrateMemObjects(