Wednesday, May 27, 2020

Creating an OpenCL demo application using the NXP Demo Framework (GPU SDK)

As mentioned before, in this post I am going to cover the setup and how to create an OpenCL based application, using the NXP Demo Framework.

The Demo Framework as it is described in its documentation as: “A multi-platform framework for fast and easy demo development. The framework abstracts away all the boilerplate & OS specific code of allocating windows, creating the context, texture loading, shader compilation, render loop, animation ticks, benchmarking graph overlays etc. Thereby allowing the demo/benchmark developer to focus on writing the actual 'demo' code.”

We will be cross-compiling our application, using Linux in the host and the Yocto as our target’s system. So, that requires a Yocto Image and SDK (arm gcc).

You can get it from the NXP website or download the source code and build your own image using the latest release available (Recommended). So, let’s start from it.


1 - Building Yocto from source

a) Preparing the Host Machine

$ sudo apt-get install gawk wget git-core diffstat unzip texinfo build-essential chrpath libsdl1.2-dev xterm curl ncurses-dev
$ mkdir ~/bin
$ curl http://commondatastorage.googleapis.com/git-repo-downloads/repo > ~/bin/repo
$ chmod +x ~/bin/repo
$ export PATH=~/bin/:$PATH


b) BSP Source Code Download

$ mkdir fsl-arm-yocto-bsp
$ cd fsl-arm-yocto-bsp
$ repo init -u git://source.codeaurora.org/external/imx/imx-manifest.git -b imx-linux-zeus -m imx-5.4.3-2.0.0.xml
$ repo sync


c) Configuring the target (i.MX8QXP as an example)

$ EULA=1 MACHINE=imx8qxpmek DISTRO=fsl-imx-xwayland source ./imx-setup-release.sh -b build-imx8qxpmek-xwayland


d) Building the BSP image

$ bitbake fsl-image-full (for full image) à we will use this one, already includes qt5
$ bitbake fsl-image-qt5 (for qt5 image)


e) Flashing the Yocto Image

Extract the image (compressed as xxxxxxxxx.rootfs.sdcard.tar.bz2), located in ../build-imx8qxpmek-xwayland/tmp/deploy/images/imx8qxpmek

$ dd if=imx8qxpmek-xwayland/tmp/deploy/images/imx8qxpmek/imx-image-full-imx8qxpmek-xxxxxxxxxxxxx.rootfs.sdcard of=/dev/sde 

change the /dev/sde to your correct sd card device


f) Building and installing the SDK package

$ bitbake fsl-image-full -c populate sdk
$ cd ../build-imx8qxpmek-xwayland/tmp/deploy/sdk
$ ./fsl-imx-xwayland-glibc-x86_64-imx-image-full-aarch64-imx8qxpmek-toolchain-5.4-zeus.sh  

default location will be: /opt/fsl-imx-xwayland/5.4-zeus/

Now with the BSP and Toolchain installed, we can proceed to the Demo Framework installation.


2 - Installing the Demo Framework

a) Download (or clone) the NXP Demo Framework source code

$ git clone https://github.com/NXPmicro/gtec-demo-framework.git


b)      Set the toolchain environment variables by sourcing the setup environment script

$ source /opt/fsl-imx-xwayland/5.4-zeus/environment-setup-aarch64-poky-linux


c) Set the Demo framework environment variables by sourcing the prepare.sh script

$ source prepare.sh




3 - Creating the OpenCL demo application

Change directory to DemoApps/OpenCL and create a new demo application from it using the FslBuildNew.py script as follows:

$ FslBuildNew.py OpenCL1_2 HelloWorld  

use OpenCL1_1 if needed




At this point we already have our OpenCL demo skeleton, in the HelloWord.cpp we can add our base OpenCL code (pretty much the same code we used in the last OpenCL post) and as this post is big enough, let's make it smaller by directly downloading the code here.

For the OCL kernel, for a testing we can simply do a copy from the input to the output buffer, the input buffer has a size of 3840*2160 * 4, with values generated randomically:


__kernel void hello_world      (__global uchar *input, __global uchar *output)
{
   int id = get_global_id (0);
   output[id] = input[id];
}


The OCL kernel can be saved with the name: hello_world.cl (you can create a folder called “Content” and place it there).

Once the steps above are completed, we can proceed to build our demo application.


4 - Building the OpenCL demo application

a) Change dir to the HelloWorld demo

$ cd HelloWorld


b) Use the FslBuild.py to generate the makefile, the FslBuild script can accept some input arguments, the one you will need now is to set the target backend, in the case of this sample, the wayland (xwayland to be correct), it can be also FB or x11.

$ FslBuild.py –Variants [WindowSystem=Wayland]




For the first build, it can take some time, since the DF modules must be built as well.


c) If errors regarding 3rd party libraries occurs, like:


Or


use the following argument to the build command line: --Recipes [*]

$ FslBuild.py --Variants [WindowSystem=Wayland] --Recipes [*]


d) the binary (Executable) file will be place at:

../gtec-demo-ramework/build/Yocto/DemoApps/OpenCL/HelloWorld/OpenCL.HelloWorld


5 - Running the OpenCL demo application

a) Copy the OpenCL.HelloWorld and the hello_world.cl to the target rootfs




b) $ ./OpenCL.HelloWorld




Process time from GPU is ZERO, because of the timer resolution I am using, and I am just measuring the clEnqueueNDRange function, which is responsible for the OCL kernel to run. If you measure the write and read time from the memory it will be bigger, in this case is suggested to use physical allocated memory and then mapping the buffers using OCL, which is called zero-copy operation. This will be a topic for another post.


EOF !

Wednesday, May 13, 2020

New topics in this blog !

There is a long time I haven't been posting anything new, from now, I will starting adding some content that mixes computer vision and compute (OpenCL), and uses a very nice tool for designing very stable and robust applications, the NXP GPU SDK, also known as the Demo Framework, which will be the topic for the next post.

 EOF!