Wednesday, October 3, 2012

Video to Texture Streaming (Part 3) - i.MX6 processor

Hi There Again, sorry for this long time without posting anything, I have being running out of time to deal with cool stuffs (unfortunately), but I have something nice to share.
Continuing with the video to texture streaming with the i.MX6 processors, we saw so far how to decode the video file, now we need to write the data fast as possible into the GPU buffer in order to get a good frame rate of our texture even when using high resolution media.

In order to do that, we have an API that will make our job easier as possible, and the main functions of this API are:

1 - GL_API GL_APIENTRY glTexDirectVIV (GLenum Target, GLsizei Width, GLsizei Height, GLenum Format, GLvoid **Pixels);

2 - GL_API GL_APIENTRY glTexDirectVIVMAP (GLenum Target, GLsizei Width, GLsizei Height, GLenum Format, GLvoid **Logical, Const GLuint *Physical);

3 - GL_API GL_APIENTRY glTexDirectInvalidateVIV (GLenum Target);

what are all those parameters ???

Answer:
1 - Target, Width, Height and Format have the same meaning with glTexImage2D. Width alighment by 16 pixels is required. Example:
Target: GL_TEXTURE_2D Width: 640 Height: 480 Format: GL_VIV_YV12, GL_VIV_NV12, GL_VIV_YUY2, GL_VIV_UYVY, GL_VIV_NV21

2 - Pixels (first function) is the pointer created by the driver, which means that any data you write using this pointer you are going to write directly to the GPU buffer.

3 - Logical (MAP function), is a pointer to the logical address of the application-defined buffer to the texture.

4 - Physical (MAP function), is a pointer to the physial address of the application-defined buffer to the texture, of ~0 if no physical address has been provided.

Alright, we have all the parameters and its description, so, how do we use these functions?

First of all, we need to declare our pointers and then call the function and let the driver allocate memory to us:

GLvoid *pTexel;

The second step is to use the API and get our pointer pointing to the GPU buffer:

glTexDirectVIV (GL_TEXTURE_2D, width, height, GL_VIV_YV12, &pTexel);
GLint error = glGetError ();
if (error != GL_NO_ERROR)
{
     printf ("\nError, glTexDirectVIV 0x%08x\n", error)
}

This function should be called just once in your code! (this was my mistake for a long time.....)

Now, somewhere in your code (in a loop function), you can use the following code to write the data:

glBindTexture (GL_TEXTURE_2D, texture);
memmove (pTexel, my_texture_data, w * h * 3/2);
glTexDirectInvalidateVIV (GL_TEXTURE_2D);

This is not fully optmized yet, if you check this post, you are going to see that we are already performing a memmove to copy the frame data into our global buffer. You can play with this code and remove one of these memmoves.

Also, the Gstreamer code is capturing the frame in RGB format, and you will need YUV format, you can change the caps format to:

// RGB
//gst_element_link_filtered (pFfConv, __videosink, gst_caps_new_simple ("video/x-raw-rgb","bpp",G_TYPE_INT,16, NULL));

// YUV12
gst_element_link_filtered (pFfConv, __videosink, gst_caps_new_simple ("video/x-raw-yuv", "format", GST_TYPE_FOURCC, GST_MAKE_FOURCC ('Y', 'V', '1', '2'), NULL));

The full application can be downlodaded here.

EOF !

Monday, March 12, 2012

Stereo Vision with Kinect (xbox360 device) on i.MX6 processor

After a long time with no posts, here it goes one more !

It is known on Image processing based projects that works as: classifiers, object tracker, and etc., that the a bad segmentation procedure could fail the entire project.

Based on the gesture recognition project already presented on the lasts posts, we can clearly see if we want to the application works fine, with every person, independent of skin color, then the user needs to wear a glove or another accessory that will differentiate from the background, also, in a gesture recognition demo, only movements on X and Y axis is detected efficiently.

It is because in images provided by a single camera, we need to do segmentation based on colors and then check the contours for the desired pattern. If the background has a color most likely with the skin color or the glove color, then we have a serious problem for the segmentation par. A lot of noise will appear on the final image and the classification/recognition will not work properly.

Unless you are working on a project that will classify/track/detect some object in a controlled environment, one camera is sufficient, but if you are aiming a project that could work on every place, so we need a different approach.

The solution for it is quite simple, we just need to add another camera !

With 2 cameras and the correct setup (alignment and distance between them), we have 2 images with different perspectives from the same point, it is, a 3D image.

There are some algorithms on the web that correlates the images from the 2 cameras and returns the disparity map, that is, the depth image.

With a depth image, the color segmentation on a project can be avoided and a segmentation based on distance can be used instead, and also, gestures more complex could be recognized.

Unfortuntely, the disparty map calculation is a heavy to the cpu do it alone, so we have the following options to improve the performance:

1 - use OpenCL (share the tasks with the GPU)
2 - use a device that already release depth images from 2 different sensors.

As we are working on a embedded system, the OpenCL port for a embedded system is not so complete, so we could face some issues and spent a lot of time on it.

On the other hand we could use a device which gives us the desired depth image. The chosen device for this task was the Microsoft Kinect. Yes ! that one used for gaming on Xbox360. Of course there is another device you could use, for example: http://www.tyzx.com/products/DeepSeaG3.html

As I already have a kinect to test (only for work, because for playing I still prefer my PS3 heheheheheh), as a demo we are going to use the kinect.

The library used for this project was the OpenKinect and you can get more information here: www.openkinect.org

Since I use Linux on my projects (LTIB), I´m going to show how to build the OpenKinect library for this platform:


NOTE: As the i.MX6 is supported on the i.MX53 BSP.

1 - Freescale´s linux BSP: https://www.freescale.com/webapp/Download?colCode=L2.6.35_11.03_ER_SOURCE&prodCode=LEIMX&appType=license&location=null&fpsp=1&Parent_nodeId=1276810298241720831102&Parent_pageType=product

2 - follow the documentation on how to install the ltib on your host linux

3 - once you have the linux running on your board we can now start building the openkinect driver.

4 - clone the openkinect git to your host: git clone git://github.com/OpenKinect/libfreenect.git

5 - it will create a freenect folder, cd to it: cd freenect

6 - create a build folder: mk build

7 - get into the ltib shell using the ltib -m command: home/ltib> ./ltib -m shell

8 - as openkinect is intended for x86, the sample apps are based on OpenGL with GLUT, so we need to avoid building any sample code, just edit the file: ../freenect/CMakeLists.txt and at line 51~63 turn the build options to off.


9 - once in the ltib shell (LTIB>) go to the build folder and type: cmake ..

10 - if every thing goes fine the log output should look like:

LTIB> cmake ..
-- The C compiler identification is GNU
-- The CXX compiler identification is GNU
-- Check for working C compiler: /opt/freescale/ltib/usr/spoof/gcc
-- Check for working C compiler: /opt/freescale/ltib/usr/spoof/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /opt/freescale/ltib/usr/spoof/g++
-- Check for working CXX compiler: /opt/freescale/ltib/usr/spoof/g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Operating system is Linux
-- Got System Processor i686
-- libfreenect will be installed to /usr/local
-- Headers will be installed to /usr/local/include/libfreenect
-- Libraries will be installed to /usr/local/lib
-- Found libusb-1.0:
-- - Includes: /usr/include
-- - Libraries: /usr/lib/libusb-1.0.so
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Looking for include files CMAKE_HAVE_PTHREAD_H
-- Looking for include files CMAKE_HAVE_PTHREAD_H - found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Looking for XOpenDisplay in /usr/lib/libX11.so;/usr/lib/libXext.so
-- Looking for XOpenDisplay in /usr/lib/libX11.so;/usr/lib/libXext.so - found
-- Looking for gethostbyname
-- Looking for gethostbyname - found
-- Looking for connect
-- Looking for connect - found
-- Looking for remove
-- Looking for remove - found
-- Looking for shmat
-- Looking for shmat - found
-- Looking for IceConnectionNumber in ICE
-- Looking for IceConnectionNumber in ICE - not found
-- Found X11: /usr/lib/libX11.so
-- Configuring done
-- Generating done
-- Build files have been written to:
/home/andre/bsps/git_openkinect/libfreenect/build

11 - make

LTIB> make
Scanning dependencies of target freenect Scanning dependencies of target freenectstatic [ 10%] [ 20%] Building C object src/CMakeFiles/freenect.dir/core.c.o
[ 30%] Building C object src/CMakeFiles/freenect.dir/tilt.c.o
[ 40%] Building C object src/CMakeFiles/freenect.dir/cameras.c.o
[ 50%] [ 60%] Building C object src/CMakeFiles/freenectstatic.dir/core.c.o
Building C object src/CMakeFiles/freenectstatic.dir/tilt.c.o
Building C object src/CMakeFiles/freenect.dir/usb_libusb10.c.o
[ 80%] [ 80%] Building C object
src/CMakeFiles/freenect.dir/registration.c.o
Building C object src/CMakeFiles/freenectstatic.dir/cameras.c.o
[ 90%] /home/andre/bsps/git_openkinect/libfreenect/src/cameras.c: In function 'freenect_fetch_zero_plane_info':
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:914: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:915: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:916: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:917: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:918: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:919: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:920: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:921: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c: At top level:
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:749: warning:
'read_register' defined but not used
[100%] Building C object src/CMakeFiles/freenectstatic.dir/registration.c.o
Building C object src/CMakeFiles/freenectstatic.dir/usb_libusb10.c.o
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c: In function
'freenect_fetch_zero_plane_info':
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:914: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:915: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:916: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:917: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:918: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:919: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:920: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:921: warning:
dereferencing type-punned pointer will break strict-aliasing rules
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c: At top level:
/home/andre/bsps/git_openkinect/libfreenect/src/cameras.c:749: warning:
'read_register' defined but not used
Linking C shared library ../lib/libfreenect.so [100%] Built target freenect Linking C static library ../lib/libfreenect.a [100%] Built target freenectstatic
LTIB>

12 - copy the lib and header files to your rootfs:
cp ../build/lib/* /ltib/rootfs/usr/lib
cp ../include/* /ltib/rootfs/usr/include

with this procedure, the openkinect is correctly installed on your rootfs. If you have any problems, for example, with the libusb, you should update the libusb package on ltib, it is easily done by copying the new downloaded package to /opt/freescale/package and changing the correspondent spec file at: ../ltib/dist/lfs-5.1/package_name/package_name.spec. Every package can be updated using this approach, after that you can rebuild the package using ltib commands:

./ltib -p package_name.spec -m prep
./ltib -p package_name.spec -m scbuild
./ltib -p package_name.spec -m scdeploy

Now as simple application, I used the last video to texture project, but now displaying the images from kinect device instead:


this demo application was wrote based on the sample applications that comes with the openkinect library. In the future posts I´ll give more details about the code.

Source Code: git@github.com:andreluizeng/Kinect-hack.git

EOF !

Gesture Recognition on i.MX6

Hi There, just testing the power of this processor, here is the port of the Gesture Recognition project presented on this blog with some modification, adding the code described on the previews post.

Now we can control a plane that with a video stream texture:



Only 2 gestures was enabled on this application:

1 - opened hand (enables control)
2 - closed hand (disables control)

the other gestures already learned by the ANN is already there, ready to use.

Source Code: git@github.com:andreluizeng/gesture-recognition.git

EOF !

Video to Texture Streaming (Part 2) - i.MX6 processor

After some months without any new stuff (sorry for that) here I am again ! continuing with the last post about video to texture, but now, using Freescale´s lastest release, the powerful i.MX6 processor. It is a quad core processor, that runs up to 1.2GHz, but I´m using it at 1GHz.

In that last application, it was used a webcam as our video source, but, what if we wanted to use a video file as a source for an application like that? The answer for this question is simple: we need a video decoder.

For video decoding, it can be done both ways:

1 - Software decoding
2 - Hardware decoding

as the processor used supports hardware decoding, and in the Freescale´s BSP it has already included the VPU plugins/drivers so we can use it in a Gstreamer code directly, which make a lot of things easier.

As an example I made a simple demo application that decodes a video and use it as a texture for a cube, but since we are dealing with a quad core here, why not using 2 videos decoding and 2 cubes to test its performance ?

before showing the result, lets first go through the gstreamer code.
void gst_play (const char *uri, GCallback handoffHandlerFunc)
{
GstElement* pFfConv = NULL;
GstElement* pSinkBin = NULL;
GstPad* pFfConvSinkPad = NULL;
GstPad* pSinkPad = NULL;

netplayer_gst_stop ();

pipeline = gst_pipeline_new ("gst-player");

bin = gst_element_factory_make ("playbin2", "bin");
videosink = gst_element_factory_make ("fakesink", "videosink");
//videosink = gst_element_factory_make ("mfw_v4lsink", "videosink");
g_object_set (G_OBJECT (videosink), "sync", TRUE, "signal-handoffs", TRUE, NULL);
g_signal_connect (videosink, "handoff", handoffHandlerFunc, NULL);

g_object_set (G_OBJECT (bin), "video-sink", videosink, NULL);
g_object_set (G_OBJECT (bin), "volume", 0.5, NULL);

bus = gst_pipeline_get_bus (GST_PIPELINE (pipeline));
gst_bus_add_watch (bus, bus_call, loop);
gst_object_unref (bus);
g_object_set (G_OBJECT (bin), "uri", uri, NULL);

// colorspace conversion
// it is add in a new bin, and then this bin is added to the first one (above)
pFfConv = gst_element_factory_make ("ffmpegcolorspace", "ffconv");
if (!pFfConv)
{
printf("Couldn't create pFfConv\n");
}

//Put the fake sink and caps fiter into a single bin
pSinkBin = gst_bin_new("SinkBin");
if (!pSinkBin)
{
printf("Couldn't create pSinkBin\n");
}
gst_bin_add_many (GST_BIN (pSinkBin), pFfConv, videosink, NULL);
gst_element_link_filtered (pFfConv, videosink, gst_caps_new_simple ("video/x-raw-rgb","bpp",G_TYPE_INT,16, NULL));

//In order to link the sink bin to the play been we have to create
//a ghost pad that points to the capsfilter sink pad
pFfConvSinkPad = gst_element_get_static_pad(pFfConv, "sink");
if (!pFfConvSinkPad)
{
printf("Couldn't create pFfCovSinkPad\n");
}

pSinkPad = gst_ghost_pad_new( "sink", pFfConvSinkPad );
if (!pSinkPad)
{
printf("Couldn't create pSinkPad\n");
}
gst_element_add_pad(pSinkBin, pSinkPad);
gst_object_unref( pFfConvSinkPad);

// force the SinkBin to be used as the video sink
g_object_set (G_OBJECT (bin), "video-sink", pSinkBin, NULL);

gst_bin_add (GST_BIN (pipeline), bin);

gst_element_set_state (pipeline, GST_STATE_PAUSED);

return;
}

the code above creates the Gstreamer pipeline which is used for video decoding, note that we could see the video directly on the framebuffer if we use the mfw_v4lsink instead of fakesink argument.

fakesink is needed since we are not going to show the video in a video buffer, actually we need to put all the video date in another buffer and then use this buffer as texture for our cubes.

this buffer is updated by the callback function:

//handoff function, called every frame
void on_handoff (GstElement* pFakeSink, GstBuffer* pBuffer, GstPad* pPad)
{

video_w = get_pad_width (pPad);
video_h = get_pad_height (pPad);

gst_buffer_ref (pBuffer);
memmove (g_pcFrameBuffer, GST_BUFFER_DATA (pBuffer), video_w * video_h * 2);
gst_buffer_unref (pBuffer);
}

and as for every gstreamer based application we need a main loop for the message bus, and our main loop is already being used for rendering, we can use a thread for it:


void *GstLoop (void *ptr)
{
while (1)
{
while ((bus_msg = gst_bus_pop (bus)))
{
// Call your bus message handler
bus_call (bus, bus_msg, NULL);
gst_message_unref (bus_msg);
}
}
}


the bus_call function is a generic message_bus function that can be easily found in any gstreamer documentation, or, you can access: http://gstreamer.freedesktop.org/data/doc/gstreamer/head/manual/html/chapter-bus.html

with those functions, your main function could look like:


g_pcFrameBuffer = (gchar*)malloc(720*480*3); // video buffer
gst_init (&argc, &argv);
loop = g_main_loop_new (NULL, FALSE);
uri_to_play = g_strdup_printf ("file:///home/video_to_play.mp4");
gst_play(uri_to_play, (GCallback) on_handoff);
gst_resume();
pthread_create (&gst_loop_thread, NULL, GstLoop,(void *)&thread_id);


after that, we have the g_pcFrameBuffer variable being updated constantly, and then we can use it as a texture on the the cube´s face:


glTexImage2D (GL_TEXTURE_2D, 0, GL_RGB, w, h, 0, GL_RGB, GL_UNSIGNED_SHORT_5_6_5, g_pcFrameBuffer);


in this application we are not using the gpu buffers directly, like in the last post, we are going to get back on it in the future.

and finally, the result:



EOF !