This page is devoted to the object tracking algorithms. It is a representation of my last Master thesis in Computer Sciences themed 'Localization of an object in serialized images'. It is a comparative analysis of three tracking algorithms: Color-based, Template-based and Mean-shift method. They are implemented in both Matlab and C. They are tested in Matlab at the development stage and then the same procedures are just implemented in C and adapted for working to the Parrot Ar.Drone base application. They are used for object tracking in unmanned aerial vehicle (UAV).
- Color-Based Tracking
The first tested approach is the color-based tracking. We used the front camera of the Parrot Ar.Drone because it is comfortable for testing in landed state. I used the same color filter approach by RGB to HSV transformation as it was mentioned earlier to my 'Visual Servoing' page. After the HSV transformation I receive a binary image as a result of a thresholding of the H, S and V values. Well, this binary image is a base for blob analysis and retrieving the information I need and especially the center of the blob. I apply a simple procedure as I estimate the mean value of all the white pixels separately for the X and Y coordinates. Here is a result picture:
Color-based tracking |
For comfort reasons I decided to watch both the color (source) and the binary (result) images at the same time. So I used the same window and just split the screen horizontally. I did not apply any scale algorithm, because this way the image would be scaled in the X and Y axеs, however I needed to rescale the image in just one axis. This was the reason I split the image as I jumped over the odd rows in the image and left only the even rows. This picture may make the issue clearer:
Split-screening |
It is notable that in some cases there is noise to the image as a result of the thresholding. Because I used a mean values for the coordinates of the object center it is hard for the noise to make sense in this estimation. Nevertheless, I used another filter for noise reduction. For this reason, the image is walked by a window with a defined size (NxN). In this walking the number of the white pixels situated to the window is taken into account. This number is evaluated by some thresh - for example 10% of the all window cells - if the window is of size 10x10 the thresh will be 10 pixels. The area of the image where the window is current will be zero, if the number of the pixels is less than 90. The area will be white, if the number of the pixels situated to the window is more than 90.
As a result I receive the following image:
The center of the object is estimated as follows:
Filtered image |
The center of the object is estimated as follows:
- Template-Based Tracking
The second compared algorithm is the Template-based method. The basic idea is that we take and keep the template of the tracked object. Then we move it sliding on the image column by column ( and row by row ) and estimate the minimum difference.
As it is expected, the difference where the template corresponds to the object will be minimal. There are different approaches for estimating the difference. Most of them are correlation based similarity measures. The method I used in this case is 'Sum of squared differences' (SSD).
where T - is the template image, and I - is the source image.
- Mean-Shift Tracking
The third algorithm is the Mean-Shift algorithm. Its theory and math are little complex to describe here, however I will try to interpret the basic issues. As a source I used this article and this lecture. Here you can find all the steps for receiving the final and most essential equations. It is a complicated task, however at the end you need just 3-4 equations for the implementation. So, if you read it already, I will present you my implementation. It is a little different from the described approach in the article and may be you will find the distinctions. Here is the receipt of the algorithm:
- In the first frame, I use a square box for selecting the object we will track from the source image. I do not do any normalization to a unit circle or representation of the object by an ellipsoidal region in the image. It is vital for us later to apply the other procedures.
- I represent the coordinates of the object with a center equals to the center of the image snippet. I keep the matrix coordinate system.
- The patch of the object is represented as a grayscale image. I generate the Epanechnikov kernel and mask for filtering the target patch and later the candidate patch from the source image. It was used for weighting the pixel values so that the central pixels will have greater weights and the border pixel will have smaller weights.
- Then I retrieve its probability density function - histogram ( it is ready-to-use Matlab function ), however in C - I wrote it by myself. The target histogram is the factual target model which will be used later for searching the object.
- In the second frame, I used the same initial coordinates which I used for initializing the target model. We need to repeat the above procedure and to receive the candidate model histogram.
- As you note, I do not use any Battacharya coefficient estimation as it is mentioned in the article and I jumped over step number 5 and 6 from point 4.1 Distance Minimization. I just estimate the 'Wi' weights for every of the pixels and then the 'MhX' and 'MhY' parameters of the Mean-Shift vector. In fact, the Mean-Shift vector is estimated according to the shifting of the two histograms.
Then the new coordinates are computed:
The algorithm is repeated for every frame and in the next frame we use the coordinates of the object received in frame before.
Mean-Shift tracking |
For my implementation it is enough to track the object very well. So there is no need to estimate the semi-shifting of the final coordinates and measuring whether the difference of Y(k) and Y(k-1) is satisfactory.
- Results
I will show you the result of the three tracking algorithm. I suppose that you can make your own conclusion for the advantages and disadvantages depending on the clips you will see.
Here is the result of Color-Based Tracking algorithm:
The result of the Template-Based Tracking algorithm in Matlab is:
It is notable that the frame rate is very low because the algorithm is very time consuming, hard for computing and very slow. Here it is hard to note the time parameter of the clip, because I used a ready-to-use Matlab function which saves every new frame and it is waiting for the frame to be processed.
That is its equivalent in C:
It is notable that regardless of the code is implemented in C, it is again slow and time consuming. You can note the time parameter here because I used an stand-alone screencast application and it is independent of the image processing app.
Here is the result of the Matlab Mean-Shift tracking approach :
It is worthwhile to pay attention to the frame rate and the speed of execution of the code, in spite of the implementation in Matlab.
The performance of the C implementation is also satisfactory:
- Source Files
And here is the link to my GitHub remote repository for the source files:
https://github.com/nrgful/image-tracking
Няма коментари:
Публикуване на коментар