* Notice, this post contains videos which were submitted as code output for the relevant classes. However, the homework itself is not provided and code is only available on request. Similarly, not all video's for the given homework are made available, and the video's which are made available have either been permanently watermarked or are indistinguishably mine. Copying this material in any part is not permitted for academic purposes.
Our first class on the subject of Computational Perception broadly taught how to manipulate images and videos into novel image types. More specifically, we learned how to blend, stitch, filter, and compare images in a variety of ways which facilitate the creation of complex panoramas and HDR photographs, along with a variety of other applications. Feature tracking was highlighted through keypoint detection and matching, and was used to match objects between photographs, as well as to stitch multiple views of a scene together (shown on the right). Fourier analysis was taught, as well as edge and seam detection, to lay a foundation for advanced mathematical processing of images.
We also learned the fundamentals of camera design, camera geometry and stereoscopic vision, lighting and the physics of reflection and refraction, and a variety of camera properties required for proper photographic processing. It was exciting to learn how relevant real-world electronics and physical properties effect our ability to process computer vision models.
A particle filter program which can track a template image in 2D space was created as an advanced application of our learned techniques. I particularly enjoyed learning this method since it has robotic localization and machine learning applications as well. Since Romney is moving only slightly left-right, and hardly at all in the z axis, this application is generally robust to noise over relatively long periods of time. I show both pixel and histogram tracking for a variety of results in the video on the right.
Further in the compillation to the right, we shift to tracking Romney's hand. Because of the more dynamic movement, we now implement a gradual re-evaluation of the template image (top-left) as the tracking progresses. This was less robust to noise, and still only takes into account 2D transformations. However, it is still able to track these hand movements surprisingly well.
In my final demonstration we expanded the tracking software above to account for a z axis of motion and to be robust to long occlusions. In order to do this, I needed to give every particle a z-axis and have the crop scale the template to their reduced size depending on their z distance. I also had to manually tune the particle's noisiness and selection parameters based on performance. Notice that the the box tends to wander off right at the end (the video is cropped to the point of failure), and that the particles are packed much closer than in the previous video right off-the-bat. This is because the particles now occupy and center around a 3D volumetric sphere and therefore their 2D variance is reduced. They also likely favor the z dimension in terms of exploratory wandering due to the inaccuracies inherent in scaling the template for comparison.
This is just a small subset of our work in this program, due to the limitations of sharing assignments necessary for the running of an online program. All in all, we learned quite a bit in these courses and I now feel competent in handling most Computer Vision problems. However, our courses did not cover Deep Learning methods, and as such I have bought the book from PyImageSearch, Deep Learning for Computer Vision Professional, and will be cataloging my work in detail. I hope you will look forward to these works and have enjoyed the sample I have demonstrated thus far. Other material and code is available upon request.
Thumbnail Attribution: Wikimedia