Setting up for HAAR cascade training for OpenCV

OpenCV has been quite an alluring open source project for some time now. The goal of the project is to allow for an easy to use collection of computer vision algorithms but the aspect that interests me the most is machine learning. Being able to train a computer to recognize a thing or a person within a video feed is critical to the future of the world.

Since I plan on running this on Linux virtual machines I followed the instructions for installing on Linux which was fairly straight forward. But I always have my doubts and did some investigating to find how others were fairing with installing and running OpenCV. To put it mildly, there are a lot of people that are confused and never get this running.

I feel like a lot of this confusion comes from the amount of articles talking about the installation process and the information not being up to date on the OpenCV website for all of the steps needed for the best installation experience. And with version 3.0 in alpha it would be a good time for them to aggregate what we all have gone through in order to make for a more pleasant experience. But the differences are really just minor.

After going through the Instructions on the website I found only a few things that it would be good for them to update. Since I only went through the Linux installation my comments are solely on those instructions.

Install ffmpeg with libraries before OpenCV
There are a few dependencies that come down automatically but there are some optional libraries that will enhance OpenCV and ffmpg. If you are the kind of person that likes compiling from source you can do this well, but the version of ffmpeg that is available via Linux package managers usually does the job.

Ensure your machine has at lease 2GB of RAM available for OpenCV
Sure, you can run this on just about any machine configuration everything performs better with more memory.

Stop requiring additional application compiles
When doing HAAR training a part of getting set up for this is to compile a 3rd party application against your OpenCV installation. This application code has been in use for at least 3 years within the community and should be brought into the core and compiled when installing OpenCV.

So far things are going well for me and working with OpenCV. At the time of this writing I have a VM in the cloud on stage 15 of 20 training a cascade file. This was not without its own share of problems, but more on that later.

Thanks to Soulhuntre over at Samurai Developer for lending a hand to help get through some of the rough spots.