Sunday, June 19, 2011

Semaine 12 dimanche

~ dimanche ~


OpenCV on Mac OS X

Installed OpenCV on Mac. About the same process as in Linux (thankfully), besides just installing the Xcode package instead of using yum in Fedora, and installing pkg-config manually:

Download pkg-config for Mac OS X:http://mac.softpedia.com/get/Developer-Tools/pkg-config.shtml
OR direct link
http://pkgconfig.freedesktop.org/releases/pkg-config-0.23.tar.gz

Unzip it into wherever.
Go into that directory
$ cd <pkg-config unzipped dir>
$ ./configure
$ make
$ sudo make install

Then set the environment variables just as in Linux:

$ export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
$ export LD_LIBRARY_PATH=/usr

Check it by running these two commands, should have something similar, in your /
usr/local/...:

$ pkg-config --cflags opencv
-I/usr/local/include/opencv -I/usr/local/include

$ pkg-config --libs opencv-L/usr/local/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_ml -lopencv_video -lopencv_features2d -lopencv_calib3d -lopencv_objdetect -lopencv_con
trib -lopencv_legacy -lopencv_flann

For the compile lines to use in Makefile, see the Linux installation guide posted before.


Xcode is so big... - - a 4 GB download for 3+ hours and 10 GB install... O_O They should have optional packages so that we can choose to only download the UNIX development package...

Some of my built-in tests in test_cv.sh failed though, with errors FAIL(Bad accuracy) and FAIL(Invalid test data). Not sure why, the same process in Linux didn't get me any errors... test_cxcore.sh and test_ml.sh both worked. So I think I'll just ignore the ones that failed in test_cv.sh.

Subversion

Trying to check out the repository on a Mac, which I committed from Linux. Had a bunch problems with subversion, seemed to be caused by files with the same name but different capitalizations, which Linux allows because it's case-sensitive, but other systems don't, so SVN is giving all these errors when I fix and reupdate and fix and reupdate, seemed like a cycle of 3 files.

Initial error looks like:

svn: In directory '.'
svn: Can't open file '.svn/tmp/text-base/foodRecog.cpp.svn-base': No such file or directory


That means there's a file with the same name but different capitalization. Just delete the problematic file from the repository with:
$ svn rm http://repositoryURL/theProblemFile

Run update again. Most likely it won't let you run, saying

svn update
svn: Working copy '.' locked
svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for details)


Then run
$ svn cleanup
If it doesn't let you run, then do
$ rm .svn/log

Then svn cleanup should run, then svn update should run.
Solve the problem with other files that give the same error. May have to just delete the local copy and make a new folder and check out again.
Otherwise your update may be cycling through this type of errors for several files, even though there is nothing wrong with these files:


svn: Failed to add file 'myResize.sh': an unversioned file of the same name already exists


OR


svn: In directory '.'
svn: Can't move source to dest
svn: Can't move '.svn/prop-base/tempfile.7.tmp' to '.svn/prop-base/Makefile.svn-base': No such file or directory
(this error also means the same capitalization problem. But if this is not the initial error, this file might have no problem at all, it's just that you need to delete this entire folder, make a new one, and do a clean checkout, after you resolve the problems above with the other files. This was the case for me.)


Turns out the way to solve it is that after removing the offending file, don't update in the same directory. Delete it, then do a clean checkout in a new directory. That solved it :) Geez. Now hopefully it'll run on the Mac and we can start expanding our database.

Amazon Mechanical Turk

Thought about it. Looked at it. I think I'll still do some by myself and won't use it until I can't handle it. See how fast I am. If I'm just not fast enough, then I might just start using it. Not too sure how to set up the labeling yet though. I should do some tasks to try out how it works.

Friday, June 10, 2011

Semaine 11 vendredi - Quarter Final Report

~ vendredi ~

Final report for the course is here http://acsweb.ucsd.edu/~mezhang/cse190/MabelZhang_cse190_FinalReport.pdf

(Temporary location, might get wiped after we graduate in a few days... Will move to Google sites later.)

Friday, June 3, 2011

Semaine 10 vendredi - Le Plan

~ vendredi ~

In order to submit to a workshop, we must expand the depth in all directions, including:
  • Getting more cuisines and more dishes for the database
  • Looking closer at the different SVM kernels and how well they can work
  • Different low-level feature descriptors (currently using EMD, there's also RGB histogram and other features that preserve color information, and then SIFT and others that discard color information, might serve well to compare perserving vs. discarding color)
  • Maybe the attribute (ingredient) types we chose and how effective they are (this requires relabeling if change)
  • Any other part of the pipeline
The top concern is expanding the database, because labeling takes a lot of time, especially when we expand to more cuisines and more dishes. We might even consider Mechanical Turk from Amazon. Hmm I haven't used it before so I'll have to look at how to set it up, etc.

So we already have the code to grab images from Flickr, but we've found that Google Image's results are much higher quality and clearer than Flickr images, so it is desirable to have a batch downloading code / software for Google Image.

Resources I've found:

// Seems like it could only do the first page
Google Image Ripper - online service
http://www.dearcomputer.nl/gir/
Linked from
http://labnol.blogspot.com/2006/07/how-to-leech-pictures-from-flickr-or.html
"This online service extracts the full size images [no thumbnails] from Google index and displays them in one page. You can then save the full page with attachments to build your offline gallery of Google Images."

// Hmm didn't work, probably old software
MultiImageDownloader (freeware, looks like this download is Windows only?)
http://www.addictivetips.com/windows-tips/download-google-images-in-bulk/
Download page
http://www.freewarefiles.com/MultiImageDownloader_program_55357.html
Its Developer's page
http://chesterway.co.uk/

// Trial version only allows 30 images to be downloaded...
WebImageGrab (formerly Googlegrab, Windows & Mac OS)
Review 

Batch Image Downloader - Firefox Addon on Google Code, Downloads selected images on a webpage
http://code.google.com/p/batch-images-downloader/

Review articles about various tools to batch download images from the web
http://www.ghacks.net/tag/download-images/


My second concern is that once the database is expanded, the algorithm won't work anymore! Hahaha... yeah. We'll see.

The other detail-looks listed above require some additions to the code, such as adding code that calculates other feature descriptors, rerunning with different SVM parameters, and other experiments.

Wednesday, June 1, 2011

Semaine 10 mercredi - Les Résultats!

~ mercredi ~

I think it works!!!!!

Hold on.

Layer 1: SVM, RBF kernel
Layer 2: SVM, 3rd degree Polynomial

Final result (i.e. layer 2 cuisine classification) with cross validation:
Fold    Accuracy
0       60.000000
1       86.666664
2       95.555557
3       86.666664
4       85.416664
mean     82.8611010
var     143.854935
stdev   11.993954

Q1: How do I know whether it's overfitting? Can I check somehow? If I do 5th degree polynomial, then it's lower:
Layer 2: SVM, 5th degree Polynomial
Fold       Accuracy
0               51.111111%
1               77.777779%
2               84.444443%
3               77.777779%
4               83.333336%
mean             74.888890%
var             148.938279
std dev         12.204027
When I used RBF kernel, it just didn't work at all, almost everything gets classified as a single class, so the accuracy is around 33.33%.

Q2: The variance looks really large. Fold 0 for some reason doesn't work well.