Food Classification CSE 190 SP 11: April 2011

Monday, April 25, 2011

Semaine 5 lundi

~ lundi ~

Test data: 3 classes, each 19 images
bibimbap
pasta
sushi
Some images cropped to include most relevant area

Bibimbap (incomplete, collage needs to be updated):

Pasta:

Sushi:

Baseline 1: Global Color Histogram

20 bins

63.2 %

Baseline 2: EMD (in progress)

Clustered with # id:

Each cluster's (r, g, b) values saved as a 1D matrix, each labeled as an ingredient, total ~6 ingredients, global.

Load each cluster for each image, where an ingredient doesn't exist in an image, the matrix value is -1. Calculate the EMD between each pair of clusters (ingredients) in 1 image.

Ground truth is loaded from a manually labeled file, with each cluster labeled as an ingredient.

Each grayscale image is outputted with a cluster ID number to make labeling more convenient, so that the labeler can match up the cluster # with ingredient #. Noisy clusters are discarded and labeled as -1. Background cluster is labeled as -1, i.e. no ingredient.

Training data layout:

matrix dimension (number of ingredient samples from all images) x types of ingredients
Each column is the EMD distance to all the ingredients in this image.
If non-existent, distance -1.
Distance to self is 0.

img 1 ing 1: EMD to ing1(0), ing2(-1), ing3, ing4(-1), ing5(-1), ing6 | 1
img 1 ing 3: EMD to ing1, ing2(-1), ing3(0), ing4(-1), ing5(-1), ing6 | 3
img 1 ing 6: EMD to ing1, ing2(-1), ing3, ing4(-1), ing5(-1), ing6(0) | 6
img 2 ing 2: EMD to ing1(-1), ing2(0), ing3(-1), ing4(-1), ing5, ing6(-1) | 2
img 2 ing 5: EMD to ing1(-1), ing2, ing3(-1), ing4(-1), ing5(0), ing6(-1) | 5
img 3 ... : ...
...

(
Plan for later:

Save each image as 1 row in training data, row format:

| ing1 emd dist to ing2 3 4 5 | ing2 dist to 1 3 4 5 | ing3 nonexist | ing4 dist to 1 2 3 5 | ing5 nonexist |
)

EMD and training matrix are implemented.
Currently dealing with some memory issues at runtime, using valgrind to help.

Still need to label all the image clusters with the corresponding ingredient id.

Monday, April 18, 2011

Semaine 4 lundi

~ lundi ~

Baseline 1: Global Color Histogram, code is working. Testing on some images yields very bad prediction results, which is expected - because it's just going by color. Lots of dishes can have the same color but totally different spatial information. Global color histogram loses the spatial information, unlike local color histograms. Thus result is bad.

Lessons learned in OpenCV and debug statements specifically:

When access matrix elements with at<uchar>(y,x) or at<float>(y,x), make sure the type inside angle brackets <> matches the type of the matrix... My print() function assumed <uchar>, so when I tried to debug my CV_32SC1 matrix, it kept printing out 1 0 0 0 2 0 0 0 3 0 0 0... instead of 1 2 3... Took me way longer than it should to figure out that my matrix was correct, just that my debug statement is incorrect...

I know I've had this problem before, that was one of the things that made me waste a lot of time and dislike OpenCV. The other thing is the bad documentation - -

Em, when printing out using fprintf (), remember to supply it with an argument...? I was wondering why my SVM prediction results were all 0s, couldn't figure out and fell asleep. Morning, went to change all the parameters to a forum's post's values, and then found out that my printout statement was fprintf (stderr, "%f\n"); ... just forgot to supply an actual float value.

TODO:
Going to test on a larger set of images, record the results.
Then move on to EMD. Code and example is already compiling and running, but need to link it to my code and data.

Wednesday, April 13, 2011

Semaine 3 mercredi

~ mercredi ~

Baseline 1:
Global color histogram

Difference between local and global color histograms explained, this paper uses local color histogram: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.23.4215&rep=rep1&type=pdf
Abstract:
“Global color histograms are well-known as a simple and often way to perform color-based image retrieval. However, it lacks spatial information about the image colors. The use of a grid of cells superimposed on the images and the use of local color histograms for each such cell improves retrieval in the sense that some notion of color location is taken into account. In such an approach however, retrieval becomes sensitive to image rotation and translation. In this thesis we present a new way to model image similarity, also using colors and a superimposing grid, via bipartite graphs. As a result, the technique is able to take advantage of color location but is not sensitive to rotation and translation. Experimental results have shown the approach to be very effective. If one uses global color histograms as a ﬁlter then our approach, named Harbin, becomes quite effcient as well (i.e., it imposes very little overhead over the use of global color histograms).”

Baseline 2:
Earth Mover's Distance. Code (C): http://www.cs.duke.edu/~tomasi/software/emd.htm [6]

Fast EMD Code (C++, Matlab, Java): http://www.cs.huji.ac.il/~ofirpele/FastEMD/code/

[6] Y. Rubner, C. Tomasi, and L. J. Guibas. A Metric for Distributions with Applications to Image Databases. Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India, January 1998, pp. 59-66. http://www.cs.duke.edu/~tomasi/papers/rubner/rubnerIccv98.pdf

Sunday, April 10, 2011

Semaine 2 (Week 2) dimanche

~ dimanche ~

Bah, c'est très joli aujourd'hui.... mais je suis malade... - -

Ran textonizer (from Google Code) and seems to work quite well on the test image [5]. It does its own segmentation, calculates the textons, and uses kmeans to cluster the segments, it seems. Its output is an integer array with values that indicate which cluster the pixel belongs to, ex. if number of clusters == 4, then the output values are 0, 1, 2, 3, 255, where 255 is the background cluster.

Parameters that the user can tune include the number of clusters, a pixel that represents pixels that should be clustered as the background (value 255), texton size (default 30). The system has a nice .doc file that specifies the parameters and explains the algorithm like a paper.

Now that the similar-looking ingredients are clustered into the same group, we have better data than raw segments. Probably don't need JSEG anymore.

Problems:

NUM_CLUSTER is specified by user. How do I get the optimal number of clusters? Ideally, it should be equal to the number of ingredients in the dish, but how does the program know how many ingredients in the dish, before clustering occurs? Two ways:
1. Specify ONE constant number and hope it works well enough for all images.
2. Do some kind of guessing algorithm to get a good number? How do we guess?

Background: Background and plate are correctly clustered into two different clusters than the ingredients, this is good. However, that's 2 clusters, not 1, because the background appearance is not the same as the plate. How do we prune the plate out from the data? Two ways:
1. Edit the images so that there is no table in the image, only the plate and food. This is good because table can be in any kind of color and textures, this is easy to get rid of it and now we'd only have 1 background cluster - the plate, and we can rule it out easily in the algorithm. However, this is a lot of manual work.
2. In the algorithm, figure out which cluster is the plate, and rule it out, along with the background cluster. How to figure out which cluster is the plate...?

Original and clustered images according to textons:

After that, now that we have the clusters, we need to label the clusters into ingredient categories and send it into training.

First, need to get the matrix from the textonizer's internal data.

Each matrix is size (number of pixels) x 15, for a texton size of 30. Each pixel corresponds to a cluster (an ingredient). Each image has 4 clusters (ingredients). The cluster values are 0, 1, 2, 3, 255 (background cluster).

Each pixel has the cluster as the label. Each cluster corresponds to an ingredient. Cluster number across images are not the same. They may refer to different ingredients, depending on the image.

Pick 10 images from dataset, output each image's texton vectors into a file *_textons.txt. For each row in the file, it's a 1x15 row vector representing the texton of that pixel.

Output each image's cluster (the answer given by kmeans) into a file *_labels.txt.

TODO:

Hand replace the cluster number in each of the 10 _labels.txt files with a number from 0-7 (assuming we have 8 ingredients total in all the 10 images).

We can actually set the right number of clusters for each image, and train the ingredients using very very accurate ingredient clusters.

Now we have 8 labeled ingredients. Feed it into SVM for training and save the model trained.

For testing, do much of the similar thing, except won't have labels, and feed to SVM to get the predicted labels.

[5] Texton-izer : An Irregular Textonizer. http://code.google.com/p/texton-izer/

Saturday, April 9, 2011

Juste pour Moi: Manuel d'Installation d'ImageMagick pour CentOS

ImageMagick 6.6.9-4

$ export LDFLAGS="-L/usr/local/lib -Wl,-rpath,/usr/local/lib"
$ export LD_LIBRARY_PATH="/usr/local/lib"
(without this, will get error when run "display", with error message of a .so.4 library file not found.)

$ ./configure
$ make
$ make install

To check installation:
$ display

To check configuration:
$ identify -list configure
To check what image formats are supported:
$ identify -list format

To run a more comprehensive test suite:
$ make check
Should output after all the tests:
===================
All 48 tests passed
===================

* There is a special rpm instruction for Linux installation, in the downloaded zip's Install-unix.txt. Didn't try, but might probably work.

Semaine 2 (Week 2) samedi

~ samedi ~

Textons

Code from online ran in Matlab, but not sure how to get anything out of the resulting matrices to show visually on the image so that I can get the meaning out of it. Looking into code for OpenCV.

OpenCV surprisingly doesn't support loading GIF directly. Found code online to load it, but has artifacts when I just load and store back the image. Decided to use ImageMagick and some shell script to convert the GIF image from JSEG back to JPG, then load into OpenCV matrix. Much easier, since my shell script is already written for JSEG. Just need 1 line of code for using ImageMagick.

Data
So... although we had thousands of images in a some dozens of food categories automatically queried and downloaded from Flickr, they are quite noisy. We need to clean them.

A snapshot of the categories currently downloaded:
bento, bibimbap, burrito, chowMein, crepes, DimSum, dumpling, fishAndchips, gyro, kebab, macaron, onigiri, PekingDuck, pizza, potAufeu, ramen, redCurry, sashimi, shrimpTempura, spaghetti, steakFrites, stirfry, stollen, sushi, taco, teriyaki, tiramisu, udon, wonton, yakisoba

Currently only working on a subset of sushi images for initial testing, because it has many segmentation regions and ingredients to look for.

TODO: Separate sushi data into easy, medium and hard tests.

Use the easy ones for testing first. They shouldn't have too many segmented sections, so that I can manually label the segments and make the *.labels files

TODO: Pick out 10 images, and hand label those.

Machine Learning

When training the ingredients:

Q: Should we find images with just that ingredient on a single image, and train using those images, or can we hand label the segmented regions, and just train using the regions?

Q: If I label the plate and the table as "unnecessary" and just throw away, or just NOT label them at all. Can I just use a classifier that doesn't always pick one possible, but sees that if a pattern is like nothing it has seen, then discard that data automatically?

When training the cuisines:

Will obviously use the whole image

Juste pour Moi: Manuel d'Installation d'OpenCV pour CentOS 32-bit 5.x

Log for self.
Worked on CentOS 5.3 and 5.5

64-bit CentOS has problems with g++, so will need to fix that before can even start installing OpenCV. Run it, and it'll give you errors. Google that error and fix it first.

Output log is in <my local drive>/opencv_linuxInstall.txt.

How to set up OpenCV in Linux:

(when the manual says ldconfig, for centos, it's /sbin/ldconfig. ldconfig by itself won't work, command not found. man ldconfig for more info).

gcc:
$ yum install gcc
make sure gcc is installed. most likely it is already

pkg-config:
$ yum install pkgconfig
most likely it is already installed

cmake:
Download cmake from cmake website. Say we download the .sh file.

$ cd <wherever you put the downloaded .sh file>
$ chmod +x cmake-2.8.4-Linux-i386.sh
$ ./cmake-2.8.4-Linux-i386.sh
It'll unzip in the current folder, with a subdirectory (if you answered yes to the second question).

Can just move it to home directory, or anywhere is fine.
that directory will be your directory that cmake is installed in
the cmake executable is in
<unzipped dir>/bin/cmake
so when need to run cmake, just run that
(I have it in ~/cmake-2.8.4-Linux-i386/bin/cmake)

OpenCV:

Unzip anywhere.

Might need to use sudo on some of these. Just run
$ su
ahead of time to log into root, and then run the following.

Go into the directory where you unzipped Opencv-2.2... whatever

cd ~/OpenCV-2.2.0 # the directory containing INSTALL, CMakeLists.txt etc.
mkdir release
cd release
<cmake unzipped dir>/bin/cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_PYTHON_SUPPORT=ON -D BUILD_EXAMPLES=ON ..

It'll config and put files in
/home/master/OpenCV-2.2.0/release

It'll say so with a msg at the end:
-- Build files have been written to: /home/master/OpenCV-2.2.0/release

2.
Still in the opencv/release directory.
Run
$ make
or
$ nmake
not sure what nmake is for, but if make doesn't work, try nmake, I guess.

This will build a bunch of .o files.
This will take a while, half an hour or more. Go do something else.

Then run
$ sudo make install
or just
$ make install
if you're already in root user.

Installation DONE!

Testing OpenCV

1.
To test OpenCV, run cmake again with BUILD_EXAMPLES.
(If you run opencv_test without including the BUILD_EXAMPLES=ON flag in the build, it'll say can't find opencv_test.sum. So run this first.)

may need to log into root first, because the installation was done with root, it might not let you write to some files.
$ cd <opencv dir>
$ cd release
$ su
$ <cmake unzipped dir>/bin/cmake -D BUILD_EXAMPLES=ON

It'll say:
-- Configuring done
-- Generating done
-- Build files have been written to: /home/master/OpenCV-2.2.0/release

2.
Copy tests/ data files.
Copy the data files used for testing, else tests will fail with something like:
[c29e21ad0b37f6f1]
calibrate-stereo-c:     FAIL(No test data)
>>>
The file ../../../../tests/cv/testdata//stereo/case1/stereo_calib.txt can not be opened or has invalid content
context: test case = -1, seed = c29e21ad0b37f6f1

Copy images from source directory tests/ dir to bin/ directory
$ su
$ cd <opencv dir>
$ cp -r tests/* release/bin

3.
Copy opencv_extra/ data files.
Still need to copy data files from opencv_extra/testdata/cv/, else some tests will fail with something like:
[9215677204570357]
calibrate-camera-c:     FAIL(No test data)
>>>
Could not open file with list of test files: /../../opencv_extra/testdata/cv/cameracalibration/datafiles.txt
context: test case = -1, seed = 9215677204570357

(Most of the fail cases in this round are associated with opencv_extra/testdata/cv)

It doesn't seem like opencv_extra/testdata/cv came with the standard Opencv download archive. Just download that directory manually from opencv repository.

Install svn if you don't have it yet (I didn't... to the jungle's surprise.)
$ su
$ yum install subversion

cd into whatever dir you want to put the repository files:
$ cd /home/master
$ mkdir OpenCV-repository
$ cd OpenCV-repository
$ svn co https://code.ros.org/svn/opencv/trunk/opencv_extra

(or maybe you can just do trunk/opencv_extra/testdata. seems like that'll cover all of the test files already)

This will check out latest opencv_extra/ from the repository into the current directory. It'll say something like:

Checked out revision 4818.

(Don't check out trunk/opencv/ unless you want to! Just do trunk/opencv_extra/ is sufficient for the tests. If you checked out trunk/opencv/ by accident, just remove it with
$ rm -rf opencv
)

Now we have all the test files in opencv_extra.
Copy it to where the test binaries want it, in /../../:
$ cp -r opencv_extra/ /../../
or simply
$ cp -r opencv_extra/ /
Either way is the same. Parent dir .. from root / is just root / itself.

(It's a bit weird that they want the directory to be in root / dir, I guess they put a slash in the beginning where they didn't mean to.
Anyway, this is the easiest way that works.
Just remove opencv_extra from your root / directory after finish running all the tests.

OR, probably easier, you can go into the shell script and just remove that slash!
$ vim <opencv dir>/release/bin/test_cv.sh

Remove the slash after $srcdir
so it'd be
./opencv_test -d $srcdir../../opencv_extra/testdata/cv
instead of
./opencv_test -d $srcdir/../../opencv_extra/testdata/cv
or just change it to ./opencv_extra/testdata/cv, if you have opencv_extra in the bin/ directory.

You'd probably also have to change it for the other shell script files, like test_cxcore.sh, etc.
)

4.
Now can run test, there are three .sh files:
$ cd <opencv dir>/release/bin
$ ./test_cv.sh        # this was copied from the tests/ dir in Step 2

All the tests should be passing, no more "FAIL(No test data)".
(All tests passed for me! :D )

(By the way all the tests are alphabetized. So if you want to check something, just find it in the output by first letter, e.g. canny, color-lab, shape-*, etc.)

NOW HERE-------------------------> Failed on z-highgui.
No idea how to fix it. Doesn't say what file source it is, which means it's not missing a file. So probably just unexpected output.
Why? environment variables? What do the ext= lines that it printed out mean?
Maybe i didn't install highgui libraries?
Seems like highgui is in release/modules/highgui and release/lib/libopencv_highgui.so.2.2. So seems like highgui is installed.

If they aren't passing for you... I don't know what to tell you, something went wrong with the installation, or you missed a step, missed an error message and didn't fix it during installation? etc.
Hence this is called Testing OpenCV! That's why we test. If something failed, go fix your installation. Reinstall or whatever.

Error log is in opencv_test.log. That might tell you what error was thrown at the failure.

Run the other two shell script files, all tests should pass:

$ ./test_cxcore.sh
$ ./opencv_test_core
=================================================
Summary: 0 out of 78 tests failed
Running time: 00:08:35
This just runs opencv_test_core with the right directory for opencv_extra.
opencv_test_core actually doesn't use anything in opencv_extra, so if you just run it by itself, it'll also run all 78 tests successfully (that was the case for me at least, although I did have a copy of opencv_extra in current dir, don't know if it looks at that).

(i.e. ./opencv_test_ml)
$ ./test_ml.sh
=================================================
Summary: 0 out of 15 tests failed
Running time: 00:07:04
This just runs opencv_test_ml with the right directory for opencv_extra (you can see this if you just open up test_ml.sh).
So if you can't run opencv_test_ml by itself, no worries, it's probably just missing the test data files, just run it using this shell script.

5.
There are other binaries in the directory that you can run, if suitable. I don't think I need any of them, so it didn't matter if I ran the extra ones.
For example, this failed for me:

$ ./opencv_test_gpu
OpenCV Error: No GPU support (The library is compilled without GPU support) in throw_nogpu, file /home/master/OpenCV-2.2.0/modules/gpu/src/precomp.hpp, line 84
terminate called after throwing an instance of 'cv::Exception'
what(): /home/master/OpenCV-2.2.0/modules/gpu/src/precomp.hpp:84: error: (-216) The library is compilled without GPU support in function throw_nogpu

Aborted

I don't really care, I probably didn't install OpenCV with GPU support, and I'm not using opencv for any gpu related stuff.

Run these other tests if needed.

6.
After all the tests are done, if you put the opencv_extra in your root directory, can remove it now.
$ rm -rf /opencv_extra

7. Use OpenCV for projects, using pkg-config

http://opencv.willowgarage.com/wiki/CompileOpenCVUsingMacOSX

The opencv.pc is in
/usr/local/lib/pkgconfig

NOT in your release/lib. Important that you get this dir right and export it to the environment variable PKG_CONFIG_PATH. Then everything else follows.

$ cd <your source code dir>

For Bash:
$ export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
$ export LD_LIBRARY_PATH=/usr

Check it by running these two commands, should have something similar, in your /usr/local/...:

$ pkg-config --cflags opencv
-I/usr/local/include/opencv -I/usr/local/include

$ pkg-config --libs opencv
-L/usr/local/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_ml -lopencv_video -lopencv_features2d -lopencv_calib3d -lopencv_objdetect -lopencv_contrib -lopencv_legacy -lopencv_flann

To compile, use:
gcc `pkg-config --cflags --libs opencv` -o mytest mytest.c

/// i'm guessing args are same?
or, for cpp programs, use:
g++ `pkg-config --cflags --libs opencv` -o mytest mytest.cpp

When run the program, if you get this error:
./a.out: error while loading shared libraries: libopencv_core.so.2.2: cannot open shared object file: No such file or directory

Then need to add /usr/local/lib to /etc/ld.so.conf.d/opencv.conf:
$ vim /etc/ld.so.conf.d/opencv.conf
It probably doesn't exist yet, so it'll be a new file.
In the first line of the file, insert:

/usr/local/lib

Save the file. Then run:

$ su
$ /sbin/ldconfig

Now it should run!

Monday, April 4, 2011

Semaine 1 (Week 1) dimanche

~ dimanche ~

Segmentation

Wrote shell script to batch run segmentation with JSEG. While there are over-segmented areas, the region between each ingredient boundary is clearly segmented, which is good. However, the plate is not that clearly separated from the food, as the plate is often also over-segmented, so we'll need a way to discard the plate and tabletop.

Brightness

Tune all images to the same brightness, might be able to deal with variations in lighting this way?

Result: Segmentation is better WITHOUT brightness adjustment, but don't know about texton extraction yet.
Motivation for brightness adjustment is intra-class variation in lighting. Color textons would use color information and misclassify if there's too much variation. Ting-Fan's cafeteria vision ended up depending heavily on the color, so that's very important. Will see if need this brightness tuning when texton is running.

TODO: Attach images

Textons

"basic elements in early (pre-attentive) visual perception" [3]
[3]'s process basically goes from input image -> generate textons -> generate base map -> generate an image, from a combination of bases, that tries to match the input image
Everyone seems to be using k-means
What are Textons? http://chengenguo.com/ucla/what_are_textons.htm

paper http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.8748&rep=rep1&type=pdf [3]

Varma, Zisserman paper http://research.microsoft.com/en-us/um/people/manik/pubs/varma05.pdf [4]

filter bank code, MATLAB http://www.robots.ox.ac.uk/~vgg/research/texclass/filters.html

Texton Code in MATLAB from Ting-Fan
Textonizer found on Google Code. C++, OpenCV 1.0

Berkeley Segmentation Dataset and Benchmark - Textons > computeTextons.m http://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
Texture Synthesizer, C++, OpenCV http://texsynth.googlecode.com/svn-history/r3/trunk/TextureSynthesizer/TextureSynthesizer.cpp
Deciding on which to use, what the differences are, and which filter bank to use. Ting-Fan seems to have used the John Winn filter bank, there's also the Leung-Malik filter bank and the Schmid filter bank. [4] used Leung-Malik. Don't know what the differences are yet. Guess will just try one and see how (bad) it goes first for starters to save time.

Others

Toolbox for Computer Vision http://gulimujyujyu.me/wiki/index.php?title=Toolbox_for_Computer_Vision

[3] Song-Chun Zhu, Cheng-En Guo, Yizhou Wang, and Zijian Xu. What are Textons? IJCV 2005. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.8748&rep=rep1&type=pdf

[4] Manik Varma and Andrew Zisserman. A Statistical Approach to Texture Classification from Single Images. http://research.microsoft.com/en-us/um/people/manik/pubs/varma05.pdf

Food Classification CSE 190 SP 11