Sunday, May 8, 2011

Semaine 6 dimanche

~ dimanche ~

Cross validation implemented.
Can specify number of folds, and which fold to use as test fold. Program takes a list of files (previous evenly interlaced with filenames from different categories) and separate it into #num_folds number of files, each file is a fold.

When running training, it skips the #test_fold and reads all the images in the other folds as training data. When running testing, it reads only the images in the test fold, and uses it for testing.


All training / testing are ran on low-res images (250 in width).


EMD training on ingredients (not cuisines, just ingredients), each sample row is an ingredient.
ing row: | 0 1 2 3 4 5 6 7 8 9 10 11 | truth

Each column is the EMD distance to other ingredients found in THIS image.
Problem: might be too sparse for effective learning, because each image only contains 1-3 ingredients, 4 at the very most.

12 categories (ingredients), images from 3 cuisines, 19 images each cuisine category, 57 images total.
Images are divided into 5 folds, 11 each in the first 4 folds, 13 images for the 5th fold.
Each fold contains as even as possible number of images by cuisine (e.g. 4 images for pasta, 4 for sushi, 3 for bibimbap, in a 11-image fold), but not by ingredient.

Chance: 1 / 12 = 8.33%

Fold#    %
0       29.6%
1       33.3%
2       27.6%
3       26.1%
4       28.6%
avg     29.04%
var     5.8744
std dev 2.4237


Eh... TODO
Pixels passed to EMD is now the 1st 100 points in a cluster, should uncomment the 100 random points code and see if that's better.



Color Histogram

RGB color space.
20, 50 bins produce same accuracy on each fold.

Chance: 1 / 3 = 33.3%

Fold#    %

0       27.3% (all predicted as 3)
1       27.3% (all 2)
2       27.3% (all 1)
3       27.3% (all 3)
4       30.8% (some 1s some 2s)
avg     28%
var     (.49 + .49 + .49 + .49 + 7.84 ) / 5 = 1.96
std dev 1.4

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home