| Tasks: Three-Class Problem
Class 0 = crystal data types 1,2,3 Design a classifier to distinguish between Class 0, 1 and 2 as defined above. Again, keep in mind that misclassifying Class 1 data is more undesirable than misclassifying Class 0 data. The error costs are as defined in Challenge 1. |
For tertiary classification, we build upon our work from Challenge 1 to create a three-step procedure. Step 1 and Step 2 are the same as those in Challenge 1. Step 1 sifts out any images that are obviously from Class 0, while Step 2 uses a more refined method to discriminate between Class 0 and Class 1. This two-step procedure worked very well for the binary classification problem, and we would like to be able to achieve a similar level of performance for the tertiary problem. That has led us to take a very conservative approach, where a third step is applied which attempts to sift out the Class 2 (type 5) images. The flowchart of the 3-step algorithm is shown below.
| back to top | home |
Looking at the images making up the training data for Class 2 (type 5), we see that they are typified by tiny crystal-like precipitate. This visual examination is confirmed by the description given of type 5 images: “microcrystals”. So, we assume that our previous features developed for identifying images with crystals in them will probably also identify type 5 images. That leads us to the single feature used in Step 3: the average size of the segmented objects in an image. Presumably, an image with large crystals in it will return a high value for this metric, while a type 5 image will return a low value. It should be noted that this approach will probably not do a great job at discriminating between Class 0 and Class 2; but that is not a great worry for us, since our primary concern is making sure images with good crystals in them don't get thrown away. Shown below is the histogram of this feature for all of the training data from Class 0, Class 1, and Class 2, with step-3 decision boundary superimposed.

We use a 1-D classifier with the threshold of 50 pixels. Image with average objects' sizes less than 50 pixels would be classified as class 2, otherwise it would be classified as class 1. The threshold is set to be small in order to minimize the probability of a miss. It should also be noted that classifying class 2 images is particularly challenging because we have insufficiently small number of training data for class 2 (15 images). This limits the selection of "good" features to be used in step 3.
| back to top | home |
Shown here are the error matrices for the three classes, after each of the three steps of our algorithm. We can see that the overall error (= 10*#Miss + #F.A.) stays at a respectably value even after the jump from binary to tertiary classification.
Multi-stage algorithm
Step 1 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
132 |
93 |
0 |
1 |
1 |
55 |
0 |
|
2 |
5 |
10 |
0 |
|
Total Error |
10*1 + 93+ 15 = 118 | |||
Step 2 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
157 |
68 |
0 |
1 |
1 |
55 |
0 |
|
2 |
5 |
10 |
0 |
|
Total Error |
10*1 + 68+ 15 = 93 | |||
Step 3 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
157 |
65 |
3 |
1 |
1 |
55 |
0 |
|
2 |
6 |
8 |
1 |
|
Total Error |
10*1 + 68+ 14 = 92 | |||
It should be noted that other features and other classifiers were experimented with, but all achieved errors that were two to four times as large as the error with our three-step approach. Specifically, we tried a 4-D Gaussian classifier, using all four features (number of objects, average local variance of objects, average intensity of objects, and average size of objects in pixels). We also tried different combinations of three features in a 3-D Gaussian classifier. In all cases, we have larger numbers of class-1 misses, which are errors with largest cost. The results of these approaches are summarized below in terms of error matrices. These results using these alternative approaches validate our three-step approach.
4-D Gaussian tertiary classification
All 4 features |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
72 |
120 |
33 |
1 |
2 |
48 |
6 |
|
2 |
2 |
9 |
4 |
|
Total Error |
10*8 + 153+ 11 = 244 | |||
3-D Gaussian tertiary classification
Features 1, 2, 3 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
75 |
54 |
96 |
1 |
4 |
28 |
24 |
|
2 |
2 |
4 |
9 |
|
Total Error |
10*28 + 150+ 6 = 436 | |||
Features 2, 3, 4 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
71 |
124 |
30 |
1 |
3 |
49 |
4 |
|
2 |
1 |
12 |
2 |
|
Total Error |
10*7 + 154+ 13 = 237 | |||
Features 1, 2, 4 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
72 |
83 |
70 |
1 |
3 |
41 |
12 |
|
2 |
2 |
9 |
4 |
|
Total Error |
10*15 + 153+ 11 = 314 | |||
Features 1, 3, 4 |
Classified label |
|||
0 |
1 |
2 |
||
| True Label | 0 |
66 |
108 |
51 |
1 |
0 |
47 |
9 |
|
2 |
1 |
9 |
5 |
|
Total Error |
10*9 + 159+ 10 = 259 | |||
| back to top | home |
Function CLASSIFIER3CLASS - classify class 0 images(type 1,2,3) , class 1 images (type 7,8,9) and class 2 images (type 5).
Input: crystal image in standard JPEG format
Output: label assigned to image
Usage: label = classifier3class('1a/i_1_14.jpg');