ECE 532 Crystallography Project

Tulaya Limpiti and David Winters

Instructor: Rob Nowak


Final challenge 2: Tertiary classification (1,2,3 or 7,8,9 or 5)

Tasks: Three-Class Problem

Class 0 = crystal data types 1,2,3
Class 1 = crystal data types 7,8,9
Class 2 = crystal data type 5

Design a classifier to distinguish between Class 0, 1 and 2 as defined above. Again, keep in mind that misclassifying Class 1 data is more undesirable than misclassifying Class 0 data. The error costs are as defined in Challenge 1.


 

 

 

 

 

Solution:
| Overview | Step 3 | Result | Matlab implementation |


Overview

For tertiary classification, we build upon our work from Challenge 1 to create a three-step procedure. Step 1 and Step 2 are the same as those in Challenge 1. Step 1 sifts out any images that are obviously from Class 0, while Step 2 uses a more refined method to discriminate between Class 0 and Class 1. This two-step procedure worked very well for the binary classification problem, and we would like to be able to achieve a similar level of performance for the tertiary problem. That has led us to take a very conservative approach, where a third step is applied which attempts to sift out the Class 2 (type 5) images. The flowchart of the 3-step algorithm is shown below.

| back to top | home |


Step 3:

Looking at the images making up the training data for Class 2 (type 5), we see that they are typified by tiny crystal-like precipitate. This visual examination is confirmed by the description given of type 5 images: “microcrystals”. So, we assume that our previous features developed for identifying images with crystals in them will probably also identify type 5 images. That leads us to the single feature used in Step 3: the average size of the segmented objects in an image. Presumably, an image with large crystals in it will return a high value for this metric, while a type 5 image will return a low value. It should be noted that this approach will probably not do a great job at discriminating between Class 0 and Class 2; but that is not a great worry for us, since our primary concern is making sure images with good crystals in them don't get thrown away. Shown below is the histogram of this feature for all of the training data from Class 0, Class 1, and Class 2, with step-3 decision boundary superimposed.

We use a 1-D classifier with the threshold of 50 pixels. Image with average objects' sizes less than 50 pixels would be classified as class 2, otherwise it would be classified as class 1. The threshold is set to be small in order to minimize the probability of a miss. It should also be noted that classifying class 2 images is particularly challenging because we have insufficiently small number of training data for class 2 (15 images). This limits the selection of "good" features to be used in step 3.

| back to top | home |


Result

Shown here are the error matrices for the three classes, after each of the three steps of our algorithm. We can see that the overall error (= 10*#Miss + #F.A.) stays at a respectably value even after the jump from binary to tertiary classification.

Multi-stage algorithm

Step 1
Classified label
0
1
2
True Label
0
132
93
0
1
1
55
0
2
5
10
0
 
Total Error
10*1 + 93+ 15 = 118

Step 2
Classified label
0
1
2
True Label
0
157
68
0
1
1
55
0
2
5
10
0
 
Total Error
10*1 + 68+ 15 = 93

Step 3
Classified label
0
1
2
True Label
0
157
65
3
1
1
55
0
2
6
8
1
 
Total Error
10*1 + 68+ 14 = 92

It should be noted that other features and other classifiers were experimented with, but all achieved errors that were two to four times as large as the error with our three-step approach. Specifically, we tried a 4-D Gaussian classifier, using all four features (number of objects, average local variance of objects, average intensity of objects, and average size of objects in pixels). We also tried different combinations of three features in a 3-D Gaussian classifier. In all cases, we have larger numbers of class-1 misses, which are errors with largest cost. The results of these approaches are summarized below in terms of error matrices. These results using these alternative approaches validate our three-step approach.

4-D Gaussian tertiary classification

All 4 features
Classified label
0
1
2
True Label
0
72
120
33
1
2
48
6
2
2
9
4
 
Total Error
10*8 + 153+ 11 = 244

3-D Gaussian tertiary classification

Features 1, 2, 3
Classified label
0
1
2
True Label
0
75
54
96
1
4
28
24
2
2
4
9
 
Total Error
10*28 + 150+ 6 = 436

Features 2, 3, 4
Classified label
0
1
2
True Label
0
71
124
30
1
3
49
4
2
1
12
2
 
Total Error
10*7 + 154+ 13 = 237

Features 1, 2, 4
Classified label
0
1
2
True Label
0
72
83
70
1
3
41
12
2
2
9
4
 
Total Error
10*15 + 153+ 11 = 314

Features 1, 3, 4
Classified label
0
1
2
True Label
0
66
108
51
1
0
47
9
2
1
9
5
 
Total Error
10*9 + 159+ 10 = 259

| back to top | home |


Matlab Implementations

Function CLASSIFIER3CLASS - classify class 0 images(type 1,2,3) , class 1 images (type 7,8,9) and class 2 images (type 5).
Input: crystal image in standard JPEG format
Output: label assigned to image

Usage: label = classifier3class('1a/i_1_14.jpg');

classifier3class.m


| back to top | home |
(Updated: 4/29/2005)