In this chapter we report on identification experiments based on combinations of feature sets, as described in the previous chapters, using the mixed genera data set with 37 taxa. We developed an application framework that integrates the contributions of the project partners to make these mixed-method identifications possible. Identification performance is measured by bootstrap aggregating (bagging) C4.5 decision trees. Combinations of contour-based features show that over 90% of the diatoms can be identified correctly, and a similar result is obtained using ornamentation features. If all features are combined, the identification rate increases to almost 97%. From the analysis of a collection of 25 decision trees, a set of 17 robust features, that were used by at least 12 trees, is selected. This small feature set yields an identification rate of almost 96%. Because a few feature sets were still under development at the time of writing (convex/concave curvature, Legendre polynomials), it is expected that the same experiments on the basis of the final feature sets will result in even better ID rates. This chapter also describes a web-based application, called ADIACweb, that allows users to interact with the automatic identification system. Currently, it can identify 37 different taxa.
|Title of host publication||Automatic Diatom Identification|
|Editors||H. Buf, du, M.M. Bayer|
|Place of Publication||Singapore|
|Publication status||Published - 2002|
|Name||Series in Machine Perception and Artificial Intelligence|