Abstract
We propose the Hierarchical Product Classification (HPC) framework for the purpose of classifying products using a hierarchical product taxonomy. The framework uses a classification system with multiple classification nodes, each residing on a different level of the taxonomy. The innovative part of the framework stems from the definition of classification recipes that can be used to construct high-quality classifier nodes, using the product descriptions in the most optimal way. These classifier recipes are specifically tailored for the e-commerce domain. The use of these classifier recipes enables flexible classifiers that adjust to the taxonomy depth-specific characteristics of product taxonomies. Furthermore, in order to gain insight into which components are required to perform high quality product classification, we evaluate several feature selection methods and classification techniques in the context of our framework. Based on 3000 product descriptions obtained from Amazon.com, HPC achieves an overall accuracy of 76.80% for product classification. Using 110 categories from CircuitCity.com and Amazon.com, we obtain a precision of 93.61% for mapping the categories to the taxonomy of shopping.com.
Original language | English |
---|---|
Pages (from-to) | 1-27 |
Number of pages | 27 |
Journal | Journal of Web Engineering |
Volume | 17 |
Issue number | 1-2 |
DOIs | |
Publication status | Published - Mar 2018 |
Keywords
- Product descriptions
- hierarchical clustering
- feature selection
- e-commerce
- Feature selection
- E-commerce
- Hierarchical clustering