Enhancing directed binary trees for multi-class classification

One approach to multi-class classi cation consists in decomposing the original problem into a collection of binary classi cation tasks. The outputs of these binary classi ers are combined to produce a single prediction. Winner-takesall, max-wins and tree voting schemes are the most popular methods for this purpose. However, tree schemes can deliver faster predictions because they need to evaluate less binary models. Despite previous conclusions reported in the literature, this paper shows that their performance depends on the organization of the tree scheme, i.e. the positions where each pairwise classi er is placed on the graph. Di erent metrics are studied for this purpose, proposing a new one that considers the precision and the complexity of each pairwise model, what makes the method to be classi er-dependent. The study is performed using Support Vector Machines (SVMs) as base classi ers, but it could be extended to other kind of binary classi ers. The proposed method, tested on benchmark data sets and on one real-world application, is able to improve the accuracy of other decomposition multi-class classi ers, producing even faster predictions. Keywords: Multi-class classi cation, Decomposition methods, Support Vector Machines, Directed Binary Trees, Generalization error bounds

URI:

http://hdl.handle.net/10651/7187

ISSN:

0020-0255

Identificador local:

20121298

DOI:

10.1016/j.ins.2012.10.011

Patrocinado por:

The research reported in this paper has been partially supported by Spanish Ministerio de Economía y Competitividad (Grant TIN2011-23558). Besides, we would like to thank Begoña de la Roza and Ana Soldado from the Department of Animal Nutrition, Grasslands and Forages of the Regional Institute for Research and Agro-Food Development (SERIDA) for providing us with their animal feed data set