Binary relevance efficacy for multilabel classification
Subject:
Multilabel classification
Binary relevance
Synthetic datasets
Label dependency
Publication date:
Editorial:
Springer
Publisher version:
Citación:
Descripción física:
Abstract:
The goal of multilabel (ML) classi cation is to induce models able to tag objects with the labels that better describe them. The main baseline for ML classi- cation is Binary Relevance (BR), which is commonly criticized in the literature because of its label independence assumption. Despite this fact, this paper discusses some interesting properties of BR, mainly that it produces optimal models for several ML loss functions. Additionally, we present an analytical study about ML benchmarks datasets, pointing out some shortcomings. As a result, this paper proposes the use of synthetic datasets to better analyze the behavior of ML methods in domains with di erent characteristics. To support this claim, we perform some experiments using synthetic data proving the competitive performance of BR with respect to a more complex method in di cult problems with many labels, a conclusion which was not stated by previous studies
The goal of multilabel (ML) classi cation is to induce models able to tag objects with the labels that better describe them. The main baseline for ML classi- cation is Binary Relevance (BR), which is commonly criticized in the literature because of its label independence assumption. Despite this fact, this paper discusses some interesting properties of BR, mainly that it produces optimal models for several ML loss functions. Additionally, we present an analytical study about ML benchmarks datasets, pointing out some shortcomings. As a result, this paper proposes the use of synthetic datasets to better analyze the behavior of ML methods in domains with di erent characteristics. To support this claim, we perform some experiments using synthetic data proving the competitive performance of BR with respect to a more complex method in di cult problems with many labels, a conclusion which was not stated by previous studies
Patrocinado por:
The research reported here is supported in part under grant TIN2011-23558 from the Ministerio de Economía y Competitividad, Spain
Collections
- Artículos [36307]
- Informática [803]
- Investigaciones y Documentos OpenAIRE [7936]