A family of admissible heuristics for A* to perform inference in probabilistic classifier chains
Subject:
Multi-label classification
Probabilistic classifier chains
Inference A*
Admissible heuristics
Publication date:
Editorial:
Springer
Publisher version:
Citación:
Descripción física:
Abstract:
Probabilistic classifier chains have recently gained interest in multi-label classification, due to their ability to optimally estimate the joint probability of a set of labels. The main hindrance is the excessive computational cost of performing inference in the prediction stage. This pitfall has opened the door to propose efficient inference alternatives that avoid exploring all the possible solutions. The ϵϵ -approximate algorithm, beam search and Monte Carlo sampling are appropriate techniques, but only ϵϵ -approximate algorithm with ϵ=0ϵ=0 theoretically guarantees reaching an optimal solution in terms of subset 0/1 loss. This paper offers another alternative based on heuristic search that keeps such optimality. It consists of applying the A* algorithm providing an admissible heuristic able to explore fewer nodes than the ϵϵ -approximate algorithm with ϵ=0ϵ=0 . A preliminary study has already coped with this goal, but at the expense of the high computational time of evaluating the heuristic and only for linear models. In this paper, we propose a family of heuristics defined by a parameter that controls the trade-off between the number of nodes explored and the cost of computing the heuristic. Besides, a certain value of the parameter provides a method that is also suitable for the non-linear case. The experiments reported over several benchmark datasets show that the number of nodes explored remains quite steady for different values of the parameter, although the time considerably increases for high values. Hence, low values of the parameter give heuristics that theoretically guarantee exploring fewer nodes than the ϵϵ -approximate algorithm with ϵ=0ϵ=0 and show competitive computational time. Finally, the results exhibit the good behavior of the A* algorithm using these heuristics in complex situations such as the presence of noise
Probabilistic classifier chains have recently gained interest in multi-label classification, due to their ability to optimally estimate the joint probability of a set of labels. The main hindrance is the excessive computational cost of performing inference in the prediction stage. This pitfall has opened the door to propose efficient inference alternatives that avoid exploring all the possible solutions. The ϵϵ -approximate algorithm, beam search and Monte Carlo sampling are appropriate techniques, but only ϵϵ -approximate algorithm with ϵ=0ϵ=0 theoretically guarantees reaching an optimal solution in terms of subset 0/1 loss. This paper offers another alternative based on heuristic search that keeps such optimality. It consists of applying the A* algorithm providing an admissible heuristic able to explore fewer nodes than the ϵϵ -approximate algorithm with ϵ=0ϵ=0 . A preliminary study has already coped with this goal, but at the expense of the high computational time of evaluating the heuristic and only for linear models. In this paper, we propose a family of heuristics defined by a parameter that controls the trade-off between the number of nodes explored and the cost of computing the heuristic. Besides, a certain value of the parameter provides a method that is also suitable for the non-linear case. The experiments reported over several benchmark datasets show that the number of nodes explored remains quite steady for different values of the parameter, although the time considerably increases for high values. Hence, low values of the parameter give heuristics that theoretically guarantee exploring fewer nodes than the ϵϵ -approximate algorithm with ϵ=0ϵ=0 and show competitive computational time. Finally, the results exhibit the good behavior of the A* algorithm using these heuristics in complex situations such as the presence of noise
ISSN:
Patrocinado por:
This research has been supported by the Spanish Ministerio de Economía y Competitividad (grants TIN2011-23558, TIN2015-65069)
Collections
- Artículos [36139]
- Informática [789]
- Investigaciones y Documentos OpenAIRE [7870]