Expert Systems with Applications, 38(11), p.14276-14283 (2011); doi:10.1016/j.eswa.2011.05.003
Motivated by an agriculture case study, we discuss how to learn functions able to predict whether the value of a continuous target variable will be greater than a given threshold. In the application studied, the aim was to alert on high incidences of coffee rust, the main coffee crop disease in the world. The objective is to use chemical prevention of the disease only when necessary in order to obtain healthier quality products and reductions in costs and environmental impact. In this context, the costs of misclassifications are not symmetrical: false negative predictions may lead to the loss of coffee crops. The baseline approach for this problem is to learn a regressor from the variables that records the factors affecting the appearance and growth of the disease. However, the number of errors is too high to obtain a reliable alarm system. The approaches explored here try to learn hypotheses whose predictions are allowed to return intervals rather than single points. Thus,in addition to alarms and non-alarms, these predictors identify situations with uncertain classification, which we call warnings. We present 3 different implementations: one based on regression, and 2 more based on classifiers. These methods are compared using a framework where the costs of false negatives are higher than that of false positives, and both are higher than the cost of warning predictions
The research reported here has been supported in part under grant TIN2008–06247 from the MICINN (Ministerio de Ciencia e Innovación, of Spain), and grant 2009/07366–5 from the FAPESP (Fundação de Amparo á Pesquisa do Estado de São Paulo, Brazil).