When climate variables improve the dengue forecasting: a machine learning approach

8 Apr 2024  ·  Sidney T. da Silva, Enrique C. Gabrick, Paulo R. Protachevicz, Kelly C. Iarosz, Iberê L. Caldas, Antonio M. Batista, Jürgen Kurths ·

Dengue is a viral vector-borne infectious disease that affects many countries worldwide, infecting around 390 million people per year. The main outbreaks occur in subtropical and tropical countries. We study here the influence of climate on dengue in Natal (2016-2019), Brazil, Iquitos (2001-2012), Peru, and Barranquilla (2011-2016), Colombia. For the analysis and simulations, we apply Machine Learning (ML) techniques, especially the Random Forest (RF) algorithm. In addition, regarding a feature in the ML technique, we analyze three possibilities: only dengue cases (D); climate and dengue cases (CD); humidity and dengue cases (HD). Depending on the city, our results show that the climate data can improve or not the forecast. For instance, for Natal, D induces a better forecast. For Iquitos, it is better to use CD. Nonetheless, for Barranquilla, the forecast is better, when we include cases and humidity data. For Natal, when we use more than 64\% and less than 80\% of the time series for training, we obtain results with correlation coefficients ($r$) among 0.917 and 0.949 and mean absolute errors (MAE) among 57.783 and 71.768 for the D case in forecasting. The optimal range for Iquitos is obtained when 79\% up to 88\% of the time series is considered for training. For this case, the best case is CD, having a minimum $r$ equal to 0.850 and maximum 0.887, while values of MAE oscillate among 2.780 and 4.156. For Barranquilla, the optimal range occurs between 72\% until 82\% of length training. In this case, the better approach is HD, where the measures exhibit a minimum $r$ equal to 0.942 and a maximum 0.953, while the minimum and maximum MAE vary between 6.085 and 6.669. We show that the forecast of dengue cases is a challenging problem and climate variables do not always help. However, when we include the mentioned climate variables, the most important one is humidity.

PDF Abstract