Abstract:
Many rainfall predictions models have been proposed. The common methodology
followed by those models is that the model is trained using the data before the target and
tested the model in one or a few points and claimed that the model is generalized.
However, this project shows that the above procedure is not sufficient to generalize a
rainfall prediction model as in some target periods the models failed to achieve a decent
prediction quality. The models such as Multilayer Perceptron (MLP), M5P, and Linear
Regression-were trained from the weather data collected between the years 2002 and
2015 from the station located at Badulla, Sri Lanka. Initially, the target period was set in
the last week of the dataset and the training period was one week before the target week.
Then, the training period was extended by one week, until the maximum length of the
training period reached, keeping the target fixed. Next, the target period was brought back
one week and the same procedure was repeated resulting in 695 models. The prediction
quality was measured using Mean Absolute Error (MAE) and represented in heat-maps.
The heat-maps show that the prediction quality varies over time. The highest accuracy
was given by the MLP so that the MAE has fallen between 0 and 10 mm in 61.7% of the
total instances. This indicates that testing models in one or a few time points are not
sufficient for the generalization. Further, the reasons for such drastic changes in
prediction quality will be investigated in our future projects.
Keywords: Linear regression, M5P, Prediction, Multilayer perceptron, Rainfall