Results

The verification of the different postprocessing methodologies are presented in this page.

Continuous Ranked Probability Score

The Continuous Ranked Probability Score (CRPS) is a measure of the accuracy of probabilistic forecasts, considering both the sharpness and the reliability of the forecast distribution. The results considering all stations and lead times are indicated in the table below, with DRN-Members being the methodology with the lowest CRPS, closely followed by DRN-Mean.

Forecast CRPS
PME 1.082
IMPROVER 0.789
DRN-Members 0.768
DRN-Mean 0.776

The CRPS is also analysed from two perspectives:

Continuous Ranked Probability Score by lead time

  • The time series of the PME forecast exhibits a diurnal cycle with worse CRPS values during nighttime lead times and shows an increasing trend in CRPS values as the lead time increases.

  • Regardless of the postprocessing methodology, all techniques improve the mean CRPS value for each lead time compared to PME. DRN-Members is the technique that exhibits the lowest CRPS during most lead times.

  • There is a noticeable worsening of the CRPS from 48 hours of lead time, mainly due to the loss of 3 members in the ensemble. DRN-Mean is particularly affected.

Continuous Ranked Probability Score on a map

  • The highest CRPS values of the PME raw forecast nearly reach 3.0, with most of the highest values concentrated in the Pyrenees (north of the country). In contrast, the coastline stations show better performance regarding CRPS.

  • All postprocessing methodologies exhibit a reduction in CRPS compared to the PME raw forecast. This is evident when a comparison (forecast 1 vs. forecast 2) is selected. Cool colors indicate that the CRPS of forecast 1 is lower than that of forecast 2 for a given station.

  • IMPROVER and DRN-Members show similar performance in terms of CRPS reduction. Although DRN-Mean improves upon PME and outperforms IMPROVER at some stations, it generally does not surpass the performance of IMPROVER and DRN-Members methodologies.

Rank histograms

A rank histogram is used to verify ensemble forecasts by showing the distribution of the ranks of observations relative to the ensemble predictions. It indicates whether the ensemble forecast system is biased and whether it has the appropriate spread. A flat, uniform rank histogram suggests that the ensemble forecasts are reliable, while deviations from uniformity indicate biases or incorrect spread in the forecasts.

Rank histograms are analysed from two points of view:

Rank histogram

  • The PME forecast shows clear underdispersion, indicated by a pronounced U shape in the rank histogram for both sets of lead times.

  • EMOS-IMPROVER mitigates the underdispersion observed in PME; however, it still exhibits some underdispersion regardless of the lead time range.

  • DRN-Mean and DRN-Members exhibit the best results for the 0-48 hour lead time set, with DRN-Members being less biased and fairly uniform. However, for the 49-72 hour lead time range, both show a positive bias, which is less pronounced for DRN-Members.

Rank histogram by lead time

  • During all night-hour lead times, PME and EMOS-IMPROVER forecasts exhibit clear underdispersion, which is more pronounced for PME. DRN-Mean and DRN-Members show better performance, but for night lead times after 49 hours, they exhibit some positive bias.

  • For daylight-hour lead times, DRN-Mean and DRN-Members exhibit a fairly flat rank histogram for most lead times, obtaining the best results compared to other methods.

  • From lead times 6 to 12 hours, PME exhibits a clear positive bias, while a negative bias is reported from 16 to 21 hours. This behavior is mitigated by IMPROVER postprocessing and is almost fully corrected by DRN methodologies.