Tropical cyclone ensemble track forecasts from 153 initialization times during 2017–18 are clustered using regression mixture models. Clustering is performed on a four-ensemble dataset [ECMWF + GEFS + UKMET + CMC (EGUC)], and a three-ensemble dataset that excludes the CMC (EGU). For both datasets, five-cluster partitions are selected to analyze, and the relationship between cluster properties (size, ensemble composition) and 96–144-h cluster-mean error is evaluated. For both datasets, small clusters produce very large errors, with the least populous cluster producing the largest error in more than 50% of forecasts. The mean of the most populous EGUC cluster outperforms the most accurate (EGU) ensemble mean in only 43% of forecasts; however, when the most populous EGUC cluster from each forecast contains ≥30% of the ensemble population, its average cluster-mean error is significantly reduced compared to when the most populous cluster is smaller. Forecasts with a highly populous EGUC cluster also appear to have smaller EGUC-, EGU-, and ECMWF-mean errors. Cluster-mean errors also vary substantially by the ensembles composing the cluster. The most accurate clusters are EGUC clusters that contain threshold memberships of ECMWF, GEFS, and UKMET, but not CMC. The elevated accuracy of EGUC CMC-excluding clusters indicates the potential utility of including the CMC in clustering, despite its large ensemble-mean errors. Pruning ensembles by removing members that belong to small clusters reduces 96–144-h forecast errors for both EGUC and EGU clustering. For five-cluster partitions, a pruning threshold of 10% affects 49% and 35% of EGUC and EGU ensembles, respectively, improving 69%–74% of the forecasts affected by pruning.
All Science Journal Classification (ASJC) codes
- Atmospheric Science