Virtual water flows are used to determine the indirect water requirements of a region or product, making them an indispensable tool for water sustainability analysis and assessment. Commodity flows are a key data needed to compute virtual water but are typically available every 5 years in the United States (US). The lack of continuous, annual commodity flow data severely limits our ability to study and understand the drivers, evolution, and alterations of virtual water in the US. We build and evaluate a machine learning model using Random Forest (RF) to predict annual commodity and virtual water flow networks. The model is used to perform several modeling experiments and illustrate the prediction of annual virtual water flows in the US during 2013–2018. We show that the RF predictions consistently outperform those from a gravity model. The overall performance of the RF algorithm improves as commodities or regions are aggregated into coarser groups. Likewise, the inclusion of past commodity flows as an additional explanatory variable enhances the RF performance. The combination of RF classification and regression allows predicting both network connections and flows without comprising performance. Based on our RF predictions for 2013–2018, we find that temporal variations in virtual water flows can be large for some regions in the US, underscoring the need addressed by this study of reconstructing domestic virtual water changes over time. By capturing inter-regional water consumption interactions in space and time, such reconstructed data could be beneficial in the future for anticipating and managing local and regional water scarcity.
All Science Journal Classification (ASJC) codes
- Water Science and Technology