Working paper

Nowcasting World Trade with Machine Learning: a Three-Step Approach

Published on 12 July 2023
Authors : Menzie Chinn , Baptiste Meunier, Sebastian Stumpner

Working Paper Series no. 917. We nowcast world trade using machine learning, distinguishing between tree-based methods (random forest, gradient boosting) and their regression-based counterparts (macroeconomic random forest, linear gradient boosting). While much less used in the literature, the latter are found to outperform not only the tree-based techniques, but also more “traditional” linear and non-linear techniques (OLS, Markov-switching, quantile regression). They do so significantly and consistently across different horizons and real-time datasets. To further improve performance when forecasting with machine learning, we propose a flexible three-step approach composed of (step 1) pre-selection, (step 2) factor extraction and (step 3) machine learning regression. We find that both pre-selection and factor extraction significantly improve the accuracy of machine-learning-based predictions. This three-step approach also outperforms workhorse benchmarks, such as a PCA-OLS model, an elastic net, or a dynamic factor model. Finally, on top of high accuracy, the approach is flexible and can be extended seamlessly beyond world trade.

Image Decomposition of accuracy gains relative to PCA-OLS Description The three-step approach outperforms benchmarks significantly and consistently. This three-step approach can be viewed as an extension of the widely used “diffusion index” of Stock and Watson (2002) who combine factor extraction by PCA and OLS regression. Compared to a model à la Stock and Watson (2002), the three-step approach delivers on average a 26% lower RMSE with accuracy gains coming both from the addition of a pre-selection step and from the use of the macroeconomic random forest (Figure N1). We finally check that the three-step approach outperforms workhorse nowcasting models such as a dynamic factor model.
Decompostion of accuracy gains relative to PCA-OLS

Real-time economic analysis often faces the fact that indicators are published with significant lags. This problem is encountered for world trade in volumes: the earliest indicator is published by the Dutch Centraal Plan Bureau (CPB) roughly eight weeks after month end – meaning that March 2023 data is available around May 25th. Since these data are widely used among economists, this poses a challenge policy-wise as decisions should rely on timely information about the current business cycle. In the meantime, a number of early indicators are available. The purpose of this paper is to exploit such information to get advanced estimates of world trade ahead of the CPB releases.

A key novelty of this paper is the use of machine learning techniques for nowcasting. We distinguish between tree-based and regression-based techniques. The first category – tree-based – includes random forest and gradient boosting and is the most popular in the literature. It is however found to perform poorly on our dataset, supporting recent evidence that such techniques might be ill-equipped to deal with the small samples of macroeconomic time series. In contrast, the regression-based techniques – macroeconomic random forest and linear gradient boosting – provide the most accurate predictions. They outperform all other techniques, not only tree-based machine learning but also more “traditional” non-linear techniques (Markov-switching and quantile regression) and Ordinary Least Squares (OLS). They do so significantly and consistently across different horizons, real-time datasets, and states of the economy.

A second key contribution is to propose a three-step approach for forecasting with machine learning and large datasets. The approach works sequentially: (step 1) a pre-selection technique identifies the most informative predictors among our dataset of 600 variables; (step 2) selected variables are summarized and orthogonalized into a few factors; and (step 3) factors are used as explanatory variables in the regression of world trade, using machine learning techniques. While such pre-selection and factor extraction have been already used in the literature, our contribution is to use them in a combined framework for machine learning. We compare different methods for each step: the best-performing triplet is formed by the Least Angle Regression (LARS, see Efron et al., 2004) for pre-selection, principal component analysis (PCA) for factor extraction, and the Macroeconomic Random Forest (MRF; Goulet-Coulombe, 2020) for prediction. LARS is similar to stepwise regression when dealing with a large set of potential regressors, variables are included step-by-step, but the method ensures that regression coefficients are similar in absolute value when the variables have the same correlation with the residuals.

The three-step approach outperforms benchmarks significantly and consistently. This three-step approach can be viewed as an extension of the widely used “diffusion index” of Stock and Watson (2002) who combine factor extraction by PCA and OLS regression. Compared to a model à la Stock and Watson (2002), the three-step approach delivers on average a 26% lower RMSE with accuracy gains coming both from the addition of a pre-selection step and from the use of the macroeconomic random forest (Figure N1). We finally check that the three-step approach outperforms workhorse nowcasting models such as a dynamic factor model.

In the end, the three-step approach can be viewed as a step-by-step method for forecasters willing to employ machine learning techniques in order to improve forecast accuracy. Aside from the use of innovative regression-based machine learning techniques, the contribution of this paper is the combination of those three steps. We show that each step improves accuracy: alternative approaches that excludes either pre-selection, factor extraction, or machine learning are found to underperform. Such findings contribute to the growing literature on machine learning by showing empirically that: (i) on short samples, machine learning techniques work best if data is summarized into factors instead of taking all of the individual series as explanatory variables, and (ii) accuracy is even greater if only a subset of the potential regressors is pre-selected.