Seminario dipartimentale a.a. 2019/2020
nell'ambito del Programma STaRs - Supporting Talented Researchers
Modelling Non-stationary Big Data (joint with J. Castle and J. Doornik)
INTERVIENE: Professor Sir David F. Hendry, University of Oxford
ABSTRACT: Seeking substantive relationships among vast numbers of spurious connections when modelling Big Data requires an appropriate approach. Big Data are useful if they can increase the probability that the data generation process (DGP) is nested in the postulated model, increase the power of specification and mis-specification tests, and yet do not raise the chances of adventitious significance. Simply choosing the best-fitting equation or trying hundreds of empirical fits and selecting a preferred one—perhaps contradicted by others that go unreported--is not going to lead to a useful outcome. A crucial issue addressed in this paper is that wide-sense non-stationarity (cointegration and location shifts) must be taken into account if statistical modelling by mining Big Data is to be productive. Moreover, important computational problems must be resolved given the huge numbers of possible models to be selected over. The paper discusses the use of principal components analysis to identify cointegrating relations using saturation estimators as a route to handling non-stationary big data, and considers an empirical example of modelling the monthly UK unemployment rate with Google trends data, searching over 3000 explanatory variables and yet identifying a parsimonious, well-specified and theoretically interpretable model specification.
Per info scrivere a: firstname.lastname@example.org