ON VARIABLE SELECTION IN LINEAR REGRESSION Academic Article uri icon

abstract

  • Shibata (1981, Biometrika 68, 45–54) considers data-generating mechanisms belonging to a certain class of linear regressions with errors that are independent and identically normally distributed. He compares the variable selection criteria AIC (Akaike information criterion) and BIC (Bayesian information criterion) using the following type of comparison. For each fixed possible data–generating mechanism, these criteria are compared as the data length increases. The results of this comparison have been interpreted as meaning that, in the context of the data-generating mechanisms considered by Shibata, AIC is better than BIC for large data lengths. Shibata's comparison is pointwise in the space of data–generating mechanisms (as the data length increases). Such comparisons are potentially misleading. We consider a simple class of data-generating mechanisms satisfying Shibata's assumptions and carry out a different type of comparison. For each fixed data length (possibly large) we compare the variable selection criteria for every possible data-generating mechanism in this class. According to this comparison, for this class of data-generating mechanisms no matter how large the data length AIC is not better than BIC.

publication date

  • August 2002