2006-01-19Buch DOI: 10.18452/3702
Maximization of Empirical Shannon Information in Testing Significant Variables of Linear Model
Search for an unknown set A,Card(A) = s, of significant variables of a linear model with random IID discrete binary carriers and finitely supported IID noise is studied. Two statistics T1, Ts, based on maximization of Shannon Information (SI) of the corresponding classes of joint empirical input-output distributions, are proposed inspired by the related study in Csiszar and Körner (1981). The first one compares sequences of values of each variable and of the output separately. The second one explores the relation between the subsets of the (N x t) design matrix corresponding to each subset of variables of given cardinality and the output sequence. Here N is the number of experiments and t is the total number of variables. Both statistics are shown to be asymptotically as efficient as the ML-test for the corresponding classes of joint empirical distributions in the artificial case when ML-test is applicable: if the unknown parameters bλ, λ Є A, of the model and the distribution of errors are known. Our tests do not require this information. Therefore, they are asymptotically uniformly most efficient in the corresponding classes of tests. The second statistic is shown to provide asymptotically best rate of search for the set A of significant variables when t→∞ but requires about ts log t cycles of computing. This may appear in accessible for actual computations in some applications. The first statistic requires only t log t cycles of computing operations and provides the best order of magnitude of the characteristics studied for the second class of tests.
Files in this item