Logo of Humboldt-Universität zu BerlinLogo of Humboldt-Universität zu Berlin
edoc-Server
Open-Access-Publikationsserver der Humboldt-Universität
de|en
Header image: facade of Humboldt-Universität zu Berlin
View Item 
  • edoc-Server Home
  • Artikel und Monographien
  • Zweitveröffentlichungen
  • View Item
  • edoc-Server Home
  • Artikel und Monographien
  • Zweitveröffentlichungen
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.
All of edoc-ServerCommunity & CollectionTitleAuthorSubjectThis CollectionTitleAuthorSubject
PublishLoginRegisterHelp
StatisticsView Usage Statistics
All of edoc-ServerCommunity & CollectionTitleAuthorSubjectThis CollectionTitleAuthorSubject
PublishLoginRegisterHelp
StatisticsView Usage Statistics
View Item 
  • edoc-Server Home
  • Artikel und Monographien
  • Zweitveröffentlichungen
  • View Item
  • edoc-Server Home
  • Artikel und Monographien
  • Zweitveröffentlichungen
  • View Item
2021-09-29Zeitschriftenartikel DOI: 10.1186/s12874-021-01373-z
Statistical model building: Background “knowledge” based on inappropriate preselection causes misspecification
Hafermann, Lorena
Becher, Heiko
Herrmann, Carolin cc
Klein, Nadja cc
Heinze, Georg cc
Rauch, Geraldine
Wirtschaftswissenschaftliche Fakultät
Background Statistical model building requires selection of variables for a model depending on the model’s aim. In descriptive and explanatory models, a common recommendation often met in the literature is to include all variables in the model which are assumed or known to be associated with the outcome independent of their identification with data driven selection procedures. An open question is, how reliable this assumed “background knowledge” truly is. In fact, “known” predictors might be findings from preceding studies which may also have employed inappropriate model building strategies. Methods We conducted a simulation study assessing the influence of treating variables as “known predictors” in model building when in fact this knowledge resulting from preceding studies might be insufficient. Within randomly generated preceding study data sets, model building with variable selection was conducted. A variable was subsequently considered as a “known” predictor if a predefined number of preceding studies identified it as relevant. Results Even if several preceding studies identified a variable as a “true” predictor, this classification is often false positive. Moreover, variables not identified might still be truly predictive. This especially holds true if the preceding studies employed inappropriate selection methods such as univariable selection. Conclusions The source of “background knowledge” should be evaluated with care. Knowledge generated on preceding studies can cause misspecification.
Files in this item
Thumbnail
s12874-021-01373-z.pdf — Adobe PDF — 1.357 Mb
MD5: 40098097ec074b7f19cc2ded22167dcf
Cite
BibTeX
EndNote
RIS
(CC BY 4.0) Attribution 4.0 International(CC BY 4.0) Attribution 4.0 International
Details
DINI-Zertifikat 2019OpenAIRE validatedORCID Consortium
Imprint Policy Contact Data Privacy Statement
A service of University Library and Computer and Media Service
© Humboldt-Universität zu Berlin
 
DOI
10.1186/s12874-021-01373-z
Permanent URL
https://doi.org/10.1186/s12874-021-01373-z
HTML
<a href="https://doi.org/10.1186/s12874-021-01373-z">https://doi.org/10.1186/s12874-021-01373-z</a>