Exploring the Difference in Multicollinearty Adjustments Between Logistic and Linear Regression

Deanna Schreiber-Gregory
Henry M Jackson Foundation for the Advancement of Military Medicine


Abstract

Multicollinearity can be briefly described as the phenomenon in which two or more identified predictor variables are linearly related, or codependent. The presence of this phenomenon can have a negative impact on an analysis as a whole and can severely limit the conclusions of a research study, regardless of whether you are employing linear or logistic regression techniques.

In this paper, we will briefly review how to detect multicollinearity, and once it is detected, which regularization techniques would be the most appropriate to combat it. The nuances and assumptions of R1 (Lasso), R2 (Ridge Regression), and Elastic Nets and their application to linear and logistic regression model construction will be covered in order to provide adequate background for appropriate analytic implementation. This paper is intended for any level of SASĀ® user. This paper is also written to an audience with a background in theoretical and applied statistics, though the information within will be presented in such a way that any level of statistics/mathematical knowledge will be able to understand the content.