Regression - Diagnostic - Plot - Cook's Distance extension

From Q
Jump to navigation Jump to search

A line/rug plot showing Cook's Distance for each observation fitted in the regression model.


Sample output from plotting the Cook's distances for a quasi-Poisson regression model. No data points appear to be overly influential.


Used to detect highly influential data points, i.e. data points that can have a large effect on the outcome and accuracy of the regression. For large sample sizes, a rough guideline is to consider values above 4/(n-p), where n is the sample size and p is the number of predictors including the intercept, to indicate highly influential points.


Uses plot.lm and/or plot.glm function from the stats R package.


Cook, R. Dennis (1977). Detection of Influential Observations in Linear Regression. Technometrics. American Statistical Association. 19 (1): 15–18. DOI: 10.2307/1268249.

Williams, D. A. (1987). Generalized linear model diagnostics using the deviance and single case deletions. Applied Statistics. 36: 181-191. DOI: 10.2307/2347550.

Fox, J, & Weisberg, S. (2011). An R Companion to Applied Regression. 2nd Edition. SAGE Publications. ISBN: 9781412975148.


includeWeb("QScript R Output Functions");

const menu_location = "Regression > Diagnostic > Plot > Cook's Distance";