Regression - Diagnostic - Plot - Cook's Distance

From Q
Jump to navigation Jump to search


A line/rug plot showing Cook's Distance for each observation fitted in the regression model.

Example

Sample output from plotting the Cook's distances for a quasi-Poisson regression model. No data points appear to be overly influential.

Details

Used to detect highly influential data points, i.e. data points that can have a large effect on the outcome and accuracy of the regression. For large sample sizes, a rough guideline is to consider values above 4/(n-p), where n is the sample size and p is the number of predictors including the intercept, to indicate highly influential points.

Acknowledgements

Uses plot.lm and/or plot.glm function from the stats R package.

References

Cook, R. Dennis (1977). Detection of Influential Observations in Linear Regression. Technometrics. American Statistical Association. 19 (1): 15–18. DOI: 10.2307/1268249.

Williams, D. A. (1987). Generalized linear model diagnostics using the deviance and single case deletions. Applied Statistics. 36: 181-191. DOI: 10.2307/2347550.

Fox, J, & Weisberg, S. (2011). An R Companion to Applied Regression. 2nd Edition. SAGE Publications. ISBN: 9781412975148.

Code

includeWeb("QScript R Output Functions");

const menu_location = "Regression > Diagnostic > Plot > Cook's Distance";
createDiagnosticROutputFromSelection(menu_location);