Influential observations are outliers that have a large effect on the slope of a regression line. It is very useful to be able to automatically remove influential observations prior to running a simple regression.
The function above regresses colA and colB after removing the top 10 influential observations from the data set.
The approach taken will be to remove each observation one and at a time and re-run the regression on the data set minus the observation. After each run the difference in model fit will be recorded. After completing the regression runs, N observations that had the highest difference of fit will be removed from the data set. The final regression will be run without those observations.