Here is code to calculate RMSE and MAE in R and SAS.

RMSE (root mean squared error), also called RMSD (root mean squared deviation), and MAE (mean absolute error) are both used to evaluate models by summarizing the differences between the actual (observed) and predicted values. MAE gives equal weight to all errors, while RMSE gives extra weight to large errors.

First, in R:

# Function that returns Root Mean Squared Error rmse <- function(error) { sqrt(mean(error^2)) } # Function that returns Mean Absolute Error mae <- function(error) { mean(abs(error)) } # Example data actual <- c(4, 6, 9, 10, 4, 6, 4, 7, 8, 7) predicted <- c(5, 6, 8, 10, 4, 8, 4, 9, 8, 9) # Calculate error error <- actual - predicted # Example of invocation of functions rmse(error) mae(error) # Example in a linear model ## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2, 10, 20, labels = c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) rmse(lm.D9$residuals) # root mean squared error

In SAS, we calculate RMSE and MAE using a DATA STEP or PROC SQL, and the SQL method is much simpler than the DATA step.

/* Macro to calculate Mean Absolute Eror and Root Mean Squared Error */ /* Outputs to data set, log, and macro variable */ %macro mae_rmse( dataset /* Data set which contains the actual and predicted values */, actual /* Variable which contains the actual or observed valued */, predicted /* Variable which contains the predicted value */ ); %global mae rmse; /* Make the scope of the macro variables global */ data &dataset; retain square_error_sum abs_error_sum; set &dataset end=last /* Flag for the last observation */ ; error = &actual - &predicted; /* Calculate simple error */ square_error = error * error; /* error^2 */ if _n_ eq 1 then do; /* Initialize the sums */ square_error_sum = square_error; abs_error_sum = abs(error); end; else do; /* Add to the sum */ square_error_sum = square_error_sum + square_error; abs_error_sum = abs_error_sum + abs(error); end; if last then do; /* Calculate RMSE and MAE and store in SAS data set. */ mae = abs_error_sum/_n_; rmse = sqrt(square_error_sum/_n_); /* Write to SAS log */ put 'NOTE: ' mae= rmse=; /* Store in SAS macro variables */ call symput('mae', put(mae, 20.10)); call symput('rmse', put(rmse, 20.10)); end; run; %mend; /* Alternative macro that uses PROC SQL. Output is only a macro variable */ %macro mae_rmse_sql( dataset /* Data set which contains the actual and predicted values */, actual /* Variable which contains the actual or observed valued */, predicted /* Variable which contains the predicted value */ ); %global mae rmse; /* Make the scope of the macro variables global */ proc sql noprint; select count(1) into :count from &dataset; select mean(abs(&actual-&predicted)) format 20.10 into :mae from &dataset; select sqrt(mean((&actual-&predicted)**2)) format 20.10 into :rmse from &dataset; quit; %mend; /* Example data set */ data error; input actual predicted; datalines; 4 5 6 6 9 8 10 10 4 4 6 8 4 4 7 9 8 8 7 5 run; /* Example of macro invocation */ %mae_rmse(error, actual, predicted); %put NOTE: mae=&mae rmse=&rmse; /* Example of SQL-based macro */ %mae_rmse_sql(error, actual, predicted); %put NOTE: mae=&mae rmse=&rmse;

Tested with R 3.0.1 and SAS 9.3.

Advertisements

Thank you for the R rmse formula/function. This was helpful when I was having doubts about how to calculate it. It is just as simple as the name implies! 🙂

Pingback: Predicting Daily Demand via a Random Forest Regression – Viviane Van der Meeren