# Blog moved to Blogger.com

After five years and 405,557 page views on WordPress.com, I will now be posting from http://heuristicandrew.blogspot.com/. Please update any bookmarks, RSS feeds, or Atom feeds. New Atom feed for all posts: http://heuristicandrew.blogspot.com//feeds/posts/default/ R posts http://heuristicandrew.blogspot.com//feeds/posts/default/-/r SAS posts http://heuristicandrew.blogspot.com//feeds/posts/default/-/sas

# Practical MD5 in SAS

This guide introduces MD5 and hash functions in general, lists common uses for hash functions, gives advise on how to best use MD5 in SAS, and covers common issues.

# Email address normalization in SAS

This SAS macro performs email address normalization by changing email addresses like First.Last+tag@googlemail.com to the canonical form firstlast@gmail.com. Also, it demonstrates basic unit testing in SAS, which ensures quality and eases code maintenance. Email address normalization is often used to transform email addresses into unique keys for identifying or preventing duplicate accounts

# Binomial confidence intervals: exact vs. approximate

This graph and R code compares the exact vs. normal approximations for 95% binomial confidence intervals for n trials with either one success or 50% success.

# Bar plot with error bars in R

Here’s a simple way to make a bar plot with error bars three ways: standard deviation, standard error of the mean, and a 95% confidence interval. The key step is to precalculate the statistics for ggplot2.

# Get free disk space in SAS on Windows

This SAS macro retrieves the amount of free disk space, and puts the value in the SAS log and in a global macro variable. It works with local and remote drives and mapped and UNC paths. To avoid data loss, use it as a sanity check to verify there is a reasonable amount of disk space before writing data.

# Calculate RMSE and MAE in R and SAS

Here is code to calculate RMSE and MAE in R and SAS. RMSE (root mean squared error), also called RMSD (root mean squared deviation), and MAE (mean absolute error) are both used to evaluate models. MAE gives equal weight to all errors, while RMSE gives extra weight to large errors.

# Kernel panic – not syncing: VFS: Unable to mount root fs on unknown block(0,0)

This is how to resolve the following error I encountered while upgrading Fedora 16 (EOL) to Fedora 17 using yum: Loading Fedora (3.8.12-100.fc17.i686) Kernel panic – not syncing: VFS: Unable to mount root fs on unknown block(0,0) Pid: 1, comm: swapper/0 Not tained 3.8.12-100.fc17.i686 #1 The error and resolution are nearly identical to the Kernel…

# Geolocate IP addresses in R

This R function uses the free freegeoip.net geocoding service to resolve an IP address (or a vector of them) into country, region, city, zip, latitude, longitude, area and metro codes.

# Popup notification from R on Windows

After R is done running a long process, you may need to notify the operator to check the R console and provide the next commands. Without installing any more software or creating any batch files or VBS scripts, here is a simple way to create the popup notice in Windows