SAS has powerful features for reading data out of text files (including datalines and external .CSV), but while reading text, you may find your data set is missing rows and the log has this non-intuitive error message (well, “note”):
NOTE: SAS went to a new line when INPUT statement reached past the end of a line
The best solution depends on the situation.
Example 1: number and character variables
First, let’s recreate the error with one number and one character variable:
data rickroll; input line 1. lyric $200. ; datalines; 0 I just wanna tell you how I'm feeling 1 Gotta make you understand 2 Never gonna give you up 3 Never gonna let you down ;
SAS is trying to fill the second variable lyric with 200 characters. Because there is not enough data provided, SAS is reaching the end of the line. There are several options to deal with this situation (from the SAS documentation on INFILE):
- FLOWOVER: causes an INPUT statement to continue to read the next input data record if it does not find values in the current input line for all the variables in the statement. FLOWOVER is the default behavior of the INPUT statement.
- MISSOVER: prevents an INPUT statement from reading a new input data record if it does not find values in the current input line for all the variables in the statement. When an INPUT statement reaches the end of the current input data record, variables without any values assigned are set to missing.
- STOPOVER: causes the DATA step to stop processing if an INPUT statement reaches the end of the current record without finding values for all variables in the statement. When an input line does not contain the expected number of values, SAS sets _ERROR_ to 1, stops building the data set as if a STOP statement has executed, and prints the incomplete data line.
- TRUNCOVER: overrides the default behavior of the INPUT statement when an input data record is shorter than the INPUT statement expects. By default, the INPUT statement automatically reads the next input data record. TRUNCOVER enables you to read variable-length records when some records are shorter than the INPUT statement expects. Variables without any values assigned are set to missing.
In our case, the most appropriate is TRUNCOVER like this:
data rickroll; infile datalines truncover; input line 1. lyric $200. ; datalines; 0 I just wanna tell you how I'm feeling 1 Gotta make you understand 2 Never gonna give you up 3 Never gonna let you down ;
Now the data set is complete (with four rows), and the SAS log doesn’t have any complaints.
Example 2: three numbers
Now we’ll create a data set with three numeric variables but won’t specify how to treat the exception, and watch what happens to the data set.
data lame_numbers; infile datalines delimiter=','; ; input a b c; datalines; 0,1,2 3 .,4 ;
Oh no! SAS defaulted to FLOWOVER behavior which combined the second and third rows: that could cause some problems. Let’s fix it with MISSOVER:
data lame_numbers; infile datalines missover delimiter=','; ; input a b c; datalines; 0,1,2 3 .,4 ;
Now we have three rows:
Tested with Base SAS 9.2 on Windows XP SP3.