HOME Module List Site Index About This Site glossary" Resources Innovative Technology Center UT Statistics Courses
Using SAS Choose Design ANOVA Compare Means Regression Examples
Using SAS Using SAS: Table of Contents

Accessing data
Accessing data from text files: comma delimited files

If your dataset is large (more than 100 observations or more than 10 variables), you will most likely want SAS to read your data directly from an external file in which you have it saved. This is easily done if your data are in a text file, such as a *.txt file.
Why save data as a text file?

SAS statements to read the data from a text file (Method 1):

SAS needs to be told two things: a SAS name for the dataset, and the name of the external text file. These are accomplished with one SAS statement:

      The PROC IMPORT statement has two options:
           DATAFILE= specifies the name of the text file
           OUT=
specifies the SAS dataset name.

      SAS statements always start with a keyword and end with a semicolon.

      In addition, use the REPLACE option to assure that SAS overwrites pre-existing versions of your dataset. Each time you run PROC IMPORT, it will attempt to create the requested dataset. If you do not specify REPLACE, it will not create a new dataset (i.e., it will leave the old copy active). This can cause great confusion, as you correct your dataset and SAS appears to ignore the changes (because the old copy is still being used).

SAS statements to read the data from a text file (Method 2):

SAS needs to be told four things: a SAS name for the dataset, the external file name, names for each of your variables, and any special format for those variables. These are accomplished with three SAS statements:

      The DATA statement starts this process by naming the dataset, for example DATA one;

      The INFILE statement is used to identify the external file, and options needed to read the file.

      The INPUT statement names the variables and describes how they should be read in.

Important points about your data set

Examples

example 1: Using INPUT to read a comma-delimited file

DATA one; INFILE 'c:\dawg\tab1.csv' DSD FIRSTOBS=2;
INPUT x y z;
RUN;

This example reads 3 variables from the tab1.txt file located in the specified folder. DSD is required, as it indicates that commas are delimiter, and if two commas are found with nothing between them, that variable's value is set to missing. Often the first row of data will contain column headings, and the FIRSTOBS option requests that data be read starting on line 2.

example 2: Using a formatted INPUT to read a comma-delimited file

If a variable has letter values, use the $ symbol after its name. Other formats that are commonly used are dates, for example MMDDYY8. specifies month-day-year data with the / separator and 8 characters wide (eg. 12/31/05). Otherwise the access instructions do not change, for example
DATA one; INFILE 'c:\dawg\tab1.csv' DSD FIRSTOBS=2;
INPUT x $ y MMDDYY8. z;
RUN;

The variable z will be read in as a number by default.

example 3: Using PROC IMPOR T to read a comma-delimited file

An example of using PROC IMPORT is:
PROC IMPORT DATAFILE= 'c:\dawg\tab1.csv' DATA=one REPLACE;
RUN;
By default, the extension .txt indicates a tab-delimited file. If your file has a different name, then add the option DDM=CSV to the PROC IMPORT statement:
PROC IMPORT DDM=CSV DATAFILE= 'c:\dawg\tab1.csv' DATA=one REPLACE;
RUN;

Related Topics:                                                                                                     Using SAS: Table of Contents
Overview
From within a SAS program
From Excel
Accessing data from text files
    tab delimited files
    comma delimited files
PROC Import problems

Home | Contact us | Module list & summary | Glossary/Terms | About this site | Stats courses | Links | Index