HOME Module List Site Index About This Site glossary" Resources Innovative Technology Center UT Statistics Courses
Using SAS Choose Design ANOVA Compare Means Regression Examples
Using SAS Using SAS: Table of Contents

Working with SAS datasets
Subsetting the dataset

Occasionally you will want to run a procedure on only part of your data, so need to create a dataset containing a subset of your observations.

Suppose you have several years of data in DATA One, and want to create a new dataset containing only 2005 data.
DATA two; SET one;
IF year=2005;
RUN;

will accomplish this, as only observations for which the IF statement is true will be retained in DATA Two.

Alternatively, you could delete observations that are not wanted ("not equal to", NE):
DATA two; SET one;
IF year NE 2005 THEN DELETE;
RUN;

Several data subsets can be created in one step. This example puts 2004 and 2005 data into separate datasets.
DATA y2004 y2005; SET one;
IF year=2004 THEN OUTPUT y2004;
IF year=2005 THEN OTUPUT y2005;
RUN;

The OUTPUT command tells SAS to write the observation to the specified dataset.

Shortcut
If you need a subset of the data to pass to a SAS procedure, this can often be done without creating a new dataset. Instead, use the WHERE statement to instruct the procedure to only use certain observations. For example,
PROC MEANS; WHERE year=2005;
VAR x;
RUN;

will calculate the mean of x for the year 2005. Almost all procedures allow the use of the WHERE statement, and it has the same syntax.

 

 

Related Topics:                                                                                                     Using SAS: Table of Contents
 Merging datasets
 Creating new variables
Subsetting the dataset
bullet Miscellaneous

Home | Contact us | Module list & summary | Glossary/Terms | About this site | Stats courses | Links | Index