********************************************************************** * * * SASGENE 1.1 * * Program for Analysis of * * Gene Segregation and Linkage * * November 5, 1997 * * * *********************************************************************; Instructions for running SASGENE macros The SASGENE program for Gene Segregation and Linkage Analysis is written in the SAS macro language. There are four SAS files which include three macro files and one file with an example. The first macro, sgene, is for single gene goodness-of-fit tests. The second macro, linkage, is for analysis of gene linkage relationships. The third macro, convert, is optional and converts gene values to "D" for dominant and "R" for recessive. A file, startup.sas, illustrates how to use the macros. The startup.sas file can be modified easily from its present form for experiments of interest to the user. The macros are written for version 6 and later versions of SAS and are designed to run on any computer platform provided there is sufficient disk space. The amount of disk space required increases as the number of genes increases for the linkage analysis. To use the macros, the user creates an input data file that consists of: Plot No., Replication No., Plant No., Family No., Generation No., and Gene (or trait) names. Note that Plot No., Replication No. and Plant No. are only used for collecting data and are not used by the program for computing statistics. The user may specify any values for the FAMILY variable, but the macro is expecting values of 1, 2, 3, 4, 5, or 6 for the GNR (generation) variable: 1 for P1, 2 for P2, 3 for F1, 4 for F2, 5 for BC1P1, 6 for BC1P2. Any SAS names may be used for the gene names. The genes (or traits) are variables (columns) and their values are observations (rows). Family and generation are identification variables. In the data file, the values of P1, P2 and F1 should not be omitted or the results may be incorrect. The sgene and linkage macros require the gene values to be coded as "D"for dominant, "R" for recessive, and "." or blank for a missing value. An optional macro, convert, converts the original gene values to "D", "R", or missing. For each gene and family, the most frequent value for F1 is assigned as the dominant gene. Any other non-missing values are assigned as recessive. Any missing values are retained as missing. An example of a SAS dataset is: data orig; input PLOT REP FAMILY GNR PLNT BI $ RC $ DV $ SP $ LL $ DF $ F $ B $ D $ U $ TU $; cards; 1 1 20 1 N R N S N N M B D N W 2 1 20 1 N R N S N N M B D N W 3 1 20 1 N R N S N N M B D N W 4 1 20 1 N R N S N N M B D N W . . run; The macro code or a %INCLUDE (alias is %INC) statement is needed to define the macro to the SAS system. The user may include the macro into the program editor or use a %INC statement, such as %inc 'sgene.sas'. The %INC statement specifies the physical name of an external file where the macro is stored. The physical name is the name by which the host system recognizes the file. Depending on the host system and location of the file, the entire file name may need to be specified. Examples: %inc 'c:\mysas\sgene.sas'; %inc '~/sasmacro/sgene.sas'; The file, sgene.sas, contains the SAS macro sgene. File names, such as sgene.sas, usually have an extension of sas if the file is a SAS program or a SAS macro. Once the macro is defined to SAS, the macro can be invoked. To invoke the macro, specify the %, the macro name (either sgene, linkage, or convert), and the required parameters in parenthesis. The sgene macro has 3 parameters: DS - name of the SAS dataset to analyze GENES - gene names from the SAS dataset P1 - critical value for about half of the frequency of one parent to determine the expected segregation ratio (1:1 or 1:0) in BC1 generation. Example: %sgene (ds=new, genes=BI RC DV SP LL DF F B D U TU, p1=9 ); The linkage macro has 4 parameters: DS - name of the SAS dataset to analyze GENES - gene names from the SAS dataset P1, P2 - critical value for about half of the frequency of the parents to determine if the phase is coupling or repulsion. Example: %linkage (ds=new, genes=BI RC DV SP LL DF F B D U TU, p1=9, p2=9 ); The convert macro has 3 parameters: DS - name of the SAS dataset to convert GENES - list of the desired gene names from the SAS dataset DSOUT - name of the SAS dataset after conversion. Example: %convert (ds=orig, genes=BI RC DV SP LL DF F B D U TU , dsout=new ); There are several additional files stored in the same location as this introduction. The files are: startup.sas - example that illustrates how to use the macros orig.dat - sample data for the startup.sas file convert.sas - file that contains the SAS macro convert sgene.sas - file that contains the SAS macro sgene linkage.sas - file that contains the SAS macro linkage.