Science topic

SAS Programming - Science topic

Explore the latest questions and answers in SAS Programming, and find SAS Programming experts.
Questions related to SAS Programming
  • asked a question related to SAS Programming
Question
3 answers
Does anyone have suggestions on software programs that are good at analyzing data and giving great publishable graphics, software not requiring knowledge of programming languages like in R or SAS programs? Softwares that are freely available for installation and application are of priority. Thank you!
Relevant answer
Answer
Yes, R is free. A lot of websides for help.
For Graphics, see:
  • asked a question related to SAS Programming
Question
2 answers
Anyone have a SAS program to process data from the Recent Physical Activity Questionnaire (RPAQ)? The MRC lists a STATA syntax (https://www.mrc-epid.cam.ac.uk/physical-activity-downloads/), but I do not use STATA.
Relevant answer
Answer
Convert Stata to SAS.
  • asked a question related to SAS Programming
Question
4 answers
Hi!, Everyone! I am a new learner of SAS software. I am reading an article of a study about minority. In the findings of this study, there is a sentence mentioning something as follows:
" A power analysis using MacCallum et al.’s (1996) SAS program indicated that the statistical power of the models were at 0.90."
Here I also attached his conceptual model (Structure Equation medel) of his study. I am wondering how could the author achieved such result (0.90) with SAS software. As you can see in another attached photo, there are lots of items under the main item of power and sample size, such as Anova, t-tests, multiple regression, etc. However, I can't find the item or button, especially for structure equation model or path analysis. May I know if anyone know which item I should choose in order to achieve the statistical power of the model 0.90. Alternatively, if there is no "ready made" menu or direct button when conducting a power analysis for SEM or Path Analysis in SAS. is there any software that I can achieve such result directly or easily ?
Relevant answer
Answer
Hi Oscar, SAS is a very complex too. For complicate questions I'd recommend you sending an email to their support, which is GREAT. They are very responsive and will go great lengths to solve your issue.
However, that's only if you have a SAS license, they don't answer to free users.
  • asked a question related to SAS Programming
Question
9 answers
I have taken the data from a field trial established to screen the sugarcane varieties for sugarcane grassy shoot diseases(GSD). RCBD with three replicates was used to establish the trial. Standard varieties are not available for GSD. Therefore, comparison and rating will not be possible.
The number of GSD infected clumps (phenotypically) and # of total clumps had been taken as the main data of the trial in the one-month intervals from 1 to 12 months (12 disease counts). Disease incidence was calculated. In addition, yield data were taken.
1. Could you please explain what kind of statistical analysis is suitable for analyzing the data taken here in order to find the varietal response for the GSD by the SAS program based on disease data?
2. What data (# disease clumps or disease incidence) is appropriate to use for analysis? Further, it would be a great support if anyone can give an idea of how to write CLASS and MODEL statements of SAS for analyzing this data.
Thank you
Relevant answer
Answer
You can analyze either number of infected clumps or disease incidence (that I understand as number of infected clumps / number of total clumps). The first is count data that follow Poisson or Negative Binomial distribution whereas the second is a proportion that follow a binomial distribution. Therefore, classical methods like Analysis of Variance (ANOVA), a specific case of general linear model, are not appropriate even with prior data transformation like root square or arcsinus (O'Hara and Kotz, 2010; Wharton and Hui, 2011).
Ignoring the repeated measurement, considering data for each month separately, it is highly advisable to use Generalized Linear Model (GLM) with appropriate distributions and link functions that lead to Poisson regression for number of infected clumps and logistic regression for disease incidence. For Poisson regression, overdispersion should be checked and if it is present, use Negative Binomial regression instead of Poisson regression.
To take into consideration repetition in time, the likely correlation between time points (months) should be considered by using Generalized Estimating Equations (GEE) or Generalized Linear Mixed Models (GLMM) (Gbur et al, 2012; Stroup, 2013; Yirga et al, 2020).
In SAS, GLM is done using PROC GENMOD while GLMM is done using PROC GLIMMIX.
- Gbur EE, Stroup WW et al. (2012). Analysis of Generalized Linear Mixed Models
in the Agricultural and Natural Resources Sciences. ASA, SSSA, CSSA. See chapter 5.
- O’Hara RB and Kotze DJ. (2010). Do not log-transform count data. Methods in Ecology and Evolution, 1: 118-122.
- Stroup WW. (2013). Generalized Linear Mixed Models: Concepts, Methods, and Applications. CRC Press. See Chapter 14.
Warton DI and Hui FKC. (2011). The arcsine is asinine: the analysis of proportions in ecology. Ecology, 92: 3-10.
- Yirga AA, Melesse SF, Henry G. Mwambi HG, and Ayele DG. (2020). Negative binomial mixed models for analyzing longitudinal CD4 count data. Scientific Reports, 10: 16742.
  • asked a question related to SAS Programming
Question
6 answers
The solution seems quite simple, but I can't find the code that works. I just try to run the following SAS code in RStudio :
proc ttest data=jump_data;
var JFM_BJH;
class clc_sex;
where ageg=2;
run;
ageg = age groups and this categorical variable has 5 levels.
Thank you very much in advance
Relevant answer
Answer
Simpler without tidyverse:
t.test(JFM_BJH ~ clc_sex, data = jump_data, subset = ageg==2, var.equal = T)
If there are several age groups it might be better to explicitely include ageg in the model, like
model <- lm(JFM_BJH ~ clc_sex * factor(ageg), data = jump_data)
summary(model)
Note that age groups (and categorization in general) is usually suboptimal. It would be way more informative to model JFM_BJH as a function of age (e.g. in years), and this function may depend in meaningful ways on clc_sex.
  • asked a question related to SAS Programming
Question
5 answers
I’m performing a multivariate regression and my residuals are not normal. I decided to do a log y transformation however this didn’t help should it have or is there another thing I could try?
Relevant answer
Answer
Hello Lauren,
You could try using bootstrap/resampling to obtain coefficient and error estimates. A number of statistical software packages offer this as an option. Here are some links to get you started:
Good luck with your work!
  • asked a question related to SAS Programming
Question
8 answers
I have experimented with two Factor: Factor A: Two iron sources and Factor B: two seasons with three blocks. When I run the SAS, the interaction not significant, but the combination of the treatment was significant.
what the problem for this
Relevant answer
Answer
With two factors A and B, there is a main effect of A, a main effect of B, and an interaction effect of A and B.
  • asked a question related to SAS Programming
Question
3 answers
I have water temperature data for moring and evening, I need to show the maximum and minimum value by the SAS program.
Can anyone inform me of the SAS code for this
Relevant answer
  • asked a question related to SAS Programming
Question
2 answers
Hello SAS recommends to me that I look at the article Fisher, R. A. (1938). Statistical Methods for Research Workers. 10th ed. Edinburgh: Oliver & Boyd. in order to understand how to transform nominal variables by optimally scoring the categories.
I tried reading it and it just made me even more confused. Can anyone give me a layman's term on the optimal scale (SAS defines it as optimal score or opscore)?
Relevant answer
Answer
Hello Marco,
Here's a related RG thread that you may find helpful: https://www.researchgate.net/post/Fishers_Scoring_Algorithm
Good luck with your work.
  • asked a question related to SAS Programming
Question
4 answers
Is there SAS code for implementing panel cointegration tests?
Relevant answer
Answer
Hello all,
I am refreshing the request of this topic in a hope that Mrs. Ruel will provide an updated link of the paper mentioned in her post. The existing link is void.
All the best.
  • asked a question related to SAS Programming
Question
2 answers
SAS programming for clinical trails. Filing to NDA (FDA)
Relevant answer
Answer
Thank jos
  • asked a question related to SAS Programming
Question
2 answers
my model is related to earnings management.
Relevant answer
Answer
first, thank you
second, you are right it was my fault there were some differences between the type of data.
please i have an issue which is
when I wanted to apply propensity score matching method by SAS i got error message says the psmatch statement is not found. then i googled it all the answers that i got that my version is not updated and my version is SAS9.4 and valid though. do you have an idea please how can i fix my own version.
thanks in advance
  • asked a question related to SAS Programming
Question
1 answer
Hello everyone,
I had successfully installed oracle database 11g XE edition  and SAS 9.4 version on a system running on windows 10 Enterprise. I was able to create database through sql-plus or oracle sql developer. I have all required SAS components such as SAS\ACCESS  to ORACLE and SAS\ACCESS to ODBC installed.
I had error while trying to connect to database from SAS using
libname myora oracle user='Bakare' password='Agbaman path='XE';
the error message I got is below:
2    libname myora oracle user='Moshood' passoword='Agbaman' path='XE' ;
ERROR: The SAS/ACCESS Interface to ORACLE cannot be loaded. ERROR: Image SASORA   found but not
       loadable..
Please make sure Oracle environment is set correctly.
Look in the install/Config doc for additional info for your platform.
Other possible reasons - incomplete Oracle client install, 32/64-bit mismatch between Oracle
       client & SAS, incorrect Oracle client version(Oracle client must match the version
       picked during post-install process), incompatible sasora for your OS or its attribs
       don't permit SAS to load it.
ERROR: Error in the LIBNAME statement.
Relevant answer
Answer
I have prior with XE, but not SASS. I do remember having to hack the Registry on Oracle 10g. See if this helps...
  • asked a question related to SAS Programming
Question
6 answers
For me, im using sas program and i do and analysis my data personally.
Relevant answer
Answer
Thank you Dr Vishwanath shastry G.L. For your contribution.
  • asked a question related to SAS Programming
Question
5 answers
The likelihood ratio test of comparing reduced model with full model differs by fixed factor result to chi-square distribution of zero degree of freedom.
/* reduced model */
proc mixed method = ml;
class block gen;
model rtwt = /ddfm = kr;
random block gen;
run;
/* full model*/
proc mixed method = ml;
class block gen;
model rtwt = prop_hav/ddfm = kr;
random block gen;
run;
There are 3 degree of freedom from reduced model - block variance, genotype variance, and residual variance. The same degree of freedom for full model with includes prop_hav as covariate. The difference in their -2 loglikelihood has zero degree of freedom under chi-distribution. Please could anyone guide me on how to compare these model to ascertain if the full modelis significantly different from reduced model.
Relevant answer
Answer
LRT only works when one model is a simplification of another. You can’t take out terms and add in with one go, do it in two stages.
the models to be compared must be nested
  • asked a question related to SAS Programming
Question
3 answers
I am trying to locate the whereabouts of the specific source code for the SAS program mentioned in the paper below on page 102:
Unfortunately the site is no longer working and the authors are not contactable. If anyone can help, I would be most grateful.
Relevant answer
Answer
good question,thank you
  • asked a question related to SAS Programming
Question
1 answer
I tried the offset function but it is quite complex. I have 60,000 rows of data from UV detection. I need to offset downwards by +1 for all of them so that only y axis of graphed chromatogram(s) will be affected. If there is a way of manipulating the figure to do this please let me know. 
The percentage deviation and multiplication did not work. 
I am using excel 2017. If you know Origin it wont matter it is similar. I couldnt download SPSS. 
End goal is attached below. 
Relevant answer
Answer
As for statistical analysis: You can use the r language as effectively as SPSS and it is now being used more widely than either SPSS or SAS. There are a lot of tutorials and demonstrations of ways to use it on YouTube. I think a few might even be in Spanish.
Also see:
  • asked a question related to SAS Programming
Question
3 answers
I have a set of forest plots in which for example three have been remesured after three years, another three after five years, two after six years. How can I compare mortality and recruitment fairly among these plots?
Any suggestions or literature please?
Relevant answer
Answer
modelo linear generalizado mixto. Colocas el tiempo como factor aleatorio.
  • asked a question related to SAS Programming
Question
5 answers
I can open program,  enter data, write some statement and some analysis  but i am not professional I want to apple use SAS program as professional in analysis data of poultry breeding," wights, BWG,egg production traits, blood biochemicals"  how calculate Correlation, means, GLM....   whats your advise?   
Relevant answer
Answer
Dear Esteftah
The SAS Software is very good tool for statistical analysis but needs good training, so you can do your training with exercises in the following link:
The SPSS is simple and efficient software and need short training 
Good Luck
  • asked a question related to SAS Programming
Question
4 answers
I use SPSS program well at statistical analysis at poultry breeding data, but i am not use SAS program well Are SPSS program do the same analysis of SAS program with accuracy and precision? Most article use SAS.
Relevant answer
Answer
By using it.
I remembered a sentence "To follow the path, look to the master, follow the master, walk with the master, see through the master, become the master." and the following links. I like them and wanted to share.
  • asked a question related to SAS Programming
Question
6 answers
proc NLIN data = eleven
BEST = 10
MAXITER = 100
METHOD = Gauss
CONVERGE=1.0E-6
LIST
ALPHA = 0.05
OUTEST = outest(where = (_TYPE_ = "COVB"));
parms
B1 = 1
B2 = .5128
B3=-.0036;
model y = B1-B2*exp(-B3*x);
id id x y;
output
out = elevenOut
p = y_hat
r = residual
stdr = SE_Resid
LCLM = LCL_Mean
UCLM = UCL_Mean ;
run; quit; title;
Relevant answer
Answer
Thank you for your reply. I was able to use linear regression estimates as starting values. It took a while. It was a complicated equation. I have your comments printed out for review as I do more of these.
  • asked a question related to SAS Programming
Question
4 answers
I am trying to calculate the SASA of residue 128 in Lysozymefor a 25 ns trajectory. What should be my calculation group and output group in that case?
Relevant answer
Answer
The protein is the calculation group and that residue is the output group. Note that this situation is described explicitly in the g_sas help information. You will need to create an index group to select the residue(s) of interest for output.
  • asked a question related to SAS Programming
Question
1 answer
I used an incomplete block design (LATTICE) with two replications under stress and control conditions, in some blocks there is covariate factor and I want to correct treatments with covariance analysis but I don't know SAS code. 
I appreciate if you help me with that
  • asked a question related to SAS Programming
Question
9 answers
I will be working with cancer registry data (case level) and area based measures from the census and other sources to do a multilevel regression modeling. There will be thousands, if not tens of thousands, of cases, so considerable computing power is required. I personally have experience with SAS, but have heard STATA requires less memory/space for computation. Which program is preferable for such analysis? What are the pros and cons of each program for this analysis? Would it be useful to learn STATA programming?
Thank you very much in advance.
Relevant answer
Answer
Dear Whitney
I am a SAS user from 1985 without interruption, and I never got tired of being astonished of the multiple resources that SAS offers to resolve the problems with data analysis. I worked with very large datasets, as example in a SNA (Social Network Architecture) project with the information of all the phone callings in a year, from the users of a big cellular company, and SAS could resolve just on time, each of this problems, with a good hardware. I have not used STATA but I am not particularly interested because a change from the software that I know would be for me a considerable delay in the production of results.
  • asked a question related to SAS Programming
Question
9 answers
I could calculate the phenotypic correlation coefficient and generate the correlation matrix with logciel R, now I would calculate the corresponding  genetic correlation coefficient but I don't know how. 
Which other program can do it 
NB: I do not have the SAS program.
thanks
Relevant answer
Answer
I usually estimate genetic and phenotypic correlation through Analysis of Variance (ANOVA) method with MS. Excel. 
  • asked a question related to SAS Programming
Question
3 answers
Assuming 12 varieties are evaluated in 4 replicates in Randomized Complete Block Design (RCBD) where  20 variables are measured, is it appropriate to run correlation and principal component analysis on such replicated data or on the mean value of those varieties across the replicates? Which is the better approach? That is, there would 12*4 =48 observations if the analysis is based on replicate values while there would be 12 observations having computed the mean prior to correlation and principal component analysis.
Relevant answer
Answer
First principal component is created in such way to describe maximum of variance in data set, second pc is orthogonal to first and describe maximum of the rest variance and so on.
In my opinion replication better represent variance in data set. In other hand using means the scatter is decreased so less number of PC's will be needed to describe data. In my opinion findings for both ways will be similar.
When you use correlation coficient you should be very careful because Simpson's paradox can occur.
  • asked a question related to SAS Programming
Question
1 answer
Statisticians. Breeders. Biometricians
Relevant answer
Answer
Dear Omikunle,
At first you need to install "agricolae" package in R software. After loading the agricolae package you can copy and paste the  statements available in the attached file. This stability analysis has been done for 7 genotypes over 6 locations.
I think the following pdf file will help you very much.
  • asked a question related to SAS Programming
Question
4 answers
I intend to obtain kinship matrix from pedigree data through proc inbreed to be used as  input to proc mixed in order to obtain additive genetic variance. The warning message is always "Individual clone=I011412 has been previously defined. Observation 27 corresponding to this individual will
not be processed." How do I order individuals to avoid this warning message?
proc inbreed data=pedigree covar outcov=kinmat;
var clone female male;
run;
Relevant answer
Answer
There are many functions (and libraries) in R that can help you sort your data in the proper order. The main things are: grandparents before parents, parents before offspring, however, sometimes this is not easy. The key is that for a given individual that has a sire and dam defined in the pedigree file, needs to define these parents previously. I recommend you use GenoMatrix for this. This package has a model (Pedigree Sort) that will help you reading and sorting your pedigree file. Check it out at:
This software has also other routines to clean your molecular data, and to generate your kingship matrix (additive, dominance and even epistatic matrices)
  • asked a question related to SAS Programming
Question
3 answers
In plant breeding, we often talk of estimate of random effect i.e. BLUP and variance component on quantitative variables such as yield, plant height and etc. My thought is that estimate of BLUP or variance component for qualitative trait that are ranked or ordinal is wrong and unreliable. Take for example, if disease severity is scaled from 1 = no symptom to 5 = symptom is extremely severe. This is qualitative trait that is ranked! Is it appropriate to estimate the BLUP or variance component of random effect on such trait?
Relevant answer
Answer
yes, strictly speaking you can not use normal assumptions for this type of data, and you will need to go with Generalized Linear Mixed Models (GLMM) or another type of methodology (Bayesian?) in order to properly model the distribution of your data.
However, in practical terms, we do use LMM with normality assumptions for this type of variable in a regular basis. For example, sometimes we have a score from 1 to 5 for straightness of pine trees, where 0 is completely straight and 5 is very curved. This variable is discrete and ordinal, but if we check for some properties it can work quite well. First, you need to have equal spacing between your classes, this means that 1 to 2 is the same score change than 4 to 5, and this needs to be clearly defined in your measuring protocols. The other thing is that you ideally need to have several class values in your variable (ideally 10 or more), the reason is that this will give you a closer approximation to normality (central limit theorem) and therefore less issues. In any case, you will always have some 'doubts' about your heritability and BLUP estimates as you are 'approximating' your variable (e.g. true level of departure from straigthness)  with something different (e.g. a score).
  • asked a question related to SAS Programming
Question
1 answer
The 15 farms have new treatments in common but the local variety which is considered to be a check variety varies from one farm to another. There was NO replication of each treatment within a farm. Therefore, each farm is taken as a block in order to estimate block effect and residual term for comparison. Does it make sense to make a pairwise comparison since the local check is unique from one farm to another? The SAS script for the analysis is as follows
proc mixed data=onfarm covtest;
class farm variety;
model fyld=variety nohav/ddfm=satterth; /* nohav is a covariate */
random farm;
lsmeans variety/diff adjust=tukey;
run;
quit;
Relevant answer
Answer
Assuming that you randomized each block, it sounds like you have a slightly unusual augmented design (unusual because the checks are generally replicated, rather than the new treatments). I'd recommend that you read about augmented designs before continuing with your analysis (links below).
You should be able to make pairwise comparisons, but not with a thing called "check". Since your checks are not the same variety in all blocks, you should not estimate a combined mean for them. The augmented design analysis will allow you to compare each treatment variety with each control variety.
  • asked a question related to SAS Programming
Question
3 answers
I have reached my wts end trying to run a PROC NLIN model using SAS on my data set. I got the syntax from a colleague doing similar experiment but i am not sure if the contraints fit into my data.I shall appreciate every effort in putting me through what am doing wrong. Here is my SAS syntax: 
DATA ACC;
Input TIME IVGP SPECIES$;
Cards;
2 10.49293325 C1
2 8.32876575 C1
2 5.77111325 C11
2 4.78740075 C11
2 6.16459825 CN2
2 5.37762825 CN2
2 4.78740075 CN21
2 3.60694575 CN21
2 1.83626325 CN3
2 1.04687 CN3
2 3.88116175 CN31
2 3.150835925 CN31
2 8.32876575 U2
2 9.641 U2
2 2.42649075 U21
2 2.62323325 U21
2 9.31247825 U3
2 7.14831075 U3
2 4.19717325 U31
2 6.55808325 U31
2 9.50922075 U1
2 11.87013075 U1
2 7.854721 U11
2 9.0148674 U11
2 12.26361575 CN1
2 10.29619075 CN1
2 5.565580825 CN11
2 5.32876575 CN11
4 15.72311325 C1
4 13.19803575 C1
4 9.83712575 C11
4 8.26318575 C11
4 8.85341325 CN2
4 6.88598825 CN2
4 5.70553325 CN21
4 6.68924575 CN21
4 3.48732 CN3
4 2.34672 CN3
4 6.84022825 CN31
4 6.69674325 CN31
4 10.62409575 U2
4 11.303625 U2
4 5.70553325 U21
4 5.90227575 U21
4 13.57523325 U3
4 12.39477825 U3
4 7.08273075 U31
4 11.41106575 U31
4 16.34591575 U1
4 18.08402325 U1
4 13.6521897 U11
4 14.9912487 U11
4 19.08402325 CN1
4 17.11659825 CN1
4 10.788075 CN11
4 8.16546075 CN11
8 24.95372325 C1
8 22.06773575 C1
8 15.32962825 C11
8 14.77197575 C11
8 15.54265825 CN2
8 12.00129325 CN2
8 9.06644325 CN21
8 11.00129325 CN21
8 8.1297124 CN3
8 7.6971254 CN3
8 9.91856325 CN31
8 8.75439575 CN31
8 18.90356825 U2
8 20.30325 U2
8 11.80455075 U21
8 12.59152075 U21
8 19.87099325 U3
8 19.47750825 U3
8 14.16546075 U31
8 18.49379575 U31
8 23.05144825 U1
8 24.36349575 U1
8 19.87432 U11
8 20.1467912 U11
8 24.19932825 CN1
8 24.00258575 CN1
8 13.45992825 CN11
8 14.28076575 CN11
12 36.5945365 C1
12 35.741124 C1
12 25.051879 C11
12 24.658394 C11
12 26.625819 CN2
12 20.723544 CN2
12 17.640814 CN21
12 17.510514 CN21
12 19.7398315 CN3
12 15.0180115 CN3
12 13.247329 CN31
12 10.6896765 CN31
12 32.773699 U2
12 33.14078 U2
12 23.871424 U21
12 24.264909 U21
12 29.9704415 U3
12 31.347639 U3
12 27.412789 U31
12 32.3313515 U31
12 35.7248365 U1
12 36.4140815 U1
12 29.051879 U11
12 32.2662015 U11
12 36.856429 CN1
12 35.675974 CN1
12 24.788694 CN11
12 23.347639 CN11
24 75.5645465 C1
24 73.500689 C1
24 60.859014 C11
24 60.072044 C11
24 58.6948465 CN2
24 50.234919 CN2
24 48.4466565 CN21
24 49.5143915 CN21
24 50.8251465 CN3
24 46.1033265 CN3
24 45.2837815 CN31
24 43.971734 CN31
24 61.6121165 U2
24 61.198135 U2
24 64.5971215 U21
24 64.006894 U21
24 64.941744 U3
24 63.465529 U3
24 70.696139 U31
24 72.8603065 U31
24 72.335229 U1
24 72.598414 U1
24 64.285074 U11
24 66.2863665 U11
24 74.630989 CN1
24 72.8603065 CN1
24 58.728714 CN11
24 58.267494 CN11
48 144.884779 C1
48 143.2118215 C1
48 117.766889 C11
48 115.832039 C11
48 90.092639 CN2
48 90.5825565 CN2
48 88.8281615 CN21
48 90.666579 CN21
48 84.205359 CN3
48 90.9610465 CN3
48 119.127799 CN31
48 118.9962065 CN31
48 92.844449 U2
48 90.8641775 U2
48 149.162959 U21
48 151.328419 U21
48 131.2819714 U3
48 130.7126745 U3
48 130.177954 U31
48 129.9812115 U31
48 121.521284 U1
48 125.6528765 U1
48 139.2144065 U11
48 141.799464 U11
48 145.017664 CN1
48 143.0789365 CN1
48 122.112804 CN11
48 120.929764 CN11
;
PROC SORT DATA = WORK.ACC;
BY SPECIES;
RUN;
PROC PLOT DATA = WORK.ACC;
BY SPECIES;
PLOT IVGP*TIME;
RUN;
PROC NLIN BEST = 9; BY SPECIES;
PARMS A = 9 TO 15 BY 1
B = 39 TO 44 BY 1
C = 0.03 TO 0.05 BY 0.005;
MODEL IVGP = A + B*(1-EXP(-C*TIME));
OUTPUT OUT = WORK.ACC PARMS = ABC;
RUN;
DATA WORK.ACC;
SET WORK.ACC;
PD = A + B;
ED = A + B*C/(C+0.05);
RUN;
PROC MEANS MEAN NOPRINT DATA = WORK.ACC;
BY SPECIES;
VAR A B C PD ED;
OUTPUT OUT = WORK.ACC MEAN = A B C PD ED;
RUN;
PROC PRINT DATA = WORK.ACC;
RUN;
Relevant answer
Answer
Your are most welcome!
  • asked a question related to SAS Programming
Question
3 answers
I have a dataset of 60,000 women with a propensity for vaccination ranging from around -0.5 to 1.5. What is the best way to individually match these women on propensity score in SAS? I know how to do this by creating multiple datasets, but I'm hoping there is a function that could save a lot of time/work? 
Relevant answer
Answer
Hi Annette - Propensity score is a conditional probability, so I'm curious by your statement of propensity ranging from -0.5 to 1.5.  What methods are you using?  
  • asked a question related to SAS Programming
Question
2 answers
Which program will be better for analysis of this data. I need help to analyse this data in SAS program or any other program. If any friend provide my example for data preparation and running in SAS program. 
Relevant answer
Answer
Dear Friend thank you very much for your valuable suggestion.
  • asked a question related to SAS Programming
Question
2 answers
I am investigating the impact of a factor on 4 Dependent variables (DVs). Two DVs  are following normal distributions, the rest two aren't. One of them is  "count" data with the majority being 0s and it follows a Poisson distribution.  The other is a "time length" data measured by "days".   I know that I can use Poisson regression( or zero-inflated, negative binomial) for univariate analysis on the two variables that do not follow normal distribution. However, I need to do the multivariate analysis first to control the type I error.  My question is :"  What SAS program that I can use for Multivariate analysis with DVs like these?  Is there a generalized method for  all types of distributions?"
Relevant answer
Answer
Thank you, Kelvyn, I will look into the mixed response models.
  • asked a question related to SAS Programming
Question
3 answers
I have a set of non-stationary variables which are not cointegrated.
I want to convert these non-stationary variables into stationary variables. I have tried the following methods (f represents a variable):
1) Augmented Dickey Fuller test using d(f)= lag(f) lag(df)
2) Dickey Fuller test with no intercept d(df)= lag(df)
-Thank you.
Relevant answer
Answer
Dear Soumya,
The attached two publications might be helpful:
1-Paper 456-2013
Exploring Time Series Data Properties in SAS®
David Maradiaga, Louisiana State University
Aude Pujula, Louisiana State University
Hector Zapata, Louisiana State University
ABSTRACT
Box and Jenkins popularized graphical methods for studying time series properties of time series data. Dickey-and Fuller did the same for unit-root tests. Both methods seek to understand the non-stationary properties of data and SAS® is a popular software used by applied researchers. The purpose of this paper is to provide a series of steps using SAS MACRO Language, PROC SGPANEL, ARIMA, AUTOREG, and %DFTEST to diagnose non-stationary properties of data. A comparison among three competing SAS procedures is presented with SAS capabilities highlighted using simulated time series.
2-Paper 192-30
Stationarity Issues in Time Series Models
David A. Dickey
North Carolina State University
ABSTRACT
The decision on whether analyze a time series in levels or differences is an important aspect of forecasting. Visual methods have been around for a long time. Relatively recently, statistical tests for the null hypothesis that the series is nonstationary, meaning that differencing is required, have been developed. This paper reviews the development of these tests, give motivating examples of why they are needed, demonstrates their use, and shows some other related procedures. The paper will be accessible to beginning level SASTM programmers, but the reader should have a reasonably strong grounding in statistics.
Rafik
  • asked a question related to SAS Programming
Question
6 answers
These days I use SAS mixed model to compute the variances.( For example, proc mixed data=phe convf=1e-8 maxiter=50; )Through google, I know some R packages, like  lme4 and nlme. But I can not get the results like SAS. Now I want to know the corresponding R package having the similar function with SAS mixed model. Thank you very much!
Relevant answer
Answer
Dear prof. Kelvyn,
R2MLwiN is a fresh software for me. And from your description about MLwiN, I think it's the software that I want. Thank you!
  • asked a question related to SAS Programming
Question
11 answers
I want to normalize my data using log10 in SAS. Please write the related program for me.
Thank you
Relevant answer
Answer
Thank you for the clarification.  So you are asking how to transform one variable that currently has a non-normal distribution using a logarithmic transformation in the hopes that it will give an approximately normal distribution.  The necessary SAS code is as follows:
DATA dataset2;
SET dataset1;
LOGVAR=log10(VAR);
RUN;
This would create a new variable LOGVAR that is the log10 transformation of the variable VAR. 
  • asked a question related to SAS Programming
Question
1 answer
I need Error SSCP Matrix and SSCP Matrix for var, to calculate selection indices (Smith-Hazel index,.....) in Lattice design, but I don’t know SAS program to calculate them. Can anyone kindly help me for this?
Relevant answer
Answer
MANOVA Statement
 
MANOVA < test-options >< / detail-options > ;
If the MODEL statement includes more than one dependent variable, you can perform multivariate analysis of variance with the MANOVA statement. The test-options define which effects to test, while thedetail-options specify how to execute the tests and what results to display.
When a MANOVA statement appears before the first RUN statement, PROC GLM enters a multivariate mode with respect to the handling of missing values; in addition to observations with missing independent variables, observations with any missing dependent variables are excluded from the analysis. If you want to use this mode of handling missing values and do not need any multivariate analyses, specify the MANOVA option in the PROC GLM statement.
If you use both the CONTRAST and MANOVA statements, the MANOVA statement must appear after the CONTRAST statement.
Test Options
The following options can be specified in the MANOVA statement as test-options in order to define which multivariate tests to perform.
H=effects | INTERCEPT | _ALL_
specifies effects in the preceding model to use as hypothesis matrices. For each H matrix (the SSCP matrix associated with an effect), the H= specification displays the characteristic roots and vectors ofE-1H (where E is the matrix associated with the error effect), Hotelling-Lawley trace, Pillai's trace, Wilks' criterion, and Roy's maximum root criterion with approximate F statistic.
Use the keyword INTERCEPT to produce tests for the intercept. To produce tests for all effects listed in the MODEL statement, use the keyword _ALL_ in place of a list of effects. For background and further details, see the "Multivariate Analysis of Variance" section.
E=effect
specifies the error effect. If you omit the E= specification, the GLM procedure uses the error SSCP (residual) matrix from the analysis.
M=equation,...,equation | (row-of-matrix,...,row-of-matrix)
specifies a transformation matrix for the dependent variables listed in the MODEL statement. The equations in the M= specification are of the form
where the ci values are coefficients for the various dependent-variables. If the value of a given ci is 1, it can be omitted; in other words 1 ×Y is the same as Y. Equations should involve two or more dependent variables. For sample syntax, see the "Examples" section.
Alternatively, you can input the transformation matrix directly by entering the elements of the matrix with commas separating the rows and parentheses surrounding the matrix. When this alternate form of input is used, the number of elements in each row must equal the number of dependent variables. Although these combinations actually represent the columns of the M matrix, they are displayed by rows.
When you include an M= specification, the analysis requested in the MANOVA statement is carried out for the variables defined by the equations in the specification, not the original dependent variables. If you omit the M= option, the analysis is performed for the original dependent variables in the MODEL statement.
If an M= specification is included without either the MNAMES= or PREFIX= option, the variables are labeled MVAR1, MVAR2, and so forth, by default. For further information, see the "Multivariate Analysis of Variance" section.
MNAMES=names
provides names for the variables defined by the equations in the M= specification. Names in the list correspond to the M= equations or to the rows of the M matrix (as it is entered).
PREFIX=name
is an alternative means of identifying the transformed variables defined by the M= specification. For example, if you specify PREFIX=DIFF, the transformed variables are labeled DIFF1, DIFF2, and so forth.
Detail Options
You can specify the following options in the MANOVA statement after a slash as detail-options.
CANONICAL
displays a canonical analysis of the H and E matrices (transformed by the M matrix, if specified) instead of the default display of characteristic roots and vectors.
ETYPE=n
specifies the type (1, 2, 3, or 4, corresponding to Type I, II, III, and IV tests, respectively) of the E matrix, the SSCP matrix associated with the E= effect. You need this option if you use the E= specification to specify an error effect other than residual error and you want to specify the type of sums of squares used for the effect. If you specify ETYPE=n, the corresponding test must have been performed in the MODEL statement, either by options SSn, En, or the default Type I and Type III tests. By default, the procedure uses an ETYPE= value corresponding to the highest type (largest n) used in the analysis.
HTYPE=n
specifies the type (1, 2, 3, or 4, corresponding to Type I, II, III, and IV tests, respectively) of the H matrix. See the ETYPE= option for more details.
ORTH
requests that the transformation matrix in the M= specification of the MANOVA statement be orthonormalized by rows before the analysis.
PRINTE
displays the error SSCP matrix E. If the E matrix is the error SSCP (residual) matrix from the analysis, the partial correlations of the dependent variables given the independent variables are also produced.
For example, the statement
manova / printe;
displays the error SSCP matrix and the partial correlation matrix computed from the error SSCP matrix.
PRINTH
displays the hypothesis SSCP matrix H associated with each effect specified by the H= specification.
SUMMARY
produces analysis-of-variance tables for each dependent variable. When no M matrix is specified, a table is displayed for each original dependent variable from the MODEL statement; with an M matrix other than the identity, a table is displayed for each transformed variable defined by the M matrix.
Examples
The following statements provide several examples of using a MANOVA statement. 
proc glm;
class A B;
model Y1-Y5=A B(A) / nouni;
manova h=A e=B(A) / printh printe htype=1 etype=1;
manova h=B(A) / printe;
manova h=A e=B(A) m=Y1-Y2,Y2-Y3,Y3-Y4,Y4-Y5
prefix=diff;
manova h=A e=B(A) m=(1 -1 0 0 0,
0 1 -1 0 0,
0 0 1 -1 0,
0 0 0 1 -1) prefix=diff;
run;
Since this MODEL statement requests no options for type of sums of squares, the procedure uses Type I and Type III sums of squares. The first MANOVA statement specifies A as the hypothesis effect andB(A) as the error effect. As a result of the PRINTH option, the procedure displays the hypothesis SSCP matrix associated with the A effect; and, as a result of the PRINTE option, the procedure displays the error SSCP matrix associated with the B(A) effect. The option HTYPE=1 specifies a Type I H matrix, and the option ETYPE=1 specifies a Type I E matrix.
The second MANOVA statement specifies B(A) as the hypothesis effect. Since no error effect is specified, PROC GLM uses the error SSCP matrix from the analysis as the E matrix. The PRINTE option displays this E matrix. Since the E matrix is the error SSCP matrix from the analysis, the partial correlation matrix computed from this matrix is also produced.
The third MANOVA statement requests the same analysis as the first MANOVA statement, but the analysis is carried out for variables transformed to be successive differences between the original dependent variables. The option PREFIX=DIFF labels the transformed variables as DIFF1, DIFF2, DIFF3, and DIFF4.
Finally, the fourth MANOVA statement has the identical effect as the third, but it uses an alternative form of the M= specification. Instead of specifying a set of equations, the fourth MANOVA statement specifies rows of a matrix of coefficients for the five dependent variables.
As a second example of the use of the M= specification, consider the following:
proc glm;
class group;
model dose1-dose4=group / nouni;
manova h = group
m = -3*dose1 - dose2 + dose3 + 3*dose4,
dose1 - dose2 - dose3 + dose4,
-dose1 + 3*dose2 - 3*dose3 + dose4
mnames = Linear Quadratic Cubic
/ printe;
run;
The M= specification gives a transformation of the dependent variables dose1 through dose4 into orthogonal polynomial components, and the MNAMES= option labels the transformed variables LINEAR, QUADRATIC, and CUBIC, respectively. Since the PRINTE option is specified and the default residual matrix is used as an error term, the partial correlation matrix of the orthogonal polynomial components is also produced.
  • asked a question related to SAS Programming
Question
16 answers
I have a set of data that should be fit by segmented regression. I'm trying to find the breaking point of two models in segmented regression. If x<x0, the model is linear.  If x>x0 the model is non linear.  At x=x0 the y values of both model must be identical. At x=x0 the slope of both model must be identical too.  Please anyone suggest me, how to calculate the best estimation for x0 and its confident interval?
Relevant answer
Answer
Hi Effendi,
I am attaching a paper I recently wrote on interrupted time series analysis (which some people refer to as segmented regression). It describes the software package I wrote for Stata that performs the analyses for are asking about.
I hope this helps
Ariel
  • asked a question related to SAS Programming
Question
4 answers
I am developing nutrient index through hyperspectral data. i have SAS package but how can i program Stepwise discriminate, Principle Component Analysis and band to band R square. sent me any model programing
Relevant answer
Answer
 Hi, All
Jacques answer is wright. I introduced SAS, Statistica and SPSS in Japan, and wrote these textbooks. After traveling of statistical  softwares, I concluded JMP is the most easy and improve our the productivity of intellectual power.
Jack may use old version of JMP before Ver. 5 in 2004. I change the JMP user in 2004.
We must PCA in the platform of  multivariate correlations. After calculate of correlation, we can choose PCA. This is traditional way of SAS. At the oldest version of SAS, we must use PCA after the factorial analysis. Many Japanese researchers claimed SAS did not support PCA. Recentry, there is a platform of PCA of JMP.
If you download my " Tour Guide of my Researches" and download recent DEA's paper suggestion by researchmap, you can find the examples of one-way ANOVA, cluste, PCA and regression analysis that analysed p-input, q-output, p*q ratios and CCR and Inverted DEA efficients.
  • asked a question related to SAS Programming
Question
8 answers
I am working through a SAS database where if an answer to a question was (n/a) values were coded as 99. I was wondering if I can use a quick way to tell SAS to consider 99 as "." where ever SAS sees the number 99. I could of course do it by each variable alone easily using if-then statement, but I am looking for a faster way to do it. I also could use an array statement but the variables names are not uniform (for ex, they are not : dx1, dx2,dx3,dx4....)
I am looking for a more general way, like the one we use in SPSS where we define missing values.
Thanks
Relevant answer
Answer
Thank you so much Patrick, this was very helpful. I was successful doing it your way.
  • asked a question related to SAS Programming
Question
2 answers
I'm trying to run the additive macro (for additive hazards models) written by Alicia Howell and John Klein but it takes longer to run. I'm using SAS 9.3.  I always break it after an hour without getting any output. I followed all the steps as per Alicia Howell's paper. It does not show any errors except that I don't get any output. I'm not sure whether to leave it running overnight.
Relevant answer
Answer
Thanks Ellen Hertzmark, I have 642  observations with only 3 variables. I will try the steps you have mentioned and see what happens. Funny enough, if I leave out a semicolon it  quickly gives me an error. I was hoping the output will be that quick .
  • asked a question related to SAS Programming
Question
4 answers
Hi,
I´m using PROC SYSLIN in SAS program to set a system of equations to predict biomass in Acacia spp. My goal is to make a weighted restricted SUR fit (or WRSUR, like Parresol, 1999) as there is correlation between the errors of the equations. I have different equations to predict the biomass of stem, foliage, branches and total biomass, each with different functions of variance, for example:
Biomass(stem)=a0+a1*(DAC2*H)+ε
Biomass(branches)=b0+b1*(DAC2*H)+ε
Biomass(foliaje)=c0+c1*(DAC2)+ε
Biomass(total)=d0+d1*(DAC2)+d2*(DAC2*H)+ε
And the equations of the error variance,  wich will be used as weights are:
σ2e(stem)=(DAC2*H)1.806,
σ2e (branches)=(DAC2)1.8744
σ2e (foliaje)=(DAC2*H)1.607
σ2e (total)=(DAC2*H)1.998
DAC=Diameter at the neck of the tree
H=Total tree height
I had no problems with a restricted SUR fit in SYSLIN, the procedure is relatively simple if the restrictions are well established through SRESTRICT. However, it seems that I have problems syntax to perform a WRSUR, since I do not know how to indicate the weights for the corresponding equation. I just know I should define the four weights in the input data set, but then not as assigning weights to each equation corresponding to conduct a weighted restricted SUR fit. Also, I have to define a variance-covariance previously in SYSLIN?. Can anyone help me with the correct syntax?, I appreciate a lot since I`m stuck.
Relevant answer
Answer
you can try this:
  proc model data=simul;
    var Biomass(stem)Biomass(branches)  Biomass(foliaje) Biomass(total) DAC2 H;
    parms  a0 a1 b0 b1 c0 c1 
 /*Biomass(stem)=a0+a1*(DAC2*H)+ε
Biomass(total)=d0+d1*(DAC2)+d2*(DAC2*H)+ε*/
/*σ2e(stem)=(DAC2*H)1.806,
σ2e (total)=(DA*C2*H)1.998*/
  Biomass(stem)=a0+a1*(DAC2*H)+ε
      bfy7=a1*(d2)+a2*(dh)+a3*(d2h); 
      resid.Biomass(stem)=resid.Biomass(stem)/(((DAC2*H)**1.806)**0.5);
      /*weigths, I recommend Blaboa-Murias 2006, For.Eco.Mang. for details*/ 
  e1 = actual. Biomass(stem) - pred. Biomass(stem);
/*so on for all components*/
Biomass(total)=a0+a1*(DAC2*H)+
                                 b0+b1*(DAC2*H)+ 
                                 c0+c1*(DAC2);
              resid.Biomass(total)=resid.Biomass(total)/((DAC2H**1.998)**0.5);
    e4 = actual.Biomass(total) - pred.Biomass(total);
            outvars e1 ......... e4;
fit Biomass(stem)Biomass(branches) Biomass(foliaje) Biomass(total)
start=( a0 XX a1 XX b0 XX b1 XX c0 XX c1 XX)/  /* XX start values*/
 sur outs=Smatrix outest=coeff cov out=values outpredict collin;
quit;
run;
  • asked a question related to SAS Programming
Question
6 answers
I'm using the GLM procedure to analyze a factorial assay with additional treatment (3x3+1). However, I don't know how to put the additional treatment in the model.
Relevant answer
Answer
Perhaps have a look at the following paper.  It includes SAS code examples for the dummy variables referred to by Ariel to achieve what you are looking to do. 
Marini, R.P. 2003. Approaches to Analyzing Experiments of Treatments Plus Other
Treatments. Hort. Sci. 38:117-120
Hope this helps.
  • asked a question related to SAS Programming
Question
3 answers
I hear it has good recommendations for any software, but does anyone have a good recommendation for a similar book written specifically for SAS?
Relevant answer
Answer
Urko, thanks. I have that one too. I've always found it a little obtuse for some reason. I know it's popular but things seem out of order to me. I have Cody and Pass "SAS programming by example" which I like much more. 
After coming across Long book (and becoming a quick and unrepentant convert), I've been wondering why no one has written a similar book, with the same orientation for SAS. You're right, Bernice, that the principles are 99-100% generalizable. I'm becoming a Stata convert because of the easy of documentation vars and files with metadata, and the general simplicity of the code. But I use a lot of Long's principles and rules in file management for all my projects. It's been a real workstyle changer for me. 
  • asked a question related to SAS Programming
Question
7 answers
I am interested to use contrast and estimate statements between the same fixed-effect values in my mixed model. I know there won't be statistical difference and that the estimate will be zero (0). However, I am interested in the confidence interval estimates. I would like to perform linear contrast/estimate between two groups (i.e., -A +A +B -B, or in other words, -1 +1 +1 -1). Is that possible? How do I do that using SAS (PROC MIXED)?
Relevant answer
Answer
If I understand the question .... This is a classic situation for using a computer intensive method. This assumes that you have a large number of replicates (16 replicates for a permutation test, 20 for all others). If you have less data, then this is pointless.
In SAS take your data set and randomly select a subpopulation of size n. Run proc Mixed, save the results. Randomly select another subpopulation, and run proc mixed. Do this 10,000 to 50,000 times. If you really want to understand your data, try doing this again for subpopulations of different sizes. You can do this by modifying the program that is published in the appendix of this manuscript:
Randomization Tests: Example Using Morphological Differences in Aphis gossypii (Homoptera: Aphididae) T. A. EBERT, W. S. FARGO, B. CARTWRIGHT, F. R. HALL
Annals of the Entomological Society of America 10/1998; 91(6):761-770
The PDF is on my ResearchGate account.
I highly recommend that you spend some time playing with your data, or start by playing with the data that are also included in the cited manuscript, or use data generated using a random number generator. Think of things this way: You run an experiment, gather data, and calculate a mean and standard deviation. Yet you know that if you were to do exactly the same experiment you would get a slightly different answer. If someone else would do exactly the same experiment they too would get a different answer. We hope that the difference is not too large, but that is hope. The point is that you calculated a mean and standard deviation, but by repeating the experiment you will be able to calculate a mean and standard deviation for your original mean, and a mean and standard deviation for your original standard deviation. To start you used your experimental design to test your system. Now you are testing your experimental design.
How does changing subpopulation size influence the outcome?
How does changing the number of randomizations change the outcome? What happens if I use 10 randomizations (run 1000 sets of 10 randomizations)? What if I use 100, 1000, 10,000 randomizations? You can now use this outcome to quantitatively justify a specific number of randomizations that should be used in your system.
  • asked a question related to SAS Programming
Question
3 answers
I am trying to resolve a problem with count data. At the beginning, I fitted a poisson regression model. However, I got under dispersion in my model.
I tried to use a restricted generalized poisson regression model to go on. However, I got the problem with the SAS code. Can anyone propose a suitable SAS procedure in this case?
Relevant answer
Answer
Another option might be to use a generalized Poisson distribution.  I know this can be done in STATA, but I don't know if/where one would do it in SAS.
Or, you could transform your data monotonically - say, via square root or log function.  This would be workable, since your minimum is >0 on the original scale.
  • asked a question related to SAS Programming
Question
5 answers
Having imported the map and data set to SAS, I used CHORO in GMAP to show the points on the different states. Since the polygons are comprised of different states (provinces) located in different  zones, the points are shown at wrong locations.
I have used X and Y(UTM),  ID in my data set.
Any idea what I am supposed to do to make points appear at the right location? Thank you in advance for any help 
Relevant answer
Answer
Check out the data structure of the internal dataset utilized in SAS/Help/GMAP, for its example below:   
goptions reset=all border;
title1 "Population in Europe";
footnote1 j=r "GMPCHORO";
proc gmap map=maps.europe(where=(id ne 405 and id ne 845))
data=sashelp.demographics(where=(cont=93)) all;
id id;
choro pop / cdefault=yellow;
run; quit;
Look at the structure of the data set, sashelp.demographics, and make sure your data structure is set up similarly.
  • asked a question related to SAS Programming
Question
6 answers
What is the best way to deal with quasi-complete separation of data points in a logistic regression?
Relevant answer
Answer
Hi Rolando,
it depends on the cause of separation and on what you are looking at. Can you provide more detail?
As a general advice, this might help you:
  • asked a question related to SAS Programming
Question
7 answers
I used a split plot design with two main plots and two replicates. Within each main plot, I used an 8 x 10 alpha lattice design to assign 79 varieties. I have the data but am facing difficulties to analyze them.
Relevant answer
Answer
hi Garcia,
find here the code:
PROC MIXED METHOD = REML;
CLASS gen treat rep block;
MODEL y =gen treat gen*treat rep/DDFM¼SATTERTH;
RANDOM treat*rep block(treat rep);
LSMEANS gen*treat/PDIFF CL;
RUN;
  • asked a question related to SAS Programming
Question
4 answers
I have a dataset like:
obs name
1 Ram
2 Ram
3 Ram
4 Shyam
5 Shyam etc.
What is the easiest procedure to change/rename the value 'RAM' to 'Sharma' in SAS?
Relevant answer
Answer
Amir's code should work for the given case, you can alternatively use the TRANWRD function to do the same task.
Data two;
set one;
Name=TRANWRD(NAME, 'Ram' , 'Sharma');
run;
  • asked a question related to SAS Programming
Question
6 answers
I have a dataset which I want to fit a non-linear model. I've tried hyperbolic and logarithmic models that fitted with the same R Square. But I don't know which one is better. Can anyone help me with that?
Relevant answer
Answer
Hi Ehsan,
Good question. Comparing models using R-squared is problematic. One problem is that R-squared has no underlying distribution, so it's difficult to know how to test if one R-squared is better than another. The second issue, and I think one that is much more intuitive to understand, is that R-squared values increase as more terms are added to the model. Therefore, using these R-squared values to compare models will lead you to pick the more complex model, in almost every situation.
Solutions:
- Use adjusted R-squared: It's the same formula as R-squared except that you divide the Sum of Square (SS) of the residuals (the numerator) by "n-K," where n = number of data points and K = number of parameters in your model. Divide the total SS (the denominator) by "n-1."
- Use Akaike Information Criterion (adjusted for small sample sizes) to compare between models.
Depending on what your overall modeling goal is, you may also consider the kind of "nonlinear" model you've chosen. The hyperbolic and logarithmic models that you mention are good if you think that the underlying data have some natural functional form. But just like quadratic relationships (or other polynomials), these models are approximations of the data. In other words, the user is forcing the relationship on the data. I have no problem with this in most cases; for instance, for modeling growth rates of a cat or dog, the logistic growth curve is reasonable, simply because dogs start small, grow fast, and then reach adult size. Depending on your data, you may want to look into options that allow you to "let the data speak for themselves." One modeling framework I enjoy is the Generalized Additive Model, but there are others as well.
I hope this helps.
  • asked a question related to SAS Programming
Question
5 answers
Could someone please help me with SAS code for the following problem?
I have 1 patient and 35 observations.
I would like to make 10 sets of 2 observations selected randomly from the full set of 35, then 10 sets of 3 observations selected randomly from the 35, then 10 sets of 4 observations selected randomly from the 35, and so on and so forth until I have say 20 sets of 10 randomly selected observations.
This random selection can also be with replacement.
I see something similar using PROC PLAN (order and treat) but I have an actual dataset and want to apply the random selection to it and not produce a theoretical dataset. This goes beyond my SAS knowledge.
Relevant answer
Answer
Hi Inti, you can use proc surveyselect to create a random sample from your data without replacement and a "DO Loops " statement if you need replacement.
  • asked a question related to SAS Programming
Question
7 answers
I am modeling time to death, and have a major issue with a competing risk in that the risk of getting another event (besides death or study end) is more common than death or study end. What is the best way to obtain an accurate Hazard Ratio for the effect of my exposure of interest on death, given these competing risks?
Relevant answer
Answer
I had to do a similar problem on competing risks a couple of years ago on SAS and found I couldn't do it. I ended up using STATA as that had a module that worked quite well.