CSI 991
Seminar in Computational Statistics:
Logistic Regression
Summer, 2003
Fridays 3:00pm -- 5:00pm, Science & Technology, Room 206
Contacts:
csutton@gmu.edu
jgentle@gmu.edu
We will work through some of the material in
Applied Logistic Regression,
Second Edition,
by David W. Hosmer and Stanley Lemeshow.
ftp site for data used in text.
More data (under "Regression - Logistic").
Datasets used by Hosmer and Lemeshow
Jim Shine converted the data files to text form and added headers.
I produced some SAS datasets with other variables used by Hosmer and
Lemeshow.
The descriptions are given in the tables indicated. For convenience,
I have included the descriptions that Jim S provided, but the statements
about where these datasets are discussed in the book are not correct (at least for the second edition).
ICU (Table 1.5)
Description
(ignore the statement about the book)
Data (in text form with header)
Low Birth Weight (Table 1.6)
Description
(ignore the statement about the book)
Data (in text form with header)
Prostrate Cancer (Table 1.7)
Description
(ignore the statement about the book)
Data (in text form with header)
UMARU Impact (Table 1.8) (The dataset is called UIS)
Description
(ignore the statement about the book)
Data (in text form with header)
SAS dataset (with additional variables, see Table 4.7)
Age and CHD (Table 1.1)
Description
Data
If
the
colors
are
annoying,
just
copy
the
source
into
a
convenient
place
and
edit
it.
Search-and-replace
to
substitute
"cooler"
for
"color"
and
"txet"
for
"text"
everywhere.
(Browsers just ignore incorrect directives. Don't you
wish all processing programs were as forgiving.)
|
Exercise 1.1
A simple analysis using ICU data: the response is the binary STA variable,
and only one covariate or risk factor is modeled, AGE.
The things to be computed in the exercise are
(e) MLE; (g) LR, Wald, and score p values, deviance; (h) CIs; (i) estimated Cov,
logit and logistic probability with CIs for AGE=60; (j) estimated logit and
SEs for all, graph.
- SAS code (with PROC LOGISTIC and PROC CATMOD)
Questions:
- Can you plot directly in PROC LOGISTIC?
- Can CIs be computed for the logits (XBETA)?
- What is the difference in PROC LOGISTIC and PROC CATMOD here?
- SAS output
- SPlus code
Questions:
- How do you get the variance-covariance matrix for the coefficient estimates?
- How do you get other variances?
Selected Tables from Chapter 4
SAS code
SAS output
The output contains more than just what is in Hosmer and Lemeshow's tables.
Also for Table 4.15, you have to pick out from my table
the 5 models in their table.
These data are from the UMARU Impact study.
A SAS dataset that contains the original variables (described in Table 1.8)
as well as the variables constructed and used in Chapter 4 is available
from the link above. (The dataset is called uis, and it is in the file
uis.sas7bdat.)
Some things Darryl did for Chapter 4 and Chpater 5
SAS code
SAS output