Plan for Friday, Oct. 25 (3 PM meeting) I think a reasonable plan would be to first tie up loose ends in what we had planned to cover this past Friday, and then discuss more in Ch. 4 of Harrell, to the end of p. 72. (If we finish to p. 72, we may be able to finish Ch. 4 the next week, even though we will only have 90 minutes for the Nov. 1 meeting.) Specifically, the loose ends are: Sec. 3.8 D.G. p. 59 J.S. rest of Sec 4.3 A.K. (Can you find a small economic or medical data set that has an estimated coefficient with the "wrong sign"?) Sec. 4.4 C.P. For the new material, I think it may be best not to have specific section assignments. Rather, we'll just have a general group discussion about pp. 62-72 (& the very last part of p. 61), with the discussion concentrating on parts that group members have questions about. (We'll consider the material (sub)section by (sub)section, but move right along if no one has any questions.) My guess is that some portions of the material will be familiar to most, and will warrant litle discussion. Other parts may be generally more confusing, and will warrant more discussion, but it isn't clear who can best explain the trouble spots, and so I won't suggest specific assignments. Below, I'll indicate some portions of the material that may warrant some discussion (but in some cases it may be that we won't figure out for sure at this point what Harrell means, and we'll have to move ahead and hope that matters are clarified later in the book). (a) Is the shrinkage factor on the last line (not counting footnote) of p. 63 a special case of (4.1) on p. 63? Perhaps we can consider both formulas for the case of y = beta_0 + beta_1 * x_1 + beta_2 * x_2 + beta_3 * (x_2)^2 + e. (It seems to me that if we generate data according to such a model, we can (try to) compute the shrinkage factor(s), and also compute the shrunken estimate of beta given by (4.3) on p. 64 ... and see if the shrunken estimates are closer to the truth. *** Let's all try this using beta_0 = 0, beta_1 = 1, beta_2 = 1, & beta_3 = 1, with iid N(0, 1) error terms, (each person) choosing a sample size and design (x values) to make an "interesting" example. *** With regard to the shrinkage factors, the 2nd one should be easy (just use (4.2)), but for the 1st one you'll have to figure out what is meant by the model chi-square (it's a likelihood ratio statistic, but you need to figure out how to compute it's value).) (b) Can you explain why the 1st sentence of Sec. 4.6 is true? (c) In the last sentence of the 2nd to last paragraph on p. 65, does Harrell mean that VIF is *never* very informative? Also, what exactly does he mean by variables being algebraically connected connected to each other? (d) Can we try out the MGV method using a simple example like the 2 variable one given near the top of p. 68? Then do we dare try it with 3 variables? (e) Can anyone explain the part of subsection 4.7.4 on p. 70 after the frist 2 sentences? (I.e., explain the part that begins with "For the ordinal count" and ends with "new summary vatiable.") (f) Can anyone explain the last paragraph of subsection 4.7.5 on p. 72? Looking ahead, the next 2 chapters after Ch. 4 are relatively short compared to Ch. 4. But before we start Ch. 5, it may be good to spend at least a week trying some problems from the book or trying some of the techniques described on some examples of our own. (Oddly, Ch. 4 doesn't have any end of the chapter problems. Still, I think we should work through a few things from Ch. 4 before moving on to Ch. 5.) CDS