Starting Friday, October 4, at 3:30 PM in Room 206 of Science and Technology 1 (*not* Sci Tech 2), we will begin discussing Frank Harrell's Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. While this book isn't pushing at the edge of the envelope like Hastie, Tibshirani, and Friedman, it isn't old-fashioned either. It seems clear to me from H,T&F that the newer methods are built on a firm foundation of older methods, and it may be that a book like Harrell's is just what some of us need to fill in some gaps and set us up to better appreciate statistical methods which are close to the frontier. (To me, data mining is basically regression, classification, clustering, and estimation of conditional probabilities associated with categorical variables, with an emphasis on computer-intensive procedures to deal better with large data sets. From H,T&F one can see that simple issues such as bias-variance tradeoff, matrix manipulations, and techniques for variable selection and model building are important in the "new statistics", just as they are for the old least squares dinosaurs. Harrell's book ought to help us get more comfortable with some of the basic issues, and see connections between older methods and newer methods.) Since Harrell's book makes use of S-Plus, the seminar ought to help some of us learn how to use that software better. For the first meeting, we only have 90 minutes. (Most weeks we will go from 3 to 5 PM, but on the first Friday of Oct., Nov., and Dec., we will start at 3:30 PM.) Let's plan to have read the Preface, Ch. 1 & Ch. 2 prior to the meeting. This will be more pages than will we typically cover, but my guess is that the first 20 pages or so may be relatively simple compared to the rest of the book, and we won't need to spend a lot of time on them. I believe (not sure, since my book was only shipped out from Amazon.com this past Thursday) that the end of Ch. 2 touches on splines and trees, so there is a possibility that even in Ch. 2 there is coverage of topics that some of us haven't mastered. The most important thing will be for you to do the reading each week, and prepare a short list of questions that you have or topics that you'd like to discuss. Since there may be as many as 10 of us this semester, I think it will be best if we try to have it so just one person is talking at a time. (If too many conversations are going on at the same time, it's hard for people to follow.) Perhaps a good plan will be that at the start of each meeting, we'll go around the table and have everyone check in with where they stand with regard to the material. (E.g., someone might say, "I've read through p. 35, but became less confident in my understanding about p. 28 or so." Another person might warn us that he has a bunch of questions about a particular subsection, and maybe from time to time someone will say that she has prepared a 10 minutes presentation on an example from the book, including the exploration of a possibility that the author overlooked. CDS