Statistical Analysis and Data Mining

I don’t keep many actual books next to my desk these days. I have found that my hard drive has become my main knowledge repository. For those interested, everything I receive online (email, documents, spreadsheets, video, research papers, etc.) is feed into my knowledgebase using Devonthink.

A rare exception to this is a a new book that has really impressed me: Handbook of Statistical Analysis and Data Mining Applications by Robert Nisbet, John Elder IV, and Gary Miner. Available on Amazon for about AUD80.

Why has this 800+ page book squeezed its way onto my crowded desk? It’s useful to a part-time data miner whose post-graduate maths and stats courses are in the dim and distant 1990s. I have found it useful in a number of ways:

Reference Guide. Section II is a lexicon of the algorithms used in structured and unstructured (i.e. text) data mining.
Problem Solving. Section III is a substantial how-to guide of the data mining in practise. The 13 tutorials cover a wide range of problems and industries/fields.
Mentoring. Section I is a great primer for people new to the field. I would use it to help any analyst who joins one of my teams.

I haven’t yet made use of Section IV …

Reference Guide. Section II is a lexicon of the algorithms used in structured and unstructured (i.e. text) data mining.
Problem Solving. Section III is a substantial how-to guide of the data mining in practise. The 13 tutorials cover a wide range of problems and industries/fields.
Mentoring. Section I is a great primer for people new to the field. I would use it to help any analyst who joins one of my teams.

I haven’t yet made use of Section IV of the book (Measuring True Complexity, the “right model for the right use”, Top Mistakes, and the Future of Analytics), but I know it’s something I should get to.

The book is a practical guide for how to use SAS-Enterprise Miner and STATISTICA Data Miner. There is also a section on SPSS Clementine and sprinkled throughout the book are STATISTICA’s C&RT, CHAID, MARSpline, and other data mining and graphical analytic tools.

Here’s a link to the table of contents.

I don’t need it every week, but when I do I’m really glad I have it to hand.

Link to original post