Machine Learning Books
Related Subjects: Case-Based Reasoning Companies Mailing Lists Conferences Research Groups Software Datasets Publications
More Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153

Used price: $50.00

Great product & serviceReview Date: 2007-09-21
A Very Bad SequelReview Date: 2007-03-08
I have tried to use this book, but after constant student complaints and my own difficulty with the text, I have finally concluded that the problem lies with the text and not with the users.
I think an indicator of problems was the large number of errors in the first printing; large here is an understatement. Even in later additions, the 4th, the size of the errata is huge. I think this is indicative of the authors' attention to detail and seriousness in preparation. I have found similar errors and ambiguities in the associate Computer Manual.
The bottom line is that this book has seen its final appearance in our curriculum. I would use any other text, even an older one.
There is simply not enough room or time to point out all the problems with this text. Do yourself a favor if considering this text for a class. Don't bother.
Terrible ProblemsReview Date: 2008-04-09
For one the Homework Problems it provides are not really representative of what you're learning in the text. Almost all of the problems revolve around proofs, as opposed to using the concepts in practice. You can seemingly have a good grasp on the material, yet spend hours trying to solve each of the problems they provide for that particular section. My entire class has complained, and even my professor has admitted that even he isn't sure sometimes how they expect you to solve some of the problems.
Secondly, there are very few example problems demonstrated in the text, so the reader doesn't really get to see the concepts in action so to speak.
Also, there is a typo or error on almost every other page, sometimes even on important formulas.
Overall, I'd have to think there are better books out there. If this truly is "the best there is" as some reviewers claim, God help the field of Pattern Recognition.
excellent revision of a classical text on statistical pattern recognition Review Date: 2008-01-24
With this in mind the authors and their new coauthor David Stork go about the task of providing a revision. True to the goals of the original the authors undertake to describe pattern recognition under a variety of topics and with several available methods to cover each topic. Important new areas are covered and old but now deemed less significant are dropped. Advances in statistical computing and computing in general also dictate the topics. So although the authors are the same and the title is almost the same (note that scene analysis is dropped from the title) it is more like an entirely new book on the subject rthan a revision of the old. For a revision, I would expect to see mostly the same chapters with the same titles and only a few new chapters along with expansion of old chapters.
Although I view this as a new book, that is not necessarily bad. In fact it may be viewed as a strength of the book. It maintains the style and clarity of the original that we all loved but represents the state-of-the-art in pattern recognition at the beginning of the 21st Century.
The original had some very nice pictures. I liked some of them so much that I used them with permission in the section on classification error rate estimation in my bootstrap book. This edition goes much further with beautiful graphics including many nice three-dimensional color pictures like the one on the cover page.
The standard classical material is covered in the first five chapters with new material included (e.g. the EM algorithm and hidden markov models in Chapter 3). Chapter 6 covers multilayer neural networks (a totally new area). Nonmetric methods including decision trees and the CART methodology are covered in Chapter 8. Each chapter has a large number of relevant references and many homework exercises and computer exercises.
Chapter 9 is "Algorithm-Independent Machine Learning" and it includes the wonderful "No Free Lunch" theorem (Theorem 9.1), a discussion of the minimum desciption length principle, overfitting issues and Occam's razor, bias - variance tradeoffs,resampling method for estimation and classifier evaluation, and ideas about combining classifiers.
Chapter 10 is on unsurpervised learning and clustering. In addition to the traditional techniques covered in the first edition the authors include the many advances in mixture models.
I was particularly interested in that part of Chapter 9. There is good coverage of the topics and they provide a number of good references. However, I was a bit disappointed with the cursory treatment of bootstrap estimation of classification accuracy (section 9.6.3 on pages 485 - 486). I particularly disagree with the simplistic statement "In practice, the high computational complexity of bootstrap estimation of classifier accuracy is rarely worth possible improvements in that estimate (Section 9.5.1)". On the other hand, the book is one of the first to cover the newer and also promising resampling approaches called "Bagging" and "Boosting" that these authors seem to favor.
Davison and Hinkley's bootstrap text is mentioned for its practical applications and guidance for bootstrapping. The authors overlook Shao and Tu which offers more in the way of guidance. Also my book provides some guidance for error rate estimation but is overlooked.
My book also illustrate the limitations of the bootstrap. Phil Good's book provides guidance and is mentioned by the authors. But his book is very superficial and overgeneralized with respect to guiding practitioners. For these reasons I held back my enthusiasm and only gave this text four stars.
Stick with the first editionReview Date: 2007-11-19

Used price: $43.94

It shows me many examplesReview Date: 2005-04-08
Very, Very, Very Bad Book !Review Date: 2004-12-22
The book is indescribably bad.
It is bad if you want theory.
It is bad if you want practical advice.
It is just plain bad.
BAD! BAD! BAD!
Do yourself a favor and go out and get the books by Gordon Linoff et. al. (Mastering Data Mining and Data Mining Techniques). I believe that Amazon will sell you both for not much more money than if you buy this book. Either one of those books is better than this! (I recommend buying both, you won't be sorry!)
SAVE YOUR MONEY, AVOID THIS BOOK !!!
finally a good statistical and computer science perspective on data miningReview Date: 2008-01-23
They also provide a very well organized structure for the text that is well described in the preface. It consists of three parts. Chapter 1 is an essential introduction that is informative to everyone. Chapters 2 through 4 go through basic statistical ideas that statisticians would be very familiar with and others could view as a refresher. The authors have experience teaching this course to engineering and science majors and have found that many of these students unfortunately do not have the prerequisite statistical inference ideas and need this material covered in the course.
Chapters 5 through 8 cover the components of data mining algorithms and the remaining chapters deal with the details of the tasks and algorithms.
The book features a further reading section at the end of each chapter that provides a very nice guide to the useful and most significant relevant literature. The author's have done a very good job at this. One mistake I found was a reference to Miller (1980). I think this was intended to be a reference to the seocnd edition fo Rupert Miller's text "Simultaneous Statistical Inference" which was published in 1981 by Springer-Verlag but the full citation is missing from the list of references in the back of the book.
This book deserves 5 stars because it does what it intends to do. It presents the field of data mining in a clear way covering topics on classfication and kernel methods expertly. David Hand has published a great deal on these techniques including many fine books.
Mannila and Smyth bring to the text the computer science perspective. There is much useful material on optimization methods and computational complexity.
Statistical modeling and issues of the "curse of dimensionality" and the "overfitting problem" are key issues that this text emphasizes and expertly addresses.
The only thing the text misses is details on specific algorithms. But I do not grade them down for that because it was not their intention. They emphasize methodology and issues and that is the most critical thing a practitioner needs to know first before embarking on his own attack at mining data.
The text does provide most of the current important methods. Although Vapnik's work is mentioned and his two books are referenced there is very little discussion of support vector machines and the use of Vapnik-Chervonenkis classes and dimension in data mining. The new book by Hastie, Tibshirani and Friedman goes into much greater detail on specific algorithms include some only briefly discussed in this text (e.g. support vector machines). The support vector approach is also nicely treated in "Learning with Kernels" by Scholkopf and Smola.
I highly recommend this book for anyone interested in data mining. It is a great reference source and an eloquent text to remind you of the pitfalls of thoughtless mining or "data-dredging". It also has many nice practical examples and some interesting success stories on the application of data mining to specific problems.
make sure you are right audienceReview Date: 2005-12-02
Good book for overall breadth of alogrithms..Review Date: 2004-08-15

Used price: $5.00

Could have been a great one.Review Date: 2003-12-13
First, the good. The description of stochastic context free grammars is the best I've seen. I don't know any other reference that even hint at how to use generative grammars to evaluate likelihoods. Once they caught my interest, though, the authors did not carry through with training and evaluation algorithms I could really use. I suspect that parts of the information are there, but I'll have to go back over their opaque notation again to work out just what they've given and just what's been left out.
This same pattern - an interesting introduction with missing or mysterious development - recurs throughout the book. The discussion on clustering and phylogeny goes the same way: a number of techniques are mentioned but not developed. The authors mention a tree drawing problem, not just building the tree's topology, but ordering the branches for the most informative rendering. Again, a critical topic and one that most authors miss - in the end, these authors miss it, too, by mentioning but not filling in the idea.
Their discussion of neural nets suffers badly from the authors' partial presentation. Evaluation of network output for a given input is relatively straightforward, and they present it in some detail. Training the net is the real problem, though, and is given less than a page.
Baldi and Brunak give more of the fundamentals than most authors. For example, they explain the maximum entropy principle well enough that I'll use it in lots of other areas. They give some coverage to topics of intermediate complexity, such as the forward and backward algorithms for HMM training. Finally, they fizzle out at the higher levels of complexity - the Baum-Welch algorithm could have followed from the forward and backward methods, but is left as a reference to another book.
There is some good here, especially in the fundamentals behind important techniques. The discussions I wanted - the more avanced topics, in forms I can use - are often weak, missing, or impenetrable. Just a bit more work, clearly within the authors' capability, would have made this a landmark reference.
An excellent book.Review Date: 2001-10-23
A very bad book. A colection of references w/o explanationsReview Date: 2001-09-19
Here is why. The book is badly written, hard to read and follow. Although it is said that this is a book is for " many readers", it is really for those who have already known all the algorithms. It is simply impossible to learn the algorithms from this book. The chapter on neural network is a few pages. It provieds a few equations for backpropagation. That is it! It is pretty much true for every thing else. Equations, hard to understand sentences, abbreviations with no explnantions, tons of citations everywhere. A book should strive to explain, and not to cite what other papers and go look there all the time. I suspect the few good reviews here are from the authors themselves.
I have a good programming background. I also read some papers on neural network and hidden markov models, This book is a lot worse than anything I have read in explaining the stuff. Very disappointed. Save your money and get something else.
the worst book I have ever readReview Date: 2005-11-06
TerribleReview Date: 2006-03-16

Used price: $66.98

Quite UsefulReview Date: 2008-07-13
Nice print, no mistakes, MATLAB code. You get everything on Kernel Methods, from theory to implementation. A perfect book and helped me a lot in my research.
Nice introduction, but no moreReview Date: 2006-08-11
A Useful Reference on Kernel MethodsReview Date: 2005-02-21
the rest of the book is a cook-book with plenty of matlab code.
The website contains most of the same code + data online. Readable, complete.
coherent and accessible reference, ready-to-use algorithmsReview Date: 2005-02-22
It is theoretically well-founded, the resulting algorithms are well-explained and made accessible for practioners by providing pseudo-code and online, ready-to-use matlab code.
This book nicely complements the previous, yellow book, written by the same authors. Indeed, after "getting into the field" by reading the accessible introduction to support vector machines (SVMs), it was clear to me that SVMs was only an example of a signifcantly larger framework, i.e., kernel methods. The blue book is the reference book about that larger framework I have been waiting for since then. I particularly like the way the book is set up, making clear the modular, flexible approach in kernel methods.
SloppyReview Date: 2006-11-29
Constant repetitions do not add any clarity either.

Used price: $62.60

Very good introduction in causal ModelingReview Date: 2006-03-09
In my opinion this book is well written and the chosen examples are insightful. What I do not like is part three of the book which is devoted to case studies and praktical examples. If this space had been used for the first two parts by providing more details, e.g., for the discussion of path models (which is given but only short), this book could be even great on a more advanced level. In this form it is very good as an introduction in Bayesian Networks and related topics like the larger class of causal models.
Excellent Introductory TextReview Date: 2004-12-16
Bayesian Networks for Undergrads and PracticionersReview Date: 2004-01-12
The content is divided in three main sections: (1) The basics of probabilistic reasoning with BNs, (2) Causal discovery (finding BNs from data), and (3) "Knowledge engineering".
The first part covers the fundamental concepts and algorithms around BNs and (simple) decision networks. It is well-written and clear, but readers who are not totally new to the field might find only little new information (e.g., loopy belief propagation, continuous densities, large decision networks, etc. are not covered).
The second part is on how to deduce causal relationships from observational data. Constrained-based and Bayesian approaches are covered, but on a rather general level. I am not sure how easy it is to implement the algorithms from the descriptions provided. When it comes to details of the algorithms, proofs, or mathematical background the authors very often refer to the literature due to "lack of space". From a practical standpoint, it is unfortunate that the different methods are compared to each other only superfiscially. For instance, one method presented performs a large number of statistical tests; one would expect that this requires large amounts of data in order to avoid false positive results. Is this a problem? With questions like these the reader is often left alone.
I am not competent to talk about part three (knowledge engineering), so I end with my general impression of the book: I would have appreciated if the authors had treated some the algorithms in greater detail and had spent a few pages on advanced concepts and current research directions. On the other hand, some information provided could have easily been left out. (For instance, how to download and install certain software packages from the internet, Kevin Murphy's well-known survey on BN software packages, screenshots of user dialogs, etc. just eat pages. Providing the URLs to the corresponding sites on the internet is completely sufficient, and the information there is more likely to be up-to-date.) The saved pages could then be spent on information which is not readily available elsewhere.
To summarize: The book provides a mostly well-written general overview of the basic concepts and could serve as a first introduction to the field. However, it leaves the reader often alone when it comes to the mathematical background, potential practical pittfalls, or advanced algorithms.


Misleading titleReview Date: 2003-05-29
Still the best analysis around.Review Date: 2004-06-23
I recommend this book wholeheartedly.
Thorough review and new resultsReview Date: 2001-07-28
The book has a lot of new theory that is easy to follow and gives recommendations to make parallel genetic algorithms work well in many circumstances. Although the theory makes many simplyfying assumptions, the examples in the book demonstrate that the models are very accurate and the recommendations made in the book seem very reasonable.

Too outdatedReview Date: 2000-12-31
Pretty good...Review Date: 2000-04-12

Used price: $29.95

Trick philosophyReview Date: 2004-07-13
Yet if you are asked to act like a computer by reading numbers, moving paper tape, erasing things and following instructions given on the paper tape, you will prove to be one of the slowest computers in the world. The original word `computer' referred to a man sitting in a room with paper, pencil and eraser. These human `computers' were replaced by machines a long time ago because they are too slow.
In summary, humans are fast and intelligent at being humans but slow at being computers. In the Chinese Room Argument, John Searle states that although we have a human mind which could otherwise be used to understand Chinese, this particular human mind does not in fact understand it. Given this stipulation, the human mind's ability to process language cannot be used and the only method of "understanding Chinese" is left to the "Chinese room" which consists of a computer run by the very slowest of CPUs, the human being sans abacus, sans calculator, sans silicon chips and sans hope.
The Chinese Room Argument is a trick argument that proves nothing. The computer room is so slow that it cannot ever think or understand Chinese. On the other hand, this doesn't say anything about whether a high-speed computer with the memory and processing power of the human brain might one day speak and understand Chinese quite well.
Ignore the previous comments on "trick philosophy"Review Date: 2005-04-24
For example, the concepts we employ to think and the words we use to speak have meanings. But there is nothing in computationalism as syntax that has any meaning whatsoever. Whatever meaning an implemented formal program has results from its being programmed or interpreted by us. Syntax (e.g., a computer program) has no causal powers. Whatever causal powers computers have (e.g., to fly airplanes) results from our programming and our assigning interpretations to the electrical charge insides a chip, not from the program in itself.
The chapters in Views Into the Chinese Room attack different aspects of the CRA. But they address it as an argument that stands or falls on the truth of the premises and the validity of the inference, not on engineering questions such as the speed of computers, which are irrelevant. Searle believes that there are, in fact, thinking machines -- we human beings are biological machines that think. And he believes that there also could be artificially made machines that think. The CRA is meant to show only that an implemented computer program by itself cannot generate mental content or semantic content.
For a clear explanation of the CRA, see chapter 15 of this book, by Stevan Harnad, the editor of The Behavioral and Brain Sciences, where Searle's original paper appeared twenty years ago. Do not rely on reviewers who do not understand the argument in the first place.


Good BookReview Date: 2006-03-01
A very good introduction to Bayesian networksReview Date: 2003-06-14
The author also provides a good introduction to decision graphs, a close relative of Bayesian networks.
The aspect of Bayesian networks that I find most attractive is the fact that there is a "rational" way of designing a network, based on hypothesis, informational, and mediating variables, and their "causal" relationships. Unlike neural networks in which one is almost forced to guess the appropriate structure of the network, every node in a Bayesian network correpsonds with a state or quantity that can be measured either directly or indirectly through other variables. Thus, changes in a system model should only induce local changes in a Bayesian network, where as system changes might require the design and training of an entirely new neural network.
Another aspect of Bayesian networks that I find very compelling is the way in which they seem quite amendable to learning and the presentation of new evidence. This is true since knowledge updating is done locally (through variables), while the effects of those changes are witnessed globally through appropriate belief-updating algorithms.
On the downside, it should be noted that the operation of belief-updating is in general NP-hard, thus there exists a valid concern about the computational efficiency of Bayesian networks. Contrast this with the fact that once a nueral network has been trained, it is quite easy to compute. One would hope that these concerns will subside with more research, for the above mentioned benefits of Bayesian networks leads me to believe that these networks will have quite an influence on the future directions of machine learning.
Although this book will not go down in history as the definitive reference for Bayesian networks, it serves as a good conduit for explaining this quite interesting area of learning at a time when such few complete and modern references exist.
Not worth the moneyReview Date: 2002-12-31
Accessible introduction to Bayesian NetworksReview Date: 2003-01-21
Prerequisites of the book as stated in the preface include Graph Theory and Calculus, both at introductory level. I personally did not have exposure to Graph theory, but I was able to understand most of the material without any help. Necessary probability theory is developed, but basic probability knowledge is also a prerequisite to digest the material to a reader without prior exposure of Probability as it shapes the core of the material in the book.
The strength of this text is in Part I where the author provides several examples to illustrate use of Bayesian Networks, Influence Diagrams and other models. I find it useful Influence Diagram as an extension of Bayesian Networks.
Most answers to Exercises at the end of each chapter are provided at the author's homepage, except answers of the last chapter. Answers that require graphical modeling software are also provided in Hugin format. (Hugin Lite can be downloaded from Hugin site.)
The downsides are that writing of the text is somewhat awkward, obscuring readers from understanding, that model building chapter could have been discussed more thoroughly, that material in Learning is barely present, and that definitions are sometimes not introduced upon the first encounter but they appear later in chapters. More different and complex examples could have been discussed to illustrate the material. Note: the author provides a page for Learning at his homepage.
Although this is an introduction to Bayesian Networks and Influence Diagrams, a reader should be equipped with some level of abstract thinking in order to digest the material.
This book is suitable for self-study. It has motivations for the uninitiated. References are provided at the end of the book and I was able to find some of them online. A notable is "A tutorial on Learning with Bayesian Networks" by Heckerman, to fill in the part of Learning in this book.
Other books at this level from users' perspective are:
Edwards, Introduction to Graphical Modeling (Utilizes software MIM.)
Clemen, et al., Making Hard Decisions (Uses Palisade Decision Tools suite. The book discusses Influence Diagrams but not Bayesian Networks.)
Further studies after completion of this book include:
Cowell, et al., Probabilistic Networks and Expert Systems
Lauritzen, Graphical Models
Pearl, Probabilistic Reasoning in Intelligent Systems
Pearl, Causality
A lot about very littleReview Date: 2003-05-06
Used price: $1.10

Too frustrating!!!Review Date: 2007-09-07
What a great book! It should be back in print!Review Date: 2001-10-24
Related Subjects: Case-Based Reasoning Companies Mailing Lists Conferences Research Groups Software Datasets Publications
More Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153