Å·±¦ÓéÀÖ

Jump to ratings and reviews
Rate this book

Machine Learning with R - Fourth Edition

Rate this book
Learn how to solve real-world data problems using machine learning and R Purchase of the print or Kindle book includes a free eBook in PDF format. Machine learning, at its core, is concerned with transforming data into actionable knowledge. R offers a powerful set of machine learning methods to quickly and easily gain insight from your data. Machine Learning with R, Fourth Edition, provides a hands-on, accessible, and readable guide to applying machine learning to real-world problems. Whether you are an experienced R user or new to the language, Brett Lantz teaches you everything you need to know for data pre-processing, uncovering key insights, making new predictions, and visualizing your findings. This 10th Anniversary Edition features several new chapters that reflect the progress of machine learning in the last few years and help you build your data science skills and tackle more challenging problems, including making successful machine learning models and advanced data preparation, building better learners, and making use of big data. You'll also find this classic R data science book updated to R 4.0.0 with newer and better libraries, advice on ethical and bias issues in machine learning, and an introduction to deep learning. Whether you're looking to take your first steps with R for machine learning or making sure your skills and knowledge are up to date, this is an unmissable read that will help you find powerful new insights in your data. This book is designed to help data scientists, actuaries, data analysts, financial analysts, social scientists, business and machine learning students, and any other practitioners who want a clear, accessible guide to machine learning with R. No R experience is required, although prior exposure to statistics and programming is helpful.

762 pages, Paperback

First published July 31, 2015

186 people are currently reading
524 people want to read

About the author

Brett Lantz

4Ìýbooks5Ìýfollowers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
144 (46%)
4 stars
110 (35%)
3 stars
43 (13%)
2 stars
8 (2%)
1 star
6 (1%)
Displaying 1 - 28 of 28 reviews
Profile Image for Peter Baumgartner.
42 reviews6 followers
December 5, 2016
This book has opened a new world for me! I bought it to get some understanding about machine learning. The book holds everything what it promises in the title: The author gives not only a very gentle introduction to key issues in statistics � even explaining simple things like the difference between mean and median � but also a crash course on R so that you could follow and experiment with the data on your own.

Especially intriguing for me was not only, that one could follow the data analysis hands-on with no previous knowledge of R but with real data sets! (I didn't know previously that there are real data sets free available on the internet (for instance at the )

I have to confess that some of the statistical details in the later chapters I didn't understand completely in my first reading. But I didn't expect that with my first dive into the domain of machine learning I will become a professional data scientist. I got some understanding about the main concepts and know now where to go for further practice and to build up my skills for analysing big data.

From an educational point of view the structure of the book is also (almost) perfect: After two introductory chapters (one about general features of machine learning and one about the first steps and general syntax of R) the next seven chapters follow the same outline:
(A) Providing a general understand of the algorithms with strength and weaknesses: Explaining the most important formulas and the effects demonstrating with some illustrative sample data. This provides you with a qualitative understanding of the method.
(B) The chapter continues with a practical demonstration in the following order:
Step 1: Collecting data: Where to get the data set, references and explaining the structure of the data.
Step 2: Exploring and preparing the data. Every R-command to load the data, to transform etc. is explained and written down as code. The data and even these command are provided in a .zip archive at github.
Step 3: Training the model on the data
Step 4: Evaluating the model performance, looking for and discussing the false positives and false negatives including their effects in the real world. Step 5: Improving the performance of the model.
(C) And finally a summary with lessons learned from this chapter.
(D)Like the first two chapters also the structure of the last three chapters are different: They are dedicates on strategies for evaluating and improving of model performances and some other specialised issues on machine learning.

Above I mentioned the word "almost perfect": The only three things I was missing: (1) Please provide a section with exercises and solutions for the next edition! This would be very important for the transfer from understanding to applicable skills. (2) I would like to see one application in learning analytics with a real data set from the educational domain. (3) And there should be a last chapter "Where to go from here now".

But all in all: One of the best tutorial books I have read!
333 reviews24 followers
June 20, 2018
Excellent tutorial for the R novice who wants to apply ML to any kind of project. All the main ML models are presented, as well as different performance metrics, bagging, pruning, tuning, ensembling, ***ing, etc. Nice pop-culture references, easy to scan through, many tips with fully-solved textbook problems. Certainly a very good starting point if you plan to compete on Kaggle. If you already master both R and ML, this books is obviously not for you.
Profile Image for Walter Ullon.
318 reviews154 followers
September 7, 2017
If you need a proper introduction to Machine Learning for professional reasons or even just for your own edification, do yourself a favor and pick up this gem of text.

Make sure you are 'language agnostic' before you begin. Let me explain, right now the python libraries are all the rage: Pytorch, Keras, TensorFlow, ScikitLearn, etc... Thus, you might be tempted to believe that in getting yourself acquainted with ML in R you are putting yourself at a disadvantage. You'd be wrong.

Truth it, you should be approaching the subject with the idea of learning from a conceptual and practical standpoint, albeit at a high level. The language you use will make little difference at the beginning. This was my main concern as I needed to learn "python ML" for professional reasons. Make no mistake, this book along with the available code up on the author's GitHub will guide you through the language, the hard to grasp concepts, and the terminology in a way that is pedagogically so effective that you'd be left wondering how it is that most technical books never reach this level of clarity. You'll be carrying conversations with experienced ML practitioners in no time, without embarrassing yourself (too much).

Take it for what it is though, an introduction. If you need to know every pedantic detail about how neural networks learn, the heavy mathematical proofs behind the algorithms, etc., then you'd be much better served looking elsewhere.

Once you go through this text, you'll be able to jump on the Python bandwagon all while avoiding the risk of having the language's technicalities distract you from the core concepts.

Go for it, happy learning.
223 reviews6 followers
March 20, 2018
This is a very hands-on book written in an easy-to-understand language. It kept me engaged through all the chapters with code examples on many of the machine learning algorithms. I generally don't like books that spoon-feeds with all the information - leaving some room for self-exploration motivates the reader to learn more through other media. Though this book illustrated some excellent walkthroughs for most common ML algorithms, the last of the chapters pique reader's interest by providing leads regarding improving model performance and special machine learning topics.

Going through books like Statistics in Plain English and Fundamentals of Machine Learning for Predictive Data definitely helped me prepare for the coding aspects of the book.
Profile Image for Alexander Whyte.
11 reviews
December 18, 2018
Good book if you are a complete beginner to machine learning. Helps if you have a little familiarity with R.
2 reviews
April 17, 2023
Not Useful

Book is really hard and difficult to follow. Examples are unrealistically simplified and misleading. Code doesn’t work most of the time.
4 reviews8 followers
January 31, 2014
[Full disclosure - I was given a free review copy of the book from the publisher. This review refers to the ebook version]

This is the most recent of a group of books that try to explore machine learning from a programming, rather than purely mathematical, perspective. The book is highly successful in this respect and deserves a place on the bookshelf of any data scientist, Kaggler or statistician.

The book takes a slightly different tack from previous ones in this field (See 'Programming Collective Intelligence' and 'Machine learning for Hackers') in that it concentrates largely on the packages themselves and how to use them to solve real world ML problems, rather than focusing on coding up simple algorithms from scratch and running these on toy datasets. Perhaps this way the book doesn't provide as much insight into how the algorithm design, but it does make the book much more practically useful, particularly since it spends a good chunk of each chapter explaining the algorithm in simple, plain English.

The book is well laid out and written. Despite a slightly shaky start (do we really still think of ML in terms of Skynet, the Matrix and Hal?), the introduction is excellent and gives a pleasing summary of the philosophical and ethical issues surrounding machine learning and big data. Next, there is a thoughtful introduction to data management and exploratory data analysis that highlights important and often missed tips on things like getting data out of SQL databases. It introduces some basic R functions and concepts (some I had managed to miss up until now) without feeling like a tacked on 'R for beginners' chapter.

In the guts of the book, each chapter focuses on a group of related algorithms (KNN, Naive Bayes, Decision trees, Regression, Neural nets and SVMs, association rules, clustering) and has a good introduction to the algorithm in question, followed by sections on finding and cleaning data, implementing the algorithm on the data and evaluating and improving model performance. There are clear and easy to understand tables and descriptions of the important distinctions between the algorithms and the reasons for choosing one over another. The datasets the author has chosen are large and interesting enough to well illustrate the points being made without being frustratingly unwieldy and many of them are 'classic' machine learning datasets from places such as the UCI Machine Learning Data Repository.

Next, the book looks more deeply at evaluating and improving model performance and discusses important ensemble and meta-learning techniques like bagging, boosting and RandomForests. This section will be of particular interest to people wanting to enter Kaggle or other data science competitions because they show how to milk as much performance as possible from the basic algorithms described earlier in the book.

The final section discusses getting the algorithms to run on big datasets and improving the performance of R itself using tricks like the data.table and ff packages and parallel processing. This is the only section of the book that feels slightly rushed and many of these topics are discussed only briefly before linking to the relevant package documentation. This is only small criticism though, since coding up these kinds of systems will depend strongly on the data you have and these are difficult subjects to cover whilst retaining generality.

Obviously, the book cannot cover everything. It is decidedly light on graphs and has almost nothing on visualisation techniques and packages like ggplot2 which have become almost mandatory for doing data science today. Also, if you are new to R, you really want to get one of the excellent introductory books first and if you are new to ML, you probably want to spend a while learning some basic stats as well. Finally, this book doesn't pretend to be a deep text about the mathematics of the algorithms it covers. For that you will need to go for something like Bishop's classic 'Pattern Recognition and Machine Learning' and be prepared to put in some serious effort!

In short, if you are looking for a practical guide to implementing ML algorithms on real data and if you are more comfortable thinking in R code than in mathematical equations, this is the book for you and is probably the best that I have seen on the subject so far.
Profile Image for Arthur.
96 reviews5 followers
May 25, 2019
We live in the Machine Learning and Artificial Intelligence Age - deny or embrace it. Those who do truly accept it and its many challenges as early adopters (yes, this age has just began) often would need to rely or resort to sources and wisdom of the work done by the early pioneers who Brett Lantz I consider belongs to.
I was a reader of the earlier edition of this book and therefore upon the release of its 3rd edition I grabbed the copy to find that this edition became even more adopted to the most current challenges and covering most (if not all) the changes that happened around the big data tools and techniques.
Not to mention I found this book bias-less and perhaps one of the few that is very frank about the subject yet sets the expectations quite realistically.
Who this book targets: in my view, CS graduates or under-grads, BI or Data Anlaysists (beginners or not), even software developers. I just happen to observe most initiative staring today already assume AI and/or ML built in.
A sidetrack on R (the aptly named programming language) - many debate on the supremacy of one language over another. In my practice, R has the lowest barrier to entry with maximum rewards. Not to mention flexibility and versatility.
To sum up, what I liked about the book:
* Sheer Big Data tooling coverage beyond R as a language, just enough to get one started with Hadoop, Spark, etc.
* Trending ML packages, frameworks (i.e. Tensorflow)
* Enterprise grade databases covered
* The latest R Studio covered
Where in my view it still requires some improvement:
* Maybe it's me - I like assignments so it would not harm to provide some home work
* No cartography (maps)
Hope you will enjoy your read!
* No coverage for
Profile Image for Jeremy.
6 reviews8 followers
February 17, 2017
It gets you going without drowning you in theory or too much math.
Profile Image for Brian Powell.
189 reviews34 followers
August 4, 2020
Very easy and friendly introduction to the major families of machine learning algorithms. Each is implemented in R and applied to sample data; you can follow along by downloading the datasets from the textbook website. You'll need to have R and a variety of machine learning-oriented libraries installed. I'm not an R user and so I can't say how well this text covers the basics, but it gives you what you need to load in a dataset, view it from various angles, prepare it for analysis, and then turn the cranks on the machine learning algorithms. The book does not delve deeply into the details of the algorithms: how they work, when they should be applied, what the gotchas are.

The book, therefore, can be used for good or for evil. The good (really great) thing is that it compels you just to dig right in, to start playing with data, running routines, and testing models. This is important: it can be paralyzing if one approaches machine learning too cautiously, worrying about hyperparameters, features, and all the rest up front. The evil thing is that, while this is a great way to get started, one cannot continue to do machine learning this way. There is a tendency, made all the more alluring with the advent of "fire and forget" software like sklearn, to approach data science as a mechanic who applies a broad suite of tools, ignorant of how they work, to fix a vehicle, ignorant of how it works. These data scientists are nearly guaranteed to churn out shitty models. It is therefore imperative to supplement this text with other more thorough sources to get a complete picture of what's going on -- it cannot stand on its own.
1 review2 followers
Want to read
February 16, 2017
mnmnmnm
This entire review has been hidden because of spoilers.
6 reviews
June 3, 2015
A decent quick reference for machine learning in R

This book provides a quick overview of some of the most popular machine learning algorithms and their implementation in R. I found it easy to read and following the examples was fun. However, the edition is not the best, as you'll find plenty of typos.

If you are starting with machine learning (or thinking about it), this could be a useful reference to see how it can be used (there are plenty of practical examples) and get a basic notion of how each algorithm is implemented. If you already know a lot about it, this book is not for you! Even if the price is not so high, the explanations are too simple compared to other resources available online.

Overall, I gave this book 4 stars because of the practical examples, which are a useful way to see how machine learning can be used.

PS: Personally, I think this book would be a good companion to the MIT "Analytics Edge" MOOC. Some of the methods overlap between the two, and it follows the same format of learning by doing.
1 review
July 31, 2015
This was a useful book to begin learning how to practice machine learning techniques in R.

After completing the book, I felt like its audience is someone who wants to jump directly into practicing machine learning (ML), rather than understand the fundamentals of each ML model. Each chapter briefly introduces the relevant machine learning model. These introductions were just detailed enough to understand what each model is doing.

The book does a good job walking the reader through the use of each machine learning package in R. Each chapter has detailed, step-by-step instructions for how to code a machine learning model with real-world datasets (downloadable online).

Overall, the book was a useful resource.

217 reviews13 followers
April 13, 2017
This was a really awesome book, certainly the initial chapters nicely combined the introduction with a concept alongside with a practical example showing it in practice in R. The only drawback of this book, is also a downside of R, for every specialized topic there are dedicated packages, and some of these packages may or may not be available for your current R installation ... this breaks some of the chapters of this book, but also makes R a but instable (in my eyes at least), in this aspect I also missed out that I could simply apt-get install certain R packages, it could've been nice to simply have mentioned this somewhere in the book.
Profile Image for Ganes Kesari.
36 reviews7 followers
December 24, 2016
A very good book on the basics of machine learning that covers the breadth of techniques and does a good job of introducing model evaluation & tuning methods. Each chapter has parallel illustrations of R code, with application of techniques to practical problems solved using R libraries.

The model internals and advanced concepts of techniques aren't covered, but that seems to be a well thought plan.
Profile Image for Truc-Vien Nguyen.
44 reviews1 follower
September 11, 2016
This book gives an overview of the functions and commands for implementing some of machine learning algorithms in R. It's not recommended for those who want to go deeply into either Machine learning or R programming. It is rather for those with ML knowledge already but want to know how to do it in R.
10 reviews1 follower
August 10, 2020
I enjoyed reading this book, as I practiced every single example and published them on Rpubs.
I went through Datacamp courses, and I learned about R and machine learning in an interactive way. but going through the projects/examples in the book is really exiting. you can build on your project in later stages.
Profile Image for Rajesh.
96 reviews26 followers
June 11, 2015
Great R language reference and nice, accessible introduction to the incredible functionality built into R and its libraries. Recommended for anyone working on statistics and hypothesis tests, exploratory data analysis, designed experiments, data visualization, etc.
Profile Image for Ivan.
34 reviews
December 15, 2020
This is a great introductory book to machine learning but also to R. There are easy-to-follow practical examples in each chapter. It is also very well written. Have a look at the contents if the topics interest you and if the answer is yes, don't hesitate and get it.
Profile Image for José María.
49 reviews41 followers
December 7, 2013
Es un estupendo texto introductorio. Explica gran parte de la teoría tras múltiples técnicas de Machine Learning y además incluye varios ejercicios para hacer con R. Se lee en un día.
Profile Image for Katjp.
42 reviews
July 16, 2014
A really useful and well-written resource.
Profile Image for Alvaro Tejada Galindo.
178 reviews5 followers
April 4, 2017
Brilliant! One the best R book I have ever read...this will blow you Learning Machine mind away...I have learnt some much by reading this book...I totally fell in love with it...
Profile Image for Yoly.
669 reviews45 followers
September 5, 2015
Great introduction to machine learning in R. It provides a nice balance between theory and practice.
Packtpub books range from terrible to mediocre to great, this is one of the few great ones.
Profile Image for Erondi.
1 review
July 6, 2016
This is a very good intro book to Machine Learning with good examples and code in R. Definitly recommend if you already know some basic R programming and want to get into ML concepts.
8 reviews
February 12, 2017
A great book for beginners in Machine Learning. Good choice of topics covered. Excellent insight into the practical aspects of ML. Missing only more in-depth discussions on the underlying maths.
Profile Image for Sidhartha Ray.
3 reviews1 follower
Read
June 23, 2017
Chapter 3 talks about kNN algorithm, the most simple classifier. It's not a modeling rather storing the data in a particular format and choosing number of nearest neighbors so that the test datasets' lebels will be predicted.
This entire review has been hidden because of spoilers.
Displaying 1 - 28 of 28 reviews

Can't find what you're looking for?

Get help and learn more about the design.