Visualizing Data is about visualization tools that provide deep insight into the structure of data. There are graphical tools such as coplots, multiway dot plots, and the equal count algorithm. There are fitting tools such as loess and bisquare that fit equations, nonparametric curves, and nonparametric surfaces to data. But the book is much more than just a compendium of useful tools. It conveys a strategy for data analysis that stresses the use of visualization to thoroughly study the structure of data and to check the validity of statistical models fitted to data. The result of the tools and the strategy is a vast increase in what you can learn from your data. The book demonstrates this by reanalyzing many data sets from the scientific literature, revealing missed effects and inappropriate models fitted to data.
TODO full review: + Nearly a decade after his classic book , returns to . This 1993 book is still well worth its time for the starting practitioner. Complements , , and (the best I found among the group of authors focusing on the basics) with the technical (read: mathematical) details, but does not (and it cannot) have details regarding the modern software to create the plots. +/- Covers various aspects of drawing, including touching ("brushing") an image to add labels for key points, marking ("slicing") specific areas of the plot, zooming in, and changing the aspect ratio ("banking"). The terms proposed by Cleveland have not passed the test of time, and the methods proposed here are still tentative. ++ Plenty of good material on Q-Q plots, box plots, distribution fits and residuals, curve fitting (all sorts of parametric fitting, plus LO[W]ESS), scatterplots, higher variate analysis (tri- and multi-, with coplots, level plots, contour plots, scatterplot matrices, and even the dreaded 3d[-to-2d] plots). +/- Quite a bit of material from 's , but summarized well and explained for the beginner.
A tough yet necessary read for the non-statistician... At a time when anyone can produce colorful graphs in a few clicks, this book tells us how much thought, work, and hard-earned technique must go into plotting data, to reveal rather than distort the trends it conceals.
I bought this book years before I got around to reading it. And I had expected a very different book than what I had. If I had known this isn't a theoretical book, I probably would have read it much sooner.
Before writing this book, Cleveland was involved in for fitting a function to data. Loess does appear in this book, multiple times. After writing Visualizeing Data, Cleveland as .
Visualizing Data is very much like . Cleveland simply shows how he would analyze various datasets, starting with single variable datasets and continuing to what he calls hypervariate data (defined as "more than three variables"). Cleveland does a wonderful job presenting his visual techniques, and explains things in great detail. Unfortunately, he doesn't explain much of the math or vocabulary he uses. Ultimately, you can learn visual analysis from this book, but you may need another book -- such as Exploratory Data Analysis in order to follow along.
This is really great. It's not mathematically taxing, and it's certainly not a "definition, theorem, proof" book, but it doesn't intend to be that. It shows how one can visualize probablity distributions, including joint probably distributions, and extract information from them graphically. Both numeric and graphical statistical inference are important, but this is the first book I've read with a graphical aspect.