
Math level: ♦
Experimental Design for Biologists (Glass DJ)
This book contains zero statistics. It is about approaching research problems with a questionandanswer framework rather than a hypothesis testing framework. It discusses how to validate an experimental system, various types of experimental controls, and how to build and validate a biological model. If you only read one book on this list, it should be this one.


Math level: ♦
Data Analysis: A Model Comparison Approach (Judd CM, et al.)
This book is recommended for those who are looking for a good introduction to statistics. The book teaches statistics from a unified statistical modelling perspective, rather than the usual cookbook method of "this test goes with this type of data". Ttests, ANOVA, and regression are all examples of linear models, and the book shows how to model (i.e. analyse) data, rather than apply statistical tests to data. The only drawback is that examples are from the social sciences.


Math level: ♦
Visualizing Data (Cleveland WS)
This hugely influential book is about how to understand your data by graphing it. A simple idea, but often not done well. The graphical methods are implemented in the lattice package for R. It also has a companion volume (see next book).


Math level: ♦
The Elements of Graphing Data (Cleveland WS)
A second book by Cleveland (see previous entry), which focuses on basic principles for constructing good graphics. Required reading for anyone who makes graphs to present data. The graphical methods are implemented in the lattice package for R.


Math level: ♦
Introduction to MetaAnalysis (Borenstein M et al.)
What happens when multiple studies are conducted to address a research question, and the results are p=0.12, p=0.032, p=0.002? Are these results conflicting? Is the effect real? Scientists usually evaluate such results with "vote counting" (one against vs. two for... but one pvalue is just barely significant... hmm, maybe there's something there). This is not the way to proceed, but unfortunately this is how many scientists struggle to understand the results of multiple experiments. A metaanalysis allows one to numerically combine information across studies to get an overall picture of what is going on. The equations are also simple enough to do by hand.


Math level: ♦
Model Based Inference in the Life Sciences: A Primer on Evidence (Anderson DR)
This book is a good introduction to informationtheoretic approaches. Examples are mostly from ecology, but it provides an alternative theoretical perspective on statistical inference, and the methods are easy to implement. The content is a subset of the next book, with most of the mathematics removed, and thus provides a more accessible book for biologists.


Math level: ♦♦♦
Model Selection and MultiModel Inference: A Practical InformationTheoretic Approach (Burnham KP, Anderson DR)
This book also provides an introduction to informationtheoretic methods, but contains proofs and more advanced theoretical topics than the previous book.


Math level: ♦♦
Data Analysis Using Regression and Multilevel/Hierarchical Models (Gelman A, Hill J)
This book has nothing to do with experimental biology, but it is remarkable in how smoothly it transitions from regression models (familiar ground for biologists) to hierarchical models (which should be used more often, given the hierarchical nature of many data sets) to Bayesian methods. This book is also a good introduction to statistical modelling in general, and examples are in R (and BUGS for the Bayesian examples).


Math level: ♦
Doing Bayesian Data Analysis: A Tutorial with R and BUGS (Kruschke J)
Another great introduction to Bayesian methods using R and BUGS, but this time examples are from Psychology. Unlike the previous book, this one is purely Bayesian and starts at a more basic level. It has been receiving great reviews from statisticians and scientists alike.


Math level: ♦♦
Cause and Correlation in Biology: A User's Guide to Path Analysis, Structural Equations and Causal Inference (Shipley B)
Laboratorybased biologists are fortunate because most factors are under experimental control and/or can be held constant. However, this is not always the case; for example, the interest is in manipulating X and observing Y, but X also affects Z, which is known to affect Y. Therefore, to what extent does X directly affect Y (if at all) and how much of X's effect is through Z? These types of questions can be addressed with structural equation models, and it is important to know how causality can be inferred from such data. Demonstrating causeandeffect relationships is a core aspect of biological research, and therefore familiarity with these methods should be a part of every scientist's toolkit.


Math level: ♦♦♦
An Introduction to Optimal Designs for Social and Biomedical Research (Berger MPF, Wong WK)
Optimal design theory deals with getting the most information out of an experiment. Suppose you want to test the effect of a compound and are restricted to 20 animals. How do you decide on the number of groups, the actual dose levels, and the sample size in each group? Your choice will have profound implications for statistical power. In general, optimal designs allow you to do more with less, and to determine where further effort will be wasted.
