An Introduction to R: Biostatistics-I
2007  Brian C. McCarthy

Quantitative Methods in Plant Biology (PBIO 415-515)
Department of Environmental and Plant Biology
Ohio University




R is a free software environment designed explicitly for statistics and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files. It is rapidly becoming the industry standard among statisticians.

R is an open source program meaning anyone can contribute. It will run on many platforms including Windows, MAC-OS, and Unix. There are many advantages to using it, not the least of which that it is free, you will have it available to you in any future work environment, and you will simply learn statistics better because it is not a point & click software application.

The R-FAQ is probably the best place to start reading about what R is, how you can obtain a copy, and what it can do for you. To download a copy, go to the following nearby mirror site, select the operating system and follow on-screen instructions.



One of the nice features of R is the large volume of easily accessible resources and documentation. Free manuals are available at the R website, as well as many suggested third party reference books. Two good introductory texts that I especially like are Dalgaard (2002) and Crawley (2005). Graphics are explained well by Murrell (2006) and linear models by Faraway (2005). In an effort not to re-invent the wheel, I have borrowed liberally from some of these resources, but wish to explicitly acknowledge their intellectual contribution. There are also many specialized applications, especially in bioinformatics, where additional information can be found.

R can be intimidating at first glance, and indeed it requires some commitment to learn, but the rewards are great and you will be exposed to a large suite of sophisticated statistical tools for problem solving in ecology, systematics, and evolutionary biology, as well as genetics, molecular biology, and bioinformatics.



For the purposes of this course, which is intended as an introduction to biostatistics, I have designed a set of tutorials to help you through the basics of R. Each tutorial provides most of the background necessary for the corresponding problem set.

Tutorial-1: An Introduction to R, Central Tendency, and Graphics (PDF)

Tutorial-2: Session Management, Data Entry, and Variance (PDF)

Tutorial-3: Summary Statistics and Graphical Display of Data (PDF)

Tutorial-4: Probability Distributions; One & Two Sample Analysis (PDF)

Tutorial-5: Correlation and Regression (PDF)

Tutorial-6: Analysis of Variance (ANOVA) (PDF)