Computational Functional Genomics

Computational Functional Genomics
(Statistical Models in Computational Biology)

26-BE-790
Instructor: Mario Medvedovic

Motivation: The very nature of functional genomics research has undergone major changes in the last 5 to 6 years. The changes were driven by the development of high-throughput technologies for measuring levels of bio-molecules on the genomic scale and by the development of easily accessible databases containing a tremendous amount of accumulated information about genomic sequences, protein sequences, functional annotations, etc. Current functional genomics research depends heavily on computational tools which allow access to various information depositories as well as tools for mathematical and statistical modeling of data obtained by modern experimental technologies such as DNA microarrays, 2D gels, mass spectrometry, etc. While an increasing number of computational tools are being developed that allow for analyzing such data, the appropriate use of these tools depend on one’s understanding of the underlying mathematical and statistical models. Furthermore, many of the newly developed mathematical and statistical approaches are not supported by an appropriate computational tool, but can be easily implemented by using one of several widely available generic computational software packages such as SAS, Splus, MATLAB, R, etc. Unlike the statistical tools required for analyzing data generated by classical experimental approaches that assessed only a few entities at a time, the analytical methods used for large scale functional genomics data need to deal with additional issues of multiple hypotheses testing, high-dimensional models, assessing the statistical confidence in patterns discovered by data mining techniques, etc. Human intuition, when not aided by formal mathematical analysis, breaks down in such situations. This makes it imperative that future generations of biomedical researchers acquire an understanding of the mathematical and statistical methods underlying the tools they use to analyze their data.

Objectives: The goal of the course is to introduce students to statistical models and concepts corresponding computational tools for analysis of microarray data. Molecular biology students are expected to learn principles upon which computational tools used in the analysis of functional genomics (e.g. microarray) data are based. They are also expected to gain a basic level understanding of how to write simple programs in R which and make use of analytical procedures within the Bioconductor package that are specifically developed for the analysis of microarray data. Students with quantitative backgrounds are expected to learn basic concepts of molecular genetics and specifics of applying familiar concepts and tools to modeling functional genomics data.

Syllabus

Previous Editions: 2005

Teaching Assistants:
Johannes Freudenberg: Johannes.Freudenberg@cchmc.org
Junhai Guo: guojs@email.uc.edu

Lectures from the Winter quarter 2005/2006:

R-programs:

Lecture 1/26/2006 (LimmaAnalysis.R)

Links
Nadon and Shoemaker paper (please read)
Benjamini and Hochberg FDR paper
Smyth and Speed: Normalization of cDNA Microarray Data
Limma Description
Smyth-Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments (you will need to "register" to download the paper for free)