Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R. This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including
importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms
curation and delivery of biological metadata for use in statistical modeling and interpretation
statistical analysis of high-throughput data, including machine learning and visualization,
modeling and visualization of graphs and networks.
The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.
This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.
Robert Gentleman is Head of the Program in Computational Biology at the Fred Hutchinson Cancer Research Center in Seattle. He is one of the two authors of the original R system and a leading member of the R core team. Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School. Gentleman and Carey are co-founders of the Bioconductor project. Wolfgang Huber is Group Leader in the European Molecular Biology Laboratory at the European Bioinformatics Institute in Cambridge. He has made influential contributions to the error modeling of microarray data. Rafael Irizarry is Associate Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health in Baltimore. He is co-developer of RMA and GCRMA, two of the most popular methodologies for preprocessing high-density oligonucleotide arrays. Sandrine Dudoit is Assistant Professor in the Department of Biostatistics at the University of California, Berkeley. She has made seminal discoveries in the fields of multiple testing and generalized cross-validation and spearheaded the deployment of these findings in applied genomic science.
Inhalt
Preprocessing data from genomic experiments.- Preprocessing Overview.- Preprocessing High-density Oligonucleotide Arrays.- Quality Assessment of Affymetrix GeneChip Data.- Preprocessing Two-Color Spotted Arrays.- Cell-Based Assays.- SELDI-TOF Mass Spectrometry Protein Data.- Meta-data: biological annotation and visualization.- Meta-data Resources and Tools in Bioconductor.- Querying On-line Resources.- Interactive Outputs.- Visualizing Data.- Statistical analysis for genomic experiments.- Analysis Overview.- Distance Measures in DNA Microarray Data Analysis.- Cluster Analysis of Genomic Data.- Analysis of Differential Gene Expression Studies.- Multiple Testing Procedures: the multtest Package and Applications to Genomics.- Machine Learning Concepts and Tools for Statistical Genomics.- Ensemble Methods of Computational Inference.- Browser-based Affymetrix Analysis and Annotation.- Graphs and networks.- and Motivating Examples.- Graphs.- Bioconductor Software for Graphs.- Case Studies Using Graphs on Biological Data.- Case studies.- limma: Linear Models for Microarray Data.- Classification with Gene Expression Data.- From CEL Files to Annotated Lists of Interesting Genes.