Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Third
Edition edited by Andreas D. Baxevanis, B. F. Francis Ouellette
(Wiley-Interscience) This fully revised version of a world-renowned bestseller
provides readers with a practical guide covering the full scope of key concepts
in bioinformatics, from databases to predictive and comparative algorithms.
Using relevant biological examples, the book provides background on and
strategies for using many of the most powerful and commonly used computational
approaches for biological discovery. This Third Edition reinforces key concepts
that have stood the test of time while making the reader aware of new and
important developments in this fast-moving field. With a new full-color and
enlarged page design, Bioinformatics, Third Edition offers the most readable,
up-to-date, and thorough introduction to the field for biologists.
This new edition features:
New chapters on genomic databases, predictive methods using RNA sequences, sequence polymorphisms, protein structure prediction, intermolecular interactions, and proteomic approaches for protein identification
Detailed worked examples illustrating the strategic use of the concepts presented in each chapter, along with a collection of expanded,more rigorous problem sets suitable for classroom use
Special topic boxes and appendices highlighting experimental strategies and advanced concepts
Annotated reference lists, comprehensive lists of relevant Web resources, and an extensive glossary of commonly used terms in bioinformatics, genomics, and proteomics
Bioinformatics, Third Edition is essential reading for researchers, instructors, and students of all levels in molecular biology and bioinformatics, as well as for investigators involved in genomics, clinical research, proteomics, and computational biology.
In the foreword to the Second Edition of Bioinformatics, Eric Lander conveyed the sentiment that modern biology had entered a new era with the official publication of the initial sequence and analysis of the human genome. Since that moment in time, in February of 2001, the impact of having the human sequence in hand has been nothing short of tremendous. In the last few years, we have witnessed the completion of human genome sequencing, the completion of numerous model organism genome sequences, the development of new genomic technologies and approaches, and a proliferation of in-numerable databases attempting to catalog all of the in-formation that has been learned about genes, proteins, structures, mutations, polymorphisms, and many other biological features of interest. The advent of the genomic era has also laid a strong foundation for the development of new areas of endeavor, such as proteomics and systems biology, fields that are still in their infancy but that have the potential to have an even greater impact on our understanding of basic biological processes and human disease. What has become obvious in these last few years is that, regardless of one's specific area of endeavor, one of the critical keys to being able to do cutting-edge biological research in this new era lies in the ability to combine both laboratory- and computationally-based approaches in a synergistic manner, allowing the investigator to better-design experiments (based on database searches and the like), as well as facilitating the analysis of larger and larger data sets generated through experimentation. Unfortunately, despite its great power and potential in solving biological problems, the realm of bioinformatics still remains terra incognita for many biologists. To address the need for training and education in this area, we have developed a new edition of this book as a resource for our scientific colleagues.
This new edition of Bioinformatics follows in the tradition of the last two editions in keeping up with the quick pace of change in this field. In this edition,tried-and-true concepts and approaches that have stood the test of time are featured, as well as new approaches and algorithms that have emerged since the publication of the First and Second Editions. In considering how to refine the focus and content of the book, a questionnaire was sent to a number of professors currently teaching bioinformatics courses, as well as to people who are actively called-upon to lecture on these topics. Based on these responses, published reviews of the Second Edition, and our own experience in the classroom, we have included a number of new features in the Third Edition.
Six chapters have been added on topics that have emerged as being important enough in their own right to warrant distinct and separate discussion: genomic databases, predictive techniques using RNA sequences, sequence polymorphisms, intermolecular (protein-protein) interactions, comparative genomics, and protein identification using proteomic techniques. The chapter on Internet basics has been retired, and the chapter on submitting sequence information to public databases has been folded into the chapter on sequence databases. We have supplemented many of the chapters with text boxes and appendices that highlight basic biological techniques or provide more advanced information that may be of interest to readers or useful to instructors. A more rigorous set of problem sets has been included, and we hope that the reader will work through these examples to reinforce the concepts presented throughout the book. The solutions to these problems are available through the book's Web site, at http://www.wiley.com/bioinformatics. We are also pleased that the current edition contains color figures throughout; this is in recognition of the way in which bioinformatic information is presented nowadays by many Web sites, using color much more than before to communicate basic biological information to the user. We are hopeful that the inclusion of all of these features, in response to the valuable feedback we have
As we move into the 21st century, we stand at a grand inflection point in biology—how we view and practice biology has forever changed. This inflection point has been catalyzed by a number of events, perhaps the most important of which is the human genome project. It provided a genetics parts list and catalyzed the development of high throughput measurement tools (e.g., the high speed DNA sequences, DNA arrays, high through-put mass spectrometry, etc.) and high throughput measurement strategies (e.g., yeast two-hybrid technique for measuring protein/protein interactions and the genome-wide localization technique for delineating protein/DNA interactions), as well as stimulating the development of powerful new computational tools for acquiring, storing, and analyzing biological information.
The human genome project also changed how we view and practice biology in several other ways. First, it has catalyzed the view that biology is an informational science. There are two fundamental types of biological information, the digital genome and the environmental signals, intracellular, extracellular, or even from outside the organism, that impinge on the genome to facilitate the development of living organisms. These two types of biological information operate across three different time dimensions with regard to the lifetime of individual organisms—evolution, development and physiological responses. There are two major types of genomic (digital) information-the genes encoding proteins, which assemble to create the molecular machines and networks of life, and the cis-control elements that, through interactions with their cognate transcription factors, regulate the expression of their associated genes and establish the linkage relationships and architectures of the gene regulatory networks, those grand integrators of environmental signals, which then transduce the input in-formation to the protein modules or protein networks that mediate the developmental and physiological responses of living organisms. Biological information is also hierarchical—as one moves from the genomes to ecologies, successively higher levels of biological information are created (DNA, RNA, protein machines, protein and gene regulatory networks, cells, etc). Since environmental signals change the information at each level, in order to understand systems one must collect and integrate in-formation from as many different hierarchical levels as possible.
Second, biology has become increasingly cross-disciplinary as biologists, chemists, computer scientists, engineers, mathematicians, and physicists work together to develop the high throughput technologies and computational/mathematical tools required for this new biology—all driven by the contemporary needs of biology. Finally, all of these changes have enabled the emergence of systems biology—the idea that we can study the interactions of all the elements in a biological system and from these come to understand its systems or emergent properties. Systems approaches have been practiced for many years, but what is unique about today's systems biology is that it can make global measurements (e.g., all mRNA, all proteins, etc.) and can integrate the global measurements from different levels of biological information.
The world of biology is, accordingly, very different from what it was even ten years ago. There are many different categories of scientists that must be educated as to this new world—undergraduate, graduates, practicing biologists, and our cross-disciplinary colleagues. One of the biggest challenges with regard to this education is to bring an awareness and understanding of the central role that mathematics, computer science, and statistics plays in deciphering the complexities of this new world of biology. Indeed, one of the most interesting exercises we at the Institute for Systems Biology have undertaken in the past year is a series of institute-wide discussions concerning ten grand computational and mathematical challenges in biology. I will not discuss these
challenges here, but will point out that they represent a very broad list focused on deciphering biological information—the digital information of the genome, the three- and four- dimensional information of proteins, the dynamic nature of protein and gene regulatory networks to the metrics and analyses necessary for global data sets—as well as their integration.
TABLE I: Grand Computational and Mathematical Challenges in Biology
This book attempts to bring this world of compute science and mathematics in biology to the entire spectrum of scientists with an interest in biology—advanced undergraduates, graduates students, practicing biologists, and our cross-disciplinary colleagues. This book is particularly important because it represents a readable and concise approach to educating ordinary biologists and equipping them with the fundamental tools necessary for participating in the paradigm changes that have occurred as a consequence of the grand inflection point in biology.
insert content here