²é¿´: 4377  |  »Ø¸´: 100
µ±Ç°Ö»ÏÔʾÂú×ãÖ¸¶¨Ìõ¼þµÄ»ØÌû£¬µã»÷ÕâÀï²é¿´±¾»°ÌâµÄËùÓлØÌû

[×ÊÔ´] ÉúÎïÐÅϢѧ-BIOINFORMATICS: A CONCEPT-BASED INTRODUCTION

Scientific disciplines evolve and mature into different areas of specialization
to accommodate new knowledge and methods that are being developed by
the research community. The last decade has seen a dramatic change in most
fields and the information technology has revolutionized several fields.
Bioinformatics is the perfect marriage between computer science and
advanced biology. Complex biological processes, macromolecular
components and their functional interplay define the basis of living cells.
Biological experiments that aim to reveal the complexity of cellular systems
and biomolecular functions produce huge volumes of data or information
that needs to be efficiently handled for tangible results. Exponential increase
in genome sequences, protein sequences, protein interactions and biological
networks/pathways information has created a demand for efficient
information handling. This led to the birth of the field of Bioinformatics that
aims to handle biological information using computational methods and
algorithms. Bioinformatics is evolving into a mature field with an everincreasing
participation from the scientific community. The past five years
have seen a rapid increase in the number of scientific journals in this field. It
is impossible to include all the topics of Bioinformatics in a book and still
cater to the needs of newcomers attracted to this field. This is an
introductory book that provides a balance between computational methods
and biological information. Instead of delving in depth, for each topic we
provide a broad but necessary content that will benefit readers with different
levels of expertise.
1 Introduction to Biological Systems........................................................1
Claude-Henry Volmar, Nikunj Patel, Amita N. Quadros,
Daniel Paris, Venkatarajan S. Mathura and Michael Mullan
1. Molecules of Life.................................................................................. 1
2. Nucleic Acids: DNA Versus RNA ....................................................... 2
3. Understanding Proteins: Sequence¨CStructure¨CFunction....................... 4
4. Biological Systems, Signals, and Pathways.......................................... 5
5. Technological Advances and Their Benefits to Biology ...................... 7
6. The Role of Bioinformatics in Big Picture ........................................... 8
7. Exercises ............................................................................................... 9
References............................................................................................... 10
2 Computer Programming Fundamentals and Concepts ....................13
Deepak N. Kolippakkam, Pankaj Gupta
and Venkatarajan S. Mathura
1. Purpose ............................................................................................... 13
2. Learning Objective ............................................................................. 13
3. Perl Programming............................................................................... 14
3.1 Variables ....................................................................................... 14
3.2 Operators....................................................................................... 15
3.3 Control Structures ......................................................................... 16
3.4 Regular Expressions ..................................................................... 17
3.5 File Handling ................................................................................ 18
3.6 Subroutines and Functions............................................................ 18
xii Contents
4. PHP Programming .............................................................................. 19
4.1 Language Syntax and Data Types................................................. 19
4.2 Creating Web Interfaces ............................................................... 22
5. Basic RDBMS and SQL ..................................................................... 24
5.1 Data Definition Language (DDL)................................................. 24
5.2 Data Manipulation Language (DML) ........................................... 25
5.3 Data Control Language (DCL) ..................................................... 26
6. Web-Pointers ...................................................................................... 26
3 Introduction to Algorithms ..................................................................27
Senthilkumar Radhakrishnan, Deepak Kolippakkam
and Venkatarajan S. Mathura
1. Introduction......................................................................................... 27
1.1 Classification ................................................................................ 27
1.2 Hypothesis Testing ....................................................................... 28
1.3 Decision Tree................................................................................ 28
1.4 Clustering...................................................................................... 29
1.5 Principal Component Analysis ..................................................... 29
1.6 Multidimensional Scaling ............................................................. 29
1.7 Regression Analysis...................................................................... 29
1.8 Linear Discriminant Analysis ....................................................... 30
1.9 Fuzzy Logic .................................................................................. 30
1.10 Pattern Recognition..................................................................... 31
1.11 Bayesian Statistics ...................................................................... 31
1.12 Neural Networks ......................................................................... 32
1.13 Hidden Markov Model................................................................ 32
1.14 Support Vector Machines ........................................................... 33
2. Exercises ............................................................................................. 33
3. Useful Web-Pointers........................................................................... 34
References............................................................................................... 35
4 Biological Sequence Databases ............................................................39
Meena Sakharkar, Pandjassarame Kangueane
and Venkatarajan S. Mathura
1. Purpose ............................................................................................... 39
2. Learning Objective ............................................................................. 39
3. Introduction......................................................................................... 39
3.1 Genomic Sequence Databases ¨C GenBank, EMBL, DDBJ .......... 41
3.2 Protein Sequence Databases ......................................................... 42
3.3 Secondary Databases on Molecular Evolution ............................. 44
References............................................................................................... 46
Contents xiii
5 Biological Sequence Search and Analysis...........................................47
Venkatarajan S. Mathura
1. Purpose ............................................................................................... 47
2. Learning Objectives............................................................................ 47
3. Introduction......................................................................................... 48
3.1 Similarity Matrices and Alignment............................................... 48
3.2 Sequence Search and Pair-Wise Alignment ................................. 50
3.3 Global Alignment Using Needleman-Wunsch Algorithm............ 51
3.4 Sequence Search Tools ................................................................. 53
3.5 Pair-Wise and Multiple-Sequence Alignment Tools .................... 55
3.6 Sequence Motifs ........................................................................... 57
References............................................................................................... 61
6 Protein Structure Prediction................................................................63
Hongyi Zhou, Yaoqi Zhou and Venkatarajan S. Mathura
1. Introduction......................................................................................... 63
2. Secondary Structure Prediction .......................................................... 65
3. Comparative Modeling ....................................................................... 66
3.1 Steps Involved in Comparative Modeling .................................... 67
3.2 Homologous Sequence Search Using Sequence
Comparison Tools......................................................................... 67
3.3 Identifying Remote Templates Using Fold-Recognition
Methods ........................................................................................ 68
3.4 Selection of the Alignment ........................................................... 69
3.5 Construction of 3D Models Using Modeling Programs ............... 69
3.6 Protein Modeling Package ¨C MPACK.......................................... 70
3.7 SP3 ¨C A Web-Based Structure-Prediction Tool Using
Known Protein Structures as Templates ....................................... 70
3.8 Modeling Servers.......................................................................... 73
3.9 Critical Assessment of Structure Prediction ................................. 74
3.10 Objective Testing of Modeling Tools in CASP.......................... 74
References............................................................................................... 75
7 Protein-Protein Interaction and Macromolecular Visualization...... 79
Arun Ramani, Venkatarajan S. Mathura, Cui Zhanhua
and Pandjassarame Kangueane
1. Introduction......................................................................................... 79
2. Experimental Methods........................................................................ 80
2.1 Yeast Two-Hybrid ........................................................................ 80
2.2 Affinity Tagging ........................................................................... 81
2.3 Computational Methods................................................................ 82
2.4 Co-evolution ................................................................................. 83
xiv Contents
2.5 Structure Based Methods .............................................................. 83
3. Protein Structure Visualization........................................................... 91
4. Databases ............................................................................................ 91
References............................................................................................... 93
8 Genes, Genomics, Microarray Methods and Analysis ...................... 97
Ghania Ait-Ghezala and Venkatarajan S. Mathura
1. Introduction......................................................................................... 97
2. Gene Identification and Characterization ........................................... 98
2.1 Identifying Human Genes and Cloning ........................................ 98
3. Microarray Experiments ................................................................... 102
3.1 Microarray Databases ................................................................. 104
3.2 Gene Annotations, Ontology, and Pathway Databases............... 104
References............................................................................................. 105
9 Introduction to Proteomics ................................................................107
Fai Poon and Venkatarajan S. Mathura
1. Introduction....................................................................................... 107
2. Sample Preparation ........................................................................... 108
3. Two-Dimensional (2D) Gel Electrophoresis .................................... 108
3.1 Image Analysis and Statistical Analysis ..................................... 109
3.2 In-Gel Digestion and Mass Spectrometry................................... 109
4. Mass Spectrometry ........................................................................... 109
4.1 Mass Spectrometry in Proteomics .............................................. 110
5. Bioinformatics Applications for Identification................................. 111
6. Conclusion ........................................................................................ 113
References............................................................................................. 113
10 Biomedical Literature Mining ...........................................................115
Chaolin Zhang and Michael Q. Zhang
1. Introduction....................................................................................... 115
2. Literature Sources for Mining........................................................... 117
3. Recognition of Biological Terms...................................................... 118
3.1 Gene/Protein Name Recognition ................................................ 119
3.2 Removing Gene/Protein Name Ambiguities .............................. 120
3.3 Collecting Other Keywords ........................................................ 120
4. Mining Biological Relationships ...................................................... 121
4.1 Detecting Gene Interactions by Co-occurrence .......................... 121
4.2 Inferring Implicit Relationships.................................................. 122
4.3 Identifying Sub-networks of Communities................................. 123
4.4 Evaluating Functional Coherence of Gene Group ...................... 124
References............................................................................................. 125
5. Acknowledgments ............................................................................ 124
Contents xv
11 Computational Immunology: HLA-Peptide Binding Prediction....129
Pandjassarame Kangueane, Bing Zhao and Meena K. Sakharkar
1. Background....................................................................................... 129
2. HLA Molecules ................................................................................ 131
3. HLA Binding Peptide Based Methods.............................................. 132
3.1 Sequence Based Prediction Models ............................................ 133
3.2 Molecular Structure Based Predictions....................................... 143
4. Conclusion ........................................................................................ 150
References............................................................................................. 151
12 Bioinformatics Application: Eukaryotic Gene
Count and Evolution ...........................................................................155
Meena K. Sakharkar and Pandjassarame Kangueane
1. Introduction....................................................................................... 155
2. Methodology..................................................................................... 156
2.1 Identification of SEG .................................................................. 156
2.2 Identification of MEG................................................................. 156
2.3 Pseudogenes................................................................................ 157
2.4 Caveats........................................................................................ 157
2.5 Total Genes ................................................................................. 158
3. Results and Discussion ..................................................................... 158
3.1 Utility of SEG and MEG Sequences to the Study of Evolution.... 158
3.2 Selection of SEG and MEG in Different Eukaryotic Genomes.... 158
3.3 Mechanism of SEG Origin ......................................................... 160
4. Conclusion ........................................................................................ 161
References............................................................................................. 162
13 Bioinformatics Application: Predicting Protein Subcellular
Localization by Applying Machine Learning ...................................163
Pingzhao Hu, Clement Chung, Hui Jiang and Andrew Emili
1. Introduction....................................................................................... 163
2. Methods ............................................................................................ 165
2.1 Data Sets and Preprocessing ....................................................... 165
2.2 Learning Algorithm .................................................................... 166
2.3 Evaluating Performance of the Learning Algorithm................... 167
2.4 Strategy for Multi-class/Multi-label Classification..................... 167
2.5 Optimal Sampling Methods for Imbalanced Data Sets............... 168
2.6 Algorithm of Asymmetric Bagging Strategy.............................. 169
3. Results............................................................................................... 170
4. Discussion......................................................................................... 172
References............................................................................................. 172
xvi Contents
14 Bioinformatics Analysis: Gene Fusion..............................................175
Meena Kishore Sakharkar, Yiting Yu
1. Introduction....................................................................................... 175
2. Identification of Fusion Proteins....................................................... 176
2.1 Human Fusion Proteins Mimicking Bacterial Operons .............. 177
2.2 Human Fusion Proteins Simulating Bacterial Subunit
Interfaces..................................................................................... 177
2.3 Fusion Proteins Exhibiting Multiple Functions .......................... 177
2.4 Fusion Proteins Showing Alternative Splicing ........................... 178
3. Remarks on Fusion Proteins ............................................................. 178
References............................................................................................. 180
Index ..........................................................................................................183
»Ø¸´´ËÂ¥
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû

yawei1983

Ìú³æ (³õÈëÎÄ̳)


¡ï¡ï¡ï¡ï¡ï ÎåÐǼ¶,ÓÅÐãÍƼö

ÎÒϲ»ÏÂÀ´£¬Äܲ»ÄÜÓÐÄÄλ´óÏÀ°ïæ´«¸øÎÒÄØ£¿ÓÊÏäyawei1983@126.com.ллÀ²£¡
69Â¥2013-10-22 00:23:48
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû
²é¿´È«²¿ 101 ¸ö»Ø´ð

¡ï¡ï¡ï¡ï¡ï ÎåÐǼ¶,ÓÅÐãÍƼö

´ó¸çÄãºÃ¶àºÃ¶«Î÷°¡
µ±ÄêÎÒÉúÎïÐÅϢѧ²Å60·Ö£¬ÀÏʦÍø¿ªÒ»Ãæ
ÓÐʱ¼ä²¹²¹¿Î
2Â¥2011-08-12 21:57:16
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû

florayo

Òø³æ (ÕýʽдÊÖ)


¡ï¡ï¡ï¡ï¡ï ÎåÐǼ¶,ÓÅÐãÍƼö

ѧϰѧϰ~~
5Â¥2011-08-13 12:07:57
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû
ÒýÓûØÌû:
2Â¥: Originally posted by Î÷¹Ï at 2011-08-12 21:57:16:
´ó¸çÄãºÃ¶àºÃ¶«Î÷°¡
µ±ÄêÎÒÉúÎïÐÅϢѧ²Å60·Ö£¬ÀÏʦÍø¿ªÒ»Ãæ
ÓÐʱ¼ä²¹²¹¿Î

ÁùÊ®·Ö òËƶ¼ÊÇÆßÊ®·Ö¼°¸ñ°É
6Â¥2011-08-13 17:44:45
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû
¼òµ¥»Ø¸´
jinhx873Â¥
2011-08-13 07:01   »Ø¸´  
ÎåÐǺÃÆÀ  ¶¥Ò»Ï£¬¸Ðл·ÖÏí£¡
lan_jixian4Â¥
2011-08-13 10:21   »Ø¸´  
ÎåÐǺÃÆÀ  ¶¥Ò»Ï£¬¸Ðл·ÖÏí£¡
july97510Â¥
2011-08-14 22:08   »Ø¸´  
ÎåÐǺÃÆÀ  ¶¥Ò»Ï£¬¸Ðл·ÖÏí£¡
okjstor12Â¥
2011-08-16 11:20   »Ø¸´  
ÎåÐǺÃÆÀ  ¶¥Ò»Ï£¬¸Ðл·ÖÏí£¡
Jarvis201013Â¥
2011-08-16 11:42   »Ø¸´  
ÎåÐǺÃÆÀ  ¶¥Ò»Ï£¬¸Ðл·ÖÏí£¡
¡î ÎÞÐǼ¶ ¡ï Ò»ÐǼ¶ ¡ï¡ï¡ï ÈýÐǼ¶ ¡ï¡ï¡ï¡ï¡ï ÎåÐǼ¶
×î¾ßÈËÆøÈÈÌûÍƼö [²é¿´È«²¿] ×÷Õß »Ø/¿´ ×îºó·¢±í
[½Ìʦ֮¼Ò] ¸ßУ´ÇÖ°£¬ÒªÇóÅâ³¥£¬Õâµ½µ×ºÏ²»ºÏÀí +14 ´«¶¯_º£Éñ 2024-06-23 19/950 2024-06-23 23:23 by dogdog2021
[Ö°³¡ÈËÉú] ÔÚ»¯¹¤Éè¼ÆÔº£¬³§À﹤×÷µÄ£¬ÄãµÃѧ»á»ýÀÛ +3 »¹ÊǻؼҺð¡ 2024-06-18 3/150 2024-06-23 22:13 by cangxiong1
[»ù½ðÉêÇë] ²©ºóÃæÉϺÍÌØÖú½ñÌì³öÂ𣿠+39 ¶ºÄúÍæ 2024-06-21 75/3750 2024-06-23 21:37 by ±¯´ß¿ÆÑй·
[ÕÒ¹¤×÷] ¸ßУÁ½¸öofferÑ¡Ôñ +22 cowox2021 2024-06-18 23/1150 2024-06-23 21:27 by СÀÏ»¢¹Ô¹Ô
[˶²©¼ÒÔ°] Êý¾Ý²»ºÃ +3 Hetai 2024-06-23 5/250 2024-06-23 20:36 by 82ÄêÀ­·Æ
[½Ìʦ֮¼Ò] ÓÐûÓнñÄêµÄÓ°ÏìÒò×Ó£¿ +5 jurkat.1640 2024-06-22 7/350 2024-06-23 18:38 by lyfbangong
[»ù½ðÉêÇë] ÇàÄêºÍÃæÉÏ£¬ÄĸöÉÏ»áÄѶȸü´ó +12 ½ñÍíÍƼö22 2024-06-21 17/850 2024-06-23 14:44 by ½ñÍíÍƼö22
[»ù½ðÉêÇë] F03ÇàÄê»ù½ðº¯ÆÀ½á¹û +5 ôßÑôÒ»Ö»²ñ 2024-06-19 6/300 2024-06-23 14:30 by adsqsj
[˶²©¼ÒÔ°] »Ø¼ÒÁ½Ì죬²»Ïë´ôÁË +4 368ghnf 2024-06-22 4/200 2024-06-23 13:43 by lhjr123
[¹«Åɳö¹ú] ²©Ê¿cscÁªÅà»á¿´ÖصÚһѧÀúѧУ²ã´ÎÂð +3 Ò²¾ÍÕâÑù 2024-06-23 3/150 2024-06-23 13:01 by RED_JME
[»ù½ðÉêÇë] ÑÝÔ±ÑîÃÝÔÚºËÐÄÆÚ¿¯·¢±íÂÛÎÄ£¬Öйú¹ã²¥µçÊÓѧÆÚ¿¯ +8 babu2015 2024-06-22 11/550 2024-06-23 11:49 by jurkat.1640
[½Ìʦ֮¼Ò] ¸´µ©ÏÄͬѧÌá³ö¸ßУ³ÉÄêÈËѧÉúÍËѧ²»Ó¦¸ÃÈüҳ¤ÉóºË£¬´ó¼ÒÈÏͬÂ𣿠+8 ËÕ¶«Æ¶þÊÀ 2024-06-22 15/750 2024-06-22 21:37 by ÎÒ°®ÏÄÌìÁË
[¾«Ï¸»¯¹¤] ÇÐÏ÷Òº½»Á÷Ⱥ +7 SZÎâÑå×æ 2024-06-17 7/350 2024-06-22 08:54 by gemini_li
[Óлú½»Á÷] Óлú·´Ó¦ 50+3 ÓêÖеĺìõ¹å 2024-06-19 6/300 2024-06-22 00:10 by ϲ»¶ºÍÒ»Ñõ»¯¶þÌ
[»ù½ðÉêÇë] ÃæÉϲ©Ê¿ºó +11 jsqy 2024-06-19 13/650 2024-06-21 17:12 by sizhouyi
[Óлú½»Á÷] ÔõôÝÍÈ¡³öÎýÑÎÄÚ°ü¹üµÄ»¯ºÏÎï +4 ˶Áù¹ý 2024-06-19 5/250 2024-06-21 09:50 by ¹â³¬à½à½
[ÂÛÎÄͶ¸å] µÚһƪÂÛÎÄͶ¸å½ø³Ì¼Ç¼ +4 É÷¶ÀµÄС»¨¾í 2024-06-20 9/450 2024-06-20 20:37 by É÷¶ÀµÄС»¨¾í
[»ù½ðÉêÇë] ÃæÇàµØ»áÆÀʱ¼ä£¿£¿£¿ +7 Axvdvbfs 2024-06-19 8/400 2024-06-20 11:16 by ·ң»¹ÓÐË­
[»ù½ðÉêÇë] Ì«¾íÁË +14 laoyuefubio 2024-06-17 27/1350 2024-06-20 09:52 by htjwqy
[º£Íⲩºó] Ä«¶û±¾´óѧ²©ºóofferÒª²»Òª½Ó +3 kyxblmm 2024-06-18 3/150 2024-06-19 22:39 by blake1111
ÐÅÏ¢Ìáʾ
ÇëÌî´¦ÀíÒâ¼û