Protein–protein interaction prediction: Difference between revisions

Content deleted Content added
wikify
Fix Format
Line 1:
{{prod|[[WP:NOR]]}}
{{wikify}}
 
'''The prediction of an interaction or binding between proteins.'''
 
(computational multi genome assays are of primary interest)
the acronym ('''PPIP''') stands for '''P'''rotein '''P'''rotein '''I'''nteraction '''P'''rediction
 
theThe acronym ('''PPIP''') stands for '''P'''rotein '''P'''rotein '''I'''nteraction '''P'''rediction
 
==Protein Interaction Prediction Literature Review, 2005==
(computational multi genome assays are of primary interest)
 
 
===Why PPIP===
In order to understand how organisms function, we need to shed more light on our deepest inner workings (i.e. chemical interactions). The most complex of these are protein interactions. For any given genome we want a comprehensive list, numbering in the millions, of protein interactions and their functions. For an organism the complete protein interaction network is called an interactome. Unfortunately, directly sensing what happens at this level is not currently feasible, although intuitively, it is the first thing to consider.
 
 
===Overview of methodologies===
The technologies for direct sensing reside in the realm of physics and all rely on the four fundamental forces (Strong, Electromagnetic, Weak, and Gravity). Currently, the best technologies do not have sufficient resolution and are either too disruptive, too slow to track or resolve protein Brownian movement, or have too narrow a field of view for genome wide PPIP. The highly advanced Stanford University optical trap microscope is only just capable of protein observation [141]. But even if direct observation and identification were possible, the sheer complexity of biological systems would present a challenge to our understanding.
Because direct sensing of protein interaction is currently ruled out, various genome-wide methods of PPIP are being developed. Some strictly biological methods that have been developed are: Yeast Two-Hybrid (Y2H) [60, 92], Correlated mRNA Expression Profiles, Genetic Interaction Data, and Mass Spectrometry Protein Complex Purification.
Biological experiments are prohibitively expensive and have large errors induced by their inherent limitations. For example, it is estimated that it would take a costly 10,000 pull down assays to discover 90% of the human interactome. This can be overcome by using computational methods, which reduce the potential interactions to be tested by them by many orders of magnitude. If they ever become sufficiently accurate, computational methods will be able to be used on their own without biological verification to predict protein interaction. To date, computational prediction accuracy is in the 80% range at the best of times. Computer algorithms rely on accurate genome sequencing a well as knowledge of physics, chemistry, and biology.
Genome sequencing is becoming faster and more accurate with micro fabricated high-density Pico-litre reactors [90]. Hypothetically, it is possible to predict any biological form or function from the DNA sequence, however to date experimentation has given only limited success.
Simulation predictions also rely on good programming. To accurately and quickly perform PPIP, speed optimization is imperative, as it is all too easy to make an algorithm that uses more time than the universe offers. It is of no relevance how accurate a prediction is if the result is never produced. Similarly it is of no relevance how speedy and algorithm is if it produces inaccurate results.
The accuracy of a PPIP algorithm is measured by:
Accuracy = (TP+TN)/(TP+FP+TN+FN),
Precision or Specificity = TP/(TP+FP),
Sensitivity = TP/(TP+FN);
Where TP = True Positive (prediction), FP = False Positive, TN = True Negative, and FN = False Negative.
This standard scoring method will prove useful when comparing PPIP algorithms.
Computer simulation for prediction of protein function, which is our primary goal, is accomplished using a series of steps. First of all, the genome is sequenced; then Open Reading Frames (ORFs) are found; then pre-processing of mRNA is simulated; followed by PPIP which produces protein-protein interaction maps; from which, using known functional information, unknown protein function can be determined. When the interaction maps become sufficiently accurate more proteins will have more of their functions determined.
As algorithms, equations, and template solutions have many applications, it is natural to inquire across disciplines for solutions to similar type problems. The successful algorithms used in proteomics use methods, which have been in use previously in other fields. In the face of Occam's razor, computational algorithms rely on complexity to improve speed and accuracy.
 
Accuracy = (TP+TN)/(TP+FP+TN+FN),
 
Precision or Specificity = TP/(TP+FP),
 
Sensitivity = TP/(TP+FN);
 
 
Where TP = True Positive (prediction),
 
FP = False Positive,
 
TN = True Negative,
 
and FN = False Negative.
 
This standard scoring method will prove useful when comparing PPIP algorithms.
Computer simulation for prediction of protein function, which is our primary goal, is accomplished using a series of steps. First of all, the genome is sequenced; then Open Reading Frames (ORFs) are found; then pre-processing of mRNA is simulated; followed by PPIP which produces protein-protein interaction maps; from which, using known functional information, unknown protein function can be determined. When the interaction maps become sufficiently accurate more proteins will have more of their functions determined.
As algorithms, equations, and template solutions have many applications, it is natural to inquire across disciplines for solutions to similar type problems. The successful algorithms used in proteomics use methods, which have been in use previously in other fields. In the face of Occam's razor, computational algorithms rely on complexity to improve speed and accuracy.
 
===Method Analysis===
Line 39 ⟶ 47:
The '''Vector Learning Method''' is an alternative to the Graph Learning Method and is currently competing for the title of most efficient method. Both machine learning methods are probably of equal potential. A training set is mapped to an n-dimensional space where successful combinations of residues or amino acids are represented in a hyperspace. Each piece of the pattern or residue attribute is mapped to a separate dimension “vectorization”. Unlike normal two dimensional (latitude and longitude) city maps, protein pattern maps are most effective when using more than 20 dimensions. If a potential protein pair lies within the space identified as successful an interaction is predicted. (Similarly if an address is mapped to a residential zone, it is likely to be a residence). Support Vector Machines (SVM's), clustering, and other spatial approaches are often used as successful implementation of n-space mapping. The vector learning method leans towards a parallel approach but is often implemented on a regular CPU: this hints at Ω 1 PPIP n-space. It is not intuitive for most people to think in more than three dimensions, therefore, it becomes difficult to grasp why a particular interaction is probable. Advantageously, the complex pattern rules are “learned” in an inclusive and possibly exhaustive way. SVMs are interesting because when faced with an additional problem or complication they just add another dimension handle it, however the number of dimensions is directly related to computational cost. The vector learning tools available are widely popular and are implemented in the methods of docking, folding, and many other pattern recognition problems. Related articles include [6, 7, 9, 11, 20, 26, 28, 36, 37, 44, 50, 58, 59, 64, 79, 81, 82, 83, 98, 100, 106, 107, 114, 126, 132, 138, 139, 145, 146, 147, 148, and 152].
Because a large amount of work has been done on interactomes, the '''Evolutionary Method''' is becoming a practical speedup method. It uses the data from PPIPs or experimentally verified interaction maps to infer protein interaction for evolutionarily related organisms. This is a speedy method to create new interactomes and its accuracy is relatively high because many organisms are highly related. Unfortunately it relies on good databases and knowledge of orthologs, neither of which are widely available at the present time. Related articles include [23, 25, 26, 37, 45, 49, 50, 52, 94, 111, 118, 119, 136, 139, 147, and 152].
 
 
===Result Presentation===
Displaying results can be problematic [117] because of the volumes of data generated (note resemblance to a hairball); therefore, it should be organised in a hierarchical manner, or interaction "tree". The two best approaches to date are first, to simply display only one or two interaction links deep of a hierarchy at a time [108]; the second is to assign the highly interactive (hub) proteins to be the roots of the interaction trees. [108]. This creates better groupings of functionally and spatially related proteins, making for a more easily interpreted interactome.
The main goal of proteomics is to predict the structures, interactions and functions of the proteins [29]. Specific function is only found through interactions. Because structures are primarily used to help find interactions, the prediction of protein-protein interactions is of vital interest in proteomics.
 
 
===Future possibilities===
Line 51 ⟶ 57:
The methods also suffer because they are incapable of continuously updating themselves: they only "learn" before they generate predictions, when a training set (a list of interactions) is pushed through. In addition many methods only use one specific approach, although it has been demonstrated [94] that combining two or more approaches increases true positives and reduces false positives.
A method that addresses the weaknesses described above could use machine learning algorithm for initial selection; and a slower algorithm, consisting of both folding and docking methods, for random and boundary case verifications. By staying up to date on the literature, verifications could also be input from known evolutionary interactome relationships. Although protein localization could be accounted for in machine learning PPIP training it might be possible to increase PPIP by considering localization separately current localisation techniques are 83.6% accurate [44]. To maximize the accuracy of protein function prediction, more computation is needed on less data, and less computation is needed on more data.
These refined results would continuously [114] teach the machine learning algorithm to modify the pattern, resulting in a more accurate PPIP. As people learn from their mistakes, so we can program an algorithm to learn from its mistakes or even update itself.
It is difficult to include third body interaction in quick methods (SVM or Graphs based) methods, therefore slower verification methods (protein folding and docking) should try to compensate by considering third body interactions. The "quick" methods should be inclusive so that the "slow" methods have a reduced data set to work with. To aid interaction prediction it would be wise for interactomes to include identification of the active site used for each interaction. This can help with the prediction of third body interactions.
The above described combination of methods addresses the four problems inherent in using single methods for PPIP. This approach might push the accuracy over the as yet unsurpassed 90% mark. It is theoretically possible to design a vector learning method that sports 100% accuracy but even at current accuracy rates the computational methods provide significant insight for speedup of the biological methods of PPIP. Because of conserved proteins and domains it will become progressively easier to make protein interaction maps of each genome. The advantage of this approach is that interaction maps can be produced quickly and then improved more slowly with a smaller dataset, in contrast to most implementations which are a one shot affair. However, it would be desirable to analyze whole libraries of genomes on an ongoing basis as they become available, despite the apparent difficulty of performing in the order of 1012 interaction tests. The algorithms would need to be run periodically but if it is to be used as a PPIP server, this is to be expected anyway. There are implementations that use data from known interactions as well as multiple prediction methods such as [http://mysql5.mbi.ucla.edu/cgi-bin/functionator/pronav Prolinks ].
 
 
 
 
 
Line 61 ⟶ 70:
 
===online PPIP services===
*[http://advice.i2r.a-star.edu.sg/search/pair.php advice]
*[http://www-appn.comp.nus.edu.sg/~bioinfo/bayesprot/bayesprot.htm Bayesian Protein Prediction]
*[http://interdom.lit.org.sg/validate/index_inter.php InterDom]
*[http://www.russell.embl-heidelberg.de/people/patrick/interprets/interprets.html InterPreTS]
*[http://interweaver.i2r.a-star.edu.sg/report/demo.php InterWeaver]
*[http://cbi.labri.fr/outils/ippred/ Ippred]
*[http://ophid.utoronto.ca/ophid/ppi.html OPHID]
*[http://gordion.hpc.eng.ku.edu.tr/prism/predictions.php#online PRISM]
*[http://mysql5.mbi.ucla.edu/cgi-bin/functionator/pronav Prolinks]
*[http://www.protsuggest.org/main.html Protsuggest]
*[http://point.bioinformatics.tw/ POINT]
*[http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi SVMProt]
*[http://string.embl.de/ String]
 
 
==Bibliography==
*[1] Albert, István. & Albert, Réka. (2004). Conserved Network Motifs Allow Protein-Protein Interaction. Bioinformatics., 20 (18), 3346-3352.
*[2] Aloy, Patrick & Russell, Robert B. (2003). InterPreTs Protein Interaction Prediction through Tertiary Structure. Bioinformatics., 19(1), 161-162.
*[3] Anne Imberty, Veronique Piller, Friedrich Piller and Christelle Breton (1997). Fold recognition and molecular modeling of a lectin-like ___domain in UDP–GalNAc:polypeptide N-acetylgalactosaminyltransferases. Protein Engineering., 10 (12),1353–1356
*[4] Ansari, Sam & Helms, Volkhard. (2005). Statistical Analysis of Predominantly Transient Protein-Protein Interfaces. Proteins: Structure, Function and Bioinformatics., 64, 344-355.
*[5] Boer, D. Roeland., Kroon, Jan., Cole, Jason C., Smith, Barry & Verdonk, Marcel L. (2001). Superstar comparison of CSD and PDB-based interaction fields as a basis for the prediction of protein-ligand interactions. J. Mol. Biol., 312, 275-287.
*[6] Bordner, Andrew J. & Abagyan, Ruben. (2005). Statistical analysis and Prediction of Protein-Protein Interfaces. Proteins: Structure, Function and Bioinformatics., 60, 353-366.
*[7] Borgwardt, Karsten M., Ong, Cheng Soon., Schönauer, Stefan., Vishwanathan, S. V. N., Smola, Alex J. & Kriegel, Hans-Peter. (2005). Protein function prediction via graph kernels. Bioinformatics., 21(1), i47-i56.
*[8] Bowie, James U. (2005). Solving the Membrane Protein- Folding Problem. Nature. 438, 581-589.
*[9] Bradford, James R. & Westhead, David R. (2005). Improved Prediction of Protein-Protein Binding Sites Using a Support Vector machines Approach. Bioinformatics., 21(8), 1487-1494.
*[10] Brun, Christine., Chevenet, François., Martin, David., Wojcik, Jérôme., Guénoche, Alain & Jacq, Bernard. (2003).Functional Classification of Proteins for the Prediction of Cellular Function from a Protein-Protein Interaction Network. Genome Biology., 5(1)
*[11] Bunescua,Razvan., Gea,Ruifang., Katea, Rohit J., Marcotteb,Edward M., Mooneya, Raymond J., Ramanib, Arun K. & Wonga, Yuk Wah. (2005). Comparative Experiments on Learning Information Extractors for Proteins and Their Interactions. Artificial Intelligence in Medicine., 33,139-155.
*[12] Burguete, Alondra Schweizer., Harbury, Pehr B. & Pfeffer, Suzanne R. (2004). In Vitro Selection and Prediction of TIP47 Protein- Interaction Interfaces. Nature Methods., 1(1), 1-6.
*[13] Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X. & Chen, Y.Z. (2003). SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Research., 31(13), 3692-3697.
*[14] Cai, C.Z., Wang, W.L.,Sun, L.Z & Chen, Y.Z. (2003). Protein Function Classification via Support Vector machine approach. Mathematical Biosciences., 185, 111-122.
*[15] Cai, Yu-Dong & Lin, Shuo Liang. (2003). Support Vector Machines for Predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochimica et Biophysica Acta., 1648, 127-133.
*[16] Cai, Yu-Dong., Lin, Shuo-Liang & Chou, Kuo-Chen. (2003). Support Vectors Machines for Prediction of Protein Signal Sequences and their Cleavage Sites. Peptides., 24, 159-161.
*[17] Cai, Yu-Dong., Liu, Xiao-Jun., Xu, Xue-biao & Chou, Kuo-Chen. (2000). Support Vector machines For Prediction of Protein Subcellular Location. Molecular Cell Biology Research Communications., 4, 230-233.
*[18] Cai, Yu-Dong., Liu, Xiao-Jun., Xu, Xue-Biao & Chou, Kuo-Chen. (2003). Support Vector Machines for Prediction of Protein Domain Structural Class. J. theory. Biol., 221, 115-120.
*[19] Cai, Yu-Dong., Liu, Xiao-Jun., Xu, Xue-biao., Chou, Kuo-Chen. PRediction of Protein Structural Classes by support vector machines. Computers and Chemistry., 26, 293-296.
*[20] Capriotti, Emidio., Fariselli, Piero., Calabrese, Remo & Casadio, Rita.(2005).Predicting protein stability changes from sequences using support vector machines. Bioinformatics. 21(2), ii54-ii58.
*[21] Carugo, Oliviero & Franzot, Giacomo. (2004). Prediction of Protein-Protein Interactions Based on Surface Patch Comparison. Proteomics., 4, 1727-1736.
*[22] Chattopadhyaya, Rajagopal. & Ghose, Asoke Chandra. (2002). Model of Vibrio cholerae toxin coregulated pilin capable of filament formation. Protein Engineering., 15(4), 297-304.
*[23] Chelliah, Vijayalakshimi., Blundell, Tom & Mizugichi, Kenji. (2005). Functional Restraints on the Patterns of Amino Acid Substitutions Application to Sequence–Structure Homology Recognition. Proteins: Structure, Function and Bioinformatics., 61, 722-731.
*[24] Chen, Pai-Hsuen., Lin, Chih-Jen & Scholkopf, Bernhard. (2003). A Tutorial on v-Support Vector Machines. Department of Computer Science and Information Engineering, National Taiwan University. 1-29.
*[25] Chen, Xue-wen & Liu, Mei. (2005). Prediction of Protein-Protein Interactions Using Random Decision Forest Framework. Bioinformatics., 1-4.
*[26] Chen, Yu-ching & Hwang, Jenn-Kang. (2005).Prediction of Disulfide Connectivity From Protein Sequences. Proteins: Structure, Function and Bioinformatics., 61, 507-512.
*[27] Chen, Yu-Ching., Lin, Yeong-Shin., Lin, Chih-Jen & Hwang, Jenn-Kang. (2004). Prediction of the Bonding States of Cysteines Using the Support Vector machines Based on Multiple Feature Vectors and Cysteine State Sequences. Proteins: Structure, Function and Bioinformatics., 55, 1036-1042.
*[28] Cheng, Betty Yee Man., Carbonell, Jaime G. & Klein-Seetharaman, Judith. (2005).Protein Classification Based on Text Document Classification Techniques. Proteins: Structure, Function and Bioinformatics., 58, 955-970.
*[29] Chinnasamy, Arunkumar., Mittal, Ankush & Sung, Wing-Kin. (2005). Probabilistic prediction of protein–protein interactions from the protein sequences. Computers in Biology and Medicine., 1-12.
*[30] Chmiel, Agnieszka., Radlinska, Monika., Pawlak, Sebastion D., Krowarsch, Daniel., Bujnicki, Janusz M. & Skowronek, Krzysztof J. (2005). A Theoretical Model of Restriction Endonuclease NiaIV in Complex with DNA, Predicted by Fold Recognition and Validated by Site-Directed Mutagenesis and Circular Dichroism Spectroscopy. Protein Engineering, Design & Selection., 18(4), 181-189.
*[31] Choulier, Laurence., Andersson, Karl., Hämäläinen, Markku D., Reggenmortel, Marc H.V. van., Malmqvist, Magnus & Altschuh, Danièle. (2002). QSAR studies applied to the prediction of antigen–antibody interaction kinetics as measured by BIOCORE. Protein Engineering., 15 (5), 373-382.
*[32] Chung-Jung Tsai and Ruth Nussinov (2001). The building block folding model and the kinetics of protein Folding. Protein Engineering., 14(10), 723–733
*[33] Craig, P.O., Berguer, P.M., Ainciart, N., Zylberman, V., Thomas, M.G., Martinez Tosar, L.J., Bulloj, A., Boccaccio, G.L. & Goldbaum, F.A. (2005). Multiple Display of Protein Domain on a Bacterial Polymeric Scaffold. Proteins: Structure, Function and Bioinformatics., 61, 1089-1100.
*[34] Deprez, Paola & Inestrosa, Nibaldo. (2000). Molecular Modeling of the Collagen-like Tail of Asymmetric Acetylcholinesterase. Protein Engineering.,13(1), 27-34.
*[35] Ding, Chris H.Q. & Dubchak, Inna. (2001). Multi-class Protein Fold Recognition using Support Vector Machines and Neural Networks. Bioinformatics., 17(4), 349-358.
*[36] Dobson, Paul D. & Doig, Andrew J. (2005). Predicting Enzyme Class from Protein Structure Without Alignments. J. Mol. Biol., 345, 187-199.
*[37] Dubey, Anshul., Realff, Matthew J., Lee, Jay H. & Bommarius, Andreas S. (2005).Support vector machines for learning to identify the critical positions of a protein. Journal of Theoretical Biology., 234, 351-361.
*[38] Eisenhaber,Birgit.,Eisenhaber,Frank.,Maurer-Stroh, Sebastian & Neuberger, Georg. (2004). Prediction of sequence signals for lipid post-translational modifications: Insights fromcase studies. Proteomics.,4, 1614-1625.
*[39] English, Andrew C., Groom, Colin R. & Hubbard, Roderick E. (2001). Experimental and Computational Mapping of the Binding Surface of a Crystalline Protein. Protein Engineering., 14(1), 47-59.
*[40] Fariselli, Piero., Pazos, Florencio., Valencia, Alfonso & Casadio, Rita. (2002). Prediction of Protein-Protein Interaction Sites in Heterocomplexes with Neural Networks. Eur. J. Biochem., 269, 1356-1361.
*[41] Fernández- Recio, Juan., Totrov, Maxim & Abagyan, Ruben. (2004). Identification of Protein-Protein Interaction Sites from Docking Energy Landscapes. Journal of Molecular Biology., 335, 843- 865.
*[42] Fernández-Recio, Juan., Totrov, Max., Skorodumov, Constantin & Abagyan, Ruben. (2005). Optimal Docking Area: A New Method For Predicting Protein-Protein Interaction Sites. Proteins: Structure, Function and Bioinformatics., 58, 134-143.
*[43] Franzot, Giacomo & Carugo, Oliviero.(2004). Computational Approaches to Protein-Protein Interaction. Journal of Structural and Functional Genomics., 4, 245-255.
*[44] Gao, Qing-Bin & Wang, Zheng-Zhi. (2005). Using Nearest Feature Line and Tunable Nearest Neighbor methods for prediction of protein subcellular locations. Computational Biology and Chemistry., 29, 388-392.
*[45] Gomez, Manuel., Alonso-Allende, Ramón., Pazos, Florencio., Grana, Osvaldo., Juan, David.& Valencia, Alfonso. (2004). Accessible Protein Interaction Data for Network Modeling. Structure of the information and available repositories. Structural Bioinformatics group, Imperial College.
*[46] Gomez, Shawn M. & Rzhetsky, Andrey. (2002).Towards the Prediction of Complete Protein- Protein Interaction Networks. Columbia Genome Center, Department of Medical Informatics, Columbia University.
*[47] Gottschalk, Kay-Eberhard., Neuvirth, Hani & Schreiber, Gideon. (2004). A Novel Method for Scoring of Docked Protein Complexes Using Predicted Protein-Protein Binding Sites. Protein Engineering, Design & Selection., 17(2), 183-189.
*[48] Guo,Ting., Shi, Yanxin & Sun, Zhirong. (2005). A novel statistical ligand-binding site predictor: application to ATP-binding sites. Protein Engineering, Design & Selection., 18(2), 65-70.
*[49] Han, Dong-soo., Kim, Hong-soo., Jang, Wong –Hyuk., Lee, Sung-Doke & Suh, Jung-Keun.(2004). PreSPI: a ___domain combination based prediction system for protein–protein interaction. Nucleic Acids Research., 32(21), 6312-6320.
*[50] Han, Sangjo., Lee, Byung-chui, Yu, Seung Taek., Jeong, Chan-seok., Lee, Soyoung & Dongsup, Kim. (2005). Fold Recognition by combining profile-profile alignment and support vector machine. Bioinformatics., 21(11), 2667-2673.
*[51] Hermjakob, Henning., Motecchi- Palazzi, Luisa., Bader, Gary., Wojcik, Jérôme., Salwinski, Lukasz., Ceol, Arnaud et al. (2004). The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data. Nature Biotechnology., 22(2) 177-183.
*[52] Heuser, Phillipp., Baù, Davide., Benkert, Pascal & Schomberg, Dietmar. (2005). Refinement of Unbound Protein Docking Studies Using Biological Knowledge. Proteins: Structure, Function and Bioinformatics., 61, 1059-1067.
*[53] Hirokawa, Takatsugu., Uechi, Junichi., Sasamoto, Hiroyuki., Suwa, Makiko & Mitaku, Shigeki. (2000). A Triangle Lattice Model that Predicts Transmembrane Helix Configuration using a Polar Jigsaw Puzzle. Protein Engineering., 13(11), 771-778.
*[54] Ho, Tin Kam. (1998). The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence., 20(8), 832- 842.
*[55] Horváth, Gábor V., Pettkó- Szandtner, Aladár., Nikovics, Krisztina., Bilgin, Metin., Boulton, Margaret., Davies, Jeffery W., Gutiérrez, Crisanto & Dudits, Dénes. (1998). Prediction of functional regions of the maize streak virus replication-associated proteins by protein-protein interaction analysis. Plant Molecular Biology., 38, 699-712.
*[56] Hsu, Chih-Wei., Chang, Chih-Chung & Lin, Chih-Jen. (2003). Practical guide to Support Vector Classification. Department of Computer science and Information Engineering, National Taiwan University, 1-12.
*[57] Hu, Hai., Columbus, John., Zhang, Yi., Wu, Dongying., Lian, Lubing., Yang, Song., Goodwin, Jennifer., Luczak, Christine., Carter, Mark., Chen, Lin., James, Michael., Davis, Roger., Sudol, Marius., Rodwell, John & Herrero, Juan J. (2004). A Map of WW Domain Family Interactions. Proteomics., 4,643-655.
*[58] Huang, Jing & Shi, Feng. (2004). Support vector machines for predicting Apoptosis Proteins Types. Acta Biotheoretica., 53, 39-47.
*[59] Huang, Ni., Chen, Hu & Sun, Zhirong. (2005). CTKPred: an SVM-based method for the prediction and classification of the cytokine superfamily. Protein Engineering, Design & Selection. 18(8), 365-368.
*[60] Ito, Takashi., Chiba, Tomoko., Ozawa, Ritsuko., Yoshida, Mikio., Hattori, Masahira & Sakaki, Yoshiyuki. (2001). A Comprehensive Two- Hybrid Analysis to Explore the Yeast Protein Interactome. Proceedings of National Academy of Science., 98(8), 4569-4574.
*[61] Jaffe, Jacob D., Berg, Howard C. & Church, George M. (2003). Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics, 4, 59-77.
*[62] Jiang, Fan. (2003). Prediction of Protein Secondary Structure with a Reliability Score Estimated by Local Sequence Clustering. Protein Engineering., 16(9), 651-657.
*[63] Jones, S. & Thornton, JM. (1997). Prediction of Protein-Protein Interaction Sites using Patch Analysis. J. Mol. Biol., 272(1), 133-143.
*[64] Kikuchi, Tomonori & Abe, Shigeo. (2005). Comparison Between Error Correcting output Codes and Fuzzy Support Vector Machines. Pattern Recognition Letters., 26, 1937-1945.
*[65] Kim, Hyunsoo & Park, Haesum. (2004). Prediction of Protein Relative Solvent Accessibility with Support Vector Machines and Long-Range Interaction 3D Local Descriptor. Proteins: Structure, Function, and Bioinformatics., 54, 557-562.
*[67] Kim, Moon Kyu., Kim, Eun Sook., Kim, Dong Soo., Choi, In-Hong., Moon, Taesung, Yoon, Chang No & Shin, Jeon-Soo. (2004). Two novel mutations of Wiskott Aldrich syndrome the molecular prediction of interaction between the mutated WASP L101P with WASP interacting protein by molecular modeling. Biochimica et Biophysica Acta., 1690, 134-140.
*[68] Kim, Wan Kyu & Ison, Jon C. (2005). Survey of the Geometric Association of Domain-Domain Interfaces. Proteins., 61(4), 1075- 1088.
*[69] Kim, Wan Kyu., Park, Jong & Suh, Jung, Keun. (2002). Large Scale Statistical Prediction of Protein-Protein Interaction by Potentially Interacting Domain (PID) Pair. Genome Informatics., 13, 42-50.
*[70] Koike, Asako & Toshihisa, Takagi. (2004). Prediction of Protein-Protein Interaction Sites using Support Vector Machines. Protein Engineering, Design & Selection., 17(2), 165-173.
*[71] Kortemme, Tanja & Baker, David. (2003). Computational Design of Protein-Protein Interactions. Current Opinion in Chemical Biology., 8, 91-97.
*[72] Kortemme, Tanja., Joachimiak, Lukasz A., Bullock, Alex N., Schuler, Aaron D., Stoddard, Barry L. & Baker, David. (2004). Computational Redesign of Protein-Protein Interaction Specificity. Nature Structural & Molecular Biology., 11(4), 371-379.
*[73] Küster, Bernhard., Mortensen, Peter., Andersen, Jens S. & Mann, Matthias. (2001). Mass spectrometry allows direct identification of proteins in large genomes. Proteomics., 1, 641-650.
*[74] Lappe, Michael & Holm, Liisa. (2004).Unravelling Protein Interaction Networks with Near Optimal Efficiency. Nature Biotechnology., 22(1), 98-103.
*[75] Lee, CY., Yang, PK.,Tzou, WS. & Hwang, MJ. (1998). Estimates of Relative Binding Free Energies for HIV Protease Inhibitors Using Different Levels of Approximations. Protein Engineering., 11(6), 429-437.
*[76] Li, Chun Hua., Ma, Xiao Hui., Chen, Wei Zu & Wang, Cun Xin. (2003). A Protein-Protein Docking Algorithm Dependent on the Type of Complexes. Protein Engineering., 16(4), 265-269.
*[77] Li, Hui., Robertson, Andrew D. & Jensen, Jan H. (2005). Very Fast Empirical Prediction and Rationalization of Protein pKa Values. Proteins: Structure, Function and Bioinformatics., 61, 704- 721.
*[78] Liang, Shide., Zhang, Jian., Zhang, Shicui and Guo, Huarong. (2004).Prediction of the Interaction Site on the Surface of an Isolated Protein Structure by Analysis of Side Chain Energy Scores. Proteins: Structure, Function and Bioinformatics., 57, 548-557.
*[79] Lin, Yi., Lee, Yoonkyung & Wahba, Grace. (2002). Support Vector Machines for Classification in Nonstandard Situations. Machine Learning., 46, 191-202.
*[80] Lindauer, Klaus., Loerting, Thomas., Liedl, Klaus R. Kroemer, Romano T. (2001). Prediction of the Structure of Human Janus Kinase 2 (JAK2) Comprising of two Carboxy-terminal Reveals a Mechanism for Autoregulation. Protein Engineering., 14(1), 27-37.
*[81] Ling Lo, Siaw., Cai Zhong, Cong., Chen, Yu Zong & Chung, Maxey C. M. (2005). Effect of training datasets on support vector machine prediction of protein-protein interactions. Proteomics., 5, 876-884.
*[82] Lu, Wencong., Dong, Ning & Gábor Náray-Szabó. (2005). Predicting Anti-HIV-1 Activities of HEPT-analog Compounds by Using Support Vector Classification. QSAR Comb. Sci., 24, 1021-1025.
*[83] Lubec,Gert., Afjehi-Sadat,Leila., Yang, Jae-Won & John, Julius Paul Pradeep. (2005). Searching For Hypothetical Proteins: Theory and Practice Based Upon Original Data and Literature. Progress in Neurobiology., 77, 90-127.
*[84] Gromiha, Michael M., Oobatake, Motohisa., Kono, Hiditoshi., Uedaira, Hatsuho and Sarai, Akinori. (1999). Role of structural and sequence information in the prediction of protein stability changes: comparison between buried and partially buried mutations. Protein Engineering, 12(7),549–555
*[85] Gromiha, Michael M. and Selvaraj, S. (1998). Protein secondary structure prediction in different structural classes. Protein Engineering., 11(4).249–251.
*[86] Mahn, Andrea & Asenjo, Juan A. (2005). Prediction of Protein Retention in Hydrophobic Interaction Chromatography. Biotechnology Advances., 23,359-368.
*[87] Mamitsuka, Hiroshi. (2004). Essential Latent Knowledge for Protein-Protein Interactions: Analysis by an Unsupervised Learning Approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics., 2(2),119-130.
*[88] Mandel-Gutfreund, Yael & Margalit, Hannah. (1998). Quantitative parameters for amino acid–base interaction implications for prediction of protein–DNA binding sites. Nucleic Acids Research., 26(10), 2306-2312.
*[89] Mandell, Jeffery G., Roberts, Victoria A., Pique, Michael E., Kotlovyi, Vladimir., Mitchell, Julie C., Nelson, Erik., Tsigelny, Igor & Eyck, Lynn F. Ten. (2001). Protein Docking Using Continuum Electrostatics and Geometric Fit. Protein Engineering., 14(2), 105-113.
*[90] Margulies, Marcel., Egholm, Michael., Altman, William E., Attiya, Said., Bader, Joel S., Bemben, Lisa A., Berka, Jan., Braverman, Michael S., Chen, Yi-Ju., Chen, Zhoutao., Dewell, Scott B., Du, Lei., Fierro, Joseph M. et al. (2005). Genome sequencing in microfabricated high-density picolitre reactors. Nature., 437, 376-380.
*[91] Markowetz, Florian., Edler, Lutz & Vingron, Martin. (2003). Support Vector Machines For Protein Fold Class Prediction. Biometrical Journal., 45(3), 377-389.
*[92] Mayordomo, Isabel & Sanz, Pascual. (2002). The Saccharomyces cerevisiae 14-3-3 protein Bmh2 is Required for Regulation of Phosphorylation Status of Fin1, a Novel Intermediate Protein. Biochemistry Journal., 365, 51-56.
*[93] Mõnnigmann, M. & Floudas, C.A. (2005). Protein Loop Structure Prediction With Flexible Stem.Geometries. Proteins: Structure, Function and Bioinformatics., 61, 748-762.
*[94] Mooney, Sean D., Liang, Mike Hsing-Ping., Deconde, Rob & Altman, Ross B. (2005). Structural Characterization of Proteins Using Residue Environments. Proteins: Structure, Function and Bioinformatics., 61, 741-747.
*[95] Nabieva, Elena., Jim, Kam., Agarwal, Amit., Chazelle, Bernard & Singh, Mona. (2005).Whole-proteome Prediction of Protein Function Via Graph-Theoretic Analysis of Interaction Maps. Bioinformatics., 21, 302-310.
*[96] Nagl, Sylvia B., Das, Sudenshna & Smith, Temple F. (2000). Prediction of Interaction Partners for Orphan Nuclear Receptors by Prior-based Protein Sequence Profiles. Journal of Molecular Recognition., 13, 117-126.
*[97] Nanni, Loris. (2005). Fusion of Classifiers for Predicting Protein-Protein Interactions. Neurocomputing., 68, 289-296.
*[98] Nanni, Loris. (2005). Hyperplanes for Predicting Protein-Protein Interactions. Neurocomputing., 69, 257-263.
*[99] Nanni, Loris. (2005). Hyperplanes for Prediction Protein-Protein Interactions. Neurocomputing., 69, 257-263.
*[100] Nguyen, Minh N. & Rajapakse, Jagath C.(2005). Prediction of Protein Relative Solvent Accessibility With a Two-Stage SVM Approach. Proteins: Structure, Function and Bioinformatics., 59, 30-37.
*[101] Palma, Nuno P., Krippahl, Ludwig., Wampler, John E., Moura, José J.G. (2000). A New (Soft) Docking Algorithm for Predicting Protein Interactions. Proteins: Structure, Function and Genetics., 39(4), 372-284.
*[102] Pazos, Florencio. & Valencia, Alfonso. (2001). Similarity in phylogenetic trees as indicator of protein- protein interaction. Protein Engineering., 14(9), 609-614.
*[103] Permyakov, Serge E., Makhatatadze, George I., Owenius, Rikard., Uversky, Vladimir N., Brooks, Charles L., Permyakov, Eugene A. & Berliner, Lawrence J. (2005). How to Improve Nature Study of the Electrostatic Properties of the Surface of a-lactalbumin. Protein Engineering, Design & Selection., 18(9), 425-433.
*[104] Prusis, Peteris., Lunstedt, Torbjörn & Wikberg, Jarl E.S. (2002). Proteo-chemometrics analysis of MSH peptide binding to melanocortin receptors. Protein Engineering., 14(4), 305-311.
*[105] Qi, Yanjun., Klein-Seetharaman, Judith. & Bar-Joseph, Ziv. (2005). Random Forest Similarity for Protein-Protein Interaction prediction from Multiple Sources. School of Computer Science, Carnegie Mellon University.
*[107] Rausch,Christian., Weber,Tilmann.,Kohlbacher,Oliver., Wohlleben,Wolfgang & Huson, Daniel H. (2005). Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Research., 33(18), 5799-5808.
*[108] Rhodes, David R., Tomlins, Scott A., Varambally, Sooryanarayana., Mahavisno, Vasudeva., Barrette, Terrence., Kalyana- Sundaram, Shanker., Ghosh, Debashis., Pandey, Alhilesh & Chinnaiyan, Arul M. (2005). Probabilistic model of the Human Protein-Protein Interaction Network. Nature Biotechnology., 23(8), 951-959.
*[109] Russell, Robert B., Alber, Frank., Aloy, Patrick., David, Fred P., Korkin, Dmitry., Pichaud, Matthieu., Topf, Maya & Sali, Andrej. (2004). A Structural Perspective on Protein-Protein Interactions. Current Opinion in Structural Biology., 14, 313-324.
*[110] Vajda, Sandor., Vakser, Ilya A., Sternberg, Michael J.E & Janin, Joël. (2001). First community wide experiment on the comparative evaluation of protein: Modeling of Protein Interactions in Genomes. Biomedical Engineering., Boston University.
*[111] Saraf, Manish C., Moore, Gregory L. & Maranas, Costas D. (2003). Using multiple sequence correlation analysis to characterize functionally important protein regions. Protein Engineering., 16(6), 397-406.
*[112] Sato, Tetsuya., Yamanshi, Yoshiro., Kanehisa, Minoru. & Toh, Hiroyuki. (2000). Prediction of Protein-Protein Interactions Based on Real-Valued Phylogenetic Profiles Using Partial Correlation Coefficient. Institute for Chemical Research, Kyoto University.
*[113] Seeger, Matthias. (2004).Gaussian Processes For Machine Learning. University of California.
*[114] Shilton, Alistair, Palaniswami, M., Ralph, Daniel & Tsoi, Ah Chung. (2005). Incremental Training Support Vector Machines. IEEE TRANSACTIONS ON NEURAL NETWORK., 16(1), 114-131.
*[115] Shionyu-Mitsuyama, Clara., Shirai, Tsuyoshi., Ishida, Hirokazu & Yamane, Takashi. (2003). An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins. Protein Engineering., 16(7), 467-478.
*[116] Shoshana J Wodak and Rau´l Me´ndez, (2004). Prediction of protein–protein interactions: the CAPRI experiment, its evaluation and implications. Current Opinion in Structural Biology. 14, 242–249
*[117] Shwikowski, Benno., Uetz, Peter. & Fields, Stanley. (2000). A network of protein-protein interactions in yeast. Nature Biotechnology., 18,1257-1261.
*[118] Soares, Dinesh C., Gerloff, Dietlind L., Syme, Neil R., Coulson, Andrew F.W., Parkinson, John & Barlow, Paul N. (2005). Large-scale modelling as a route to multiple surface comparisons of the CCP module family. Protein Engineering, Design & Selection., 18(8), 379-388.
*[119] Socolich, Michael., Lockless, Steve W., Russ, William P., Lee, Heather., Gardner, Kevin H. & Ranganathan, Rana. (2005). Evolutionary information for specifying a protein fold. Nature., 437, 512-518.
*[120] Song, Jie & Tang, Huanwen. (2005). Support vector Machines for Classification of Homo-oligomeric Proteins by incorporating Subsequence Distributions. Journal of Molecular structure: THEOCHEM., 722, 97-101.
*[121] Srinivasan, N., Antonelli, Marcelo., Jacob, Germaine., Korn, Iris., R-Sayed, Muhammed F., Blundell, Tom L., Allende, Catherine C. & Allende, Jorge C. (1999). Structural interpretation of site-directed mutagenesis and specificity of the catalytic subunit of protein kinase CK2 using comparative modelling. Protein Engineering., 12(2), 119-127.
*[122] Su, Zhengchang., Dam, Phuongan., Chen, Xin., Olman, Victor., Jiang, Tao., Palenik, Brian & Xu, Ying. (2003). Computational Inference of Regulatory Pathways in Microbes: an Application of Phosphorus Assimilation Pathways in Synechococcus sp WH8102. Genome Informatics., 14, 3-13.
*[123] Sudarsanam, Sucha & Srinivasan, Subhashini. (1997). Sequence-dependent conformational sampling using a database of fI+1 and yi angles for predicting polypeptide backbone conformations. Protein Engineering., 10(10), 1155-1162.
*[124] Sussman, Fredy., Villaverde, M. Carmen & Martinez, Luis. (2002). Modified Solvent Accessibility Free Energy Prediction Analysis of Cyclic Urea inhibitors binding to the HIV-1 protease. Protein Engineering., 15( 9), 707-711.
*[125] Pitre, Sylvain., A. Chan, Cheetham, Jim., Dehne, Frank., Duong, Alex., Emili, Andrew., Greenblatt, Jack., Krogan, Nevan., Luo, Xuemei & Golshani, Ashkan. (2005). PIPE: A PROTEIN-PROTEIN INTERACTION PREDICTION ENGINE BASED ON THE RE-OCCURRING SHORT AMINO ACID SEQUENCES BETWEEN KNOWN INTERACTING PROTEIN PAIRS.
*[126] Tang, Yuchun., Jin, Bo & Zhang, Yan-Qing. (2005). Granular Support Vector Machines with Association Rules Mining For Protein Homology Prediction. Artificial Intelligence in Medicine., 35, 121-134.
*[127] Taroni, Chiara., Jones, Susan. & Thornton, Janet M. (2000). Analysis and prediction of carbohydrate binding sites. Protein Engineering., 13(2), 89-98.
*[128] Terashi, Genki., Takeda-Shitaka, Mayuko., Takaya, Daisuke., Komatsu, Katsuichiro & Umeyama, Hideaki. (2005). Searching for Protein-Protein Interaction Sites and Docking by Mothods of Molecular Dynamics, Grid Scoring, and the Pairwise Interaction Potential of Amino Acid Residues. Proteins: Structure, Function and Bioinformatics., 60, 289-295.
*[129] Thierry-Mieg, Nicolas. (2000). Protein-Protein Interaction Prediction for C. elgans. Laboratoire LSR-IMAG, France.
*[130] Tong, Amy Hin Yang., Drees, Becky., Nardelli, Giuliano., Bader, Gary D., Branetti, Barbara., Castagnoli, Luisa., Evangelista, Marie., Ferracuti, Silvia et al. (2002). A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules. Science., 295, 321-324.
*[131] Tropsha, Alexander & Edelsbrunner, Herbert. Biogeometry Applications of Computational Geometry to Molecular Structure. School of Pharmacy, University of North Carolina.
*[132] Tsuda, Koji., Shin, HyunJung & Scholkopf, Bernhard. (2005). Fast Protein Classification with Multiple Networks. Bioinformatics., 21(2), ii59-ii65.
*[133] Tuffery, Pierre & Derreumaux, Phillippe. (2005). Dependency Between Consecutive Local Conformations Helps Assemble Protein Structures From Secondary Structures Using Go Potential and Greedy Algorithm. Proteins: Function, Structure and Bioinformatics., 61, 732-740.
*[134] Uetz, Peter & Vollert, Carolina S. (2005). Protein-Protein Interactions. Encyclopedic References of Genomics and Proteomics in Molecular Medicine.
*[135] Vajda, Sandor & Camacho, Carlos J. (2004). Protein-Protein Docking: Is the Glass Half-Empty? Trends in Biotechnology., 22(3), 110-116.
*[136] Valencia, Alfonso & Pazos, Florencio. (2002). Computational Methods for the Prediction of Protein Interactions. Current Opinion in Structural Biology., 12,368-373.
*[137] Vazquez, Alexei., Flammini, Alessandro., Maritan, Amos. & Vespegnani, Alessandro. (2003). Global Protein Function Prediction From Protein-Protein Interaction Networks. Nature Technology., 21(6), 697-700.
*[138] Wang, Meng., Yang, Jie., Liu, Guo-Ping., Xu, Zhi-Jie & Chou, Kuo-Chen. (2004). Weighted-Support vector Machines for Predicting Membrane Protein Types based on Pseudo-amino acid Composition. Protein Engineering, Design & Selection., 17(6), 509-516.
*[139] Webb-Robertson, Bobbie-Jo., Oehmen, Christopher & Matzke, Melissa. (2005). SVM-BALSA Remote homology detection based on Bayesian sequence alignment. Computational Biology and Chemistry., 29, 440-443.
*[140] Weber, Irene T. & Harrison, Robert W. (1999). Molecular mechanics analysis of drug-resistant mutants of HIV. Protein Engineering., 12(6), 469-474.
*[141] William J. Greenleaf, Michael T. Woodside, Elio A. Abbondanzieri, and Steven M. (2005). Passive All-Optical Force Clamp for High-Resolution Laser Trapping. Block Phys. Rev. Lett., 95, 208-102
*[142] Wodak, Shoshana J. & Mendez, Raul. (2004). Prediction of Protein-Protein Interactions: the CAPRI Experiment, its evaluation and implications. Current Opinion in Structural Biology., 14, 242-249.
*[143] Wojcik, Jérôme., Boneca, Ivo. & Legrain, Pierre. (2002). Prediction, Assessment and Validation of Protein Interaction Maps in Bacteria. J. Mol. Biol., 323, 763-770.
*[144] Wright, JD & Lim, C. (1998). Prediction of an anti-IgE binding site on IgE. Protein Engineering., 11(6), 421-427.
*[145] Xie, Dan., Li, Ao., Wang, Minghui., Fan, Zhewan & Feng, Huanqing. (2005).LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Research., 33, W105-W110.
*[146] Yang, Zheng Rong., (2005). Orthogonal Kernel Machine for the Prediction of Functional Sites in Proteins. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS., 35(1), 100-106.
*[147] Yu, Chenggang., Zavaljevski1,Nela., Stevens, Fred J., Yackovich,Kelly & Reifman, Jaques. (2005). Classifying Noisy Protein Sequence Data: A case study of immunoglobulin light Chains. Bioinformatics. 21(1), i495-i501.
*[148] Yu, Hui., Gao, Lei., Tu, Kang & Guo, Zheng. (2005).Broadly Predicting Specific Gene Functions with Expression Similarity and Taxonomy Similarity. Gene., 352, 75-81.
*[149] Yuan, Zheng & Huang, Bixing. (2004). Prediction of Protein Accessible Surface Areas by Support Vector Regression. Proteins: Structure, Function and Bioinformatics., 57, 558-564.
*[150] Zavaljevski, Nela., Stevens, Fred J. & Reifman, Jaques. (2002). Support Vector Machines with Selective Kernel Scaling for Protein Classification and identification of Key Amino acid positions. Bioinformatics., 18(5), 689-696.
*[151] Zeng, Jun., Nheu,Thao., Zorzet1, Anna., Catime ,Bruno., Nice, Ed., Maruta, Hiroshi., Burgess, Antony W. & Treutlein, Herbert R. (2001). Design of Inhibitors of Ras-Raf interaction using a Computational Combinatorial Algorithm. Protein Engineering., 14(1), 39-45.
*[152] Zhao, Xing-Ming., Cheung, Yiu-Ming & Huang, De-Shuang. (2005). A Novel Approach to Extracting Features from Motif Content and Protein Compositiong for Protein Sequence Classification. Neural networks., 18, 1019-1028.
*[153] Zhou, Huan-Xiang & Shan, Yibing. (2001). Prediction of Protein Interaction Sites from Sequence Profile and Residue Neighbor List. Proteins: Structure, Function and Genetics., 44, 336-343.
[[Category:Bioinfromatics]]