MyJournals Home  

RSS FeedsIJMS, Vol. 17, Pages 1623: Identification of Protein-Protein Interactions via a Novel Matrix-Based Sequence Representation Model with Amino Acid Contact Information (International Journal of Molecular Sciences)

 
 

24 september 2016 12:45:09

 
IJMS, Vol. 17, Pages 1623: Identification of Protein-Protein Interactions via a Novel Matrix-Based Sequence Representation Model with Amino Acid Contact Information (International Journal of Molecular Sciences)
 


Identification of protein-protein interactions (PPIs) is a difficult and important problem in biology. Since experimental methods for predicting PPIs are both expensive and time-consuming, many computational methods have been developed to predict PPIs and interaction networks, which can be used to complement experimental approaches. However, these methods have limitations to overcome. They need a large number of homology proteins or literature to be applied in their method. In this paper, we propose a novel matrix-based protein sequence representation approach to predict PPIs, using an ensemble learning method for classification. We construct the matrix of Amino Acid Contact (AAC), based on the statistical analysis of residue-pairing frequencies in a database of 6323 protein-protein complexes. We first represent the protein sequence as a Substitution Matrix Representation (SMR) matrix. Then, the feature vector is extracted by applying algorithms of Histogram of Oriented Gradient (HOG) and Singular Value Decomposition (SVD) on the SMR matrix. Finally, we feed the feature vector into a Random Forest (RF) for judging interaction pairs and non-interaction pairs. Our method is applied to several PPI datasets to evaluate its performance. On the S . c e r e v i s i a e dataset, our method achieves 94 . 83 % accuracy and 92 . 40 % sensitivity. Compared with existing methods, and the accuracy of our method is increased by 0 . 11 percentage points. On the H . p y l o r i dataset, our method achieves 89 . 06 % accuracy and 88 . 15 % sensitivity, the accuracy of our method is increased by 0 . 76 % . On the H u m a n PPI dataset, our method achieves 97 . 60 % accuracy and 96 . 37 % sensitivity, and the accuracy of our method is increased by 1 . 30 % . In addition, we test our method on a very important PPI network, and it achieves 92 . 71 % accuracy. In the Wnt-related network, the accuracy of our method is increased by 16 . 67 % . The source code and all datasets are available at https://figshare.com/s/580c11dce13e63cb9a53.


 
98 viewsCategory: Biochemistry, Biophysics, Molecular Biology
 
IJMS, Vol. 17, Pages 1620: miR33a/miR33b* and miR122 as Possible Contributors to Hepatic Lipid Metabolism in Obese Women with Nonalcoholic Fatty Liver Disease (International Journal of Molecular Sciences)
IJMS, Vol. 17, Pages 1629: Extracts of Magnolia Species-Induced Prevention of Diabetic Complications: A Brief Review (International Journal of Molecular Sciences)
 
 
blog comments powered by Disqus


MyJournals.org
The latest issues of all your favorite science journals on one page

Username:
Password:

Register | Retrieve

Search:

Molecular Biology


Copyright © 2008 - 2024 Indigonet Services B.V.. Contact: Tim Hulsen. Read here our privacy notice.
Other websites of Indigonet Services B.V.: Nieuws Vacatures News Tweets Nachrichten