光宣敏



【光宣敏】个人简历
生日:1984.12
籍贯:湖北仙桃
毕业院校:武汉工程大学化工与制药学院 学工程与工艺(工业分析方向)专业
E-Mail:
通信地址:
兴趣爱好:
钓鱼,书法,篮球,象棋,旅游,看书
主要科研方向

用支持向量机预测蛋白质序列中的二硫键






近期发表文章
  • In silico method for systematic analysis of feature importance in microRNA-mRNA interactions

  • BMC Bioinformatics,2009, 10:427

    JiaMin Xiao  , Yizhou Li  , Ke Long Wang  , Zhining Wen  , Menglong Li*  , LiFang Zhang  , guangxuan Min 

    Abstract:  

    Background: MicroRNA (miRNA), which is short non-coding RNA, plays a pivotal role in the regulation of many biological processes and affects the stability and/or translation of mRNA. Recently, machine learning algorithms were developed to predict potential miRNA targets. Most of these methods are robust but are not sensitive to redundant or irrelevant features. Despite their good performance, the relative importance of each feature is still unclear. With increasing experimental data becoming available, research interest has shifted from higher prediction performance to uncovering the mechanism of microRNA-mRNA interactions.

    Results: Systematic analysis of sequence, structural and positional features was carried out for two different data sets. The dominant functional features were distinguished from uninformative features in single and hybrid feature sets. Models were developed using only statistically significant sequence, structural and positional features, resulting in area under the receiver operating curves (AUC) values of 0.919, 0.927 and 0.969 for one data set and of 0.926, 0.874 and 0.954 for another data set, respectively. Hybrid models were developed by combining various features and achieved AUC of 0.978 and 0.970 for two different data sets. Functional miRNA information is well reflected in these features, which are expected to be valuable in understanding the mechanism of microRNAmRNA interactions and in designing experiments.

    Conclusions: Differing from previous approaches, this study focused on systematic analysis of all types of features. Statistically significant features were identified and used to construct models that yield similar accuracy to previous studies in a shorter computation time.



  • Using pseudo amino acid composition to predict transmembrane regions in protein: cellular automata and Lempel-Ziv complexity

  • Chinese Chemical Letters,Volume 34, Number 1,Jan,2008

    guangxuan Min  , Yanzhi Guo  , Menglong Li*  , Tuanfei Zhu 

    Abstract:  

    The knowledge of subnuclear localization in eukaryotic cells is indispensable for understanding the biological function of nucleus,
    genome regulation and drug discovery. In this study, a new feature representation was proposed by combining position specific scoring matrix (PSSM) and auto covariance (AC). The AC variables describe the neighboring effect between two amino acids, so that they incorporate the sequence-order information; PSSM describes the information of biological evolution of proteins. Based on this new descriptor, a support vector machine (SVM) classifier was built to predict subnuclear localization. To evaluate the power of our predictor, the benchmark dataset that contains 714 proteins localized in nine subnuclear compartments was utilized. The total jackknife cross validation accuracy of our method is 76.5%, that is higher than those of the Nuc-PLoc (67.4%), the OETKNN
    (55.6%), AAC based SVM (48.9%) and ProtLoc (36.6%). The prediction software used in this article and the details of the SVM parameters are freely available at http://chemlab.scu.edu.cn/ predict_SubNL/index.htm and the dataset used in our study is from Shen and Chou’s work by downloading at http://chou.med.harvard.edu/ bioinf/Nuc-PLoc/Data.htm.



  • Prediction of neurotoxins by support vector machine based on multiple feature vectors

  • Interdisciplinary Sciences,

    guangxuan Min 

    Abstract:  

    A new method was proposed for prediction of mitochondrial proteins by the discrete wavelet transform, based on the sequence–scale similarity measurement. This sequence–scale similarity, revealing more information than other conventional methods, does not rely on subcellular location information and can directly predict protein sequences with different length. In our experiments, 499 mitochondrial protein sequences, constituting a mitochondria database, were used as training dataset, and 681 non-mitochondrial protein sequences were tested. The system can predict these sequences with sensitivity, specificity, accuracy and MCC of 50.30%, 95.74%, 76.53% and 0.54, respectively. Source code of the new program is available on request from the authors.