In silico method for systematic analysis of feature importance in microRNA-mRNA interactions

BMC Bioinformatics,2009, 10:427

JiaMin Xiao  , Yizhou Li  , Ke Long Wang  , Zhining Wen  , Menglong Li*  , LiFang Zhang  , guangxuan Min 


Background: MicroRNA (miRNA), which is short non-coding RNA, plays a pivotal role in the regulation of many biological processes and affects the stability and/or translation of mRNA. Recently, machine learning algorithms were developed to predict potential miRNA targets. Most of these methods are robust but are not sensitive to redundant or irrelevant features. Despite their good performance, the relative importance of each feature is still unclear. With increasing experimental data becoming available, research interest has shifted from higher prediction performance to uncovering the mechanism of microRNA-mRNA interactions.

Results: Systematic analysis of sequence, structural and positional features was carried out for two different data sets. The dominant functional features were distinguished from uninformative features in single and hybrid feature sets. Models were developed using only statistically significant sequence, structural and positional features, resulting in area under the receiver operating curves (AUC) values of 0.919, 0.927 and 0.969 for one data set and of 0.926, 0.874 and 0.954 for another data set, respectively. Hybrid models were developed by combining various features and achieved AUC of 0.978 and 0.970 for two different data sets. Functional miRNA information is well reflected in these features, which are expected to be valuable in understanding the mechanism of microRNAmRNA interactions and in designing experiments.

Conclusions: Differing from previous approaches, this study focused on systematic analysis of all types of features. Statistically significant features were identified and used to construct models that yield similar accuracy to previous studies in a shorter computation time.

BMC Bioinformatics