Index


1. What is miRWalk database and how does it work?


miRWalk is a comprehensive database of human, mouse and rat microRNAs (miRNAs) on their Predicted and Validated Targets associated with genes, pathways, diseases, organs, cell lines and transcription factors. The miRWalk consists of two modules.

The Predicted Target module hosts miRNA-target interactions information on the complete sequence of all known genes of human, mouse and rat. In addition, the results are presented togather with results from 8 established miRNA-target prediction programs for the comparison of the results with different algorithms. Furthermore it provides predicted miRNA binding sites on gene associated with 449 human biological pathways and 2356 OMIM disorders.

The Validated Targets module hosts new and unique features on experimentally validated miRNA interaction information associated to genes, pathways, diseases, organs, cell lines and OMIM disorders. Moreover, it offers information on proteins known to be involved in miRNA processing. The miRWalk is the only database provides the possible miRNA binding sites on the compelete sequence (promoter, 5' UTR, CDS and 3' UTR) of known genes and 3 complete mitochondrial genomes.

miRWalk Algorithm The miRWalk algorithm is based on a computational approach which identifies the longest consecutive complementary between miRNA and gene sequences. Based on Watson-Crick complementary, It starts walking on the complete gene sequence and mitochondrial genomes starting with a heptamer seed of miRNA and identifies possible miRNA binding sites up to possible matching on the complete sequence of all known genes, returns the longest seed match, then it assigns the miRNA binding sites to four regions of protein coding genes i.e. Promoter, 5 UTR, CDS, 3 UTR and mitochondrial genes. In addition, the probability distribution of random matches of a subsequence (5 end miRNA sequence) in the analysed sequence is calculated by using Poisson distribution as shown by Rehmsmeier et al. 2004. Then miRWalk compares its identified miRNA binding sites with the results of 8 established miRNA-target prediction programs i.e. DIANA-microT, miRanda, miRDB, PicTar, PITA, RNA22, RNAhybrid and TargetScan/TargetScanS. Finally miRWalk incorporates all the predicted miRNA binding sites produced by the miRWalk algorithm and the 8 established programs into a relational database (miRWalk). Thereafter, It made an extensive search in the PubMed database to retrieve all the available information on human, mouse and rat miRNAs linked to genes, pathways, diseases, organs, cell lines, OMIM disorders, and proteins known to be involved in miRNA processing. This information is complied and stored as validated information on miRNA in miRWalk database. miRWalk data is generated by executing automated Perl and BioPerl scripts scripts on the server of bwGRID Cluster Heidelberg (High Performance Cluster). The below figure shows miRWalk algorithm and analysis pipeline.




2. Why does miRWalk search possible miRNA binding sites in the complete sequence of a gene?


For more than a decade, attempts to study the interaction of miRNAs with their targets were confined to the 3 UTR of mRNAs. Tay Y et al. have shown the existence of many naturally occurring miRNA targets sites of miR-314, miR-296 and miR-470 in the amino acids coding sequence (CDS) of the genes Nanog, Oct4 and Sox2 (47). Shouhong Guang et al. have shown that specific argonaute proteins can transport specific classes of small regulatory RNA to distinct cellular compartments to regulate gene expression (48). In another study, Robert F. Place et al. have shown that miR-373 targets promoter sequences of E-cadherin and CSDC2 genes to induce gene expression (49). On the other hand, a few experiments have indicated possible target sites in the 5 UTR (50-52). In recent studies, it was shown a new miRNA target class containing simultaneous 5- and 3 UTR interaction sites (53).
These findings reveal a new mode of action of miRNAs by which they may regulate the gene expression. Thus it really makes sense to search the possible miRNAs binding sites on complete sequence (promoter, 5 UTR, CDS and 3 UTR) of a gene, not only confined to 3 UTR.


3. What does miRWalk database cover?


miRWalk database provides available information on the following features:-
1. MiRNA-target interactions information produced by miRWalk on the complete sequence (promoter, 5' UTR, CDS and 3' UTR) of all genes of human, mouse and rat including all transcripts.
2. MiRNA-target interactions information produced by the 8 established miRNA prediction programs i.e. RNA22miRandamiRDBTargetScanRNAhybridPITAPICTAR, and Diana-microT.
. 3. Predicted miRNA-target interactions information on genes associated with 449 human biological pathways and 2356 OMIM disorders.
4. Experimentally Verified miRNA interactions information associated with genes, pathways, organs, diseases, cell lines, transcription factors and OMIM disorders.
5. In addition, it hosts information on proteins known to be involved in miRNA processing.


4. How do I use miRWalk database?


The miRWalk database is divided into two modules i.e. Predicted Targets and Validated Targets modules. Predicted Targets modules has four different search pages i.e. Gene Targets, MicroRNA Targets, Pathway Targets and Chromosome Targets. Validated Targets modules has nine different search pages i.e. Gene, MicroRNA, Pathway, Disease, Organ, Common Organs, Cell Line, and Transcription Factor Targets and MicroRNA Literature.

We are explaining predicted and validated targets search with the help of two examples.
Example 1. Predicted Gene Targets
Step1. Select a specie from the dropdown list as shown in below figure.
Step2. Select region(s) of interest on which you want to search the possible miRNA binding sites. There is a extra range box for promoter upstream gene flanked region. The Maximum upstream range is 10000.
Step3. Select a transcripts of interest. There are two option available i.e. All transcripts or The longest transcripts.
Step4. Enter minimum seed length of miRNA OR/AND P-value for confined search.
Step5. Select other available prediction tools to see the common prediction in 3' UTR of gene(s).
Step6.. Enter gene(s) in the given text area box or upload a file and press the Submit button.



5. What does "Choose Gene Region(s)" means?


According to the current understanding, a new mode of action of miRNAs has shown through which they may regulate gene expression by binding on promoter as well as on coding sequence.
Therefore a user can limit the search for the prediction of miRNAs binding sites on specific region(s) i.e. Promoter or/and mRNA (5' UTR, CDS and 3' UTR) of entered gene(s). By default 3' UTR region is selected.


6. What does "Select a Transcript" means?


Alternatively spliced transcript variants encoding different isoforms have been noted for several genes.
For eg. TP63 gene of human encodes six different transcipts with variant length on the region of 5' UTR, CDS as well as 3' UTR.
When a user click on "All" radio button under the option "Select a Transcript" ,then all transcripts encode by entered gene(s) are automatically selected for possible miRNAs binding sites.
When a user click on "Longest" radio button under the option "Select a Transcript", then only the longest transcript encodes by entered gene(s) are automatically selected for the possible miRNAs binding sites.
By default the Longest transcripts is selected.


7. What does "Prediction programs" means?


Indeed, it has become a common practice to look at miRNA interaction predictions produced by several target prediction programs and concentrate on their intersection. So miRWalk database provides miRNA targets interaction information produced by 8 different established miRNAs prediction programs. Please select atleast two miRNA prediction programs for the comparison of results. Bydefault miRanda, miRDB, miRWalk and TargetScan are selected.


8. What is minimum seed length?


The minimum number of nucleotides (nt) of miRNA seqence (5' end) through which they can make one or more interactios with their possible targets sequences i.e. promoter, 5' UTR, CDS and 3' UTR. The minimum seed length is 7 nts.


9. What is P-value?


A probability distribution of random matches of a subsequence (miRNA 5' end sequence) in the given sequence (promoter or mRNA sequence), is calculated by using Poisson distribution. Where a low probability implies a significant hit. This relationship to the Poisson distribution has been described in Sadygov, R.G et al., 2003 and Havilio et al.,2003. By default P-value = 0.05 is selected.


10. What is "Pathway Targets" in Predicted Targets module?


In Pathway Targets under Predicted Targets module, can provide a wide range of search to see the possible miRNAs binding sites in all genes which come under a selected pathway. When the user select a pathway from the drop down list provided in the Pathway Targets search page, then the search page automatically incorporates all the genes of a selected pathway and finds the possible miRNAs binding sites.


11. What is "Chromosome Targets" in Predicted Targets module?


In Chromosome Targets, a user can search all the possible miRNAs binding sites of all miRNAs located on a miRNA selected chromosome against all protein coding genes of a mRNA selected chromosome. This search pages first selects all the miRNAs located on the selected miRNA chromosome and then searches possible miRNAs binding sites against all the protein coding genes which are located on mRNA selected chromosome.


12. What is the current status of miRWalk database?


Currently, The Predicted Target module of miRWalk hosts predicted miRNA-target interaction information on morethan 2,000 miRNAs on the complete sequence of all known genes (including mitochondrial genes) of Human, Mouse and Rat produced by both miRWalk as well as with the 8 established miRNA-target prediction programs on 3' UTR. In addition, it offers the predicted miRNA binding sites on genes linked to 449 human biological pathways and 2356 OMIM disorders.

Prediction Programs Diana-microT miRanda miRDB PICTAR PITA RNA22 RNAhybrid Targetscan
Version Or Date   Version 3.0   August 2010 release   April 2009   March 2007   August 2008   May 2008   Version 2.1   Verion 5.1  

In Current release of miRWalk database, more than 7000 publications are documented on miRNAs. The Validated Target module of miRWalk hosts validated information on 2044 miRNAs from Human, Mouse, and Rat and reports more than 67598 relationship associated to 3821 genes, 375 pathways, 549 diseases, 468 organs, 74 cell lines and 2033 OMIM disorders. In addition, it presnets the information on the proteins known to be involved in miRNA processing. This data is automatically updated in a period of every 3 months by execution of automated Perl scripts. This module is last updated on 15th March 2011.


13. What are the future plans of miRWalk database?


For future development, we will continue to incorporate validated and predicted miRNA-target interactions information on other species to miRWalk database. Future plans to include further improvement of the web interface as well as incorporation of new modules for more annotation by adding new PHP scripts. Also an online tool for motifs search will be incorporated to query (fixed motif or regular expression search) aganist input sequences of interest.
We would like you to let us know if you encounter problems during the use of miRWalk or you have suggestions to improve user interface as well as incorporation of new features to miRWalk datbase. To obtain further information about miRWalk please contact the miRWalk Team. mirwalkteam@medma.uni-heidelberg.de


14. How does miRWalk database store all the putative targets of 8 prediction programs.


All the possible targets predicted (with no threshold or filter) obtained from 8 already established miRNA-target prediction programs (3rd party algorithms) are stored in miRWalk database. Currently, miRWalk database hosts all the putative targets (both matched and unmatched with miRWalk prediction data) of 3rd party algorithms.