is an improved version of the previous database (i.e. miRWalk ). miRWalk2.0 is so far the only freely accessible,
comprehensive archive, supplying the biggest available collection of predicted and the experimentally verified miRNA-target interactions
with various novel and unique features (missing in a previous version - that is miRWalk  and other resources [2-17]) to greatly assist miRNA research community.
Currently, it amalgamates miRNA-target interactions for human, mouse and rat. However, it provides miRNA-miRNA interactions on 15 species:
human, orangutan, chimpanzee, monkey, mouse, rat, pig, chicken, dog, cow, frog, zebrafish, opossum, fruitfly and worm.
miRWalk2.0 not only documents miRNA binding sites within the complete sequence of a gene, but also combines this information with a comparison of binding sites resulting from 12 existing miRNA-target prediction programs: DIANA-microTv4.0 , DIANA-microT-CDS , miRanda-rel2010 , mirBridge , miRDB4.0 , miRmap , miRNAMap , doRiNA i.e.,PicTar2 , PITA , RNA22v2 , RNAhybrid2.1  and Targetscan6.2 , to build novel comparative platforms of binding sites for the promoter (4 prediction datasets), cds (5 prediction datasets), 5'- (5 prediction datasets) and 3'-UTR (13 prediction datasets) regions. It also documents experimentally verified miRNA-target interaction information collected via an automated text-mining search and data obtained from existing resources (miRTarBase , PhenomiR2.0 , miR2Disease  and HMDD ) offer such information. A total of 13,650 publications are documented on validated miRNA-target interactions. It documents experimentally validated interactions on 3,081 miRNAs and reports more than 151,666,930 relationships associated with 19395 genes, 1,955 DOs, 12 gene classes, 4,371 GOBPs, 1,331 GOMFs, 715 GOCCs, 6,463 HPOs, 4,087 OMIM disorders, 546 pathways, 28 protein classes, 450 diseases, 671 organs and 87 cell lines. In addition, it presents the information on proteins known to be engaged in miRNA processing.
miRWalk2.0 is developed with the aim of providing a public resource to supply putative as well as experimentally verified miRNA interactions associated with the complete sequence of genes, mitochondrial genomes, other miRNAs, pathways, gene-, disease-, and human phenotype-ontologies and OMIM disorders, classes, cell lines, and organs. The structure of miRWalk2.0 can be broadly divided into four sections: putative miRNA-target interactions, validated miRNA-target interactions, functional annotation and the web-interface. Briefly, first, all the genomic sequences (promoter, mitochondrial and miRNA) were downloaded and five prediction algorithms were locally executed to generate putative miRNA binding sites within the downloaded sequences. In parallel, the 8 prediction datasets were gathered from the existing resources and merged with the findings of the locally executed algorithms. Thereafter, these miRNA binding sites were separated into 6 different lists. Second, the experimentally verified miRNA-target interactions were retrieved via an automated text-mining survey in PubMed and the data obtained from four databases host such information. Third, the functional annotation information such as pathways, ontologies and diseases were obtained to further dissect the putative as well verified miRNA-target interactions. In the last step, the web-interface has been designed to host the collated information that was stored into a MySQL database (miRWalk2.0). The web-interface of miRWalk2.0 has two modules (the Predicted and Validated Target) which can be interrogated to acquire miRNA-target interactions for human, mouse and rat. Moreover, external links have been integrated with the result pages, allowing users to obtain more annotation and information on queried genes, miRNAs, pathways, ontologies, and/or diseases.
For more than a decade, attempts to study the interaction of miRNAs with their targets were limited to the mRNA 3'-UTR region. However, several investigators have recently suggested an alternative mode of gene regulation in which miRNAs anneal within the promoter, cds, 5'- and/or 3'-UTR regions of their targets thereby regulating their translation [18-21]. Therefore, it is of paramount importance to search possible miRNA binding sites within the complete sequence (promoter, 5'-UTR, CDS and 3'-UTR) of a gene.
In order to support such interactions, miRWalk2.0 offers possible miRNA interactions with all the regions of a gene by gathering 13 prediction datasets from existing miRNA-target resources [1-13]. These 13 different prediction datasets are preprocessed, unified and the processed information is further used to build novel comparative platforms of miRNA interactions, enabling the users to access new targets on the promoter, cds, 5'- and/or 3'-UTR regions.
The web-interface of miRWalk2.0 is broadly classified into the Predicted Target (PTM) and the Validated Target (VTM) modules.
These two modules are further categorized into different search pages, allowing users to fetch miRNA associated information using different identifiers.
Search methods implemented under the PTM: Gene-miRNA Targets search page
Step1. Select a species, database and input identifier type from the given drop-down menus (Figure 2) and either paste or upload a list of identifiers.
Step2. Select at least one check box to obtain information on input identifiers and their functional association.
Step3. Select starting position (from 1 to 6) of a miRNA seed, region(s) of input genes on which you want to search possible miRNA binding sites (a maximum of 10kb i.e. 10,000 is allowed for the promoter region), enter minimum seed length of miRNA, and/or P-value and choose at least two algorithms to obtain a comparative overview of miRNA binding sites resulting from 13 different prediction datasets within the promoter, 5'-UTR, CDS and 3'-UTR regions.
Step4. Click on the "SEARCH" button to execute the query.
In Gene-miRNA Targets search, a tabulated result page (Figure 3) is presented with links for the gene (Figure 4), allowing users to retrieve data including gene information (Figure 4), genomic location (Figure 5), gene synonymous, RefseqIDs and homologous information (Figure 6), external links (Figure 7), information on gene and protein classes (Figure 8), functional association (Figure 9) and miRNA binding sites predicted with different combination of algorithms (Figure 10). Additionally, information on human homologous genes across 15 species can be downloaded to conduct an interspecies analysis on homologous genes (Figure 6). Also, external links (Figure 7) are provided, permitting the user to obtain data on phenotype, genotype, SNPs, splice junction, functional networks, neighbouring genomic members, expressions of genes and proteins in human organs, their MS/MS spectra and relevant PubMed articles. This page offers a one-stop place to collect an abundance of information on queried genes (Figure 3).
By clicking on the GeneTab link (Figure 3), a user can gather basic information (such as EntrezIDs, Chromosome, Map, definition) on queried genes (Figure 4) and can be easily downloaded by a single click on the "Download Table" link. The contents of this result table are hyper linked to external databases: Gene and Taxonomy at NCBI to obtain further information.
By clicking on the "Gene Location" link (Figure 3), one can obtain information on genomic location (such as ContigID, Start and end positions, chromosome, map, and strand) and epigenomics on queried identifiers (Figure 5) and can download by a single click on the "Download Table" link. This table has some hyper links to external databases (Gene, Nucleotide and Epigenomics at NCBI) to get additional information.
By clicking on the "Synonymous", "Refseq Table" and "Homologous Table" links (Figure 3), a user can collect information on synonymous (such as genes, synonymous, EnsemblIDs, RefseqIDs , UCSCIDs, VegaIDs , UniGeneIDs, LocusTagIDS, RefseqPIDs, HGNCIDs, UniProtIds, OMIMIDs and UniSTS), mRNAs (RefseqIDs, CDS start and end positions and the length of mRNAs) and homologous (comprehensive atlas of human homologous genes among 15 different species) on queried genes (Figure 6). These tables can be easily downloaded by a single click on "Download Table" links. These tables are hyperlinked to Gene and Nucleotide (Refseq) at NCBI to get additional information.
By clicking on the "External links" (Figure 3), users can obtain information on their genes of interest from several databases via. the given links (Figure 7). These external databases are UniGene, HGNC, OMIM, Ensembl, UCSC, AceView, DGV, CCDS, Genotype, ClinVar, dbVar, PheGenl, GeneMania, Nucleotide, EST, Probe, Protein, CDD, GEO, ProteomicsDB, Human Proteome Map (HPM), UniProt, PubChem Compound, PMC and PubMed. Interestingly, all these external links can be downloaded by clicking on the "Download Table" link.
Users can retrieve information on gene and protein classes (Figure 8) associated with their input identifiers by clicking on "Gene" and "Protein" classes links (Figure 3). Moreover, a comparative overview of protein and gene classes among 15 different species can be viewed and/or downloaded. The "Gene" and "Class" fields are also hyperlinked with external databases (Gene and Panther) to obtain further information on input genes and their protein classes.
One can fetch information on pathways and ontologies associated with queried identifiers (Figure 9) by clicking on "KEGG", "WIKI", "Panther", "GOBP", "GOMF" and "GOCC" links (Figure 3). Moreover, comparative overviews of pathways and ontologies among 15 different species can be viewed and/or downloaded. The "Gene", "KEGG", "Wiki", "Panther" and "GO" fields are also hyperlinked with external databases (Gene, KEGG, WikiPathways, Panther and Gene Ontology) to obtain further information.
One can obtain the possible miRNA binding sites within the complete sequence of genes resulting from the miRWalk algorithm and 12 other prediction datasets (Figure 10) by clicking on the links: Promoter, 5'-UTR, CDS and 3'-UTR integrated on the result page (Figure 3). "Green" and "red" colour cells in the comparative platforms indicate, whether a given miRNA-target interaction is "predicted" or "not predicted", respectively. Moreover, these tables can be downloaded at any time by clicking on "Download Table" links. The "Gene", "RefseqID" and "miRNA" fields are also hyperlinked with external databases (Gene, Nucleotide and miRBase, respectively) for further annotation.
The MicroRNA-target search page (Figure 11) is organized similar to "Gene-based" interface (Figure 2). A user can carry out "miRNA-based" searches by selecting a species, database and type of identifier; by providing identifiers of miRNA; picking result tables; selecting search parameters such as promoter, 5'-UTR, p-value and external databases; functional annotations; and clicking on "SEARCH" button to execute the query (as shown in Figure 11).
The "miRNA-based" result page (Figure 12) is also organized in the similar manner as the "Gene-based" result page (Figure 3). The result page of miRNA-based interface hosts a multi-layered view of information i.e. sequences, accessions, families, other miRNAs having similar seeds, sequence alignment, host gene and other necessary data - lists of putative targets and statistically enriched pathways, ontologies, gene and protein classes on input miRNAs (Figure 12 to 14). Tables integrated under the "miRNA-based" result page are hyperlinked with miRBase, Gene, KEGG, Wiki-Pathways, Panther, GO and Taxonomy databases to gather further annotation data.
Users can collect information on their miRNAs of interest for example, which other miRNAs having similar sequence, similar seeds, data on their families along with their identities, alignment of family members and miRNA host-gene information (Figure 13). These tables are hyperlinked with miRBase for further annotation on queried miRNAs.
Users can assemble data such as pre-miRNAs, pre-miRNA aligned profiles, identity of pre-miRNA aligned profile, possible targets predicted by 13 different prediction data-sets and enriched miRNAs within different pathways, ontologies and classes (Figure 14) on their miRNAs of interest. These tables are hyperlinked with miRBase, Gene, Refseq, KEGG, WikiPathways, Panther and GO for further information.
Recently, miRNAs have been shown to base-pair with other miRNAs . These observations may not only help to understand the complexity of regulatory networks, but also can open new avenues to better understand how these regulators fine-tune each other to maintain the integrity of a cell. Nonetheless, this information is missing in the existing resources. This information is therefore generated and integrated into miRWalk2.0 with the help of a "miRNA-miRNA-based" search page (Figure 15). One can gather basic information such as miRNA identifiers, sequences, alignments, miRNA host-gene and miRNA-miRNA binding sites predicted by the miRWalk algorithm using the "miRNA-miRNA-based" search. In addition, a comprehensive platform is presented to offer a comparative overview of miRNA-miRNA binding sites (Figure 15) on queried miRNAs. These tables are hyperlinked with miRBase for further information.
Using the "Gene-miRNA-pathway Targets" or "Pathway information retrieval system" search, the users can gather putative miRNA binding sites within the complete sequence of all genes belonging to one or more queried pathways (a maximum of 10 is allowed). In addition, it is also possible to obtain a list of genes associated with a given pathway and to collect miRNAs which are enriched for their binding sites within these pathways (Figure 16). These tables are hyperlinked with miRBase, Gene, Refseq, KEGG and WikiPathways for further information.
Other search methods: "Gene-class Targets", "Chromosome Targets", "Gene-miRNA-OMIM Targets", "Disease Targets" and "Human Phenotype Ontologies (HPO) Targets" are organized in the similar manner as "Gene-miRNA-Pathway Targets" search and result pages.
Using the "Mitochondrial Targets" search page, one can fetch putative miRNA binding sites within the complete mitochondrial genome by selecting a species of interest from the drop-down menu (Figure 17). The result page of Mitochondrial Targets is organized similar to other result pages. Users can obtain information on mitochondrial genes, their association with pathways and putative miRNA binding site predictions as well as a comparative view resulting from 5 different prediction datasets (Figure 17).
Large-scale experiments such as next-generation sequencing or transcriptomic profiling, produce large amounts of data (> 1,000 significant genes/miRNAs). Still, there is no single miRNA resource available which either allows users to perform functional enrichment analysis on all the significant candidates (at once) or supplies a functionality to download the customized datasets for stand-alone tools e.g. GSEA  and DAVID . To foster large-scale enrichment analysis, a novel feature named "Customized data-sets" is implemented within miRWalk2.0 through which users can generate a customized list of putative targets on their miRNAs of interest from 13 different datasets for promoter, CDS, 5'- and/or 3'-UTR regions (Figure 18).
Previous studies suggest that several mammalian miRNA genes are co-expressed with their host-gene and/or neighbouring genes by utilizing their transcriptional machinery and promote synergistic and/or antagonistic effects on them Figure 19. miRWalk2.0 provides a genomic location search functionality for genes to determine which of miRNA(s) share the same or nearby location (Figure 19). A list of disease-specific or significant genes obtained from a microarray profiling study can be interrogated to attain miRNAs that may be expressed with queried genes which could be involved in the genetic regulation of a specific condition. Further, one can use this information to choose miRNAs which are located nearby or within the highly differentially regulated genes (Figure 19) and can perform qPCR experiments to validate potential miRNAs without considering miRNA microarray profiling studies.
Moreover, it is possible to obtain a list of all miRNAs located within the exon, introns, 5'- and/or 3'-UTR regions of human, mouse and rat genomes by selecting check-boxes given on the Genomic Location Search page (Figure 19).
According to the current understanding, a new mode of action of miRNAs has shown through which they may regulate gene expression by binding on the promoter as well as on the coding sequence [18-21]. Therefore, it is of paramount importance to search possible miRNA binding sites within the complete sequence (promoter, 5'-UTR, CDS and 3'-UTR) of a gene.
In order to incorporate such interactions, we have generated possible miRNA interactions with all the regions of a gene by gathering 13 prediction datasets from the existing miRNA-target resources [1-13]. These interactions are documented into miRWalk2.0, enabling the users to access new targets on promoter, cds, 5'- and/or 3'-UTR regions.
Yes, miRWalk2.0 integrates all transcripts encoding by a genes - as it has previously been shown that a gene can encode for different transcripts with different lengths due to alternative splicing process - for example, TP63 gene is known to encode six different transcripts with variant length on 5'-UTR, CDS and 3'-UTR regions.
After scanning the complete sequence of all genes/miRNAs (including mitochondrial genomes) of
human, mouse and rat for possible miRNA binding site using the "miRWalk algorithm", the prediction datasets resulting from 12
databases are gathered to build novel comparative platforms to compare results. Indeed, it has become a common practice to consider union and/or intersection of miRNA-target interactions resulting from
multiple algorithms [50-56]. Therefore, miRWalk2.0 supplies novel platforms of miRNA-target interaction information on the promoter, 5'-UTR, CDS, 3'-UTR, mitochondrial genomes and miRNA-miRNA pairs.
It is important to select at least two algorithms with logical operators (OR or AND) to obtain a comparative view.
The minimum number of nucleotides (nt) of miRNA seed sequence (from the 5' end) through which a miRNA can bind with its targets i.e. promoter, 5'-UTR, CDS, 3'-UTR and/or miRNA.
It is not possible to search possible binding sites of miRNA with less than 7nt. Therefore, a user should enter at least 7 in the given text box area.
A probability distribution of random matches of a subsequence (from the 5' end of miRNA sequence) in a given sequence (gene, miRNA and/or mitochondrial genome sequence),
is calculated by using Poisson distribution. Where a low probability implies a significant hit. More information on the Poisson distribution has been described in [1-13].
The default p-value is set to 0.05.
We sincerely acknowledge all publicly available data sources (listed below) which have been used in miRWalk2.0.
Information on genes, and their synonymous, identifiers and sequences
|NCBI||April 2014||symbols & identifiers||Click Here|
|Refseq||61||mRNA sequences||Click Here|||
|Organelle Genome Resources||July 2014||Mitochondrial genome||Click Here|
|Ensembl||May 2014||Promoter (10kb upstream flanking region)||Click Here|||
|HomoloGene||May 2014||Homologous genes||Click Here|
|MGI||April 2014||genes & identifiers||Click Here|||
|RGD||April 2014||genes & identifiers||Click Here|||
|HGNC||April 2014||genes & identifiers||Click Here|||
Information on miRNAs, and their synonymous and identifiers
|miRBase||Release 10 to 20||miRNAs, synonymous & sequences (only rel20)||Click Here|||
|NCBI||April 2014||names & EntrezIDs||Click Here|
|MGI||April 2014||Identifiers||Click Here|||
|RGD||April 2014||Identifiers||Click Here|||
|HGNC||April 2014||Identifiers||Click Here|||
|EMBL||April 2014||Identifiers||Click Here|
|RFAM||April 2014||Identifiers||Click Here|||
Functional annotation information
|DAVID||6.7||KEGG pathways and their gene-sets||Click Here|||
|PANTHER||9||Pathways, protein classes and their gene-sets||Click Here|||
|WikiPathways||April 2014||Pathways and their gene-sets||Click Here|||
|GSEA||2.0.14||Gene classes and their gene-sets||Click Here|||
|DGV||July 2013||CNV genes||Click Here|||
|GO||April 2014||GOBP, GOMF, GOCC and their gene-sets||Click Here|||
|OMIM||July 2014||OMIM disorders and their gene-sets||Click Here|||
|HPO||Build 553||Human Phenotype Ontologies and their gene-sets||Click Here|||
|DO||September 2014||Diseases and their gene-sets||Click Here|||
Putative miRNA-target interaction information
|Diana-microT||4.0 and 5.0||miRNA binding sites within 3'-UTR||Click Here|||
|Diana-microT-CDS||5.0||miRNA binding sites within CDS||Click Here|||
|miRanda||August 2010||Locally executed to identify miRNA binding sites within the complete sequence||Click Here|||
|miRBridge||4.0||miRNA binding sites within 3'-UTR||Click Here|||
|miRDB||4.0||miRNA binding sites within 3'-UTR||Click Here|||
|miRMap||2013||miRNA binding sites within 3'-UTR||Click Here|||
|miRNAMap||2008||miRNA binding sites within 3'-UTR||Click Here|||
|doRiNA (PICTAR2)||Version 2||miRNA binding sites within 3'-UTR||Click Here|||
|PITA||2007||miRNA binding sites within CDS, 5'- & 3'-UTR, and miRNA (locally executed)||Click Here|||
|RNA22||version 2||miRNA binding sites within CDS, 5'- & 3'-UTR||Click Here|||
|RNAhybrid||2.1||Locally executed to identify miRNA binding sites within the complete sequence||Click Here|||
|Targetscan||6.1||Locally executed to identify miRNA binding sites within the complete sequence||Click Here|||
Validated miRNA-target interaction information
|miRTarBase||4.0||miRNA-target interactions||Click Here|||
|PhenomiR||2.0||miRNA-disease interactions||Click Here|||
|miR2Disease||2008||miRNA-disease interactions||Click Here|||
|HMDD||2.0||miRNA-disease interactions||Click Here|||
|PubMed||September 2014||miRNA interactions with miRNAs, genes, diseases, cell lines, organs, processing proteins||Click Here|
We sincerely acknowledge all the useful data sources (see below table) which have been hyperlinked with miRWalk2.0.
|PubChem Compound||Click Here|
Currently, the PTM of miRWalk2.0 hosts putative interaction information between more than 11,740 miRNAs and genes, miRNAs,
mitochondrial genomes of human, mouse, and rat resulting from 13 different prediction datasets. In addition, it supplies the predicted miRNA binding sites
on genes linked to biological pathways, gene ontologies, diseases, OMIM disorders, human phenotype ontologies, gene and protein classes.
In the VTM, more than 13,650 publications are documented on miRNAs. This module documents experimentally validated interactions on 3,081 miRNAs and reports more than 151,666,930 relationships associated with 19395 genes, 1,955 DOs, 12 gene classes, 4,371 GOBPs, 1,331 GOMFs, 715 GOCCs, 6,463 HPOs, 4,087 OMIM disorders, 546 pathways, 28 protein classes, 450 diseases, 671 organs and 87 cell lines. In addition, it presents the information on proteins known to be engaged in miRNA processing. This module is last updated on 29th September 2014.
|Gene ontologies (GO)||7,506||5,441||5,447||49,035|
|Disease ontologies (DO)||2,035||NA||NA||2,035|
|Human Phenotype ontologies (HPO)||6,727||NA||NA||6,727|
|Promoter (5 algorithms)||146,354,554||123,529,954||38,037,935||307,922,443|
|5'-UTR (5 algorithms)||71,409,379||18,621,336||3,663,590||93,694,305|
|CDS (5 algorithms)||143,634,119||25,667,955||16,594,605||185,896,679|
|miRNA-miRNA (5 algorithms)||2,116,365||1,559,205||579,638||9,747,305|
|Mitochondrial (5 algorithms)||30,372||20,745||7,805||58,922|
|category||Total (N)||Interactions||Articles||Genes (14 species)||miRNAs (14 species)|
More annotations and additional species will be integrated to further expand this resource.
We would like you to let us know if you encounter problems during the use of miRWalk2.0 or you have suggestions to improve the user interface as well as incorporation of new features to this resource.
To obtain further information about miRWalk2.0, please contact: Dr. Harsh Dweep at Harsh.Dweep@medma.uni-heidelberg.de
All the possible targets predicted (with no threshold or filter) obtained from the established miRNA-target prediction programs (3rd party algorithms) are stored in miRWalk database. Currently, miRWalk2.0 hosts all the putative targets (both matched and unmatched with miRWalk prediction data) of 3rd party algorithms.
In 2011, we developed the miRWalk algorithm  to identify all possible interactions between miRNA and gene sequences.
Briefly, based on Watson-Crick complementary, it starts walking on the complete gene sequence and mitochondrial genome with a starting miRNA seed of 7nt (heptamer) and
identifies possible miRNA binding sites up to possible matching on the complete sequence of all known genes, returns all the identified bindings,
then it assigns these miRNA binding sites to four regions (Promoter, 5'-UTR, CDS and 3'-UTR) of protein coding genes and mitochondrial genes.
In addition, the probability distribution of random matches of a subsequence (from the 5' end of miRNA sequence) in the analyzed sequence is calculated by using
Poisson distribution . In a next step, miRWalk compares its identified miRNA binding sites with the results of 8 established
miRNA-target prediction programs i.e. DIANA-microT, miRanda, miRDB, PicTar, PITA, RNA22, RNAhybrid and TargetScan/TargetScanS.
Finally, it incorporates all the predicted miRNA binding sites produced by the miRWalk algorithm and the 8 established programs into a relational database (miRWalk).
Thereafter, it performs an automated text-mining search in the titles/abstracts of PubMed to retrieve the experimentally verified information on human, mouse and rat miRNAs and their interactions linked to
genes, pathways, diseases, organs, cell lines, OMIM disorders, and proteins known to be involved in miRNA processing.
This information is complied and stored as experimentally verified miRNA-target interactions into miRWalk database.
Information on the predicted as well as validated miRNA-target interactions is generated by executing automated Perl and BioPerl scripts on the server of bwGRID Cluster Heidelberg (High Performance Cluster). Please read Dweep et al.  for more information on the miRWalk algorithm.
All the customized datasets are available for downloading in two most popular ready to use file formats (Rdata and GMT) via the Holistic view search page implemented under the PTM of miRWalk2.0.
Many computational approaches have been developed by considering different searching rules (e.g., base-pairing, thermodynamic stability,
conservation and cooperativity, and multiplicity of miRNA binding sites) to identify possible miRNA-target interactions [1-4].
These algorithms have proven to be useful; however, comparative investigations carried out with these algorithms suggest
that no program is consistently superior to all others [5-6]. Therefore, to overcome this issue, researchers have started
focusing on the prediction information generated by combination of different programs [7-9].
Moreover, this approach has become very popular and has been applied in hundreds of publications [7-9].
Therefore, in order to further explore whether the consideration of different combination of algorithms is a stringent filter,
we have estimated the median value of targets within 3-UTR region of human (hsa), mouse (mmu) and rat (rno) with different numbers of algorithms (e.g., at least 2 to 10).
By considering only those interactions predicted with at least 2 algorithms, the median number of targets found for hsa, mmu and rat were 7967, 7793.5, and 3857, respectively.
Interestingly, a rapid decrease in the median values was observed with an increase in the number of algorithms (Figure 20a).
For example, by considering at least 4 algorithms, the median values were decreased to 3865 (hsa), 2724 (mmu) and 1146 (rno).
When using at least 6 algorithms, the median values were further decreased to 1000, 1504 and 66 for hsa, mmu, and rat, respectively.
Reduction of the median (targets) values within (a) 3-UTR, (b) Promoter, (c) 5-UTR and (d) CDS regions, when the number of algorithms is increased.
Similarly, this filtering criterion was also applied to other regions: the promoter (2kb) (Figure 20b), 5-UTR (Figure 20c) and CDS (Figure 20d) to find out changes in the median values. The median values were found to decrease in the similar fashion with increasing number of algorithms as observed for 3-UTR region. For instance, the median values for promoter, 5-UTR and CDS for human with at least 2 algorithms were 5505, 2217 and 10279, however, after increasing the number of algorithms to at least 3, the values were rapidly decreased to 3041, 608, and 4720. Therefore, these observations suggest that different algorithms can work as a stringent filter to reduce the number of target genes for one or more miRNAs.
Additionally, several studies have demonstrated that a considerable number of miRNAs co-target 3-UTR and the CDS or 5-UTR region [10-13]. For example, in Lee et al., it is shown that the reporter constructs containing miRNA binding sites on 5-UTR and 3-UTR down-regulate to a great extent compared to those harboring 3-UTR site alone . In Fang et al., the authors reanalyzed the previously published studies and observed that genes harboring miRNA binding sites within the both regions (CDS and 3-UTR) show significantly stronger regulation compared with the ones having sites in the 3-UTR only . These observations were further reconfirmed in an another study12 and the authors also found that some miRNAs (especially those related to cell cycle) appear to preferentially anneal to CDS region, which they found to be effective in rapid inhibition of translation .
Hence, these studies can also be applied as an additional filter to further reduce the number of target genes per miRNA. Moreover, steps need to reduce the number of target genes are depicted in (Figure 22). Briefly, first, one can obtain the miRNA binding site results within the promoter, 5-, CDS and/or 3-UTR regions by applying different algorithms approach (as described in Figure1-4). Second step is to combine these sites as per the region(s) of interest to collect co-target sites (only within 5-UTR+3-UTR and/or CDS+3-UTR). In the final step, one can carry out an overrepresentation analysis with co-target sites within their genes of interest. This enrichment analysis will further decrease the number of miRNAs to a few potential candidates.
To further compliment information hosted by the comparative platform of miRWalk2.0, ~13 million interactions gathered via CLIP datasets are integrated with the help of additional tables which display validated information (how many of predicted interactions are already verified and documented in miRTarBase and/or CLIP datasets) on putative gene-miRNA interactions (Figure 21). Moreover, these interactions are available for downloading in two formats (Rdata and GMT files) to enable stand-alone large-scale overrepresentation analysis. Also, information on the holistic view of these datasets can be downloaded via the Holistic.html page implemented under the Predicted Target module of miRWalk2.0.
In order to reduce the number of putative target genes on miRNAs of interest, the below steps can be followed (Figure 22).
Step 1. Collect information on target genes (by considering at least 2 algorithms) having binding sites of miRNAs of interest within the mRNA 5-, CDS and 3-UTR regions (as described in Figure 11-14) via the microRNA information retrieval system or Holistic.html implemented under the PTM of miRWalk2.0.
Step 2. Compile information obtained from step 1 and create separate lists (files) of target genes having binding sites for miRNAs of interest within different combinations i.e. 5-UTR+CDS, 5+3-UTR and CDS+3-UTR.
Step 3. Subject all the files resulting from step2 to stand-alone enrichment analyses and/or map them with experimentally verified data (validated target genes and/or CLIP datasets). For further help, please contact Dr. Harsh Dweep at Harsh.Dweep@medma.uni-heidelberg.de.