Wang et al. Text summary Supplementary information: Content of database; Querying, searching and visualization; Data sources; Implementation; Supplementary Figure 1 and 2. Content of the database According to PubMed database, we collected the literatures published before September 2012. Based on keywords combination, we have automatically screened thousands of abstracts and full-text articles by in-house scripts. The relevant hits were further inspected manually. In total, more than 370 literatures were reviewed and 807 lncRNA-associated, 229 miRNA-associated, 13 piRNA-associated and 100 snoRNA-associated entries for a total of 1149 curated entries were documented for three mammals (866 Homo sapiens-associated, 251 Mus musculus-associated and 32 Rattus norvegicus-associated entries) (Table 1). Among these ncRNA-disease entries, it contained 224 non-redundancy lncRNAs, 100 non-redundancy miRNAs, 2 non-redundancy piRNAs and 8 non-redundancy snoRNAs associated with 175 disease terms. In current version of MNDR, each entry contains detailed information on a ncRNA–disease relationship, including RNA category, species, ncRNA symbol, disease, tissue, interaction gene symbol, expression direction of ncRNA, a literature reference and detailed description. To facilitate researchers accessing information from external resources, we linked lncRNAs to the lncRNAdb database (www.lncrnadb.com/) or Functional lncRNA database (www.valadkhanlab.org/database/); miRNAs to the miRBase database (www.mirbase.org); and snoRNAs to the sno/scaRNAbase (bioinfo.fudan.edu.cn/snoRNAbase.nsf) or snoRNA-LBME-db database (www-snorna.biotoul.fr). Correspondingly, functional data of the interaction genes were also linked to commonly used resources NCBI gene (www.ncbi.nlm.nih.gov/gene). Users can efficiently retrieve plenty of genomic and disease associated data via linking to these external resources. MNDR also welcomes researchers to submit experimentally identified novel ncRNA–disease relationships. In addition, all the ncRNA-disease relationships can be downloaded directly in the Excel format. Querying, searching and visualization MNDR provides an interface for convenient retrieving of all relationships between diverse ncRNAs and diseases. Users can browse and obtain full lists of ncRNAs involved in any given diseases through three paths. Path1: users can browse the ncRNA-disease relationships by selecting associated options of RNA categories, species and RNA symbols. Users can obtain a list of ncRNA-disease relationships for any combination of RNA categories, species and RNA symbols. Alternatively, researchers can also get ncRNA-disease relationships based on a specified disease term (Path2) or tissue (Path3). The main table of result contains RNA categories, species, RNA symbols, disease, tissue, PMID and detail. When clicking the “detail” link in each record, users can access to more specific information such as interaction gene symbol, expression of ncRNA, detection method of ncRNA expression, PMID, reference title and detailed description. To help users to observe the ncRNA-mediate interaction network in disease conditions, MNDR also provides visualization functionality, where the global ncRNA-mediate disease network in three mammals can be rapidly and independently represented by embedding interactive networks with Cytoscape Web (cytoscapeweb.cytoscape.org/). Multiple data resources can be combined in a single visualization in each of three mammals. Since the compelling visualization architecture is pan-and-zoom, users can observe specific Mammalian ncRNA-disease repository(MNDR)) diseases associated ncRNAs within the global ncRNAs interaction network. Data sources In order to collect all available ncRNAs symbols, we have firstly integrated three major types of ncRNAs: lncRNA symbols collected from the lncRNAdb1 and Functional lncRNA database, 2 miRNA symbols collected from the mirBase, 3 snoRNAs symbols collected from the sno/scaRNAbase4 and snoRNA-LBME-db.5 Because the research for other ncRNAs including promoter-associated small RNAs(PASRs), PIWI-interacting RNAs (piRNA), promoter upstream transcripts (PROMPTs), transcription initiation RNAs (tiRNA) and TSS-associated RNAs (TSSa-RNAs), etc6 is still in its infancy, we searched the PubMed database by using these ncRNA category names to replace specific ncRNA symbols. The list of disease terms were collected according to the MeSH (Medical Subject Headings) vocabularies that are created and maintained at the National Library of Medicine. In order to reduce the challenge of manual curation, we’ve written scripts to search in all abstract and full-text articles in the PubMed database for the following keywords combinations: (each ncRNA symbol or ncRNA category names) and (each species: Homo sapiens, Mus musculus and Rattus norvegicus) and (each disease name). Since mir2disease database has provided a resource of disease-associated miRNAs in human, we have not integrated such information in the MNDR database. Implementation The MNDR database runs in a window environment, it was implemented by using HTML and PHP language, a widely-used general-purpose scripting language for web development. The interface component consists of the web pages designed and implemented in HTML/CSS. It has been tested in some web browsers, such as Google Chrome, Safari, Mozilla Firefox and Internet Explorer. Supplementary Figure: Supplementary Figure 1. The biggest sub-network of ncRNA-associated disease network based on MNDR database. Trianglar, square, diamond and circular nodes represent lncRNAs, miRNAs, snoRNAs and protein-coding genes respectively s Wang et al. Supplementary Figure 2. The biggest sub-network of ncRNA-associated disease network by integrating MNDR database and mir2disease data. Trianglar, square, diamond and circular nodes represent lncRNAs, miRNAs, snoRNAs and protein-coding genes respectively Supplementary information is available at cell death disease website. References 1. 2. 3. 4. 5. 6. Amaral PP, Clark MB, Gascoigne DK, Dinger ME, Mattick JS. lncRNAdb: a reference database for long noncoding RNAs. Nucleic acids research 2011, 39(Database issue): D146-151. Niazi F, Valadkhan S. Computational analysis of functional long noncoding RNAs reveals lack of peptide-coding capacity and parallels with 3' UTRs. Rna 2012, 18(4): 825-843. Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic acids research 2008, 36(Database issue): D154-158. Xie J, Zhang M, Zhou T, Hua X, Tang L, Wu W. Sno/scaRNAbase: a curated database for small nucleolar RNAs and cajal body-specific RNAs. Nucleic acids research 2007, 35(Database issue): D183-187. Lestrade L, Weber MJ. snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic acids research 2006, 34(Database issue): D158-162. Esteller M. Non-coding RNAs in human disease. Nature reviews Genetics 2011, 12(12): 861-874.