OPEN SUBJECT AREAS: EVOLUTIONARY ECOLOGY ENVIRONMENTAL SCIENCES Received 23 October 2014 Accepted 21 January 2015 Published 25 February 2015 Correspondence and requests for materials should be addressed to Y.Z. (zhaoyb@pku. edu.cn) or J.H. (hujy@ urban.pku.edu.cn) Families of Nuclear Receptors in Vertebrate Models: Characteristic and Comparative Toxicological Perspective Yanbin Zhao1, Kun Zhang1, John P. Giesy2,3,4 & Jianying Hu1 1 MOE Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, Peking University, Beijing 100871, China, 2Department of Veterinary Biomedical Sciences and Toxicology Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada, 3Department of Zoology, and Center for Integrative Toxicology, Michigan State University, East Lansing, MI, USA, 4Department of Biology & Chemistry and State Key Laboratory in Marine Pollution, City University of Hong Kong, Kowloon, Hong Kong, SAR, China. Various synthetic chemicals are ligands for nuclear receptors (NRs) and can cause adverse effects in vertebrates mediated by NRs. While several model vertebrates, such as mouse, chicken, western clawed frog and zebrafish, are widely used in toxicity testing, few NRs have been well described for most of these classes. In this report, NRs in genomes of 12 vertebrates are characterized via bioinformatics approaches. Although numbers of NRs varied among species, with 40–42 genes in birds to 66–74 genes in teleost fishes, all NRs had clear homologs in human and could be categorized into seven subfamilies defined as NR0B-NR6A. Phylogenetic analysis revealed conservative evolutionary relationships for most NRs, which were consistent with traditional morphology-based systematics, except for some exceptions in Dolphin (Tursiops truncatus). Evolution of PXR and CAR exhibited unexpected multiple patterns and the existence of CAR possibly being traced back to ancient lobe-finned fishes and tetrapods (Sarcopterygii). Compared to the more conservative DBD of NRs, sequences of LBD were less conserved: Sequences of THRs, RARs and RXRs were $90% similar to those of the human, ERs, AR, GR, ERRs and PPARs were more variable with similarities of 60%–100% and PXR, CAR, DAX1 and SHP were least conserved among species. N uclear receptors (NRs) are one of the largest groups of transcription factors in vertebrates, and serve important functions in regulation of a range of physiological functions including growth and differentiation of cells, metabolic processes, reproduction, development and overall homeostasis. Transcriptional activities of NRs are regulated by binding of endogenous small lipophilic compounds1,2. There is growing evidence that diverse chemicals that occur in the environment, including synthetic molecules such as pharmaceuticals, endocrine disrupting chemicals and some industrial compounds, can mimic endogenous small compounds that can bind to ligand binding domains (LBDs), activate NR-mediated signals that then lead to toxic responses3,4. Typically, interactions of some pesticides and industrial chemicals with estrogen (ER) and androgen (AR) receptors have been linked to a number of adverse effects including birth defects, developmental neurotoxicity, both male- and female-factor reproductive health, such as decreased quality of sperm, and increased incidences of cancers5–7. A series of in vitro bioassays, based on signaling of endocrine receptors including well-studied steroid hormone receptors such as ER, AR, glucocorticoid receptors (GRs), and progesterone receptor (PR) and the less wellstudied retinoic acid receptor (RAR), retinoid X receptor (RXR), and thyroid hormone receptor (THR), have been established or are under assessment by OECD and/or US EPA8–10. Due to their relatively clear physiological functions and responses to environmentally-relevant organic micropollutants, these NR-based assays have been used in assessment of toxicological effects of chemicals in the environment. For example, ERs, AR and THRs, involved in development and maintenance of the endocrine system, have been demonstrated to be targets of alkylphenols, phthalates (PAEs), dichlorodiphenyltrichloroethane and some metabolites of polychlorinated biphenyls (PCBs) and polybrominated diphenyl ethers (PBDE)11–13. Besides endocrine receptors, PXR and CAR, NRs that participate in metabolism of both endobiotics and xenobiotics to detoxify or bioactivate chemicals, can be activated by a variety of pharmaceuticals such as rifampicin, pesticides such as chlorpyrifos and methoxychlor, and other synthetic chemicals used in industry, such as PBDEs and BPA14–17 In addition to these wellknown NRs, there are more NRs, that, during the past decade, have been identified in genomes of several SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 1 www.nature.com/scientificreports vertebrates. These include 48 NR genes in human (Homo sapiens), 47 genes in rat (Rattus norvegicus), 49 genes in mouse (Mus musculus) and 68 genes in the teleost puffer fish Fugu rubripes18,19. Specifically, structures of 48 NRs in the human have been identified and categorized, based on sequence homology, into seven different subfamilies NR0B-NR6A20. Except for two NRs in the subfamily NR0B which lack a DNA binding domain (DBD), all 46 NRs contain the following six functional domains: (A–B) variable N-terminal regulatory domain; (C) conserved DNA-binding domain; (D) variable hinge region; (E) conserved ligand binding domain (LBD) and (F) variable C-terminal domain20. In addition, sets of NRs described in humans offered a better understanding of characteristics of NRs, and provided insight for uncovering novel molecular and signal targets and mechanisms of action of synthetic toxicants. For instance, it has been found that some widely used pharmaceutical drugs that are found in the environment, including thiazolidine diones, trichloroacetic acid and toxaphene are ligands for human RORa, PPARa and ERRa, respectively21–23. Compared with the extensive understanding of NRs in human, fewer NRs have been identified in other vertebrates used as models to screen chemicals for toxic potencies, such as reptiles, amphibians and teleost fishes. While in recent years, due to extensive information about their developmental biology and molecular genetics and now the availability of completed sequencing of their genomes, these vertebrate species have been much used as toxicological models such as western clawed frog (X. tropicalis), zebrafish (Danio rerio), and freshwater Japanese medaka (Oryzias latipes)24–26, information on NRs in these vertebrates were still limited to ERs, AR, GR, PXR, RARs and PPARs, though studies on some novel NRs, such as VDR, FXR and NURR are in progress27–29. Additionally, since sets of NRs in human, mouse and rat that have been identified in previous studies were based on their genomes assembled a decade ago18, there is also a need to reevaluate the characteristics of NRs in these genomes due to the constantly updated sequence data and annotations. In addition to the sequences of genomes, predicted transcriptomes and proteomes, now available for all of these species in Genebank and Ensembl, provide useful databases that can be further used to uncover and characterize additional NRs. Therefore, comprehensive descriptions of NRs and their families for these vertebrates used as models to screen for toxic potencies of chemicals, will be helpful for their further development and interpretation of results of studies of synthetic chemicals of environmental significance. In this study, complete sets of NRs were described for genomes of 12 vertebrates used as models in studies of toxic potency and mechanisms of action of chemicals. Several bioinformatics approaches were applied to four mammals (human, Homo sapiens; mouse, Mus musculus; rat, Rattus norvegicus and dolphin, Tursiops truncatus), two birds (chicken, Gallus gallus and mallard (wild duck), Anas platyrhynchos), a reptile (Chinese softshell turtle, Pelodiscus sinensis), an amphibian (Western clawed frog, Xenopus tropicalis) and four teleost fishes (zebrafish, Danio rerio; medaka, Oryzias latipes; tilapia, Oreochromis niloticus and stickleback, Gasterosteus aculeatus). The locations of NRs on chromosomes, phylogenetic analysis and DBD and LBD sequence conservations among species were also analyzed to better understand the characteristics of these NRs in these vertebrates. Results and Discussion Identification of NRs in 12 vertebrates. Substantial and continuous information gathered from developmental biology and molecular genetics, together with the complete sequencing of genomes has placed a series of vertebrate species in attractive positions for use in toxicological research. Twelve species were chosen for description and complete sets of NR genes within their genomes were identified by use of a systemic bioinformatics approach. In total, 42–74 NR genes were uncovered within these vertebrates and a large number of SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 variations were observed among classes (Fig. 1A, Table S2). Comparisons of sequences showed that all of these NRs displayed significant similarity to NRs of the human and could be categorized into the seven subfamilies NR0B-NR6A, with no novel subfamilies. For mammals, there were 48, 49, 49 and 47 NRs identified in human, mouse, rat and dolphin genomes, respectively (Fig. 1A). Compared to the human, one more gene (NR1H5) was observed for mouse and rat and one (NR2F2) was absent from dolphin (Fig. 2). Sets of NRs in human and mouse were consistent with previous reports18, while two more NRs (NR1D2 and NR2E3) were newly identified for the rat. The absences of these two NRs in rat in previous study18 were due to the existence of sequence gaps in the rat genome which was assembled in 2003. The numbers of NRs in birds were less than those in human, though there were some unique genes observed. There were seven NRs (NR1B3, NR1D1, NR1H2, NR1I2, NR2B2, NR3B1 and NR4A1) present in the human that were absent from the chicken. Similarly, there were nine NRs (NR1B3, NR1D1, NR1H2, NR1I2, NR1I3, NR2B2, NR2E3, NR2F1 and NR3B1) present in the human that were absent from the mallard, though there were three new NRs (NR1F3, NR1H5 and NR2A3) were identified that were unique to chicken and mallard (Fig. 2). Similar absences were observed in the genomes of turkey (Meleagris gallopavo), flycatcher (Ficedula albicollis) and zebra finch (Taeniopygia guttata), where 9, 5 and 6 NRs, respectively, that are present in the human genome were absent from these birds (Fig. 3C). These results demonstrated that a cluster of NRs were indeed absent from genomes of the class aves, especially in galloanserae, that were deleted during the course of evolution. Some NRs present in the human were absent from turtle and western clawed frog while some others were unique in these species. In the one species of turtle, 48 NRs were identified with four genes absent (NR1B3, NR1H2, NR1I2 and NR2B2) and four new genes gained (NR1F3, NR1H5, NR2A3 and NR2F1) compared with those in human. Similarly, 52 NRs were identified in western clawed frog with 2 genes absent (NR1H2 and NR4A3) and six additional genes (NR1F2, NR1H5, NR2A3, NR2F5, NR3B3 and NR4A2) appeared which were not present in the human (Fig. 2). For the four teleost fishes studied, there were many additional NRs uncovered in this study. Specifically, 73 and 74 NRs were identified in zebrafish and tilapia, respectively (Fig. 1A), which were consistent with those reported for Fugu rubripes (68 NRs identified)19. The additional NRs were mainly due to the paralogue genes exist in their sets of NRs (Fig. 1C). In zebrafish, two or more paralogues were identified to correspond with one of 20 NRs in human and with one of 18, 22 and 17 NRs in medaka, tilapia and stickleback, respectively. Existences of paralogue genes in teleost fishes were not random but focused on some specific NR units. For instance, NR1F3 (RORc) was the most abundant NR, with a total of seven paralogue gene copies in these four teleost fishes. The NRs NR1A1, NR1B3, NR1C1, NR1I1, NR2B2, NR2F6, NR3A2, and NR3B3 were also rich in paralogues, with one paralogue gene copy in each of the four teleosts (Fig. 3D). Characteristics of NRs families. Genomic locations of NRs in seven vertebrate genomes (human, mouse, rat, chicken, zebrafish, medaka and stickleback) were retrieved via the Ensemble annotations. In general, distributions of NRs on chromosomes were more widespread in teleost fishes than those of mammals and birds (Fig. 1B). This is possibly due to the existence of more paralogue genes in teleosts. For example, NRs in zebrafish, medaka and stickleback were distributed throughout their genomes except for 1–2 chromosomes. The most abundant clusters of NRs were observed on chromosomes 8 and 16 in zebrafish, each with 6 NRs; on chromosomes 7 and 16 in medaka, each with 7 NRs; and on chromosome 12 in stickleback, with 8 NRs. The narrowest distribution of NRs was observed for species of chicken, in which 44 NRs were distributed in 61% (19/31) chromosomes. 2 www.nature.com/scientificreports Figure 1 | Identification of NRs in genomes of 12 toxicological vertebrate models. (A) Total number of NRs in each vertebrate genome (B) the genomic distributions of NRs in seven vertebrate species (C) the number of NRs for each type (NR0B-NR6A) and the paralogous gene numbers (P.G.) in total. Phylogenetic analyses, based on their full amino acid sequences and DBD plus LBD compositions of NRs, were performed for 48 types of NRs among these 12 vertebrates. The Neighbor-Joining (NJ) and Maximum-Likelihood (ML) phylogenetic analyses showed similar patterns, while the Neighbor-Joining algorithm gave better resolution at the base of the phylogram. Conservative evolutionary relationships were observed for most NRs, i.e. the evolutionary relationships were generally consistent with the traditional morphologybased systematics (Fig. S1). As exemplified for NR3A1 (ERa), closer relationships were observed within each class and the traditional teleost-amphibian-reptile-bird and mammal evolutionary relationships were followed (Fig. 3A). This was verified by the similarity of sequences of the LBD of ERa among species (Fig. 4). In details, about 82–93% sequence similarities among teleost, 99% between birds and 98–99% among mammals was observed and the sequence similarities among classes were relatively small (Fig. 5). Some exceptions were observed in Dolphin such as NR2A1 and NR2A2 (Fig. S1). Though dolphin, diverged from artiodactyls approximately 50 million years ago30, was thought to show the closest relationship with human among the 12 vertebrates, there were 32% NRs that showed closer relationships between rodents and human compared with those in dolphin. Similarities between sequences of the DBD and LBD also confirmed this likely historical divergence. In rodents, 13% of sequences of amino acids of DBD and 26% of those of the LBD exhibited relationships more similar to those of the human than dolphin (Fig. 3B). These variations in NRs in dolphin were possibly due to the results of positive Darwinian selection, the major driving force for adaptive evolution and diversification among species, to adapt their radical habitat transition from land to a marine environment. Though increasing toxicological research has been preformed using dolphins and extrapolations from dolphin to human were thought to be more significant, results of the present study SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 demonstrated more variations, indicating more genetic characteristics should be taken into account when assessing toxicities of chemicals based on results of studies with dolphins. In addition, since PXR and CAR displayed the largest variations and were absent in several vertebrates used in this study (Fig. 2 and 4), more comparisons among species were conducted. Existence of NR1I (VDR, PXR and CAR) genes were demonstrated in 35 vertebrate species (20 mammals, 5 birds, 2 reptile, 1 amphibian and 7 teleost fishes) with for which complete sequences of genomes were available and unexpected patterns were showed for their evolutions. VDR genes appeared in all vertebrate genomes, a result which was consistent with those in previous reports that VDR could be detected in mammals, birds, amphibians, reptiles, teleost fishes, and even the sea lamprey31. PXR appeared in most teleost fishes (expect for stickleback), amphibians and mammals (also known as SXR), but were totally absent from reptiles and birds. Though CAR also appeared in all mammals, it exhibited quite different patterns in other classes. CAR was mostly absent in birds (expect for chicken), but retained in reptiles and amphibians, and appeared in lobe-finned fishes and tetrapods (Sarcopterygii) (Fig. 3E). Since Sarcopterygii appeared nearly 400 million years ago during the Devonian, and are widely accepted as ancestors of all tetrapoda, including amphibians, reptiles, birds and mammals32, the appearance of CAR in Sarcopterygii possibly indicated that the existence of CAR was much earlier than previously thought. In general, these results revealed a novel evolutionary relationship for PXR/CAR. These two NRs likely coexisted in ancient Sarcopterygii, first due to the duplication events, descended into amphibians and then to mammals, but one of them was absent from reptiles and both were absent from most birds (Fig. S2). Alignment of sequences of DBD and LBD. Since cross-species extrapolations from surrogate vertebrate species to humans are 3 www.nature.com/scientificreports Figure 2 | Nuclear receptor families in 12 model vertebrates. Each nuclear receptor is presented as a colored block. The white spaces indicate that no ortholog was identified. Nuclear receptor family for each vertebrate species was marked with different color. From left to right: human ‘‘ ’’; mouse ‘‘ ’’; rat ‘‘ ’’; dolphin ‘‘ ’’; chicken ‘‘ ’’; duck ‘‘ ’’; turtle ‘‘ ’’; frog ‘‘ ’’; zebrafish ‘‘ ’’; medaka ‘‘ ’’; tilapia ‘‘ ’’ and stickleback ‘‘ ’’. usually considered to be crucial for human risk assessment of chemicals, better understanding of similarities of these NRs sequences among species will be useful to facilitate these extrapolations and better understand the toxicities of environmental chemicals. In the present study, pairwise alignments were constructed between sequences of DBD/LBD of 48 human NRs and their corresponding orthologs in the other eleven vertebrate species (Fig. 4). As expected, DBDs of the orthologous proteins generally shared relatively great conservation with sequences in human (Fig. 4, left), especially, for the mouse, rat and dolphin, in which 94%–100% sequence similarities were observed for most NRs, expect CAR (70%–89%), and almost 70% (31/46, 32/46 and 31/42, respectively) orthologous proteins showed 100% similarities with sequences of the human. For bird, reptile, amphibian and teleost fishes, SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 most NRs also displayed conservation of sequences (usually .90%), especially for RORb (100% for all species). While there are also some exceptions, such as PXR (61%–73%), CAR (64%–67%), and PPARa and TR2 in teleost (87%–90% and 84%–87%, respectively), which indicates potential alternations on target genes and signals for these NRs among vertebrate species. Compared to the more conserved sequences of DBD regions of NRs among species, sequences of the LBD displayed more variation. The greatest variation was observed for DAX1 (40%–81%), while the least variation was observed for COUP-TFII (99%–100%) compared with those in human (Fig. 4, right). To our best knowledge, this is the first time all NRs LBD have been compared among vertebrates, which showed a broader and novel insight to investigate the LBD 4 www.nature.com/scientificreports Figure 3 | Characteristics of the 12 NRs families. (A) Phylogenetic tree for 12 NR3A1 (ERa) genes (B) The evolutionary relationships of NRs among dolphin, rodents and human species. Left: the proportions of dolphin NRs with closer relationships with human compared to rodents are presented as percent/number and blue colour. The proportions of rodents NRs with closer relationships with human are presented as percent/number and orange colour. Green colour represents the NRs numbers with equivalent sequence similarities with human for dolphin and rodents. Right: phylogenetic tree for NR2C1 and NR2A1 represents the different positions of NRs for dolphin. (C) Comparative searches for the ten lacked NRs in five bird species (D) Paralogous gene copy numbers for each type of NRs (E) Comparative searches for NR1I genes (VDR, PXR and CAR) in 35 vertebrates, including 20 mammals, 5 birds, 2 reptiles, 1 amphibian and 7 teleost (details are described in Table S4). Phylogenetic tree was developed utilizing 35 full amino acid sequences of VDR. differences between species and between multiple NRs units. In the present study, three groups were identified in general based on similarities in sequences of NRs. The first group contained 13 NRs including THRa, THRb, RARa, RARb, RARc, RORa, RXRa, RXRb, RXRc, COUP-TFII, ERRc, NURR1 and LRH1 (except some orthologs for RARa, RORa, RXRb, RXRc and NURR1) with $90% similarity of sequences of the LBDs for all eleven vertebrates compared with those of the human (Fig. 4, right). As observed for RXRa, 97–100% similarities in sequences, for the best alignment orthologs, were observed from multiple sequence alignment (Fig. 5). Variations in conservation of sequences, window averaged across 10 amino acid residues, found that there were fewer than 5 variations in amino acid residues among these 12 vertebrate species, and most of them were observed in a-helix 3 to a-helix 6 of the LBD structures (Fig. 5). RXRa commonly functions as a heterodimers with other NRs and mainly mediates signaling of hormones derived from vitamin A (retinol) such as 9-cis retinoic acid, and are involved in multiple physiological functions of vertebrates such as embryonic patterning and organogenesis, proliferation of cells and differentiation of tissues33. It has been reported that among vertebrates, such as mouse SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 and human, LBDs of RXRa interacted with similar types of ligands with similar binding affinities34,35. Sequence similarities of these 13 NRs among vertebrates suggested potential straightforward interspecies extrapolations when assessing toxicity of chemicals via these NRs. Approximately 77% of NRs such as the well-known ERs, AR, PR, PPARs and VDR can be sorted into the second group, exhibiting 60–100% similarities of sequences (for the best aligned orthologs) compared with those of human. Similarities in sequences of these NRs among four fishes were substantially the same and usually $90% in mouse, rat and dolphin, showing apparent differences in sequences of amino acids between teleosts and mammals. Specifically, LBDs of NRs in the second group, such as ERa and PPARc, always shared the same variations in amino acids within four fishes, which were quite different from those of mammals (Fig. 5 for ERa). ERa is a well-studied NR, activated by endogenous and exogenous estrogens, and plays a variety of central physiological roles, such as maintenance of reproductive, cardiovascular and central nervous systems in vertebrates36. Potencies of binding of ligands to LBDs of ERa were different for fishes when compared to mammals. It has been reported that widespread chemicals like 4-t-octyl5 www.nature.com/scientificreports Figure 4 | Pairwise alignments between DBD/LBD amino acid sequences of 48 human NRs and the corresponding orthologs in other eleven vertebrate species. Left for the DBD sequence comparisons and right for the LBD. The sequence similarities are presented as the percentage (%) and relevant color. NRs, with incomplete amino acid sequences of DBD/LBD, were not included in this comparison. SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 6 www.nature.com/scientificreports Figure 5 | Variations in LBD sequence conservation across the sequence of RXRa, ERa and SHP. Left: LBD sequences for eleven vertebrates compared to the related human nuclear receptors. All sequences were window averaged across 10 residues. Right: multiple sequence alignments among the 12 vertebrates. The sequence similarities are presented as the percentage (%) and relevant color. The LBD sequence of ERa in Dolphin was not included in this comparison due to the incomplete amino acid sequences. phenol and bisphenol A (BPA) bound with greater avidity to rainbow trout ER than that of human or rat. Also, types of ligands were various: of 34 chemicals tested, 29 can bind to ER of rainbow trout, while only 20 of them can bind to ER of human/rat37. PPARc is also a well-studied transcription factor, which could be activated by fatty acids and is involved in lipid and glucose metabolism38. Reports on binding strengths of LBDs for PPARc were rare, but interspecies extrapolations on LBD binding activities can be likely to estimate, due to the similar sequence characteristics between PPARc and ERa. SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 In the third group, with less than 85% similarities in sequences of eleven vertebrate species compared with those in human, four NRs including PXR, CAR, DAX1 and SHP (Fig. 4) were classified as being different from human. DAX1 and SHP, which belong to the subfamily NR0B, displayed the greatest variations among NRs and among vertebrates (Fig. 4 and 5), a result which is consistent with those reported previously that NRs in the NR0B group were a unique class of NRs with among-species variability in sequences and lacking DBD domains18. PXR and CAR were also assigned to this group, and 7 www.nature.com/scientificreports exhibited apparent differences among vertebrates and even among fishes. PXR and CAR can be activated by xenobiotics and have relatively broad abilities to bind ligands39. The unusually great diversity in sequences of the LBD among species could be related to diversity in binding activities among species. This is exemplified by the fact that phenobarbital, a pharmaceutical that is generally detectable in effluents of municipal waste water plants (WWTP), was a moderate activator of the zebrafish PXR and exhibited greater binding affinity with human PXR, while it did not bind to PXR of mouse39. These differences among species might be due to the differences in diet and physiology among vertebrates, and such largely differences of sequences of PXR and CAR among vertebrates complicated the in silico extrapolations. Here, for the first time, genes that code for NRs and their relative characteristics are provided for 12 vertebrate species used as model animals in screening of toxic potencies of chemicals. These results will help understanding of the NRs in vertebrates and will be useful for clarifying mechanisms of toxic effects of environmental chemicals on these model species and also the extrapolations from the effects on these surrogates to human. Methods Identification of NRs in 12 vertebrate genomics. Identification of sequences for NRs was performed as described previously40,41 with slight modifications. In brief, the putative NRs for each vertebrate were identified through a combination of BLASTn and BLASTp searches of the genome and protein databases, which were obtained from NCBI and Ensembl. The nucleotide and protein sequences of 165 described NRs in three vertebrates (48 in human, 49 in mouse and 68 in Fugu rubripes) were downloaded from GenBank and used as templates for interrogating the vertebrate databases. Nucleotide homology searches were performed using the full nucleotide sequences of each of the 165 NRs against these 12 genomic sequences database at NCBI by use of nucleotide BLAST with a blastn algorithm and an e value cut off of 1e04. Protein sequences were then used to construct multiple sequence alignments by ClustalX2 (http://www.clustal.org/clustal2/) and then the DNA-binding domain (DBD) and the ligand-binding domain (LBD) amino acid sequences were demonstrated. BLASTp searches were performed using the conserved DBD plus LBD domains against the non-redundant vertebrate protein sequence database at NCBI by use of protein BLAST with a blastp algorithm and an e value cut off of 1e-25. The e cut-off values were set to be just loose enough to find all the Fugu NRs when using human NRs as queries. Genes identified by BLASTn and BLASTp searches were then combined and individual putative genes were sorted according their unique DNA and amino acid sequences. All these putative genes were verified by online software NRpred and iNR-PhysChem to remove the false-positive hits, and the NR0B1 and NR0B2, which are known to lack the DBD region, were added to the final sets of NRs. Details for the sequence searches were shown in Table S1. Finally, complete sequences for each NR in each vertebrate species were loaded into Ensembl database. The nomenclatures of NRs were based on Ensembl’s GeneTree and Orthology annotations. Genomic distributions. Genomic location for each nuclear receptor in seven vertebrate genomes (human, mouse, rat, chicken, zebrafish, medaka and stickleback) were retrieved via the Ensembl annotations, and then mapped onto complete vertebrate karyograms. Analyses of sequences of DBD and LBD. Sequences of peptides in the DBD and LBD domains for each NR were identified by use of Pfam software (http://pfam.sanger.ac. uk/, Pfam 27.0) and modified manually, based on characteristics of DBD and LBD regions reported previously. The sequence of DBD, which is classified as a type-II zinc finger motif, corresponds to a 75–80 amino acid residue segment, starting at the location of two amino acid residues before the first conserved cysteine and encompassing both C4 zinc fingers and the LBD, a flexible unit made of a-helices containing of 170 to 210 amino acid residues, begin at the 12th residue of a-helix 3 and extended through a-helix 1042,43. The pairwise alignments between sequences of the DBD and LBD of human protein and corresponding orthologs in the other 11 vertebrates were constructed by use of the NCBI BLASTp software with default parameters. Similarities in sequences were calculated based on the numbers of identical residues over the total numbers of aligned residues in human. Phylogenetic analysis. Phylogenetic trees were constructed by use of amino acid sequences of 48 types of NRs downloaded from Ensembl based on the set of homologous NRs in the human. Only full- length molecules were included for the analysis. Some genes without complete amino acid sequences in the Ensembl database were retrieved from NCBI/EMBL/DDBJ databases (Table S3). They were also included. The Ensembl ID of each NR used in the analyses is available in SI Table S2. Conserved sequences of DBD and LBD for each NR were also isolated and used as a supportive analysis. Sequences of DBD and LBD were combined and then aligned, SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 except for NR0B1 and NR0B2. Multiple alignments of sequences of amino acids were generated by use of ClustalX2 software with default parameters, and the results used for construction of phylogenetic trees by implementation of the Neighbour-Joining and Maximum-Likelihood algorithms with a Poisson model in MEGA6 software (http://www.megasoftware.net/mega.php). Confidence for branching patterns was assessed by bootstrap analysis (1000 replicates). For NR1I1 (VDR) analysis, the full amino acid sequences of NR1I1 in 35 vertebrates, including 20 mammals, 5 birds, 2 reptiles, 1 amphibian and 7 teleost fishes (Table S4), were downloaded from the Emsenbl database. These full amino acid sequences were then aligned and applied for gene phylogenetic analysis by use of the same method described above. 1. Mangelsdorf, D. J. et al. The nuclear receptor superfamily: the second decade. Cell 83, 835–839 (1995). 2. Pardee, K., Necakov, A. S. & Krause, H. Nuclear receptors: small molecule sensors that coordinate growth, metabolism and reproduction. Subcell. Biochem. 52, 123–153 (2011). 3. Janosek, J., Hilscherova, K., Blaha, L. & Holoubek, I. Environmental xenobiotics and nuclear receptors - interactions, effects and in vitro assessment. Toxicol. in Vitro 20, 18–37 (2006). 4. Grun, F. & Blumberg, B. Environmental obesogens: organotins and endocrine disruption via nuclear receptor signaling. Endocrinology 147, S50–S55 (2006). 5. Damstra, T., Barlow, S., Bergman, A., Kavlock, R. & Van Der Kraak, G. eds. International programme on chemical safety global assessment: the state-of-thescience of endocrine disruptors. Geneva: World Health Organization (2002). Available at: http://www.who.int/ipcs/publications/new_issues/endocrine_ disruptors/en/ (Accessed: 23th December 2014). 6. Huang, R. et al. Chemical genomics profiling of environmental chemical modulation of human nuclear receptors. Environ. Health. Perspect. 119, 1142–1148 (2011). 7. Toppari, J. et al. Male reproductive health and environmental xenoestrogens. Environ. Health. Perspect. 104, 741–803 (1996). 8. Kortenkamp, A. et al. State of the art assessment of endocrine disrupters, final report. 2011. Available at: http://ec.europa.eu/environment/chemicals/endocrine/ documents/studies_en.htm (Accessed: 23th December 2014). 9. Kavlock, R. et al. Update on EPA’s ToxCast Program: Providing high throughput decision support tools for chemical risk management. Chem. Res. Toxicol. 25, 1287–1302 (2012). 10. Martin, M. T. et al. Impact of environmental chemicals on key transcription regulators and correlation to toxicity end points within EPA’s ToxCast program. Chem. Res. Toxicol. 23, 578–590 (2010). 11. Kuiper, G. G. J. M. et al. Interaction of estrogenic chemicals and phytoestrogens with estrogen receptor beta. Endocrinology 139, 4252–4263 (1998). 12. Sonnenschein, C. & Soto, A. M. An updated review of environmental estrogen and androgen mimics and antagonists. J. Steroid Biochem. Mol. Biol. 65, 143–150 (1998). 13. Zoeller, R. T. Environmental chemicals as thyroid hormone analogues: new studies indicate that thyroid hormone receptors are targets of industrial chemicals? Mol. Cell. Endocrinol. 242, 10–15 (2005). 14. Jacobs, M. N., Nolan, G. T. & Hood, S. R. Lignans, bacteriocides and organochlorine compounds activate the human pregnane X receptor (PXR). Toxicol. Appl. Pharmacol. 209, 123–133 (2005). 15. Chang, T. K. & Waxman, D. J. Synthetic drugs and natural products as modulators of constitutive androstane receptor (CAR) and pregnane X receptor (PXR). Drug Metab. Rev. 38, 51–73 (2006). 16. Zhao, Y. B., Luo, K., Fan, Z. L., Huang, C. & Hu, J. Y. Modulation of benzo [a] pyrene-induced toxic effects in Japanese medaka (Oryzias latipes) by 2, 29, 4, 49tetrabromodiphenyl ether. Environ. Sci. Technol. 47, 13068–13076 (2013). 17. DeKeyser, J. G., Laurenzana, E. M., Peterson, E. C., Chen, T. & Omiecinski, C. J. Selective phthalate activation of naturally occurring human constitutive androstane receptor splice variants and the pregnane X receptor. Toxicol. Sci. 120, 381–391 (2011). 18. Zhang, Z. et al. Genomic analysis of the nuclear receptor family: new insights into structure, regulation, and evolution from the rat genome. Genome Res. 14, 580–590 (2004). 19. Maglich, J. M. et al. The first completed genome sequence from a teleost fish (Fugu rubripes) adds significant diversity to the nuclear receptor superfamily. Nucleic Acids Res. 31, 4051–4058 (2003). 20. Germain, P., Staels, B., Dacquet, C., Spedding, M. & Laudet, V. Overview of nomenclature of nuclear receptors. Pharmacol. Rev. 58, 685–704 (2006). 21. Missbach, M. et al. Thiazolidine diones, specific ligands of the nuclear receptor retinoid Z receptor/retinoid acid receptor related orphan receptor alpha with potent antiarthritic activity. J. Biol. Chem. 271, 13515–13522 (1996). 22. Maloney, E. K. & Waxman, D. J. Trans -activation of PPARa and PPARc by structurally diverse environmental chemicals. Toxicol. Appl. Pharmacol. 161, 209–218 (1999). 23. Yang, C. & Chen, S. Two organochlorine pesticides, toxaphene and chlordane, are antagonists for estrogen-related receptor alpha-1 orphan receptor. Cancer Res. 59, 4519–4524 (1999). 24. Showell, C. & Conlon, F. L. The western clawed frog (Xenopus tropicalis): an emerging vertebrate model for developmental genetics and environmental 8 www.nature.com/scientificreports toxicology. Cold Spring Harb. Protoc. 2009, pdb.emo131 (2009); DOI:10.1101/ pdb.emo131. 25. Hill, A. J., Teraoka, H., Heideman, W. & Peterson, R. E. Zebrafish as a model vertebrate for investigating chemical toxicity. Toxicol. Sci. 86, 6–19 (2005). 26. Ankley, G. T. & Johnson, R. D. Small fish models for identifying and assessing the effects of endocrine-disrupting chemicals. Inst. Lab Anim. Res. 45, 469–483 (2004). 27. Ciesielski, F., Rochel, N., Mitschler, A., Kouzmenko, A. & Moras, D. Structural investigation of the ligand binding domain of the zebrafish VDR in complexes with 1alpha, 25(OH)2D3 and Gemini: purification, crystallization and preliminary X-ray diffraction analysis. J. Steroid Biochem. Mol. Biol. 89–90, 55–59 (2004). 28. Howarth, D. L. et al. Two farnesoid X receptor a isoforms in Japanese medaka (Orzias latipes) are differentially activated in vitro. Aquat. Toxicol. 98, 245–255 (2010). 29. Kapsimali, M., Bourrat, F. & Vernier, P. Distribution of the orphan nuclear receptor Nurr1 in medaka (Oryzias latipes): cues to the definition of homologous cell groups in the vertebrate brain. J. Comp. Neurol. 431, 276–292 (2001). 30. Meredith, R. W. et al. Impacts of the cretaceous terrestrial revolution and KPg extinction on mammal diversification. Science 334, 521–524 (2011). 31. Whitfield, G. K. et al. Cloning of a functional vitamin D receptor from the lamprey (Petromyzon marinus), an ancient vertebrate lacking a calcified skeleton and teeth. Endocrinology 144, 2704–2716 (2003). 32. Georges, D. & Blieck, A. Rise of the earliest tetrapods: an early Devonian origin from marine environment. PLoS ONE 6, e221362011 (2011); DOI:10.1371/ journal.pone.0022136. 33. Szanto, A. et al. Retinoid X receptors: X-ploring their (patho) physiological functions. Cell Death Differ. 11, S126–S143 (2004). 34. Heyman, R. A. et al. 9-cis retinoic acid is a high affinity ligand for the retinoid X receptor. Cell 68, 397–406 (1992). 35. Mangelsdorf, D. J. et al. Characterization of three RXR genes that mediate the action of 9-cis retinoic acid. Genes Dev. 6, 329–344 (1992). 36. Heldring, N. et al. Estrogen receptors: How do they signal and what are their targets. Physiol. Rev. 87, 905–931 (2007). 37. Matthews, J., Celius, T., Halgren, R. & Zacharewski, T. Differential estrogen receptor binding of estrogenic substances: a species comparison. J. Steroid Biochem. Mol. Biol. 74, 223–234 (2000). 38. Lee, C. H., Olson, P. & Evans, R. M. Minireview: lipid metabolism, metabolic diseases, and peroxisome proliferator-activated receptors. Endocrinology 144, 2201–2207 (2003). 39. Moore, L. B. et al. Pregnane X receptor (PXR), constitutive androstane receptor (CAR), and benzoate X receptor (BXR) define three pharmacologically distinct classes of nuclear receptors. Mol. Endocrinol. 16, 977–986 (2002). SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 40. Thomson, S. A., Baldwin, W. S., Wang, Y. H., Kwon, G. & Leblanc, G. A. Annotation, phylogenetics, and expression of the nuclear receptors in Daphnia pulex. BMC Genomics 10, 500 (2009); DOI:10.1186/1471-2164-10-500. 41. Vogeler, S., Galloway, T. S., Lyons, B. P. & Bean, T. P. The nuclear receptor gene family in the Pacific oyster, Crassostrea gigas, contains a novel subfamily group. BMC Genomics 15, 369 (2014); DOI:10.1186/1471-2164-15-369. 42. Wurtz, J. M. et al. A canonical structure for the ligand-binding domain of nuclear receptors. Nat. Struct. Biol. 3, 87–94 (1996). 43. Greschik, H. et al. Characterization of the DNA-binding and dimerization properties of the nuclear orphan receptor germ cell nuclear factor. Mol. Cell Biol. 19, 690–703 (1999). Acknowledgments This study supported by the National Natural Science Foundation of China [41330637 and 41171385] and the 111 Project (B14001). Prof. Giesy was supported by the Canada Research Chair program, a Visiting Distinguished Professorship in the Department of Biology and Chemistry and State Key Laboratory in Marine Pollution, City University of Hong Kong. Author contributions Y.B.Z. and J.Y.H. designed the experiments, Y.B.Z. and K.Z. performed the experiment and analyzed the data, Y.B.Z., K.Z., J.P.G. and J.Y.H. wrote the manuscript. All authors contributed to scientific discussions of the manuscript. Additional information Supplementary information accompanies this paper at http://www.nature.com/ scientificreports Competing financial interests: The authors declare no competing financial interests. How to cite this article: Zhao, Y., Zhang, K., Giesy, J.P. & Hu, J. Families of Nuclear Receptors in Vertebrate Models: Characteristic and Comparative Toxicological Perspective. Sci. Rep. 5, 8554; DOI:10.1038/srep08554 (2015). This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ 9 1 Supplementary information for: 2 Families of Nuclear Receptors in Vertebrate Models: Characteristic and Comparative 3 Toxicological Perspective 4 Yanbin Zhao1, Kun Zhang1, John P. Giesy2,3,4, and Jianying Hu1 5 1 6 Peking University, Beijing 100871, China 7 2 8 Saskatchewan, Saskatoon, Saskatchewan, Canada 9 3 MOE Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, Department of Veterinary Biomedical Sciences and Toxicology Centre, University of Department of Zoology, and Center for Integrative Toxicology, Michigan State University, East 10 Lansing, MI, USA 11 4 12 University of Hong Kong, Kowloon, Hong Kong, SAR, China Department of Biology & Chemistry and State Key Laboratory in Marine Pollution, City 13 14 Address for Correspondence 15 Dr. Yanbin Zhao; Prof. Dr. Jianying Hu 16 College of Urban and Environmental Sciences 17 Peking University, Yi Fu Second Building 18 Beijing 100871 China 19 TEL & FAX: 86-10-62765520 20 Email: zhaoyb@pku.edu.cn; hujy@urban.pku.edu.cn Figure S1. Phylogenetic analysis for 48 types of nuclear receptor genes in twelve vertebrate species. Numbers at branches indicate the bootstrap probabilities (≥90%) with 1,000 replicates. Neighbour-Joining trees of ClustalX-aligned full amino acid/DBD plus LBD sequences were constructed and displayed for the majority of NRs. For some trees, which displayed better topological structures in Maximum-Likelihood analysis, the ML trees were constructed instead. 21 22 23 24 25 26 Human_0B1 Human_0B2 90 Mouse_0B1 93 Mouse_0B2 100 Rat_0B1 Rat_0B2 100 Chicken_0B1 Zebrafish_1A1b Medaka_1A1a Zebrafish_0B2a Tilapia_0B1a Tilapia_1A1a Tilapia_0B2 99 Zebrafish_0B1 Zebrafish_1A1a Turtle_0B2 Turtle_0B1 Medaka_0B2 100 Stickleback_0B1 99 Tilapia_1A1b Stickleback_0B2 Medaka_1A1b Tilapia_0B1b Dolphin_1B1 Human 1A2 96 Mouse_1B1 Mouse 1A2 90 Chicken_1B2 Xenopus_1B1 Duck_1B2 Zebrafish_1B1a Chicken 1A2 Turtle_1B2 Medaka_1B1 Turtle 1A2 Rat_1B2 Mouse_1B2 99 Chicken_1B1 Rat 1A2 Duck 1A2 99 Human_1B2 98 Rat_1B1 Dolphin 1A2 99 Stickleback_1A1a Human_1B1 100 Stickleback_1A1b 99 Medaka_0B1 96 Rat_1A1 Xenopus_1A1 Duck_0B2 100 Xenopus_0B1 100 95 98 Chicken_1A1 98 Chicken_0B2 Duck_0B1 99 Mouse_1A1 Dolphin_0B2 Dolphin_0B1 100 Human_1A1 100 Xenopus_1B2 Tilapia_1B1a Xenopus 1A2 Medaka_1B2b Zebrafish_1B1b Zebrafish 1A2 Tilapia_1B1b Medaka 1A2 Tilapia 1A2 Medaka_1B2a Stickleback_1B1a 100 Tilapia_1B2b 100 Tilapia_1B2a Stickleback 1A2 Stickleback_1B2 Human_1C1 Dolphin_1C1 100 Mouse_1C1 Dolphin_1B3 100 Human_1B3 Rat_1B3 100 100 Duck_1C1 100 Tilapia_1B3b Zebrafish_1C1b Stickleback_1B3a Chicken_1C2 97 Medaka_1C1a 99 Duck_1C2 Zebrafish_1C2a Tilapia_1C1a Zebrafish_1B3a Zebrafish_1C2b Stickleback_1C1b Zebrafish_1B3b Stickleback_1C1a Tilapia_1B3a Stickleback_1B3b Stickleback_1C2 Zebrafish_1C1a Medaka_1B3a 94 100 Rat_1C2 Turtle_1C2 Xenopus_1C1 Medaka_1B3b Dolphin_1C2 Mouse_1C2 90 Turtle_1C1 100 Xenopus_1B3 97 99 Chicken_1C1 100 Mouse_1B3 Human_1C2 Rat_1C1 100 Medaka_1C1b Tilapia_1C1b 99 Medaka_1C2 Tilapia_1C2 Xenopus_1C2 27 Human_1C3 Human_1D1 Dolphin_1C3 Mouse_1D1 100 100 Mouse_1C3 100 Xenopus_1C3 Tilapia_1D1 Duck 1D2 99 Turtle 1D2 100 Xenopus 1D2 Zebrafish 1D2b Medaka_1C3 Human 1F2 Tilapia_1C3 Mouse 1F2 Stickleback_1C3 Stickleback 1D2a 100 Zebrafish 1D2a Medaka 1D2 Dolphin 1F2 Duck 1F2 Chicken_1F1 100 Duck_1F1 Tilapia 1D2b 100 Rat 1F2 98 97 Tilapia 1D2a 100 Stickleback 1D2b 98 Chicken 1F2 Turtle 1F2 Turtle_1F1 Xenopus 1F2b Human_1F1 Human_1H2 Zebrafish 1F2 Dolphin_1F1 Dolphin_1H2 Medaka 1F2 Mouse_1F1 Rat_1F1 Mouse_1H2 Tilapia 1F2 100 Stickleback 1F2 Zebrafish_1F1a 97 Chicken 1D2 100 100 Stickleback_1D1 Zebrafish_1C3 93 100 Zebrafish_1D1 100 Rat 1D2 100 Xenopus_1D1 Turtle_1C3 99 Mouse 1D2 Dolphin_1D1 98 Chicken_1C3 Duck_1C3 100 Dolphin 1D2 Rat_1D1 100 98 100 Rat_1C3 Human 1D2 100 100 100 Rat_1H2 96 Human 1H4 Medaka_1F1 93 99 Tilapia_1F1a Human 1H3 Xenopus_1F1 Dolphin 1H4 Dolphin 1H3 100 Tilapia_1F1b Mouse 1H3 Zebrafish_1F1b 100 100 99 Duck 1H4 98 Mouse_1F3 Rat_1F3 99 100 Chicken_1F3a Turtle 1H4 Turtle 1H3 Human_1F3 Chicken 1H4 100 Chicken 1H3 Xenopus 1H4 Xenopus 1H3 Dolphin_1F3 100 Rat 1H4 98 Rat 1H3 Duck 1H3 100 Mouse 1H4 100 Zebrafish 1H4 Stickleback 1H3 99 Zebrafish 1H3 100 Medaka 1H3 95 99 Duck_1F3a Tilapia 1H4 100 Medaka 1H4 100 Stickleback 1H4 Tilapia 1H3 Turtle_1F3b Zebrafish_1F3b 99 100 Medaka_1F3b 99 Dolphin 1I1 100 100 Zebrafish_1F3c Medaka_1F3c Rat 1I1 Rat 1I2 Xenopus 1I2 Zebrafish 1I2 Duck 1I1 100 Xenopus 1I1 Medaka_1F3a Medaka 1I2 100 Turtle 1I1 Zebrafish_1F3a 100 Mouse 1I2 100 100 Tilapia_1F3a 100 Dolphin 1I2 Chicken 1I1 Stickleback_1F3a 100 Human 1I2 100 Mouse 1I1 Tilapia_1F3c 98 99 Human 1I1 Stickleback_1F3b Tilapia 1I2 Zebrafish 1I1b 91 Tilapia_1F3b Tilapia 1I1a 100 100 Medaka 1I1b Human_1I3 Dolphin_1I3 100 Stickleback 1I1b Zebrafish 1I1a Medaka 1I1a 97 Tilapia 1I1b Stickleback 1I1a Mouse_1I3 100 Rat_1I3 Turtle_1I3 Xenopus_1I3 Human 2A1 28 Mouse 2A1 100 100 100 Dolphin 2A1 Mouse 2B1 Rat 2B1 Rat_2A2 Dolphin_2A2 Chicken 2A1 99 97 Mouse_2A2 100 Rat 2A1 100 Human 2B1 Human_2A2 90 100 Chicken_2A2 99 Duck 2A1 Chicken 2B1 Duck_2A2 100 Turtle 2A1 Turtle 2B1 Xenopus_2A2 Zebrafish 2A1 Xenopus 2B1 Medaka_2A2 Tilapia 2A1 99 Duck 2B1 95 100 Turtle_2A2 Xenopus 2A1 100 Dolphin 2B1 Tilapia_2A2 100 Stickleback 2A1 Zebrafish 2B1b Zebrafish 2B1a Stickleback_2A2 98 Medaka 2B1a 100 Tilapia 2B1 Human 2B2 99 Human 2B3 Dolphin 2B2 100 Mouse 2B2 Xenopus 2B2 Mouse 2C1 Chicken 2B3 100 Zebrafish 2B2a 100 Duck 2B3 Tilapia 2B2b 96 Turtle 2B3 Duck 2C1 Turtle 2C1 Zebrafish 2B3a Zebrafish 2B2b Medaka 2B2b Xenopus 2C1 Zebrafish 2B3b 100 Tilapia 2B2a 100 100 Xenopus 2B3 Stickleback 2B2b Stickleback 2C1 100 Tilapia 2B3 100 Stickleback 2B2a Zebrafish 2C1 Medaka 2B1b 96 Rat 2C1 Chicken 2C1 97 Medaka 2B2a 100 Dolphin 2C1 100 Rat 2B3 99 Human 2C1 98 Mouse 2B3 Rat 2B2 100 Stickleback 2B1 Dolphin 2B3 100 Medaka 2C1 100 Tilapia 2C1 Stickleback 2B3 Human 2E3 Mouse 2E3 Human 2C2 100 Chicken 2E3 Turtle 2E3 Rat_2E1 Chicken 2C2 Xenopus 2E3 Chicken_2E1 Duck 2C2 100 100 Medaka 2E3a Turtle_2E1 Xenopus 2C2 Rat_2F1 100 Dolphin_2F6 100 Mouse_2F6 98 Rat_2F6 Chicken_2F2 Turtle_2F2 99 Turtle_2F6 Xenopus_2F6 Xenopus_2F2 Zebrafish_2F6a Duck_2F2 Zebrafish_2F1a Zebrafish_2F6b Tilapia_2F2b Tilapia_2F1 Stickleback_2F1 Human_2F6 Mouse_2F2 100 Rat_2F2 Dolphin_2F1 90 Tilapia 2E3b 90 Human_2F2 Mouse_2F1 Chicken_2F1 Medaka 2E3b 100 93 Tilapia_2E1 Human_2F1 Xenopus_2F1 Stickleback 2E3 Medaka_2E1 99 Tilapia 2C2 99 Stickleback_2E1 100 Medaka 2C2 Tilapia 2E3a 91 Zebrafish_2E1 Stickleback 2C2 100 100 Xenopus_2E1 Zebrafish 2C2 100 Zebrafish 2E3 Duck_2E1 Turtle 2C2 Rat 2E3 Dolphin 2E3 Mouse_2E1 98 Rat 2C2 100 96 100 Dolphin_2E1 Mouse 2C2 94 99 Human_2E1 Dolphin 2C2 100 94 Stickleback_2F6b Zebrafish_2F2 Medaka_2F2 100 Tilapia_2F6b Stickleback_2F2b Tilapia_2F6a Tilapia_2F2a 100 Stickleback_2F2a Medaka_2F6b 94 90 90 Medaka_2F6a Stickleback_2F6a 29 Human 3A1 Human 3A2 Dolphin 3A1 Rat 3A1 100 92 100 Mouse 3A2 100 90 Chicken 3A1 100 99 100 99 Turtle 3B1 Xenopus 3B1 Medaka 3B1 Stickleback 3B1 Medaka 3A2b 97 100 Tilapia 3A2b Stickleback 3A1 Zebrafish 3A2a Human_3C1 Stickleback 3A2a 100 Dolphin_3C1 Medaka 3A2a 100 95 Mouse_3C1 Mouse_3B2 99 98 Rat_3B2 100 Dolphin_3B2 90 100 Chicken_3B2 100 99 Duck_3C1 Human_3B3 Turtle_3C1 Dolphin_3B3 Xenopus_3C1 100 Rat_3B3 Turtle_3B2 Zebrafish_3B2 100 Medaka_3C1b Medaka_3B2a 100 100 Zebrafish_3C1 Duck_3B3 Tilapia_3C1a 92 Xenopus_3B3a Stickleback_3B2a Medaka_3C1a Zebrafish_3B3a Stickleback_3B2b Stickleback_3C1a Tilapia_3B3b Medaka_3B2b 100 Stickleback_3C1b Chicken_3B3 100 99 99 100 Tilapia_3B2b Medaka_3B3a Stickleback_3B3b 100 Medaka_3B3b 100 Mouse_3C4 100 Zebrafish_3B3b 100 Turtle_3C4 Mouse_3C2 100 100 100 Duck_3C2 99 100 99 Zebrafish_3C2 98 Stickleback_3C2 Stickleback_3C4b 100 Turtle_3C3 Medaka_3C4b 100 Tilapia_3C4b Xenopus_3C3 Medaka_3C2 Tilapia_3C2 Zebrafish_3C4 Rat_3C3 Chicken_3C3 Xenopus_3C2 100 Xenopus_3C4 Mouse_3C3 100 Turtle_3C2 100 100 Duck_3C4 Dolphin_3C3 100 Chicken_3C2 Chicken_3C4 100 Human_3C3 Rat_3C2 100 Stickleback_3C4a Stickleback_3C3 Zebrafish_3C3 Rat_3C4 Dolphin_3C4 Stickleback_3B3a Dolphin_3C2 100 100 Xenopus_3B3b Human_3C2 Human_3C4 95 Tilapia_3B3a 99 Tilapia_3C1b Turtle_3B3 Tilapia_3B2a 100 Chicken_3C1 Mouse_3B3 Duck_3B2 100 Rat_3C1 Tilapia 3A2a Human_3B2 100 Tilapia 3B1 100 Zebrafish 3A2b Medaka 3A1 Tilapia 3A1 Zebrafish 3B1 Xenopus 3A2 Zebrafish 3A1 Mouse 3B1 99 Rat 3B1 Turtle 3A2 Xenopus 3A1 99 92 Rat 3A2 Duck 3A2 Turtle 3A1 100 100 Chicken 3A2 100 Duck 3A1 Human 3B1 Dolphin 3B1 Dolphin 3A2 Mouse 3A1 100 99 Medaka_3C4a 100 98 Tilapia_3C4a 30 Dolphin 4A1 100 Dolphin 4A3 100 Mouse 4A1 100 91 Rat_4A2 96 Mouse_4A2 100 Rat 4A3 95 Xenopus 4A1 Duck 4A3 Duck 4A1 Turtle 4A3 Zebrafish_4A2a Zebrafish 4A3 Stickleback 4A1a Medaka 4A1b 94 Chicken_4A2 Xenopus_4A2a 100 95 Zebrafish 4A1 100 Turtle_4A2 99 Chicken 4A3 100 Dolphin_4A2 100 Mouse 4A3 Rat 4A1 Turtle 4A1 100 91 Human_4A2 Human 4A3 Human 4A1 99 Medaka 4A3 100 99 Medaka 4A1a Medaka_4A2a 99 Tilapia_4A2a Tilapia 4A3 100 Tilapia 4A1a Stickleback_4A2 Zebrafish_4A2b Stickleback 4A3 Medaka_4A2b Tilapia 4A1b 100 100 Stickleback 4A1b Tilapia_4A2b Xenopus_4A2b Human_5A1 100 98 Human 5A2 Dolphin_5A1 Mouse_5A1 97 Dolphin 5A2 100 100 98 Xenopus_5A1 Zebrafish_5A1b Zebrafish_5A1a Stickleback_5A1a 99 99 Tilapia_5A1 Medaka_5A1b Stickleback_5A1b Human_6A1 Mouse_6A1 Rat_6A1 Chicken 5A2 Dolphin_6A1 96 Duck 5A2 100 Chicken_6A1 99 Turtle 5A2 Turtle_6A1 90 Xenopus 5A2 Zebrafish 5A2 Medaka_5A1a 100 100 Rat 5A2 Chicken_5A1 100 90 Mouse 5A2 100 Rat_5A1 Stickleback 5A2 100 99 Medaka 5A2 Tilapia 5A2 Xenopus_6A1 Zebrafish_6A1a 97 Tilapia_6A1 100 Stickleback_6A1 Zebrafish_6A1b 31 32 Figure S2. Schematic diagram depicts the evolution of PXR and CAR in vertebrates. 33 34 Table S1. Details for nuclear receptor sequence searches in 12 model vertebrates. Human Mouse Rat Dolphin Chicken Duck Turtle Xenopus Zebrafish Medaka Tilapia Stickleback 35 BLASTn Hits BLASTp Hits Sum After sortation Verified by software NR0B Subfamily Final sets of NRs. 33849 23014 8312 2834 3712 2381 2922 2289 9788 3601 7586 571 24967 12540 8896 2752 3761 4034 3421 3850 9230 4090 6630 268 58816 35554 17208 5586 7473 6415 6343 6139 19018 7691 14216 839 57 62 70 74 50 48 48 53 72 78 83 64 46 47 47 45 42 40 46 50 70 65 71 64 2 2 2 2 2 2 2 2 3 2 3 2 48 49 49 47 44 42 48 52 73 67 74 66 36 37 Table S2. Sequence ID. for each nuclear receptor gene in Ensembl database. Human Mouse Rat Dolphin Chicken Duck Turtle Xenopus Zebrafish Medaka Tilapia Stickleback ENSG0000012 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 6351 000058756 000009066 00016893 000000270 00016001 0012754 00024399 000000151 00016941 00018247 00003766 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000052654 00012005 00006456 00006540 NR1A1 ENSG0000015 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1090 000021779 000006649 00001859 000011294 00006081 0008182 00003871 000021163 00008122 00010312 00007996 ENSG0000013 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1759 000037992 000009972 00016901 000005629 00006377 0002372 00024390 000056783 00004373 00019915 00012955 ENSDARG00 ENSONIG000 ENSGACG000 000034893 00006314 00005297 NR1A2 NR1B1 ENSG0000007 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSORLG000 ENSONIG000 ENSGACG000 7092 000017491 000024061 00010874 000011298 00006432 0007930 00007272 00008502 00010320 00007999 ENSORLG000 ENSONIG000 00016394 00006493 NR1B2 ENSG0000017 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 2819 000001288 000012499 00002778 00012670 000034117 00015382 00012223 00009372 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000054003 00007861 00019165 00000612 NR1B3 ENSG0000018 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 6951 000022383 000021463 00004136 000022985 00010641 0018221 00023454 000031777 00002413 00016715 00018958 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000054323 00011091 00008831 00003703 NR1C1 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 2033 000002250 000000503 00009416 000002588 00004751 0005889 00015121 000044525 00006636 00011871 00008288 NR1C2 ENSDARG00 000009473 ENSG0000013 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 2170 000000440 000008839 00016565 000004974 00009031 0011100 00017422 000031848 00004432 00014331 00001665 ENSG0000012 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSONIG000 ENSGACG000 6368 000020889 000009329 00016894 0014806 00024397 000033160 00009283 00009356 ENSG0000017 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 4738 000021775 000046912 00010829 000011291 00005753 0008488 00003869 000003820 00016431 00008699 00012958 ENSDARG00 ENSONIG000 ENSGACG000 000009594 00010308 00007986 NR1C3 NR1D1 NR1D2 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000031161 00007837 00012213 00000614 ENSDARG00 ENSORLG000 ENSONIG000 000059370 00015399 00019164 NR1D4 ENSG0000006 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 9667 000032238 000027145 00007718 000003759 00005866 0011314 00021123 000031768 00007645 00015289 NR1F1 ENSDARG00 ENSONIG000 000001910 00015603 ENSG0000019 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 8963 000036192 000013413 00008387 000015150 00007187 0005579 00031251 000033498 00012441 00010762 00011556 NR1F2 ENSXETG000 00008148 ENSG0000014 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 3365 000028150 000046831 00003151 000025988 00013051 0008995 00002131 000087195 00009486 00004686 00012280 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000001035 00011493 0016262 000057231 00003765 00010247 00015341 ENSDARG00 ENSORLG000 ENSONIG000 000017780 00014886 00006222 NR1F3 ENSG0000002 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 5434 000002108 000013172 00014149 000008202 00010925 0010360 00000307 000043170 00001286 00005828 00017167 NR1H3 ENSG0000013 ENSMUSG00 ENSRNOG00 ENSTTRG000 1408 000060601 000019812 00002416 ENSMUSG00 ENSRNOG00 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSONIG000 ENSGACG000 000048938 000023073 000002170 00008338 0003828 00021443 000031046 00009252 00004938 ENSG0000001 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 2504 000047638 000007197 00016373 000011594 00013289 0005774 00030372 000057741 00011270 00014678 00011745 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1424 000022479 000008574 00012578 000026166 00005087 0018108 00010658 000043059 00001063 00009200 00004763 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000070721 00016402 00019378 00007975 NR1H2 NR1H5 NR1H4 NR1I1 ENSG0000014 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 4852 000022809 000002906 00016650 00018029 000029766 00017953 00014385 ENSG0000014 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSPSIG0000 ENSXETG000 3257 000005677 000003260 00009227 000028624 0004437 00031759 ENSG0000010 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1076 000017950 000008895 00013004 000004285 00008950 0012689 00001775 000021494 00016380 00016515 00011485 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSONIG000 000015670 00011331 0017650 00016389 000012764 00005911 NR1I2 NR1I3 NR2A1 NR2A3 ENSG0000016 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 4749 000017688 000008971 00003691 000005708 00011794 0003756 00017845 000071565 00006996 00014490 00002422 ENSG0000018 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 6350 000015846 000009446 00009492 000002626 00013150 0011977 00012733 000057737 00012155 00013076 00018189 ENSDARG00 ENSORLG000 000035127 00016690 NR2A2 NR2B1 ENSG0000020 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 4231 000039656 000000464 00004291 00020416 000078954 00006476 00020007 00000096 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000002006 00007020 00002873 00007982 NR2B2 ENSG0000014 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSONIG000 ENSGACG000 3171 000015843 000004537 00003653 000003406 00004831 0004871 00004750 000005593 00002143 00011685 NR2B3 ENSDARG00 000004697 ENSG0000012 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 0798 000005897 000006983 00016305 000011327 00006253 0017190 00023840 000045527 00004114 00008566 00010174 ENSG0000017 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 7463 000005893 000010536 00009876 000008519 00007538 0008928 00004817 000042477 00010877 00017240 00002941 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 2333 000019803 000050550 00008863 000015305 00010675 0006035 00014853 000017107 00013426 00013281 00008934 ENSG0000003 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1544 000032292 000050690 00009410 000002093 0017480 00005219 000045904 00000011 00007109 00004739 ENSORLG000 ENSONIG000 00007175 00015396 NR2C1 NR2C2 NR2E1 NR2E3 ENSG0000017 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 5745 000069171 000014795 00001519 000027907 0009818 00011594 000052695 00010191 00011840 00010385 NR2F1 ENSPSIG0000 ENSDARG00 0010198 000017168 ENSG0000018 ENSMUSG00 ENSRNOG00 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 5551 000030551 000010308 000007000 00010629 0017164 00022346 000040926 00008429 00015133 00013235 ENSONIG000 ENSGACG000 00003070 00014846 NR2F2 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 00011046 000033172 00016315 00008594 00013191 NR2F5 ENSG0000016 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 0113 000002393 000016892 00003132 000027294 00003193 0013773 00013531 000003607 00008749 00010512 00007766 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000003165 00008911 00010104 00015583 NR2F6 ENSG0000009 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1831 000019768 000019358 00002996 000012973 00004585 0004166 00012364 000004111 00014514 00013354 00008711 ENSG0000014 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 0009 000021055 000005343 00000517 000011801 00011895 0018210 00007257 000016454 00017721 00005633 00007514 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000034181 00018012 00001710 00000213 NR3A1 NR3A2 ENSG0000017 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 3153 000024955 000021139 00010296 0016751 00007211 000069266 00010624 00001778 00020287 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 9715 000021255 000010259 00001302 000010365 00012470 0017916 00013217 000040151 00016581 00015282 00010561 ENSORLG000 ENSONIG000 ENSGACG000 00009126 00020192 00007542 NR3B1 NR3B2 ENSG0000019 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 6482 000026610 000002593 00006004 000009645 00005309 0005595 00020932 000004861 00011528 00000573 00013426 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 00016948 000011696 00016819 00017162 00016275 ENSDARG00 ENSONIG000 ENSGACG000 000015064 00001134 00004898 NR3B3 NR3B4 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 3580 000024431 000014096 00003260 000007394 00007318 0015245 00001879 000025032 00006022 00017907 00018209 ENSORLG000 ENSONIG000 ENSGACG000 00001565 00008483 00020725 NR3C1 ENSG0000015 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1623 000031618 000034007 00014440 000010035 00015146 0006383 00026061 000037025 00007530 00010029 00017193 ENSG0000008 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSGACG000 2175 000031870 000006831 00000030 000017195 00003887 0013654 00005482 000035966 00002651 00012162 ENSG0000016 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 9083 000046532 000005639 00004230 000004596 00006566 0010176 00005089 000067976 00008220 00012854 00018525 ENSORLG000 ENSONIG000 ENSGACG000 NR3C2 NR3C3 NR3C4 00009520 00017538 00020332 ENSG0000012 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 3358 000023034 000007607 00002817 00014123 0018018 00000579 000000796 00015557 00016717 00010788 ENSORLG000 ENSONIG000 ENSGACG000 00015279 00019260 00000045 NR4A1 ENSG0000015 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 3234 000026826 000005600 00005740 000012538 00012071 0008054 00031753 000017007 00016692 00008976 00005831 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 00024016 000044532 00000050 00012131 NR4A2 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 9508 000028341 000005964 00007458 000013568 00011263 0012281 000055854 00008732 00006026 00009027 ENSG0000013 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 6931 000026751 000012682 00017390 000001080 00004548 0006131 00011456 000017704 00016486 00020218 00003539 ENSDARG00 ENSORLG000 ENSGACG000 000023362 00013196 00018317 NR4A3 NR5A1 ENSG0000011 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 6833 000026398 000000653 00003256 000002182 00009302 0003632 00000314 000042556 00006933 00012517 00008896 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 000039116 00006019 00001686 00009952 NR5A2 NR5A5 ENSG0000014 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 8200 000063972 000013232 00017391 000001073 00004788 0006445 00008578 000018030 00016492 00020217 00003560 NR6A1 ENSDARG00 000014480 ENSG0000016 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 9297 000025056 000003765 00013272 000016287 00003894 0009740 00015374 000056541 00011824 00012111 00002817 NR0B1 ENSONIG000 00006662 ENSG0000013 ENSMUSG00 ENSRNOG00 ENSTTRG000 ENSGALG00 ENSAPLG000 ENSPSIG0000 ENSXETG000 ENSDARG00 ENSORLG000 ENSONIG000 ENSGACG000 1910 000037583 000007229 00016680 000000887 00010744 0017134 00011771 000044685 00004442 00006772 00007198 NR0B2 38 39 40 Table S3. Genes with incomplete/without DBD/LBD regions in the Ensembl database. Genes marked in red means the full sequences were retrieved in NCBI/ EMBL/DDBJ databases. Gene and related Ensembl ID. Human — Mouse — Rat Dolphin NR2E3 (ENSRNOG00000050690); NR3A1 (ENSRNOG00000019358); NR3C3 (ENSRNOG00000006831); NR5A2 (ENSRNOG00000000653) NR1A1 (ENSTTRG00000016893); NR1B2 (ENSTTRG00000010874); NR1C1 (ENSTTRG00000004136); NR1F3 (ENSTTRG00000003151); NR1I2 (ENSTTRG00000016650); NR2A1 (ENSTTRG00000013004); NR2B1 (ENSTTRG00000009492); NR3A1 (ENSTTRG00000002996); NR2B3 (ENSTTRG00000003653); NR2F6 (ENSTTRG00000003132); NR3B2 (ENSTTRG00000001302); NR4A1 (ENSTTRG00000002817); NR4A3 (ENSTTRG00000007458) Chicken Duck Turtle NR1B1 (ENSGALG00000005629); NR2F6 (ENSGALG00000027294); NR1A1 (ENSAPLG00000016001); NR1B1(ENSAPLG00000006377); NR1F2 (ENSAPLG00000007187); NR1F3b (ENSAPLG00000011493); NR1H4 (ENSAPLG00000013289); NR1I1 (ENSAPLG00000005087); NR2F6 (ENSAPLG00000003193); NR3A1 (ENSAPLG00000004585); NR3C3 (ENSAPLG00000003887); NR4A1 (ENSAPLG00000014123); NR4A2 (ENSAPLG00000012071); NR5A1 (ENSAPLG00000004548); NR6A1 (ENSAPLG00000004788); NR0B1 (ENSAPLG00000003894) NR1A1 (ENSPSIG00000012754); NR1B1 (ENSPSIG00000002372); NR1D1 (ENSPSIG00000014806); NR2A2 (ENSPSIG00000003756); NR2E3 (ENSPSIG00000017480); NR2F1b (ENSPSIG00000010198); NR3B1 (ENSPSIG00000016751); NR4A2 (ENSPSIG00000008054); NR5A1 (ENSPSIG00000006131); NR1C3 (ENSXETG00000017422); NR1F3 (ENSXETG00000002131); NR6A1 (ENSPSIG00000006445) Xenopus Zebrafish Medaka Tilapia Stickleback NR1C2 (ENSXETG00000015121); NR2A1 (ENSXETG00000001775); NR4A2b (ENSXETG00000024016) NR1C3 (ENSDARG00000031848); NR1I1 (ENSDARG00000043059) NR1B1 (ENSORLG00000004373); NR2A1 (ENSORLG00000016380); NR3C3 (ENSORLG00000002651); NR6A1 (ENSORLG00000016492) — — NR2F1 (ENSORLG00000010191); 41 42 43 Table S4. The vertebrate species used for NR1I1 (VDR) gene phylogenetic analysis. Common name Scientific name Common name Scientific name Human Gibbon Gorilla Macaque Marmoset Bushbaby Cat Dog Ferret Hedgehog Rabbit Dolphin Pig Opossum Cow Sheep Mouse Rat Guinea Pig Squirrel Homo sapiens Nomascus leucogenys Gorilla gorilla gorilla Macaca mulatta Callithrix jacchus Otolemur garnettii Felis catus Canis lupus familiaris Mustela putorius furo Erinaceus europaeus Oryctolagus cuniculus Tursiops truncatus Sus scrofa Monodelphis domestica Bos taurus Ovis aries Mus musculus Rattus norvegicus Cavia porcellus Ictidomys tridecemlineatus Flycatcher Zebra Finch Duck Chicken Turkey Ficedula albicollis Taeniopygia guttata Anas platyrhynchos Gallus gallus Meleagris gallopavo Anole lizard Chinese softshell turtle Anolis carolinensis Pelodiscus sinensis Xenopus Xenopus tropicalis Coelacanth Tilapia Zebrafish Tetraodon Medaka Platyfish Stickleback Latimeria chalumnae Oreochromis niloticus Danio rerio Tetraodon nigroviridis Oryzias latipes Xiphophorus maculatus Gasterosteus aculeatus