Hi Val, Responses in red: Thanks for this it is very useful. Could you confirm a few things for me to check I haven't misinterpreted. The situation is as I thought for pombe, but we are trying to find a GO solution which would allow queries to retrieve the functionally equivalent genes across species. I guess you would have to call them elements, regions, features or something, rather than genes. 1. If I have interpreted correctly, for organisms with 'regional ' centromeres See below for expanded answers. a) There is a distinct 'core region' in all organisms, ~yes and the common defining feature of this region is that it is b) not heterochromatic ~yes c) enriched for CENP-A Yes d) Is the region of kinetochore attachment Region of kinetochore assembly and microtubule attachment Yes. e) As you move away from the 'central core' region the amount of CENP-A decreases and the DNA is assembled into heterochromatin (=pericentromeric heterochromatin /centromeric heterochromatin) ~yes. This is the picture in e.g. human, Drosophila (but the term ‘central core’ not used) --In pombe this region is not repetitive but in other organisms, including human, it is Yes. In humans the whole of the centromere is repetitive (CENP-A and heterochromatin). In pombe the central core is ‘unique’ and the outer repeats are repetitive (but far, far larger than human alpha satellite repeats). Q Are a-e above correct ? Mostly. Parts c and d are true (with the modification I wrote above). Part b is correct as far as we know – let’s say it’s our understanding. But when DNA is repetitive it can be pretty hard to assess – there could be mixing, but basically b is correct. Some organisms don’t have centromeric heterochromatin at all (so, whilst it’s true that that the CENP-A domain doesn’t have heterochromatin, it’s also not relevant in those organisms). Part a – it is true that there is a CENP-A domain in all organisms (except Trypanosmes), but it wouldn’t be called a core region necessarily. ‘Central region’ / ‘core region’ are sometimes found, but it’s by no means universal. See below. Q Does this 'central' region have any names in higher eukaryotes to distinguish this region from the "centromeric/pericentric heterochromatin"? Probably the most common phrase you would find is something along the lines of ‘central CENP-A-containing region’. Although many/most regional centromeres have central CENP-A-containing region flanked by heterochromatin, this isn’t true for all of them – for instance, Candida albicans centromeres have CENP-A domains spanning a few kb but has no classical centromeric heterochromatin (it lacks an H3K9-histone methyltransferase). The main thing is that although the proteins found at centromeres are conserved the DNA is completely different between species. For a pictorial representation of DNA elements and chromatin domains see the Figure (and extensive figure legend) in our review - attached. It’s a few years old now, but the story is basically the same. 2. For GO terms would the following work for all species with modular centromeres? (BTW, Modular centromeres is not a generally used term. You are using it to mean complex centromeres with distinct domains. Regional centromeres is an often used (but it is perhaps not all that helpful as there is great variation between regional centromeres of different species.) If we defined : i) A generic "centromere complex" term which included the specialised chromatin with CENP-A assembled and the kinetochore proteins ii) A 'sibling' term "centromeric/pericentric heterochromatin", and iii) A broader 'parent' term "centromeric region" which included both the core domain associated proteins of the centromere complex, and the heterochromatic region? This would allow people to query on "centromere complex" for the proteins of the central chromatin kinetochore platform, and the kinetochore itself, pericentric heterochromatin if they are only interested in the proteins at the heterochromatic/pericentric regions or the more general "centromeric region" if interested in all the proteins which are localised to both the kinetochore associated region or the entire region associated with centromere sequence. I’m no expert on GO terms. I think the basic structure of this tree (is it called a tree?) is fine. See below for a few suggestions for the terms that would be most appropriate. The term ‘centromere complex’ is not really used and might be confusing – it might be expected to include DNA. Centromeric proteins “Kinetochore Complex” and “Centromeric Chromatin” (Peri)centromeric Heterochromatin Centromeric proteins Kinetochore Proteins and CENP-A Chromatin (Peri)centromeric Heterochromatin Or some variation on these. A few more things to take into consideration: Are you planning on including histone marks/modifications in your definitions? I guess you wouldn’t normally do so, but they are key features of centromeres. CENP-A is easy – it’s a distinct protein. But the key feature of heterochromatin is methylation (di and/or tri) on histone H3 lysine 9. As I mentioned previously, there is some histone H3 in the central domain in pombe – it’s not all CENP-A. This is the case in other organisms as well – there is H3 within the CENP-A domain and there are a number of publications on the composition of centromeric chromatin (ie the non-heterochromatic part) especially in vertebrates & flies, the post-translational modifications of the histones within it (especially H3), and the function (e.g. association of CENP-T etc). I can supply you with these if you want them. In addition, other histone variants may be present / absent within centromeric regions. For instance H2Az is low in pombe central domain. So, it’s all a bit complicated and may differ between organisms (or we don’t have the full picture in all organisms). We may think of the centromere in cartoon, linear form, but actually in cells the centromeric DNA, heterochromatin and kinetochore forms a three-dimensional structure. There are several papers on this topic. It also means that the term ‘outer’ for instance, might be used to mean the outer edges of a linear structure (which would be expected to be heterochromatic) or it might be used to mean the outer surface of the kinetochore than is CENP-A containing and contains kinetochore proteins and connects to the microtubules. See Fig 2 of the Stellfox review I’ve attached. 3. Note that we will use the sequence ontology to define the individual repeats in fission yeast : The children of "regional centromere" which is the defined as a sequence region: http://sequenceontology.org/browser/current_svn/term/SO:0001795 and the children of this can be used to capture the pombe specific repeats dg/dh etc This sounds fine. What we are doing in GO is different, here we are aiming to capture the recognised cellular components (protein and protein-DNA complexes) rather than the sequence features per. se. We will be able to use both in annotations depending which is most appropriate for the experiment. I am also CCing GO collaborators David Hill (GO editor at MGI) and Anna Melidoni (human GO curator at UCL) as we are trying to resolve some tricky issues with the existing GO graph. They may have a couple more questions to make the solution work for higher eukaryotes. Many thanks, this was really helpful, Val On 03/12/2013 10:57, Alison L Pidoux wrote: Hi Val, Apologies for my delay in replying. Which precise regions (with relation to teh repeats) are people referring to when they use the phrases 1. Centromeric heterochromatin 2. Pericentric heterochromatin (are these 2 terms interchangeable or is there a subtle difference? They're talking about the same thing. So, historically we said 'centromeric heterochromatin' in pombe, or 'outer repeat heterochromtin'. 'Pericentromeric heterochromatin' was used to describe the heterochromatin at/adjacent to centromeres in mammalian cells etc. However, it began to be used in pombe (I believe partially to emphasise the similarity between the centromeres of pombe and multicellular eukaryotes). So, it's the same thing, they are used interchangeably. 3 is the "centromere core region" heterochromatic?, I have seen it described as specialised chromatin, but not as heterochromtin. The central core region in NOT heterochromatic. The central core is packaged in CENP-A chromatin, which is entirely different from euchromatin and heterochromatin. The main defining feature is that in central core region (or central domain, as Robin prefers) is that (most) histone H3 is replaced by the histone H3 variant CENP-A. The CENP-A chromatin forms the platform upon which the kinetochore is assembled. This is the case in all eukaryotes (except Trypanosomes) - CENP-A chromatin is assembled at centromeres and the kinetochore is built upon it (in other organisms it's not called the central core/domain/region). 4. When people refer to the "centromere" is the "pericentric heterochromatin" include in the definition (I guess this is partially dependent on the answers to 1 & 2? What I mean by centromere in pombe is the sequence encompassing the central domain plus the outer repeats (ie the whole ~40 kb for cen1), with the CENP-A chromatin and kinetochore plus the heterochromatin on the outer repeats. So, yes, it would include the pericentromeric heterochromatin. I think that most people would use the term in this way for pombe. 5. Are there any species differences in the use of these terms ? Yes, I'd say it's variable. This is partly historical and partly due to the fact that centromeric sequences are not conserved between species, and the set-up of centromeres varies between species. In human, for instance, centromeric DNA is composed of megabases of alpha-satellite (171 bp repeat). It appears that the 'good' repeats are towards the centre and have CENP-A and as you move outwards the repeats are degraded and are assembled in heterochromatin (=pericentromeric heterochromatin). There isn't the defined domain structure that you find in pombe. Mammalian centromeres are far harder to analyse due to their highly repetitive nature. Here is a general description of pombe centromeres: The central domain is composed of the central core and the inner part of the innermost repeats (imr). The central domain is assembled in CENP-A chromatin in which canonical H3 is replaced with CENP-A. The kinetochore (proteins such as Mis6, Cnp3 etc) is assembled on the CENP-A chromatin, and microtubles attach to the kinetochore. There is some H3 in the central domain (estimated to be around 10% of what you find elsewhere in the genome - but don't quote me on that). The function of the H3 in the central core is not known (but H3 is found at mammalian and Drosophila centromeres also, along with CENP-A). It is thought that CENP-A is in an octomeric nuclesome that also contains two copies of H4, H2A and H2B (this is a hugely controversial subject in centromere biology at present). tRNAs are found in the imr repeats. See diagrams in attached ppt. The outer tRNAs of imr1 have been shown to have boundary function (Kristin Scott). So they form a boundary between the CENP-A/kinetochore domain and the heterochromatin domain of the outer repeats. There are also clusters of tRNAs at 5 of the 6 centromere extremities, suggesting a boundary function to prevent the spread of heterochromatin from the outer repeats to the euchromatin. There are also IRC elements at the edges of centromeres which have boundary function (Grewal etc). The outer repeats (dg/dh or K, depending on if you use Yanagida's or Clarke's nomenclature) are assembled in RNAi-directed heterochromatin. Histones are underacetylated and methylated on Histone H3 lysine 9 (dependent on Clr4). Functions of the domains: Central domain: assembly of CENP-A chromatin. Site of kinetochore assembly. Outer repeats: Heterochromatin. Recruits a high density of cohesin. This promotes bi-orientation of centromeres on the spindle, and prevents merotely (because pombe has multiple microtubles per centromere, MT-binding sites must be co-ordinated so they all face the same way - failure causes merotely and lagging chromosomes). In order to have any segregation function you must have a kinetochore. In order have highly accurate segregation function you need the functions provided by the centromeric heterochromatin. (This is in pombe, the role of pericentromeric heterochromatin is less clear-cut in multicellular organisms.) In the absence of centromeric heterochromatin there is premature separation of sister centromeres, lagging chromosomes, etc. Segregation does not entirely fail because the chromosome arms have cohesion (non-heterochromatin directed). We have also shown that heterochromatin is required for establishment but not maintenance of CENP-A chromatin. The centromeres of Schizosaccharomyces are not conserved. No homology found between e.g. pombe and octosporus. There may be some similiarity in domain organisation - but that's in progress. Anyway, hope that helps. I've included a variety of cartoons on pombe centromeres which I hope helps in clarifying things. Let me know if you have any further questions. Cheers, Alison Many thanks if you can help Best wishes, Val