Comments received and incorporated in specification - 29.06.2011 ===================================================== From Jim (on v0.0): ------------------------Many corrections and suggestions, received over fax, discussed over phone and included in the document v0.1 From Sam (on v0.1 and v0.4): -------------------------------------VME addressing discussion (Sam, Ian, Steve, Murrough, Bruce and others) Many corrections was done directly in the document From Uli (on v0.2): ----------------------General issues: I believe we should make a statement about parallel termination of the 400 backplane links on the CMM++. I have always been a bit reluctant to do parallel termination (though available on the BLT). However, if we think we do not need that, we should explicitly state that in the document so as to make sure everybody knows what we agree on here. Just using the (always available) FPGA internal termination seems to be a risk wrt thermal budget of the FPGA. --> stated in the document (4.3, last paragraph) We should also clearly state what frequency the module is meant to run off. For the GOLD we have decided not to support 40.00 MHz operation so as to allow for a 40.08 crystal based GTX receive reference clock to be used for both LHC clock and crystal clock based operation. If, for reason of compatibility with the _old_ (TTCvx based) test rigs, 40.00 MHz were required, be aware that those data cannot be received on the GOLD / TP. --> included (4.9) I would suggest we spend a few moments to reason about single FPGA vs. dual FPGA approach. For a single FPGA the parallel lines JUST fit. And that's true only if you are using single ended <--> differential converters external to the FPGA. Since you are, however, talking about possibly increasing the data rate on the legacy LVDS interfaces considerably, that might perhaps be a sub-optimal approach wrt signal integrity. I wouldn't mind seeing a separate system merger FPGA (possibly a cheap Spartan-6) that's in the data path for legacy mode only, and therefore doesn't impact TP mode latency. --> included (4.1 first paragraph) CAN based monitoring might be required on TP as well (tbd). A CAN daughter module might in this case be a common item. Please note that there are architectural differences between the UK approach and the JEM scheme. While the UK modules are trying to make use of the Fujitsu analogue resources (I believe) the JEM is using the Fujitsu basically as a CAN to I2C bridge only, with all monitoring done via I2C sensors. This scheme requires just a very small number of lines to be routed between a daughter card and the main board. --> included (4.7) You are mentioning the need to operate the CMM++ off the existent online software. I would dare to disagree. We definitely do not want to change online software again and again. But at some stage we need to modify it anyway. Therefore if the human resources were available on the software end one could well write some code supporting the CMM++, in its initial and later incarnations, at a rather early stage. In this way the CMM++ could be controlled in a consistent way for all various iterations of the firmware. Just a single branch in the software to distinguish CMM from CMM++ wouldn't probably be a disaster. But this is an issue we should really discuss with our online software colleagues. --> agree Per section: 3.3 upgraded CTP: well if you know when it will be available... --> agree 4. in first section and fig.1 / fig.2 you are talking about TCM. Not sure whether you might actually wish to refer to DCS here? Or are you referring to the TTCdec/TTCrx ? --> clarified on fiagrams and in 4.7 4.3 no, we do not want to eliminate the crystal clock. DAQ/ROI will be driven off 40.00 xtal in future as well. --> corrected 4.4 if we want to go for 320 Mb/s we might be better off using differential signaling on the FPGA. see above. --> (for the rear transition module) - corrected 6.1 do we like that timeline? The production modules will come in rather late wrt the Phase0 shutdown (to current planning). Have we got a chance to speed things up ? --> will correct the schedule according to the overall upgrade schedule 6.2 We should specify the design tool very soon, I suggest. Any MSU preferences? Check with RAL whether the Cadence design files of the CMM is fully available, or whether there might be any intellectual property issues with any of the components coming from RAL CAD libraries. --> agree, we are in contact with RAL 6.3 You mention myself as energy firmware contact. Well, I do not object being listed there, but would wish to point out that from some initial Mainz supplied code, RAL has developed what's currently in there. --> corrected 6.4 You are talking about "full hardware simulation". For clarification: the data should in fact be simulated bit for bit. We need to exactly match the binary representation of the real data. There is one thing we don't have to do in detail: We do not do a full timing simulation. However, in a fully synchronous system a timing simulation is of limited use anyway. The combination of behavioral simulation plus a properly constrained clock should be sufficient (I know that some people would tend to object, but I am still waiting to see an example where a properly constrained synchronous system doesn't match the behavioral simulation) --> agree From Uli (on v0.3): ----------------------p.6 "The several parts of the CMM (e.g. – CPLD VME interface, TCM interface, Xilinx System ACE) – both schematics and firmware – can be reused to speed up the development." just for clarification (need not be documented in any publicly accessible document if that is considered problematic, but the reviewers would definitely wish to understand that): Is there agreement that RAL provides Cadence schematics and libraries of components used ? Are you going to use Cadence ? - or does this phrase just mean that you are recycling the .pdf/paper copies ? --> Ian can provide design files to MSU. I will use Cadence at CERN. I think, Dan will use Mentor at MSU. p.7 "Therefore, the connectors for the CMX optical links may not fit into existing CMM front panel and may be placed on the board away from the front panel." I would guess this means the use of mid board transceivers with pigtails. However, space for the MTP/MPO feed through connectors needs to be provided anyway. I would like to point out that it is important to provide plenty of spare connectivity on the CMX. If required, space must be made available on the front panel. The Fujitsu RS232 needn't necessarily be routed via Sub-D9. Maybe smaller connectors exist. And for the SystemACE one should make sure that the long-planned in situ update of the flash cards is developed into a reliable tool. If it works 100.00% then there is no need to mount the flash card on the front panel of the production version of the CMX! Technically simple approach: Two CF connector footprints, one near the front panel and one in a recessed position. Assembly where suitable. MPO feed through is mechanically connected to front panel only and isn't affected by an unused CF connector footprint on PCB. Also, one should seriously consider MTP/MPO for replacement of the existent SFPs. --> OK! Thank you, we will use your suggestions during board design! Bottom of p.8 Here you suggest that we do not need a lot of output bandwidth. In fact, further down (p.14) you go into some more detail and talk about data replication to more than one TP module. That's important, since paralleling the processing could possibly allow for a wider choice of algorithms *without* compromising latency. There is no need to wire up everything from the beginning, but opto sockets and front panel space should be made available. --> It's a good point! We shall foresee a place for extra necessary optical transmitters and connectors. P.9 clocking scheme I would guess that this scheme requires a lot of global or regional resources for the incoming data. While generally regional buffers should do for latching parallel data in, I would suspect that the clock recovery scheme described here might actually need global buffers due to dedicated routing between MMCMs and global buffers. On the other hand, plenty of global resources are required, if we want to run the MGT links in low latency mode. I suspect that with the currently existing devices we might run out of resources and we might have to give up on the idea of using phase alignment mode for latency reduction. We would rather have to use clock correction mode instead, and there someone would have to invest some effort on tuning the scheme towards low latencies. Clock correction mode, as it is available out of the box, uses deep buffers and yields horrible latency figures afaik. --> OK, thanks! We will look carefully into it during the engineering design. 4.5 Since you are referring to 72 MGT links per device here, you are including the 10Gb links. Therefore it would make sense to refer to 10Gb/s capable transceivers. They are in fact available from Avago. --> Added in the text. From Uli (on v0.5): ----------------------As outlined in the GOLD documents, the GOLD will always run off a 40.08 MHz clock, whether connected to real LHC bunch clock or local crystal clock. Even though I would believe that this doesn't impose any limitations on the CMX, I would nevertheless like to understand what you are planning local clock wise. When running in USA15, there will obviously be a 40.08MHz clock available through the TTC system. When running either CMX or GOLD standalone, the exact frequency doesn't matter. Running just CMX and GOLD in the test rig, we would have to run with with a 40.08MHz TTCvi/TTCvx. Therefore local clocks on both our modules should not matter. Just need to make sure that the TTC system is operating at 40.08 rather than 40.00 MHz. I do not know whether for the CMX design it does matter what local clocks be used, however, it might be worth spelling it out in the specifications. I would believe that we have already agreed that for full backward compatibility the DAQ and ROI links are going to run at 40.00 MHz. That's in fact the only use for that frequency on the GOLD, all real time paths are always running at 40.08, as pointed out above. --> we will follow the GOLD way - ran CMX from 40.08 MHz clock (TTC or local) and use 40.00 MHz clock for the DAQ and ROI links only. From Ian (on v0.2): -----------------------General comments: There is an FPGA on the current CMM that isn't mentioned in your text, or shown in your block diagrams: the I2C FPGA. This is a small device, an XCV100E that implements a VME interface to the various control and status pins of the TTCrx (I2C, Brcst, etc). For the CMM++ you'll need to keep this functionality. Incorporating it into your main FPGA would require ~30 extra pins, so I assume you'll also want to use a small, separate FPGA. (Porting the firmware for this FPGA should be straight forward, by the way.) --> agree, will be included in the technical design Latency -- this is mentioned as a constraint in a few places but there are no figures given. Is there a target latency (or latencies for the various modes)? --> added in 5.5 Re figures 1 & 2: As mentioned above you need to add the I2C FPGA (although you might consider giving it a better name, such as the TTC control FPGA). Alternatively, you could re-label the existing "VME CPLD" box as "VME Interface", so that this could notionally include this FPGA. --> labeled as "VME interface" There is a block here labeled "TCM", which I think should at least be labeled "TCM interface". If you've got room in the box, you might consider expanding this to "TCM interface (TTC + CANBus)" to make explicit the two distinct functions that comprise this conceptual box (and which you describe in the text). --> labeled as "TCM interface (TTC + CANBus)" There is a grey link between the TTCrx and the VME CPLD that I suspect people will interpret as the real-time TTC path (i.e., In and In_b). These in fact arrive at the TTCrx from the TCM interface. There is a slow control path from the VME CPLD, via the I2C FPGA, to the TTCrx, but I don't think that is what you were intending to show with this line. --> corrected The VME paths need some corrections: - add a path from the CPLD via the I2C FPGA to the TTCrx; - add a path to System ACE; - remove the path to the G-links. Alternatively, you might like to consider removing all the VME paths, to avoid complicating your diagram. --> corrected Per section: Section 4.4 Are the RTM-RTM cables to be the same as the CMM++-CTP cables? This is how the current CMM is designed, so that we can change the connectivity for diagnostic purposes. If you retain this functionality does it limit your bandwidth? That is, are SCSI-3 cables capable of carrying 320 MHz? --> The same cables will be used. Corrected the bandwidth Section 4.7, FPGA configuration: I suggest you remove the first scenario, where you implement multiple modes of operation in a single design. I can't imagine the required modes of operation (backwards compatible, test, etc) changing at such a rate that this will be required, so it looks like an unnecessary complication. I think you'll have enough work to do without implementing this, and if you put it in the spec, sooner or later somebody will ask for it.... --> agree and corrected (now in 4.8) Section 6.3, Firmware development: Please list me as the contact for the CMM Energy firmware. The core of this firmware was originally developed at Mainz by Andrea Dahlhoff. However, she left the collaboration many years ago and since then I've been supporting it. In fact, after many years of modifications almost none of her original code remains. --> done Appendix B: Data Formats: You talk about many modes of operation in the spec: Backwards Compatible, Test, etc. For clarity, I suggest you say explicitly which mode these formats are for ("Data Source for TP" I believe). --> clarified More generally about data formats, are we happy to converge on these formats now? That is, are we happy with the physics case? This isn't a question for the spec to address but I ask it more generally. --> These are the last proposed formats Furthermore, do we have any conception of what a format for the Test mode might look like? Some half-way house between the current and upgrade formats? --> Formats for the test mode (and the test mode itself) is under discussion From Dan (on v0.4): ------------------------The new part and the technically challenging part of the CMX card is the large number of high speed fiber optic links on it. Even though this will be a "functional" review of the CMX card, I expect that the review committee may look closely at this aspect of the card. --> Yes, this is the most challenging part. We also have to use the results and experience of others... Will the committee ask if the designers of the TP and the designers of the CMX have worked together to "iron out" all the details and agree on a common strategy for these high speed optical links? For example are we certain that SNAP 12 at 6 Gbps is the best way to build these cards ? --> Uli Schäfer from Mainz (TP designer) is the member of review panel, so he knows the answer already. We are in close contact with him and will define all technical points during the engineering phase follow the review. He actually propose to use 12- fibers AVAGO transceivers and we will define and use a common design for CMX/TP. How many links do we actually propose for the CMX card? Should there be a table of the number of required links vs mode of operation? This table would be to summarize the information that you present on pages 13, 14, and 15. --> A table will certainly help, we have to fit it somewhere. The number of high-speed fibers in the backward compatible mode is zero. In the upgrade mode the number of transmitters is 12 (one optical 12-fibers ribbon link) in a case all backplane data are send to the TP without internal CMX processing (just to remind that there is up to 72 transceivers per FPGA). If there is an internal data processing in CMX, data volume can be reduces by a factor of 2-3 (Sam's estimate) - but we probably will still need one 12-links optical transmitter. The remaining transmitters can be used to duplicate (fan-out) the data to multiple destinations (multiple TP slices). In a case we will implement the standalone mode, we also can use remaining transmitters to duplicate (fan-out) the data to multiple destinations - in this case CMX modules. In addition, in this mode we have to implement in CMX module optical receivers for incoming data. The maximum number of CMX modules is 12 - this is an upper limit to the number of input/output optical 12-fibers ribbon links. With internal data processing in CMX we may need, let say, 6 x 11 fibers to send data from one CMX to other 11 CMX modules. An additional passive interconnect card regrouping the output fibers form the CMX modules into fiber ribbons for the inputs of the CMX/TP modules. The number of links depends on the link speed so should the above table also show the required Gbps for the different modes of operation. --> I agree, but this shall be also agreed with the TP, the optical data transfer will be the area of discussions/decisions in coming months. Page 11 section 5.5 "Optical Links to TP and from other CMXs" mentions a potential of: 72 FPGA transceivers, 72 optical receivers, and 72 optical transmitters for this high speed optical link function. It's maybe a detail but from these high speed optical links we need to subtract the FPGA's GTX transceivers that are needed to implement the G-Link output function. --> Yes, I will add this note. However, only 1-3 transmitters are needed for the G-Link. In the upgrade mode there are plenty of transmitters. In the standalone mode there will be internal data processing in CMX and we will also have quite some free transmitters. It would be a full time job to keep up with the fiber optic transmitter/receiver market and I certainly have not done so. If the reviewers ask, do we need to be ready to talk about? How modern is SNAP 12? What fraction of SNAP 12 products operates at 6 Gbps vs 2.5 or 3.3 Gbps? What other high speed high density optical standards did we look at? Are we in contact with other current CMS, ATLAS, CERN, and FERMI projects currently designing in high speed optical links? --> As I've said, we didn’t look into the optical link implementation mostly relaying on the result of GOLD (Generic Optical Link Demonstrator) study as a part of the L1Calo trigger collaboration and decision shall be taken together with the TP. Avago is the other possibility vs SNAP12. In ATLAS - Mainz and RAL (L1Calo) are the groups we are in contact with and we are following the development in the new calorimeter ROD design. I also try to follow high speed link development in CMS trigger. As far as I know there is no CERN development of the high speed optical link which satisfy our requirements, they concentrate on detector radhard link study. The reviewers may ask, "How and When will you decide how many high speed optical links to put on the CMX card?". --> Up to now we assumed that we will route on the board all 72 in/out links and layout the connections for the transmitters/receivers, but we will not equip the board with the optical parts until it really needed. The final decision shall be taken together with the TP and the TP development/commissioning schedule, the main question being the need for the standalone mode of CMX. Do we need any backup slides to prove that we have thought enough about the high speed optical links so that we are ready to start detailed pcb layout in 6 months? This is the critical part of this whole card. Everything else on CMX has already been done before on CMM card. --> We are aware of other activities: RAL (High speed Demo): https://indico.cern.ch/getFile.py/access?contribId=4&sessionId=0&resId=1&materialId=slides&confId=126214 Mainz (GOLD): https://indico.cern.ch/getFile.py/access?contribId=12&sessionId=1&resId=0&materialId=slides&confId=120126 In order to be ready to start detailed pcb layout in 6 months we either need to implement our own prototype or to follow these activities more close, better even participate in them somehow - this can be discussed among MSU/RAL/Mainz how to organize such a participation, etc. An aspect of the CMX that may be difficult for the reviewers to understand is that, if the CMX is also used as a TP, then we currently plan to use a given single CMX card to function both as a CMM with the CMX high speed output function and as the TP. --> It's certainly a complication as we combine two functionalities in a single module. But just to remind that the current CMM also combine "crate" and "system" functionalities in the same physical module. In my view it's somehow similar (but may be not of the same complexity). This is very different than using a separate CMX card to function as a TP. Firmware management will be more difficult. A change to the TP firmware will require re-certification of the firmware that implements the CMM function. --> I believe, that we can "freeze" the CMM (CMX) part of FPGA place-and-route in the design and it's layout and timing characteristics will not change after modification of the TP part, XILINX tool shall allow for this... Testing will be more difficult especially online testing. If the TP were a separate CMX card then for testing it could run online receiving the full online data stream but not have its output used for anything except checking to verify the correct operation of the TP. --> I agree that the testing will be more complicated. I guess, for the initial tests we can use one CMX as CMX and the other (in other crate) - as the TP in order to check the CMX and TP parts of the design separately. In the final system, of course, the CMX/TP functions will be merged in the same module. If a real TP will require a new separate crate. Will the review panel wonder why we don't simplify matters for ourselves and make some arrangement to provide space for a separate CMX card(s) to run as the TP. --> There are several points to mention: the TP is planned to be implemented in the ATCA crate and CMX - in 9U VME mechanical crate with a custom backplane, so CMX will not fit in the TP crate. In the current system, the CMM occupy designated slots in the crates and can't be used in the other slots (as far as I understand the system), in spite there are few spare slots in the L1Calo crates. Therefore, having separate CMX cards will probably not help... Do you need a backup slide to say why it is impossible to have a separate CMX card for the TP? Will all of us really be happy living with a single CMX card doing both the CMM function and the TP function especially if there is a lot of TP algorithm development? --> I said above why it's impossible. As for the happiness :-) - the aim of the Phase 1 upgrade is CMX+TP. Standalone mode of CMX (CMX/TP) is a backup solution which may not happen if the TP will come in time (that's what Uli aiming for). Therefore, the CMX/TP development will be done with the available hardware (CMX) and remaining firmware resources in the FPGA after the CMX upgrade mode design is done. What will be possible to achieve with this combination, will be done. But we can’t do everything - then we will try to develop another TP in a not very optimal way. Perhaps this is just part of an overall aspect that the review committee may find confusing but, why don't we as a group know whether or not there will be a real TP or that the CMX will need to also provide the TP function ? --> If we look at the planning, presented during the Cambridge meeting (slide 6), https://indico.cern.ch/getFile.py/access?contribId=13&sessionId=1&resId=1&materialId=slides&confId=120126 the prototype TP shall be ready in time - during the long shutdown, when the CMX will be commissioned. As I've said, the first priority for us is to implement the upgrade mode of the CMX. If there were going to be a real TP then the design of the CMX is much simpler, it can be ready sooner, there is less risk that it will not work, it will cost less. Do we need to be ready to justify why we do not know if there will be a real TP? --> Agree. Most of the reviewers are from the L1Calo or L1 trigger. I don’t think we have to justify "why we do not know" to them - we don’t know more than the L1Calo as whole. I believe, we can’t do this decision in isolation. I agree that the schematics and the PCB layout will be much simpler with only upgrade mode of CMX and a single 12-fibers ribbon link. Small points: The pin count table on page 8 includes 42 pins for the G-Link output connections. I believe that these are not needed because the G-Link encoding function will be done in the FPGA. --> It is mentioned in the text: "One way to reduce this count is by emulating the readout G-links in the FPGA (see 4.3)." We probably need to reserve some FPGA pins for the "management" of the high speed optical link transmitters and receivers, i.e. the TXEN, TXDIS, RESET-, and FAULT- lines for the transmitters and the RXEN, ENSD, SD, SQEN lines for the receivers. This may be about another 10 to 15 signal pins on the FPGA. The SFP transceivers for the G-Link outputs also have some management lines. This may be about another 5 or so pins on the FPGA. --> Agree, will be a part of the next step - engineering design, which, I believe, will follow by another design review... The current version of the document describes using the Infineon V23818-M305-B57 type optical transceiver for the G-Link output. It's a small point but Infineon no longer makes any fiber optic components. I believe that an equivalent Avago part is AFBR-57M5APZ. --> Added into the spec. From Philippe (on v0.4): -----------------------Typo: Incomplete sentence at bottom of page 9? "by emulating the readout G-links in the FPGA (see 4.3), saving" --> corrected Typo: in 4.5, too many words? "In the initial backward compatible mode of CMX operation, no optical links will to the TP be exploited." --> Yes, shall be "no optical links to the TP will be exploited", corrected Question: In 5.4, about number of G-link - instead of "2-4", shouldn't 4 be the minimum? Do I understand correctly that one pair of G-links would be used to send data separately to both the DAQ and ROI builder, while we would still need one pair for the CMM functionality and one separate pair for the TP source or TP standalone functionality? --> Yes, this is a valid point. We will need (in addition to 2 G-Links of CMM) at least 1 (or 2) extra G-Link for the CMX in the TP source mode and some extra G-Links in the standalone mode, so the number shall be increased to 4-6... However, this is still a guess until a detailed design will be done. I am also trying to reconcile this with section 3.3 "The CMX programming model and online software will require modifications to new CMX functionalities. As in the data source modes (see 3.2), the CMX will send more data to the RODs and therefore additional optical links to RODs shall be implemented." Does the above sentence imply that even more G-link may be needed? beyond 4? --> see above... Question: also in 5.4 I do not understand the number in the box for "Tx repl" and "Data source for TP"... If/when there is a TP in the system, are we still doing massive data replications? --> It depends on the TP implementation. It can be implemented in several "slices", each slice receives all the data from CMX modules, but run different algorithms. This data fan-out can be done in different ways - passive optical replication outside of the CMX modules or electrical/optical in CMX (what is assumed as a possibility in the CMX spec). I believe, so we will have the overall Phase 1 upgrade document/scenario (and will fill the ref. [1] in the spec). Question: also in 5.4 "Assumption is, that in the standalone mode all CMX are used as the CMX/TP". Somewhere else in the document, don't we make the argument that only one CMX per crate can be a CMX/TP (for VME address space constraints). --> Yes, in 4.6 it is said: "The Virtex 6 FPGAs have up to 38 Mbits of block RAM, or 4.75 megabytes. This fits easily within the extra 6 megabytes of VME-- space as long as no more than one CMX per crate can act as a CMX/TP. Alternatively, instead of direct mapping of the CMX to the VME-space, an indirect addressing might be used - a moveable window, where a register on the CMX defines the base address of this window. This would allow the many megabytes of CMX to be accessed through a smaller VME-- address space. Provided the window is big enough to encompass any of the single blocks of RAM this solution would not be expected to slow access significantly". So, it depends on the VME addressing mode we have to select. Question: 5.5 Latency. Should we define the quantity "bunch crossing"? --> It is an LHC parameter of 25 ns (to be precise, 1/40.08 MHz), I will define it. Should the serialization/deserialization/transfer latency from CMX to TP or to CMX-TP be addressed explicitly? --> Agree. It was already reported to be ~2 bunch crossing. Added From Hal (on v0.6): ----------------------I think that I'm missing something basic here about the backplane receivers (end of sect. 4.1). The last paragraph in this section mentions an 80 MHz clock, but it wasn't obvious to me how this would fit in with, for example, Fig. 3, which seems to only indicate 40 and 160 MHz. --> clarified it in the text (4.1) and in the Fig. 3 Present a clear timeline for specifying baseline standalone mode functionality (along with some idea of who will actually do this work) at the review. Since this has a potentially large impact on the design, I think that it's very important to clarify the requirements as soon as possible. --> agree, in a case the TP comes in time, this mode may never happen. I mention in the document that this mode is optional. From Richard (on v.7): ---------------------------Could this detail about extracting the clock go into the specification document? --> It is clarified in 4.1. In my firmware exercises I rather use PLL to recover 80 MHz clock from encoded clock/parity line and FPGA input DDR register to fetch the data. Therefore, 160 MHz clock was neither transferred over the backplane nor used in the receiver part. Do you intend to have a PLL on each of the 14 sets of input busses? Are there enough in the chip you intend using? How does your DDR register scheme work? --> Current plan is to do this on each input bus. There are 18 MMCM in FPGA. The encoded clock is recovered in PLL and fetch incoming data on each clock edge using input DDR register. From Chip (on v.7): -----------------------Multiple corrections and suggestions discussed over phone and included in the document v0.8 From Jim (on v0.8): ------------------Do new (or upgraded) rear transition modules need to be made for the CMX? --> For backward and upgrade modes - no new module, use existing one. For standalone mode - may be, if we need to transfer additional data with higher data rate and existing modules are not capable. Included in the document. One other thought: the self-testing of CMX implies a number of receivers N equal to the number of transmitters N at least, if you are doing the test with just 2 boards. True, you could finesse this by configuring/soldering one board laid out as N transceivers as "all transmitters" and the other as "all receivers"; I leave to you engineers whether that more naturally implies traces for N transmitters and N receivers both attached to N transceivers on the FPGA, or 2N with dedicated layout for N out, N in, as a minimum complexity. My main point is defining a minimum sensible complexity for a self-testable CMX even if it is destined to never do standalone. I think what's described in the CMX document is a layout with N transceivers and traces to connect all of them as either N transmitters, or as N receivers, rather than 2N transceivers dedicated as N out, N in. --> Each transceiver on FPGA has 1 receiver and 1 transmitter, independent of each other, therefore, in order to have N transmitters and N receivers, one need N transceivers (not 2N). For the optical link test with the second CMX you are right, I mentioned it in the text.