T. Kovácsházy et al. / Carpathian Journal of Electronic and Computer Engineering 4 (2011) 59-64 59 Hardware Assisted IEEE 1588 Clock Synchronization for Linux Based Network Embedded Systems Tamás Kovácsházy, Bálint Ferencz Budapest University of Technology and Economics, Department of Measurement and Information Systems, Budapest Hungary Abstract— The paper introduces the IEEE 1588 Clock Synchronization technology primarily developed for Ethernet based Network Embedded Systems and presents a hardware assisted PTP implementation for the Linux operating system (kernel version of 2.6.30 or later) on the x86 architecture. Our software is based on the standard PTPd available for UNIX like operating systems offering only software time stamping. The modified software uses the Linux standard SO_TIMESTAMPING socket option to communicate with the Ethernet Network Interface Controller driver, so it is supposed to work with any other Ethernet Interface Controllers with proper driver support. Our test system utilizes selected Gigabit Ethernet Interface Controllers supporting hardware time stamping donated by Intel. The initial results show that clock accuracy (masterslave clock difference) less than one microsecond is achievable with our software even in the case of high network traffic and slave node (a node that synchronizes its clock to a master clock) load in standard Linux. The paper also investigates how the coefficients of the clock servo influence initial time convergence and tracking behavior in case of disturbance such as changing network traffic and slave node load. I. INTRODUCTION Measuring time is a very vague concept for the general public; however, they do not see how fundamentally their life depends on methods of precise time keeping in distributed systems. On the other hand the methods and precision of time keeping have been the markers of technology development from the early ages. Today, we have arrived to a stage of development in which we need to provide sub-microsecond range time precision to a global time reference in our network embedded systems for proper operation. This means that the clock of any participating node in the network embedded system needs to have less than one microsecond worst case deviance from the clock of any other node in the system. This precision should be delivered on an extremely low cost compared to earlier solutions in most of the applications, making impossible to apply dedicated time synchronization networks and hardware for every node in the network embedded system (in other words out of band time synchronization). In band time synchronization protocols, like Network Time Protocol (NTP) [1], have been available for decades; however, these solutions were designed primarily for personal and server computer time synchronization for simplifying the work of system operators by automatically adjusting the naturally ISSN 18474 – 9689 wandering local clock of the computers to reference clocks available somewhere on the Internet (low bandwidth, high error rate, long delay and big delay jitter communication). The main design requirement was to provide good enough time synchronization for human users (observers) with low network overhead using remote time servers. Therefore, their accuracy is not sufficient in most of the modern measurement and control problems. A. IEEE 1588 Precision Time Protocol IEEE 1588 Precision Time Protocol (PTP) [2] has been developed to fulfill the new requirements in distributed systems using standard local area network (LAN) communication technologies (practically Ethernet today), or even sensor network technologies, e.g., IEEE 802.15.4 [3]. The rationale behind developing PTP was the following: • Precise time synchronization between sites or building can be done using Global Positioning System (GPS) or by various radio services operated government organizations, or it is not required at all (autonomous operation). Therefore, wide area network (WAN) support is not required, the solution can concentrate on the LAN or on the sensor network only (lower error rate, lower delay and delay jitter), • NTP and other time synchronization solutions developed for WANs (communicating over TCP/IP primarily) cannot provide the required precision due to their design and observed performance. First, the IEEE 1588-2002 (PTPv1) standard has been developed, and then later based on gained experience with PTPv1, IEEE 1588-2008 (PTPv2) has been announced. PTP is a hierarchical master-slave clock synchronization protocol, which can take into account delays incurred by messages not only on end nodes (masters or slaves), but also delays caused by compliant network devices, e.g. Ethernet switches supporting PTP (boundary or transparent clocks). B. Precision Time Protocol daemon On UNIX like operating systems including Linux the software package Precision Time Protocol daemon (PTPd) has been implemented [4,5] and gained wide scale acceptance in the community. PTPd has supported PTPv1 originally (PTPd 1.x), but in 2010 an operational PTPv2 compliant modification has also appeared (PTPd 2.x). T. Kovácsházy et al. / Carpathian Journal of Electronic and Computer Engineering 4 (2011) 59-64 The current PTPd runs as a regular daemon (service) in the operating system, and time stamping of the incoming and outgoing network packets are done in software [5]. This means that the arrival and departure time of PTP network packets are measured (time stamped) inaccurately most cases by software components (PTPd and the underlying operating system). Therefore this kind of time stamping method strongly operating system load dependent, i.e., if the system load heavy or changing the precision of time synchronization may be influenced negatively, the possible synchronization error grows. Clearly, the only real solution to this problem is assigning time stamps to packets in hardware by the network interface controller hardware, which can measure low level actual arrival and departures times of PTP packets much more accurately and independently of the load of the system than the software solutions. This paper first introduces the advantages and disadvantages of hardware time stamping, and details the properties of our hardware assisted PTPdv1 implementation in Section II. Then Section III presents our test setup. Section IV shows the experimental results with and without load, and examines the results giving also directions for future work. In Section V the conclusions are drawn. II. HARDWARE TIME STAMPING Time stamps captured by hardware devices are recorded when an easy to identify section of the PTP packet passes the physical layer (PHY), for example, this easy to identify section can be the end of the Start of Frame delimiter octet in case of Ethernet. In essence, hardware time stamps are recorded at this time instance with extremely low delay and low delay jitter in a hardware register by capturing the value of a counter driven by a clock source. This hardware based solution, however, brings up additional problems to solve: • There are two clocks keeping time in this case in the node of the network embedded system, one of the operating system’s system clock to be synchronized to the master clock, and the other is the timer on the Ethernet controller used to acquire hardware time stamps, • HW based time stamping needs SW support (driver, operating system kernel, and application) to configure hardware time stamping and to send time stamps to applications in modern operating systems such as Linux. The first problem can be solved by applying properly crafted clock transformation between the two clocks, or by synchronizing both clocks to the master if that is possible (supported by the hardware and software). In this paper we consider synchronizing the clock of the operating system only. The solution to the second problem is detailed in this paper, from which our contribution is the modified PTPd daemon utilizing the already available hardware time stamping support of the device drivers and the operating system. A. Hardware time stamping Support of PTP compliant time stamping is slowly appearing on standalone Ethernet network interface controllers (NIC), on Ethernet physical layer devices ISSN 18474 – 9689 60 (PHY), and on system on a chips (SoC) containing Ethernet controllers. On the high speed PCIE attached NIC product range, which are primarily used in servers and high end network embedded systems, only some Ethernet NICs from Intel support hardware time stamping. For example, such devices are the Intel Intel 82576 (dual port), 82580 (quad port), and i350 (dual or quad port) gigabit Ethernet controllers. Some manufacturers offer PHYs with hardware time stamping support making possible to use this feature on any NIC or microcontroller with MII/RMII interface, the most widely used one seems to be the National Semiconductor DP83640. In addition, an increasing number of SoCs designed for implementing lower cost network embedded system nodes have time stamping hardware on board. A representative list of such system on chips is the following: • Freescale PowerQUICC processors, • Intel IXP4XX network processors, the EG20T Platform Controller Hub, and the EP80579 Integrated Processor Product Line, • Texas Instruments Stellaris line of microcontrollers with on board Ethernet controller. From this list, the Freescale and Intel products are capable of running Linux as an operating system; however, the Texas Instruments Stellaris line can be also used as low performance slave nodes in IEEE 1588 networks with simple TCP/IP stacks like lwIP. B. Software support for hardware time stamping in Linux To utilize the hardware time stamping feature of the Ethernet network interface controller the hardware must be properly configured to collect time stamps, and the collected time stamps must be passed to the application requesting the time stamps. In addition, the application must be able to utilize the provided timestamps and synchronize the system clock. In Linux, these functionalities require the proper interoperation of the operating system kernel, the NIC driver, and PTPd. Based on this the following software requirements are present: • The operating system must be prepared to transfer send and receive time stamps between the driver and application. Since version 2.6.30 of the Linux kernel the SO_TIMESTAMPING socket option provides the required functionalities for both send and receive timestamps. • The kernel driver for the previously mentioned Intel Ethernet Network Interface Controllers support hardware time stamping since version 2.0.6 of the Intel Gigabit (igb) driver. • PTPd has no official support for hardware assisted clock synchronization. A proof of concept PTPd has been published by Intel for the Intel 82576 Ethernet Interface Controller [6], but it does not use the SO_TIMESTAMPING socket option but communicates times in a proprietary way, practically it uses a specially modified driver. In addition, this modified version of PTPd T. Kovácsházy et al. / Carpathian Journal of Electronic and Computer Engineering 4 (2011) 59-64 is not maintained. In [7] a proof of concept implementation is introduced on various embedded platforms using the SO_TIMESTAMPING socket option, but that has not been tested on the x86 architecture. Therefore, the only missing component of the implementation is a PTPd that support hardware time stamping, i.e., hardware assisted PTPd implementation. So we decided to develop a PTPd version that is based on the current PTPd code base and uses the built in functionalities in the current Linux and hardware drivers. The effort has been supported by Intel Corporation by donating Ethernet Network Interface Controllers and by providing expert consultation. Because PTPd exists in two versions, practically version PTPv2 is a fork of the software package; it is also required to develop both a PTPv1 and PTPv2 compliant solution. As a first step we decided to implement a PTPv1 based solution due to the wider acceptance and availability of the standard, however, we have also developed a PTPv2 compliant version later. Our modifications have been also submitted to the PTPd maintainers but due to some constraints they have not been merged into the code base. For interested users the code is available from [8]. In the paper we present our results using PTPv1, because that implementation is more mature. The modifications to PTPd for hardware assisted time stamping are not detailed in this paper, the interested reader can examine the code him or herself, our primary aim is to show the performance of the hardware assisted implementation. C. Communication of the Time Stamps Communication of the time stamp information between the hardware and application is not a problem for received packets because both frame and time stamp data structures traverse the hardware and software interfaces into the same direction (hardware to application) and in the same time (after reception of the frame); and therefore, they can be fused into a single compound data structure even on the hardware level. This also means that it is not required to pair frames and time stamps, i.e., any number of frames can be time stamped efficiently and in an easy to use way. Unfortunately, time stamping is much complex for sent packets (send timestamps). In this case the application sends the information (by calling the appropriate system call), the information traverses the operating system including the TCP/IP stack and the Ethernet driver, and finally the hardware sends it through the PHY. While sending the frame on the physical layer the time stamp is captured in hardware by the PHY. Then, the captured time stamp should be sent to the application for processing in the opposite direction, form the hardware to application. In this situation it is also hard to pair time stamps to information, i.e., it is not a straightforward task to determine that the time stamp read from the hardware pertains to which of the frames (information) sent by the application previously (or by other applications). The problem of sent packet time stamps is properly solved by the SO_TIMESTAMPING socket option in Linux, by providing timestamps for sent packets in the error queue of the socket. However, matching information in the error queue to the sent packet must be done in the application. ISSN 18474 – 9689 61 D. Multiple clocks keeping time In standard software based PTPd slave implementations there exists only one clock to be synchronized, the standard Linux system clock, which has an architecture specific mixed hardware and software implementation. This clock is used to time stamp information, and also this clock is synchronized to the master clock by the software. So consistency of the whole algorithm is easy to be guaranteed due the single time piece in the system. The accuracy (time synchronization error between master and slave) and precision (resolution of time keeping) is also determined by this single time piece. The typical resolution of this timer is in the range of microseconds, or in nanoseconds. However, the properties of this clock are very hard to examine on the x86 architecture due to the hardware architecture of the x86 platform. The reason for this is that there are very deep queues in the x86 architecture in between the CPUs and the peripherals; i.e., the x86 architecture is primarily optimized for throughput, not latency. These deep queues introduce large jitter to any attempt to measure the accuracy of the system clock, this is the same reason why we need hardware time stamping on this architecture. On the other hand hardware assisted PTPd implementations have to deal with minimum of two clocks. One of the timers is the standard Linux system clock, which should be synchronized to the master clock, and used for time stamping of software events. The other clock is the counter on board the Ethernet Network Interface Controller used to time stamp incoming or outgoing Ethernet frames in hardware. This clock has a pure hardware implementation and may be a free running clock with a clock signal derived from the clock of the NIC, or it may also support hardware synchronization to a master clock. The resolution of this clock is typically in the range of nanoseconds. III. TEST SETUP Our hardware assisted PTPd implementation has been tested in a test environment shown on Figure 1. Both the PTP Master and Slaves are implemented by our hardware assisted PTPdv1 software as we have no dedicated grandmaster clock such as the MEINBERG LANTIME M600/GPS/PTPv2. A 3rd machine is used to generate disturbance (load) using iperf network traffic generator between the disturbance generator and the PTPd Slave. Figure 1. Hardware assisted PTPd test configuration T. Kovácsházy et al. / Carpathian Journal of Electronic and Computer Engineering 4 (2011) 59-64 The generated load is the maximum possible load, i.e., the system’s bottleneck defines the load. All three machines have the following configuration: • Intel C2D 2 GHz CPU, • 512 MByte or more RAM (the size of physical memory does not influence the performance in this situation), • Intel 82580 4 port Gigabit Ethernet Network Interface Controllers in all machines connected to a PCIE 4x slot, • Latest Debian Linux with Linux kernel 2.6.32 (Ubuntu and Fedora has been also tested with similar results). All three machines and the Ethernet switch have an additional network connection to the departmental network to provide remote network access and management; therefore, the network depicted on Figure 1. is used only for PTP communication. A non PTP compliant managed gigabit Ethernet switch connects the three machines using a Virtual LAN (VLAN). There are no quality of service classes (IEEE 802.1q/ IEEE 802.1p) defined on the switch, it forwards all packets on the first come first serve bases. We did not have a PTP compliant switch at the time of the measurements in operational, fully tested status; this is why we decided to use the PTPd Slave as iperf slave during the measurement show in this paper. The reasoning behind this decision is the following: • In this iperf configuration heavy traffic flows from the PTPd Slave to the disturbance generator, and queuing delay occurs in the PTPd Slave only, where hardware time stamping is applied, so PTP can eliminate the delay variations in this situation. • If the PTPd Slave is configured as iperf Server, queuing delay caused by the heavy traffic generated by iperf causes excessive delay variations on the switch egress port to the PTPd Slave, which is not measured by the non PTP compliant switch, reducing the accuracy of time synchronization. In our measurements the delay and accuracy reduction in this test setup were as we predicted (approximately 10 us, lower than the queuing delay of one maximum size 1518 Ethernet frame), but due to paper length constraints we have decided not to publish our results for these tests. • In this paper we do not want to show that a non PTP compliant switch causes synchronization accuracy problems, but we want to introduce how the hardware assist function increases PTP Slave accuracy under heavy PTP slave load conditions. TABLE I. MEAN AND STANDARD DEVIATION OFFSET AFTER SETTLING WITH DEFAULT PTPD CONFIGURATION Test Mean offset SW w/o Load SW w Load HW w/o Load HW w Load 1,2 µs 37 µs 0.77 µs 0.82 µs ISSN 18474 – 9689 Standard deviation 1,0 µs 27 µs 0.09 µs 0.09 µs 62 Measuring synchronization accuracy of two mixed hardware and software based clock, such as the Linux system clock, is not a simple technical problem as previously mentioned. Unfortunately, it is problematic to generate external events measurable for example by oscilloscopes on the x86 architecture without introducing additional software inaccuracies magnitudes larger than the quantities to be measured, so external measurement tools cannot be used. Therefore, we decided employing the offset reported by PTPd as the basis of evaluation as the last option. So the validation of our results by other means is a requirement, and we are working on this problem. We must note the same type of evaluation is done for NTP also [9] most cases. IV. RESULTS We have compared the performance of the standard PTPdv1 using software time stamping with the performance of our modified PTPdv1 using hardware time stamping. The performance also depends on PTPd slave load, so we have done test with and without iperf load as detailed in the previous section. In addition, we tested the PTPd software in two configurations, • One of them is the default configuration of PTPdv1, • The other is configuration with a modified clock servo time constant as later detailed (the rationale behind the selection if this servo configuration is to be introduced later also). In all cases, ntpdate is executed before tests on the Slave to set the clock to an acceptable offset. After that, PTPd is started on the Slave machine. The disturbance (load) is started after 20 minutes of the start of the PTPd Slave. The measurements shown are done for a 60 minute period, because for this length of measurement the settling and the steady state performance of the clock delay can be observed. A. Results in default PTPd configuration Figure 2., 3., 4., and 5. show the plot of absolute offset as the function of time in the four test cases (software or hardware time stamping, with or without load) for representative experiments. The mean and standard deviation of offset is presented in Table I in numeric form. The following observations can be made based on these experiments: • The performance of the PTPd using software time stamping drops drastically as the iperf load is introduces at 20 minutes (see Figure 3.). • Hardware time stamping provides better offset performance independently of load; however, the performance is drastically better in case of load, while only marginally better without load. This performance difference needs to be explained. • The 1 µs worst-case master to slave accuracy cannot be achieved even with hardware time stamping in operation, and slave to slave errors can be even bigger. The performance of the hardware assisted solution needs to be improved. • The initial slave offset is reduced very slowly, which is understandable because the equivalent default time constant of the PI controller used in PTPd is approximately 2000s. However, the long T. Kovácsházy et al. / Carpathian Journal of Electronic and Computer Engineering 4 (2011) 59-64 settling time is unacceptable in some applications, i.e., it takes approximately to 30 minutes to get the system operational (minimize offset) after a scheduled system start or a malfunction. B. Results in a modified PTPd configuration PTPd allows setting the time constants of the PI clock servo on the command line at startup; therefore, it is reasonable to experiment with these settings first before more detailed analyses. Reducing the settling time to 1/10 of the default value is achievable by setting the P I =100. Figure 6., 7., 8., and 9. show the plot of absolute offset as the function of time in the four test cases (software or hardware time stamping, with or without load) for representative experiments with P I =100. The mean and standard deviation of offset is presented in Table II in numeric form. The following observations can be made based on these experiments: • The settling time is reduced to 1/10 of the settling time of the default configuration while the stability of the solution is upheld. The properties of the servo loop should be investigated in detail. • The hardware assisted solution shows master to slave offset well under 1 µs worst-case, even worst-case slave to slave offset under 1 µs worstcase is achievable. • Assuming that the disturbance is not changed between experimenting with the default and the modified configuration, we must note the while the servo loop faster for the modified solution, the disturbance rejection is also reduced. However, this last observation contradicts with the fundamentals of linear control theory, so we have to find some explanation for this behavior, there must be some other phenomenon causing this behavior. The initial analyses shows that the fixed point implementation of the PI controller in the servo loop exhibits some rounding errors [9], and these errors are directly proportional to the integrator coefficients in the PI controller of the servo. Our investigation shows that while the current simplistic clock servo of PTPd can be improved, the results introduced in [10] may be used as starting point for a simpler but robust clock servo, or the reconfigurable controller of [11] may be applied as an alternative. V. CONCLUSIONS In this paper first we have introduced our hardware assisted PTPd implementation based on PTPd offering only software time stamping and we have shown our TABLE II. MEAN AND STANDARD DEVIATION OFFSET AFTER SETTLING WITH THE MODIFIED PTPD CONFIGURATION Test Mean offset SW w/o Load SW w Load HW w/o Load HW w Load 0,95 µs 31 µs 47 ns 66 ns ISSN 18474 – 9689 Standard deviation 1.16 µs 30 µs 38 ns 57 ns 63 initial results using Intel 82580 Ethernet NICs. The solution uses the Linux standard SO_TIMESTAMPING socket option to communicate with the Ethernet Network Interface Controller driver, so it is supposed to work with any other Ethernet Interface Controllers with proper driver support; however, due to lack of such hardware we cannot prove this claim. The performance of the developed hardware assisted PTPd modification is promising, worst-case slave to slave offset under 1 µs worst-case is achievable based on our results. Further evaluation and validation of these results is necessary. Our investigation shows that the performance of the hardware assisted PTPd software is primarily limited by the implementation of the servo loop today. ACKNOWLEDGMENTS Intel Corporation has supported this work by donating network interface controllers, other hardware, and by providing expert consultation. This work is connected to the scientific program of the "Development of quality-oriented and cooperative R+D+I strategy and functional model at BME" project. This project is supported by the New Hungary Development Plan (ProjectID: TÁMOP-4.2.1/B-09/1/KMR-2010-0002). This paper and the work it is based on have been supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. REFERENCES [1] Mills, D.L., “A brief history of NTP time: Memoirs of an Internet timekeeper”, ACM SIGCOMM Computer Communication Review, vol 33, number 2, page 9–21, 2003. [2] IEEE Std 1588-2002 and IEEE Std 1588-2008. (for more information see http://www.nist.gov/el/isd/ieee/ieee1588.cfm [3] Kim, K. and Lee, S.W. and geun Park, D. and Lee, B.C., “PTP interworking 802.15. 4 using 6LoWPAN”, 11th International Conference on Advanced Communication Technology, 2009. ICACT 2009., Vol. 1., pp. 873–876, 2009. [4] K. Correll, PTPd - Precision Time Protocol daemon, http://ptpd.sourceforge.net/ [5] K. Correll, N. Barendt, and M. Branicky, “Design Considerations for Software Only Implementations of the IEEE 1588 Precision Time Protocol,” in Proc. Conference on IEEE 1588, Winterthur, Switzerland, October 2005. [6] P. Ohly, D.N. Lombard, K.B. Stanton, “Hardware Assisted Precision Time Protocol. Design and case study.” Proceedings of LCI International Conference on High-Performance Clustered Computing. Urbana, IL, USA: Linux Cluster Institute, pp. 121– 131, 2008. [7] R. Cochran, C. Marinescu, “Design and implementation of a PTP clock infrastructure for the Linux kernel” Proceedings of the 2010 International IEEE Symposium on Precision Clock Synchronization for Measurement Control and Communication (ISPCS), pp. 116–121, 2010. [8] T. Kovacshazy, “Hardware assisted PTPd home page.” http://home.mit.bme.hu/~khazy/ptpd/ [9] B. Widrow and I. Kollar, "Quantization Noise: Roundoff Error in Digital Computation, Signal Processing, Control, and Communications," Cambridge University Press, Cambridge, UK, 2008. 778 p. [10] Mills, D.L., “Computer Network Time Synchronization: The Network Time Protocol on Earth and in Space”, CRC Press, Inc., 2001. [11] T. Kovacshazy, G. Peceli, Gy. Simon, “Transients in Reconfigurable Signal Processing Channels.”, IEEE Transactions on Instrumentation and Measurement 50:(4), pp. 936–940, 2001. T. Kovácsházy et al. / Carpathian Journal of Electronic and Computer Engineering 4 (2011) 59-64 64 Figure 2. Behavior of PTPdv1 in its default configuration, using software time stamping, and in case of no load Figure 6. Behavior of PTPdv1 with modified servo configuration, using software time stamping, and in case of no load Figure 3. Behavior of PTPdv1 in its default configuration, using software time stamping, and in case of load Figure 7. Behavior of PTPdv1 with modified servo configuration, using software time stamping, and in case of load Figure 4. Behavior of PTPdv1 in its default configuration, using hardware time stamping, and in case of no load Figure 8. Behavior of PTPdv1 with modified servo configuration, using hardware time stamping, and in case of no load Figure 5. Behavior of PTPdv1 in its default configuration, using hardware time stamping, and in case of load Figure 9. Behavior of PTPdv1 with modified servo configuration, using hardware time stamping, and in case of load ISSN 18474 – 9689
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )