Problems and solutions for IEEE 1588 implementations

January 30, 2018

Time synchronization via 1588 has been established as an IEEE standard since 2008 and is already used in a wide variety of areas. Until now, the use of this standard was always associated with exotic hardware, i.e. with implementations of network adapters in various FPGAs or embedded controllers. With the introduction of the Intel network chip families Intel I21x and Intel I35x, this standard is now available for the consumer market. This lays the foundation for new projects based on consumer hardware.

A number of commercially available IEEE 1588 implementations are available for software implementation. The American LXI consortium even provides a virtually free implementation of the standard (only membership fees apply). The German company TSEP has developed this IEEE 1588 implementation in cooperation with the LXI consortium. TSEP also distributes a paid version especially for companies that are not interested in LXI membership.

The fundamental question that arises in every IEEE 1588 project is the accuracy with which the time synchronization must take place. The achievable accuracy usually depends on the hardware used, the topology and the control algorithm used. Modern IEEE 1588 implementations have the option of defining different control algorithms and simply exchanging them. TSEP has also defined the control algorithm as an independent module with defined interfaces. This means that the user can easily define their own algorithm and integrate and test it in the system.

If IEEE 1588 is used, for example, for the synchronization of WLAN loudspeakers, the human ear is the unit of measurement for accuracy. The human ear can detect differences in propagation time from 10 µs. Therefore, the accuracy achieved for the synchronization of the WLAN speakers must be less than 10 µs.

However, other accuracies are required for metrological tasks. In measurement technology, measurements are usually triggered with the help of triggers. As a rule, these triggers are signal changes (rising or falling edges, exceeding level values, etc.). These signals are transmitted via cable from the source to the measuring device. The transit time within the trigger cable is therefore the decisive target variable for accuracy. Assuming cable lengths of approx. 5 meters, which is rather generous, a runtime of 25 ns can be assumed (runtime of 5 ns per meter). The accuracy for measurement problems would therefore be in this order of magnitude. However, this order of magnitude has shifted significantly downwards in the field of measurement technology with the introduction of 5G technologies in mobile communications. Accuracies in the sub-nano range would be desirable for these technologies.

The IEEE 1588 control algorithm

IEEE 1588 attempts to synchronize several free-running clocks. Each of these clocks is usually implemented as a counter that increments its counter at a specified frequency. Based on the frequency and the counter reading, the current time can be derived at any time. As it is not technically possible to have several oscillators generate identical frequencies, the frequency must be readjusted. As it is technically much easier to manipulate the counter cycle, this is changed. This adjustment must be made using a control algorithm, as the adjustments are subject to various disturbances. In addition, disturbances that may occur in the transport path must also be taken into account. As each IEEE 1588 implementation is based on its own hardware and hardware topology, there can be no such thing as a “general control algorithm”.

In principle, the algorithms can be divided into two groups. The first group tends to be based on simple algorithms that generally only concentrate on determining the error in the frequency of their own clock from the determined time difference between master and slave (also known as MeanPathDelay). This type of algorithm is independent of the hardware topology used and provides quite useful results. The free LinuxPTP implementation and the TSEP implementation each contain such an algorithm as standard.

The second group are algorithms that try to determine the errors in the system and include them in the calculation of the error of their own frequency. These algorithms are only useful if the hardware used and the expected topology are known. The error models can then be created and used on the basis of the hardware used. Kalman filters, which can be modeled specifically for the corresponding problem, are particularly suitable for this type of control algorithm.

Each control algorithm contains at least two states. In the first state, the offset between the master and slave (MeanPathDelay) is so large that the algorithm cannot close this gap in an acceptable control time. In this state, the time received from the master is adopted directly as the slave time without correction in the hope that the MeanPathDelay value determined in the next synchronization interval is significantly smaller. This state is maintained until an acceptable MeanPathDelay is reached. This state is the default state after starting the clock or if synchronization is lost due to problems. In the second state, the actual control algorithm takes effect, which attempts to determine the correction values of its own clock and to approximate its own time as closely as possible to the master time.

Simple IEEE 1588 control algorithms

As already mentioned, these types of algorithms are designed to determine only the correction value of the individual clock based on the determined MeanPathDelay. A separate error model is not used here, or only to a limited extent. This type of algorithm is relatively simple and can be used independently of the hardware.

Based on the standard control algorithm of the TSEP IEEE 1588 implementation, the operation of such an algorithm will be illustrated. In the first step, this simple control algorithm attempts not to include invalid or incorrect MeanPathDelays in the control. The IEEE 1588 implementation discussed here is based on Intel network chips of the I21x family. These use Gigabit Ethernet in accordance with IEEE standard 802.3. This type of network system is not deterministic; any participant can access the network at any time. Access is organized via packet collisions. This can result in packets being transmitted much later than is actually assumed. This delay does not appear in the transmitted data packets. In order to protect the control algorithms from this incorrect and therefore disruptive data, the control algorithm attempts to detect these incorrect data packets and not include them in the control algorithm.

Figure 1: Incorrect MeanPathDelay due to network delays

The diagram above shows such an incorrect packet, which was subsequently included in the control algorithm and then had to be compensated. This incorrect data can be recognized by the significantly increased MeanPathDelay. To detect these incorrect MeanPathDelays, the standard deviation of the MeanPathDelay is calculated:

Figure 2: Calculation of the standard deviation MeanPathDelay

If a new MeanPathDelay significantly exceeds the calculated standard deviation, it is not used for further processing.

In the next step, an attempt is made to calculate the correction of the slave’s own clock from the calculated MeanPathDelay. To do this, the deviation of the slave from the master, which was determined using the MeanPathDelay algorithm, is converted to the frequency of the slave’s own counter (clock). In the first step, the error of the slave’s own clock is calculated per counter step:

Figure 3: Calculation of the error of the internal clock

The Intel I21x can vary the time after which its own counter is incremented (every 8 ns) within certain limits. The error per counter increment is programmed into the hardware by the algorithm (according to the above formula).

With such simple control algorithms, the system repeatedly oscillates, as the control algorithm applies the correction value accordingly depending on the error detected. To avoid such oscillation, the TSEP IEEE 1588 control algorithm considers the 1st derivative of the MeanPathDelay.

Figure 4: Extended control algorithm

Complex IEEE 1588 control algorithms

However, if error models and hardware topologies are to be considered, other approaches must be used. Kalman filters (or Kalman-Bucy-Stratonovich filters) can be used for such problems. The Kalman filter is named after its discoverers Rudolf E. Kálmán, Richard S. Bucy and Ruslan L. Stratonovich, who independently discovered the method or made significant contributions to it. The Kalman filter is used to reduce errors in real measured values and to provide estimates for non-measurable system variables. The prerequisite for this is that the necessary values can be described by a mathematical model. The special feature of the filter presented by Kálmán in 1960 [3] is its special mathematical structure, which enables it to be used in real-time systems in various technical areas. This includes its use in electronic control loops in communication systems. In the context of mathematical estimation theory, one also speaks of a Bayesian minimum variance estimator for linear stochastic systems in state space representation.

The Kalman filter attempts to incorporate error models into the estimation of the actual correction value. In the first approach, it is necessary to be able to estimate the error sources within an IEEE 1588 realization.

Stability of the oscillator within the Intel I21x network chips

One error variable in an IEEE 1588 implementation is the stability of the internal counter used. The current time is derived from an internal counter. The clock with which the counter is incremented is derived from an oscillator or another source. With the Intel network chips I21x and I35x, this is derived from the Ethernet clock used, i.e. 125 MHz.

Figure 5: Measurement setup with Omicron Grandmaster Clock

In order to get an overview of the stability of this clock, measurements were carried out in the TSEP laboratory (see image above). Several computer boards and PC plug-in cards with the corresponding network chips were measured. For this purpose, the internal clock on the network chips was operated with a constant adjustment over several hours. A pps signal (pulse per second) was programmed on the GPIO of the Intel network chip and measured with a corresponding oscilloscope. An Omicron Grandmaster Clock was used as the grandmaster clock.

Figure 6: Room temperature error distribution

The diagram above shows the difference in the period duration of the pps signal. In this diagram you can see that the error of the clock is grouped around the center position. The measurement shows that the error is around +/- 2000 ns. Based on the 125 MHz clock of the clock, this results in an error per clock pulse/increment of the clock of:

2000 ns / 125 MHz = 16 x 10-15 sec

This error value could also be measured with other Intel 21x network cards or embedded computers with Intel I21x.

Another point is the temperature stability of the internal clock of the Intel I21x chips. The ambient temperature of the network chip was varied for this purpose. Below are two diagrams showing the error in the corner temperatures for the chip.

Figure 7: Error distribution of cooled network chip

Figure 8: Error distribution of heated network chip

You can see from the diagrams that the error distribution is clearly different. The error also occurs noticeably more frequently at higher temperatures.

Multiplexing for Gigabit Ethernet data transmission

In order to better understand the following problems, some basics about Gigabit Ethernet according to IEEE standard 802.3 and the Intel network chips must be discussed. The 1 GB Ethernet evolved from the previously established 100 MB Ethernet standard. The Ethernet CAT 5 cables used were intended for the transmission of signals at 125 MHz. These cables had four pairs (two wires each) for this purpose. However, data was only transmitted via two pairs. With Gigabit Ethernet, two bits are now transmitted via all four pairs:

125 MHz x 2 bits x 4 communication channels = 1000 Mbps =1Gbps

This means that two bits are always transmitted on the four twisted pairs. The transmitter must split each byte into four parts and the receiver must reassemble them. As already mentioned, the processing clock is 125 MHz.

The four wire pairs are used simultaneously for both directions. The frequency used within the Gigabit Ethernet is 125 MHz according to GMII (Gigabit Media Independent Interface). The frequencies at the transmitter and receiver are not necessarily coupled. Depending on the data transmission, the frequency for the data is derived from its own clock or from that of the network. Due to this procedure and the probable assumption that the four lines are not identical in length, a different runtime must be expected for the individual partial data (4 x 2 bits). Only after all four data parts of a byte are available can they be reassembled, which is why delays can occur. Depending on the scenario, delays of up to 20 ns can occur. The entire problem is described very well in the document “Improving IEEE 1588 synchronization accuracy in 1000BASE-T systems” [1]. As the problems described are unfortunately in the ns range, they have a considerable impact on the accuracy of time synchronization via IEEE 1588. With copper-based Gigabit Ethernet implementations, it is therefore very difficult or almost impossible to reach the sub-nano range. This fact was certainly one of the reasons why the White Rapid System [2] resorts to networks with fiber optic technology.

Consideration of runtime errors with “Long Linear Paths”

Actually, you can only count on a grandmaster clock, a transparent clock (switch) and a slave clock for small systems or in the development of systems. In reality, however, you have to assume connections in which the grandmaster clock has to run via several transparent clocks or even non-PTP-compliant switches. All these factors add up over the transport chain from the master to the slave. To improve your own control algorithm, you can try to record the errors that occur in the individual switching nodes and incorporate them into your own control algorithm. To this end, the Kalman filter can be used to create a model that takes the individual errors into account. The document [4] “Accurate Time Synchronization in PTP-based Industrial Networks with Long Linear Path” by Daniele Fontanelli and David Macii shows a model for this problem using a Kalman filter. The document also shows that this approach can be used to improve the accuracy of IEEE 1588 systems in this scenario.

The Kalman filter

s early as 1960, Rudolf E. Kalman developed a special method for discrete-time linear systems to estimate the states of a system (including their parameters) from noisy and partially redundant measurements. This method became known as the Kalman filter and was first published in [3]. Since then, many different variants of the Kalman filter have been published. The following description can also be found in detail in [5].

In order for the Kalman filter to be used correctly, it is necessary that the basic conditions of the measurement system are known. Every classic Kalman filter usually consists of a state space description and the real measurement system with its own system and measurement noise. The prediction and correction can be calculated from this system with the help of the Kalman filter. In principle, a Kalman filter estimates the output variable ŷ(k) and compares it with the measured output variable y(k) of the real measurement system. The difference Δy(k) between the two values is weighted with the Kalman gain Ḵ(k) and used to correct the estimated state vector ẍ(k). The structure can be described as follows:

Figure 9: Structure of a Kalman filter

The five basic Kalman filter equations can be derived from this structure:

Prediction:

ẍ(k+1) = Ad * ẋ(k) + Bd * u(k)

Ṕ(k+1) = Ad * Ṗ(k) * AdT + Gd * Q(k) * GdT where Q(k) = variance(z(k))

Correction:

K(k) = Ṕ(k) * CT * (C * Ṕ(k) * CT + R(k))-1 with R(k) = variance(v(k))

ẋ(k) = ẍ(k) + K(k) * (y(k) – C * ẍ(k) – D * u(k))

Ṗ(k) = (I – K(k) * C) * Ṕ(k)

The derivation is omitted due to its complexity. However, it can be found in [3] or [5].

When designing a Kalman filter, it is necessary to describe the physical system in continuous time using differential equations. It determines the output vector y(t). It must be ensured that this output vector contains all noisy variables. The non-noise variables are described separately in the output vector u(t). There can be several approaches for determining the state variable x(t). As a rule, these options should be evaluated and the approach that best describes the problem should then be used.

Once all equations and parameters have been selected, the system has the following form:

ẋ(t) = A * x(t) + B * u(t) + G * z(t)

y(t) = C * x(t) + D * u(t)

This continuous-time description must then be converted into a discrete-time description. The following formulas can be used for this:

TS is the sampling rate of the system. The matrices C and D remain identical in the discrete-time system.

Several approaches can be used to determine the matrix Gd, such as sampling using the Dirac impulse. The method used must be determined individually.

Once the description of the system in the state space is available, the observability can be checked. According to [3] or [5], the criterion for this is that a linear time-invariant system of order n is observable if the observability matrix SB or S*B has rank n. However, Gilbert’s or Hautus’ criteria can also be used. However, Gilbert’s or Hautus’ criteria can also be used. If the system is not observable, it can be divided into an observable and a non-observable system.

Finally, the system noise Q(k) and the measurement noise R(k) must be described.

Q(k) = variance(z(k))

R(k) = variance(v(k))

In order to make optimum use of Kalman filters, these two variables must be determined as accurately as possible. In simple systems, one can speak of approximately constant values, but this does not apply in the IEEE 1588 environment. It can be assumed that these variables change during runtime and are therefore adaptively adjusted.

There is a variant of the Kalman filter that works with an adaptive determination of the two covariance matrices: the ROSE filter (Rapid Ongoing Stochastic Covariance Estimation Filter). The principle is based on cyclically redetermining the covariance of the measurement noise R(k) by observing the measurable variable y(k) using two embedded simple Kalman filters (with constant Kalman gain). Similarly, the covariance of the system noise Q(k) is determined using the value Δy(k) and the measured quantity y(k), the covariance of the estimation error Ṕ(k+1), the covariance of the measurement noise R(k) and a simple Kalman filter.

If one considers the individual steps that lead to a complete description of the state space and the determination of the covariances of the system and measurement noise, one can say that the creation of a Kalman filter is anything but trivial. However, the boundary conditions of the measurement system can be optimally embedded in the Kalman filter, which leads to the best possible correction of the internal IEEE 1588 clock.

Conclusion

The successful use of an IEEE 1588 implementation is not only based on the use of an existing IEEE 1588 stack or special hardware. The problem-oriented solution approach is the key to success here. The possibilities of IEEE 1588 are wide-ranging. However, without knowing the requirements for the necessary accuracy for an IEEE 1588 implementation and the available hardware topology, it is difficult to provide a system that functions within the framework of the requirements. Analysis in advance is absolutely essential and requires the appropriate know-how.

The choice of hardware used is of essential importance, as the individual components influence the accuracy in different ways. Especially for high accuracies in the sub-nano range, the use of fiber optic-based systems is absolutely essential. The White Rabit project has done good groundwork here and created the appropriate hardware and the necessary boundary conditions. The Intel network chips can also work with fiber optic-based PHYs, and the corresponding hardware is available on the market as consumer goods.

A perfectly tuned control algorithm is essential for high-precision systems. Choosing the right algorithm is not easy, requires a lot of know-how and must be adapted to the hardware and its topology. The implementation and simulation of the algorithm, especially for high-precision systems, is a not insignificant part of an IEEE 1588 implementation. However, it is also the key to the success of such an implementation. Kalman filters are very well suited for the implementation of such efficient control algorithms. However, the description of the state space and the covariances of the system and measurement noise require a lot of effort and time.

This article does not claim to present all facets of this extremely complex subject, but is intended to provide a brief overview of the problems and possible solutions. Each available IEEE 1588 stack only represents the tools for implementation. The correct use and implementation of the control algorithm is the actual task.

Author: Peter Plazotta, TSEP

Source: channel-e