The Trend of DSP’s Application in Data Center



The data center 100G has begun to be used on a scale, and the next-generation 400G is expected to begin commercial use by 2020. For 400G applications, the biggest difference is the introduction of a new modulation format, PAM-4, to achieve a doubled transmission rate at the same baud rate (device bandwidth). For example, the single-lane baud rate of DR4 used for transmissions up to 500m need to reach 100Gb/s. In order to realize the application for such rate, the data center optical transceiver modules began to introduce Digital Signal Processor (DSP) chips based on digital signal processing to replace the clock recovery chips of the past to solve the sensitivity problem caused by insufficient bandwidth of the optical devices. Can DSP become a broad solution for future data center applications as expected in the industry? To answer this question, it is necessary to understand what problems the DSP can solve, what its architecture is, and how the development of its costs and power consumption trends in the future.




The Problems that DSP Can Solve


In the field of physical layer transmission, DSP was first applied in wireless communications for three reasons. First, the wireless spectrum is a scarce resource, and the transmission rate demand has been increasing. Increasing the spectrum efficiency is a fundamental requirement for wireless communications, so DSP is required to support a variety of complex and efficient modulation methods. Second, the transmission equation of the wireless channel is very complicated. The multipath effect, and the Doppler effect in the high-speed motion, can't satisfy the wireless channel's compensation demand with the traditional analog compensation. DSP can use various mathematical models to compensate the channel well Transmission equation. Third, the Signal-to-Noise Ratio (SNR) of the wireless channel is generally low, and the Forward Error Correction (FEC) should be used to improve the sensitivity of the receiver.




In the field of optical communications, DSP was first commercially used in long-distance coherent transmission systems over 100G. The reason is similar to that of wireless communications. In long-distance transmission, since the laying cost of optical fiber resources is very high, the improvement of spectral efficiency to achieve higher transmission rates on a single optical fiber is an inevitable requirement for operators. Therefore, after the use of WDM technology, the use of coherent technology based-on DSP has become an inevitable choice. Secondly, in long-distance coherent transmission systems, by using of a DSP chip, the dispersion effects, non-linear effects caused by transmitter (Tx) and receiver (Rx) devices and the optical fiber itself, and phase noise introduced by the Tx and Rx devices, can be easily compensated without the need for Dispersion Compensation Fiber (DCF) that placed in the optical link in the past. Finally, in long-distance transmission, due to the attenuation effect of optical fibers, an optical amplifier (EDFA) is generally used to amplify the signal every 80km to reach a transmission distance up to 1000km. Each amplification will introduce noise to the signal, reducing the SNR of the signal, therefore, the FEC should be introduced to improve the receiver's receiving ability during long-distance transmission.




To sum up, DSP can solve three problems. First, it supports high-order modulation formats and can improve the spectral efficiency. Second, it can solve the effects caused by components and signal-channel transmission. Third, it can solve the SNR problem.




Then, whether there are similar requirements in data center has become an important basis for us to judge whether we should introduce DSP.




First of all, let's take a look at the spectrum efficiency. Does data center need to improve spectrum efficiency? The answer is yes. But unlike the lack of wireless spectrum resources and insufficient optical fiber resources in the transmission network, the reason for improving spectrum efficiency in data center is that the insufficient bandwidth of the electrical/optical devices and the insufficient number of wavelength division/parallel paths (limited by the size of optical transceiver modules). Therefore, to meet the needs of future 400G applications, we must rely on increasing the single-lane baud rate.




The second point is that for single-lane 100G and above applications, current Tx electrical driven chips and optical devices can not reach bandwidths above 50GHz. Therefore, it is equivalent to that a low-pass filter is introduced at the transmitter. The performance on the code is inter-symbol interference in the time domain. Taking the application of 100G PAM-4 as an example, the bandwidth-limited modulation device will make the width of the optical eye diagram of the signal very small, then the clock recovery based on the analog PLL in the past could not find the best sampling point, making the receiver unable to recover the signal (this is also why the TDECQ needs to introduce an adaptive filter for equalization in the standards). After introducing the DSP, the signal can be directly spectrally compressed at the Tx end. For example, the extreme approach is to artificially introduce intersymbol interference between two symbols to reduce the signal bandwidth of the Tx end. At this time, the eye diagram of PAM-4 on the oscilloscope will become PAM-7 form. The Rx end recovers the signal through an adaptive FIR filter. In this way, the uncontrollable analog bandwidth effect in the modulating/receiving device becomes a known digital spectrum compression, reducing the bandwidth requirement for the optical device. Fujitsu's DMT (Discrect-Multi-Tone) modulation technology, which has been promoted in conjunction with DSP, can even use a 10G optical device to transmit 100G signals.




Third, does FEC technology really need to be introduced at the module end? Inside the data center, the maximum transmission distance is not more than 10km. The link budget is about 4dB with the loss of the joint. Such SNR effects caused by the link is basically negligible. Therefore, the FEC in the data center is not intended to solve the link SNR, but to solve the performance shortage of the optical devices. At the same time, we need to consider that the electrical interface signal at the optical module end is upgraded from 25G NRZ to 50G PAM-4 (net rate) in the 400G era, so it is often necessary to turn on the electrical FEC to meet the requirements for transmission from the optical transceivers to the switches. In this case, reopening the FEC on the module side is not necessary and has no effect. Because for FEC, we mostly discuss the error correction threshold. For example, 7% FEC error correction threshold is at 1E-3 Bit Error Rate (BER), that is to say that FEC is able to correct all errors below this BER, and FEC above this BER is essentially useless (regardless of the Burst Error which is usually solved with Inter-leaver). Therefore, there is no difference between the effect of using multiple FECs and using only the best FEC. Considering the power consumption and delay caused by FEC on the module side, it may be better to open FEC on the switch side in the future.




The Architecture of DSP


In the optical communication field, DSP generally consists of several parts: the front-end analog digital mixing section, including ADC (Digital-to-Analog Converter, required), DAC (Analog to Digital Converter, optional) and SerDes, digital signal processing section (including FEC) ) and the PHY section. The PHY section is similar to the CDR chip with the PHY function, and will not be described here.







The main function of ADC and DAC is to convert analog signal and digital signal, which is a bridge between the modulation device and digital signal processing section. The ADC/DAC mainly has four key indicators which are sampling rate, sampling effective bit width, analog bandwidth and power consumption. For the 100G PAM-4 application, the sampling rate of ADC in the Rx end needs to reach 100Gs/s. Otherwise, Alias will be generated during sampling, which will cause distortion to the signal. The effective sampling width is also very important. For PAM-4 applications, it does not mean that 2 effective bits can satisfy the requirement of digital signal processing, but at least 4. Analog bandwidth is currently the main technical challenge for ADC/DAC. This index is limited by both effective bit widths and power consumption. Generally, there are two ways to implement high bandwidth ADC/DAC which are GeSi and CMOS. The former has a high cutoff frequency and can easily realize the high bandwidth. The disadvantage is very high power consumption, so it is generally used in instrumentation. The cutoff frequency of CMOS is very low, so to achieve high bandwidth, multiple sub-ADCs/DACs must be sampled using an interleaving method. The advantage is low power consumption. For example, in a coherent 100G communication system, a 65Gs/s ADC with 6 effective bits is composed of 256 sub-ADCs with a sampling rate of 254Ms/s. It must be noted that although the ADC has a sampling rate of 65Gs/s, its analog bandwidth is only 18GHz. With a clock jitter of 100fs, the theoretical maximum analog bandwidth of 4 effective bits width is only up to 30GHz. Therefore, an important conclusion is that under the condition of using DSP, the bandwidth limitation of the general system is no longer the optical device, but the ADC and DAC.







In the data center applications, and the digital signal processing unit is still relatively simple. For example, for 100G PAM-4 applications, it performs spectral compression of the transmitted signal, nonlinear compensation, and FEC encoding (optional) in the Tx end, then the ADC uses an adaptive filter to compensate the signal and digital domain CDR in the Rx end (separate external crystal support is required). In the digital signal processing unit, the FIR filter is generally used to compensate the signal. The Tap number and the decision function design of the FIR filter directly determines the performance of the compensation DSP and power consumption. It should be particularly pointed out that the DSP application in the field of optical communications is facing with a large number of parallel computing problems. The main reason is the huge difference between the ADC sampling frequency (tens or even 100Gs/s) and the digital circuit operating frequency (up to several hundred MHz), in order to support the ADC of 100Gs/s sampling rate, digital circuits need to convert the serial 100Gs/s signals into hundreds of parallel digital signals for processing. It can be imagined that when the FIR filter only adds one Tap, the actual situation is that hundreds of Taps needs to be added. Therefore, how to deal with the balance of performance and power consumption in the digital signal processing unit is the key factor to determine the quality of the DSP design. In addition, inside the data center, optical transceiver modules must meet the interoperability prerequisites. In practical applications, the transmission performance of a link depends on the overall performance of the DSP and analog optical devices in the Tx and Rx ends. It is also a difficulty to design a reasonable standard to correctly evaluate the performance of the Tx and Rx ends. When the DSP supports that FEC function is opened in the the physical layer, how to synchronously transmit and receive the FEC function of the optical transceivers also increases the difficulty of data center testing. Therefore, so far, coherent transmission systems are interoperable among manufacturers' devices, and do not require interoperability among different manufacturers. (The TDECQ performance evaluation method is proposed for PAM-4 in 802.3.)




Power Consumption and Cost


Because DSP introduces DAC/ADC and algorithm, its power consumption must be higher than the traditional CDR chip based on simulation technology. And the method that DSP lowers the power consumption is relatively limited, mainly depending on the promotion of the process of tape, for instance, upgrading from the current 16nm to 7nm process can achieve a 65% reduction in power consumption. The current design power consumption of the 400G OSFP/QSFP-DD based on the 16nm DSP solution is around 12W, which is a huge challenge for the thermal design of the module itself or the future front panel of the switch. Therefore, it may be based on the 7nm process to solve the 400G DSP problem.




Price is always a topic of concern to data center. Unlike traditional optical devices, DSP chips are based on mature semiconductor technology. Therefore, larger chip costs can be expected to fall under the support of massive applications. Another advantage of DSP's future application in data centers is flexibility, which can meet the application requirements of different data rates and scenarios by adjusting the DSP configuration in the same optical device configuration.




Article Source: http://www.gigalight.com/news_detail/newsId=422.html




Related Gigalight 100G QSFP28 Optical Transceivers: