ABSTRACT

In communication, Decision feedback equalizer is

used to avoid the noises and ISI. In extreme digital communication the

complexity of time domain DFE exponentially increases with the feedback filter

order. In this paper we made a proposal to reduce the complexity and increase

the throughput in time domain using 5G.This is achieved by using speeding up

the conventional architecture in advance. Using this method the number of

coefficients of feedback filter is reduced thus decreasing the hardware

complexity. The convergence strategy and the Steady state error performance are

also increased without any additional hardware requirements. This is based on storing

the past values of decision maker in a separate Look up table and updating it

time to time using parallel error multiplexer. Using this method the results

are very efficient by decreasing the power, area and logic utilization of the

new architecture.

Keywords: Decision feedback

Equalizer,5G,Parallel error multiplexer, Inter-symbol Inference.

INTRODUCTION

Nowadays

Wireless communication is the fastest growing segment in the communication

technology. Its growth in the past two decades had made a rapid phase in its

advancement. Thus it captured the attention of all. This pattern of growth has

been nothing short on monumental. These advancement has led to the operational

data rates up to Giga bit per second which further leads to the development of

new standard IEEE 802.11ac which is now called 5G.The estimated data rates for

5G are 1-10 Gbps which is possible to achieve with multicarrier modulation

techniques. Despite these high data rates they are vulnerable to Inter symbol

interference (ISI) at high frequency and sensitive to Inter-Carrier

Interference due to frequency misalignment. Therefore to provide the robustness

QAM is coupled with OFDM.

EVOLUTION

OF 5G

OFDM

and OFDMA are the multiple access strategy modulation technique used in the LTE

4th generation cellular network standards. These both technique were succeeded

by CDMA used in the 3G for many reasons. The merits of OFDM and OFDMA are:

1.Ease of implementation of FFT

and IFFT in transmitter and receiver.

2.Ability to overcome the

multipath distortion.

3.Subcarrier orthogonality to

eliminate ICI.

4.Possibility to adapt to the

modulational cardinality and transmitted power.

5.Integration of multi-antenna

hardware at both the receiver and transmitter.

Though

it has many positive outcome this approach is being limited by certain factors

such as its large side lobes which require some null guard tones at the

spectrum edges and the need of cyclic prefix (CP).Thus it let to the

development of new generation with OFDM and QAM coupled architecture.

OFDM-QAM

ARCHITECTURE

The

Block diagram of the OFDM-QAM is shown below. It includes TEQ block, FFT, DAC,

FEQ and QAM mapper.

Fig.1

OFDM-QAM Architechture with TEQ

The QAM

mapper at the input converts the incoming data bits into a QAM symbol. Each QAM

symbol comprises of two componets namely inphase (I) and quadrature (Q). These

I and Q values of a data bit defines the amplitude of the pulses given to the

pulse shaping filter (S-P).The translation of the bits to symbol and vice versa are represented by a

symbol map diagram.

IFFT block

returns the value of the normalized discrete, univariate, Inverse fast fourier

transform ot the values given by the pulse shaping filter in addition with the

pulse symbols. AddCP block returns the value of the added cyclic prefix length

to the symbols. It is then inputted to the Digital to Analog converter (DAC)

block. The obtained signal is then modulated to the RF range between 20 KHz to

300 GHz and transmitted over the air.

At the

receiver end the received RF signals are then passed through the analog to

digital converter (ADC) block to get back the pulses from the signals in Time

domain Equalization (TEQ). TEQ is now being widely used due to its simplicity

and ease of implementation. Zero forcing is one of the conventional TEQ

technique which is used to estimate the inverse channel transfer function.

Hence ZF technique is replace by Maximum Likely hood sequence (MLSD) and

Decision feedback Equalization (DFE) techniques.

MLSD is used in low span ISI as its Complexity

increases exponentially with the channel memory, but practically speaking

wireless channel need not to operate in low span ISI hence MLSD is not used. In

contrast DFE provides better performance and robust against ISI when upper

bound is in its maximum speed. Many high speed architecture has been proposed

to overcome the high speed limitation of DFE at the cost of more hardware. The

main idea came up from reformulating the original architecture into arrays of

adders,slicers and multiplexers based on past decisions.

Later FBF

and LUT were presented for low hardware complexity.This is acheived by

precomputing the FBF coefficients in the Lookup table with address lines and

past decisions.The filter coefficient are to be updated with the changing

channel frequently.Thus this category falls under adaptive DFE which is more

complex than non adaptive DFE due to updating time limit.To reduce this

computational complexity relaxed look-ahead technique is used.But the

architechtural complexity increases due to increased multipliers.

The

precomputation technique used here is Distributed Arithmetic which transforms

the Multiply and Accumulate (MAC) units to the look-up table (LUT). In tis

method the filter coefficients and inputs are stored in separate look-up tables

thus increasing the performance efficiency of the filter.In every iteration the

coefficient LUT is updated which consumes time and power.To overcome this

limitation the offset binary coding (OBC) scheme is used.On the whole the

computational complexity is reduced in the tradeoff with circuit complexity.To

overcome this limitation we proposed with low complex high efficient

architehture for both non adaptive and adaptive DFEs using the concept of pre

speed up.

The existing

design requires the feed forward filter output for precomputation of feedback

filter which is not necessary in proposed system.

DECISION FEEDBACK EQUALIZER

In communication equalizers are used to mitigate the ISI and

to recover the originaal transmitted symbols. A linear equalizer is used in series

with the channel to produce an estimated channel inverse transfer function.It

consists of real and complex FIR filters to handle the real and complex data

received. Teh coefficient of these filters are updated using Least mean Square

(LMS) Algorithm.But the performance of the linear equalizer are not very good

with strong distorted cahnnel where noise is also amplified with symbols at

higher frequencies resulting in Inter symbol Interference (ISI). To over come

these limitations DFEs are used. DFEs use the feedback of the received symbol

to produce the estimated channel output. Thus DFE is fed with detected symbols

and produces an estimated output which is subtracted from the output of linear

equalizer.

Fig.2 Reformulated DFE for 2nd

order FBF

Fig

2 shows the architecture of conventional adaptive DFE. It consists of two FIR

filters namely Feef forward Filter and Feedback FIlter. L and N are their

orders respectively.The operation of DFE can be explained with the following

sets of equation:

?n

= fnTxn ?

bnTdn

(1a)

en

= ? n ?

?n

(1b)

?n

= Q?n

(1c)

Fn+1=

fn + µenxn

(1d)

bn+1

= bn + µ endn

(1e)

where n is the current time

instance, xn is the vector of received samples, dn is the

vector of detected samples fn if FFF coefficient vector, bn

is the FBF coefficient vector.The critical path of DFE is shown by the dashed

line which consists of multipliers two adders and a slicer.to reduce the

computational complexity two adders are replaced by the carry save full adder

and two input adder.Using TSMC 90nm CMOS standard the values od computational

delay of multipler, adder and slicer are found to be 1.54ns,0.74ns and 0.05ns

respectively.In this case the critical path estimated as 1.20 ns which is below

the throughput requirement of 5G. to overcome this limitation relaxed lookahead

DFE is used.It is similar to conventional DFE but it has an extra register in

the decision feedback loop to speedup the architecture by the factor of 2.The

architectural diagram of relaxed look ahead DFE is shown below.

Fig.3.

Architectural diagram with relaxed look ahead DFE

HIGH SPEED DFE SCHEMES

To reduce the computational complexity the feedback and

feedforward filter coefficients are stored in separate LUT and rapidly updated.

This process is done in two stages, hence it is called two stage precomputation

DFE. In first stage only few FBF coefficients are precomputed and saved in

separate LUT1 which reduced the multiplier complexity. In case of

PP-DFE (partially precomputed DFE) it is found that the remaining multipliers

are still high and occupies more area. This can be solved by precomputing and

storing the remaining multiplier coefficients in another LUT2 at

stage 2. Thus hardware and computational complexities are reduced.

PROPOSED METHOD

In this method first the LUT1 and LUT2

are transformed and then the proposed architecture is rescaled and unfolded to

achieve the higher throuhput.Now the content of LUT2 is reduced to

half the size by rescaling.This is done with the sign reversal operation for

the remaining content.Now the contents of LUT1 will disturb the

symmetry. Now by adding the FFF output minor symmetry can be achieved. Now

similar to LUT2 ,LUT1 is transformed.Thus the LUT of

stage 1 and 2 are reduced to half.by combining these LUT with DFE our proposed

architechture is obtained.The iteration bond of the proposed architechture

decreases with the larger order of the filter.Thus it reduces the hardware

complexity.Generally it is difficult to store the output of FFF in LUT since

the LUT has to be dynamically updated and the output is larger which consumes

more power and time. But by adding the FFF output outside the LUT in our

proposed mrthos these limitations are overcomed.

Fig.4. Proposed

Architecture

RESULTS AND OUTPUT

In order to reduce latency and increase the fan out in both

stages some efforts are made in the proposed unfolded architecture.In this

architecture a special TSMC 90nm CMOS inverters are used. The latency of these

invered 2×1 multiplexers are half than that of teh normal multiplexers.

Addition of these multiplexers haven’t made any functionality changes in our

architecture.Since we are using the inverted multiplexers inputs should be

given vice versa. In general to increase the fan out a set of buffers and

inverters are used.Since our architecture has large number of XOR gates to perform

the inverse operation the fanout tends be more here. These methods reduces the

latency in the stage 1 alone. To reduce latency in stage 2 we are using

retiming technique where the registers are retimed to perform the same

operation with reduced latency in each unfolded levels.

HARDWARE COMPLEXITY

When

compared to the existing and proposed method the hardware complexity of the

proposed ADFE design has been reduced by half than the existing irrespective of

the constellation of the QAM. Though the number of inverters and buffers

increase for higher order (for the transformation of LUT1 LUT 2

contents) the hardware complexity has been reduced by the factor of 2. The

following diagram exhibits the comparison of hardware complexity.

Fig.5. Hardware complexity comparison graph

The above graph

shows the comparison of hardware complexity and LUT size for larger feedback

filter by several fold.In 5G system the volume of computation will be large

which inturn requires more hardware. But our design is more complex for smaller

feedbck filter and less complex for higher order feedback filter which makes

its use efficient in the 5G system.

TIME COMPLEXITY

Throughput is defined as the ratio of clock rate to the processing time per sample.

In other words it is defined as the maximum production within a given period of time.Generally in DFE the

maximum clock rate is defined based on the iteration bound, whereas the time

required to process the incoming samples depends upon the order of feedback and

feedforward filters (N and L).In the exixting system the processing time

additionaly depends upon the update time of LUT content and number ofclock

cycles in inner loops .In our proposed

system the number of clock cycle required to update the LUT contents is

minimized.The exixting system has more computational time which is reduced in

the proposed system thereby increasing the throughput.The following graph

explains the time complexity reduction clearly.

Fig.6. Throughput VS Feedback

filter order graph

CONCLUSION

Thus our proposed system has more benefits than the existing

system which in turn makes our architecture more effective in 5G communication.Our

proposed system reduces the hardware complexity and computational omplexity

without any trade off by the factor 2 and more for higher order systems.The

throughput is also acheived upto 2.5Gbps for 16 QAM.This makes its significant

high in 5G communication where the volume of computation will be higher.We are

working over the BER and Convergence performance of the proposed architecture

to be efficient enough without any tradeoff with the performance of the proposed design.

Our further research and studies are to reduce the power consumption of the

architecture.