ABSTRACT filter order. In this paper we made


            In communication, Decision feedback equalizer is
used to avoid the noises and ISI. In extreme digital communication the
complexity of time domain DFE exponentially increases with the feedback filter
order. In this paper we made a proposal to reduce the complexity and increase
the throughput in time domain using 5G.This is achieved by using speeding up
the conventional architecture in advance. Using this method the number of
coefficients of feedback filter is reduced thus decreasing the hardware
complexity. The convergence strategy and the Steady state error performance are
also increased without any additional hardware requirements. This is based on storing
the past values of decision maker in a separate Look up table and updating it
time to time using parallel error multiplexer. Using this method the results
are very efficient by decreasing the power, area and logic utilization of the
new architecture.

Keywords: Decision feedback
Equalizer,5G,Parallel error multiplexer, Inter-symbol Inference.


We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

Wireless communication is the fastest growing segment in the communication
technology. Its growth in the past two decades had made a rapid phase in its
advancement. Thus it captured the attention of all. This pattern of growth has
been nothing short on monumental. These advancement has led to the operational
data rates up to Giga bit per second which further leads to the development of
new standard IEEE 802.11ac which is now called 5G.The estimated data rates for
5G are 1-10 Gbps which is possible to achieve with multicarrier modulation
techniques. Despite these high data rates they are vulnerable to Inter symbol
interference (ISI) at high frequency and sensitive to Inter-Carrier
Interference due to frequency misalignment. Therefore to provide the robustness
QAM is coupled with OFDM.



and OFDMA are the multiple access strategy modulation technique used in the LTE
4th generation cellular network standards. These both technique were succeeded
by CDMA used in the 3G for many reasons. The merits of OFDM and OFDMA are:

1.Ease of implementation of FFT
and IFFT in transmitter and receiver.

2.Ability to overcome the
multipath distortion.

3.Subcarrier orthogonality to
eliminate ICI.

4.Possibility to adapt to the
modulational cardinality and transmitted power.

5.Integration of multi-antenna
hardware at both the receiver and transmitter.

it has many positive outcome this approach is being limited by certain factors
such as its large side lobes which require some null guard tones at the
spectrum edges and the need of cyclic prefix (CP).Thus it let to the
development of new generation with OFDM and QAM coupled architecture.


Block diagram of the OFDM-QAM is shown below. It includes TEQ block, FFT, DAC,
FEQ and QAM mapper.

OFDM-QAM Architechture with TEQ


            The QAM
mapper at the input converts the incoming data bits into a QAM symbol. Each QAM
symbol comprises of two componets namely inphase (I) and quadrature (Q). These
I and Q values of a data bit defines the amplitude of the pulses given to the
pulse shaping filter (S-P).The translation of the bits  to symbol and vice versa are represented by a
symbol map diagram.

            IFFT block
returns the value of the normalized discrete, univariate, Inverse fast fourier
transform ot the values given by the pulse shaping filter in addition with the
pulse symbols. AddCP block returns the value of the added cyclic prefix length
to the symbols. It is then inputted to the Digital to Analog converter (DAC)
block. The obtained signal is then modulated to the RF range between 20 KHz to
300 GHz and transmitted over the air.

            At the
receiver end the received RF signals are then passed through the analog to
digital converter (ADC) block to get back the pulses from the signals in Time
domain Equalization (TEQ). TEQ is now being widely used due to its simplicity
and ease of implementation. Zero forcing is one of the conventional TEQ
technique which is used to estimate the inverse channel transfer function.
Hence ZF technique is replace by Maximum Likely hood sequence (MLSD) and
Decision feedback Equalization (DFE) techniques.

             MLSD is used in low span ISI as its Complexity
increases exponentially with the channel memory, but practically speaking
wireless channel need not to operate in low span ISI hence MLSD is not used. In
contrast DFE provides better performance and robust against ISI when upper
bound is in its maximum speed. Many high speed architecture has been proposed
to overcome the high speed limitation of DFE at the cost of more hardware. The
main idea came up from reformulating the original architecture into arrays of
adders,slicers and multiplexers based on past decisions.

            Later FBF
and LUT were presented for low hardware complexity.This is acheived by
precomputing the FBF coefficients in the Lookup table with address lines and
past decisions.The filter coefficient are to be updated with the changing
channel frequently.Thus this category falls under adaptive DFE which is more
complex than non adaptive DFE due to updating time limit.To reduce this
computational complexity relaxed look-ahead technique is used.But the
architechtural complexity increases due to increased multipliers.

precomputation technique used here is Distributed Arithmetic which transforms
the Multiply and Accumulate (MAC) units to the look-up table (LUT). In tis
method the filter coefficients and inputs are stored in separate look-up tables
thus increasing the performance efficiency of the filter.In every iteration the
coefficient LUT is updated which consumes time and power.To overcome this
limitation the offset binary coding (OBC) scheme is used.On the whole the
computational complexity is reduced in the tradeoff with circuit complexity.To
overcome this limitation we proposed with low complex high efficient
architehture for both non adaptive and adaptive DFEs using the concept of pre
speed up.

            The existing
design requires the feed forward filter output for precomputation of feedback
filter which is not necessary in proposed system.


            In communication equalizers are used to mitigate the ISI and
to recover the originaal transmitted symbols. A linear equalizer is used in series
with the channel to produce an estimated channel inverse transfer function.It
consists of real and complex FIR filters to handle the real and complex data
received. Teh coefficient of these filters are updated using Least mean Square
(LMS) Algorithm.But the performance of the linear equalizer are not very good
with strong distorted cahnnel where noise is also amplified with symbols at
higher frequencies resulting in Inter symbol Interference (ISI). To over come
these limitations DFEs are used. DFEs use the feedback of the received symbol
to produce the estimated channel output. Thus DFE is fed with detected symbols
and produces an estimated output which is subtracted from the output of linear


Fig.2 Reformulated DFE for 2nd
order FBF

2 shows the architecture of conventional adaptive DFE. It consists of two FIR
filters namely Feef forward Filter and Feedback FIlter. L and N are their
orders respectively.The operation of DFE can be explained with the following
sets of equation:


= fnTxn ?




= ? n ?






= Q?n






fn +  µenxn





= bn + µ  endn




where n is the current time
instance, xn is the vector of received samples, dn is the
vector of detected samples fn if FFF coefficient vector, bn
is the FBF coefficient vector.The critical path of DFE is shown by the dashed
line which consists of multipliers two adders and a slicer.to reduce the
computational complexity two adders are replaced by the carry save full adder
and two input adder.Using TSMC 90nm CMOS standard the values od computational
delay of multipler, adder and slicer are found to be 1.54ns,0.74ns and 0.05ns
respectively.In this case the critical path estimated as 1.20 ns which is below
the throughput requirement of 5G. to overcome this limitation relaxed lookahead
DFE is used.It is similar to conventional DFE but it has an extra register in
the decision feedback loop to speedup the architecture by the factor of 2.The
architectural diagram of relaxed look ahead DFE is shown below.

Architectural diagram with relaxed look ahead DFE



            To reduce the computational complexity the feedback and
feedforward filter coefficients are stored in separate LUT and rapidly updated.
This process is done in two stages, hence it is called two stage precomputation
DFE. In first stage only few FBF coefficients are precomputed and saved in
separate LUT1 which reduced the multiplier complexity. In case of
PP-DFE (partially precomputed DFE) it is found that the remaining multipliers
are still high and occupies more area. This can be solved by precomputing and
storing the remaining multiplier coefficients in another LUT2 at
stage 2. Thus hardware and computational complexities are reduced.


            In this method first the LUT1 and LUT2
are transformed and then the proposed architecture is rescaled and unfolded to
achieve the higher throuhput.Now the content of LUT2 is reduced to
half the size by rescaling.This is done with the sign reversal operation for
the remaining content.Now the contents of LUT1 will disturb the
symmetry. Now by adding the FFF output minor symmetry can be achieved. Now
similar to LUT2 ,LUT1 is transformed.Thus the LUT of
stage 1 and 2 are reduced to half.by combining these LUT with DFE our proposed
architechture is obtained.The iteration bond of the proposed architechture
decreases with the larger order of the filter.Thus it reduces the hardware
complexity.Generally it is difficult to store the output of FFF in LUT since
the LUT has to be dynamically updated and the output is larger which consumes
more power and time. But by adding the FFF output outside the LUT in our
proposed mrthos these limitations are overcomed.

            Fig.4. Proposed



            In order to reduce latency and increase the fan out in both
stages some efforts are made in the proposed unfolded architecture.In this
architecture a special TSMC 90nm CMOS inverters are used. The latency of these
invered 2×1 multiplexers are half than that of teh normal multiplexers.
Addition of these multiplexers haven’t made any functionality changes in our
architecture.Since we are using the inverted multiplexers inputs should be
given vice versa. In general to increase the fan out a set of buffers and
inverters are used.Since our architecture has large number of XOR gates to perform
the inverse operation the fanout tends be more here. These methods reduces the
latency in the stage 1 alone. To reduce latency in stage 2 we are using
retiming technique where the registers are retimed to perform the same
operation with reduced latency in each unfolded levels.


compared to the existing and proposed method the hardware complexity of the
proposed ADFE design has been reduced by half than the existing irrespective of
the constellation of the QAM. Though the number of inverters and buffers
increase for higher order (for the transformation of LUT1 LUT 2
contents) the hardware complexity has been reduced by the factor of 2. The
following diagram exhibits the comparison of hardware complexity.

          Fig.5. Hardware complexity comparison graph

            The above graph
shows the comparison of hardware complexity and LUT size for larger feedback
filter by several fold.In 5G system the volume of computation will be large
which inturn requires more hardware. But our design is more complex for smaller
feedbck filter and less complex for higher order feedback filter which makes
its use efficient in the 5G system.


            Throughput is defined as the ratio of  clock rate to the processing time per sample.
In other words it is defined as the maximum production within  a given period of time.Generally in DFE the
maximum clock rate is defined based on the iteration bound, whereas the time
required to process the incoming samples depends upon the order of feedback and
feedforward filters (N and L).In the exixting system the processing time
additionaly depends upon the update time of LUT content and number ofclock
cycles in  inner loops .In our proposed
system the number of clock cycle required to update the LUT contents is
minimized.The exixting system has more computational time which is reduced in
the proposed system thereby increasing the throughput.The following graph
explains the time complexity reduction clearly.

Fig.6. Throughput VS Feedback
filter order graph


            Thus our proposed system has more benefits than the existing
system which in turn makes our architecture more effective in 5G communication.Our
proposed system reduces the hardware complexity and computational omplexity
without any trade off by the factor 2 and more for higher order systems.The
throughput is also acheived upto 2.5Gbps for 16 QAM.This makes its significant
high in 5G communication where the volume of computation will be higher.We are
working over the BER and Convergence performance of the proposed architecture
to be efficient enough without any tradeoff  with the performance of the proposed design.
Our further research and studies are to reduce the power consumption of the