PeaktoAveragePowerRatio (PAPR) reduction in WiMAX and OFDM/A systems
 Seyran Khademi^{1}Email author,
 Thomas Svantesson^{2},
 Mats Viberg^{1} and
 Thomas Eriksson^{1}
DOI: 10.1186/16876180201138
© Khademi et al; licensee Springer. 2011
Received: 12 November 2010
Accepted: 9 August 2011
Published: 9 August 2011
Abstract
A peak to average power ratio (PAPR) reduction method is proposed that exploits the precoding or beamforming mode in WiMAX. The method is applicable to any OFDM/A systems that implements beamforming using dedicated pilots which use the same beamforming antenna weights for both pilots and data. Beamforming performance depends on the relative phase shift between antennas, but is unaffected by a phase shift common to all antennas. PAPR, on the other hand, changes with a common phase shift and this paper exploits that property. An effective optimization technique based on sequential quadratic programming is proposed to compute the common phase shift. The proposed technique has several advantages compared with traditional PAPR reduction techniques in that it does not require any sideinformation and has no effect on power and biterrorrate while providing better PAPR reduction performance than most other methods.
Keywords
WiMAX OFDM PTS PAPR reduction phase optimization sequential quadratic programing1. Introduction
Many recent wideband digital communication systems use a multicarrier technology known as orthogonalfrequencydivisionmultiplexing (OFDM), where the band is divided into many narrowband channels. A key benefit of OFDM is that it can be efficiently implemented using the fastfouriertransform (FFT), and that the receiver structure becomes simple since each channel or subcarrier can be treated as narrowband instead of a more complicated wideband channel. Orthogonalfrequencydivisionmultiaccess (OFDMA) is a similar technique, but the bands can be occupied by different users.
Although OFDM and OFDMA have many benefits contributing to its popularity, a wellknown drawback is that the amplitude of the resulting time domain signal varies with the transmitted symbols in the frequency domain. From OFDM symbol to OFDM symbol, the maximum amplitude can vary dramatically depending on the transmitted symbols. If the maximum amplitude of the time domain signal is large, it may push the amplifier into the nonlinear region which creates many problems that reduce performance. For example, it breaks the orthogonality of the subcarriers which will result in a substantial increase in the error rate. A common practice to avoid this peaktoaveragepowerratio (PAPR) problem is to reduce the operating point of the amplifier with a backoff margin. This backoff margin is selected so that it avoids most of the occurrences of high peaks falling in the nonlinear region of the amplifier. Of course, it is desirable to have a minimum backoff margin since operating the amplifier below full power reduces the range of the system, as well as the efficiency of the amplifier.
PAPR reduction is a wellknown signal processing topic in multicarrier transmission and large number of techniques have been proposed in the literature during the past decades. These techniques include amplitude clipping and filtering, coding [1], tone reservation (TR) [2, 3] and tone injection (TI) [2], active constellation extension (ACE) [4, 5], and multiple signal representation methods, such as partial transmit sequence (PTS), selected mapping (SLM), and interleaving [6]. The existing approaches differ in terms of requirements and restrictions they impose on the system. Therefore, careful attention must be paid to choose a proper technique for each specific communication system.
WiMAX mobile devices (MS) are commercially available and for the system to work, both mobile devices and basestations need to adhere to the WiMAX standard. Hence, it is not possible to modify the basestation transmission technique if it makes the transmission noncompliant to the standard since existing MS would not be able to decode the transmissions correctly. For example, phase manipulation techniques such as PTS and SLM [7–9], which require coded side information to be transmitted would not be compatible or compliant to the standard. One technique of inserting a PAPR reducing sequence is part of the IEEE 802.16e standard. It is activated using the PAPR reduction/sounding zone/safety zone allocation IE. Using this technique reduces the throughput since it requires sending additional PAPR bits. It is also not a part of the WiMAX profile so it is likely not supported by the majority of handsets.
Accordingly, each of the discussed techniques is associated with a cost in terms of bandwidth or/and power. The proposed technique in this paper neither require additional bandwidth nor power while delivering equal or better PAPR reduction gain compared with other existing methods. The proposed algorithm makes use of the antenna beamforming weights and dedicated pilots at the transmitter [10]. It reduces the PAPR by modifying the cluster weights in the WiMAX data structure in a manner similar to the PTS method [7, 8]. The main benefits of the proposed technique are:

It preserves the transmitted power by adjusting only the phase of the beamforming weights per cluster.

No extra side information regarding the phase change needs to be transmitted due to the property of dedicated pilots.

Not sending the phase coefficients allows for arbitrary phase shifts instead of a quantized set such as used for PTS.

A novel search algorithm based on gradient optimization to find the optimum cluster weights phase shifts.
The following presentation focuses on WiMAX, but the same technique applies to any OFDM/OFDMA system that uses a concept similar to dedicated pilots and does not explicitly announce the multiplied weights to the receiver.
The paper is organized as follows: in Sect.2 the PAPR in an OFDM system is defined, also the data structure in WiMAX profile and potential capabilities of the standard is explained. In Sect.3, the proposed PAPR reduction method is described based on the PTS technique model and the phase optimization problem is formulated. The optimization problem is written as a conventional minimax problem with nonequality constraints in Sect.4 and then a sequential quadratic programming (SQP) technique is proposed to solve the minimax optimization. This approach breaks the complex original problem into several convex quadratic subproblems with linear constraints. A pseudo code for a tailored SQP approach is given in sect.4C. Simulation results in Sect.6 confirm the significant PAPR reduction gain applying the SQP algorithm over other techniques, and the complexity evaluation in Sect.5 reveals the advantage of the new optimization method comparing the exhaustive search approach in PTS. Finally, the paper is concluded in Sect.7 with a summary and a brief discussion on further research.
2. System Model
Although not explicitly written in Equation(2), it is well known that oversampling is required to accurately capture the peaks. In this paper, an oversampling of four times is used.
To extract frequency diversity, the WiMAX protocol specifies that the clusters in a subchannel are spread out across the band, i.e., a distributed permutation. The WiMAX standard further specifies two main modes of transmitting pilots: common pilots and dedicated pilots. Here, dedicated pilots allow percluster beamforming since channel estimation is performed percluster, whereas for common pilots channel estimation across the whole band is allowed. The presentation so far has ignored a practical detail of guard bands which are inserted to reduce spectral leakage. In WiMAX, a number of subcarriers in the beginning and the end of the available bandwidth do not carry any signal, leaving N_{usable} subcarriers that carrie data and pilots. Although this number depends on bandwidth and transmission modes, weights that are constant across each cluster are simply applied to only the N_{usable} subcarriers.
3. Proposed Technique
where W_{ s } (k) denotes the beamforming weight on subcarrier k, i.e., W_{ s } (k) = e^{jϕ(k)}. Since the channel is estimated using the pilots in each cluster, the beamforming weights need to be constant over each cluster, but can change from cluster to cluster, i.e., W_{ s } (k_{0}) = W_{ s } (k_{0} + 1) = ... = W_{ s } (k_{0} + 13), where k_{0} denotes the first subcarrier in a particular cluster. In the following, we will focus on the scenario of a single transmission antenna since it simplifies the expressions. However, the method can easily be extended to scenarios with multiple transmit antennas, which is the normal mode of dedicated pilots and beamforming.
For the case of wideband weights, i.e., the beamforming weights are the same across the whole band, the PAPR reduction method is identical and performed only once. For the typical case of narrowband weights, a different beamforming weight per cluster is used so that the PAPR reduction method is applied in a joint fashion over the transmitted signal from all antennas. Furthermore, the technique is readily extendable to single and multiuser MIMO systems using the same concept of dedicated pilots. Although there are now multiple streams, the basestation has to transmit pilots beamformed in the same way as the data. Hence, the same technique as outlined above can be applied. For a basestation sending multiple streams to one or many receivers, the weight optimization now has to be performed jointly over the streams, but otherwise the concept is the same.
Note that for a 10 MHz WiMAX system, there are 60 clusters so there are 60 phase shifts W_{ s } (k) = e^{jϕ(k)}where ϕ(k) ∈ [0, 2π) and k = 1, 2, ..., 60.
where h' = h e^{ j ϕ } denotes the effective channel. The BER performance of the effective channel is identical to the original channel. Furthermore, since both pilots and data are transmitted with the same phase shift, the channel estimation performance is also identical. In the proposed technique, the dedicated pilots for channel estimation is used, without interfering with their original job, as an indicator to inform the receiver about the phase rotation at the transmitter. So, the known symbols at allocated subcarriers are phase rotated, as well as data subcarriers. Note that pilot symbols already exists in current design of WiMAX and other similar wireless standards, so we do not reduce the bandwidth for PAPR reduction. The receiver is implicitly informed while the information is hidden at the known pilot symbols. The channel coefficients are estimated for equalization based on received pilots while the PAPR phase rotation is interpreted as the channel effect.
Moreover, the proposed technique does not impact the transmitted power since it is only a phasemodification. In essence, the technique is similar to partialtransmitsequence (PTS), but without the drawback of requiring sideinformation which would make it impossible to apply in existing communication standards such as WiMAX. These advantages makes it a very attractive technique to reduce PAPR.
The dedicated pilot feature is designed for beamforming and the standard explicitly states that only the beamformed pilots inside the beamformed clusters can be used for channel estimation and equalization. The weights are different from cluster to cluster. Since only those pilots can be used, there is no other side information that could be used since in the WiMAX case, the phasechange is incorporated into the channel just as any other type of beamforming weights would. Remember that there is no difference between our beamforming weights and normal beamforming weights from a channel estimation perspective. In both cases, there is no need for extra side information. Note that it is possible to design a system different from the WiMAX dedicated pilots setting that could use more sideinformation, but that is outside the scope of the this paper since it is focusing on WiMAX.
In conclusion, cluster weights can be used to decrease the PAPR of the OFDM symbol. To preserve the average transmitted power, only the phase of the clusters are changed. These phase weights can be multiplied either before IFFT blocks or after it, and the result will be the same due to the linear property of the IFFT operation. However, it is more efficient for the optimization algorithm to apply the phase coefficients after the IFFT block. This is exactly the same approach as the PTS which is explained with a description. However, there are still substantial differences regarding the phase selection, subblock partitioning, etc.
A. Partial Transmit Sequence (PTS)
The objective is to find a set of phase factors that minimize the PAPR. In general, the selection of the phase factors is limited to a set with a finite number of elements to reduce the search complexity. The set of possible phase factors is written as where K is the number of allowed phases. The first phase weight is set to 1 without any loss of performance, so a search for choosing the best one is performed over the (M  1) remaining places. The complexity increases exponentially with the number of subblocks M, since K^{M1}possible phase vectors are searched to find the optimum set of phases. Also, PTS needs M times IDFT operations for each data block, and the number of required side information bits is log_{2}(K^{M1}) to send to the receiver. The amount of PAPR reduction depends on the number of sub blocks and the number of allowed phase factors [9].
For each subblock which is rotated at the transmitter, the applied phase coefficient is sent using a code book to the receiver as an explicit side information which reduce the spectral efficiency. on the other hand, the receiver use the same code book to retrieve the applied phase at the transmitter from side information bits. So the code book needs to be compromised between transmitter and receiver at the system design phase.
PTS performs an exhaustive search among a combination of phase vectors to resolve the optimum weights. For example a permutation of ±1 for two allowed phase factors is performed; in this case, the whole search space for 60 clusters will be 2^{60} alternative vectors, which takes a tremendous amount of computations. Here, we propose a realistic optimization algorithm based on the basic configuration of the PTS subblocks.
B. Formulation of the Phase Optimization Problem
The s(n)s are complex values and ϕ_{ n } s are continuous phases between [0, 2π). Substituting b_{ n,m } = R_{ n,m } + jI_{ n,m } and e^{ jϕm } = cos ϕ_{ m } + j sin ϕ_{ m } in Equation(9) and taking the square of s(n) results in Equation(10), when R_{ n,m } and I_{ n,m } stands for ℜ{b_{ n,m } } and ℑ{b_{ n,m } } respectively. This is a very important equation, which shows the square of the norm or the power of output subcarriers that are transmitted; a multivariable cost function to be minimized when the largest s(n) specifies the PAPR of the system. To emphasis on the role of objective function, the s(n)^{2} is replaced with f_{ n } (ϕ) as expressed in Equation(10).
The elements of Jacobian matrix is expressed in Equation (11).
In agreement with this new setting, the objective function f(ϕ) is the maximum of f_{ n } (ϕ), or equivalently it is the greatest IFFT sample in the whole OFDM sequence which characterizes the PAPR value. The remaining samples are appended as additional constraints, in the form of f_{ n } (ϕ) ≤ f (ϕ). In fact, the f (ϕ) is minimized over ϕ using SQP, and the additional constraints are considered because we do not want other f_{ n } s pop out when the maximum value is being minimized. In this way, the whole OFDM sequence is kept smaller than the value that is being minimized during iterations.
4. Solving the Optimization Problem
The proposed PAPR reduction technique has unique features of exploiting the dedicated pilots and channel estimation procedure while choosing the best phase coefficients still is a new challenge. In PTS the optimum weights are selected by performing the exhaustive search among the quantized set of phase options, where here there is no restriction on phase coefficients and they can be selected between continuous interval of (0, 2π]. So an efficient optimization algorithm should be used to extract the proper phase choices; the proposed algorithm is a gradientbased method and modified and adapted for the phase optimization problem of the PAPR reduction technique.
A. Sequential Quadratic Programming
SQP is one of the most popular and robust algorithms for nonlinear constraint optimization. Here, it is modified and simplified for the phase optimization problem of PAPR reduction, but the basic configuration is as same as general SQP. The algorithm proceeds based on solving a set of subproblems created to minimize a quadratic model of the objective, subject to a linearization of the constraints. The SQP method has been used successfully to many practical problems, see [12–14] for an overview. An efficient implementation with good performance in many sample problems is described in [15].
These equations are used to form quasi Newton updating step which is an important step outlined below. The quasi Newton steps are implemented by accumulating secondorder information of KT criteria and also checking for optimality during iterations.
The SQP implementation consists of two loops: the phase solution is updated at each fiiteration in major loop with k as the counter, while itself contains an inner QP loop to solve for optimum search direction d _{ k }.
Major loop to find ϕ which minimize the f(ϕ):
while k < maximum number of iterations do
ϕ_{k+1}= ϕ_{ k }+ d _{ k },
QP loop to determine d_{ k } for major loop:
while optimal d_{ k } found do
d_{l+1}= d_{ l } + α d _{ l },
end while
end while
The step length α is determined within the QP iterations which is distinguished from major iterations by index l as the counter.
The Hessian of the Lagrange function is required to form the quadratic objective function. Fortunately, it is not necessary to calculate this Hessian matrix explicitly since it can be approximated at each major iteration using a quasi Newton updating method, where the Hessian matrix is estimated using the information specified by gradient evaluations. The Broyden Fletcher Goldfarb Shanno (BFGS) is one of the most attractive members of quasi Newton methods and frequently used in nonlinear optimization. It approximates the second derivative of the objective function using Equation(17).
The Lagrange multipliers [according to Equation (16)] is nonzero and positive for active set constraints, and zero for others. The ∇f_{ n } (ϕ_{ k }) is the gradient of n_{ th } constraints at the k_{ th } major iteration. The Hessian is maintained positive definite at the solution point if is positive at each update. Here, we modify^{a}q_{ k } on an elementbyelement basis so that as proposed in [19].
We generally refer to the constraints of the QP subproblem as G(d) = A d  a, where ∇f_{ n } (ϕ_{ k }) ^{ T } and  f_{ n } (ϕ_{ k }) are the n_{ th } row and element of the matrix A and vector a respectively.
The quadratic objective function q(d) reflects the local properties of the original objective function and the main reason to use a quadratic function is that such problems are easy to solve yet mimics the nonlinear behavior of the initial problem. The reasonable choice for the objective function is the local quadratic approximation of f(ϕ_{ k }) at the current solution point and the obvious option for the constraints is the linearization of current constraints in original problem around ϕ_{ k }to form a convex optimization problem. In the next section we explain the QP algorithm which is solved iteratively by updating the initial solution. The notation in the following section is summarized here for convince.

d_{ k }is a search direction in the major loop while is the search direction in the QP loop.

k is used as an iteration counter in the major loop and l is the counter in the QP loop.

ϕ_{ k }is the minimization variable in the major loop, it is the phase vector in this problem.

d_{ l }is the minimization variable in the QP problem.

f_{ n }(ϕ_{ k }) is the n_{ th }constraint of the original minimax problem at a solution point ϕ_{ k }.

G(d_{ l }) = A d_{ l } a is the matrix represents the constraint of the QP subproblem at a solution point d_{ l }and g_{ n }(d_{ l }) is the n_{ th }constraint.
B. Quadratic Programming
In a quadratic programming (QP) problem, a multivariable quadratic function is maximized or minimized, subject to a set of linear constraints on these variables. Basically, the quadratic programming problem can be formulated as: minimizing f(x) = 1/2 x^{ T }C x+ c^{ T }x with respect to x, with linear constraints Ax ≤ a ,which shows that every element of the vector Ax is ≤ to the corresponding element of the vector a .
The quadratic program has a global minimizer if there exists some feasible vector x satisfying the constraints, provided that f(x) is bounded in constraints on the feasible region; this is true when the matrix C is positive definite. Naturally, the quadratic objective function f(x) is convex, so as long as the constraints are linear we can conclude the problem has a feasible solution and a unique global minimizer. If C is zero, then the problem becomes a linear programming [20].
A variety of methods are commonly used for solving a QP problem; the active set strategy has been applied in the phase optimization algorithm. We will see how this method is suitable for problems with a large number of constraints.
In general, the active set strategy includes an objective function to optimize and a set of constraints which is defined as g_{1}(d) ≤ 0, g_{2}(d) ≤ 0, ⋯, g_{ n } (d) ≤ 0 here. That is a collection of all d, which introduce a feasible region to search for the optimal solution. Given a point d in the feasible region, a constraint g_{ n } (d) ≤ 0 called active at d if g_{ n } (d) = 0 and inactive at d if g_{ n } (d) < 0.^{b}. The active set at d is made up of those constraints g_{ n } (d) that are active at the current solution point.
The active set specifies which constraints will particularly control the final result of the optimization, so they are very important in the optimization. For example, in quadratic programming as the solution is not necessarily on one of the edges of the bounding polygon, specification of the active set creates a subset of inequalities to search the solution within [21–23]. As a result, the complexity of the search is reduced effectively. That is why nonlinearly constrained problems can often be solved in fewer iterations than unconstrained problems using SQP, because of the limits on the feasible area.
In the phase optimization problem, the QP subproblem is solved to find the d_{ k } vector which is used to form a new ϕ vector in the k_{ th } major iteration, ϕ_{k+1}= ϕ_{ k }+ d_{ k } . The matrix Q in the general problem is replaced with a positive definite Hessian as discussed earlier, the QP subproblem is a convex optimization problem which has a unique global minimizer. This has been tested practically in the simulation results, when the d_{ k } which minimizes a QP problem with specific setting is always identical, regardless of the initial guess.
The QP subproblem is solved by iterations when at each step the solution is given by . An active set constraints at l_{ th } iteration, Á_{ l } is used to set a basis for a search direction d _{ l }. This constitutes an estimate of the constraint boundaries at the solution point, and it is updated at each QP iteration. When a new constraint joins the active set, the dimension of the search space is reduced as expected.
The is the notation for the variable in the QP iteration; it is different from d_{ k } in the major iteration of the SQP, but it has the same role which shows the direction to move towards the minimum. The search direction in each QP iteration is remaining on any active constraint boundaries while it is calculated to minimize the quadratic objective function.
The possible subspace for is built from a basis Z _{ l }, whose columns are orthogonal to the active set Á _{ l }, Á _{ l } Z_{ l } = 0. Therefore, any linear combination of the Z_{ l } columns constitutes a search direction, which is assured to remain on the boundaries of the active constraints.
The active constraints must be linearly independent, so the maximum number of possible independent equations is equal to the number of design variables; in other words, P < M. For more details see [19].
Finally, there exists two possible situations when the search is terminated in QP subproblem and the minimum is found; either the step length is 1 or the optimum d_{ l } is sought in the current subspace whose Lagrange multipliers are all positive.
C. SQP Pseudo Code
Here, a pseudo code is provided for the SQP implementation and we will refer to it in the complexity evaluation section. As discussed in the previous parts, the algorithm consists of two loops.
Step0 Initialization of the variables before starting the SQP algorithm

An extra element (slack variable) is appended to the variables so ϕ= [ϕ_{0}, ϕ_{1}, ϕ_{2}, ⋯, ϕ_{ M }]. The objective function is defined as f(ϕ) = ϕ_{ M }and is initialized with zero, other elements can be any random guess.

The initial Hessian is an identity matrix H_{0} = I, and the gradient of the objective function is ∇f(ϕ_{ K })^{ T }= [0, 0, ⋯, 1].
Step1 Enter the major loop and repeat until the defined maximum number of iterations is exceeded.
Step2 Initialization of the variables before starting the QP iterations,

Find a feasible starting point for and
Check that the constraints in the initial working set^{c} are not dependent, otherwise find a new initial point d_{0} which satisfies this initial working set.
Calculate the initial constraints A d_{0}  a,
if max(constraints) > ε then
The constraints are violated and the new d_{0} needs to be searched
end if

Initialize the Q, R and Z and compute initial projected gradient ∇q(d_{0}) and initial search direction d_{0}
Step3 Enter the QP loop and repeat until the minimum is found

Find the distance in the search direction we can move before violating a constraint
(Gradient with respect to the search direction)
ind = find (gsd _{ n } > threshold)
if isempty(ind) then
Set the distance to the nearest constraint as zero and put α = 1
else
Add the constraint A_{ i } ^{ d } to the active set Á _{ l }
Decompose the active set as (21)
Compute the subspace Z_{ l } = Q[:, P + 1: M ]
end if

Update

Calculate the gradient objective at this point Δq(d_{ l })

Check if the current solution is optimal^{e}
if α = 1  length (Á _{ l }) = M then
end if
if all λ_{ i } > 0 then
return d _{ k }
else
Remove the constraints with λ_{ i } < 0
end if

Compute the QP search direction according to the Newton step criteria,(24)
Where the is projected Hessian, see A.
Step4 Update the solution ϕ for the k_{ th } iteration; ϕ_{k+1}= ϕ_{ k }+ d_{ k } and go back to Step 1
5. Complexity Analysis
The SQP algorithm has a quite complicated mathematical concept, and it can be implemented with different modifications. Therefore, the complexity evaluation is not straightforward. The number of QP iterations is not fixed^{f} and is different for each OFDM symbol; here, the average number of QP iterations is considered to evaluate the complexity. For 60 subblocks, 1024 subcarriers and 64 QAM, the average is obtained as 80 iterations for each major SQP iteration.
Another difficulty to compute the required operation is the length of the active set, which alters during iterations starting from 1 to at most M at the end of loop. Consequently, the size of R in the QR decomposition and Z the basis for the search subspace are not fixed during the process so the complexity cannot be assessed directly for each QP iteration and some numerical estimations are necessary.
To evaluate the amount of computation needed for this technique, all steps in the pseudopod are reviewed in detail and an explicit expression is given for each part. First, the complexity of the major loop is assessed in Steps 1 and 4, and then the QP loop is evaluated separately. Finally, the complexity is derived in terms of the number of subblocks and major iterations with some approximation and numerical analysis.
 1)
Objective function and constraints from Equation(10):
 2)
Jacobian matrix from Equation(11):
 3)
Hessian update Equation(17):
2M × N multiplications, 2M × (N + 1) additions to calculate Equation(19),
 4)
The solution ϕ is updated, which requires M additions.
 1)
Gradient with respect to the search direction:
 2)
Distance to the nearest constraint from Equation(22):
 3)
Addition of constraint to the active set:
 4)
Update the solution d_{ l } which needs M additions.
 5)
The gradient objective at the new solution point needs M ^{2} multiplications and M ^{2} + 1 additions
 6)
The Lagrange multipliers are obtained by solving a linear system of equations, and this impose a complexity in the order of M ^{3} [24].
 7)
Remove the constraint in case of λ_{ i } < 0:
 8)
Search direction according to Equation(24):
It is a solution to a system of linear equations. The size of Z varies during the iterations, and starts from M × M and reduces to an M × 1 matrix at the end. Accordingly, the complexity in a QP iteration can be stated as 2S^{2}(M + S/ 3) where S is the number of columns in Z at each step.
At first, the computation which is required for the major loop is obtained as 22NM + 9M + N. Next, the amount of computation in the QP loop is divided into fixed and variable parts^{g}; there are (6M + 2)N + 2M^{2} + M operations which are performed in parts numerated by 1, 2, 4 and 5 in every iterations. Besides there are amount of uncertain operations in other parts which are evaluated separately.
To resolve the search direction in Equation(24) two states is possible: the first M times needs 0.4167M^{4}+ 0.6667M^{3} + 0.25M^{2} operations, which is derived by numerical analysis and polynomial fitting, and for further iterations each needs 2M operations. Therefore the required number of flops can be approximated as 0.4M^{3}+0.7M^{2}+0.2M for each iteration. In the QR decomposition part, which is certainly done in every iterations, the procedure is the same. It means that for the first M iteration, 0.25M^{4}  0.3333M^{3} + 0.0833M^{2} operations and for the extra ones 4/3M^{3} flops are done. So the amount of major computation is approximated to be 0.25M^{3} for each QP iteration by dividing the total operations over M.
The complexity of different algorithms to search optimum phase set.
Algorithm  Operations 

OPT PTS 

PSO 

SQP 

There are other optimization methods that can be used to find the best phase weights. PSO is one of the proposed methods for PTS phase search algorithm and many modifications have been introduced to simplify the technique [25]. But the numerical optimization techniques like PSO are only applicable for PTS with limited number of subblocks and subcarriers (at most 256 subcarriers and 16 subblocks) so that the algorithm converges fast enough to the optimal solution. But here there are 60 subblocks and when the allowed phase set is just ±1, the initial generated solutions span 2^{60} possible options. To reduce the convergence time of the optimization technique, the number of randomly generated solutions needs to be a reasonable proportion of all possible solutions, while the complexity is increased linearly with the number of particles in the initial swarm population. The continuous version of PSO is implemented and simulation result is shown in Figure 7 when the number of computations is almost equal to the generated SQP curve.
The complexity of PSO is expressed as the number of required flops in Table 1 where k is the number of iterations and n is the number of initial solutions or the swarm population. For more details on the complexity of PSO, see [26].
The exhaustive search whose complexity is shown in the first row of Table 1 is used in conventional optimal PTS and has a significantly higher cost compared to the proposed algorithm. Moreover, the performance is not as good as SQP, since the phase coefficients are optimized among a quantized phase set. The whole calculation in Equation(7) has to be repeated for every combination of phase vectors, and this requires K^{ M } × MN times additions and multiplications, where K is the number of allowed phases and M is the number of subblocks. Additionally, K^{ M } × (N + 1) comparisons are needed to find the largest sample among each produced transmit sequence, and also between all PAPRs to choose the minimum.
To have a better perception of the PTS complexity in this context, assume the allowed phase set is ±1, so K = 2 and no phase rotation required. Also, the number of subblocks is M = 60 and the same setting preserved as the SQP; then approximately, 10^{23} additions and 10^{21} comparisons have to be performed to find the optimum phase which is clearly impractical. In contrast, the SQP requires 10^{8} flops for 60 subblocks which is roughly equivalent to the PTS exhaustive search with only 12 subblocks and two phase options. According to the recent developments in DSP technology and time schedule in WiMAX and LTE standard, this amount of computation is affordable.
There are many methods in the literature which is dedicated to develop suboptimal PTS schemes to reduce the complexity of exhaustive search in conventional PTS technique, in cost of performance degradation. In this paper, we introduced a systematic optimization technique to achieve the optimal solution of phase rotation approach for PAPR reduction, which has not been studied before. Also, the proposed technique does not require any common costs in terms of increasing BER in the receiver or transmit power, so the costly part is just the optimization procedure. While in every other PTS techniques, the side information is sent to the receiver which cause the spectral efficiency reduction, increasing the transmit power or even BER degradation in case of transmission error.
There are not many options for PAPR reduction techniques without side information and it is not fair to compare SQP technique with other PTS phase optimization approaches which require explicit information to be sent to the receiver.
6. Simulation Results
The proposed PAPR reduction technique for an OFDMA system with 1024 subcarriers and 64 QAM modulation is simulated for a WiMAX data structure as explained in Figure 1. The cumulative distribution function (CDF) of the PAPR is one of the most frequently used performance measures for PAPR reduction techniques. The complementary CDF (CCDF) is used here to evaluate different methods, which denotes the probability that the PAPR of a data block exceeds a given threshold and is expressed as CCDF = 1  CDF.
As can be seen, there are many local minima which have slightly different levels; that is one of the promising properties of this optimization problem because reaching a local minimum satisfies the PAPR reduction aim even though the global minimum is not found. As a result, the performance of the proposed algorithm is relatively insensitive to the initialization of the optimization.
A. Performance of Different Algorithms
The LSE algorithm minimizes the objective function f(x) = (f_{1}(x))^{2}+(f_{2}(x))^{2}+ ⋯ +(f_{ N } (x))^{2}, which is the sum of the OFDM subcarriers amplitudes^{h}. The components are forced to be equal to minimize the sum, so the large samples are pushed to a specific level, whereas the smaller ones become larger. One of the examined optimization methods to search the phase coefficients in PTS is particle swarm optimization (PSO) [27]. The achieved gain for PSO is slightly better than LSE, but it is expensive to implement especially when the number of subblocks is large. The simulation results shows for the same amount of computation the PSO is 2dB worse than SQP, when the initial particle number is n = 100 and k = 50 iterations [26].
If the search for the global minimum can be performed in each OFDM symbol, then the CCDF curve improves to some degree. In our test, each OFDM symbol has been processed 100 times with different initial guesses and the one with the smallest PAPR is selected. The result in Figure 7 (advanced SQP) shows an overall improvement of about 0.5 dB. In this case, the PAPR of the system can almost be considered as a deterministic value since the CCDF curve is almost vertical.
B. Evaluation of Effective Parameters in SQP Performance
Finally, the PAPR reduction performance in terms of CCDF curve is not changed with different initial guesses, because the maximum of all 10, 000 simulated OFDM symbols defines the CCDF curve in low probability of Pr{PAPR > PAPR_{0}}, and this does not depend on the initial solution. But in each OFDM symbol the minimum can be found by examination of various starting points and the performance can be improved as Figure 7 illustrates in advancedSQP curve.
7. Concluding Remarks
We introduced a precoding PAPR reduction technique that is applicable to OFDM/A communication systems using dedicated pilots. We developed the technique for a WiMAX system but it is applicable to OFDM/A systems in general where dedicated pilots and data both are beamformed. Beamforming performance depends on the relative phase shift between antennas but is unaffected by a phase shift common to all antennas. PAPR, on the other hand, changes with a common phase shift, and the PAPR reduction technique proposed in this paper was based on this property. Each cluster within the WiMAX data structure are weighted with proper phase coefficients, which are optimized to minimize the PAPR of the time domain transmitted signal.
The proposed technique comes with interesting unique features, making it a very appealing method especially for standard constrained applications. No side information is sent to the receiver so the throughput is not affected and transmitted power and bit error rate does not increase which otherwise are common drawbacks in many PAPR reduction techniques. Moreover, an optimization technique for finding the best weights was proposed. The PAPR reduction problem was formulated as a minimax problem that was solved by deriving the gradient analytically and modifying the SQP algorithm to solve the optimization.
The SQP algorithm works effectively with a large PAPR reduction gain. At the cost of a smaller PAPR reduction gain, it is possible to reduce the computational complexity of the technique by using other gradientbased optimization techniques. Even lower complexity can be achieved using a least squaresbased formulation, but simulation results indicated a substantial performance loss compared with the SQP approach. The SQP itself can be implemented in different ways to simplify the algorithm and several steps can be done in parallel for a more practical hardware implementation.
Appendix A
Calculation of the search direction
Solving Equation(26) for b at each QP iteration gives the , then the step is taken as . Since the objective is a quadratic function, there are only two choices of step length α; it is either 1 along search direction or < 1. If the step length 1 can be taken without violation of the constraints, then this is the exact step to the minimum of the quadratic function. Otherwise, the distance to the nearest constraint should be found and the solution is moved along it as in Equation(22).
Endnotes
^{a}The general aim of this modification is to distort the elements of q _{ k }, which contribute to a positive definite update, as little as possible. Therefore, in the initial phase of the modification, the most negative element of is repeatedly halved. This procedure is continued until is greater than or equal to a small negative tolerance. If, after this procedure, is still not positive, modify q_{ k } by adding a vector v multiplied by a constant scalar w, and increase w systematically until becomes positive see [19]. ^{b}Equality constraints are always active but there is no equality constraints in this phase optimization problem. ^{c}When it is not the first major iteration, the active set is not empty. ^{d}Where i is the index of minimum in (22) which indicates the active constraint to be added. ^{e}The term "length" indicates the number of rows in A_{ l } or equivalently the number of active constraints. ^{f}The QP is a convex optimization problem, so the iterations proceed till the optimum is found, but a modification of the algorithm can be used when the number of iterations are fixed. ^{g}The fixed operations belong to those matrices whose sizes do not change during the iterations while there are other matrices like Z that has variable size and hence different complexity during iterations. ^{h}This is the simplest scenario, but other modifications can be made to develop a more elaborate version of LSE.
Declarations
Authors’ Affiliations
References
 Patterson K: Generalized reedmuller codes and power control in OFDM modulation. IEEE Trans Inf Theory 1997, 46: 104120.View ArticleGoogle Scholar
 Tellado J: Multicarrier Modulation with Low Peak to Average Power Applications to xDSL and Broadband Wireless. Kluwer Academic, Norwell, MA; 2000.Google Scholar
 Behravan A: Evaluation and Compensation of Nonlinear Distortion in Multicarrier Communication Systems. PhD thesis. Chalmers University of Technology, Department of Signals and Systems, Communication System Group, Gothenburg, Sweden; 2006.Google Scholar
 Ciochina C, Buda F, Sari H: An analysis of OFDM peak power reduction techniques for WiMAX systems. Proceedings of IEEE International Conference on Communications 2006, 46: 104120.Google Scholar
 Krongold BS, Jones DL: PAR reduction in OFDM via active constellation extension. IEEE Trans Broadcast 2003, 3: 258268.View ArticleGoogle Scholar
 Han SH, Lee JH: An overview of peaktoaverage power ratio reduction techniques for multicarrier transmission. IEEE Wirel Commun Mag 2005, 12: 5665. 10.1109/MWC.2005.1421929View ArticleGoogle Scholar
 Tellambura C: Phase optimization criterion for reducing peaktoaverage power ratio in OFDM. IET Electron Lett 1998, 34: 169170. 10.1049/el:19980163View ArticleGoogle Scholar
 Cimini LJ Jr, Sollenberger NR: Peaktoaveragepower ratio reduction of an OFDM signal using partial transmit sequences. IEEE Commun Lett 2000, 4: 8688. 10.1109/4234.831033View ArticleGoogle Scholar
 Mller SH, Huber JB: A novel peak power reduction scheme for OFDM. Proceedings of IEEE PIMRC 1997, 3: 10901094.Google Scholar
 Andrews JG, Ghosh A, Muhamed R: Fundamentals of WiMAX: Understanding Broadband Wireless Networking. Prentice Hall; 2007.Google Scholar
 Kang S, Kim J, Joo E: A novel subblock partition scheme for partial transmit sequence OFDM. IEEE Trans Commun 1999, 45: 333338.Google Scholar
 Fletcher R: Practical Methods of Optimization. 2nd edition. WileyInterscience; 2000.View ArticleGoogle Scholar
 Gill P, Murray W, Wright MH: Practical Optimization. Academic Press; 1981.Google Scholar
 Powell MJD: Variable Metric Methods for Constrained Optimization. Springer Verlag; 1983.View ArticleGoogle Scholar
 Schittkowski K: NLQPL: a FORTRANsubroutine solving constrained nonlinear programming problems. Ann Oper Res 1985, 5: 485500.MathSciNetView ArticleGoogle Scholar
 Kuhn HW, Tucker AW: Nonlinear programming. Proceedings of Second Berkeley Symposium on Mathematical Statistics and Probability 1951, 481492.Google Scholar
 Yi Z: Abinitio Study of Semiconductor and Metallic Systems: From Density Functional Theory to Many Body Perturbation Theory. PhD thesis. University of Osnabruck, Department of Physics, Osnabruck, Germany; 2009.Google Scholar
 Amjady N, Keynia F: Application of a new hybrid neuroevolutionary system for dayahead price forecasting of electricity markets. Appl Soft Comput 2010, 10: 784792. 10.1016/j.asoc.2009.09.008View ArticleGoogle Scholar
 Matlab optimization toolbox user guide, constrained optimization Volume ch 6. The MathWorks, Inc; 1984:227235.
 Murty KG: Linear Complementarity, Linear and Nonlinear Programming, Sigma Series in Applied Mathematics. Volume 3. Heldermann Verlag, Berlin; 1988.Google Scholar
 Gill P, Murray W, Wright M: Numerical Linear Algebra and Optimization. Volume 1. AddisonWesley; 1991.Google Scholar
 Nocedal J, Wright SJ: Numerical Optimization. Operations Research and Financial Engineering. 2nd edition. Springer Verlag; 2006.Google Scholar
 Qu YJ, Hu BG: RBF networks for nonlinear models subject to linear constraints. IEEE International Conference on Granular Computing 2009, 482487.Google Scholar
 Golub GH, Loan CV: Matrix Computations. 3rd edition. Johns Hopkins University Press, Baltimore, MD; 1996.Google Scholar
 Wang Y, Chen W, Tellambura C: A PAPR reduction method based on artificial bee colony algorithm for OFDM signals. IEEE Trans Wirel Commun 2010, 9: 29942999.View ArticleGoogle Scholar
 Khademi S: OFDM peaktoaveragepowerratio reduction in WiMAX systems. In Master's thesis. Chalmers University of Technology, Department of Signals and Systems, Communication System Group, Gothenburg, Sweden; 2011.Google Scholar
 Kennedy J, Eberhart R: Particle swarm optimization. Proceedings of IEEE International Conference on Neural Networks 1995, 46: 19421945.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.