Identifying timevarying channels with aid of pilots for MIMOOFDM
 Zijian Tang^{1, 2}Email author and
 Geert Leus^{2}
DOI: 10.1186/16876180201174
© Tang and Leus; licensee Springer. 2011
Received: 2 August 2010
Accepted: 26 September 2011
Published: 26 September 2011
Abstract
In this paper, we consider pilotaided channel estimation for orthogonal frequency division multiplexing (OFDM) systems with a multipleinput multipleoutput setup. The channel is time varying due to Doppler effects and can be approximated by an oversampled complex exponential basis expansion model. We use a best linear unbiased estimator (BLUE) to estimate the channel with the aid of frequencymultiplexed pilots. The applicability of the BLUE, which is referred to as the channel identifiability in this paper, relies upon a proper pilot structure. Depending on whether the channel is estimated within a single OFDM symbol or multiple OFDM symbols, we propose simple pilot structures that guarantee channel identifiability. Further, it is shown that by employing more receive antennas, the BLUE can combat more effectively the Dopplerinduced interference and therefore improve the channel estimation performance.
Keywords
MIMO OFDM BLUE timevarying channel pilotaided channel estimation BEM1 Introduction
Orthogonal frequency division multiplexing (OFDM) systems have attracted enormous attention recently and have been adopted in numerous existing communication systems. OFDM gains most of its popularity thanks to its ability to transmit signals on separate subcarriers without mutual interference. To further enhance the capacity of the transmission link, OFDM systems can be combined with multipleinput multipleoutput (MIMO) features.
The fact that OFDM can transmit signals on separate subcarriers can be mathematically represented in the frequency domain by a diagonal channel matrix. This property holds only in a situation where the channel stays (almost) constant for at least one OFDM symbol interval. In practice, a timeinvariant channel assumption can become invalid due to, e.g., Doppler effects resulting from the motion between the transmitter and receiver. In such a case, the frequencydomain channel matrix is not diagonal but generally full with the nonzero offdiagonal elements leading to intercarrier interference (ICI).
To equalize such channels, the knowledge of all the elements in the channel matrix is required. In order to reduce the number of unknown channel parameters, a widely adopted approach is approximating the variation of the channel in the time domain with a parsimonious model, e.g., a basis expansion model (BEM). Consequently, channel estimation boils down to estimating the corresponding BEM coefficients. Among the various BEMs that have been proposed, this paper will concentrate on the socalled oversampled complex exponential BEM [(O)CEBEM] [1]. By tuning the oversampling factor, the (O)CEBEM is reported in [2] to fit timevarying channels much tighter than its variant, the critically sampled complex exponential BEM [(C)CEBEM] [3, 4], and it has a steady modeling performance for a wide range of Doppler spreads [5].
Based on a general BEM assumption, the OFDM channel is estimated in [6] utilizing pilots that are multiplexed with data in the frequency domain. The same paper shows that the channel estimators that view the frequencydomain channel matrix as full, such as the (O)CEBEM, render a better performance than those that view the channel matrix as diagonal [5], or strictly banded [4], such as the (C)CEBEM. In this paper, the results of [6] will be extended from a singleinput singleoutput (SISO) scenario to MIMO, with a focus on channel identifiability issues.
Estimating timevarying channels in a MIMOOFDM system gives rise to a number of additional challenges. In the first place, due to multiple transmitreceive links, more channel unknowns need to be estimated, which requires more pilots and thus imposes a higher pressure on the bandwidth efficiency. To alleviate this problem, we will employ more pilotcarrying OFDM symbols to leverage the channel correlation along the time axis as in [7, 8]. Although this comes at a penalty of a larger BEM modeling error, the overall channel estimation performance can still be improved.
Another challenge in a MIMOOFDM system is how to distribute pilots in the time, frequency and spatial domains. Barhumi et al. [9] and Minn and AlDhahir [10] proposes optimal pilot schemes but only for timeinvariant channels or systems for which the time variation of the channel within one OFDM symbol can be neglected. Except for [7, 11], much less attention has been paid to systems dealing with channels varying faster. In this paper, we will use the channel identifiability criterion as a guideline to design pilot schemes. It is noteworthy that the proposed pilot structures can be independent of the oversampling factor of the (O)CEBEM, which endows the receiver with the freedom to choose the most suitable oversampling factor.
Pilot structures can have a great impact on both channel identifiability and estimation performance. The latter is, however, difficult to tackle analytically for timevarying channels. In this paper, we will try to establish, by means of simulations, a guideline for designing pilots that render a satisfactory channel estimation performance for different channel situations.
The MIMO feature brings not only design challenges but also performance benefits. Due to the ICI, the contribution of the pilots is always mixed with the contribution of the unknown data in the received samples. By taking this interference explicitly into account in the channel estimator design, [6] shows that the resulting best linear unbiased estimator (BLUE) can cope with the interference reasonably well, producing a performance close to the CrámerRao bound (CRB). When multiple receive antennas are deployed, we observe that the channel estimation performance can even be further improved. This is attributed to the fact that each receive antenna gets a different copy of the same transmitted data. The interference is therefore correlated across the receive antennas, which can be exploited by the BLUE to suppress the interference more effectively than in the single receive antenna case. To our best knowledge, this effect has not been reported before.
The remainder of the paper is organized as follows. In Section 2, we present a general MIMOOFDM system model. In Section 3, we describe how the BLUE can be used to estimate the BEM coefficients. Channel identifiability is discussed in Section 4, based on which we propose a variety of pilot structures. The simulation results are given in Section 5, where we discuss the impact of the various pilot structures on the performance. Conclusions are given in Section 6.
Notation: We use upper (lower) bold face letters to denote matrices (column vectors). (·)*, (·)^{ T } and (·)^{ H } represent conjugate, transpose and complex conjugate transpose (Hermitian), respectively. [x]_{ p } indicates the p th element of the vector x, and [X]_{p,q} indicates the (p, q)th entry of the matrix X. $\mathcal{D}\left\{x\right\}$ is used to denote a diagonal matrix with x on the diagonal, and $\mathcal{D}\left\{{A}_{0},\dots ,{A}_{N1}\right\}$ is used to denote a blockwise diagonal matrix with the matrices A_{0}, ..., A_{N 1} on the diagonal. ⊗ and † represent the Kronecker product and the pseudoinverse, respectively. I_{ N } stands for the N × N identity matrix; 1_{M×N} for the M × N allone matrix, and W_{ K } for a Kpoint normalized discrete Fourier transform (DFT) matrix. We use ${X}^{\left\{\mathcal{R},\mathcal{C}\right\}}$ to denote the submatrix of X, whose row and column indices are collected in the sets $\mathcal{R}$ and $\mathcal{C}$, respectively; Similarly, we use ${X}^{\left\{\mathcal{R},:\right\}}\left({X}^{\left\{:,\mathcal{C},\right\}}\right)$ to denote the rows (columns) of X, whose indices are collected in $\mathcal{R}\phantom{\rule{0.3em}{0ex}}\left(\mathcal{C}\right)$. The cardinality of the set $\mathcal{S}$ is denoted by $\left\mathcal{S}\right$.
2 System model
Let us consider a MIMOOFDM system with N_{T} transmit antennas and N_{R} receive antennas, where the channel in the time domain is assumed to be a timevarying causal finite impulse response (FIR) filter with a maximum order L. Using ${h}_{p,l}^{\left(m,n\right)}$ to denote the timedomain channel gain of the l th lag at the p th time instant for the channel between the m th transmit antenna and n th receive antenna, we can assume that ${h}_{p,l}^{\left(m,n\right)}=0$for l < 0 or l > L. Note that this channel model can take the transmit/receiver filter, the propagation environment and the possible synchronization errors among different transmission links into account.
where z^{(n)}[j] represents the additive noise related to the n th receive antenna; ${H}_{\mathsf{\text{c}}}^{\left(m,n\right)}\left[j\right]$ denotes the channel matrix between the m th transmit antenna and n th receive antenna in the time domain, and ${H}_{\mathsf{\text{d}}}^{\left(m,n\right)}\left[j\right]:={W}_{K}{H}_{\mathsf{\text{c}}}^{\left(m,n\right)}\left[j\right]{W}_{K}^{H}$ represents its counterpart in the frequency domain. Under the FIR assumption of the channel and letting L_{cp} = L without loss of generality, we can express the entries of ${H}_{\mathsf{\text{c}}}^{\left(m,n\right)}\left[j\right]$ as ${\left[{H}_{\mathsf{\text{c}}}^{\left(m,n\right)}\left[j\right]\right]}_{p,q}={h}_{j\left(K+L\right)+p+L,\mathsf{\text{mod}}\left(pq,K\right)}^{\left(m,n\right)}$ with mod(a, b) standing for the remainder of a divided by b.
Obviously, if the channel stays constant within an OFDM symbol, ${H}_{\mathsf{\text{c}}}^{\left(m,n\right)}\left[j\right]$ will be a circulant matrix (hence the subscript c). This results in a diagonal matrix ${H}_{\mathsf{\text{d}}}^{\left(m,n\right)}\left[j\right]$ (hence the subscript d), which means that the subcarriers are orthogonal to each other. This property is however corrupted if the time variation within an OFDM symbol is not negligible.
3 Channel estimation
For the ease of analysis, we will differentiate between two cases throughout the whole paper. The first case is based on a single OFDM symbol, which means that the channel will be estimated for each OFDM symbol individually. The other case employs multiple OFDM symbols. Because these two cases are characterized by some unique properties, we treat them separately.
3.1 Single OFDM symbol
3.1.1 Data model and BEM based on a single OFDM symbol
where κ stands for the oversampling factor with $\kappa =\frac{K}{K+L}$ used for the (C)CEBEM and $\kappa >\frac{K}{K+L}$ for the (O)CEBEM.
Because we will only concentrate on a single OFDM symbol in this section, we drop the index j for the sake of simplicity.
where ${H}_{\mathsf{\text{d}}}^{\left(m,n\right)\left\{{\mathcal{O}}_{g},{\mathcal{P}}^{\left(m\right)}\right\}}$ and ${H}_{\mathsf{\text{d}}}^{\left(m,n\right)\left\{{\mathcal{O}}_{g},{\mathcal{D}}^{\left(m\right)}\right\}}$ represent submatrices of ${H}_{\mathsf{\text{d}}}^{\left(m,n\right)}$, which are schematically depicted in Figure 1. As a consequence of the full matrix ${H}_{\mathsf{\text{d}}}^{\left(m,n\right)}$, we can see from (9) that ${y}^{\left(n\right)\left\{{\mathcal{O}}_{g}\right\}}$ depends not only on ${p}_{g}^{\left(m\right)}$, but also on the data d^{(m)}as well as the other pilot clusters.
A detailed derivation of (12)(14) for the SISO case can be found in [6]. The extension to the MIMO case is rather straightforward.
3.1.2 Best linear unbiased estimator based on a single OFDM symbol
From (10), c can be estimated by diverse channel estimators. Due to space restrictions, this paper will not list all the possible channel estimators, but will only focus on the BLUE.
The BLUE is a compromise between the linear minimum meansquare error (LMMSE) and the leastsquare (LS) estimator: it treats c as a deterministic variable, thus avoiding a possible error in calculating channel statistics, which are necessary for the LMMSE estimator; at the same time, it leverages the statistics of the data symbols and noise, which are easier to attain, such that the interference and the noise can still be better suppressed than with the LS estimator. Simulation results in [6] show that the BLUE is able to yield a performance close to that of the LMMSE estimator, even if the latter is equipped with perfect knowledge of the channel statistics.
The above expression is actually the maximum likelihood estimator [12] that is obtained by ignoring the interference i.
we can halt the iterative BLUE if Γ^{[k]}is smaller than a predefined value or the number of iterations K is higher than a predefined value.
In the previous section, we have mentioned that a different choice of ℓ in (9) will have an impact on the channel estimator. For the BLUE in the SISO scenario, it is shown in [6] that the best performance is attained when the whole OFDM symbol is employed for channel estimation.
3.2 Multiple OFDM symbols
In the previous section, the channel is estimated for each block separately. To improve the performance, we will exploit more observation samples in this section. It is nonetheless noteworthy that in the context of timevarying channels, the channel coherence time is rather short, which means that we cannot utilize an infinite number of OFDM symbols to enhance the estimation precision.
where j_{ v } stands for the position of the v th pilot OFDM symbol. Further, the symbol ${\mathcal{P}}^{\left(m\right)}\left[{j}_{v}\right]$, as analogously introduced in the previous section, represents the set of pilot subcarriers within the v th pilot OFDM symbol that is used by the m th transmit antenna. Similar extensions hold for ${\mathcal{D}}^{\left(m\right)}\left[{j}_{v}\right]$, ${\mathcal{O}}^{\left(m\right)}\left[{j}_{v}\right]$ and $\mathcal{O}\left[{j}_{v}\right]$. An interesting topic when utilizing multiple OFDM symbols is how to distribute the pilots along the time as well as frequency axis. To differentiate between various pilot patterns, let us borrow the terms used in [14] to categorize two pilot placement scenarios.^{a}
Blocktype This scheme is considered in [18–20], in which the pilots occupy the entire OFDM symbol, and such pilot OFDM symbols are interleaved along the time axis with pure data OFDM symbols. In mathematics, $\left\mathcal{V}\right\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}=J$ and $\left{\mathcal{P}}^{\left(m\right)}\left[{j}_{v}\right]\right\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}<K$. An example of the Blocktype scheme with two transmit antennas is sketched in the right plot of Figure 2.
3.2.1 Data model and BEM based on multiple OFDM symbols
Where ${c}_{q}^{\left(m,n\right)}:={\left[{c}_{q,0}^{\left(m,n\right)},\dots ,{c}_{q,L}^{\left(m,n\right)}\right]}^{T}$. Note that in (24), each OFDM symbol is associated with a different BEM sequence u_{ q }[j], but with common BEM coefficients ${c}_{q}^{\left(m,n\right)}$. This is in contrast to (5), where each OFDM symbol is associated with a common BEM, but with different BEM coefficients.
where B[j_{ v }] and d[j_{ v }] are defined as in (14) with the OFDM symbol index added.
3.2.2 Best linear unbiased estimator based on multiple OFDM symbols
where R[j_{ v }] is defined as in (16) with the OFDM symbol index added.
which can be attained in just one shot.
4 Channel identifiability
In this paper, we define channel identifiability in terms of the uniqueness of the BLUE. From (17) and (28), we understand that the BLUE is unique when A or $\stackrel{\u0303}{A}$ is of full columnrank, and R or $\stackrel{\u0303}{R}$ is nonsingular.
Normally speaking, the nonsingularity of R or $\stackrel{\u0303}{R}$ can be easily satisfied in a noisy channel. In contrast, the rank condition of A or $\stackrel{\u0303}{A}$ is often difficult to examine, because its composition depends on the choice of the BEM and the pilot structure. Especially for the latter, it turns out to be very hard to give an analytical formulation for a general pilot structure. In this paper, we will adopt a specific pilot structure for each pilot OFDM symbol, which is similar to the frequencydomain Kronecker Delta (FDKD) scheme proposed in [7]. Note that for a general BEM assumption as taken in [6], the FDKD scheme always yields a good performance experimentally.
The basic pilot structure adopted in this paper can be summarized as follows:
where ${\stackrel{\u0304}{p}}^{\left(m\right)}\left[{j}_{v}\right]$ contains all the nonzero pilots sent by the mth transmit antenna during the vth pilot OFDM symbol, and Δ^{(m)}[j_{ v }] gives the position of the nonzero pilot within the cluster.
Further, the following assumption is adopted throughout the remainder of the paper.
This assumption is shown in [6] to maximize the performance of the BLUE. In addition, it will greatly simplify the derivation of the channel identifiability conditions.
As in the previous sections, in order to derive the channel identifiability conditions, we find it instrumental to first explore the rank condition on A for the single OFDM symbol case and then extend the results to multiple pilot OFDM symbols.
4.1 Single OFDM symbol
Following Pilot Design Criterion 1, [7] shows conditions to ensure that the columns of A^{(n)}are orthonormal under a (C)CEBEM assumption. However, these conditions are not suitable for an (O)CEBEM assumption as adopted in this paper, and we need to impose more restrictions, especially on the pilot design across the transmit antennas. They are summarized in the following theorem (see Appendix A for a proof).
where μ^{(m)}denotes the position of the first nonzero pilot sent by the mth transmit antenna.
The following remarks are in order at this stage.
Such a pilot structure complies with (34) and (35) with a (C)CEBEM assumption, i.e., $\kappa =\frac{K}{K+L}$.
We observe in (36) that the FDKD pilot structure contains a certain number of zeros, which are not specified in Theorem 1. These zeros are beneficial to combat the ICI, but not necessary for the rank condition. Later on, we will show that the total number of zeros within the pilot cluster plays a more significant role at high SNR where the ICI becomes more pronounced.
Remark 2. Viewing a timeinvariant channel as a special case of a timevarying channel with a trivial Q = 0, we can establish the relationship between the conditions given in (34) and (35), and the conditions given for timeinvariant channels. For instance, the pilot structure given in [9] requires the number of nonzero pilots per transmit antenna to be no fewer than L + 1. Further, the nonzero pilots from different transmit antennas must occupy different subcarriers, i.e., μ^{(m')}  μ ^{(m)}> 0 for m' ≠ m.
4.2 Multiple OFDM symbols
In many practical situations, Theorem 1 can be harsh to satisfy due to practical constraints. For instance, if the Doppler spread and/or the delay spread of the channel are large, the lower and upperbound in (34) will approach each other, making it harder to find a suitable G. Fortunately, these constraints can be loosened by employing multiple pilot OFDM symbols.
One important issue of channel estimation based on multiple pilot OFDM symbols is how to distribute the pilots along the time axis. Prior to proceeding, let us introduce two possible schemes.
Adopting the above design criterion leads to the following theorem.
The proof is given in Appendix B.
Remark 3. We observe here again that the right inequality in (38) is identical to the channel identifiability condition in [9] for the timeinvariant MIMO channel based on multiple OFDM symbols.
Remark 4. For realistic system parameters, $\frac{KQ}{\kappa V\left(K+L\right)}<1$ holds in most cases. From (39), it is hence sufficient if μ^{(m')} ≠ μ^{(m)} for m' ≠ m: this implies that the transmitter can be transparent to the oversampling factor used by the receiver.
An alternative way of designing the pilots is given by the following construction.
Adopting the above design criterion leads to the following theorem.
The proof is given in Appendix C.
Remark 5. Theorem 3 enables the transmitter to be completely transparent to the choice of the oversampling factor at the receiver.
If there is only one transmit antenna, the conditions given in Theorem 3 can be relaxed as stated in the following corollary.
The proof is given in the last part of Appendix C. This property has been explored in [21] where a SISO scenario is considered.
5 Simulations and discussions
For the simulations, we generate timevarying channels conform Jakes' Doppler profile [22] using the channel generator given in [23]. The channel taps are assumed to be mutually uncorrelated with a variance of ${\sigma}_{l}^{2}=1\u2215\sqrt{L+1}$. The variation of the channel is characterized by the normalized Doppler spread υ_{D} = f_{c}v/c, where f_{c} is the carrier frequency; v is the speed of the vehicle parallel to the direction between the transmitter and the receiver, and c is the speed of light.
We consider an OFDM system with 64 subcarriers. The pilots and data symbols are multiplexed in the frequency domain by occupying different subcarriers. The data symbols are modulated by quadrature phaseshift keying (QPSK). Further, we set the average power of the pilots to be equal to the average power of the data symbols.
Note that in the above criterion, the true channel ${h}_{k,l}^{\left(m,n\right)}$ is used, which implies that we actually take also the BEM modeling error into account.
For all the numerical examples below, we adopt the stop criterion that halts the iterative BLUE if either Γ^{[k]}, which is defined in (19) as the normalized difference in energy between the previous and current estimates, is smaller than 10^{6} or the number of iterations K is higher than 30.
Study Case 1: Single OFDM Symbol
Pilot structure
G  P+ 1  V _{ a }  V _{ b }  J  

Combtype I  4  8  6  1  6 
Combtype II  16  2  1  6  6 
Blocktype  16  4  1  3  6 
Study Case 2: Short Channels
Again, we observe that the channel estimation performance degrades with more transmit antennas, but improves with more receiver antennas especially at high SNR. In contrast, this does not happen to the Blocktype scheme. We understand that the interference induced by the Doppler spread to the channel estimator becomes the dominant nuisance at high SNR. At the same time, this interference is a function of the transmitted data and hence strongly correlated among different receive antennas. The BLUE is able to exploit this correlation to combat the interference better. The following heuristic analysis enables a better insight into this effect.
The above suggests that the rank of ${\stackrel{\u0303}{R}}^{1}$ increases with the number of receive antennas, the number of pilot OFDM symbols as well as the number of pilots within the OFDM symbol, but decreases with the number of transmit antennas. A higher rank of ${\stackrel{\u0303}{R}}^{1}$ is beneficial to the condition of the matrix ${\stackrel{\u0303}{A}}^{H}{\stackrel{\u0303}{R}}^{1}\stackrel{\u0303}{A}$, which is in turn related to the trace of ${\left({\stackrel{\u0303}{A}}^{H}{\stackrel{\u0303}{R}}^{1}\stackrel{\u0303}{A}\right)}^{1}$. Following such a reasoning, it is not difficult to understand that increasing the number of receive antennas is beneficial to the performance just as increasing the number of pilots or decreasing the number of transmit antennas. To the best of our knowledge, this effect of the number of receive antennas on the channel estimation performance is not widely recognized. The main reason is that most works are based on a scenario where the interference is absent at the receiver, e.g., for timeinvariant channels, or in the case of the Blocktype scheme, where the pilots occupy the whole OFDM symbol and there is no interference either.
Note that the rank of ${\stackrel{\u0303}{R}}^{1}$ also increases with the number of pilot OFDM symbols. Comparing Figure 4 with 3, we can indeed observe a performance improvement. However, for faster fading channels, multiple OFDM symbols work only better at lowtomoderate SNR, but suffer from a noise floor at high SNR, where the BEM modeling error plays a dominant role. The BEM modeling error will become larger if more OFDM symbols are considered and/or the channel varies faster. Increasing the BEM order Q can enhance the BEM modeling performance at the penalty that more channel unknowns need to be estimated. An alternative is not to estimate the channel of all the OFDM symbols, but only the middle part, e.g., the 3rd and 4th symbols. This means that the channel estimator will work like an overlapping sliding window, an approach that is adopted in [25].
Study Case 3: Long Channels
Study Case 4: Why Combtype I Fails for Long Channels
Study Case 5: Convergence performance
6 Conclusions
In this paper, we have discussed how to design pilots to estimate timevarying channels in a MIMOOFDM system. We underline that the proposed pilot design criteria can be made (almost) independent of the oversampling factor of the (O)CEBEM such that each receiver can independently choose the best (O)CEBEM.
We have compared the performance of three different pilot structures, all conform the proposed design criteria. By means of simulations, we have shown that

Each pilot OFDM symbol should contain as few pilot clusters as possible provided there are more than the channel order.

Combtype pilots can estimate the timevarying channel better than the Blocktype pilots because they suffer from a smaller interpolation error.

For combtype pilots, it is possible to improve the channel estimation performance by employing more receive antennas, which combats the interference more effectively.
Appendices
A Proof of Theorem 1
Compared to (13), we keep here only the rows/columns that correspond to the positions of the nonzero pilots, which are represented by ${\stackrel{\u0304}{\mathcal{P}}}^{\left(m\right)}$. In addition, we have dropped the observation sample index $\mathcal{O}$ in the above as a result of Assumption 1.
The following two lemmas determine the rank of ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(n\right)}$ and ${\stackrel{\u0304}{A}}_{\mathsf{\text{d}}}^{\left(n\right)}$.
Lemma 1. If K/[N_{ T }(Q +1)] ≥ G, and ${\mu}^{\left(m+1\right)}{\mu}^{\left(m\right)}>\frac{KQ}{\kappa \left(K+L\right)}$, the matrix ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(n\right)}$ has full columnrank N_{ T }G(Q + 1).
In the above, we have downsampled the BEM sequence u_{ q } into lengthG subsequences with the x th subsequence being ${u}_{q,x}:={\left[{\left[{u}_{q}\right]}_{xG},\dots ,{\left[{u}_{q}\right]}_{\left(x+1\right)G1}\right]}^{T}$ for x = 0, ..., X  1.
With ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(n\right)}=\left[{\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(0,n\right)},\dots ,{\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left({N}_{\mathsf{\text{T}}}1,n\right)}\right]$, we apply the procedure from (49) until (55) on all the submatrices ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(m,n\right)}$ for m = 0, ..., N_{T}  1. It is not difficult to realize the rank of ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(n\right)}$ is determined by the rank of the matrix $\left[{\stackrel{\u0304}{\Phi}}^{\left(0\right)},\dots ,{\stackrel{\u0304}{\Phi}}^{\left({N}_{\mathsf{\text{T}}}1\right)}\right]$ multiplied by G. It is tall if X = K/G ≥ N_{T}(Q + 1). Besides, it contains distinctive columns of a larger κX(K + L)point DFT matrix if μ^{(m+1)} κ(K + L) > μ ^{(m)}κ(K + L) + KQ, which is hence of full columnrank. □
Lemma 2. If G ≥ (L + 1), the matrix ${\stackrel{\u0304}{A}}_{d}^{\left(n\right)}$ has full columnrank N_{ T }(L + 1)(Q + 1).
is determined by the rank of ${V}_{\mathsf{\text{L}}}^{\left\{{\stackrel{\u0304}{\mathcal{P}}}^{\left(m\right)},:\right\}}$. The latter is a submatrix of the Vandermonde matrix W_{ K }, and is thus of full columnrank L+1 if G ≥ L+1.
In this case, the matrix ${\stackrel{\u0304}{A}}_{\mathsf{\text{d}}}^{\left(n\right)}$ is of full columnrank N_{T}(L + 1). □
Combining Lemma 1 and Lemma 2 concludes the proof.
B Proof of Theorem 2
where ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(n\right)T}\left[{j}_{v}\right]$ and ${\stackrel{\u0304}{A}}_{\mathsf{\text{d}}}^{\left(n\right)T}\left[{j}_{v}\right]$ are defined in (48) but with the symbol index j_{ v } added.
with K' := κV(K + L). Like in Lemma 1, the rank of ${\stackrel{\u0304}{A}}_{\mathsf{\text{c}}}^{\left(n\right)}\left[{j}_{v}\right]$ is determined by the rank of $\left[{\stackrel{\u0304}{\Phi}}^{\left(0\right)},\dots ,{\stackrel{\u0304}{\Phi}}^{\left({N}_{\mathsf{\text{T}}}1\right)}\right]$ multiplied by G. It is tall if X = K / G ≥ N_{T}(Q + 1). Besides, if μ^{(m+1)}K' > μ^{(m)}K' + KG, this matrix contains distinctive columns of a larger XK'point DFT matrix, and is in that case of full columnrank.
Because ${\stackrel{\u0303}{\stackrel{\u0304}{\mathcal{P}}}}^{\left(m\right)}$ contains VG distinctive elements, ${V}_{\mathsf{\text{L}}}^{\left\{{\stackrel{\u0303}{\stackrel{\u0304}{\mathcal{P}}}}^{\left(m\right)},:\right\}}$ is a tall Vandermonde matrix if VG ≥ L + 1.
Since ${\stackrel{\u0303}{\stackrel{\u0304}{A}}}_{\mathsf{\text{c}}}^{\left(n\right)}$ and ${\stackrel{\u0303}{\stackrel{\u0304}{A}}}_{\mathsf{\text{d}}}^{\left(n\right)}$ are both of full columnrank, we can utilize the rank inequality in [24] to conclude the proof.
C Proof of Theorem 3 and Corollary 1
The first matrix on the righthandside of the above will have a full columnrank if X ≥ N_{T} and ${\mu}^{\left(m\right)}\ne {\mu}^{\left({m}^{\prime}\right)}$. This means that there exists at least one column in Θ that is not allzero. Hence, (62) cannot hold, and the matrix in (61) has a full columnrank. This concludes the proof of Theorem 3.