The theory of compressive sensing (CS) has shown that it is possible to accurately reconstruct a sparse signal from few (relative to the signal dimension) projection measurements[1, 2]. Though such a reconstruction is crucial to visually inspect the signal, there are many instances where one is solely interested in identifying whether the underlying signal is one of several possible signals of interest. In such situations, a complete reconstruction is computationally expensive and does not optimize the correct performance metric. Recently, CS ideas have been exploited in[3–5] to perform target detection and classification from projection measurements, without reconstructing the underlying signal of interest. In[3, 5], the authors propose nearestneighbor based methods to classify a signal$\mathit{f}\in {\mathbb{R}}^{N}$ to one of m known signals given projection measurements of the form$\mathit{y}=\mathit{A}\mathit{f}+\mathit{n}\in {\mathbb{R}}^{K}$ for K≤N, where$\mathit{A}\in {\mathbb{R}}^{K\times N}$ is a known projection operator and$\mathit{n}\sim \mathcal{N}\left(0,{\sigma}^{2}\mathit{I}\right)$ is the additive Gaussian noise. This model is simple to analyze, but is impractical, since in reality, a signal is always corrupted by some kind of interference or background noise. Extension of the methods in[3, 5] to handle background noise is nontrivial. Though, Duarte et al.[4] provides a way to account for background contamination, it makes a strong assumption that the signal of interest and the background are sparse in bases that are incoherent. This might not always be true in many applications. Recent works on CS[6, 7] allow for the input signal f to be corrupted by some premeasurement noise$\mathit{b}\sim \mathcal{N}\left(0,{\sigma}_{b}^{2}\mathit{I}\right)$ such that one observes y=A(f + b) + n, and study reconstruction performance as a function of the number of measurements, pre and postmeasurement noise statistics and the dimension of the input signal. In this work, however, we are interested in performing target detection without an intermediate reconstruction step. Furthermore, the increased utility of highdimensional imaging techniques such as spectral imaging or videography in applications like remote sensing, biomedical imaging and astronomical imaging[8–15] necessitates the extension of compressive target detection ideas to such imaging modalities to achieve reliable target detection from fewer measurements relative to the ambient signal dimensions.
For example, recent advances in CS have led to the development of new spectral imaging platforms which attempt to address challenges in conventional imaging platforms related to system size, resolution, and noise by acquiring fewer compressive measurements than spatiospectral voxels[16–21]. However, these system designs have a number of degrees of freedom which influence subsequent data analysis. For instance, the singleshot compressive spectral imager discussed in[18] collects one coded projection of each spectrum in the scene. One projection per spectrum is sufficient for reconstructing spatially homogeneous spectral images, since projections of neighboring locations can be combined to infer each spectrum. Significantly more projections are required for detecting targets of unknown strengths without the benefit of spatial homogeneity. We are interested in investigating how several such systems can be used in parallel to reliably detect spectral targets and anomalies from different coded projections.
In general, we consider a broadly applicable framework that allows us to account for background and sensor noise, and perform target detection directly from projection measurements of signals obtained at different spatial or temporal locations. The precise problem formulation is provided below.
Problem formulation
Let us assume access to a dictionary of possible targets of interest
$\mathcal{D}=\{{\mathit{f}}^{\left(1\right)},{\mathit{f}}^{\left(2\right)},\dots ,{\mathit{f}}^{\left(m\right)}\}$, where
${\mathit{f}}^{\left(j\right)}\in {\mathbb{R}}^{N}$ for j=1,…,m is unitnorm. Our measurements are of the form
${\mathit{z}}_{i}=\mathit{\Phi}({\alpha}_{i}{\mathit{f}}_{i}^{\ast}+{\mathit{b}}_{i})+{\mathit{w}}_{i}$
(1)
where

i∈{1,…,M} indexes the spatial or temporal locations at which data are collected;

α_{i}≥0 is a measure of the signaltonoise ratio at location i, which is either known or estimated from observations;

$\mathit{\Phi}\in {\mathbb{R}}^{K\times N}$ for K < N, is a measurement matrix to be specified in Section “Whitening compressive observations”;
${\mathit{b}}_{i}\in {\mathbb{R}}^{N}\sim \mathcal{N}({\mathit{\mu}}_{b},{\mathit{\Sigma}}_{b})$• is the background noise vector, and${\mathit{w}}_{i}\in {\mathbb{R}}^{K}\sim \mathcal{N}(0,{\sigma}^{2}\mathit{I})$ is the i.i.d. sensor noise.
For example, in the case of spectral imaging
${\mathit{f}}_{i}^{\ast}$ represents the spectrum at the ith spatial location, and in video sequences
${\mathit{f}}_{i}^{\ast}$ represents the vectorized image frame obtained at the ith time interval. In this article we consider the following target detection problems:
 (1)
Dictionary signal detection (DSD): Here we assume that each ${\mathit{f}}_{i}^{\ast}\in \mathcal{D}$ for i∈{1,…,M}, and our task is to detect all instances of one target signal ${\mathit{f}}^{\left(j\right)}\in \mathcal{D}$ for some unknown j∈{1,…,m}, i.e., to locate $S=\left\{i:{\mathit{f}}_{i}^{\ast}={\mathit{f}}^{\left(j\right)}\right\}$. DSD is useful in contexts in which we know the makeup of a scene and wish to focus our attention on the locations of a particular signal. For instance, in spectral imaging, DSD is used to study a scene of interest by classifying every spectrum in the scene to different known classes [11, 22]. In a video setup, DSD could be used to classify video segments to one of several categories (such as news, weather, sports, etc.) by projecting the video sequence to an appropriate feature space and comparing the feature vectors to the ones in a known dictionary [23].
 (2)
Anomalous signal detection (ASD): Here, our task is to detect all signals which are not members of our dictionary, i.e., detect $S=\left\{i:{\mathit{f}}_{i}^{\ast}\notin \mathcal{D}\right\}$ (this is akin to anomaly detection methods in the literature which are based on nominal, nonanomalous training samples [24, 25]). For instance, ASD may be used when we know most components of a spectral image and wish to identify all spectra which deviate from this model [26].
Our goal is to accurately perform DSD or ASD without reconstructing the spectral input${\mathit{f}}_{i}^{\ast}$ from z_{i} for i∈{1,…,M}. Accounting for background is a crucial issue. Typically, the background corresponding to the scene of interest and the sensor noise are modeled together by a colored multivariate Gaussian distribution[27]. However, in our case, it is important to distinguish the two because of the presence of the projection operator Φ. The projection operator acts upon the background spectrum in the same way as on the target spectrum, but it does not affect the sensor noise. We assume that b_{i}and w_{i}are independent of each other, and the prior probabilities of different targets in the dictionary${p}^{\left(j\right)}=\mathbb{P}\left({\mathit{f}}_{i}^{\ast}={\mathit{f}}^{\left(j\right)}\right)$ for j∈{1,⋯,m} are known in advance. If these probabilities are unknown, then the targets can be considered equally likely. Given this setup, our goal is to develop suitable target and anomaly detection approaches, and provide theoretical guarantees on their performances.
In this article, we develop detection performance bounds which show how performance scales with the number of detectors in a compressive setting as a function of SNR, the similarity between potential targets in a known dictionary, and their prior probabilities. Our bounds are based on a detection strategy which operates directly on the collected data as opposed to first reconstructing each${\mathit{f}}_{i}^{\ast}$ and then performing detection on the estimated signals. Reconstruction as an intermediate step in detection may be appealing to end users who wish to visually inspect spectral images instead of relying entirely on an automatic detection algorithm. However, using this intermediate step has two potential pitfalls. First, the Rao–Blackwell theorem[28] tells us that an optimal detection algorithm operating on the processed data (i.e., not sufficient statistics) cannot perform better than an optimal detection algorithm operating on the raw data. In other words, optimal performance is possible on the raw data, but we have no such performance guarantee for the reconstructed signals. Second, the relationship between reconstruction errors and detection performance is not well understood in many settings. Although we do not reconstruct the underlying signals, our performance bounds are intimately related to the signal resolution needed to achieve the signal diversity present in our dictionary. Since we have many fewer observations than the signals at this resolution, we adopt the “compressive” terminology.
Performance metric
To assess the performance of our detection strategies, we consider the false discovery rate (FDR) metric and related quantities developed for multiple hypothesis testing problems[
29]. Since we collect M independent observations of potentially different signals, we are simultaneously conducting M hypothesis tests when we search for targets. Unlike the probability of false alarm, which measures the probability of falsely declaring a target for a single test, the FDR measures the fraction of declared targets that are false alarms, that is, it provides information about the entire set of M hypotheses instead of just one. More formally, the FDR is given by,
$\mathrm{\text{FDR}}=\mathbb{E}\left[\frac{V}{R}\right],$
where V is the number of falsely rejected null hypotheses, and R is the total number of rejected null hypotheses. Controlling the FDR in a multiple hypothesis testing framework is akin to designing a constant false alarm rate (CFAR) detector in spectral target detection applications that keeps the false alarm rate at a desired level irrespective of the background interference and sensor noise statistics[22].
Previous investigations
Much of the classical target detection literature[30–34] assume that each target lies in a Pdimensional subspace of${\mathbb{R}}^{N}$ for P < N. The subspace in which the target lies is often assumed to be known or specified by the user, and the variability of the background is modeled using a probability distribution. Given knowledge of the target subspace, background statistics and sensor noise statistics, detection methods based on LRTs (likelihood ratio tests) and GLRTs (generalized likelihood ratio tests) have been proposed in[30–35]. A subspace model is optimal if the subspace in which targets lie is known in advance. However, in many applications, such subspaces might be hard to characterize. An alternative, and a more flexible option is to assume that the highdimensional target exhibits some lowdimensional structure that can be exploited to perform efficient target detection. This approach is utilized in this work and in[5] where the target signal in${\mathbb{R}}^{N}$ is assumed to come from a dictionary of m known signals such that m≪N, and in[3], where the targets are assumed to lie in a lowdimensional manifold embedded in highdimensional target space.
Recently, several methods for target or anomaly detection that rely on recovering the full spatiospectral data from projection measurements[36, 37] have been proposed. However, they are computationally intensive and the detection performance associated with these reconstructions is unknown. Other researchers have exploited CS to perform target detection and classification without reconstructing the underlying signal[3–5]. Duarte et al.[4] propose a matching pursuit based algorithm, called the incoherent detection and estimation algorithm (IDEA), to detect the presence of a signal of interest against a strong interfering signal from noisy projection measurements. The algorithm is shown to perform well on experimental data sets under some strong assumptions on the sparsity of the signal of interest and the interfering signal. Davenport et al.[3] develop a classification algorithm called the smashed filter to classify an image in${\mathbb{R}}^{N}$ to one of m known classes from K projections of the signal, where K < N. The underlying image is assumed to lie on a lowdimensional manifold, and the algorithm finds the closest match from the m known classes by performing a nearest neighbor search over the m different manifolds. The projection measurements are chosen to preserve the distances among the manifolds. Though Davenport et al.[3] offers theoretical bounds on the number of measurements necessary to preserve distances among different manifolds, it is not clear how the performance scales with K or how to incorporate background models into this setup. Moreover, this approach may be computationally intensive since it involves learning and searching over different manifolds. Haupt et al.[5] use a nearestneighbor classifier to classify an Ndimensional signal to one of m equally likely target classes based on K < N random projections, and provide theoretical guarantees on the detector performance. While the method discussed in[5] is computationally efficient, it is nontrivial to extend to the case of target detection with colored background noise and nonequiprobable targets. Furthermore, their performance guarantees cannot be directly extended to our problem since we focus on error measures that let us analyze the performance of multiple hypothesis tests simultaneously as opposed to the above methods that consider compressive classification performance for a single hypothesis test.
The authors of a more recent work[38] extend the classical RX anomaly detector[39] to directly detect anomalies from random, orthonormal projection measurements without an intermediate reconstruction step. They numerically show how the detection probability improves as a function of the signaltonoise ratio when the number of measurements changes. Though probability of detection is a good performance measure, in many applications controlling the false discoveries below a desired level is more crucial. As a result, in our work, we propose an anomaly detection method that controls the FDR below a desired level.
Contributions
This article makes the following contributions to the above literature:

A compressive target detection approach, which (a) is computationally efficient, (b) allows for the signal strengths of the targets to vary with spatial location, (c) allows for backgrounds mixed with potential targets, (d) considers targets with different a priori probabilities, and (e) yields theoretical guarantees on detector performance. This article unifies preliminary work by the authors[40, 41], presents previously unpublished aspects of the proofs, and contains updated experimental results.

A computationally efficient anomaly detection method that detects anomalies of different strengths from projection measurements and also controls the FDR at a desired level.

A whitening filter approach to compressive measurements of signals with background contamination, and associated analysis leading to bounds on the amount of background to which our detection procedure is robust.
The above theoretical results, which are the main focus of this article, are supported with simulation studies in Section “Experimental results”. Classical detection methods described in[22, 26, 27, 30–35, 39, 42–45] do not establish performance bounds as a function of signal resolution or target dictionary properties and rely on relatively direct observation models which we show to be suboptimal when the detector size is limited. The methods in[3, 4] do not contain performance analysis, and our analysis builds upon the analysis in[5] to account for several specific aspects of the compressive target detection problem.