Plug-and-Play Priors (PnP) is one of the most widely-used frameworks for solving computational imaging problems through the integration of physical models and learned models. PnP leverages high-fifidelity physical sensor models and powerful machine learning methods for prior modeling of data to provide state-of-the-art reconstruction algorithms. PnP algorithms alternate between minimizing a data-fifidelity term to promote data consistency and imposing a learned regularizer in the form of an image denoiser. Recent highly-successful applications of PnP algorithms include bio-microscopy, computerized tomography, magnetic resonance imaging, and joint ptycho-tomography. This article presents a unifified and principled review of PnP by tracing its roots, describing its major variations, summarizing main results, and discussing applications in computational imaging. We also point the way towards further developments by discussing recent results on equilibrium equations that formulate the problem associated with PnP algorithms.
- HISTORICAL BACKGROUND
Consider the inverse problem of estimating an unknown image x ∈ R n from its noisy measurements y ∈ R m. It is common to formulate this problem using the optimization b x = argmin x ∈Rn f(x) with f(x) = g(x) + h(x), (1) where g is a data-fifidelity term that quantififies consistency with the observed measurements y and h is a regularizer that enforces prior knowledge on x. The formulation in Eq. (1) corresponds to the maximum a posteriori probability (MAP) estimator when g(x) = − log(py|x(x)) and h(x) = − log(px(x)), (2)
Figure 1. Alternating direction method of multipliers (ADMM) and fast iterative shrinkage/thresholding algorithm (FISTA) are two widely-used iterative algorithms for minimizing composite functions f(x) = g(x) + h(x) where the regularization term h is nonsmooth. Both functions avoid differentiating h by evaluating its proximal operator.
where py|x is the likelihood relating x to measurements y and px is the prior distribution. For example, given measurements of the form y = Ax + e, where A is the measurement operator (also known as the forward operator) characterizing the response of the imaging instrument and e is additive white Gaussian noise (AWGN), the data-fifidelity term reduces to the least-squares function g(x) =1 2k y − Axk 2
2 . On the other hand, many popular image regularizers are based on a sparsity promoting regularizer h(x) = τk W xk 1, where τ > 0 is the regularization parameter and W is a suitable transform. Over the years, a variety of reasonable choices of h have been proposed, with examples including the total variation (TV) and Markov random fifield functions.
These functions have elegant analytical forms, and have had a major impact in applications ranging from tomography for medical imaging to image denoising for cell-phone cameras.
The solution of Eq. (1) balances the requirements to be both data-consistent and plausible,which can be intuitively interpreted as fifinding a balance between two manifolds: the sensor manifold and prior manifold. The sensor manifold is represented by small values of g(x), and in the case of a linear forward model, is roughly an affifine subspace of R n . Likewise, the prior manifold is represented by small values of h(x) and includes the images that are likely to occur in our application. Importantly, real images have enormous amounts of structure, departures from which are immediately noticeable to a domain expert. Consequently, plausible images lie near a lower dimensional manifold in the higher dimensional embedding space.
Proximal algorithms are often used for solving problems of the form in Eq. (1) when g or h are nonsmooth . One of the most widely-used and effective proximal algorithms is the alternating direction method of multipliers (ADMM), which uses an augmented Lagrangian formulation to allow for alternating minimization of each function in turn (see  for an overview of ADMM).
ADMM computes the solution of Eq. (1) by iterating the steps summarized in Algorithm 1 until
This perspective inspired the development of Plug-and-Play Priors (PnP) in , where the proxγh step in ADMM is simply replaced by a more general black-box denoiser D : R n → R n ,such as BM3D . That is, any black-box denoiser D can in principle replace (“plug” in for) proxγh, and then the ADMM algorithm can run (“play”) as before.
We refer to this original algorithm as PnP-ADMM in order to distinguish it from other methods inspired by this Plug-and-Play approach. In fact, there are multiple algorithms using proximal maps to minimize a sum of convex functions, and for each of these algorithms, there is a corresponding PnP version obtained by associating the proximal map with the prior term, then replacing the proximal map with a black-box denoiser. Below we provide more detail on PnPADMM and PnP-FISTA  (based on the proximal-gradient method), as well as extensions and variations. See  for the roots of FISTA,  for more detail on proximal splitting methods in general, and  for a tutorial overview of some PnP methods.
- PLUG-AND-PLAY INTEGRATION OF PHYSICAL AND LEARNED MODELS Deep learning (DL) has emerged as a powerful paradigm for designing algorithms for various image restoration and reconstruction tasks, including denoising, deblurring, and super-resolution (the literature is vast, but see  for an early history).
Given a set of paired data (xi , zi), where xi is the desired “ground truth” image and zi is its noisy or corrupted observation, the traditional supervised DL strategy is to learn a mapping from zi to xi by training a deep convolutional neural network (CNN). Despite its empirical success in computational imaging, an important drawback of DL relative to regularized inversion is the potential need to retrain the CNN for different measurement operators and noise levels.
The success of CNNs as black-box denoisers leads naturally to their use with PnP (see the inset “Turning an Image Denoising Network into an Image Super-Resolver”). In its simplest form, PnP