\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

A parallel domain decomposition algorithm for large scale image denoising

  • * Corresponding author: Xiao-Chuan Cai

    * Corresponding author: Xiao-Chuan Cai
Abstract Full Text(HTML) Figure(9) / Table(8) Related Papers Cited by
  • Total variation denoising (TVD) is an effective technique for image denoising, in particular, for recovering blocky, discontinuous images from noisy background. The problem is formulated as an optimization problem in the space of bounded variation functions, and the solution is obtained by solving the associated Euler–Lagrange equation defined on the domain occupied by the entire image. The method offers high quality results, but is computationally expensive for large images, especially for three-dimensional problems. In this paper, we introduce a highly parallel version of the algorithm which formulates the problem as multiple overlapping, but independent, optimization problems, and each is defined on a portion of the image domain. This approach is similar to the overlapping Schwarz type domain decomposition method, but is non-iterative, for solving partial differential equations, and is highly scalable, without using any coarse grids, for parallel computers with a large number of processors. We show by a theory and also by some two- and three-dimensional numerical experiments that the new approach has similar numerical accuracy as the classical TVD approach, but is much more efficient on parallel computers.

    Mathematics Subject Classification: 65N55, 65Y05, 68U10.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  An example of the domain partition. Here dots are image pixels and the image domain is partitioned into $ 16 $ subdomains. The domain with solid red lines is an example of a nonoverlapping subdomain $ \Omega_0^k $ and the corresponding overlapping subdomain $ \Omega_{\delta}^k $ is marked with dashed black lines. The overlapping size $ \delta $ is the distance between the boundaries of $ \Omega_0^k $ and $ \Omega_{\delta}^k $

    Figure 2.  Procedure of NiOS. Step 1: Overlapping domain decomposition (A); Step 2: Distribute overlapping subdomain problems to 4 processors and solve these four overlapping sub-problems (B); Step 3: Combine images from non-overlapping subdomains (C). Here the points with different colors are the image pixels distributed to different processor cores

    Figure 3.  The boat-$ 1024\times1024 $ image denoising results. From left to right: the original image, the noise image with $ \delta^2 = 0.04 $, and the restored image by NiOS. Here $ \alpha = 0.18 $, $ \beta = 10^{-4} $, and the PSNR of the restored image is $ 27.5444 $

    Figure 4.  A comparison of the results obtained by NiOS and NKS. The first row: the clean image (left), the noisy image with $ \sigma^2 = 0.04 $ (right); second row: the local zoom-in of the figures in the first row; third row: the restored images obtained by NiOS (left) and NKS (right); fourth row: the local zoom-in of the figures in the third row. The PSNR of the restored image with NiOS and NKS are $ 28.536962 $ and $ 28.536417 $, respectively

    Figure 5.  The surface plot of the error between NiOS and NKS: $ u_{\mbox{NKS}} - u_{\mbox{NiOS}} $. Here the image size is $ 1024 \hspace{-0.02in}\times\hspace{-0.02in} 1024 $, $ \alpha = 0.18 $, $ \beta = 1.0\times10^{-4} $. The overlapping size is $ 4 $ and the number of processors is $ 4 $

    Figure 6.  The speedup of the NiOS algorithm for the 2D cameraman image denoising problem

    Figure 7.  A noisy image (left) and an example of the partition of the image into 8 sub-images (right)

    Figure 8.  Slice 100 of the 3D MR images. From left to right, the first row: the original T1-w image, the T1-w image with a Racian noise at $ 9\% $ and the restored image; second row: the detailed partial images of the first row images; third row: the original T2-w images, the T2-w image with a Racian noise at $ 9\% $, and the restored image. The last row: the detailed partial images of the third row images

    Figure 9.  The 3D reconstruction of the MR image. From left to right: Top: the clean image, the image with $ 9\% $ Racian noise, and the restored image; Bottom: the corresponding zoomined images

    Table 1.  The comparison of NiOS and NKS. Here $ \beta = 1.0\times10^{-4} $, $ \alpha = 0.18 $, the overlapping size for NiOS is $ 4 $. "Sp" refers to the compute time speedup of the NiOS method compared with the NKS method, which is defined as the time of NKS divides the time of NiOS

    $ n_p $ cameraman-$ 2048\times2048 $ cameraman-$ 4096\times4096 $
    NKS NiOS Sp NKS NiOS Sp
    Newton Time Newton Time Newton Time Newton Time
    32 24 9.8 22 5.0 2.0 30 50.0 26 23.4 2.1
    64 27 6.2 21 2.8 2.2 37 35.6 23 12.6 2.8
    128 25 3.0 20 1.2 2.5 38 17.9 23 6.5 2.8
    256 25 1.7 19 0.6 2.8 33 8.7 21 2.9 3.0
    512 24 0.9 19 0.4 2.3 38 5.3 20 1.4 3.8
    1024 23 1.2 19 0.3 4.0 38 3.6 20 0.8 4.5
     | Show Table
    DownLoad: CSV

    Table 2.  The error of the NiOS method with respect to the parameters $ \alpha $ and $ \beta $. Here ERROR is the relative error $ ||u_{\mbox{NiOS}} - u_{\mbox{NKS}}||_2/||u_{\mbox{true}}||_2 $, $ u_{\mbox{NiOS}} $, $ u_{\mbox{NKS}} $, and $ u_{\mbox{true}} $ are the solution of NiOS, NKS, and the clean image, respectively

    $ \beta $ $ \alpha $ boat-$ 1024\times1024 $ cameraman-$ 2048 \times2048 $
    Newton GMRES ERROR PSNR Newton GMRES ERROR PSNR
    $ 10^{-1} $ 0.01 2 1 $ 2.1\times10^{-7} $ 15.1 2 1 $ 4.4\times10^{-9} $ 15.4
    $ 10^{-1} $ 0.05 3 1 $ 1.4\times10^{-5} $ 17.4 3 1 $ 2.9\times10^{-7} $ 17.6
    $ 10^{-1} $ 0.1 3 2 $ 6.3\times10^{-5} $ 19.5 3 2 $ 7.1\times10^{-6} $ 19.7
    $ 10^{-1} $ 0.15 4 2 $ 1.3\times10^{-4} $ 21.1 4 2 $ 1.6\times10^{-5} $ 21.2
    $ 10^{-1} $ 0.2 4 2 $ 2.0\times10^{-4} $ 22.3 4 2 $ 3.6\times10^{-5} $ 22.4
    $ 10^{-1} $ 0.25 6 2 $ 2.7\times10^{-4} $ 23.2 12 3 $ 6.1\times10^{-5} $ 23.2
    $ 10^{-2} $ 0.01 2 1 $ 9.3\times10^{-7} $ 15.6 2 1 $ 7.5\times10^{-7} $ 15.6
    $ 10^{-2} $ 0.05 4 2 $ 5.7\times10^{-5} $ 18.8 4 2 $ 3.7\times10^{-6} $ 19.0
    $ 10^{-2} $ 0.1 5 2 $ 2.3\times10^{-4} $ 22.3 5 2 $ 4.5\times10^{-5} $ 22.4
    $ 10^{-2} $ 0.15 6 2 $ 4.3\times10^{-4} $ 24.5 6 2 $ 1.4\times10^{-4} $ 24.6
    $ 10^{-2} $ 0.2 8 3 $ 6.3\times10^{-4} $ 25.7 7 3 $ 2.6\times10^{-4} $ 25.8
    $ 10^{-2} $ 0.25 10 3 $ 8.1\times10^{-4} $ 26.4 8 3 $ 3.8\times10^{-4} $ 26.6
    $ 10^{-3} $ 0.01 3 1 $ 2.2\times10^{-6} $ 15.4 3 1 $ 1.3\times10^{-7} $ 15.7
    $ 10^{-3} $ 0.05 9 2 $ 1.3\times10^{-4} $ 19.3 10 2 $ 1.9\times10^{-5} $ 19.6
    $ 10^{-3} $ 0.1 8 3 $ 4.8\times10^{-4} $ 23.9 8 3 $ 1.8\times10^{-4} $ 24.2
    $ 10^{-3} $ 0.15 10 3 $ 8.7\times10^{-4} $ 26.6 12 3 $ 5.0\times10^{-4} $ 26.9
    $ 10^{-3} $ 0.2 20 4 $ 1.2\times10^{-3} $ 27.4 12 4 $ 8.3\times10^{-4} $ 28.0
    $ 10^{-3} $ 0.25 17 4 $ 1.5\times10^{-3} $ 27.3 14 4 $ 1.1\times10^{-3} $ 28.4
    $ 10^{-4} $ 0.01 9 1 $ 3.2\times10^{-6} $ 15.4 5 2 $ 5.4\times10^{-7} $ 15.7
    $ 10^{-4} $ 0.05 16 3 $ 1.9\times10^{-4} $ 19.4 14 3 $ 5.1\times10^{-5} $ 19.7
    $ 10^{-4} $ 0.1 21 4 $ 7.1\times10^{-4} $ 24.5 20 3 $ 4.1\times10^{-4} $ 24.7
    $ 10^{-4} $ 0.15 19 5 $ 1.3\times10^{-3} $ 27.3 19 5 $ 1.0\times10^{-3} $ 27.9
    $ 10^{-4} $ 0.2 22 7 $ 1.7\times10^{-3} $ 27.5 21 6 $ 1.6\times10^{-3} $ 28.7
    $ 10^{-4} $ 0.25 21 15 $ 2.0\times10^{-1} $ 27.6 24 8 $ 2.1\times10^{-3} $ 28.7
     | Show Table
    DownLoad: CSV

    Table 3.  The error of the NiOS method with respect to the parameters $ \alpha $ and $ \beta $ changed proportionally, for example, when $ \alpha $ is reduced by $ \frac{1}{2} $, $ \beta $ is reduced by $ (\frac{1}{2})^4 $. "ERROR" has the same definition as in Table 2

    $ \beta $ $ \alpha $ boat-$ 1024\times1024 $ cameraman-$ 2048 \times2048 $
    Newton GMRES ERROR PSNR Newton GMRES ERROR PSNR
    $ 6.25\times10^{-3} $ 0.1 8 2 $ 2.8\times10^{-4} $ 22.7 6 2 $ 6.4\times10^{-5} $ 22.9
    $ 3.9\times10^{-4} $ 0.05 11 2 $ 1.6\times10^{-4} $ 19.3 11 2 $ 3.0\times10^{-5} $ 19.7
    $ 2.4\times10^{-5} $ 0.025 14 2 $ 4.1\times10^{-5} $ 16.8 12 2 $ 6.8\times10^{-6} $ 17.1
    $ 1.5\times10^{-6} $ 0.0125 17 2 $ 7.6\times10^{-6} $ 15.6 10 2 $ 1.1\times10^{-6} $ 15.9
     | Show Table
    DownLoad: CSV

    Table 4.  Parallel performance of the NiOS algorithm for the 2D cameraman image denoising

    $ n_p $ $ 2048\times2048 $ $ 4096 \times4096 $
    Newton GMRES Time(s) Newton GMRES Time(s)
    64 21 4 1.7 23 4 7.2
    128 20 4 0.8 22 4 3.3
    256 20 4 0.4 21 4 1.6
    512 19 4 0.2 20 4 0.8
    1024 18 4 0.1 20 4 0.4
     | Show Table
    DownLoad: CSV

    Table 5.  The detail of the sub-problem solving in NiOS with 24 processors. Image size is $ 2048\times2048 $. Here "rank" is the processor core ID

    rank Newton GMRES Time rank Newton GMRES Time
    0 11 5 7.3 12 17 5 9.1
    1 11 5 7.3 13 14 5 8.4
    2 17 5 9.2 14 10 5 6.4
    3 19 5 9.8 15 8 5 5.2
    4 12 5 7.2 16 14 5 8.0
    5 12 5 7.3 17 14 5 8.3
    6 12 5 7.3 18 16 5 8.8
    7 12 5 7.6 19 16 5 8.8
    8 16 5 8.8 20 13 5 7.6
    9 17 5 9.0 21 13 5 7.5
    10 11 5 6.8 22 16 5 8.5
    11 12 5 7.3 23 16 5 6.6
     | Show Table
    DownLoad: CSV

    Table 6.  Comparison of the NiOS and NKS for the 3D image denoising. The image size is $ 181\times217\times181 $. "Sp" refers to the compute time speedup of the NiOS method compared with the NKS method

    $ n_p $ NKS NiOS Sp
    Newton GMRES Time(s) Newton GMRES Time(s)
    32 35 17 38.5 35 12 22.0 1.8
    64 36 17 24.5 31 11 12.4 2.0
    128 36 18 12.0 29 10 5.0 2.4
    256 36 18 6.6 26 9 3.3 2.0
    512 39 18 3.9 24 8 1.4 2.8
    1024 35 19 2.4 22 8 0.7 3.4
     | Show Table
    DownLoad: CSV

    Table 7.  The effect of various choices of the overlapping parameter $ \delta $ in the NiOS method for the 3D image. The number of processors is $ 64 $ and the image size is $ 181\times217\times181 $. The PSNR of the NKS method is $ 27.84 $ for this image denoising. Here $ \text{DIFF} = || \mathbf{u}_{\text{NiOS}} - \mathbf{u}_{\text{NKS}}||_2 $ is the difference of the solution of NiOS ($ \mathbf{u}_{\text{NiOS}} $) and NKS ($ \mathbf{u}_{\text{NKS}} $)

    $ \delta $ Newton GMRES Time (s) PSNR DIFF
    0 14 5 2.7 27.83 5.89
    1 14 5 2.8 27.85 1.20
    2 14 5 3.1 27.84 0.36
    3 14 5 3.4 27.84 0.07
     | Show Table
    DownLoad: CSV

    Table 8.  Parallel performance of the NiOS-NKS algorithm for the 3D image denosing. The image size is $ 217\times217\times181 $

    $ n_p $ First Stage Second Stage
    Newton GMRES Time Newton GMRES Time
    32 15 5 4.48 3 7 2.39
    64 14 5 3.5 3 7 1.41
    128 15 5 1.71 3 7 0.77
    256 14 5 1.11 3 7 0.47
    512 14 5 0.45 3 7 0.36
    1024 14 5 0.26 3 8 0.42
     | Show Table
    DownLoad: CSV
  • [1] A. AsaduzzamanA. Martinez and A. Sepehri, A time-efficient image processing algorithm for multicore/manycore parallel computing, Southeastcon, IEEE, (2015).  doi: 10.1109/SECON.2015.7132924.
    [2] G. Aubert and P. Kornprobst, Mathematical Problems in Image Processing: Partial Differential Equations and the Calculus of Variations, Applied Mathematical Sciences, 147. Springer, New York, 2006.
    [3] S. Balay, S. Abhyankar, M. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, A. Dener, V. Eijkhout, W. Gropp, B. Smith, D. Karpeyev, D. Kaushik and et al., PETSc Web page, (2019), https://www.mcs.anl.gov/petsc.
    [4] M. Basu, Gaussian-based edge-detection methods-a survey, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 32 (2002), 252-260.  doi: 10.1109/TSMCC.2002.804448.
    [5] P. BlomgrenT. F. ChanP. MuletL. Vese and W. L. Wan, Variational PDE models and methods for image processing, Numerical analysis, Chapman and Hall CRC Research Notes in Mathematics, Boca Raton, FL, 420 (2000), 43-67. 
    [6] A. BuadesB. Coll and J. M. Morel, Image denoising methods, a new nonlocal principle, SIAM Review, 52 (2010), 113-147.  doi: 10.1137/090773908.
    [7] X.-C. CaiM. Dryja and M. Sarkis, Restricted additive Schwarz preconditioners with harmonic overlap for symmetric positive definite linear systems, SIAM Journal on Numerical Analysis, 41 (2003), 1209-1231.  doi: 10.1137/S0036142901389621.
    [8] X.-C. CaiW. D. GroppD. E. KeyesR. G. Melvin and D. P. Young, Parallel Newton-Krylov-Schwarz algorithms for the transonic full potential equation, SIAM Journal on Scientific Computing, 19 (1998), 246-265.  doi: 10.1137/S1064827596304046.
    [9] A. Chambolle and P.-L. Lions, Image recovery via total variation minimization and related problems, Numerische Mathematik, 76 (1997), 167-188.  doi: 10.1007/s002110050258.
    [10] T. F. Chan and J. H. Shen, Image Processing and Analysis: Variational, PDE, Wavelet, and Stochastic Methods, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2005. doi: 10.1137/1.9780898717877.
    [11] T. F. Chan, H. M. Zhou and R. H. Chan, Continuation Method for Total Variation Denoising Problems, Department of Mathematics, University of California, Los Angeles, 1995.
    [12] H. B. ChangX.-C. TaiL.-L. Wang and D. P. Yang, Convergence rate of overlapping domain decomposition methods for the Rudin-Osher-Fatemi model based on a dual formulation, SIAM Journal on Imaging Sciences, 8 (2015), 564-591.  doi: 10.1137/140965016.
    [13] H. B. ChangX. Q. ZhangX.-C. Tai and D. P. Yang, Domain decomposition methods for nonlocal total variation image restoration, Journal of Scientific Computing, 60 (2014), 79-100.  doi: 10.1007/s10915-013-9786-9.
    [14] R. L. Chen and X.-C. Cai, Parallel one-shot Lagrange-Newton-Krylov-Schwarz algorithms for shape optimization of steady incompressible flows, SIAM Journal on Scientific Computing, 34 (2012), B584-B605.  doi: 10.1137/110830769.
    [15] R. L. ChenY. Q. WuZ. Z. YanY. B. Zhao and X.-C. Cai, A parallel domain decomposition method for 3D unsteady incompressible flows at high Reynolds number, Journal of Scientific Computing, 58 (2014), 275-289.  doi: 10.1007/s10915-013-9732-x.
    [16] D. L. CollinsA. P. ZijdenbosV. KollokianJ. G. SledN. J. KabaniC. J. Holmes and A. C. Evans, Design and construction of a realistic digital brain phantom, IEEE Transactions on Medical Imaging, 17 (1998), 463-468.  doi: 10.1109/42.712135.
    [17] P. CoupéP. YgerS. PrimaP. HellierC. Kervrann and C. Barillot, An optimized blockwise nonlocal means denoising filter for 3-D magnetic resonance images, IEEE Transactions on Medical Imaging, 27 (2008), 425-441. 
    [18] X. M. DengX.-C. Cai and J. Zou, A parallel space-time domain decomposition method for unsteady source inversion problems, Inverse Problems and Imaging, 9 (2015), 1069-1091.  doi: 10.3934/ipi.2015.9.1069.
    [19] D. L. Donoho and I. M. Johnstone, Adapting to unknown smoothness via wavelet shrinkage, Journal of the American Statistical Association, 90 (1995), 1200-1224.  doi: 10.1080/01621459.1995.10476626.
    [20] A. EklundP. DufortD. Forsberg and S. M. LaConte, Medical image processing on the GPU-past, Present and Future, Medical Image Analysis, 17 (2013), 1073-1094. 
    [21] L. C. Evans, Partial Differential Equations, Graduate Studies in Mathematics, 19. American Mathematical Society, Providence, RI, 1998.
    [22] M. FornasierA. Langer and C.-B. Schönlieb, A convergent overlapping domain decomposition method for total variation minimization, Numerische Mathematik, 116 (2010), 645-685.  doi: 10.1007/s00211-010-0314-7.
    [23] M. Fornasier and C.-B. Schönlieb, Subspace correction methods for total variation and $L_1$-minimization, SIAM Journal on Numerical Analysis, 47 (2009), 3397-3428.  doi: 10.1137/070710779.
    [24] C. Gerhardt, Global regularity of the solutions to the capillarity problem, Annali della Scuola Normale Superiore di Pisa-Classe di Scienze, 3 (1976), 157-175. 
    [25] M. Hintermüller and A. Langer, Non-overlapping domain decomposition methods for dual total variation based image denoising, Journal of Scientific Computing, 62 (2015), 456-481.  doi: 10.1007/s10915-014-9863-8.
    [26] M. Hintermüller and G. Stadler, An infeasible primal-dual algorithm for total bounded variation-based inf-convolution-type image restoration, SIAM Journal on Scientific Computing, 28 (2006), 1-23.  doi: 10.1137/040613263.
    [27] T. Kohlberger, Variational Domain Decomposition For Parallel Image Processing, PhD thesis, Universitat Mannheim, 2007.
    [28] F. Kong and X.-C. Cai, A scalable nonlinear fluid-structure interaction solver based on a Schwarz preconditioner with isogeometric unstructured coarse spaces in 3D, Journal of Computational Physics, 340 (2017), 498-518.  doi: 10.1016/j.jcp.2017.03.043.
    [29] C.-O. Lee and C. M. Nam, Primal domain decomposition methods for the total variation minimization, based on dual decomposition, SIAM Journal on Scientific Computing, 39 (2017), B403-B423.  doi: 10.1137/15M1049919.
    [30] F. Malgouyres, Minimizing the total variation under a general convex constraint for image restoration, IEEE Transactions on Image Processing, 11 (2002), 1450-1456.  doi: 10.1109/TIP.2002.806241.
    [31] S. OsherM. BurgerD. GoldfarbJ. J. Xu and W. T. Yin, An iterative regularization method for total variation-based image restoration, Multiscale Model. Simul., 4 (2005), 460-489.  doi: 10.1137/040605412.
    [32] G. Pratx and L. Xing, GPU computing in medical physics: A review, Medical Physics, 38 (2011), 2685-2697. 
    [33] L. I. RudinS. Osher and E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D: Nonlinear Phenomena, 60 (1992), 259-268.  doi: 10.1016/0167-2789(92)90242-F.
    [34] Y. Saad, Iterative Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2003. doi: 10.1137/1.9780898718003.
    [35] R. ShamsP. SadeghiR. A. Kennedy and R. I. Hartley, A survey of medical image registration on multicore and the GPU, IEEE Signal Processing Magazine, 27 (2010), 50-60.  doi: 10.1109/MSP.2009.935387.
    [36] J. Spruck, On the existence of a capillary surface with prescribed contact angle, Communications on Pure and Applied Mathematics, 28 (1975), 189-200.  doi: 10.1002/cpa.3160280202.
    [37] N. Ural'tseva, Solvability of the problem of capillaries, Vestn. Leningr. Univ, (1973), 54-64. 
    [38] L. A. Vese and T. F. Chan, A multiphase level set framework for image segmentation using the Mumford and Shah model, International Journal of Computer Vision, 50 (2002), 271-293. 
    [39] C. R. Vogel and M. E. Oman, Iterative methods for total variation denoising, SIAM Journal on Scientific Computing, 17 (1996), 227-238.  doi: 10.1137/0917016.
    [40] J. XuX.-C. Tai and L.-L. Wang, A two-level domain decomposition method for image restoration, Inverse Problem and Imaging, 4 (2010), 523-545.  doi: 10.3934/ipi.2010.4.523.
  • 加载中

Figures(9)

Tables(8)

SHARE

Article Metrics

HTML views(852) PDF downloads(342) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return