Advanced Search
Article Contents
Article Contents

Nonlocal regularized CNN for image segmentation

  • * Corresponding author: Jun Liu

    * Corresponding author: Jun Liu
Abstract Full Text(HTML) Figure(8) / Table(3) Related Papers Cited by
  • Non-local dependency is a very important prior for many image segmentation tasks. Generally, convolutional operations are building blocks that process one local neighborhood at a time which means the convolutional neural networks(CNNs) usually do not explicitly make use of the non-local prior on image segmentation tasks. Though the pooling and dilated convolution techniques can enlarge the receptive field to use some nonlocal information during the feature extracting step, there is no nonlocal priori for feature classification step in the current CNNs' architectures. In this paper, we present a non-local total variation (TV) regularized softmax activation function method for semantic image segmentation tasks. The proposed method can be integrated into the architecture of CNNs. To handle the difficulty of back-propagation for CNNs due to the non-smoothness of nonlocal TV, we develop a primal-dual hybrid gradient method to realize the back-propagation of nonlocal TV in CNNs. Experimental evaluations of the non-local TV regularized softmax layer on a series of image segmentation datasets showcase its good performance. Many CNNs can benefit from our proposed method on image segmentation tasks.

    Mathematics Subject Classification: 68U10, 68T07, 65K10.


    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  An example of segmentation results by applying the algorithm of [34] and our proposed method on an image from BSD500. When using 4 geometrical nearest neighbors, the weights are set to 1. The segmentation is quite smooth and missing details (Figure 1(b)). When we use Eq. (11) to compute W, the segmentation results are with more details and better accuracy

    Figure 2.  Given an input $ O $, $ \lambda = 3 $ and $ \tau = 0.03 $, we perform algorithms for regularized softmax with local operator and non-local operator, respectively. Figure 2(a) is the convergence of softmax with local operate, the primal energy curve has a peak during the iteration. While in Figure 2(b), the energy curve drops rapidly at first and finally converges smoothly

    Figure 3.  Segmentation results predicted by Unet, RUnet and NLUnet on images from testing dataset of White Blood Cell. From row 2 to row 5, The black regions are background, the gray regions are cell sap, the white regions are nucleus

    Figure 4.  An enlarged view of segmentation results from Figure 3

    Figure 5.  Segmentation results predicted by AUnet, RAUnet and NLAUnet on images from testing dataset of White Blood Cell. From row 2 to row 5, The black regions are background, the gray regions are cell sap, the white regions are nucleus

    Figure 6.  Segmentation results predicted by Segnet, RSegnet and NLSegnet trained on CamVid Dataset

    Figure 7.  An enlarged view of segmentation results from Figure 6 column 2

    Figure 8.  An enlarged view of segmentation results from Figure 6 column 1

    Table 1.  Results of Unet, RUnet and NLUnet trained on WBC Dataset

    Method Unet [25] RUnet [10] NLUnet
    mIoU 89.79 90.15 90.80
    Accuracy 97.04 97.13 97.42
    RE 1.82 1.30 1.59
     | Show Table
    DownLoad: CSV

    Table 2.  Results of AUnet, RAUnet and NLAUnet trained on WBC Dataset

    Method AUnet [23] RAUnet NLAUnet
    mIoU 90.75 91.01 91.69
    Accuracy 97.35 97.40 97.57
    RE 1.43 1.41 1.43
     | Show Table
    DownLoad: CSV

    Table 3.  Results of Segnet, RSegnet, NLSegnet trained on CamVid Dataset

    Method Segnet [3] RSegnet[10] NLSegnet
    mIoU 57.35 57.79 59.84
    Accuracy 87.74 88.01 88.59
    RE 4.10 2.43 3.40
     | Show Table
    DownLoad: CSV
  • [1] R. Adams and L. Bischof, Seeded region growing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16 (1994), 641-647.  doi: 10.1109/34.295913.
    [2] M. Z. Alom, M. Hasan, C. Yakopcic, T. M. Taha and V. K. Asari, Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation, arXiv: 1802.06955.
    [3] V. Badrinarayanan, A. Kendall and R. Cipolla, Segnet: A deep convolutional encoder-decoder architecture for image segmentation, arXiv: 1511.00561. doi: 10.1109/TPAMI.2016.2644615.
    [4] L. Barghout and L. Lee, Perceptual information processing system, US Patent App. 10/618,543, (2004).
    [5] M. BenningC. BruneM. Burger and J. Müller, Higher-order tv methods–enhancement via bregman iteration, Journal of Scientific Computing, 54 (2013), 269-310.  doi: 10.1007/s10915-012-9650-3.
    [6] H. Birkholz, A unifying approach to isotropic and anisotropic total variation denoising models, Journal of Computational and Applied Mathematics, 235 (2011), 2502-2514.  doi: 10.1016/j.cam.2010.11.003.
    [7] J. Canny, A computational approach to edge detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8 (1986), 679-698.  doi: 10.1016/B978-0-08-051581-6.50024-6.
    [8] G. Gilboa and S. Osher, Nonlocal operators with applications to image processing, Multiscale Modeling & Simulation, 7 (2008), 1005-1028.  doi: 10.1137/070698592.
    [9] K. He, X. Zhang, S. Ren and J. Sun, Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, in Proceedings of the IEEE International Conference on Computer Vision, IEEE, 2015, 1026–1034. doi: 10.1109/ICCV.2015.123.
    [10] F. Jia, J. Liu and X. Tai, A regularized convolutional neural network for semantic image segmentation, Analysis and Applications, (2020) 1–19.
    [11] M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, K. Rosaen and R. Vasudevan, Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks?, preprint, arXiv: 1610.01983. doi: 10.1109/ICRA.2017.7989092.
    [12] M. Kass, A. Witkin and D. Terzopoulos, Snakes: Active contour models, International Journal of Computer Vision, 1, (1988) 321–331. doi: 10.1007/BF00133570.
    [13] P. Krähenbühl and V. Koltun, Efficient inference in fully connected crfs with gaussian edge potentials., Advances in Neural Information Processing Systems, (2011), 109–117.
    [14] A. Krizhevsky, I. Sutskever and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, (2012), 1097–1105. doi: 10.1145/3065386.
    [15] Y. LeCunB. BoserJ. S. DenkerD. HendersonR. E. HowardW. Hubbard and L. D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural Computation, 1 (1989), 541-551.  doi: 10.1162/neco.1989.1.4.541.
    [16] G. Lin, C. Shen, A. V. D. Hengel and I. Reid, Efficient piecewise training of deep structured models for semantic segmentation, in Proceedings of the IEEE Conference on Computer Cision and Pattern Recognition, IEEE, 2016, 3194–3203. doi: 10.1109/CVPR.2016.348.
    [17] J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2015, 3431–3440. doi: 10.1109/CVPR.2015.7298965.
    [18] M. Lysaker, A. Lundervold and X.-C. Tai, Noise removal using fourth-order partial differential equation with applications to medical magnetic resonance images in space and time, IEEE Transactions on Image Processing, 12, (2003), 1579–1590. doi: 10.1109/TIP.2003.819229.
    [19] D. R. MartinC. C. Fowlkes and and J. Malik, Learning to detect natural image boundaries using local brightness, color, and texture cues, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26 (2004), 530-549.  doi: 10.1109/TPAMI.2004.1273918.
    [20] K. Mikula, A. Sarti and F. Sgallari, Co-volume level set method in subjective surface based medical image segmentation, in Handbook of Biomedical Image Analysis, Springer, (2005), 583–626. doi: 10.1007/0-306-48551-6_11.
    [21] D. Mumford and J. Shah, Optimal approximations by piecewise smooth functions and associated variational problems, Communications on Pure and Applied Mathematics, 42 (1989), 577-685.  doi: 10.1002/cpa.3160420503.
    [22] H. Noh, S. Hong and B. Han, Learning deconvolution network for semantic segmentation, in Proceedings of the IEEE International Conference on Computer Vision, IEEE, 2015, 1520–1528. doi: 10.1109/ICCV.2015.178.
    [23] O. Oktay, et al., Attention u-net: Learning where to look for the pancreas, preprint, arXiv: 1804.03999.
    [24] N. Otsu, A threshold selection method from gray-level histograms, IEEE Transactions on Systems, Man and Cybernetics, 9 (1979), 62-66.  doi: 10.1109/TSMC.1979.4310076.
    [25] O. Ronneberger, P. Fischer and T. Brox, U-net: Convolutional networks for biomedical image segmentation, in International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, 2015,234–241. doi: 10.1007/978-3-319-24574-4_28.
    [26] L. I. RudinS. Osher and E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D: Nonlinear Phenomena, 60 (1992), 259-268.  doi: 10.1016/0167-2789(92)90242-F.
    [27] B. SchölkopfK. Tsuda and  J.-P. VertSupport Vector Machine Applications in Computational Biology, MIT press, 2004. 
    [28] L. Shapiro and G. C. Stockman, Computer Vision, Prentice Hall, 2001.
    [29] J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-908. 
    [30] K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint, arXiv: 1409.1556.
    [31] M. Unger, T. Mauthner, T. Pock and H. Bischof, Tracking as segmentation of spatial-temporal volumes by anisotropic weighted tv, in International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Springer 2009,193–206. doi: 10.1007/978-3-642-03641-5_15.
    [32] P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, Understanding convolution for semantic segmentation, in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, 2018, 1451–1460. doi: 10.1109/WACV.2018.00163.
    [33] K. Wei, K. Yin, X.-C. Tai and T. F. Chan, New region force for variational models in image segmentation and high dimensional data clustering, preprint, arXiv: 1704.08218. doi: 10.4310/AMSA.2018.v3.n1.a8.
    [34] K. Yin and X.-C. Tai, An effective region force for some variational models for learning and clustering, Journal of Scientific Computing, 74 (2018), 175-196.  doi: 10.1007/s10915-017-0429-4.
    [35] F. Yu and V. Koltun, Multi-scale context aggregation by dilated convolutions, preprint, arXiv: 1511.07122.
    [36] L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, Advances in Neural Information Processing Systems, (2005), 1601–1608.
    [37] X. ZhengY. WangG. Wang and J. Liu, Fast and robust segmentation of white blood cell images by self-supervised learning, Micron, 107 (2018), 55-71.  doi: 10.1016/j.micron.2018.01.010.
  • 加载中




Article Metrics

HTML views(638) PDF downloads(1051) Cited by(0)

Access History

Other Articles By Authors



    DownLoad:  Full-Size Img  PowerPoint