View on GitHub

RBDN

Recursively Branched Deconvolutional Network: DCNN architecture for "Generalized Deep Image to Image Regression." CVPR2017 (Spotlight).

placeholder RBDN is an architecture for Generalized Deep Image to Image Regression which features

The core design principle behind the RBDN has been an analysis of the strengths and weaknesses of a wide range of diverse creative architectures followed by an incremental modular construction with thorough empirical testing for each design decision.

Contents

Architecture

pipeline

:point_up_2: Architecture of proposed generic RBDN approach with 3 branches. The various branches extract features at multiple scales. Learnable upsampling with efficient parameter sharing is used to recursively upsample the activations for each branch until it merges with the POOL1 output, leading to a cheap multi-context representation of the input. This multi-context map is subjected to series of 9 convolutions which can supply ample non-linearity and automatically choose how much context is needed based on the task at hand.

Experimental Results

RBDN gives state-of-the-art performance on 3 diverse image-to-image regression tasks: Denoising, Relighting, Colorization.

Denoising

A single 3-branch RBDN model trained over a wide range of noise levels outperforms previously proposed noise-specific state-of-the-art models at every noise level.

dn_comp :point_up_2: Visual comparison of various denoising approaches on a test image from BSD300 with White Gaussian Noise of with st_dev=50.

dn_allnoise :point_up_2: Illustrating the capability of a single RBDN model to handle a range of noise levels (yellow box). Top Row: Noisy test image. Bottom Row: Denoised 3-branch RBDN result

dn55_dncnn :point_up_2: Illustrating RBDN’s ability to reliably denoise at st_dev=55, outside our training bounds (st_dev in [8,50]). The 18-layer DnCNN (despite using st_dev=55 for training) is outperformed by our 9-layer RBDN. Red, Yellow, Green boxes show the PSNR.

Relighting

cs0_aug :point_up_2: The goal is to render faces from various unknown lighting conditions to a fixed lighting condition. Odd rows: Inputs, Even Rows: 3-branch RBDN output. Note that the model is trained exclusively on frontal face images with constrained illumination variations from CMU-MultiPie, but still generalizes reasonably well to unconstrained face images in Janus-CS0 under a variety of poses, illuminations, expressions, occlusions, affordances (hats, glasses, etc.)

multipie :point_up_2: Analyzing RBDN with different branches for relighting a subject from the CMU-MultiPie validation set. Top Row: Input images (ground truth is top-left image). Second row: No branches (strong artifacts can be seen). Third-Sixth row: RBDN outputs for 1,2,3,4 branches respectively. Results improve with increase in number of branches up to 3 branches. The network starts overfitting at 4 branches.

Colorization

We first transform a color image into YCbCr color space and predict the chroma Cb,Cr channels from the luminance Y-channel input using RBDN. The input Y-channel is then combined with the predicted Cb,Cr channels and converted back to RGB to yield the predicted color image. We denote this model as RBDN-YCbCr.

Inspired by the recently proposed Colorful Colorizations approach, we train another RBDN model which takes as input the L-channel of a color image in Lab space and tries to predict a 313-dimensional vector of probabilities for each pixel (corresponding to 313 ab pairs resulting from quantizing the ab-space with a grid-size of 10). Subsequently, the problem is treated as multinomial classification and we use a softmax-cross-entropy loss with class re-balancing. During inference, we use the annealed-mean of the softmax distribution to obtain the predicted ab channels. We denote this model as RBDN-Lab.

comp_color :point_up_2: Colorization results for images from MS-COCO test set. The 3,4-branch RBDN-YCbCr models produce decent colorizations, but are very dull and highly under-saturated. The colorizations of RBDN-Lab have a higher saturation and appear more colorful for all images.

comp_color_supp :point_up_2: Colorizing legacy black-and-white photos: comparing 4-branch RBDN-Lab with the Colorful Colorizations model

Installation & Usage

License & Citation

RBDN is released under a variant of the BSD 2-Clause license.

If you find RBDN useful in your research, please consider citing our paper:

@article{santhanam2016generalized,
  title={Generalized Deep Image to Image Regression},
  author={Santhanam, Venkataraman and Morariu, Vlad I and Davis, Larry S},
  journal={arXiv preprint arXiv:1612.03268},
  year={2016}
}

Acknowledgments