Figures
Explore figures and images from publications
Figure 2 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
ByteESR Team: (a) Residual feature distillation block (RFDB). (b) Residual local feature block (RLFB). (c) Enhanced Spatial Attention (ESA).

ByteESR Team: (a) Residual feature distillation block (RFDB). (b) Residual local feature block (RLFB). (c) Enhanced Spatial Attention (ESA).

Source publication
Preprint
Full-text available
This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single ima...

Contexts in source publication

Context 1
... proposed RLFN is modified from residual feature distillation block (RFDB) [51]. As shown in Fig. 2a, RFDB uses three Conv-1 for feature distillation, and all the distilled features are concatenated together. Although aggregating multiple layers of distilled features can result in more powerful feature, concatenation accounts for most of the inference time. Based on the consideration of reducing inference time and memory, RLFB (see ...
Context 2
... in Fig. 2a, RFDB uses three Conv-1 for feature distillation, and all the distilled features are concatenated together. Although aggregating multiple layers of distilled features can result in more powerful feature, concatenation accounts for most of the inference time. Based on the consideration of reducing inference time and memory, RLFB (see Fig. 2b) removes the concatenation layer and the related feature distillation layers and replaces them with an addition for local feature learning. Besides, in RLFB, the Conv Groups in ESA [52] (see Fig. 2c) is simplified to one Conv-3 to decrease the model depth and ...
Context 3
... powerful feature, concatenation accounts for most of the inference time. Based on the consideration of reducing inference time and memory, RLFB (see Fig. 2b) removes the concatenation layer and the related feature distillation layers and replaces them with an addition for local feature learning. Besides, in RLFB, the Conv Groups in ESA [52] (see Fig. 2c) is simplified to one Conv-3 to decrease the model depth and ...
Context 4
... Network (MSDN) for efficient super-resolution, which stacks a group of multi-scale information distillation blocks (MSDB). Particularly, inspired by RFDN, a 1×1 convolution is used for information distillation and a 3 × 3 convolution is used for feature refinement to alleviate the limitation of channel splitting operation in IMDB. As shown in Fig. 20, in l-th MSDB, a multi-scale feature refinement module (marked with green dotted boxes) is used to replace the 3 × 3 convolution of RFDB. In Fig. 20, an upsampling refinement module with the scale factor of 2 is designed. A 1 × 1 convolution is used for channel expansion and a 3×3 convolution with two groups is used for feature ...
Context 5
... by RFDN, a 1×1 convolution is used for information distillation and a 3 × 3 convolution is used for feature refinement to alleviate the limitation of channel splitting operation in IMDB. As shown in Fig. 20, in l-th MSDB, a multi-scale feature refinement module (marked with green dotted boxes) is used to replace the 3 × 3 convolution of RFDB. In Fig. 20, an upsampling refinement module with the scale factor of 2 is designed. A 1 × 1 convolution is used for channel expansion and a 3×3 convolution with two groups is used for feature refinement, which has √ sh× √ sh receptive field to capture a larger region of neighbors and acts equivalently on an upsampled feature. Then, a single 3 × 3 ...
Context 6
... from the original RFDN [51], all 3 × 3 convolutional layers except those in the ESA block [51]) are replaced by RepBlocks (RB) in the training stage. During inference stage, the RepBlocks are converted into single 3 × 3 convolutional layers. Inspried by ECB [85] and ACB [12], 3 × 1 Conv and 1 × 3 Conv sub-branches are added into the original ECB (Fig. 22). The number of fea- ture channels is set to 40, while in the original RFDN50 version it is set to ...
Context 7
... diverse branch block. As shown in Fig. 23a, the detail of RepVGG Block, DBB, and ECB is presented. A total of eight different structures have been designed to improve the feature extraction ability of the vanilla convolution in different scenarios. Although the performance may be higher with more re-parameterizable branches, the expensive training cost is unaffordable for ...
Context 8
... is that edge and structure information may be attenuated during the merging of parallel branches. To address the above concerns, a more delicate and effective reparameterization block is built, namely Edgeenhanced Diverse Branch Block (EDBB), which can extract and preserve high-level structural information for the lowlevel task. As illustrated in Fig. 23b, the EDBB consists of seven branches of single convolutions and sequential ...
Context 9
... architecture. Following IMDN [31] and RFDN [51], an EFDN is devised to reconstruct high-quality SR images with sharp edges and clear structure under restricted resources. As illustrated in Fig. 24, the EFDN consists of an shallow feature extraction module, multiple edge-enhanced feature distillation blocks (EFDBs), and upscaling module. Specifically, a single vanilla convolution is leveraged to generate the initial feature ...
Context 10
... final model SR model consists of 4 modified RFDB blocks with 32 channels. All the other unmentioned parts of the model are the same as in RFDN. The modified RFDB block is shown in Fig. ...
Context 11
... cipher team proposed an end-to-end residual distillation network (ResDN) for lightweight image SR. As shown in Fig. 32, the proposed ResDN consists of three parts: the head, trunk and tail ...
Context 12
... shown in Fig. 32, the F i (i = 1, 2, 3, 4) will be aggregated into BFM and the feature dimensions is first halved by 1 × 1 convolution followed by the ReLU activation function (omitted in Fig. 32), and then sequential concatenations are utilized. This can be formulated ...
Context 13
... shown in Fig. 32, the F i (i = 1, 2, 3, 4) will be aggregated into BFM and the feature dimensions is first halved by 1 × 1 convolution followed by the ReLU activation function (omitted in Fig. 32), and then sequential concatenations are utilized. This can be formulated ...
Context 14
... team created a network structure where progressive refinement module (PRM) is repeated locally in the blocks and globally among the blocks to reduce the number of parameters. This is done in a way that intermediate information collection (IIC) modules in the global set- Progressive Refinement Module (PRM) Figure 42. Aselsan Research Team: GIDB Block ting is replaced with proposed Global PRM. ...
Context 15
... frequency grouping fusion block (FGFB) is shown in Fig. 52. The features with the highest difference between low-frequencies and high-frequencies are divided into the first group, the features with the next highest difference are divided into the second group, and so on. Then, starting from the feature group with the smallest frequency difference, the features of each group are gradually fused ...