Image rain removal techniques based on discriminative sparse coding

This paper proposes an image deraining model and algorithm based on discriminative sparse coding. The image deraining problem is treated as an image signal separation problem. During the separation process, the intrinsic property differences between the clear image layer and the rain layer are utilized to perform mutually exclusive sparse coding on both, achieving the effect of separating the clear image from the rain layer. The effectiveness of this model is verified on a synthetic dataset.

Background Introduction

In recent years, with the continuous development of computer hardware and software technology, computer vision systems have been increasingly widely used in the real world. Most outdoor computer vision systems need to extract image features clearly and accurately for further analysis and processing. However, the various uncertainties of weather conditions can cause incalculable damage to image features. For example, the visibility and contrast of images are severely affected in foggy weather, while rain and snow may cause some images to be obscured. In order to ensure that these outdoor vision systems still perform well in adverse weather conditions, we need to model the degradation process of images under various weather conditions and eliminate the effects of adverse weather.

Due to the inherent uncertainties in the location and grayscale of raindrops in images, their complex and variable scales, and the difficulty in characterizing the imaging process, effectively extracting clear non-rain images from rain images has become a very challenging task. Existing literature largely focuses on processing rainy day videos. The redundancy between adjacent frames in a video aids in raindrop detection and provides rich information for repairing and restoring detected rainy areas. However, this redundancy can be destroyed by fast-moving objects in the video, and sometimes intelligent machine tasks require single images for recognition. Therefore, considering how to eliminate the damage to image features caused by rain from single rain images is also very meaningful. This paper starts with a single rain image and establishes a simple and effective optimization model. By constraining the mutual exclusion of sparse coding, a discriminative dictionary is obtained through continuous iterative learning, and the rain layer and image layer in the rain image gradually separate during the iteration.

Related work

Kang et al. [1] were the first to perform rain removal on a single image. They mainly used dictionary learning to remove rain. The main process is as follows: 1) Preprocessing: First, the input image is subjected to bilateral filtering. The image obtained after filtering is called the low-frequency image. The low-frequency image retains the general information of the original image. This part does not contain rain, so it is retained unchanged; 2) Dictionary learning: Sparse coding and dictionary learning are performed on the high-frequency image. The dictionary here includes the rain dictionary and the image boundary dictionary, which need to be further distinguished; 3) Dictionary partitioning: The gradient of the block where the rain is located. The gradient directions are generally consistent, but the gradients of blocks containing image boundaries are more chaotic. The gradient direction histogram of each dictionary element is extracted as a feature, and k-means clustering is used. Of the two resulting classes, the one with the larger variance is the sub-dictionary of the image boundary, and the other is the dictionary of rain. 4) Rain removal: Using the sparse coding in 3), the sparse coefficients corresponding to the rain sub-dictionary are set to 0. The blocks are reconstructed back into the image using the sparse coefficients and the dictionary, obtaining image details in the high-frequency image. This is then added to the low-frequency image to obtain the processed image after rain removal. This method is biased towards images with fewer details and smoother regions. For images with more details, bilateral filtering may erase a lot of detail information. This detail information is mixed with rain information in subsequent processing, and the two classes of dictionaries are separated by a fragile clustering method, making separation difficult. Experiments show that the processed images of rain images obtained by this method are mostly smooth, and many details are lost.

The single-image rain removal method used by Kim et al. [2] is similar to the video rain removal method, both of which adopt a strategy of detection followed by repair. The specific steps are as follows: 1) Raindrop detection: In this paper, raindrop detection is performed by using kernel regression to determine the main gradient direction and the ratio of the major and minor axes of the pixel. Then, threshold selection is performed on these features. When using kernel regression, the covariance matrix in kernel regression is corrected and weighted according to some prior knowledge of rain. The corrected covariance matrix is decomposed by SVD to obtain the main direction information of rain in the region where the pixel is located and the ratio of the length and width of the region. Threshold screening is performed on the direction and the ratio of length and width, and the points that meet the conditions are identified as rain. 2) Repair of the raindrop region: The repair of the raindrop region is achieved by non-local mean filtering. Because of the existence of rain, the weight of each block's contribution to the processed block also needs to be corrected accordingly. Because the grayscale information of the raindrop pixel is inaccurate, inaccurate points need to be removed when calculating the weight. Kim et al.'s method largely depends on the accuracy of raindrop detection. However, raindrop detection in this method mainly depends on the gradient direction of the region, and the correction effect of prior knowledge about rain is very weak. In experiments, the detected regions often coincide with image boundaries that are aligned with the direction of the rain. Furthermore, image inpainting is inherently challenging, and the nonlocal mean filtering used may employ dissimilar blocks to provide inpainting information due to the presence of rain. Therefore, this method is also very limited.

In addition, Pei et al. [3] first converted the rain image from other color channels to the HSV channel, and then used saturation and brightness in the HSV channel to visually enhance the presence of rain. After that, they used high-pass filtering and directional filtering to further enhance the appearance of raindrops. After enhancement, they selected a certain percentage of the brightness values in the processed image as the area where the rain was detected. The repair of the area where the raindrops were detected was completed by filling the mean of the pixels near the neighborhood. Both this method and the method of Kim et al. [2] first detected raindrops and then repaired the location of the raindrops. The key to the success of both methods depends on the accuracy of raindrop detection and the quality of image repair. The image repair method used in this method is more suitable for repairing long and thin raindrops. For larger detection blocks, artificially filling the mean of pixels near the boundary of the detection area will make the processed image extremely unrealistic and will have a lot of artificial repair traces.

Fergus et al. [4] also attempted to remove rain from a single image. The entire rain removal algorithm uses a convolutional neural network and is trained with a large dataset. The trained convolutional network is a mapping from image blocks with raindrops or mud to clean image blocks. For the rain image to be processed, it is only necessary to cut it into image sub-blocks and input them into this convolutional network to obtain the corresponding clean output sub-blocks. Then, these output sub-blocks are rearranged to obtain the whole rain-removed image. This method requires a large amount of training data and a large number of parameters to be trained, which is too time-consuming.

Compared to other video deraining methods, deraining a single image is more difficult because it lacks redundant information from adjacent frames in the video sequence. The existing deraining methods are not very effective and each has its own limitations.

Image raining model and its synthesis method

The model for combining the background image with the rain to create a rain map effect is very complex. Therefore, existing research on rain removal generally simplifies the rain map generation model. This simplification can be roughly divided into two types: a simple additive model [5] and an α- mixing model [6,7]. In this section, we propose a new color filtering model.

Color Filter Model: This model is the rain image generation model used in the algorithm of this paper, and it is a layer blending mode in the classic image processing software Photoshop. Assuming the original image is I(x), the rain layer image is R(x), and the rain image obtained after blending the two images is J(x), we assume that the gray values are all normalized to [0,1], and X is the position of the pixel in the image. The model is as follows:

J(x)=I(x)+R(x)-I(x)R(x) . (1)

As can be seen, the color filter model adds a final nonlinear term compared to the additive model. In this model, the change of the image by rain depends not only on R(x) but also on the background image. The influence of rain on the image becomes R(x)(1-I(x)) . It can be seen that when the background image I(x) is relatively bright and close to 1, the influence of rain on the image is close to 0. When the background is dark, the influence of rain on the image is close to R(x). That is to say, when the background is dark, the presence of rain is more obvious, while when the background is bright, the presence of rain is weakened. This is consistent with the photos of rain we take in real life.

Image rain removal method based on discriminative sparse coding

1. Image rain removal optimization model based on discriminative sparse coding

Given a rain image, the goal is to obtain the original sharp image layer I and the rain layer R. Separating two layers from a single image presents a challenging problem with two unknowns: an ill-conditioned inverse problem requiring additional assumptions and constraints. Before adding prior assumptions, we must answer two key questions: First, how are these two image layers fused together? A correct simulation of the synthesis model is crucial for achieving good separation. If the initial model assumptions are flawed, accurate image separation will be difficult. Second, what are the discriminatory properties between the sharp and rain layers, and how can we accurately characterize these properties? Both questions are critical. Even with a correct synthesis model, without proper regularization constraints to ensure separation, the resulting two layers will have infinitely many possibilities. Conversely, even with good regularization constraints, if the fusion method between the image layers is not well-defined, it may lead to an extreme case of the regularization constraint, failing to achieve the desired separation. Therefore, to achieve good separation and reasonably separate the sharp and rain layers, we must carefully address these two questions.

As mentioned earlier, the rain image generation model used in this paper is derived from the color filter model in the image blending modes of the image editing software Photoshop. This model obtains a new blended image by inverting two image layers, performing a dot product, and then inverting the result again. Here, we assume that the ideal clear image layer is I and the rain layer is R. Then, the rain image obtained after merging the two is as follows:

J=1—(1—I)*(1—R)=1+RI*R (2)

Where * represents the dot product of two vectors.

After obtaining the rain image generation model, the next issue to consider is how to apply regularization constraints to the rain layer image and the clear image separately. In recent years, algorithms based on sparse priors have been widely applied to tasks such as image restoration and have shown excellent performance. In this paper, we will also consider incorporating sparse priors into the constraints of the image layer and the rain layer. Here, sparse priors refer to the process of segmenting the image layer (or rain layer) into blocks and learning a redundant dictionary. Then, all image blocks segmented from the image layer (or rain layer) will be sparsely represented under this redundant dictionary; that is, each image block can be represented by a linear combination of only a few elements in the dictionary. Assume P is an operator that transforms the image layer into a stacked matrix of image blocks, where each column of the matrix represents an image block:

_Y1 :=PI; _YR :=PR; (3)

Here, Y _1 is a matrix block formed by stacking sharp image layers I after passing through operator P, and Y _R is a matrix block formed by stacking sharp image layers R after passing through operator P. These matrix blocks can be sparsely represented under a certain dictionary D, that is:

_Y1 ≈ _DC1 ; _YR ≈ _DCR (4)

Where C1 _and CR are the representation coefficients _of the image patch and rain patch in dictionary D, respectively. Each column corresponds to the representation coefficient of each block in the corresponding matrix block. If the dictionary is learned well enough, the number of non-zero coefficients in each column should be very small, that is, sparse.

To accurately separate two image layers, simple sparsity is insufficient; additional discriminative constraints are needed. Since the corresponding dictionaries for the two image layers are difficult to learn directly, this paper bypasses learning separate dictionaries for the rain layer and the image layer. Instead, it uses a common dictionary to uniformly encode the rain layer and the image layer. The separation of the rain layer and the image layer is achieved not through dictionary constraints, but through constraints on the sparse coding coefficients. Due to the differences between the rain layer and the image layer, their representation coefficients will differ even within the same dictionary. Based on this difference, we can separate the rain layer and the image layer. To describe the coding coefficients, we first define the weight vector of the coefficients. Assuming the representation coefficient of a certain matrix block under the dictionary is C, the weight vector of this coefficient is defined as follows:

(5)

The k-th term of the weight vector B(C) represents the usage of the k-th element of the dictionary by the coding coefficient C. A larger value indicates that the dictionary element is more important to the matrix block; a value of 0 indicates that the matrix block does not utilize the k-th element of the dictionary. After defining the weight vectors of the coefficients, based on the previous analysis, we know that due to the differences between the rain layer and the image layer, their coding coefficients in a certain dictionary will also differ. This difference is reflected in the weight vectors. In the ideal case, the correlation between the weight vectors of the rain layer's coding coefficient CR and the image layer's coding coefficient _C1 should be very small, or even zero _. That is:

(6)

The correlation between the weight vectors of the two layers is 0, meaning that dictionary elements used to generate the rain layer will not be used to generate the image layer, and conversely, dictionary elements used to generate the image layer will not be used to represent the rain layer. We call this property mutual exclusion. In other words, in the following work, we will achieve the separation of the rain layer and the image layer by learning a dictionary with mutual exclusion properties. Based on the above discussion, we can obtain the following optimization model for separating the rain layer and the image layer, which we call the discriminative sparse coding deraining model:

(7)

Where _T1 and _TR are the sparsity of the coding coefficients of the image block and the rain block, respectively, and is a correlation threshold. The objective function of the optimization model (4-13) is to make the dictionary D optimally represent the matrix blocks corresponding to the rain layer and the image layer. The first term in the model's constraints is the rain map generation model, the second term is to ensure that the brightness of the rain layer and the image layer are within a reasonable range, the third term is the sparsity constraint, and the fourth term is the mutual exclusion constraint.

2. Solving the Discriminative Sparse Coding Model

The aforementioned model based on discriminative sparse coding is a very challenging non-convex optimization problem. A common approach to solving this type of problem is to solve the variables iteratively one by one. However, this model involves five basic variables: the brightness I and R corresponding to the image layer and the rain layer image, the dictionary D, and the coding coefficients _C1 and _CR of the image block and the rain block under this dictionary. Traditional multivariate iteration will converge very slowly. Therefore, in this section, we propose a greedy iterative algorithm to replace the traditional multivariate iterative method. The main idea is as follows: In each iteration, the image patch and the rain patch are first sparsely approximated separately. The image patch adopts the usual sparse approximation method, while the rain patch is calculated first in the sparse approximation process. The rain component that may remain in the image patch is then accumulated and added to the previously calculated rain patch matrix before sparse coding. After obtaining the sparse coding coefficients, the rain patch is reconstructed according to the sparse coding coefficients and the dictionary, and then transformed into a rain layer. The image layer is calculated according to the brightness of the rain layer and the rain image generation model. Then, the image layer and the rain layer are transformed into a block structure, the dictionary is updated, and then the next round of sparse coding is performed.

In the above ideas, we can see that in addition to the five basic variables mentioned earlier, the residual rain component in the image is also mentioned. To characterize this component, before detailing the algorithm, we introduce two additional auxiliary variables, ω and r. ω is an indicator vector with the same dimension as the number of elements in the dictionary, and its value can be 1 or 0. A value of 1 indicates that the dictionary element will participate in the sparse representation of the rain patch, while a value of 0 indicates that the dictionary element is used as a sparse representation of the image patch. The residual variable represents the rain component that may remain in the image layer. Ideally, during the algorithm's iteration process, this residual variable will approach 0, meaning the correlation between the coding coefficients of the rain patch and the image patch will approach 0.

Next, we will introduce in detail the solution algorithm for the discriminative sparse coding model proposed in this paper. Following the basic idea mentioned earlier, we divide each iteration of the algorithm into four steps: sparse approximation, image update, dictionary update, and indicator variable update. The specific operations are as follows:

Sparse approximation: In the ( l+1) th iteration of the algorithm, the current input at this stage is the dictionary ^D1 obtained from the previous iteration, the stacking matrices Y ^ _IR and Y ^{ _II} of rain patches and image patches, and the auxiliary indicator variable ω ^I . What is needed is the new coding coefficients C ^{I+1} _R and C ^{I+1} _I corresponding to the rain patches and image patches. For the image patch, its coding coefficients C ^{I+1} _I are the optimal solution of the following optimization model:

The coding coefficients of the rain blocks are sparsely approximated by accumulating the rain block matrix from the previous step and the residual. The residual is calculated as follows:

Here, `diag` is the operation of transforming a vector into a diagonalized matrix, where the vector represents the diagonal elements of the matrix, and the remaining elements are 0. After obtaining the residual, the new coding coefficients C ^{I+1} _R are obtained by solving the following optimization model:

Both of the above sparse approximation optimization models can be solved using the OMP algorithm.

Image Update: Image updates here include brightness updates for the rain layer and the clear image layer. In the previous sparse approximation step, we obtained the coding coefficients corresponding to the rain patches and image patches. A reasonable idea is to reconstruct the rain patches and image patches using the coding coefficients and a dictionary. However, to accelerate the convergence of the algorithm, we use coefficient and dictionary reconstruction for the rain layer, while for the image layer, we use a greedy algorithm to directly obtain the brightness of the image layer using the rain image generation model. The specific steps are as follows:

First, based on the coding coefficients C ^{I+1} _R of the rain blocks calculated in the previous step and the dictionary D ^l obtained in the previous iteration, a new rain block matrix Y ^{I+1} _R is obtained:

Then, the rain layer R ^l+1 is reconstructed based on the new rain block matrix. This operation is the inverse operation of operator P, which we denote as P ^T . Note that the reconstructed rain layer needs to be normalized to [0,1]. The specific operation is as follows:

Finally, the image layer brightness is updated based on the rain layer image and the rain map generation model:

Dictionary Update: In the dictionary update stage, the existing input variables are the brightness of the rain layer and the sharp layer obtained in the image update step, and the corresponding sparse coding coefficients obtained in the sparse approximation step. What we want here is the updated dictionary. First, we need to convert the rain layer and the image layer into stacked matrix blocks through the operation operator P:

Subsequent dictionary updates are obtained by solving the following optimization model:

The above dictionary update can be solved using the K-SVD algorithm.

Indicator variable update: This step mainly serves to generate the residual in the next iteration. The input variables are the indicator variable ^ωl from the previous round and the coding coefficients C ^l+1 _R of the rain patch obtained in this round. First, a new indicator variable is generated based on the coding coefficients of the rain patch:

Furthermore, we believe that the indicator variable should be a subset of the indicator variable from the previous round, which can effectively prevent the algorithm from diverging.

3. Experimental Comparison and Analysis

Comparison method selection: In order to verify the effectiveness of the above discriminative sparse coding model and algorithm, we conducted verification on the synthesized rain map and selected the following two methods for comparison experiments. The first method is proposed by Kang et al. [1] in 2012, which uses bilateral filtering to first extract the low frequency part of the image, and then extracts the rain layer from the high frequency part through dictionary learning and clustering. The other method is proposed by Kim et al. [2] in 2013, which first performs raindrop detection on the rain map and then performs image repair on the detected area.

Experimental Configuration and Initialization: Throughout the experiment, the algorithm parameters are fixed. The size of the local image patches segmented from the image layer is 16x16, and the dictionary D has 640 elements, meaning the size of dictionary D is 256x640. The sparsity _Tr and _TZ of the rain patch and image patch are 5 and 8 respectively, meaning each rain patch (image patch) can use a maximum of 5 (8) dictionary elements. The variables requiring initialization throughout the experiment mainly include dictionary D and the indicator vector W. The initialization of dictionary D consists of two parts: the first part is a sub-dictionary for the rain component, and the second part is a sub-dictionary for the clear image component. The second part can be processed relatively arbitrarily; here, we directly use the dictionary obtained from the original input rain image through dictionary learning as the sub-dictionary for the clear image. The dictionary learning method used is the classic KSVD. The first part involves initializing the rain dictionary. Based on the morphological analysis of rain described earlier, we perform a good initialization of the rain dictionary. First, we statistically determine the approximate direction of the rain based on the gradient direction of the input rain image, generating a motion blur kernel along that direction. Then, we superimpose a Gaussian filter onto this blur kernel, thus obtaining the initial rain dictionary. We set the ratio of rain dictionary elements to image dictionary elements to 1:4. The initialization of the indicator vector W is also determined based on the components of the initialized dictionary. For the initial rain dictionary, we set the values at the corresponding positions of W to 1, and the rest to 0. In addition, the algorithm input includes the initial rain patch and image patch matrices and their corresponding coding coefficients, all of which are initialized using the classic OMP algorithm.

Algorithm Time Evaluation: To ensure a fair comparison of efficiency, all comparison methods were run on a unified environment: a Matlab 2015a computing platform running Windows 10, with hardware including an Intel Core i7-3770 3.4GHz CPU and 32GB of RAM on a desktop computer. After multiple tests and average runtime calculations, for a 256x256 color image, the algorithm proposed in this section takes 140 seconds, while the methods by Kang et al. and Kim et al. take 358 seconds and 252 seconds respectively. Therefore, our algorithm takes the least time.

To objectively evaluate the effectiveness of existing rain removal methods, a benchmark database needs to be established. This database should include original rainy images as input to the algorithms and original rainless images as evaluations of the algorithm outputs. Since it's difficult to obtain rainy and non-rainy images under the same conditions in real life—as external factors such as lighting changes, visibility, and air quality can cause differences beyond just rain—we will temporarily use a synthetic approach to establish the benchmark database. Clear images will be selected from outdoor scenes in standard image libraries, and then rain images will be artificially synthesized.

Selection of clear images: The image database selected in this paper is the UCID[8] uncompressed color image database. The database was designed to evaluate content-based image retrieval algorithms. It contains 1338 uncompressed color images. In the experiment, we manually removed the images of indoor scenes and objects, leaving some outdoor scene photos. We then randomly selected 200 of them as clear image source data, including buildings, landscape scenes, statues, grasslands, etc.

Artificial Rain Image Synthesis: Existing rain image synthesis methods all utilize image editing software like Photoshop. However, generating rain images in batches for objective evaluation requires significant time investment through manual use of this software. Therefore, we employ MATLAB to approximate each step in Photoshop for batch rain image generation. To ensure rain image diversity and avoid unfairness due to algorithmic bias, the direction of the rain is randomly selected between [specific direction] and [specific direction] during the rain image generation process. Figure 1 shows some of the artificially synthesized rain image images.

Rain map corresponding to some images in Figure 1

Objective evaluation criteria: Here we use two classic image quality evaluation criteria to objectively evaluate the image restored by the algorithm, namely Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM).

To verify the effectiveness of the proposed algorithm, we conducted a rain removal test on 200 composite rain images. The results are shown in Table 1, where the data represents the average value obtained from multiple images. It can be seen that the proposed algorithm can significantly improve the image quality of rain images.

Table 1. Rain removal results of the discriminative sparse coding model.

Table 2 lists the rain removal results of the proposed algorithm and two other algorithms (Kim and Kang) on 200 rain images. The data in the table are the average data obtained from all images. It can be seen that, regardless of whether PSNR or SSIM is used as an objective evaluation metric, the proposed algorithm is better than the other two algorithms on synthetic data.

Table 2 Comparison of three rain removal algorithms on synthetic data

summary

This paper addresses the image quality degradation caused by rain, a common real-world weather phenomenon, and proposes an effective method for rain removal. Through dictionary learning, this paper analyzes the attribute differences between rain images and clear images. By leveraging these attribute differences reflected in sparse coding coefficients, the rain image and clear image are directly separated using image signal separation. Experimental results demonstrate the effectiveness of the proposed algorithm on synthetic data.

Furthermore, the proposed rain removal method has certain limitations. First, it is not suitable for rain in images that is not linearly regular; when the rain in the image consists of large raindrops, the proposed method cannot effectively handle and eliminate these raindrops. Second, when there are many structures in the image that resemble raindrops in shape, the proposed method may misclassify them as rain layers. Future research should further refine the algorithm and apply this discriminative sparse coding model to other visual signal separation problems.

References

LWKang,CWLin,andY.H.Fu.Automaticsingle-image-basedrainstreaksremovalviaimagedecomposition.IEEETransactionsonImageProcessing,

2012.21(4):p.1742-1755.

JHKim,C.Lee,JYSim,etal.Single-imagederainingusinganadaptivenonlocalmeansfilter.ICIP.2013.p.914-917.

SCPei,YTTsai,andC.Y.Lee.Removingrainandsnowinasingleimageusingsaturationandvisibilityfeatures.

IEEEInternationalConferenceonMultimediaandExpoWorkshops.2014.p.1-6.

D.Eigen,D.Krishnan,andR.Fergus.Restoringanimagetakenthroughawindowcoveredwithdirtorrain

ICCV.2013.pp.633-640.

PCBarnum,S.Narasimhan,andT.Kanade.Analysisofrainandsnowinfrequencyspace.Internationa

lJournal of Computer Vision, 2010.86(2-3):p.256-274.

V.SanthaseelanandV.K.Asari.Utilizinglocalphaseinformationtoremoverainfromvideo.

InternationalJournalofComputerVision,2014.112(1):p.71-89.

S.You, RT Tan, R. Kawakami, et al. Raindrop detection and removal from long range trajectories. 2014:

SpringerInternationalPublishing.569-585.

G.SchaeferandM.Stich.UCID:anuncompressedcolorimagedatabase.ElectronicImaging.2003.

International Society for Optics and Photonics.p.472-480.

Image rain removal techniques based on discriminative sparse coding

Read next

CATDOLL 115CM Rosie TPE

CATDOLL 139CM Kara (TPE Body with Hard Silicone Head)

CATDOLL 126CM Yoyo

CATDOLL CATDOLL 115CM Nanako (TPE Body with Hard Silicone Head)