GNGTS 2018 - 37° Convegno Nazionale

GNGTS 2018 S essione 3.3 755 In this paper, we refer to a specific family of CNN architectures known as U-nets, which can be easily adapted as autoencoders. In particular, the name is due to the typical U-shape we can observe in Fig. 2, as well as two different paths: (i) the contracting path, interpreted as the encoder; (ii) the expansive path, interpreted as the decoder. Indeed, we can think the trained U-net as an instrument implicitly providing a multi-scale/multi-resolution hidden representation, able to describe the complex features of the seismic data, where the missing traces are not modeled. Using a computer vision terminology, we can imagine the interpolation task as an image transfer problem, with the goal of transforming corrupted gathers (denoted as Ī ) into regularly sampled ones (depicted as I ). Implementation of the U-net for Data Interpolation. In order to focus on local portions of the gather and to ensure a sufficiently large amount of data under analysis, we propose to work in a patch-wise fashion. We divide each gather into K patches P k of size 128 × 128, with stride 64. Following the same rational behind Ronneberger (2015), our architecture is composed by the blocks shown in Fig. 2: • An input layer that computes the gradients of the corrupted patch P - k in both directions and concatenates them with the patch. This is done to learn information related to the local gradients, which are greatly informative for the problem. Note that we fix [ P - k ] ij = 0 in correspondence of the missing samples. • Six stages where a 2D convolution, followed by batch normalization and Leaky ReLu are performed. These stages lead to the hidden representation (i.e., the result of the encoder). • Six stages where ReLu, Cropping, and 2D convolution followed by batch normalization and dropout are performed. The output of this stages is an estimated patch P̃ k , of the same size of the input patch. • Amasking stage that fills corrupted traces of P - k with the reconstructed traces of P̃ k . Precisely, the output Pˆ k is computed as: Pˆ k = P - k + P̃ k M , where is the Hadamard product and M is a binary mask, i.e., [ M ] ij = 1 whether [ P - k ] ij is a missing sample and [ M ] ij = 0 otherwise. The overall architecture is characterized by almost 42 million parameters, thus need to be trained on a significant amount of images as typical Deep Learning solutions. System Training and Validation. Once defined the architecture, the key point is designing the training strategy through the definition of a cost function tailored to the specific problem under analysis. Indeed, the training consists in estimating the network weights w through the minimization of a distance metric defined between the network input and its output. This distance is usually referred as loss function, and its minimization is carried out using iterative techniques (e.g., stochastic gradient methods, etc.). Fig. 2 - Architecture of the used U-Net.