Estimation of Acoustic Impedance from Seismic Data using Temporal Convolutional Network

In exploration seismology, seismic inversion refers to the process of inferring physical properties of the subsurface from seismic data. Knowledge of physical properties can prove helpful in identifying key structures in the subsurface for hydrocarbo…

Authors: Ahmad Mustafa, Motaz Alfarraj, Ghassan AlRegib

Estimation of Acoustic Impedance from Seismic Data using Temporal   Convolutional Network
Citation A. Mustafa, M. Alfarraj, and G. AlRegib, Estimation of Acoustic Impedance from Seismic Data using T emporal Conv olu- tional Network, Expanded Abstracts of the SEG Annual Meeting , San Antonio, TX, Sep. 15-20, 2019. Review Date of presentation: 18 Sep 2019 Data and Code [Github Link] Bib @incollection { amustafa2019AI, title=Estimation of Acoustic Impedance from Seismic Data using T emporal Con volutional Network, author=Mustafa, Ahmad and AlRegib, Ghassan, booktitle=SEG T echnical Program Expanded Abstracts 2019, year=2019, publisher=Society of Exploration Geophysicists } Contact amustafa9@gatech.edu OR alre gib@gatech.edu http://ghassanalregib.com/ Estimation of Acoustic Impedance from Seismic Data using T emporal Con volutional Network Ahmad Mustafa ∗ , Motaz Alfarraj, and Ghassan AlRe gib Center for Energy and Geo Pr ocessing (CeGP), Geor gia Institute of T echnology SUMMAR Y In exploration seismology , seismic in version refers to the pro- cess of inferring physical properties of the subsurface from seismic data. Knowledge of physical properties can prove helpful in identifying key structures in the subsurface for hy- drocarbon exploration. In this work, we propose a workflo w for predicting acoustic impedance (AI) from seismic data us- ing a network architecture based on T emporal Conv olutional Network by posing the problem as that of sequence modeling. The proposed workflo w ov ercomes some of the problems that other network architectures usually face, like gradient vanish- ing in Recurrent Neural Networks, or ov erfitting in Con volu- tional Neural Networks. The proposed workflow was used to predict AI on Marmousi 2 dataset with an average r 2 coeffi- cient of 91% on a hold-out validation set. INTR ODUCTION Reservoir characterization workflow in volves the estimation of physical properties of the subsurface, like acoustic impedance (AI), from seismic data by incorporating knowledge of the well-logs. Howe ver , this is an extremely challenging task be- cause in most seismic surveys due to the non-linearity of the mapping from seismic data to rock properties. Attempts to es- timate physical properties from seismic data hav e been done using supervised machine learning algorithms, where the net- work is trained on pairs of seismic traces and their correspond- ing physical property traces from well-logs. The trained net- work is then used to obtain a map of physical properties for the entire seismic volume. Recently , there has been a lot of work integrating machine learning algorithms in the seismic domain (AlRegib et al., 2018). The literature shows successful applications of supervised ma- chine learning algorithms to estimate petrophysical properties. For examples, (Calder On-macas et al., 1999) used Artificial Neural Networks to predict velocity from prestack seismic gath- ers, (Al-Anazi and Gates, 2012) used Support V ector Regres- sion to predict porosity and permeability from core- and well- logs, (Chaki et al., 2015) proposed novel preprocessing schemes based on algorithms like Fourier Transforms and W av elet De- composition before using the seismic attribute data to predict well-log properties. More recently , (Lipari et al., 2018) used Generativ e Adversarial Networks (GANs) to map migrated seis- mic sections to their corresponding reflectivity section. (Biswas et al., 2018) used Recurrent neural networks to predict stacking velocity from seismic of fset gathers. (Alfarraj and AlRegib, 2018) used Recurrent Neural Networks to in vert seismic data for petrophysical properties by modeling seismic traces and well-logs as sequences. (Das et al., 2018) used Con volutional Neural Network (CNNs) to predict p-impedance from normal incident seismic. One challenge in all supervised learning schemes is to use a network that can train well on a limited amount of training data and can also generalize beyond the training data. Recurrent Neural Networks (RNNs) can subvert this problem by sharing their parameters across all time steps, and by using their hidden state to capture long term dependencies. Ho wev er, they can be difficult to train because of the exploding/vanishing gradient problem. CNNs hav e great utility in capturing local trends in sequences, but in order to be able to capture long term depen- dencies, the y need to ha ve more layers (i.e, deeper networks), which in turn increase the number of learnable parameters. A network with a large number of parameters cannot be trained on limited training examples. In this work, we used T emporal Con volutional Networks (TCN) to modeling traces as sequential data. The proposed network is trained in a supervised learning scheme on seismic data and their corresponding rock property traces (from well logs). The proposed workflo w encapsulates the best features of both RNNs and CNNs as is captures long term trends in the data without requiring a large number of learnable parameters. TEMPORAL CONV OLUTIONAL NETWORKS One kind of sequence modeling task is to map a giv en a se- quence of inputs { x ( 0 ) , ..., x ( T − 1 ) } to a sequence of outputs y ( 0 ) , ..., y ( T − 1 ) of the same length, where T is the total num- ber of time steps. The core idea is that this kind of a mapping described by the equation 1 can be represented by a neural net- work parameterized by Θ (i.e., F Θ ). ˆ y ( t ) = F ( x ( 0 ) , . . . , x ( t )) ∀ t ∈ [ 0 , T − 1 ] (1) Con volutional Neural Networks (CNNs) have been used exten- siv ely for sequence modeling tasks like document classifica- tion (Johnson and Zhang, 2015), machine translation (Kalch- brenner et al., 2016), audio synthesis(van den Oord et al., 2016), and language modeling(Dauphin et al., 2016). More recently , (Bai et al., 2018) performed a thorough comparison of canon- ical RNN architectures with their simple CNN architecture, which they call the T emporal Con volutional Network (TCN), and showed that the TCN w as able to con vincingly outperform RNNs on various sequence modeling tasks. TCN is based on a series of dilated 1-D con volutions or ga- nized into T emporal Blocks . Each temporal block has the same basic structure. It has 2 con volution layers interspersed with weight normalization, dropout, and non-linearity layers. Fig- ure 1 sho ws the or ganization of the various layers inside a tem- poral block. The weight normalization layers reparameterize the weights of the network. Each weight parameter is split into 2 param- eters, one specifying its weight, and the other its direction. Dilated Convolut ion Wei gh t No rm ReLU Dropout Dilated Convolut ion Wei gh t No rm ReLU Dropout 1x1 Conv Output T emporal Block Input + + Figure 1: The structure of a T emporal Block. This kind of reparameterization, as (Salimans and Kingma, 2016) show , helps improve con ver gence. The Dropout layers randomly zero out layer outputs, which helps prev ent overfit- ting. The ReLU nonlinearity layers allow the network to learn more po werful representations. Each Conv olution layer adds padding to the input so that the output is of the same size as input. There is also a skip connection from the input to the output of each temporal block. A distinguishing feature of the TCNs is their use of dilated con volutions, that allo ws the net- work to have a large receptive field, i.e., how many samples of the input contribute to each output. The size of the dilation factor increases exponentially at each temporal block. W ith regular con volution layers, one would have to use a very deep network to ensure the network has a large receptiv e field. On the other hand, using sequential dilated con volutions allows the network to look at large parts of the input without having to use many layers. This enables TCNs to capture long term trends better than RNNs. Skip connections in the TCN archi- tecture help stabilize training in case of deeper networks. The concept of receptiv e field sits at the core of TCNs. Smaller con volution kernel sizes with fewer layers gi ve the network a smaller receptiv e field, which allows it to capture local varia- tions in sequential data well. Ho wev er, such a network f ails to capture the long term trends. On the other hand, larger kernel sizes with more layers gi ve the network a lar ge receptive field that makes it good at capturing long term trends, but not as good at preserving local variations. This is mainly due to the large number of successive con volutions which would dilute this information. This is also why adding skip connections to each residual block helps to ov ercome this drawback. PR OBLEM FORMULA TION Let X = [ x 1 , x 2 , ... x N ] be a set of post-stack seismic traces where x i is the i th trace, and Y = [ y 1 , y 2 , ... y N ] be the corre- sponding AI traces. A subset of X is inputted to the TCN in the forward propagation step. The network predicts the corre- sponding AI traces. The predicted AI traces are then compared to the true traces in the training dataset. The error between them is computed and is then used to compute the gradients. The gradients are then used to update the weights of the TCN in a step known as back-propagation. Repeated applications of forward propagation follo wed by backpropagation change the weights of the network to minimize the loss between the actual and predicted AI traces. W e hypothesized that by treat- ing both the stacked seismic trace x n and the corresponding AI trace y n as sequential data, we would be able to use the TCN architecture to learn the mapping F from seismic to AI. The training of the network can then be written mathematically as the following optimization problem: ˆ Θ = argmin Θ 1 N N X n = 1 L ( y n , F Θ ( x n )) (2) where L is a distance function between the actual and pre- dicted AI traces, F represents the forward propagation of the TCN on the input seismic to generate the corresponding pre- dicted AI trace, and Θ represents the network weights. METHODOLOGY The network architecture used is shown in Figure 2. The seis- mic traces are passed through a series of temporal blocks. The output of the TCN is concatenated with the input seismic and then mapped to predicted AI using a linear layer . As discussed earlier , when using a larger kernel size with more layers, the network captures the lo w-frequency trend, but not the high- frequency fluctuations. On the other hand, with a smaller ker - nel size with fewer layers, the network captures the high fre- quencies but fails to capture the smoother trend. This is also why we concatenated the original seismic directly with the out- put of the TCN, so that any loss of high-frequency information due to successive con volutions in the temporal blocks might be compensated for . W e found that this slightly improv ed the quality of our results. W e experimented with different kernel sizes and number of layers, and found the numbers reported in Figure 2 worked best in terms of capturing both high- and low-frequenc y contents. T raining the network There is a total of 2721 seismic and corresponding AI traces from the Marmousi model ov er a total length of 17000m. W e sampled both the seismic section and the model at intervals of 937m, to obtain a total of 19 training traces ( ≤ 1% of the total number of traces). W e chose Mean Square Error (MSE) as the loss function. Adam was used as the optimizer with a learning rate of 0.001 and a weight decay of 0.0001. W e used a dropout of 0.2, kernel size of 5, and 6 temporal blocks. The TCN internally also uses weight normalization to improv e training and speed up con ver gence. W e trained the network for 2941 epochs, which took about 5 minutes to train on a NVIDIA GTX 1050 GPU. Once the netw ork had been trained, inference on the whole seismic section was fast and took only a fraction of a second. Concatenati on Temporal Block (1,3) Te m p o r a l Convol utional Ne twork Input seismic Linear Layer (7,1) Temporal Block (3,5) Temporal Block (5,5) Temporal Block (5,5) Temporal Block (5,5) Temporal Block (5,6) Predicted AI Figure 2: TCN architecture for predicting AI. The TCN consists of a series of 6 temporal blocks, the input and output channels for each specified in parentheses. RESUL TS AND DISCUSSION Figure 3 shows the predicted and actual AI, along with the absolute difference between the two. The predicted and actual AI sections show a high degree of visual similarity . The TCN is able to delineate most of the major structures. The difference image also shows that most of the discrepancy lies at the edge boundaries, which is because of sudden transitions in AI that the network is not accurately able to predict. (a) Predicted AI (b) T rue AI (c) Absolute Difference Figure 3: Comparison of the predicted and true Acoustic impedance sections of the Marmousi 2 model along with the absolute difference W e also sho w traces at 3400m, 6800m, 10200m, and 13600m, respectiv ely in Figure 4. As can be seen, the AI and estimated traces at each location agree with each other to a large extent. Figure 5 sho ws a scatter plot of the true and estimated AI. The scatter plot sho w that there is a strong linear correlation be- tween the true and estimated AI sections. Figure 4: Comparison of the predicted and true Acoustic impedance traces at selected locations along the horizontal axis. For a quantitative ev aluation of the results, we computed the Pearson’ s Correlation coef ficient (PCC) and the coefficient of determination between estimated and true AI traces. PCC is a measure of the ov erall linear correlation between two traces. The coef ficient of determination ( r 2 ) is a measure of goodness of fit between two traces. The averaged v alues are shown in T able 1 for the training dataset and for the entire section (la- beled as validation data). As can be seen, both the training and validation traces report a high v alue for the PCC and r 2 coef- ficients which confirms that the network was able to learn to predict AI from seismic traces well and to generalize beyond the training data. CONCLUSION In this work, we proposed a nov el scheme of predicting acous- tic impedance from seismic data using a T emporal Conv olu- Figure 5: Scatter Plot of the true and estimated AI Metric T raining V alidation PCC 0.96 0.96 r 2 0.91 0.91 T able 1: Performance metrics for both the training and v alida- tion datasets. tional Network. The results were demonstrated on the Mar- mousi 2 model. The proposed workflo w was trained on 19 training traces, and was then used to predict Acoustic Impedance for the entire Mamrousi model. Quantitative ev aluation of the predicted AI (PCC ≈ 0 . 96, and r 2 ≈ 0 . 91) sho ws great promise of the proposed workflow for acoustic impedance prediction. Even though the proposed workflow has been used for AI es- timation in this paper , it can be used to predict any other prop- erty as well. Indeed, T emporal Con volutional Networks can be adapted to any problem that requires mapping one sequence to another . A CKNO WLEDGEMENTS This work is supported by the Center for Energy and Geo Pro- cessing (CeGP) at Georgia Institute of T echnology and King Fahd Uni versity of Petroleum and Minerals (KFUPM). REFERENCES Al-Anazi, A., and I. Gates, 2012, Support vector regression to predict porosity and permeability: Effect of sample size: Computers & Geosciences, 39 , 64 – 76. Alfarraj, M., and G. AlRegib, 2018, Petrophysical property es- timation from seismic data using recurrent neural networks, in SEG T echnical Program Expanded Abstracts 2018: So- ciety of Exploration Geophysicists, 2141–2146. AlRegib, G., M. Deriche, Z. Long, H. Di, Z. W ang, Y . Alau- dah, M. A. Shafiq, and M. Alfarraj, 2018, Subsurf ace struc- ture analysis using computational interpretation and learn- ing: A visual signal processing perspectiv e: IEEE Signal Processing Magazine, 35 , 82–98. Bai, S., J. Z. Kolter , and V . K oltun, 2018, An empirical ev al- uation of generic con volutional and recurrent networks for sequence modeling: CoRR, abs/1803.01271 . Biswas, R., A. V assiliou, R. Stromberg, and M. Sen, 2018, Stacking velocity estimation using recurrent neural net- work: , 2241–2245. Calder On-macas, C., M. Sen, and P . Stoffa, 1999, Automatic nmo correction and velocity estimation by a feedforward neural network: GEOPHYSICS, 63 . Chaki, S., A. Routray, and W . K. Mohanty , 2015, A no vel pre- processing scheme to improve the prediction of sand frac- tion from seismic attributes using neural networks: IEEE Journal of Selected T opics in Applied Earth Observations and Remote Sensing, 8 , 1808–1820. Das, V ., A. Pollack, U. W ollner, and T . Muk erji, 2018, Con vo- lutional neural network for seismic impedance inv ersion, in SEG T echnical Program Expanded Abstracts 2018: 2071– 2075. Dauphin, Y . N., A. Fan, M. Auli, and D. Grangier , 2016, Lan- guage modeling with gated conv olutional networks: CoRR, abs/1612.08083 . Johnson, R., and T . Zhang, 2015, Semi-supervised conv olu- tional neural networks for text categorization via region embedding, in Advances in Neural Information Processing Systems 28: Curran Associates, Inc., 919–927. Kalchbrenner , N., L. Espeholt, K. Simon yan, A. van den Oord, A. Grav es, and K. Kavukcuoglu, 2016, Neural machine translation in linear time: CoRR, abs/1610.10099 . Lipari, V ., F . Picetti, P . Bestagini, and S. T ubaro, 2018, A generativ e adversarial network for seismic imaging appli- cations: Presented at the . Salimans, T ., and D. P . Kingma, 2016, W eight normalization: A simple reparameterization to accelerate training of deep neural networks: CoRR, abs/1602.07868 . van den Oord, A., S. Dieleman, H. Zen, K. Simonyan, O. V inyals, A. Graves, N. Kalchbrenner, A. W . Senior , and K. Kavukcuoglu, 2016, W a venet: A generativ e model for raw audio: CoRR, abs/1609.03499 .

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment