Predicting ice flow using machine learning

Pr edicting ice ﬂow using machine lear ning Y imeng Min 1 S. Karthik Mukkavilli 1 Y oshua Bengio 1 Abstract Though machine learning has achie ved notable success in modeling sequential and spatial data for speech recognition and in computer vision, ap- plications to remote sensing and climate science problems are seldom considered. In this paper , we demonstrate techniques from unsupervised learn- ing of future video frame prediction, to increase the accuracy of ice ﬂo w tracking in multi-spectral satellite images. As the volume of cryosphere data increases in coming years, this is an interesting and important opportunity for machine learning to address a global challenge for climate change, risk management from ﬂoods, and conserving fresh- water resources. Future frame prediction of ice melt and tracking the optical ﬂo w of ice dynamics presents modeling dif ﬁculties, due to uncertainties in global temperature increase, changing precipi- tation patterns, occlusion from cloud cover , rapid melting and glacier retreat due to black carbon aerosol deposition, from wildﬁres or human fos- sil emissions. W e show the adversarial learning method helps improve the accuracy of tracking the optical ﬂo w of ice dynamics compared to e x- isting methods in climate science. W e present a dataset, IceNet, to encourage machine learning research and to help facilitate further applications in the areas of cryospheric science and climate change. 1. Introduction Recent de velopments in the climate sciences, satellite re- mote sensing and high performance computing are enabling new adv ancements that can lev erage the latest machine learn- ing techniques. Petabytes of data is being produced from a new-generation of Earth observation satellites, and the commercialization of the space industry , is driving costs of 1 Mila - Quebec AI Institute, Uni v ersit ´ e de Montr ´ eal, Mon- treal, Canada. Submitted to NeurIPS 2019 W orkshop on T ackling Climate Change with Machine Learning. Correspondence to: S. Karthik Mukkavilli < mukka vis@mila.quebec > . acquiring data do wn further . Such large geoscience datasets, coupled with latest supercomputer simulation outputs from global climate model intercomparison projects is now a vail- able for machine learning applications (Kay et al., 2015; Schneider et al., 2017; Reichstein et al., 2019). In another related study , (O’Gorman & Dwyer, 2018) used random forests to parameterize moist con vection processes to successfully emulate physical processes from e xpensi ve climate model outputs. (Scher, 2018) was also able to ap- proximate the dynamics of a simple climate model faithfully after being presented enough data with deep learning. Recently machine learning has demonstrated promise in resolving the largest source of uncertainty (Sherwood et al., 2014; IPCC, 2018) in climate projections, cloud conv ection (Rasp et al., 2018; Gentine et al., 2018). (Rasp et al., 2018) and (Gentine et al., 2018) demonstrated the use of deep learning in emulating sub-grid processes to resolv e model clouds within simpliﬁed climate models at a fraction of the computational cost of high resolution physics models. These dev elopments provide machine learning researchers opportunities to build models about ice ﬂo w and dynamics from satellite data, or using new video prediction techniques (Mathieu et al., 2016; Denton & Fergus, 2018) to predict changes in glaciers and ice dynamics. In this work, our main contributions are (1) dev elopment of an unsupervised learning model to track ice sheet and glacier dynamics; and (2) introducing IceNet, a dataset that we make av ailable for the community to serve as a ﬁrst step in bridging gaps between machine learning and cryosphere climate research. 2. Dataset In this paper , we inv estigate on sev en bands ranging from 0.43 µ m to 2.29 µ m(visible, near-infrared and shortwav e light) with a resolution of 30 meters. The details of LANDSA T 8 can be found at https://www .usgs.gov/land- resources/nli/landsat. In our dataset, we focus on a particular area at Antarctica with a latitude of 80 ◦ 01’25” South and longitude 153 ◦ 11’10” East, where the ice ﬂow’ s moving pattern is dominated by the Byrd Glacier (path 54, row 118 using worldwide reference system-2). The picture is shown in Figure 1. Neural Information Processing Systems (NeurIPS) 2019, W orkshop on T ackling Climate Change with Machine Learning (a) (b) Figure 1. (a) The region our dataset inv estigates. (b) Coastal sig- nal(Band 1, 0.43 µ m to 0.45 µ m) collected by the LANDSA T 8 at 2015 Nov ember 22. The four corners contain no information. Our dataset contains the satellite images ranging from Nov ember 2015 to February 2017, with total 10675 im- ages and e very image has 12 frames with the shape of 128 by 128; the interval between each frame ranges from two weeks to 9 month gaps, each pixel stands for a 30 meters by 30 meters region. 2.1. Labels The images are denoted as F i where i is from 1 to 12 and the frames(subscenes) in each image are x j i ∈ R 128 × 128 , where i ∈ { 1 ... 12 } and j ∈ { 1 ... 1525 } . For ﬁnding the next subscene, or chip, that matches the x j i − 1 best, we compare the x j i − 1 to a range of possible regions by calculating the correlation between two chips, the equation writes as: C I ( r, s ) = P mn ( r mn − µ r )( s mn − µ s ) [ P mn ( r mn − µ r ) 2 ] 1 / 2 [ P mn ( s mn − µ s ) 2 ] 1 / 2 (1) where r and s are the two images and µ is the mean value. The ice ﬂow is not static, moving areas of the large ice sheets w h c × w c ×" h c × w c ×" h c × w c ×" h …… Figure 2. A larger subscene is selected in case of the previous subscene moving outside the original grid. remain a challenge for tracing the ice ﬂow . For tackling surface feature mov ement, we select a larger area by a scale factor c ( c > 1) that centres around the pre vious subscene in case the pattern moving outside the pre vious grid, the most correlated one is chosen as the next subscene(the ground truth). The pipeline is shown in Figure 2. 3. Model W e use a stochastic video generation with prior for predic- tion. The prior network observ es frames x 1: t − 1 and output µ ψ ( x 1: t − 1 ) and σ ψ ( x 1: t − 1 ) of a normal distribution and is trained with by maxing: L θ,φ,ψ ( x 1: T ) = T X t =1  E q φ ( z 1: t | x 1: t ) log p θ ( x t | x 1: t − 1 , z 1: t ) − β D K L ( q φ ( z t | x 1: t ) || p ψ ( z t | x 1: t − 1 ))  (2) Where p θ , q φ and p ψ are generated from conv olutional LSTM. q φ and p ψ denote the normal distribution draw from x t and x t − 1 and p θ is generated from encoding the x t − 1 together with the z t . Subscene ˆ x t is generated from a decoder with a deep con- volutional GAN architecture a by sampling on a prior z t from the latent space drawing from the previous subscenes combined with the last subscene x t − 1 . After decoding, the predict subscene is passed back to the input of the predic- tion model and the prior . The latent space z t is draw from p ψ ( z t | x 1: t − 1 ) . The loss of our model contains three parts, KL di v ergence of the prior loss D K L , a ` 2 penalty between ˆ x t and x t and an additional ` 2 penalty of the area centred around the peak of e very subscene. The prediction results v ary with dif ferent weight of ` 2 penalty on the peak, when the weight is too small, the model may ignore the lo w frequency of the sub- scene and ˆ x t will predict the noisy small textures(cre vasses) of the ice ﬂow corresponding to ˆ x t . When increasing the weight, the model predicts the peak regions but fails to generate the small textures of the ice ﬂo w . 4. Experiment Results and Discussion W e train our model with z ∈ R 128 and 2 LSTM layers, each layer has 128 units. By conditioning on the past eight subscenes, the results of our model on different types of subscenes are shown in Figure 3 and 4. For ice ﬂow pat- tern with proper slopes(not too steep), e.g. line 2 and 6 in Figure 4, the machine learning can reproduce the slopes shapes and positions, resulting in successfully correlating two subscenes. In the experiment, the capability of repro- ducing small textures grows as enlarging the hidden space and batch size. Howe ver , the high pass ﬁlter’ s performance differs in this two examples: in line 2, the high pass model draws the textures from t 0 and t 2 , since the high pass ﬁl- ter’ s results are close to binary , as long as the textures are extracted, two subscenes correlate. Howe ver , for line 6, the ﬁlter on t 0 generates noisy signals, resulting in the failure of correlating. Another example the high pass ﬁlter fails is line 3, when the previous t 0 does not collect the texture information(the satellite signal is af fected by the cloud), in this case, the ﬁltered subscene lacks the key information to be correlated with the ﬁlter 2 . The machine learning model a voids the poisonous t 0 by gen- Neural Information Processing Systems (NeurIPS) 2019, W orkshop on T ackling Climate Change with Machine Learning Figure 3. Subscenes generated with dif ferent models, the ﬁrst three columns: the past three subscenes; the fourth column: machine learning predicted next subscene; ﬁfth column: high pass of t 0 ; sixth and sev enth column: the subscene on t 1 and t 2 ; eighth col- umn: high pass of t 2 . erating parameters learned from a range of past subscenes. In this case, though some of the past subscenes’ signals are contaminated, the model can successfully reproduce the slope and small patterns, as shown in line 3 in Figure 3. By adding proper weight around the peak area, the model successful reproduces the peak and learns the small te xtures from previous subscenes, as shown in line 3 and 4. The model also generates a continuous range of pixels that help reduce the correlation error , which is different from the binary result generating from the high pass ﬁlter . For ﬂat regions with complex textures, e.g. line 1 and line 7, the persistence model correlates when both t 0 and t 2 parse the patterns(not affected by the cloud). The overall correlation map is sho wn in Figure 4 and the statistical results are sho wn in T able 1. The machine learning model helps improv e the ov erall mean correlation 1 comparing with persistence model and high pass ﬁlter model. For some ﬂat re gions with clear pattern like cre v asses, an e xample can be found at line 7 in Figure 3, the high pass ﬁlter correlates better . Ho wev er, these kinds of regions count for a small percentage in the area we inv estigate, resulting in the improv ement in high 1 Mean correlation is generated from the non-zeros correlation subscenes using the high pass ﬁlter while o ver - all worse performance due to the noisy binary pixels. The machine learning model enlarges the medium correlation regions by generating continuous pix els, the peak area and learning from a range of past frames instead of just t 0 . (a) (b) (c) Figure 4. The correlation map. a) persistence model(correlation between t 0 and t 2 ); b) high frequency model (correlation between ﬁlter 0 and ﬁlter 2 ); c) machine learning model(correlation between ml and t 2 ). Persistence(Last frame) Hi-pass Filter Machine learning Correlation Mean 0.237 0.201 0.362 Low < 0.3 0.699 0.598 0.393 Medium 0.3 ∼ 0.7 0.271 0.337 0.557 High > 0.7 0.0300 0.0651 0.0504 T able 1. Results of three models 5. Conclusions and Future W ork W e present IceNet dataset and encourage machine learning community to pay more attention to socially and scientiﬁ- cally relev ant datasets in the cryosphere and develop new models to help combat climate change. W e also use an unsu- pervised learning model to predict future ice ﬂow . Compar- ing to the high pass ﬁlter or persistence model, our model correlates the past and present ice ﬂow better . Our model can also be improved if more physical and en vironmental parameters are introduced into the model, for example, the wind speed and the aerosol optical depth components in the atmosphere. The ﬁrst parameter provides a trend for the ice ﬂow movement and the second parameter gi ves us a conﬁ- dence factor about the satellite images’ quality , dropout to particular frames can be applied if the aerosol optical depth rises ov er a threshold. Furthermore, black carbon aerosols were found to accelerate ice loss and glacier retreat in the Himalayas and Arctic from both wildﬁre soot deposition and fossil fuel emissions. Detailed analysis of the feedback effects in ’black ice’ w ould be a future a venue of research The images of IceNet dataset is very different from tradi- tional video datasets, such as in the moving-mnist, some areas are dominated by ’ small textures’ while some can be smooth areas with peaks. This suggests that the transfer capability of existing models need to be inv estigated further or ne w models need to be de veloped for predicting the ice ﬂow on dif ferent types of terrains around the planet. Neural Information Processing Systems (NeurIPS) 2019, W orkshop on T ackling Climate Change with Machine Learning References Denton, Emily and Fergus, Rob. Stochastic video generation with a learned prior . arXiv pr eprint arXiv:1802.07687 , 2018. Gentine, P , Pritchard, M, Rasp, S, Reinaudi, G, and Y acalis, G. Could Machine Learning Break the Con vection Pa- rameterization Deadlock ? Geophysical Resear c h Letter s , 45:5742–5751, 2018. doi: 10.1029/2018GL078202. IPCC. Global warming of 1.5 ◦ C. An IPCC special r e- port on the impacts of global warming of 1.5 ◦ C above pr e-industrial le vels and r elated global gr eenhouse gas emission pathways, in the context of str engthening the global r esponse to the thr eat of climate c hange, sustain- able development, and efforts to eradicate poverty [V. Masson-Delmotte, P. Zhai, H. O. P ¨ ortner, D. Roberts, J. Skea, P.R. Shukla, A. Pirani, Y. Chen, S. Connors, M. Gomis, E. Lonnoy , J. B. R. Matthews, W. Moufouma-Okia, C. P ´ ean, R. Pidcock, N. Reay, M. T ignor , T. W aterﬁeld, X. Zhou (eds.)] . 2018. Kay , J, Deser, C, Phillips, A, Mai, A, Hannay , C, Strand, G, Arblaster , J M, Bates, S C, Danabasoglu, G, Edwards, J, Holland, M, Kushner , P , Lamarque, J-F , Lawrence, D, Lindsay , K, Middleton, A, Munoz, E, Neale, R, Ole- son, K, Polv ani, L, and V ertenstein, M. The Community Earth System Model (CESM) Large Ensemble project. Bulletin of the American Meteor ological Society , 96(8): 1333–1349, 2015. doi: 10.1175/B AMS- D- 13- 00255.1. Mathieu, Michael, Couprie, Camille, Lecun, Y ann, and Artiﬁcial, Facebook. Deep multi-scale video prediction beyond mean square error. arXiv pr eprint , 2016. O’Gorman, Paul A and Dwyer , John G. Using machine learning to parameterize moist con vection: Potential for modeling of climate, climate change, and extreme e v ents. Journal of Advanc es in Modeling Earth Systems , 10(10): 2548–2563, 2018. Rasp, Stephan, Pritchard, Michael S, and Gentine, Pierre. Deep learning to represent subgrid processes in cli- mate models. Pr oceedings of the National Academy of Sciences , 115(39):1–6, 2018. doi: 10.1073/pnas. 1810286115. Reichstein, Markus, Camps-v alls, Gustau, Stev ens, Bjorn, Jung, Martin, Denzler , Joachim, and Carvalhais, Nuno. Deep learning and process understanding for data-driv en Earth system science. Natur e , 566: 195–204, 2019. ISSN 1476-4687. doi: 10.1038/ s41586- 019- 0912- 1. URL http://dx.doi.org/ 10.1038/s41586- 019- 0912- 1 . Scher , Sebastian. T o ward data-driv en weather and climate forecasting: Approximating a simple general circulation model with deep learning. Geophysical Resear ch Letters , 45(22):12–616, 2018. Schneider , T apio, Lan, Shiwei, Stuart, Andre w , and T eixeira, Jo ˜ ao. Earth system modeling 2.0 : a blueprint for models that learn from observations and tar geted high-resolution simulations. Geophysical Researc h Letters , 44:12396– 12417, 2017. doi: 10.1002/2017GL076101. Sherwood, Ste ven C, Bony , Sandrine, and Dufresne, Jean- louis. Spread in model climate sensitivity traced to at- mospheric con vectiv e mixing. Nature , 505:37–42, 2014. doi: 10.1038/nature12829.

Predicting ice flow using machine learning

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment