Predicting cell phone adoption metrics using satellite imagery

1 Predicting cell phone adop tion metrics using machine learning and satellite imagery Edward J. Oughton 1* and Jatin Ma thur 2* 1 College of Science, Ge orge Mason Univ ersity, Fairfax, VA 2 Department of Compu ter Science, University of Illinois at Urbana-Cha mpaign, Ur bana, USA *Both authors c ontributed equally to th e manuscript a nd share joint first auth orship Corresponding auth ors: Edward J. Oughton (e-mail: eoughton@g mu.edu ) ; Jatin Mathur (e- mail: jatinm2@illin ois.edu ) Corresponding add ress: GGS, George Mason Universit y, 4400 University Dri ve, Fairfax, VA Abstract Approximatel y half of the global populati on does not have ac cess to the internet , even though digital connectivity can reduce p overty by revolutionizing economic development opportunities. Due to a lack of data, M obile Network Operators an d governments strug gle to effectivel y determine i f infrastructure investments are v iable, especially in greenfield areas where demand is unknown. This leads to a lack of investment in netwo rk infrastructure, resulting in a phenomenon commonly referred to as the ‘dig ital divide’. I n this paper we present a machine learning method that uses publicly available satellite imagery to predict telecoms demand metrics, including cell phone adoption and spending o n mobile services, and apply the method to Malawi and Ethiopia. Our predictive machine learning approach consistently outperforms baselin e models which use po pulation density or nightlight luminosity, with an improvement in data variance p rediction of at l east 40%. The method is a starting point for developing m ore sophisticated predictive models of infrastructure demand using machine learning and publicly available satellite imagery. The ev idence produced can help to better inform infrastructur e investment and policy decisions. Key Words : Cell phone adoption, d eep learning , image recognition, sat ellite imagery. 2 1. Introduction How do we predict local cell phone adoption? And, co uld the combined purchasing power of households viabl y attract new digital infras tructure investment, f or example, for 4G or 5G? Currently, digital ecosyste m actors such as g overnments, regulat ors, and development agencies, as well as many Mobile Network Operators (MNOs), lack this vital insight in unconnected areas. Hence, infrastructure deployment is percei ved as riskier in these places , of ten preventing needed investment , leading to a ‘ digital divide ’ between those with voice and data access, and those without . Ultimately, having internet access allows users to participate in the digital economy, providing revolutionary economic and socie tal opportunities. While those with out access are left behin d. The United Nation’s Sustainable Development Goals (SDGs) provide a vision for achieving a better future for all which can be sustained o ver the lo ng term (Unit ed Nations, 2019), and SDG target 9 .c places a special focus on delivering universal and affo rdable broadband to help reduce poverty. Over many decades, Infor mation and Co mmunications Technologies (ICTs ) have been s een as a key way to enable digitally-led economic development and help deliver the SDGs, potentially lifting m illions o ut of poverty (Haile et al., 2019; Mansell, 2001, 1999; Mansell and Wehn, 1 998) . Hen ce, so lving the digital divide is critical to this mission . Currentl y, univ ersal broadban d still r emains an ambitious goal even i n established frontier economies such as the United States, demonstrating the challenge of achieving viable internet economics in rural and remote areas ( Claffy and Clark, 2019). Particularl y in the curr ent era of the digital economy, users do not just require a stable 2 G voice connectio n, but also a reliable data co nnection to enable functionality for the existing range o f applications and services . 4G is the most co mmon data connectivity technology, but this will increasingly be supple mented by 5G over the next decade (Hidalg o et al., 2020; Oughton et al., 2020; Tchamyou et al., 2019). The importance of the analysis reported in this paper is highlighted when we consider the vast quantities of capital invested into digital divide projects globally every year. As just o ne example, the World Bank’s Digital Development program aims to provide the necessary knowledge and financing 3 to help close the global digital divide, ensuring countries can take full advantage of the ad option of internet-based technologie s for economic developm ent purposes. Over the past four years alone, the World Bank has invested over $3.8 billion (USD) in ICT projects (W orld Bank, 2 019), with over $1 .1 billion going to c ountries on the African continent, as outlined in Table 1. Table 1 World B ank ICT financing (World Bank, 2019) Region 2015 ($ m) 2016 ($ m) 2017 ($ m) 2018 ($ m) 2019 ($ m) Total ($ m) Africa 159 44 274 226 471 1,174 East Asia & Pacific 69 - 207 80 140 496 Europe & Central Asia 42 39 8 129 116 334 Latin America & Cari bbean 48 - 122 13 46 229 Middle East & North Africa - 145 183 62 308 698 South Asia 38 43 228 232 310 851 Annual lendin g 356 271 1,022 742 1,391 3,782 With such vast investment s targetin g the digital di vide, there is strong motivation t o develop data- driven broadband strategies to inform both telecom investment decisions and policies (Feijóo et al., 2020; Feijóo and Kwon, 20 20; Tayl or and Schejter, 2013) . Estima ting the sup ply-side costs for build ing new digital infrastructure has been well researched an d is a combination o f the eq uipment cos t (which does n ot vary much between countries), and c ountry-specific cost factors relatin g to g eography, labor and taxation (Chiaraviglio et al., 2017; Jha and Saha, 2017; Oughton and Russell, 2020; Ovando et al., 2015) . In c ontrast, on the demand-side, there is considerable spatial heterogeneity in ad option between o ne local area to the next, dependent on a range o f socio -economic, demographic and cultural facto rs (Blank et al., 2018; Francis et al., 2019; Kabbiri et al., 2018; Oughton et al., 20 15; Rhinesmith et al. , 2019; Whitacre et al., 2015) . In an ideal situation, p otential MNOs, infrastructure investors and policy d ecision-makers w ould have a go od knowledge of the potential demand in a local area, covering both barriers and pent-up demand to aid business and planning decisions (Martínez-Domíng uez and Mora-Rivera, 2020; Mossberger et al., 2012; Owusu-Agyei et al., 2020; Reddick et al., 2020; Rosston and Wallsten, 2020; Taufique et al., 4 2017). Information would be collected via on -the-gro und su rvey methods and cover (i) existing cell phone o wnership for basic, featurephone and smartph one devices, (ii) the current Average Revenue Per User (ARPU), and (iii) the potential willingness to pay for new services. This information could be combined to estimate initial and future potential revenue for MNOs, and the effective magnitude of required state subsidies to make deployment viable. While these surveys can provide rich context for informing digital divide policies, they can be expensive to undertake due to labor-intensive methods. A survey also only provides inf ormation for a single p oint in space. Often therefore data must be generalized from those few areas with known demand information to unknown locations, introducing uncertainty which can affect network design, private financing, and government support through schemes such as network subsidies. Ideally, we would like a predictive model that can better inform this data generali zation process. Thus, there is motivation to explore n ew analytical options for quantifying the digital divide and providing improved evidence to design policies to reduce digital in equalities. Such evidence is essential for govern ments and in ternational aid agencies (Maitland et al., 2018). Currently, there has been much development around the use of machine learning techniques to enhance teleco m decision making (Balmer et al., 2020 ; Righi et al., 2020; Ve snic-Alujevic et al., 2020) . However, this poses a significan t challenge for digital divide researchers because there are v ery few critical assessments of the effec tiveness of these techniques. Indeed, researchers should not accept machine learnin g conjecture without independent q uantitative assess ment of these methods, with such assessments confor ming to the highest standard s of scientific reproducib ility. Considering these issues, a single res earch question is no w identified to ad dress in this paper. How effective are different techniques at predicting cell phone adoption metrics from satellite imagery , such as device pene tration and monthly spend ing on teleph one services? In answering this que stion, the key contributions of this pap er include: 5 1. Providing a validated method fo r predicting cell phone adoption metrics from satellite images. 2. Evaluating independent quantitative data on the effectiveness of machine learning techniques, over e xisting approaches. 3. Developing a documented open-source codebase for the digital divide communi ty to access, reproduce th e r esults and further develop the method , via the Tel ecom Analy tics for Demand using Deep Learning online repositor y: Having articulated the main co ntributions, the structure of the paper is now outlined. Section 2 is a literature review f ocusing on the digital divid e literature, and existing metric prediction fro m satellite imagery. Section 3 details the method empl oyed, before reporting results in Section 4 . Limitations with the m ethod are cov ered in Section 5. Finally, a discussion is underta ken in Section 6 and conclusions are present ed in Section 7 . 2. Literature review Two areas of literature are pertinent to the research question, including issues associat ed with market failure and the digital divide, and existing analyses which have used satellite imag ery to predict met rics of interest. 2.1. Market failure and t he digital divide Mobile network infrastruc ture i s generally delivered v ia market-based methods. Indeed, ev idence suggests that market competition combined with a suitable regulatory environment is positively correlated with tel ecom p erformance and better cons umer outco mes, enc ouraging such an approach (Bauer, 2010; Cave, 2 006; Wallsten, 2001),. Hence, investm ents are gen erally mad e based on rational infrastructure investment decisions by profit maximizing private operators. Thus, there m ust be a viable return which can feasibly be made, for the nece ssary infrastructure to be deployed . The problem therefore is that in areas of demand uncertainty (often exacerbated by a lack of data) , the necessary infrastructure required for economic development is not delivered leading to m arket failure (Oughton et al., 2018; Thoung et al., 2016). Although solving coverage issues alone may not 6 eliminate the digital divide (Reisdorf et al., 2020), basic infrastru cture is a necessary prerequisite to gaining sustainable eco nomic development benefits from d igital technologies and the wider positive societal impacts (Chen et al., 2020; Chester and Allenby, 2019; Farquharson et al., 2018; Graham and Dutton, 201 9; Hall et al., 2016; Parker et al., 2014; S axe and MacAskill, 2019). Thus, market failure issues must be addressed. A variet y of technol ogy, busin ess model and policy op tions are a vailable to attempt to do this and usually fo cus on usin g wireless technol ogies as the costs of deployment are lower. An essential prereq uisite however is mutual collaboration between private MNOs and governments, as improvements in coverage and capacity are most effective when there is simultaneous growth in both infrastructure and spectrum portfolios to enabling scale economics (Peha, 2017). From an economic perspectiv e, reducing the digital divide usually involves subsidizing both investments in rural ar eas and serv ices for l ow-income peopl e (Rosston and Wallsten, 2019). While the focus of the digital divide debate is very often on supply-side coverage gaps or connection speed differentials, the r oll-out of infrastructur e to unviable locations must ultimately be accompanied by demand-side progra ms to increase device o wnership and digital literacy, as these are key determinants of adoption (Hauge and Prieger, 2010) . Too often, digital divide issues are heavil y compounded by existing socio-economic disparities, meaning lower income groups can be most affected (Riddlesden and Singleton, 2014). This can too often have a greater disproportionate effect on minority ethn ic groups (Gant et al., 2 010; Turner-L ee and Miller, 201 1). Estimating demand metrics particularl y in greenfield areas is a serious challenge for both MNOs (Suryanegara, 2018), telecom regulators and analysts (Oughton et al., 2 019a, 2019b), leading to simplified modeling assumptions which do not necessarily reflect reality. Building new infrastructure is a balancing act (Greenstein, 201 0) , between delivering to areas of guaranteed demand (motivated by p rofit maximizing behavior), and incrementally rolling out new infrastructu re to areas where coverage is needed but take-up of new services is uncertain (motivated by equitable access policies) . 7 Although revenue metrics are frequently developed , they are rarely translated into spatial estimates of how and where infrastructure investment should next be directed, which for unviable areas may require government action (Sevastianov and Vasilyev, 2018; Vincenzi et al., 2019). Similarly, in forecasts of user ad option for cellu lar technologies (Ja hng and Park, 2020; Jha an d Saha, 2020; Kale m et al., 2021; Maeng et al., 2020; Neokosmidis et al., 2017) , M NOs are left with very little spatial understanding of how many p otential u sers of new se rvices ther e might be in e ach local area, despite this being important. In conclusion, it would be beneficial to have new evidence on local adoption o f cell phone metrics to he lp inform both priva te and governmental ac tions to reduce the digital divide . 2.2. Metric prediction from satellite imagery While cell phone adoption has been studied for many countries, including across Africa (Wesolowski et al., 2012) , researchers usually f ocus on analyzi ng survey data, with few attempt to develop predictions a t the nationa l scale . This is surprising given that internet-enable d technol ogies ar e increasingly being used to address a range o f issues relating to health, climate change, economic development, an d disaster resilience. Therefo re, it is e ssential to kn ow who is connected, and where. Currently there is consid erable research which uses c ell phone call records or l ocation data, obtained from MNOs, to metrics of interest, such as po pulation density (Deville et al., 2014), urban growth (Bagan and Yamagata, 2015), cellular network anomalies (Sultan et al., 2018) and socio-economic characteristics (Fernando et al., 2018; Koebe, 2020; Schmid et al., 2017). However, the limitations o f this approach r elate to the re being (i) no call data in areas with no coverag e, and (ii) privacy issues associated with this type of data, affectin g data sharing. It is increasing ly common for statistical fra meworks to be d eveloped which take advantage o f sa tellite data to augment official statistics. Many papers have focused on using nightlight luminosity data to assess questions relating to economics (Henderson et al., 2012, 2011), hu man development (Bruederle and Hodler, 2018) , urban extent (Zhou et al., 2015), conservation (Mazor et al., 2013) , 8 atmospheric composi tion (Proville et al., 2 017) and measuring the post-disaster impacts o f natural hazards (Elliott et al., 2015; Gillespie et al., 2014). While the analysis of mobile phone data is well established (Steenbru ggen et al., 2 015), new develop ments are taking ad vantage of a co mbination of machine learning with call records and satellite i magery, to address a si milar set of questi ons relating to poverty esti mation (Ayush et al., 2020; Jean et al., 2016; Perez et al., 2017; Steele et al., 2017) , ecosystem monitoring (Cord et al., 2017), estimating land cover types (Goldblatt et al., 2018) and creating data layers releva nt to the Sustainable Development Goals (Boyd et al., 2 018; Pokhriyal and Jacques, 2017) . However, such approaches have rarely been used to assess the digital divide. Importantly, a ke y advantag e of remote sensing using satellite data is that (i) there is access to an abundant, ro utinely collected body o f data, (ii) has very wide geographic coverage of such data allowing scalability across countries, and (iii) has very high spatial resolution (Donaldson and Storeygard, 20 16) . An increasingly used techn ique is transfer learning, where pre trained models are reapplied to new tasks to help tackle data limitations, such as with survey data (Jean et al., 2016). The goal o f transfer learning is to reuse low-level learned aspects of the feature domain , from abundan t data such as luminosity images or m obile phone records. High -level specific featur es can then be learnt for problems with the limited data available , preventing the need to fit a m odel from scratch. Several types o f tran sfer learn ing have been surv eyed in the literature (Pan and Yang, 2010) , but inductiv e transfer learning is a commonly applied approach, where the domain of two machine learning problems are the same, but the task i s different. 3. Method The method contains fi ve steps , starting by introduci ng the availabl e data and then articulating the data preprocessing ste ps. The concept of transfer learning and how this approach is used to turn an image into a feature vector is explained . Next, we describe how the feature vector is used to predict the metrics of inter est. Lastly, we explain how to gener alize the model to new r egions. 9 3.1. Available data To obtain measurements of the metrics of interest, data are taken from the Wo rld Bank’s Living Standards Measurement Survey (LSMS) , a multi-topic household survey undertaken in partnership with various national statistical offices . The survey co llects up - to -date information for measuring poverty, livelihood, and livin g conditions for specific household clusters in space, but is therefore not comprehensive across a whole country due to the prohibitive cost of surveying very large areas. The data are collected at what is called the “cluster” level – a small geographic region with a distinct latitude and longitude. We need to generalize th is data to develop a national dataset. Data are downloaded for the Malawian Fourth Integrated Household Survey 2016-2017 (World Bank, 2016a) and Ethiopian Socioeconomic Survey 2015-2016 (World Bank, 2016b), reported by metric in Table 2. Penetration is defined by the percen tage of h ouseholds with a t leas t one c ellphone, a nd consumpti on of phone services is based on the monthly sp ending on telephone services per c apita (w hich is broadly similar to the A verage Revenue Per User). Table 2 LSMS telecom metrics (household level) Country Variable description WB Data file name Column Household s surveyed Point clusters Source Malawi Household has a phone hh_mod_f.csv hh_f34 12,447 780 (World Bank, 2016a) Spend on phone services hh_mod_f.csv hh_f35 12,447 780 (World Bank, 2016a) Ethiopia Household has a phone sect9_hh_w3.csv hh_s9q2 2 4,954 530 (World Bank, 2016b) Spend on phone services sect9_hh_w3.csv hh_s9q2 3 4,954 530 (World Bank, 2016b) Surveys are conduct ed in each cluster, and for the sake o f anonymity LSMS clu ster coo rdinates are offset by a s mall rando m amount. In Malawi, ther e are 78 0 clusters and 12,447 h ouseholds surve yed, whereas in Ethiopia, there are 530 clusters and 4,954 house hold surveyed. To obtain images, the Planet web Application Programming Interface was used to quer y daytime Plan etS cope satellite 10 images in the time range ( 2014-2016) at a zo om level o f 14 (based on ‘ PSScene3Band' with a ~3m resolution). We always use the latest timestamped image to o btain the closest visit to when the o n- the-ground survey was undertak en, t o reduce uncertai nty in the an alysis. The imag es had a resoluti on of 256x256 pixels . A cloud cover filter of 5% was applied (removing images with m ore th an 5% cloud coverage). The chall enge is to be abl e to predict the de sired metrics of inter est from these i mages. 3.2. Data preprocessing There are two cases to c onsider: 1) training a model to work in a single country (“single - country ”), an d 2) generaliz ing th e mod el to work on multiple countries (“cross - country”). Cross -country generalization is limited in this study as o nly data from Malawi and Ethiopia is used. However, this analysis is co nducted as a baseline to enable future cross-countr y improvements. Using the LSMS data, 10x10 km bounding box is generated around the geometric centroid of each surveyed cluster. For each bounding box 20 download locations are uniformly samples. F or M alawi, 780 clusters with 20 images per cluster leads to 15,600 images, and in Ethiopia, 530 clust ers with 20 i mages per cluster leads to 10,600 images. In accordance with previous work, we bin each metric into four numeric ranges (“bins”) identified using a quantile cut (Jean et al., 2016) . Each cluster and its images are assigned to the correspondin g bin. The Convolutiona l Neural Network (CNN) is trained to classify an image’s bin. Lastly, the clusters that should be held out from the trainin g process are identified to properly valida te the model. In the single-country case, 30% of the clusters are ra ndomly held out for validation. In the cross-country case, an entire country is held out . The limitations of this approach are discussed later in the paper . 3.3. Transfer learning Images are a very complex data s ource. Besides numerous ra w inputs, image s have many hard- to - quantify fac tors such as relative position, orientation, and shading m aking th em diff icult for machines to start using fr om s cratch. As such it h as bec ome c ommon practi ce to use a techn ique called transfer learning to “transfer” model learning in o ne co ntext to another. Specifically, a parameter transfer method is us ed with the aim of fine-tuning pre-traine d mo dels on the data at han d (Houlsby et al., 11 2019; Kumagai, 2017). In this application, a pretrain ed University of Oxford Visu al Geo metry Group (VGG) model is chosen, which is widely used because the architecture is both highly effective and open-source (K olar et al., 2018; Simonyan and Zisser man, 2015) . Using this approach, the VGG mod el is trained on the ImageNet dataset which contains millions of images and over 1000 subjects (e.g. trees, vehi cles, building s etc.) (ImageNet, 2020). B y pretrainin g on I mageNet, th e VGG model is a ver y good tool for parameter tra nsfer learning, as specificall y (i) th e d omains are the same (images) and (ii) the VGG model has been extensively exposed to different domains and learned to extract useful information. For those not fam iliar with this m ethod, the approach is metaphorically like a language student learning from a wide variety of materials (books, audio sources, web resources etc.), prior to beginning conversational engagem ent with a hu man. The p rior step p rovides a basic structural understanding of a languag e, whereas the second step helps to r efine and further develop th e existing understanding to develop fluency . For an introductory overview of transfer learning m ethods and applications, there are resources a vailable in the lit erature (Sarkar et al., 2018; Yang et al., 2020) . To train the model, the pretrained VGG network is downloaded via PyTorch (PyTorch, 20 20) . Specific layers are reinitialized to function o n a four-outpu t classification task (as per the binning process described in Section 3.2) . An image prepro cessor is added during training that randomly choos es a subsection of the image to crop and pass to the CNN, preventing the CNN from being handed the sa me image repeatedly, thereb y reducing overfitting (which is a common problem for deep learning models) . A learning rate of 3 x 10 -6 is u sed with a b atch size of 8, along with a custom l oss func tion and the Adam optimizer. The custom los s function is designed to mi tigate the issue of assigning a real - valued variable into bins. F or a significant number of cases, clusters will be “close” to the bin boundary, but that information will b e lost because of the binnin g process. Cross-entropy loss aims to maximize the probability of the correct class and reduce error, yet for continuous variables the concept of correct class is artificial. The custom loss functi on defin es anything in th e top 10% o f a b in to b e “close” to the higher bin, and anything in the bottom 10% o f a bin to be “ close” to the lower bin. Fo r images 12 that are not “close” to anot her bin, a regular cr oss-entropy los s is applied . F or im ages that are “close” to another bin, the l oss function is the foll owing in equation (1):  󰇛   󰇜     󰇛    󰇜  󰇛    󰇜   󰇛   󰇛  󰇜 󰇜 (1) Where  is a vector represen ting the real-valued pr edictions for each bin and  is the integer label f or the correct b in.  refers to t he Cross-Entropy Loss.  󰇛  󰇜 return s the integer label of th e nearby bin (the bin the cluster is “ close” to). A weighting facto r 󰇛 ) assigns a degree of priority to the true class and a degree of priori ty to the nearb y class. This custom loss function pr events the model from bein g punished too harshly if it predicts the nearby class, in c ases where ther e is ambig uity about which b in the clus ter belongs to . The first 5 epochs are used to train only the new layers (all other layers are “frozen” to use PyTor ch terminology). Anothe r 25 epochs are sp ent training the e ntire model. 3.4. Prediction The vector output of a laye r near the end of the CNN is used as a feature vector r epresentation of the image, with each layer being reported in Table 3. After the CN N finishes training, layer 33 is specifi cally extracted, with an output vector of length 4096. Layer 36 is reinitializ ed for the fo ur-output class rather than the pre vious 1000-output class. Table 3 Modified PyTorch VGG architecture Layer Description Layer Description Layer Description 0 Conv2d 12 BatchNorm2d 24 ReLU 1 BatchNorm2d 13 ReLU 25 Conv2d 2 ReLU 14 MaxPool2d 26 BatchNorm2d 3 MaxPool2d 15 Conv2d 27 ReLU 4 Conv2d 16 BatchNorm2d 28 MaxPool2d 5 BatchNorm2d 17 eLU 29 AdaptiveAvg Pool2d 6 ReLU 18 Conv2d 30 Linear 7 MaxPool2d 19 BatchNorm2d 31 ReLU 8 Conv2d 20 ReLU 32 Dropout 9 BatchNorm2d 21 MaxPool2d 33 Linear 10 ReLU 22 Conv2d 34 ReLU 11 Conv2d 23 BatchNorm2d 35 Dropout 36 Linear Image feature vectors are averaged per cluster t o find an aggregate cluster featur e vector. Usin g only the clusters res erved for training, both random cross-validation and spatial cross-validation are 13 performed to fit five models, each trained on four- fifths (4 “folds”) of the training data and validated on the other fifth (the “fold” held out). Each time a fold is held out, a hyperparameter search is performed internally on th e four training folds. The only hyperparameter in Ridge Regression is the regularization coefficient. A list of potential regu larization coefficients is enumerated, and for e ach coefficient an “inner” cross -validation is undertaken on the four folds. The c oefficient with the best average R 2 is chosen . It is important to remember that this hyperparameter search does not pick the coefficient that work ed best on the original 5 th fold held out, but rather the one that works best durin g the “inner” cross -validati on. This tests gen eralization onto the 5 th fold correc tly. The five models create an ensemble by applying spatial cross-validation and implementing an equal voting scheme tha t averages the predictions of the five models. Gi ven the limited number of samples, this ca ptures input variability better. T he spatially validated models are then used because the R 2 o f those models on the trainin g set tends to be far clos er to the gen eraliz ed R 2 on the validation set. Finally, the model ensemble is tested o n the validation clusters. For the single-country case, the validation clus ters are the 3 0% of clust ers held out. For the cross-country c ase, the valid ation clusters are all clusters b elonging to the c ountry held out. Prediction intervals are computed using a probabilistic formulation of linear regression, as shown in equation (2):  󰇛   󰇜   󰇛      󰇜   󰇛  󰇜       (2) Where           .  is a N x 1 matrix of obser ved values, and  is a N x 40 96 m atrix of fe atures     refers to the i th row of  , as does   with  .  is the vector o f linear weights. No te that this is equivalent to minimizing the cla ssic linear r egression objective. Once  is solved,  becomes a constant, which enab les  to be s olved as in equation (3):       (3) 14 These equations do not strictly apply to the method for two reasons: 1) the method includes L2 regularization (hence ridge regression) in its o bjecti ve, and 2 ) the method creates an ensemble of regression models. For the sake of simpli city, the average of all ridge regression  will be substituted into  . Th ese two si mplifications are not too drastic because regularization maintains a very similar objective and an ensemble of linear models is equivalent to a single linear model that contains averages of all model weights. With th ese simplifica tions  can be determined (an d then  ). To compare the results to a baseline, non-CNN models are constructed based on (i ) population density, and (ii) nighttime luminosity . Population density is a common way to make disaggregated estimates of telecom demand, for example, in telecom regulator y decision support models which utilize different urban, suburban and rural ‘geotype’ settlement patterns (Ofcom, 2018) . Additionally, nighttime luminosity has also been used to scale telecom demand, for example in estimati ng the Average Revenue Per User (ARPU) (Oughton, 2021; O ughton et al., 2021; Oughton and Jha, 2 021) . Specifically, Ridge Regression is applied using the same cross-validation techniques with a model ensemble and held-out clusters. Nightlight luminosit y data are collected using annual composites from 2015 via the Visible Infrare d Imagin g R adiometer Suit e (VIIRS) datase t. Population data o btained from the World Population (WorldPop) 1 km 2 raster data layer (Stevens et al., 2015; Tatem, 2017). This da ta is averaged across the 10x10 km bounding box around each surveyed cluster and used directly to predict the telec oms deman d metrics. 3.5. Application st ep Each country boundary is extracted from the Global Admin istrative Database of Areas (GAD M) (GAD M, 2019) and split into 10x10 km grid squares . In total, 20 im ages are downloaded per grid and pass ed through the CNN t o obtain their feature v ector repre sentation. Vect ors are ave raged across each grid to get a feature vector per grid. This is pass ed thr ough the ensembled ridge regression model to obtain predictions for each metric per grid tile. G rid squares with very low populations are dropp ed to avoid 15 these affe cting the results . Fi nally, predic tions ar e ma pped . Figure 1 sum marizes the me thod pro cess, from model crea tion, to predi ction validation, to appli cation. Figure 1 Method overview 3.6. Contextual background to selected c ountries Malawi i s a landlocked countr y in Southern Africa, sharing borders with Mozambiq ue, Zambia, and Tanzan ia. The population is expected to double over the next two decades, from the 19 million citizens present in 2019. As a low-income coun try, Malawi is one of the poorest in the world with nearly 8 0% o f the population dependent on a griculture making deployment o f new digital infrastructure such as 4G c hallenging, (Wo rld Bank, 2021a). The two main mobile o perators include 16 Airtel Mala wi Limited and Telekom Networks Mala wi Lim ited, although a third national license has recently bee n granted. Policy guidance is provided by the Ministry of Information and Communications Technologies, and the Malawi Communications Regulatory Authority (MACRA) is responsible for regulating the sect or (Internati onal Telecommunicati on Union, 2018) . The spectrum allocation approach by MACRA is based on a “ first-come, first-s erved ” basis so long as frequencies are available, with more competitive process es being introdu ced should there be a spectrum shortage (MACRA, 2021). The other selected country is Ethiopia which has a strategically important location in the Horn of Africa, and is als o landlocked, sharin g a border wi th Eritrea, So malia, Kenya, S outh Sudan, and Sud an. Ethiopia has th e second lar gest population in Africa with more than 112 milli on people in 20 19. It has the fastest growing economy in the region, but also one of the l owest per cap ita in comes (World Bank, 2021b). Pre viously the sector was d ominated b y the government-owned Ethio Telecom, howev er this is now chang ing (Internatio nal Telecommunicati on Union, 20 18) . The sector is in the process of being privatized with large sums of Ethio Telecom being sold as part of a liberalization agenda, while also issuing new M NO national licenses, with the aim of attracting foreign investment (The Africa Report, 2020). The mobile sector is the responsibility o f the newly established Ethiopian Communications Authority (ECA) created in 2019 (Ethiopian Commun ications Authority, 2019) which intends to allocate spectrum to MNOs efficiently via aucti on methods (Bl oomberg, 2020; Cav e and Nicholls, 2017). Both c ountries ar e yet to achieve compreh ensive mobile broadband c overage, for exa mple, 4G coverage is a t 16% in Malawi and 61% in Ethiopia (GSMA, 2020) . Ther efore, the proposed satellite imagery-based approach could be incredibly useful to develop new deployment strategies. For example, in Malawi there is quite a substantial proportion of the p opulation needing to be covered (>80%) meaning these analytics co uld inform future roll-out. Similarly in Ethiopia, with the market liberalization taking place the two new entrants will need to build a substantial amount of greenfield infrastructure, thu s requiring anal ytics to support capi tal allocation proc esses. 17 4. Results The validated CNN accuracies are reported in Table 4, as detailed previously in Section 3.4 . As there are four equally represented bins, accuracies above 25% are an improve ment over random guessing . Both metrics perform bett er than the 25% baseline in the single-country case. However, in the cross- country case the accu racies only slig htly exceed rand om guessing. Table 4 CNN validation accuracies Type Name Binned device p enetration Binned monthly cost Single-country Malawi 44% 41% Ethiopia 39% 39% Cross-country Malawi 29% 31% Ethiopia 30% 29% Validation of the ensembled ridge regression models are shown in Table 5 Model Pearson R 2 country validation (best performing models highlighted in yellow) for both single-country and cross-c ountry results. The results indicat e that single-countr y CNN models outperform generaliz ed cross-country usage, and that for both countries and metrics, single-country CNN models far outperf orm the baseline models. Table 5 Model Pearson R 2 country v alidation (best performing models highlighted in yellow) Model Type Name Cross- validation technique Device Penetration Monthly Cost Population density Single- country Malawi Random 0.182 0.201 Spatial 0.182 0.201 Ethiopia Random 0.069 0.086 Spatial 0.069 0.086 Nightlight luminosity Single- country Malawi Random 0.211 0.183 Spatial 0.211 0.183 Ethiopia Random 0.083 0.152 Spatial 0.083 0.152 CNN Single- country Malawi Random 0.414 0.282 Spatial 0.410 0.284 Ethiopia Random 0. 268 0.268 Spatial 0.268 0.268 Cross- country Malawi Random 0.165 0.090 Spatial 0.169 0.124 Ethiopia Random 0.144 0.168 18 Spatial 0.151 0.162 The observed versus th e predicted values are illustra ted in Fi gure 2 with the associated predi ction intervals . Malawi perf ormed much better in predicting cell phone adoption, but only marginally better than Ethiopia in esti mating the cost of ph one services. Fi gure 2 Observed versus predicted values by metric Finally, predicti on maps can be crea ted from the results, as sh own for Malawi i n Figure 3 at 10x10 km spatial resolution . Evaluation of the spatial results are consistent with expectations. For example, the model estimates higher phone density in the ca pital Lilongwe in the mid-west area of Malawi, as well as in o ther populated ar eas, such as Blantyre in the south east and Mzuzu in the north. Figure 3 P redicted device penetration (Left); Predicted monthly cost (Right) 19 5. Limitations and fut ure areas of devel opment There are three main limitations with the method, includin g (i) justifying results, (ii) the validation technique, and (iii) satellite image availab ility. Each of these form future area s of devel opment. It is quite challenging to explain how a CNN arrives at a predicted result due to the vast quantity of parameters, and the size and structure of the network. This is a widely known issue, raising calls for more explainable machine learning approaches in telecoms (Guo, 2020) . Broadly speaking, lack of model interpretability has become an important inhibitor of widespread adaptation of deep learning methods. Activation maps have emerged as one way to interpret a CNN. The idea is simple – project areas that are “activated” by the CNN back onto the original image. This way, it is possible to v isually inspect which parts of the image the CNN focuses o n. By using activation m aps we can evaluate the behavior and limitations of the CNN. One approach to create activation maps is guided 20 backpropagation. This method passes an image through the CNN and performs backpropagation o n the known target class. At the first lay er, we store the gradient with respect to each pixel in the image an d remove those l ess than 0. Then, we plot a grayscale m ap of those gradients. Th ey are the same s hape as the original image, and the intensit y corresponds to a stronger (positive) gradient. The reason we use a gradient approach is because backpropagation finds partial derivatives; a larger p artial derivative c an be thought of as a larg er contribution. I n Figure 4 activation maps are presented for three images, with the original satellite image on the left- hand side, and the activation map on the right-hand side. In the top and middle images, we can see that the road network is able to be identified by the CNN . However, large bodies of water can produce strange results, an example being the to p left image , where a lake leads to activation o f the CNN. Furthermore, in the bottom figure, where the image pro vider has accid entally inserted a nighttime image, the CNN is also activating. These activations occur in m eaningless parts of the image and are much more prevalent than those in either of the two “good” images. As these observations are qualitative and require individual analysis, it is challenging to make a statemen t about the whole dataset o f >20,000 images. Consequently, it is difficult to m ake a claim about the robustness and generalizability of the CNN itself beyond traditional model validation. How ever, thi s means that during application, there could be unpredi ctable, unexpected, and unexplainable beha vior. Figure 4 CNN activation maps (Actual images in the left column, and activation maps on the right) 21 As fo r model validation, the two biggest constraints are the limited sample size and the long training time. In theory, 5-fold cross-validation would be a good way to test generalization. However, two countries, two metrics, and five folds would mean training a CNN 20 times. Furthermore, instead of randomized cross-validation there is als o spatial cross- validation which would make each fold c ontain 22 clusters that are geographically close. C onsequently, each iteration of cross-validation would be testing generalization not only o nto unseen clusters, but unseen geographic areas. This prevents the model fro m training o n on e cluster and validating o n anoth er nearby cluster (which likely has very similar metrics). The advantage of spatial cross-validation is that it more closely reflects the use case of applying the model to new regions. However, this would raise the number of CNN runs to 40. To maintain a reasonable number of training runs, a simple random 30% is held out for validation of single-country models. The downside o f this approach is that we d o not get the numerical stability that comes with random cross-validation (averaging five results is much better than doing just one), and generalizati on o nto ne w areas is not tested as thoroughly as p ossible with spatial cross-validati on. Lastly, because th e method only involves two c ountries, further research needs to assess (i) h ow well the approach works when scaled to many countri es, and (ii) what kind of train ing procedure and data quantity will be suffici ent for cross-countr y generalization. Finally, a limitation of this type of research relates to obtaining comprehensive an d consistent satellite imagery. F or the survey years u sed in this analysis ( 2015-2017), th e field of ear th observation w as still relatively limited in its ability to produce consistent imagery for whole countries in a single year. This is because obtaining high-quali ty images is dependent on the revisit rate by satellit es in a constellation, as well as the pres ence of o bstructions such as cl ouds. This meant that in this analysis, the most recent tile over the three-year peri od (2014-2016) had to be selected, introducing an ele ment of uncertainty . While this is a common limitation of the whole field of earth o bservation, and not just spe cific to this study, there are pr omising developm ents which will help overcome this issue in the future. For example, in recent years there has been a rapid increase in the number of commercial Low Earth Orbit observation satellite s. With the revisit ra tes to ea ch tile location risi ng, this increas es the probability of a high- quality cloud-free image bein g obtained within a defined ti me-period, helping to boost the temporal resolut ion of available satellite im agery. Importantly, a key strength of this pape r is that the whole codebase has been made open-source and available for the research com munity to 23 access. Therefore, as we gain improvements in the imagery available, further development of the codebase can take place, refinin g the results gener ated here based on more consistent image data . 6. Discussion This section returns th e focus to the resear ch question stated in the introducti on: How effective are different techniques at predicting cell phone adoption metrics from satellite imagery , such as device pene tration and monthly spend ing on teleph one services? This paper demonstrated a machine learning approach for predicting spatially granu lar est imates for cell phone adoption with s ignificant improvement over baseline modeling techniques. For example, population density is a common baseline model for predicting cell phone adoption metrics yet only captures approximately 7- 20 % of the variance in the data across the countries assessed (Malawi and Ethiopia) . The use of nightlight luminosity was similar in capturing approximately 8- 21 % of the data variance. By contrast, the CNN method described in this paper captured up to 41% of the data variance, providin g a minimum impr ovement against the baseline mod els of at least 40%. There are several key us e cases for the hig h-resoluti on, accurate predictions generated by the method in the paper, primarily relating to national assessments . Firstly, international development institutions can quantitatively identify underserved areas and mo re effectively design interventions for the billions of dollars they invest annually into digital develop ment projects each year in supp ort of the SDGs. Secondly, national and local governm ents, including telecommunication regulators, can access data which support policy decision making on the digital divide. A standard decision-making too l used in teleco ms is the Long Run Incremental Cost (LRIC) model, which is usually spreadsheet-based and focused o n modelin g a ‘hypothetical operator’ with average characteristics (e.g. assets, spectrum portfolio, market share etc.). Many assumptions are often used in this approach, particularly relating to the number o f cell phones in rural areas and the level of existing dem and. Rather than using hypothetical data and assumptions, the method produced here can help to reduce this uncertainty , helping make more effective decisi ons. 24 Mobile Network Operators can gain an understanding of two key demand metrics (device penetrati on and cost per month, which can be also be tho ught of as average revenue per us er) in g reen field areas where demand is unknown. One key advantage of this method is that it only utilizes satellite images which are globally available. Therefore, the method can be used in data limited locations where we have no cellphone records or electricity usage data. By solely using images, the method learns to associate levels of local development, with the availability of devices and the available consumer purchasing power to spend on phone services. Con sequently, in an underserved area with little existing coverage, the method extrapolates the acquired understanding of the relationship between development and teleco ms demand onto the new area. Thus, in an applicati on co ntext the approach can predict th e general capability of residents to own m obile devices and pay for telecom service s, based only on widel y-available visual evid ence. While there is significant technical complexity to the application of m achine learning methods, such as the one presented in this paper, we remain optimistic about their use and application. Firstly, international development organizati ons are already expe rimenting with such techniques, meaning this knowledge can be shared with local stakeholders. Secondly, as many national MNOs are part o f multinational telecom enterprises, with centralize d global strate gy and intelligence functions, specialized skills can be developed and applied across many countries. For exampl e, Sonatel in Senegal benefits from strategy and network intelligence inpu t from the co mpany owne r Orange, m uch like Telefonica provid es to its myriad S outh American MNOs, such as Movistar, Peru. When this i s combined with the fact th at machine learning techniq ues have become a core part of science and engineering programs at universities around the world, it is increasingly becoming easier for companies to obtain this specialized labor and utilize it to its full advantage (even in resource- constrained econo mies) . 25 7. Conclusion This paper assesse d the effectiveness of different modeling methods at estimating cell phone metrics , such as phone adoption and the capabilit y to pay fo r cellular services . We find that the baseline models using population density and nightlight luminosity, capture up to 20% and 21% o f the data variance respectively . Comparativel y, our CNN machine learning approach captured up to 41% of data variance , demonstrating a minimum pr edictive improvement against the baseline models o f at least 40% in all circumstances. The key contributions of the paper were threefold. Firstly, an accurate and v alidated method was provided that predicts telecoms demand met rics from sate llite images. Secondly, there was a quantitative comparison of this me thod to ex isting methods. Finally, the co debase used for the analysis has been made open-source for other researchers and analysts to utilize, reproduce the results, and further develop the method via the online repository: Telecom Analytics for Demand using Deep L earning . Future research needs to be undertake n to e xpand the assess ment meth od to include other indicators necessary fo r achieving the SDGs and explore the application of the method to additional countries, including in hig h-income nations such as the United States . References Ayush, K., Uzkent, B., Burke, M., Lobell, D., Erm on, S., 2020. Generating In terpretable P overty Maps using Object De tection in Satellite Images. arXiv:2 002.01612 [cs]. Bagan, H., Yamagata, Y., 2015. Anal ysis of urban growt h and estimating popula tion density using satellite images of nighttime ligh ts and land-use and population da ta. GIScience & Remote Sensing 52, 765 – 7 80. https://doi. org/10.1080/154 81603.2015.10 72400 Balmer, R.E., Le vin, S.L., Schmidt, S., 2020. Artificial In telligence Ap plications in Telec ommunications and other networ k industries. Telec ommunications Policy, Artificial intelligence, e conomy and society 44, 101977. http s://doi.org/10.10 16/j.telpol.2020.1 01977 Bauer, J.M., 201 0. Regulation, public polic y, and investmen t in communica tions infrastructure. Telecommunications Policy, Balan cing Competiti on and Regulation 34, 65 – 79. https://doi.org /10.1016/j.telpol.2 009.11.011 Blank, G., Graham, M., Calvino, C., 2018. Local Geogra phies of Digital In equality. Social Sci ence Computer Review 36, 82 – 102. https: //doi.org/10.117 7/08944393176 93332 Bloomberg, 202 0. Ethiopia Telec om Auction Set f or 2021 With Orang e in Contention. Bloomberg.com. Boyd, D.S., Jackson, B., Ward law, J., Foody, G.M., Mar sh, S., Bales, K., 2018. Slaver y from Space: Demonstrating th e role for satellite re mote sensing to inform evidence-based acti on related to UN SDG number 8. ISPRS Journal of Photogramm etry and Remote Sen sing 142 , 380 – 388. https://doi.org /10.1016/j.isprsjpr s.2018.02.012 26 Bruederle, A., Hodler , R., 2018. Nig httime lights as a proxy for human dev elopment at the lo cal level. PloS one 13. Cave, M., 200 6. Encouraging infrastru cture competition via the ladder of in vestment. Telecommunications Policy 30, 223 – 237. https://doi.o rg/10.1016/j.telpol. 2005.09.001 Cave, M., Nicholls, R., 2017. The use of spectrum aucti ons to attain multiple objectives: Policy implications. Telec ommunications P olicy, Optimising S pectrum Use 4 1, 367 – 378. https://doi.org /10.1016/j.telpol.2 016.12.010 Chen, P., Oughton, E.J., Tyler, P., Jia, M., Zagdanski, J., 2020. Evalua ting the i mpact of next generation broadban d on local busines s creation. arXi v:2010.1411 3 [econ, q-fin]. Chester, M.V., All enby, B., 20 19. Toward adapti ve infrastructure: fle xibility and agility in a n on- stationarity age. Su stainable and Resili ent Infrastructu re 4, 173 – 191. https://doi.org /10.1080/237896 89.2017.1416 846 Chiaraviglio, L., Bl efari-Melazz i, N., Liu, W., Gutierr ez, J.A., van de B eek, J., Birke, R., Chen, L., Idzikowski, F., Kilper , D., Monti, P., Bagu la, A., Wu, J., 2017. Bringing 5G into Rur al and Low- Income Areas: Is It Feasible? IEEE Co mmunications Sta ndards Magazine 1, 5 0 – 57. https://doi.org /10.1109/MCOMST D.2017.17000 23 Claffy, K., Clark, D., 2019. Workshop on Internet Eco nomics (WIE2018) Final Report. ACM SIGCO MM Computer Com munication Review 49, 2 5 – 30. Cord, A.F., Brauman, K.A., Chap lin-Kramer, R., Huth, A., Ziv, G., Seppelt, R., 2017. Priorities to Advance Monitoring of Ecosystem Servi ces Using Eart h Observation. Trends in Ecology & Evolution 32, 416 – 4 28. https://doi. org/10.1016/j.tree.201 7.03.003 Deville, P., Linard , C., Martin, S., Gil bert, M., Stevens, F .R., Gaughan, A.E., Bl ondel, V.D., Tate m, A.J., 2014. Dynamic population mappin g using mobile ph one data. PNAS 111, 15888 – 15893. https://doi.org /10.1073/pnas.14 08439111 Donaldson, D., St oreygard, A., 2016. The View fro m Above: Applications of Satellite Da ta in Economics. Journal of Economic Perspe ctives 30, 171 – 19 8. https://doi.org /10.1257/je p.30.4.171 Elliott, R.J.R., Str obl, E., Sun, P., 2015. The local impac t of typhoons on economic a ctivity in China : A view from outer spa ce. Journal of Urban Economics 8 8, 50 – 66. https://doi.org /10.1016/j.jue.20 15.05.001 Ethiopian Communica tions Authorit y, 2019. Our mission [WWW D ocument]. Ethiopian Communications Au thority. URL https: //eca.et/our-mission/ (acc essed 1.8.21). Farquharson, D., Jar amillo, P., Samaras, C., 2018. Sustain ability implications of electricity outages in sub-Saharan Afri ca. Nature Sustainabil ity 1, 589 – 597. https://doi.org/1 0.1038/s41893-018- 0151-8 Feijóo, C., Kwon, Y., 2020. AI impacts on economy an d society: Latest de velopments, open issu es and new policy measures. Telecom munications Policy, Artificial intellig ence, econo my and society 44, 10 1987. https://doi. org/10.1016 /j.telpol.2020.101 987 Feijóo, C., Kwon, Y., Bauer, J.M., Bohlin , E., Howell, B., Jain, R., Potgie ter, P., Vu, K. , Whalley, J., Xia, J., 2020. Harnessing artificial in telligence (AI) to incr ease wellbeing for all: The case for a n ew technology dipl omacy. Tele communications P olicy, Artificial intellig ence, economy and society 44, 10 1988. https://doi. org/10.1016 /j.telpol.2020.101 988 Fernando, L., Surend ra, A., Lokanathan, S., G omez, T., 20 18. Predicting populati on-level socio- economic characteris tics using Call De tail Records (C DRs) in Sri Lanka, in : Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financi al and Economic Datase ts, DSMM’18. Associ ation for Compu ting Machinery, H ouston, TX, USA, pp. 1 – 12. https://d oi.org/10.1145/32 20547.322054 9 Francis, J., Ball, C., Kadylak, T., C otten, S.R., 2019. Agin g in the Digital Age : Conceptualiz ing Technology Adopti on and Digital Inequ alities, in: Nev es, B.B., Vetere, F. ( Eds.), Ageing and Digital Technology : Designin g and Evaluating Emergin g Technologies for Older Adu lts. Springer, Sing apore, pp. 35 – 49. https:/ /doi.org/10.10 07/978-981- 13 -3693-5_3 27 GADM, 2019. Gl obal Administrative Areas Database ( Version 3.6) [WWW Document]. U RL https://gadm.org / (accesse d 7.11.19). Gant, J.P., Turner-L ee, N.E., Li, Y., Mill er, J.S., 2010. National minority broadban d adoption: Comparative trend s in adoption, ac ceptance and use. Joint Center for Poli tical and Economi c Studies, Washingt on D.C. Gillespie, T.W., Frank enberg, E., Chu m, K.F., Thomas, D., 2014. Night-time lig hts time series of tsunami damage, recovery, and economic metrics in S umatra, Indonesia. Remote Sensin g Letters 5, 286 – 29 4. https://doi.org/ 10.1080/21507 04X.2014.900205 Goldblatt, R., Stuhl macher, M.F., Tellman, B., Clinton, N ., Hanson, G., Georgescu, M., Wang, C., Serrano-Candela, F., Khan delwal, A.K., Cheng, W.-H., B alling, R.C., 2018. Using Landsat and nighttime lights for supervised pi xel-based image class ification of urban land c over. Remote Sensing of Environ ment 205, 25 3 – 275. https://doi.org /10.1016/j.rse.2 017.11.026 Graham, M., Dutt on, W.H., 20 19. Society and the Int ernet: How Net works of Informati on and Communication ar e Changing Our Lives. Oxford Unive rsity Press. Greenstein, S., 2010. Building Broadband Ahead of Digital Demand. IEEE Micr o 30, 6 – 8. https://doi.org /10.1109/MM.2010.1 11 GSMA, 2020. GS MA Intelligence Gl obal Data [WWW D ocument]. URL https://www.gsmain telligen ce.com/ (accessed 2. 5.20). Guo, W., 2020. E xplainable Artificial In telligence f or 6G: Improving Trust between H uman and Machine. IEEE Co mmunications Mag azine 58, 39 – 45. https://doi.org /10.1109/MCOM.0 01.2000050 Haile, M.G., W ossen, T., Kalkuh l, M., 2019. Acces s to information, price expectati ons and welfare: The role of m obile phone adoption in Ethiopia. Techn ological Forecasting and Social Change 145, 82 – 92 . https ://doi.org /10.1016/j.techf ore.2019.04.017 Hall, J.W., Thacker, S., I ves, M.C., Ca o, Y., Chaudry, M., Blainey, S.P., Ought on, E.J., 201 6. Strategic analysis of the future of national infrastru cture. Proceedin gs of the Institution of Civil Engineers - Civil Engin eering 1 – 9. https ://doi.org/10.1 680/jcien.16.0 0018 Hauge, J.A., Prieg er, J.E., 2010. De mand-Side Programs to Stimulate Adoption of Broadband : What Works? Review of Network Econ omics 9. https:/ /doi.org/10.220 2/1446 -9022. 1234 Henderson, J.V., Stor eygard, A., Weil, D.N., 2012. Me asuring Economic Gr owth from Out er Space. American Econo mic Review 102, 994 – 1028. https://d oi.org/10.1257/a er.102.2.994 Henderson, V., Storeygard , A. , Weil, D.N., 2011. A Bright Idea for Measuri ng Economic Gr owth. American Economic Review 101, 1 94 – 199. https://do i.org/10.1257/aer. 101.3.194 Hidalgo, A., Gabaly , S., Morales-Alonso, G., Urueña, A., 2020. The digital divide in light of sustainab le development: An app roach through advan ced machin e learning techniqu es. Technological Forecasting and Social Change 150, 119754. https ://doi.org/1 0.1016/j.techfore.2 019.119754 Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., d e Laroussilhe, Q., G esmundo, A., A ttariyan, M., Gelly, S., 2019. P arameter-Efficient Transfer Learning for NLP. ar Xiv:1902.007 51 [cs, stat]. ImageNet, 20 20. ImageNet Databas e [WWW D ocument]. URL http: //image-net.org/index (access ed 2.21.21). International Tel ecommunication Union, 2018. M easuring the In formation Society Rep ort 2018 [WWW Document]. U RL https://www.itu.int /en/ITU- D/Statistics/Pages /publications/misr 2018.aspx (ac cessed 1.8.21). Jahng, J.H., Park, S.K., 2020. Simulati on-based prediction for 5G mobile adopti on. ICT Express 6, 109 – 112. https://doi. org/10.1016/j.icte. 2019.10.002 Jean, N., Burke, M., Xie, M., Davis, W.M., Lobell, D.B., Ermon, S., 2016. C ombining satellit e imagery and machine learnin g to predict po verty. Science 3 53, 790 – 794. Jha, A., Saha, D., 2 020. “Forecasting and analysing the characteristics of 3G and 4G mobile broadband diffusion in Ind ia: A comparative evaluation of Bass, Nor ton-Bass, Gompertz, and logistic growth models.” Techn ological Forecasting an d Social Change 152, 119885. https://doi.org /10.1016/j.techfore.20 19.119885 28 Jha, A., Saha, D., 2 017. Techno-Econo mics Behind Pro visioning 4G LTE Mobile Se rvices over Sub 1 GHz Frequency Bands, in: Sastry, N. , Chakraborty, S. (Eds.), Communicati on Systems and Networks, Lecture N otes in Compu ter Science. Sprin ger International Publishi ng, Cham, pp. 284 – 306. https:/ /doi.org/10.100 7/978-3- 31 9-67235-9 _17 Kabbiri, R., Dora, M., Kumar, V., Elepu , G., Gellynck, X., 2018. M obile phone adopt ion in agri-food sector: Are farme rs in Sub-Saharan Africa c onnected? Technol ogical Forecasting and Social Change 131, 25 3 – 261. https://doi.org/ 10.1016/j.techf ore.2017.12.010 Kalem, G., Vay vay, O., Sennaroglu, B., Tozan, H., 202 1. Technology Fore casting in t he Mobile Telecommunication In dustry: A Case Study Towards the 5G Era. Engineering Ma nagement Journal 33, 15 – 29. https://doi.org/ 10.1080/1042 9247.2020.176483 3 Koebe, T., 202 0. Better coverage, be tter outcomes? Mapping mobile network data to offici al statistics using sat ellite imagery and radi o propagation modelling. arXi v:2002.11618 [c s, stat]. Kolar, Z., Chen, H., Luo, X., 20 18. Transfer learning and deep convolutional n eural networks for safety guardrail det ection in 2D imag es. Automation i n Construction 89, 5 8 – 70. https://doi.org /10.1016/j.autcon.2 018.01.003 Kumagai, W., 20 17. Learning Bound for Parameter Tra nsfer Learning. arXiv :1610.0869 6 [cs, stat]. MACRA, 2021. Tel ecommun ications - Spectrum. MAC RA. URL https://www.macr a.org.mw/?pag e_id=10995 (acces sed 1.8.21 ). Maeng, K., Kim, J., Sh in, J., 2020. Demand forecasting for the 5G ser vice market consid ering consumer preferenc e and purchase delay behavior. T elematics and Informatics 47, 10132 7. https://doi.org /10.1016/j.tele.201 9.101327 Maitland, C., Caneba, R., Schmitt, P., Koutsky, T., 2 018. A Cellular Ne twork Radio Access Performance Measurement Syste m: Resu lts from a Ugandan Refug ee Settlemen ts Field Trial (SSRN Scholarly Paper No. I D 31418 65). Social Science Res earch Network , Rochester, NY. Mansell, R., 2001. Digital opp ortunities and the missin g link for developin g countries. Oxf ord Review of Economic Polic y 17, 282 – 295. Mansell, R., 1999. Inf ormation and c ommunication tec hnologies for dev elopment: assessing the potential and the ri sks. Telecom munications policy 23, 35 – 50. Mansell, R., Wehn , U., 1998. Kn owledge societies: Information technolog y for sustainable development. Oxford University Press. Martínez-Domíngu ez, M., Mora-Rivera, J. , 2020. Inter net adoption and usag e patterns in rural Mexico. Techn ology in Society 6 0, 101226. https: //doi.org/10.1 016/j.techsoc.20 19.101226 Mazor, T., Levin , N., Possingham, H.P., Levy, Y., Rocchi ni, D., Richardson, A.J., Kark, S., 2013. Can satellite-based nigh t lights be used for c onservation? The cas e of nesting sea turt les in the Mediterranean. Bi ological Conser vation 159, 63 – 72. https://doi.org /10.1016/j.bioc on.2012.11.004 Mossberger, K., T olbert, C.J., Bowen, D., Jimenez, B., 2 012. Unraveling differen t barriers t o Internet use urban resident s and neighborhood effec ts. Urban Affairs Re view 48, 771 – 810 . https://doi.org /10.1177/107808 7412453713 Neokosmidis, I., R okkas, T., Park er, M.C., Koczian, G., Walker, S.D., Siddiqui, M.S ., Escalona, E., 2017. Assessment of soci o-techno-economic factors affectin g the market adoption and evolution of 5G networks: E vidence from the 5G-PPP CHARISMA project. Telematics and Informatics 34, 572 – 589. https: //doi.org/10.101 6/j.tele.2016.11.007 Ofcom, 2018. M obile call terminati on market revi ew 2018-21: Final statement – Annexes 1 - 15. Ofcom, London. Oughton, E., 202 1. Policy options f or digital infrastruct ure strategies : A simulation model for broadband universal service in Africa. ar Xiv:2102.0356 1 [cs, econ, q-fin]. Oughton, E., Tyl er, P., Alderson, D., 2 015. Who’s Sup erconnected and Wh o’s Not? Invest ment in the UK’s Information and Communicati on Technologies (I CT) Infrastructure. Infrastr ucture Complexity 2, 6. https://doi.org/10.1186 /s40551-015-0006- 7 29 Oughton, E.J., Co mini, N., Foster, V., Hall , J.W., 2021. Policy choices can help keep 4G and 5G universal broadban d affordable. arXi v:2101.07820 [cs, econ, q-fin]. Oughton, E.J., Frias, Z., Dohler, M., Whall ey, J., Sicker, D., Hall, J.W., Crowcr oft, J., Cleevely, D. D., 2018. The strategic national infrastructure assessment of digital communicati ons. Digital Policy, Regulati on and Governance 20, 197 – 210. https ://doi.org/10.11 08/DPRG- 02 -2018- 0004 Oughton, E.J., Frias, Z., van der Gaast, S., v an der Berg, R., 2019a. Assessing the capacity, co verage and cost of 5G inf rastructure strategies : Analysis of th e Netherlands. Tele matics and Informatics 37, 5 0 – 69. https://doi. org/10.1016/j.t ele.2019.01.00 3 Oughton, E.J., Jha, A., 2021. Sup portive 5G infrastructure polici es are essential f or universal 6G : Evidence from an op en-source techno-econ omic simul ation model using re mote sensing. arXiv:2102.080 86 [cs, econ, q-fin]. Oughton, E.J., Katsar os, K., Entezami, F., Kaleshi, D., Cr owcroft, J., 2019b. An Op en-Source Techno- Economic Assess ment Framework f or 5G Deploymen t. IEEE Access 7, 1 55930 – 155940. https://doi.org /10.1109/ACCESS.20 19.2949460 Oughton, E.J., Lehr, W., Katsaros, K., Selinis, I., Bubley , D., Kusuma, J., 2020. Revisi ting Wireless Internet Connectivit y: 5G vs Wi-Fi 6. arXi v:2010.1160 1 [cs]. Oughton, E.J., Russell, T., 2020. The imp ortance of spa tio-temporal infrastructure assessment: Evidence for 5G fro m the Oxford – Cambrid ge Arc. Com puters, Environment and Urban Systems 83, 101 515. https://doi.org /10.1016/j.compe nvurbsys.2020.101 515 Ovando, C., Pér ez, J., Moral, A., 201 5. LTE techno-economic ass essment: The cas e of rural areas in Spain. Telecommunicati ons Policy, New empirical appr oaches to telec ommunicati ons economics: Oppor tunities and challengesM obile phone data and geographic m odelling 39, 269 – 283. https:/ /doi.org/10.101 6/j.telpol.2014. 11.004 Owusu-Agyei, S., Okaf or, G., Chijok e-Mgbame, A.M., Ohalehi, P., Hasan , F., 2020. Internet ad option and financial develop ment in sub- Saharan Africa. Technological F orecasting and S ocial Change 161, 12 0293. https: //doi.org/10.101 6/j.techfore.2020. 120293 Pan, S.J., Yang, Q., 2 010. A Survey on Transfer Learning . IEEE Transactions on Knowledge and Data Engineering 2 2, 1345 – 1359. https: //doi.org/10.1 109/TKDE.2009.191 Parker, M., Acland, A., A rmstrong, H.J., B ellingham, J.R ., Bland, J., Bodmer, H.C., Burall, S. , Castell, S., Chilvers, J., Clee vely, D.D., Cope, D., C ostanzo, L., Dola n, J.A., Doubleday, R., Feng, W.Y., Godfray, H.C.J., Go od, D.A., Grant, J., Gr een, N., Groen , A.J., Guilliams, T.T., Gup ta, S., Hall, A.C., Heathfield, A., H otopp, U., Kass , G., Leeder, T., Lickorish, F.A., Lueshi, L.M., Magee, C., Mata, T., McBride, T., McCarthy, N., Mercer, A., Neils on, R., Ouchikh, J., Oug hton, E.J., Oxenham, D., Pall ett, H., Palmer, J., Patmore, J., Petts, J., Pink erton, J., Ploszek, R ., Pratt, A., Rocks, S.A., Stansfi eld, N., Surkovic, E., Tyler, C.P., W atkinson, A.R., Went worth, J., Willi s, R., Wollner, P.K.A., W orts, K., Sutherlan d, W.J., 2014. Ide ntifying the Science and Te chnology Dimensions of E merging Public P olicy Issues through H orizon Scanning. PLoS ONE 9, e96480. https://doi.org /10.1371/journ al.pone.0096480 Peha, J.M., 2017. C ellular economi es of scale and wh y disparities in spe ctrum holding s are detrimental. Telec ommunications P olicy 41, 792 – 801. https://doi.org /10.1016/j.telpol.2 017.06.002 Perez, A., Yeh, C., Az zari, G., Burke, M., Lobell, D., Er mon, S., 2017. P overty Prediction wi th Public Landsat 7 Satellite I magery and Machine Learning. arXi v:1711.03654 [cs, stat]. Pokhriyal, N., Ja cques, D.C., 2017. Combining disparate data sources for improved poverty prediction and mapping. PNAS 1 14, E9783 – E9792. https ://doi.org /10.1073/pnas.1700 319114 Proville, J., Zaval a-Araiza, D., Wagner, G., 2017. Night- ti me lights: A gl obal, long term l ook at links to socio-economic trend s. PLOS ONE 12, e0174610. https://doi.org /10.1371/journ al.pone.0174610 PyTorch, 2020. PyTorch - An open s ource deep learnin g platform. [W WW Docum ent]. URL https://www.pyt orch.org (access ed 2.21.21). 30 Reddick, C.G., Enriq uez, R., Harris, R.J., Sh arma, B., 202 0. Determinants of broadband access and affordability: An anal ysis of a c ommunity survey on the digital divide. Cities 106, 102904. https://doi.org /10.1016/j.cities.2 020.102904 Reisdorf, B.C., Fernandez, L., Hampton, K.N ., Shin, I., Du tton, W.H., 2020. Mobile Phones Will N ot Eliminate Digital an d Social Divid es: How Variation in I nternet Activi ties Mediates the Relationship Bet ween Type of Int ernet Access and L ocal Social Capital in Detroit. Social Science Computer Review 0894 43932090944 6. https://doi.org /10.1177/0894 4393209094 46 Rhinesmith, C., R eisdorf, B., Bishop, M., 2019. The abil ity to pay for broad band. Communicati on Research and Pra ctice 5, 121 – 138. https: //doi.org/10. 1080/22041451.2019.16 01491 Riddlesden, D., Sin gleton, A.D., 201 4. Broadband spee d equity: A new digital d ivide? Applied Geography 52, 25 – 33. https://doi.org/ 10.1016/j.apge og.2014.04.008 Righi, R., Samoili, S., López Cobo, M., Vázquez-Prada Baillet, M., Cardona, M., De Prato, G., 2020. The AI techno-economic complex Syste m: Worldwide lan dscape, thematic subd omains and technological collab orations. Telecom munications Poli cy, Artificial intelligen ce, economy and society 44, 10 1943. https://doi. org/10.1016 /j.telpol.2020.101943 Rosston, G.L., Wallsten, S., 2019. Incr easing Low-Income Broad band Adoption through Private Incentives (SSRN Sch olarly Paper N o. ID 3431346 ). Social Science Research Netw ork, Rochester, NY. https ://doi.org/10. 2139/ssrn.3431 346 Rosston, G.L., Wallsten, S.J., 2020. Increasin g low-income br oadband adoption th rough private incentives. Telec ommunications Poli cy 44, 1020 20. https://doi.org /10.1016/j.telpol.2 020.102020 Sarkar, D., Bali, R., Ghosh, T., 201 8. Hands-On Transfer Learnin g with Pyth on: Implement advanced deep learning and neural n etwork models usin g Tenso rFlow and Keras. Pac kt Publishi ng Ltd. Saxe, S., MacAskill , K., 2019. Toward adaptive infras tructure: the role of existing i nfrastructure systems. Sustainab le and Resilient Infras tructure 0, 1 – 4. https://doi.org /10.1080/237896 89.2019.168182 2 Schmid, T., Brucksch en, F., Salvati, N., Zbira nski, T., 2 017. Constructing sociodemographic in dicators for national statistical institutes by usin g mobile phone data: estimating litera cy rates in Senegal. Journal of the Royal Statistical So ciety: Series A (Statistics in S ociety) 180, 1163 – 1190. https://d oi.org/10.1111/rs sa.12305 Sevastianov, L.A., V asilyev, S.A., 2018. Telecommunica tion market m odel and optimal prici ng scheme of 5G servic es, in: 20 18 10th International Con gress on Ultra Modern Telecommunications and Control Sys tems and Wor kshops (ICUMT). Presented at the 201 8 10th International C ongress on Ultra Modern Telec ommunications and C ontrol Systems and Workshops (ICUM T), pp. 1 – 6. https://d oi.org/10.110 9/ICUMT.2018.8631269 Simonyan, K., Zisser man, A., 201 5. Very Deep Con volutional Networks f or Large-Scale Image Recognition. arXiv :1409.1556 [c s]. Steele, J.E., Sund søy, P.R., Pezzulo, C., Alegana, V.A., Bird, T.J., Blumenst ock, J., Bjelland, J. , Engø- Monsen, K., de Montjoye, Y.-A., Iqbal, A.M., Hadiu zzaman, K.N., Lu, X., Wetter, E ., Tatem, A.J., Bengtsson, L., 2017. Mappin g poverty using m obile phone and satellite dat a. Journal of The Royal Society Interface 14, 2 0160690. https: //doi.org/10. 1098/rsif.2016.0 690 Steenbruggen, J., Tran os, E., N ijkamp, P., 2015. Da ta from mobile ph one operators: A to ol for smarter cities ? Telecommunications Policy, New e mpirical approach es to telecommunications economics: Oppo rtunities and ch allenges 39, 33 5 – 346. https://doi.org /10.1016/j.telpol.2014.0 4.001 Stevens, F.R., Gaughan , A.E., Linard, C., Tate m, A.J., 2015. Disaggregating Censu s Data for Population Mapping Using Rand om Forests with R emotely-Sensed and Ancillary Data. PL OS ONE 10, e0107042. https: //doi.org/10.1 371/journal.pone.010704 2 Sultan, K., Ali, H., Zhang , Z., 2018. Call Detail Records D riven Anomaly Detection and Traffic Prediction in Mobil e Cellular Netw orks. IEEE Access 6, 41728 – 41737. https://doi.org /10.1109/ACCESS.20 18.2859756 31 Suryanegara, M., 2 018. The Econ omics of 5G: Shifting f rom Revenue-per-User to Revenue-per- Machine, in: 20 18 18th International S ymposium on Communications and Information Technologies (ISCIT). Presented at the 2018 18th Inter national Symp osium on Communications an d Information Techn ologies (ISCIT), pp. 19 1 – 194. https://doi.org /10.1109/ISCIT.20 18.8588006 Tatem, A.J., 20 17. WorldPop , open data for spatial d emography. Sci Data 4, 1 – 4. https://doi.org /10.1038/sdata.2017. 4 Taufique, A., Jaber, M., Imran, A., Da wy, Z., Yacoub, E., 2017. Plannin g Wireless Cellular Netw orks of Future: Outlook, Challenges and Opp ortunities. IEEE A ccess 5, 482 1 – 4845. https://doi.org /10.1109/ACCESS.20 17.2680318 Taylor, R.D., Schej ter, A.M., 2013. Beyond Broadband Access: Developing Data-Based Information Policy Strategie s. Fordham Univ Press. Tchamyou, V.S., Erre ygers, G., Cassi mon, D., 201 9. Inequality, ICT and financial access in Afric a. Technological F orecasting and Social Chan ge 139, 1 69 – 184. https://doi.org /10.1016/j.techfore.20 18.11.004 The Africa Rep ort, 2020. Ethiopia: 45 % of telecoms co mpany Ethio to be s old off, despite conflic t in the north [WWW Document]. The Africa Report.co m. URL https://www.theafricar eport.com/5 1731/ethiopia- 45 - of -telec oms-company-ethi o- to - be - sold-off-despite-confli ct- in -the-north/ (a ccessed 1.8.21). Thoung, C., Beaven, R., Zuo, C., Birkin, M., Tyler, P., Cr awford-Brown, D., Ough ton, E.J., Kelly, S., 2016. Future de mand for infrastru cture services, in: The Future of Nati onal Infrastructure: A System- of -Syst ems Approach. Cambridge Univer sity Press, Cambridge. Turner-Lee, N.E., Miller, J.S., 2011. The Social Cost of Wireless Taxation : Wireless Taxation and its Consequences for Minorities and the P oor. Joint Center for Political & Ec onomic Studies (Nov. 2011), availab le at http://www. j ointcenter. org/sites/default /files/upload/rese arch/files/The% 2 0Social% 20Cost% 20of% 20Wireless % 20T affixation. pdf ( “The Social Cost of Wir eless Taxati on”). United Nations, 2019. The Sustainab le Development Goals [WWW Docu ment]. United Nati ons Sustainable Develop ment. URL https://ww w.un.org/sustainabled evelopment/ (a ccessed 1.3.20). Vesnic-Alujevic, L., Nascimento, S., Pólvora, A., 2020. Societal and ethic al impacts of a rtificial intelligence: Critical notes on European poli cy frame works. Telecommunica tions Policy, Artificial intelligenc e, economy and s ociety 44, 1019 61. https://doi.org /10.1016/j.telpol.2 020.101961 Vincenzi, M., Lop ez-Aguilera, E., Garcia- Vi llegas, E., 20 19. Maximizing Infrastru cture Providers’ Revenue Through Network Slicin g in 5G. IEEE Access 7, 128283 – 128297. https://doi.org /10.1109/ACCESS.20 19.2939935 Wallsten, S.J., 20 01. An Econometric An alysis of Tele com Competition, Privatizatio n, and Regulation in Africa and Latin America. The J ournal of Industrial E conomics 49, 1 – 19. https://doi.org /10.1111/1467-6451.00 135 Wesolowski, A., Eag le, N., N oor, A.M., Snow, R.W., Bu ckee, C.O., 2012. Heterogeneous m obile phone ownership and usag e patterns in Keny a. PloS one 7. Whitacre, B., Str over, S., Gallard o, R., 2015. How much does broadband inf rastructure mat ter? Decomposing th e metro – non-metro adop tion gap with the help of the National Broadband Map. Government Infor mation Quarterly 32, 261 – 269. https://doi.org /10.1016/j.giq .2015.03.002 World Bank, 20 21a. The World Bank In Malawi: Ov erview [WWW D ocument]. World Bank. URL https://www.worldban k.org/en/countr y/malawi/over view (accessed 1. 8.21). World Bank, 20 21b. The World Bank In Ethiopia: Overview [WWW Document]. World Bank. UR L https://www.worldban k.org/en/countr y/ethiopia/ overview (accessed 1. 8.21). World Bank, 20 19. Annual Report 2 019 Lending Data. World Bank, Wash ington D.C. 32 World Bank, 20 16a. Living Standard s Measurement St udy (LSMS) - Malawi 201 6 [WWW Document]. URL https://micr odata.worldban k.org/index.php/catal og/lsms (accessed 1.3.20). World Bank, 20 16b. Ethiopia Socioec onomic Survey 2015-2016 [WWW Documen t]. URL https://microdata.w orldbank.org/ind ex.php/catalog/l sms (accessed 1.3. 20). Yang, Q., Zhang, Y., Dai, W., Pan, S.J., 2020. Transfer L earning. Cambridge Uni versity Pres s, Cambridge. https:/ /doi.org/10.1017/ 9781139061773 Zhou, Y., Smith, S.J ., Zhao, K., Imhoff, M., Thomson, A. , Bond-Lamberty, B., Asrar, G.R., Zhang, X., H e, C., Elvidge, C.D., 2015. A global map of urban extent f rom nightlights. Envir on. Res. Lett. 10, 054011. https://d oi.org/10.1088/174 8-9326/10/5/0540 11

Predicting cell phone adoption metrics using satellite imagery

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment