A statistical framework for measuring the temporal stability of human mobility patterns

A ST A TISTICAL FRAMEW ORK F OR MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * Abstract. Despite the gro wing p opularit y of human mobilit y studies that collect GPS lo cation data, the problem of determining the minimum required length of GPS monitoring has not b een addressed in the current statistical literature. In this pap er w e tac kle this problem b y la ying out a theoretical framew ork for assessing the temporal stabilit y of human mobility based on GPS lo cation data. W e deﬁne several measures of the temp oral dynamics of human spatiotemporal tra jectories based on the av erage v elo cit y pro cess, and on activity distributions in a spatial observ ation windo w. W e demonstrate the use of our methods with data that comprise the GPS lo cations of 185 individuals o v er the course of 18 mon ths. Our empirical results suggest that GPS monitoring should be p erformed o ver p eriods of time that are signiﬁcan tly longer than what has b een previously suggested. F urthermore, we argue that GPS study designs should tak e into accoun t demographic groups. KEYW ORDS: Density estimation; global positioning systems (GPS); human mobilit y; spatiotemp oral tra jectories; temp oral dynamics Contents 1. In tro duction 1 2. Metho ds 3 2.1. Measuring the temp oral stabilit y of human mobility patterns 4 2.2. The activit y distribution of human mobility patterns 6 2.3. Measuring the temp oral stabilit y of human activity distributions 9 3. Application 10 4. Discussion 13 F unding 14 Ac kno wledgment 15 App endix A. Pro ofs of theoretical results 15 A.1. Pro of of Theorem 2.1 15 A.2. Pro of of Theorem 2.2 16 References 17 1. Introduction Recen t dev elopments on global p ositioning systems (GPS) for w earable technology suc h as smartphones hav e drawn a great amount of interest from scientists studying the CONT A CT A. Dobra. Email: adobra@uw.edu. 1 2 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * eﬀects of en vironmental inﬂuences on diﬀeren t population groups [34, 26, 33, 21, 3, 44, 22, 41, 14]. A recent article [27] documents more than 100 studies from 20 disciplines that collect and analyze h uman time-stamp ed GPS lo cation data. This t yp e of data is key for learning about the places where p eople routinely sp end their time during activities of daily living in order to establish their relationship with so cio-economic outcomes, crime victimization, and ph ysical and mental w ell-b eing. There ha ve b een extensive studies on the so cial stratiﬁcation of mobility , suc h as health disparities of diﬀeren t neigh b or- ho ods, mental health, and substance abuse in terven tion [13, 38, 41], on the assessment of h uman spatial behavior and spatiotemp oral contextual exp osures [26, 33, 21], on the c haracterization of the relationship b etw een geographic and con textual attributes of the environmen t (e.g., the built en vironmen t) and h uman energy balance (e.g., diet, w eigh t, ph ysical activit y) [3, 44], on the study of segregation, en vironmen tal exp osure, and accessibilit y in so cial science research [22], or on the understanding of the relation- ship b et w een health-risk b eha vior in adolescen ts (e.g., substance abuse) and communit y disorder [41, 1, 42]. Not withstanding a general consensus across disciplines ab out the tremendous p oten- tial of GPS lo cation data for studying human mobilit y , very little is currently kno wn ab out ho w long a GPS study should last. There is an inherent trade-oﬀ b etw een col- lecting location data from p eople for longer vs. shorter perio ds of time. Recording more GPS lo cations yields more information ab out the lo cations where an individual sp ends their time, as well as about the frequency , duration and timing of their visits to these places. Ho wev er, an individual’s participation in a GPS study comes with burdens that often b ecome signiﬁcan t if accumulated ov er longer p erio ds of time: the individual needs to carry the device recording the data (a GPS track er) everywhere they go, and needs to mak e sure the device is prop erly charged at all times and functions prop erly . Un til recen tly , most GPS study designs stipulated mandatory regular visits to pro ject co ordination sites to download data from the lo cation track ers, to replace batteries, and replace the GPS tracking devices that w ere lost or were malfunctioning. While some of these issues hav e b een addressed b y using sp ecialized apps on smartphones to col- lect GPS data and wirelessly transmit them in to secure cloud databases, the costs of distributing smartphones to study participan ts, data plans, soft w are developmen t, and cloud computing are quite signiﬁcan t. In addition, there are imp ortant priv acy consider- ations related to recording lo cations that might sensitive for study participants for long p eriods of time. F or these reasons, it is desirable to design GPS studies that are as short as p ossible to reduce the costs of the pro jects and the burden of study participants, while in the same time still pro viding guarantees that suﬃcien t lo cation data ha ve b een collected to prop erly address the research aims. Despite the constant gro wth in the n um b er of human mobility studies that collect GPS lo cation data in the last 20 years, the question about the determination of the amoun t of time of GPS monitoring has not b een asked until recen tly [43]. In this pap er, the authors argue that an eﬀectiv e GPS study should last un til a minimum of 14 to 15 da ys of v alid GPS data hav e b een collected. While this ﬁnding is relev an t for numerous researc h groups that, in the past, hav e designed GPS studies with a duration of 7 days (see [43] and the references therein), t w o w eeks seems to sev erely underestimate the duration of other, more recen t, GPS studies whose duration is signiﬁcantly longer. F or MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 3 example, [6] and [29] represen t studies that trac ked adolescen ts in the San F rancisco Bay area for one month. Another study [11] emplo ys a more complex three site design that comprises ﬁv e assessments that take place ev ery six mon ths ov er t wo years of follo w-up for participants enrolled in Chicago, and three assessmen ts that take place every six mon ths ov er one year of follow-up for participants enrolled in Jackson and New Orleans. During eac h assessment, participan ts wear a GPS track er for t wo w eeks. Th us this study [11] records GPS lo cations for a total of 10 weeks and 6 w eeks, resp ectively , but splits the p erio d of observ ation in to sev eral contiguous t w o w eek p erio ds of GPS monitoring. These longer p eriods of observ ation time were suggested in [25] who found 17 w eeks to b e an adequate p erio d of time to monitor h uman mobilit y based on geotagged social media data. In this pap er w e lay out a theoretical framew ork for assessing the temporal stability of human mobility based on GPS lo cation data. Suc h a framework is missing from the curren t statistical literature. Previous work [43, 25] on the assessment of the duration of GPS observ ation p erio ds is based on empirical ﬁndings, and lac k any theoretical underpinnings. W e address this gap b y introducing sev eral measures of the temporal dynamics of spatiotemp oral tra jectories of individuals. W e illustrate the use of these measures with publicly av ailable data from a study that recorded GPS locations of 185 individuals that live in a city in Switzerland ov er the course of 18 months. 2. Methods The spatiotemp oral tra jectory of an individual in a reference time frame [ t min , t max ] and spatial observ ation window W ⊂ R 2 + is a curve X [ t min ,t max ] = { X ( t ) = ( x 1 ( t ) , x 2 ( t )) : t ∈ [ t min , t max ] } ⊆ W , (1) where x 1 ( · ) and x 2 ( · ) represent the longitude and latitude co ordinates, resp ectiv ely , and X ( t ) is the location visited by this individual at time t . W e assume that this curv e is smo oth: x 1 ( · ) and x 2 ( · ) hav e contin uous deriv ativ es. The length of the curve in Eq. (1) is deﬁned as [9]: L ( X [ t min ,t max ] ) = t max Z t min s  d x 1 ( t ) d t  2 +  d x 2 ( t ) d t  2 d t. (2) The complete tra jectory X [ t min ,t max ] is nev er observed in the real world. Instead, n obser- v ation times t 1 , . . . , t n are sampled from a distribution on [ t min , t max ] with densit y ρ ( · ), and the corresp onding lo cations X ( t 1 ) , . . . , X ( t n ) on the curve X [ t min ,t max ] are recorded. These locations are realizations of a random v ariable X ( T ) where T ∼ q ( · ). Ideally we w ould like T to follo w a uniform distribution to ha ve the same chance of recording a visited location anywhere in the reference time frame [ t min , t max ]. Due to tec hnological limitations (e.g., GPS devices running out of pow er), heterogeneous built en vironments that prev en t GPS devices to obtain a lo cation (e.g., skyscap ers in down town areas or buildings without windows and WIFI co verage), or h uman b ehavioral factors (e.g., in- dividuals turning oﬀ their GPS devices around certain lo cations sensitiv e to them) the distribution of T can b e far from the uniform distribution. 4 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * W e assume that GPS p ositional data from K study participants were recorded. W e denote by X [ t min ,t max ] k = { X k ( t ) : t ∈ [ t min , t max ] } the unobserv ed spatiotemp oral tra jec- tory of the k -th study participant. The observ ation times in the reference time frame [ t min , t max ] can v ary betw een study participants. The GPS data for the k -th study par- ticipan t are the time stamp ed longitude and latitude lo cations: { X k,i = X k ( t k,i ) : i = 1 , . . . , n k } , (3) where n k ≥ 1, the time t k,i w as sampled from a distribution with densit y ρ k ( · ) inde- p enden tly of the rest of the observ ation times, and t min ≤ t k, 1 ≤ . . . t k,n k ≤ t max . Here t k,i represen ts the time when the i -th lo cation of study participant k w as recorded. Our framew ork allo ws for the p ossibilit y of having diﬀerent reference time frames for v arious groups of study participants. 2.1. Measuring the temp oral stability of human mobilit y patterns. One p os- sible measure of the dynamics of the spatiotemp oral tra jectory X [ t min ,t max ] is the av- erage velocity V ( τ ) at time τ whic h is a function V ( τ ) of the length of the sub curv e X [ t min ,t min + τ ] of X [ t min ,t max ] from Eq. (1): V ( τ ) = 1 τ L ( X [ t min ,t min + τ ] ) , (4) for τ ∈ (0 , t max − t min ] and V (0) = 0. A sample estimator of the av erage v elo cit y for the k -th study participan t is b V k ( τ ) = 1 τ X { i : t k,i +1 ≤ τ } k X k,i +1 − X k,i k . (5) where k X k,i +1 − X k,i k represents an estimate of the distance trav eled b et w een times t k,i and t k,i +1 . In what follows we will assume that study participants trav eled in a straight line or “as the cro w ﬂies” b et ween t wo consecutiv e observ ed GPS lo cations. This is the simplest assumption one can make which leads to an easy w ay of calculating Great Circle (WGS84 ellipsoid) distances b etw een tw o spatial lo cations [4]. Ho wev er, this assumption underestimates actual distances trav eled, and consequen tly underestimates the av erage velocity . More accurate approximations of distances trav eled can b e deﬁned based on the shortest distances b etw een tw o lo cations on a road netw ork that spans the spatial observ ation windo w W . Calculating distances based on a road net work is more complex than calculating straigh t line distances, and in volv es signiﬁcan t GIS work since the maxim um sp eed of trav el on diﬀerent segmen ts of road needs to b e tak en in to account [10]. Nev ertheless, as the span of time betw een t w o consecutive observ ed lo cations b ecomes shorter, the diﬀerence betw een the road netw ork and straight line distances decrease. More generally , consider a stochastic pro cess Z = { Z ( τ ) : τ ∈ [0 , t max − t min ] } , where Z ( τ ) is a mapping f ( · ) of the sub curve X [ t min ,t min + τ ] in to R + . The mapping f ( · ) is c hosen such that lim τ → ( t max − t min ) Z ( τ ) = Z ( t max − t min ). W e deﬁne the absolute p ercentage error (APE, henceforth) φ ( Z ; τ ) which measures the error made when appro ximating MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 5 Z ( t max − t min ) with Z ( τ ) for τ ∈ [0 , t max − t min ]: φ ( Z ; τ ) = | Z ( τ ) − Z ( t max − t min ) | Z ( t max − t min ) . W e quan tify the temp oral stabilit y of the pro cess Z by introducing a related pro cess called the last crossing time pro cess LCT Z = { LCT Z ( γ ) : γ ≥ 0 } , where LCT Z ( γ ) = max { τ ∈ [0 , t max − t min ] : φ ( Z ; τ ) > γ } . (6) In Eq. (6), LCT Z ( γ ) is the last time when the APE made when Z ( t max − t min ) is appro ximated with Z ( τ ) is abov e a threshold γ . The last crossing time is well deﬁned since lim τ → ( t max − t min ) φ ( Z ; τ ) = 0. Consider the pro cess Z k = { Z k ( τ ) : τ ∈ [0 , t max − t min ] } associated with the k -th study participan t, Z k ( τ ) = f  X [ t min ,t min + τ ] k  , and let b Z k b e its sample estimator based on the p ositional data in Eq. (3). The a verage v elo cit y in Eq. (4) and its sample estimator in Eq. (5) are examples of processes Z k and b Z k . A sample estimator of the last crossing time LCT Z k ( γ ) is (7) d LCT Z k ( γ ) = max i =1 ,...,n k n t k,i − t min : φ ( b Z k ; t k,i − t min ) > γ o . W e note that b Z k ( τ ) in the APE φ ( b Z k ; τ ) is determined based on the locations recorded for the k -th study participant b efore time τ : { X k,i : t min ≤ t k,i ≤ τ } . As an illustration, Figure 1 sho ws estimates of the av erage velocity of an individual in the MDC data, together with the last crossing time estimate at γ = 0 . 1. The last crossing time of the APE asso ciated with a pro cess that is a function of the spatiotemp oral tra jectory of a study participan t represen ts a measure of this individual’s mobilit y . Study participan ts that hav e more irregular mobility patterns (e.g., regular tra v el to lo cations at v arious distances from the individual’s residence that change after a few days or weeks) are exp ected to hav e larger last crossing times compared to study participan ts that trav el to the same lo cations eac h w eek. An example individual with a v ery regular mobilit y pattern that tra vels ev ery da y from his home to his oﬃce and bac k b y following the same route, and go es nowhere else will record an APE equal to 0 after one da y which leads to last crossing times of less than one day in Eq. (7). Previous w ork [43] on the temp oral stability of spatiotemporal tra jectories has used the mean absolute percentage error (MAPE) which is the av erage of the APE across study participan ts: φ K ( τ ) = 1 K K X k =1 φ ( b Z k ; τ ) . (8) W e deﬁne t wo measures of the o v erall temp oral stabilit y of the spatiotemp oral tra- jectories of m ultiple study participants. The ﬁrst ov erall measure is the last crossing time pro cess LCT φ K = { LCT φ K ( γ ) : γ ≥ 0 } of the MAPE pro cess φ K = { φ K ( τ ) : τ ∈ [0 , t max − t min ] } . W e refer to this measure as LCT − MAPE ( Z ). The second o verall mea- sure is deﬁned as the a verage of the last crossing times of the APE of b Z k for k = 1 , . . . , K , 6 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * Figure 1. Estimate of the a verage v elo city (gra y curve) of an individual in the MDC data o ver t max = 21 w eeks. The dashed line indicates the v alue of b V ( t max ), and the tw o dotted lines represent the low er b ound (1 − γ ) b V ( t max ) and the upp er b ound (1 + γ ) b V ( t max ) for γ = 0 . 1. These b ounds corresp ond with times τ for whic h the APE φ ( V ; τ ) ≤ γ . The crosses denote the times τ for which φ ( V ; τ ) = γ . The last crossing time for γ = 0 . 1 is marked with a triangle, and o ccurs at the end of week 10. i.e. LCT K = { LCT K ( γ ) : γ ≥ 0 } where LCT K ( γ ) = 1 K K X k =1 LCT Z k ( γ ) . W e denote this second measure b y LCT − APE ( Z ). These t w o measures are the same only if they are calculated for a single study participant ( K = 1). They are useful for comparing the temp oral regularit y of mobility patterns of groups of study participants (e.g., y ounger vs. older individuals, men vs. w omen, high SES vs. low SES). 2.2. The activity distribution of h uman mobility patterns. The av erage velocity asso ciated with the spatiotemp oral tra jectory of an individual do es not pro vide any information about the spatial conﬁguration of locations visited. Consider t wo example individuals that driv e without stopping with the same sp eed for a long p erio d of time. The ﬁrst example individual drives back and forth betw een t wo places A 1 and A 2 . The second example individual driv es in a cycle from a place A 1 to another place A 2 , then to places A 3 and A 4 , then back to place A 1 . Since the spatiotemp oral tra jectory of the second individual in volv es tw o additional places, more sample lo cations will b e needed to understand the mobility pattern of the second individual compared to the mobility pattern of the ﬁrst individual. Ho w ever, the mobility patterns of these t w o example individuals will b e indistinguishable based on the last crossing time pro cess asso ciated with their av erage v elo cit y processes. W e address this issue b y in tro ducing a distribution of the lo cations visited b y an individual. MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 7 W e assume that the observ ation window W is partitioned into a set of grid cells G = { G 1 , . . . , G N } . Eac h lo cation X ( t ) on the curve X [ t min ,t max ] represen ting the spa- tiotemp oral tra jectory of an individual is mapp ed into a grid cell G ( t ) ∈ G . The observ ed lo cations for this individual mapp ed in to G are the sequence of grid cells g 1 = G ( t 1 ) , . . . , g n = G ( t n ) that are realizations of a random v ariable G ( T ) where T is a random v ariable on [ t min , t max ] with a distribution with density ρ ( · ). W e deﬁne the activit y distribution π = ( π 1 , . . . , π N ) ov er the grid cells G . Here π j represen ts the prop ortion of time in [ t min , t max ] sp ent by an individual in cell G j ∈ G . W e assume that T follows a uniform distribution on [ t min , t max ], and deﬁne: (9) π j = P ( G ( T ) = G j ) , for j = 1 , . . . , N . The activity distributions associated with the t w o example individuals we introduced earlier can diﬀerentiate b etw een their mobilit y patterns if the grid cells in which A 3 and A 4 do not coincide with the grid cells of A 1 and A 2 , and will show that the ﬁrst example individual did not spend an y time in the grid cells asso ciated with A 3 and A 4 . T o employ activit y distributions w e need to ha ve a method for recov ering them from the av ailable data. The simplest estimator b π = ( b π 1 , . . . , b π N ) of the activity distribution π is based on the relativ e frequency of visitation of the grid cells G : b π j = 1 n n X i =1 1 ( g i = G j ) , for j = 1 , . . . , N . Ho w ever, this estimator of π is reasonable only if T follo ws a uniform distribution as in Eq. (9). When T follo ws an arbitrary distribution with densit y ρ ( · ), a b etter approac h is to use a weigh ted av erage estimator e π = ( e π 1 , . . . , e π N ) where: (10) e π j = P n i =1 ρ − 1 ( t i ) 1 ( g i = G j ) P n ` =1 ρ − 1 ( t ` ) , for j = 1 , . . . , N . Although this estimator can b e shown to b e statistically consistent, it requires kno wledge of the density ρ ( · ). There are many metho ds for estimating ρ ( · ) from the data such as histograms or k ernel densit y estimators [40]. W e suggest using an estimation metho d that assumes that the distribution of T is appro ximated by a piecewise uniform distribution. W e tak e t 0 = t min and t n +1 = t max . If T is approximately uniform in [ t i − 1 , t i +1 ] for i = 1 , . . . , n , then ρ − 1 ( t i ) ≈ t i +1 − t i − 1 . This is a reasonable assumption if the times when lo cations are collected are roughly equally spaced in time (e.g., a lo cation is collected ev ery 10 minutes) since the mean of t i is ( t i +1 − t i − 1 ) / 2. Thus an estimator of ρ ( · ) is b ρ ( t i ) = ω ( t i ) P n ` =1 ω ( t ` ) , ω ( t i ) = 1 t i +1 − t i − 1 , for i = 1 , . . . , n. The w eighted av erage estimator from Eq. (10) becomes b π o,j = P n i =1 ω − 1 ( t i ) 1 ( g i = G j ) P n ` =1 ω − 1 ( t ` ) = P n i =1 ( t i +1 − t i − 1 ) 1 ( g i = G j ) t max − t min + t n − t 1 , for j = 1 , . . . , N . (11) 8 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * W e call b π o = ( b π o, 1 , . . . , b π o,N ) the ordinary prop ortional time estimator of the activit y distribution π . This estimator relies on the assumption that the length of the time in terv als in which an individual transitions b etw een t w o grid cells is added to the time sp en t in b oth the grid cell they leav e from, and the grid cell they arrive in. More sp eciﬁcally , assume that the consecutiv e observ ation times t i and t i +1 are suc h that g i 6 = g i +1 . Then b π o allo cates ( t i +1 − t i ) to the total time sp ent in b oth g i and g i +1 . W e introduce a second estimator b π c = ( b π c, 1 , . . . , b π c,N ) of the activity distribution π : (12) b π c,j = P n i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 = G j ) P n i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 ) , for j = 1 , . . . , N . W e call b π c the conserv ative prop ortional time estimator. This estimator is more con- serv ativ e than the ordinary prop ortional time estimator b π o from Eq. (11) in the sense that any time in terv al deﬁned by consecutive observ ation times t i and t i +1 suc h that g i 6 = g i +1 is ignored. That is, the time sp ent in a grid cell is calculated only based on time in terv als in which an individual is known to ha ve remained in that cell. W e show t wo imp ortan t prop erties of the ordinary and the conserv ative prop ortional time estimators. First, w e prov e that b oth estimators are asymptotically equiv alen t. Second, we prov e that b oth estimators are statistically consistent, that is, they will ev en tually reco ver the true activit y distribution π if suﬃcient lo cation data are av ailable. These prop erties rely on the assumptions (S1), (S2) and (S3) b elow: (S1) The length of the time in terv als b etw een consecutiv e observ ation times max i =1 ,...,n − 1 | t i +1 − t i | → 0 as the sampling rate n → ∞ . (S2) The sampling perio d is such that t 1 → t min and t n → t max when n → ∞ . (S3) The num b er of transitions b et ween grid cells is ﬁnite, i.e., there exists M < ∞ suc h that P t ∈ [ t min ,t max ] 1 ( G ( t + ) 6 = G ( t − )) ≤ M , where G ( t − ) and G ( t + ) are the left and right limits of G ( · ) at t . Assumptions (S1) and (S2) describ e the meaning of asymptotics in our context. They imply that the observ ation times t 1 , . . . , t n will even tually b e dense in the reference time frame, i.e., there will not exist a ﬁxed region of [ t min , t max ] without an y observ ation times when n → ∞ . Assumption (S3) requires that the spatiotemp oral tra jectory X [ t min ,t max ] is suﬃcien tly smo oth such that it will not jump b etw een grid cells inﬁnitely often. Theorem 2.1 (Asymptotic Equiv alence Rule with Large Sampling Rate) . Under as- sumptions (S1), (S2) and (S3), the or dinary pr op ortional time estimator b π o fr om Eq. (11) and the c onservative pr op ortional time estimator b π c fr om Eq. (12) ar e asymptoti- c al ly the same. The pro of of this result is giv en in App endix A.1. W e can also show that the same assumptions imply that the tw o estimators are statistically consistent. Theorem 2.2 (Conv ergence Rule with Large Sampling Rate) . Under assumptions (S1), (S2) and (S3), the or dinary pr op ortional time estimator b π o fr om Eq. (11) and the c on- servative pr op ortional time estimator b π c fr om Eq. (12) c onver ge to the true activity distribution π fr om Eq. (9) . The pro of of this result is given in App endix A.2. MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 9 2.3. Measuring the temp oral stabilit y of human activit y distributions. W e are in terested in determining the temp oral stabilit y of the activit y distribution of an individual. W e assume that the reference time frame [ t min , t max ] is divided into D max time p eriods of equal lengths (e.g., da ys or weeks). W e denote b y π ( d ) the activit y distribution from Eq. (12) asso ciated with time p eriod D , D = 1 , . . . , D max . Then π ( D ) can be view ed as an N -dimensional random v ector whose distribution reﬂects the v ariabilit y from time p erio d to time p eriod of the individual’s mobility patterns. With this understanding, w e are interested in determining the expectation ¯ π = E ( π ( D ) ). W e call ¯ π the time p erio d activit y distribution (e.g., daily or w eekly activit y distribution). The j -th comp onen t of ¯ π is interpreted as the a v erage prop ortion of time spent by the individual in grid cell G j in a given time p erio d (a da y or a week). A simple estimator of ¯ π is (13) b ¯ π ( D ) = 1 D D X d =1 b π ( d ) , for D = 1 , . . . , D max , where b π ( d ) is the ordinary prop ortional time estimator b π o from Eq. (11) or the conser- v ativ e prop ortional time estimator b π c from Eq. (12). Because b ¯ π ( D ) is a consistent estimator of ¯ π , the error w e mak e when appro ximating ¯ π with b ¯ π ( D ) decreases as we observe the spatiotemp oral tra jectory of the individual for a larger num b er of time p erio ds D max . W e deﬁne the last crossing time of the sequence of estimators { b ¯ π ( D ) : D = 1 , . . . , D max } as follows: (14) d LCT dist ( γ ) = max D =1 ,...,D max  D : k b ¯ π ( D ) − b ¯ π ( D max ) k 1 > γ  , where k v k 1 is the usual L 1 norm for a vector v , i.e., k v k 1 = P i | v i | . Note in Eq. (14) we used the fact that k b ¯ π ( D ) k 1 = 1 for an y D . The last crossing time in Eq. (14) is a measure of the temp oral stability of the en tire time p erio d activit y distribution ¯ π . Individuals that sp end appro ximately the same amoun t of time in the same places in every time perio d need to b e observed for a smaller n um b er of time perio ds to calculate estimator b ¯ π ( D ) with the same APE compared to individuals with heterogeneous mobilit y patterns that sp end diﬀerent amounts of times at lo cations that c hange substan tially across time p erio ds. Therefore d LCT dist ( γ ) will b e smaller for individuals whose time p erio d to time p erio d mobilit y c hanges less, and larger for individuals with irregular mobility patterns. The disadv antage of using the last crossing time in Eq. (14) as a measure of temp oral stabilit y comes from the fact that it gives the same w eight to the error made when estimating the prop ortion of time sp ent in grid cells in which an individual sp ends a lot of their time, and to the grid cells in whic h the individual rarely visits. The num b er of grid cells with a large prop ortion of time sp en t in them is likely signiﬁcan tly smaller than the total n umber of grid cells N b ecause most p eople tend to spend time at their residence, to their w ork place and perhaps in a few other select lo cations. F or this reason, the error made when estimating the prop ortion of time spent in grid cells with sparse presence could dominate the o v erall APE of b ¯ π ( D ), and lead to larger v alues of d LCT dist ( γ ). T o remedy this issue, w e deﬁne a new measure of temporal stabilit y that fo cuses on the grid cells in which an individual sp ends larger prop ortions of time. 10 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * W e deﬁne the ranking time p erio d activit y distribution ¯ r = ( ¯ r 1 , · · · , ¯ r N ) asso ciated with ¯ π by replacing eac h comp onent of ¯ π with the sum of those comp onen ts of ¯ π that are no larger than that comp onent, as follows [7]: (15) ¯ r j = N X l =1 ¯ π l 1 ( ¯ π l ≤ ¯ π j ) , for j = 1 , . . . , N . The α -level set ( α ∈ [0 , 1]) of ¯ r is deﬁned to consist of all the grid cells whose corre- sp onding comp onents in ¯ r exceed α : (16) L α = { G j : ¯ r j ≥ α } . It turns out that the α -lev el set cov ers grid cells whose total sum of comp onents of ¯ π is larger than 1 − α : X G j ∈ L α ¯ π j ≥ 1 − α . Lev els sets hav e an easy to understand interpretation: for a giv en lev el α , sa y α = 0 . 7, all the grid cells with a ranking time p eriod activit y distribution ab ov e 0 . 7 will jointly co v er at least (1 − 0 . 7) · 100 = 30% of the time in the time p erio d. V alues of α closer to 1 lead to level sets L α with a smaller cov erage that comprise only the grid cells in whic h the individual sp ends the largest amounts of time. V alues of α close to 0 lead to level sets L α with a larger co v erage that comprise the ma jority of grid cells the individual sp en t time in. Let b ¯ r ( D ) be the ranking distribution of the estimator b ¯ π ( D ) of ¯ π in Eq. (13), and L α ( D ) b e the α -level set asso ciated with b ¯ r ( D ) as in Eq. (16). Given a lev el α ∈ [0 , 1] and a stabilit y threshold γ > 0, we deﬁne the last crossing time of the sequence of level sets { L α ( D ) : D = 1 , . . . , D max } as follows: (17) d LCT level ,α ( γ ) = max D =1 ,...,D max  D : k L α ( D ) 4 L α ( D max ) k k L α ( D max ) k > γ  , where 4 denotes the symmetric diﬀerence of t wo sets, and k · k denotes the num b er of elemen ts in a set. The LCT of the level sets from Eq. (17) is a measure of temp oral stabilit y of the time p erio d activity distribution ¯ π that takes in to account only the error made when estimating the time sp ent in the grid cells in whic h an individual sp ent most of their time. F or the same v alue of γ , d LCT level ,α ( γ ) is decreasing as the lev el α is increasing. 3. Applica tion The data w e analyze comes Nokia’s Mobile Data Challenge (MDC) [18, 23, 24]. This w as a mobile computing researc h initiative fo cusing on generating a deep er scientiﬁc understanding of so cial and b ehavioral patterns related to mobile technologies. The study to ok place in Switzerland, and collected v arious t yp es of longitudinal information including time stamp ed GPS data from the cell phones of 185 study participan ts o v er the course of 18 months. Demographic data suc h as age and sex is also a v ailable. There are appro ximately 57.5 million GPS lo cation records. The av erage length of observ ation for study participants was ab out 55 weeks. These data are publicly av ailable upon request from the Idiap Research Institute. MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 11 T able 1. Means, medians and sample standard deviations of three mea- sures of temporal stabilit y of mobilit y patterns. The unit of time is w eeks. Mobilit y Measure Mean Median St. Dev. LCT-v elo cit y 30.04 26 17.29 LCT-distribution 37.18 37 16.06 LCT-lev el set ( α = 0 . 2) 17.69 17 9.50 Most activities of daily living of the study participants took place in a rectangular area that we partitioned into 4000 2 square grid cells with sides of length 28 meters. The lo cations that do not belong to this spatial observ ation window were dropped. These lo cations t ypically corresp ond with longer trips to ok b y study participants aw ay from their places of residency . Figure 2 displa ys summaries of the GPS lo cations that fall in our c hosen spatial observ ation window. Figure 2. Summary information of the GPS location data. Left panel: histogram of the total length of observ ation for each study participan t expressed in weeks. Righ t panel: histrogram of the a v erage num b er of GPS lo cations p er week for each study participan t. F or eac h study participant, we calculated three measures of temp oral stability of their mobilit y patterns: the last crossing time of the av erage velocity (LCT-velocity) as deﬁned in Eq. (5) and Eq. (7), the last crossing time of the activity distribution (LCT-distribution) as deﬁned in Eq. (14), and the last crossing time of the lev el sets of the w eekly activity distribution as deﬁned in Eq. (17). In the calculation of LCT- distribution and LCT-lev el set, we use the ordinary prop ortional time estimator deﬁned in Eq. (11). W e used α = 0 . 2 in the determination of lev el sets, and γ = 0 . 2 as the stabilit y threshold for all three measures. The results are summarized in T able 1. Ab out 30 w eeks of observ ation is needed until the mobility patterns stabilize according to the LCT-v elo cit y measure. A longer p erio d of time, 37 weeks, is needed un til the 12 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * w eekly activity distribution stabilizes. The increased length of the p erio d of observ ation for this measure is not surprising since it is based on an estimated of the full w ee kly activit y distribution in N = 4000 2 grid cells. About half of this observ ation time (18 w eeks) is needed to obtain estimates of the 0 . 2-level set of the w eekly activity distribution whic h comprise the grid cells in which the study participants sp end 80% of their weekly time. W e exemplify ho w the α -lev el set L α from Eq. (16) and its corresp onding LCT-lev el set d LCT level ,α (0 . 2) from Eq. (17) c hange for diﬀerent v alues of α ∈ [0 , 1]. T o this end, w e deﬁne an adjacency graph G grid whose vertices are the N = 4000 2 grid cells in the spatial observ ation windo w. Two grid cells are connected b y an edge in G grid if they share an edge or a corner in their arrangement in the spatial observ ation window [39, 4]. W e denote by G grid ( L α ) the subgraph of G grid deﬁned b y the grid cells in L α . W e c hose a study participan t, and determined the lev el set L α , the last crossing time d LCT level ,α (0 . 2) and the n um b er of connected components of G grid ( L α ) for α ∈ { 0 . 1 , 0 . 2 , . . . , 1 } – see Figure 3. F or smaller v alues of α , L α con tains grid cells in which the study participan t sp end the largest prop ortion of time. When α ∈ { 0 . 1 , 0 . 2 , 0 . 3 , 0 . 4 } , G grid ( L α ) has one connected comp onen t which implies that the grid cells that b elong to L α are spatially adjacen t, and deﬁne a single area in whic h the study participan t sp ends larger amounts of time. The corresp onding v alues of d LCT level ,α ( γ ) are less than 20 weeks which represen ts the length of observ ation time needed for reliably detecting this spatial area. F or α ∈ { 0 . 5 , 0 . 6 } , G grid ( L α ) has tw o connected components, and for α ∈ { 0 . 7 , 0 . 8 } , G grid ( L α ), G grid ( L α ) has three connected comp onents. Th us this study participan t sp ends their time in grid cells that deﬁne t wo or three spatially con tiguous areas. Since these areas include grid cells in which the study participant sp ends smaller prop ortions of their w eekly time, the length of the observ ation time needed to iden tify these areas doubles to ab out 40 w eeks. F or α = 1, G grid ( L α ) has 72 connected comp onents because L α includes grid cells in whic h the study participant sp ends very little time. Figure 3 sho ws that appro ximately 70 weeks of observ ation time are needed to detect these grid cells. The same t yp e of plots constructed for other study participan ts sho w similar relationships b et w een α , L α , and d LCT level ,α (0 . 2). Next we w ant to determine whether the temp oral stability of activit y distributions v aries b y the demographic c haracteristics of the p opulation. W e group the study partici- pan ts by sex (male, female) and age group (young age 15–34 years old, middle age 35–54 y ears old, and old age ≥ 55 years old). F or eac h of these ﬁve demographic groups, w e cal- culated the a verage of the last crossing times of the activity distribution d LCT level ,α (0 . 2) for every α ∈ { 0 . 1 , 0 . 2 , . . . , 1 } . The resulting curv es are presented in Figure 4. The last crossing times at all levels are similar for men and women (see the top left panel). As suc h, there do not seem to be any sex-based diﬀerences in the temporal stability of men and women who liv e in Switzerland. How ever, since Switzerland is known to be a coun try with very high equality b etw een the t w o sexes, this ﬁnding migh t not extend to other coun tries with profound sex inequality . In the top righ t and b ottom panels of Figure 4, w e ﬁnd evidence that the av erage last crossing times decrease with age esp ecially for levels b elow 0 . 5. This means that mobilit y patterns are more regular, and consequently are more temp orally stable for older study participants compared to y ounger study participan ts. The a verage last crossing MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 13 Figure 3. V alues of the LCT-lev el sets d LCT level ,α (0 . 2) for α ∈ { 0 . 1 , 0 . 2 , . . . , 1 } for an MDC study participan t. The unit of time is w eeks. The num b er of connected comp onents of G grid ( L α ) deﬁned b y the α -level sets L α are sho wn ab ov e the curve. times are larger and b ecome very similar across demographic groups for levels ab ov e 0 . 5 compared to smaller lev els b elo w 0 . 5. Thus study participan ts that b elong to any of the ﬁv e demographic groups tend to visit lo cations they do not t ypically visit. Longer observ ation p erio ds are needed to successfully determine these lo cations. Nevertheless, in order to iden tify the areas in whic h study participan ts sp end most of their time, Figure 4 suggests that 10 w eeks of observ ation of GPS locations should suﬃce for individuals older than 55. Middle age individuals require ab out 15 w eeks of observ ation time, while young individuals require ab out 20 w eeks. 4. Discussion The contribution we made in this pap er is tw o fold. On the theoretical side, we pro- p osed the use of last crossing time pro cesses asso ciated with spatiotemp oral tra jectories of individuals to assess the temp oral stabilit y of their mobilit y patterns. W e deﬁned sev eral measures of the temp oral dynamics of spatiotemporal tra jectories based on the a v erage velocity process, and on human activit y distributions in a spatial observ ation windo w. W e deﬁned the ordinary and the conserv ativ e proportional time estimators of h uman activity distributions, and pro ved that they are consistent and asymptotically equiv alen t. W e introduced the time p eriod and the ranking time p erio d activit y distri- butions that capture the c hange in human activit y distributions across time p erio ds. W e presen ted related estimators based on GPS lo cation data. On the empirical side, we analyzed GPS location data collected o v er a p erio d of 18 mon ths. The previous empirical study [43] that fo cused on assessing the duration of GPS studies is based on data collected ov er 30 days. By using our new statistical metho ds and GPS data collected ov er a muc h longer p erio d of time, w e determined that GPS monitoring needs to b e done for at least 15 weeks which represen ts a minim um study 14 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * Figure 4. Mean v alues and 90% conﬁdence interv als of the LCT-level sets d LCT level ,α (0 . 2) for α ∈ { 0 . 1 , 0 . 2 , . . . , 1 } calculated for ﬁve demograhic groups: sex (male, female), and age (young, middle, old). duration ab out 7 times longer than the 14 days minimum duration recommended in [43]. W e also put forward the idea that the duration of GPS studies should b e assessed b y demographic groups. W e determined that younger p opulation groups should b e monitored for longer p erio ds of time compared to middle age p opulation groups b ecause of their more irregular patterns of mobilit y . On the other hand, shorter monitoring p eriods might b e needed for older p opulation groups that exhibit mobility patterns that are temp orally more stable. W e also suggest using our metho ds to assess the need for diﬀeren t time spans of GPS monitoring for men and women in coun tries with a known history of inequality betw een the t w o sexes. T o the b est of our kno wledge, diﬀerential p eriods of GPS data collection based on demographic groups has not b een discussed b efore. Our w ork suggests that GPS study designs should tak e demographic groups into accoun t. Funding The work of Z.D. and A.D. was partially supp orted by the National Science F oun- dation Grant DMS/MPS-1737746 to Universit y of W ashington. Y.C. received partial supp ort from the National Science F oundation Gran t DMS-1810960 and National Insti- tutes of Health Gran t U01-A G016976. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the man uscript. MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 15 A cknowledgment P ortions of the researc h in this pap er used the MDC Database made a v ailable by Idiap Researc h Institute, Switzerland and owned by Nokia. Appendix A. Proofs of theoretical resul ts A.1. Pro of of Theorem 2.1. Pr o of. W e note that the ordinary prop ortional time estimator in Eq. (11) can be written as (18) b π o,j = 1 2 P n − 1 i =2 ( t i +1 − t i − 1 ) 1 ( g i = G j ) 1 2 ( T + t n − t 1 ) , where T = t max − t min . W e will ﬁrst show that the denominators of b π o,j and b π c,j are asymptotically the same. Assumption (S2) implies that 1 2 ( T + t n − t 1 ) → T , which sho ws the asymptotic b eha vior of the denominator of b π o,j . F or b π c,j , w e hav e n X i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 ) = n X i =2 ( t i − t i − 1 ) − n X i =2 ( t i − t i − 1 ) 1 ( g i 6 = g i − 1 ) , = T − n X i =2 ( t i − t i − 1 ) 1 ( g i 6 = g i − 1 ) , ≥ T − M max i | t i +1 − t i | , → T , where M is the constan t from assumption (S3). The limit in the ab o ve equation is due to assumption (S1). Th us, the denominators of b π o,j and b π c,j are asymptotically the same. Next w e fo cus on the numerators of the tw o estimators. The n umerator of b π c,j can b e written as n X i =2 ( t i +1 − t i ) 1 ( g i +1 = g i = G j ) = n X i =2 A i , where A i = ( t i +1 − t i ) 1 ( g i +1 = g i = G j ). Let B i = t i +1 − t i − 1 2 1 ( g i = G j ). Using Eq. (18), the n umerator of b π o,j can b e written as 1 2 n − 1 X i =2 ( t i +1 − t i − 1 ) 1 ( g i = G j ) = n − 1 X i =2 B i . When g i − 1 = g i = g i +1 = G j , we hav e 2 B i = A i + A i − 1 . By assumption (S3), there are at most 2 M n umber of time p oin ts t i suc h that the equality g i − 1 = g i = g i +1 = G j do es not hold. Thus n − 1 X i =2 B i 1 ( g i − 1 = g i = g i +1 = G j ) ≥ n − 1 X i =2 B i − 2 M · max i | t i +1 − t i | , 16 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * whic h implies that b π o,j → 1 T n − 1 X i =2 B i 1 ( g i − 1 = g i = g i +1 = G j ) , = 1 T n − 1 X i =2 A i + A i − 1 2 1 ( g i − 1 = g i = g i +1 = G j ) . (19) Again, using the fact that there are at most 2 M n umber of time p oin ts t i suc h that the equalit y g i − 1 = g i = g i +1 = G j do es not hold, we obtain n − 1 X i =2 A i 1 ( g i − 1 = g i = g i +1 = G j ) ≥ n X i =2 A i − (2 M + 1) · max i | t i +1 − t i | , n − 1 X i =2 A i − 1 1 ( g i − 1 = g i = g i +1 = G j ) ≥ n X i =2 A i − (2 M + 1) · max i | t i +1 − t i | . It follo ws that b π c,j = P n i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 = G j ) P n i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 ) → 1 T n X i =2 A i , → 1 T n − 1 X i =2 A i + A i − 1 2 1 ( g i − 1 = g i = g i +1 = G j ) , whic h is the same limit in Eq. (19) we obtained for b π o,j . Therefore the n umerators of b π o,j and b π c,j are asymptotically the same, which prov es that b π o,j and b π c,j are asymptotically equal.  A.2. Pro of of Theorem 2.2. Pr o of. Theorem 2.1 prov es that the t wo estimators are asymptotically equiv alent. Thus, w e only need to deriv e the conv ergence of one of the t wo estimators to the true activity distribution π = ( π 1 , . . . , π N ) from Eq. (9). In what follo ws w e fo cus on the conserv ativ e prop ortional time estimator. Without loss of generalit y , we assume that there exist K ≥ 1 disjoin t time in terv als in whic h the individual is inside grid cell G j , i.e., there are [ a 1 , b 1 ] , · · · , [ a K , b K ] such that a i < b i < a i +1 for i = 1 , . . . , K − 1, t min ≤ a 1 , b K ≤ t max and { t : G ( t ) ∈ G j } = [ a 1 , b 1 ] ∪ · · · ∪ [ a K , b K ] . Since, in the deﬁnition of the true activit y distribution π , T follo ws a uniform distribution on the reference time frame [ t min , t max ], w e can express π j as π j = P ( G ( T ) ∈ G j ) = K X k =1 P ( T ∈ [ a k , b k ]) = 1 T K X k =1 ( b k − a k ) . MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 17 As b efore, T = t max − t min . F or the in terv al [ a k , b k ], we let t i ∗ b e the ﬁrst observ ation time after a k , and t i ∗∗ b e the last observ ation time b efore b k : t i ∗ ≥ a k , t i ∗ − 1 < a k , t i ∗∗ +1 > b k , t i ∗∗ ≤ b k . Because G ( t ) ∈ G j for all t ∈ [ a k , b k ], we hav e g i ∈ G j for all i ∈ { i ∗ , i ∗ + 1 , . . . , i ∗∗ } . The conserv ative proportional time estimator estimates the length of the in terv al [ a k , b k ] based on the length of the interv al [ t i ∗ , t i ∗∗ ]. The corresp onding error is | ( b k − a k ) − ( t i ∗∗ − t i ∗ ) | ≤ t i ∗ − a k + b k − t i ∗∗ , ≤ ( t i ∗ − t i ∗ − 1 ) + ( t i ∗∗ +1 − t i ∗∗ ) , ≤ 2 max i =1 ,...,n − 1 | t i +1 − t i | → 0 , due to assumption (S1). By applying the ab ov e argument to each interv al [ a k , b k ], k = 1 , . . . , K , w e conclude that n X i =2 ( t i − t i − 1 ) 1 ( g i = G j ) → K X k =1 ( b k − a k ) . Because n X i =2 ( t i − t i − 1 ) 1 ( g i = G j ) ≥ n X i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 = G j ) − M · max i =1 ,...,n − 1 | t i +1 − t i | , w e further conclude that n X i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 = G j ) → K X k =1 ( b k − a k ) . This prov es the con v ergence of the conserv ativ e prop ortional estimator to the true ac- tivit y distribution: b π c,j → P n i =2 ( t i − t i − 1 ) 1 ( g i = g i − 1 = G j ) T , → P K k =1 ( b k − a k ) T , = π j .  References [1] L.A. Basta, T.S. Richmond, and D.J. Wiebe, Neighb orho ods, daily activities, and me asuring he alth risks experienc e d in urban envir onments , So cial Science & Medicine 71 (2010), pp. 1943–1950. [2] J. Beekh uizen, H. Kromhout, A. Huss, and R. V ermeulen, Performanc e of GPS-devic es for envi- r onmental exp osur e assessment , Journal of Exp osure Science and Environmen tal Epidemiology 23 (2013), pp. 498–505. [3] D. Berrigan, J.A. Hipp, P .M. Hurvitz, P . James, M.M. Janko wsk a, J. Kerr, F. Laden, T. Leonard, R.A. McKinnon, T.M. Po well-Wiley , E. T arlo v, S.N. Zenk, and The TREC Spatial and Con textual, Measures and Mo deling W ork Group, Ge osp atial and c ontextual appr o aches to ener gy balanc e and he alth , Annals of GIS 21 (2015), pp. 157–168. 18 ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * [4] R.S. Biv and, E. Pebesma, and V. G´ omez-Rubio, Applie d Sp atial Data Analysis with R , Springer, New Y ork, 2013. [5] C.R. Bro wning, C.A. Calder, B. Soller, A.L. Jac kson, and J. Dirlam, Ec olo gic al networks and neigh- b orho o d so cial or ganization , American Journal of So ciology 122 (2017), pp. 1939–1988. [6] H. Byrnes, B.A. Miller, C.N. Morrison, D.J. Wieb e, M. W o ychik, and S.E. Wiehe, Asso ciation of envir onmental indic ators with te en alc ohol use and pr oblem b ehavior: T e ens’ observations vs. obje ctively-me asur e d indic ators , Health & Place 43 (2017), pp. 151–157. [7] Y.C. Chen, Gener alize d cluster tre es and singular me asur es , Annals of Statistics 47 (2019), pp. 2174–2203. [8] W.J. Christian, Using ge osp atial te chnolo gies to explor e activity-base d r etail fo o d envir onments , Spa- tial and Spatio-temp oral Epidemiology 3 (2012), pp. 287–295. [9] R. Courant and F. John, Intr o duction to Calculus and Analysis , V ol. I, Springer, New Y ork, 1991. [10] A. Dobra and N.E. Williams, Sp atiotemp or al dete ction of unusual human p opulation b ehavior using mobile phone data , PLoS ONE 10 (2015), p. e0120449. [11] D.T. Duncan, D.A. Hickson, W.C. Go edel, D. Callander, B. Bro oks, Y.T. Chen, H. Hanson, R. Ea vou, A.S. Khanna, B. Chaix, S. Regan, D.P . Wheeler, K.H. May er, S.A. Safren, M.S. Carr, C. Drap er, V. Magee-Jackson, R. Brewer, and J.A. Sc hneider, International Journal of En vironmen tal Researc h and Public Health 16 (2019), p. 1922. [12] D.T. Duncan, F. Kapadia, S.D. Regan, W.C. Go edel, M.D. Levy , S.C. Barton, S.R. F riedman, and P .N. Halkitis, F e asibility and ac c eptability of Glob al Positioning System (GPS) metho ds to study the sp atial c ontexts of substanc e use and sexual risk b ehaviors among young men who have sex with men in New York City: A P18 c ohort sub-study , PloS One 11 (2016), p. e0147520. [13] K. Elgethun, M.G. Y ost, C.T. Fitzpatrick, T.L. Nyerges, and R.A. F ensk e, Comp arison of Glob al Positioning System (GPS) tr acking and p arent-r ep ort diaries to char acterize childr en ’s time-lo c ation p atterns , Journal of Exp osure Science and Environmen tal Epidemiology 17 (2007), pp. 196–206. [14] B. Ent wisle, Putting p e ople into plac e , Demography 44 (2007), pp. 687–703. [15] C. Graif, A.S. Gladfelter, and S.A. Matthews, Urb an p overty and neighb orho o d eﬀe cts on crime: Inc orp or ating sp atial and network p ersp e ctives , So ciology Compass 8 (2014), pp. 1140–1155. [16] C. Harding, Z. Patterson, L. Miranda-Moreno, and S. Zahabi, Mo deling the eﬀe ct of land use on activity sp ac es , T ransp ortation Researc h Record: Journal of the T ransp ortation Research Board 2323 (2012), pp. 67–74. [17] Y. Kestens, A. Leb el, M. Daniel, M. Th´ eriault, and R. Pampalon, Using exp erienc e d activity sp ac es to measur e fo o dsc ap e exp osur e , Health & Place 16 (2010), pp. 1094–1103. [18] N. Kiukkonen, J. Blom, O. Dousse, D. Gatica-Perez, and J. Laurila, T owar ds Rich Mobile Phone Datasets: Lausanne Data Col le ction Campaign , in Pro c. ACM Int. Conf. on Pervasive Servic es (ICPS), Berlin , July . 2010. [19] J.A. Kop ec, Conc epts of disability: The activity sp ac e mo del , So cial Science & Medicine 40 (1996), pp. 649–656. [20] L.J. Krivo, H.M. W ashington, R.D. P eterson, C.R. Bro wning, C.A. Calder, and M.P . Kwan, So cial isolation of disadvantage and advantage: The r epr oduction of ine quality in urb an sp ace , Social F orces 92 (2013), pp. 141–164. [21] M.P . Kw an, The unc ertain ge o gr aphic c ontext pr oblem , Annals of the Asso ciation of American Geographers 102 (2012), pp. 958–968. [22] M.P . Kwan, Beyond sp ace (as we knew it): T owar d temp or al ly inte gr ate d ge o gr aphies of se gr e gation, he alth, and ac c essibility , Annals of the Association of American Geographers 103 (2013), pp. 1078– 1086. [23] J.K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T. Do, O. Dousse, J. Eb erle, and M. Miettinen, The Mobile Data Chal lenge: Big Data for Mobile Computing R ese arch , in Pr o c. Mobile Data Chal lenge Workshop (MDC) in c onjunction with Int. Conf. on Pervasive Computing, Newc astle , June. 2012. [24] J.K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T.M.T. Do, O. Dousse, J. Eb erle, and M. Miettinen, F r om big smartphone data to worldwide r ese ar ch: The Mobile Data Chal lenge , P erv asive and Mobile Computing 9 (2013), pp. 752–771. MEASURING THE TEMPORAL ST ABILITY OF HUMAN MOBILITY P A TTERNS 19 [25] J.H. Lee, A.W. Da vis, S.Y. Y oon, and K.G. Goulias, A ctivity sp ace estimation with longitudinal observations of so cial me dia data , T ransportation 43 (2016), pp. 955–977. [26] S.A. Matthews and T.C. Y ang, Sp atial p olygamy and c ontextual exp osur es (SPACEs): Pr omoting activity sp ac e appr o aches in r ese ar ch on plac e and he alth. , The American Behavioral Scientist 57 (2013), pp. 1057–1081. [27] J.D. Mazimpak a and S. Timpf, T r aje ctory data mining: A r eview of metho ds and applications , Journal of Spatial Information Science 13 (2016), pp. 61–99. [28] H. Miller, Plac e-b ase d versus p e ople-b ase d Ge o gr aphic Information Science , Geography Compass 1 (2007), pp. 503–535. [29] C.N. Morrison, H.F. Byrnes, B.A. Miller, E. Kaner, S.E. Wiehe, W.R. P onicki, and D. Wiebe, Assessing individuals’ exp osur e to envir onmental c onditions using r esidenc e-b ase d me asur es, activity lo c ation-b ase d me asur es, and activity p ath-b ase d me asur es , Epidemiology 30 (2019), pp. 166–176. [30] T.H. Newsome, W.A. W alcott, and P .D. Smith, Urb an activity sp ac es: Il lustr ations and applic ation of a c onc eptual mo del for inte gr ating the time and sp ace dimensions , T ransp ortation 25 (1998), pp. 357–377. [31] A.J. Noah, Putting families into plac e: Using neighb orho o d-eﬀe cts r ese ar ch and activity sp aces to understand families , Journal of F amily Theory & Review 7 (2015), pp. 452–467. [32] B.K. Paul, F emale activity sp ace in rural Bangladesh , Geographical Review 82 (1992), pp. 1–12. [33] C. Perc houx, B. Chaix, S. Cummins, and Y. Kestens, Conc eptualization and measur ement of envi- r onmental exposur e in epidemiolo gy: A c c ounting for activity sp ac e r elate d to daily mobility , Health & Place 21 (2013), pp. 86–93. [34] D.B. Richardson, N.D. V olk o w, M.P . Kw an, R.M. Kaplan, M.F. Go odchild, and R.T. Cro yle, Sp atial turn in he alth r ese ar ch , Science 339 (2013), pp. 1390–1392. [35] S. Schonfelder and K.W. Axhausen, A ctivity sp ac es: Me asur es of so cial exclusion? , T ransport Policy 10 (2003), pp. 273–286. [36] J.E. Sherman, J. Spencer, J.S. Preisser, W.M. Gesler, and T.A. Arcury , A suite of metho ds for r epr e- senting activity sp ac e in a he althc ar e ac c essibility study , International Journal of Health Geographics 4 (2015), p. 24. [37] L.K. V anWey , R.R. Rindfuss, M.P . Gutmann, B. Ent wisle, and D.L. Balk, Conﬁdentiality and sp atial ly explicit data: Conc erns and chal lenges , Pro ceedings of the National Academy of Sciences 102 (2005), pp. 15337–15342. [38] G.M. V azquez-Prokopec, D. Bisanzio, S.T. Stoddard, V. P az-Soldan, A.C. Morrison, J.P . Elder, J. Ramirez-Paredes, E.S. Halsey , T.J. Ko chel, and T.W. Scott, Using GPS te chnolo gy to quan- tify human mobility, dynamic c ontacts and infe ctious disease dynamics in a r esour c e-p o or urb an envir onment , PloS One 8 (2013), p. e58802. [39] L.A. W aller and C.A. Got wa y , Applie d Sp atial Statistics for Public He alth Data , John Wiley & Sons, Hob ok en, NJ, 2004. [40] L. W asserman, Al l of Nonp ar ametric Statistics , Springer T exts in Statistics, Springer, New Y ork, 2007. [41] S.E. Wiehe, M.P . Kwan, J. Wilson, and J.D. F orten b erry , Adolesc ent he alth-risk behavior and c om- munity disorder , PloS One 8 (2013), p. e77667. [42] S.E. Wiehe, A.E. Carroll, G.C. Liu, K.L. Haberkorn, S.C. Hoch, J.S. Wilson, and J.D. F ortenberry , Using gps-enable d c el l phones to tr ack the tr avel p atterns of adolesc ents , International Journal of Health Geographics 7 (2008), pp. 22–22. [43] S.N. Zenk, S.A. Matthews, A.N. Kraft, and K.K. Jones, How many days of Glob al Positioning System (GPS) monitoring do you ne e d to me asur e activity sp ac e environments in he alth r ese ar ch? , Health & Place 51 (2018), pp. 52–60. [44] S.N. Zenk, A.J. Sc hulz, S.A. Matthews, A. Odoms-Young, J. Wilbur, L. W egrzyn, K. Gibbs, C. Braunsc hw eig, and C. Stok es, A ctivity sp ace envir onment and dietary and physic al activity b ehaviors: a pilot study , Health & Place 17 (2011), pp. 1150–1161. * Dep ar tment of St a tistics, University of W ashington, Sea ttle, W A, USA; † Dep ar tment of Sociology, University of W ashington, Sea ttle, W A, USA

A statistical framework for measuring the temporal stability of human mobility patterns

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment