Social Interactions in Large Networks: A Game Theoretic Approach
This paper studies social interactions in a game theoretic model with players in a large social network. We consider observations from one single equilibrium of a large network game with asymmetric information, in which each player chooses an action …
Authors: Haiqing Xu
SOCIAL INTERACT ION S I N LARG E NETWORKS: A GAME THEORETIC APPROA C H ∗ HAIQING XU A B S T R A C T . This paper studies s ocial interactions in a game theoretic model with player s in a large social network. W e conside r obse rvations from one single equilibrium of a large network game with asymmetric i nformation, in which each p l ayer chooses an action from a finite set and is subject to interactions with her friends. Simple assumptions about the structure are made to establish the existence and uniqueness of eq uilibrium. In par ticu- lar , we show that the equilibrium strategi e s satisfy a network decaying de pendence (NDD) condition requiring that depende nce between any two players’ decisions decays with their network distance. The formulation of such an NDD proper ty is novel and serves as the basis for statistical i nference. Further , we establish the identification of the structural mo d el and introduce a computationa lly feasible and efficient estimation method. W e ill us trate the esti- mation method with an actual application to col lege attendance, as well as in M onte Carlo experiments. Keywords: Local interaction, social networks , Bayesi an Nash Equilibr ium, network de- caying dependence condition, approximated m ax i mum likeli hood estimation JEL : C14, C35, C62 and C72 Date : October 28, 2016. Department of Economics, The University of T exas at Austin, BRB 3.160, Mailcod e C3100, Austin, TX, 78712 h.xu@austin.utexas.edu. ∗ A previous version of this paper was circulated under the title “Social interactions: a game theoretic ap- proach”. I gratefully acknowledge Kalyan Chatterjee, Sung Jae Jun, Isabelle Perri gne, Jori s Pinkse, Quang V uong and Nei l W allace for their guidance and advice. I also thank Jason Abrevaya, V ictor Aguirregabiria, Edward Couls on, Stephen Donald, Frank Er ickson, Bry an Graham, E d Green, Paul Grieco, Han Hong, Hi- royuki Kasahara, Konrad Me nzel, Margaret Slade, James T ybout, Halbert White, Daniel Xu, and the s eminar participants at UBC, T oronto, UT Austin, Boston College, Ri ce, O hio State Universi ty , T ex as A&M, the 2011 Cornell–PSU Macro Seminar , the 2011 North Ameri can Econometric Society Summer Meeting, and 2012 T exas Econometrics Camp for providing helpful comments. Al l remaining errors are mine. 1 1. I N T R O D U C T I ON Over the last de cades, network ef fects on social be haviors has become important in so - cial t heory (see e .g. Granovett er, 1985). In particular , economics has been e ncouraged t o br oaden its scope to the analysis of so cial interactions while maintaining the rigo r that is emblematic of economic analysis (Manski, 2000). Re cently , g ame the ory has played a central role in th is regar d and a leading e xample is t he st u dy of net work formation by Bala and Goyal (2000). In t his paper , we propose a ne twork game of incomplete infor- mation to study large–network–based social interactions. Simple assumptions about the game s tructur e ar e made to ensure a unique equilibrium and the equilibrium satisfies a decaying network effects condition. W e t hen e stablish identification and estimation of the structural primitives using data from a single large netw ork. The structure of ou r netwo rk game follows the “prefer ence inte r action” approa ch s u g- geste d by Manski (2000). Specific ally , a player ’s p ayoff from choosing an action over al- ternatives depe nds on othe r players’ simultaneous actions as well. 1 Instead of interacting with all players on the social netwo rk, we assume each agent is o nly af fected by th e cho ices of he r direct best friends. W e call it as “local” interactions, a notion that was first intro- duced by Se im (2006) in the context of industrial organization. Such a sp ecification is par- simonious, but rich enough to generate t h e inter depend ence of all agents’ choices, which is shaped by the w ay t h e network g ets con n e cted. For example, teenage rs are incli ned to be aff ected by their friends in te r ms of adolescent risk y behaviors (see e.g. Nakajima, 2007), but such local effects can spread through t he ne twork. In particular , they are indir ectly af- fected by th e behaviors of the ir friends’ friend s, because t h e ir friends ar e interacting with. In equilibrium, all tee nagers fr om the s ame network will af fect each othe r directly or indi- rectly . Our local interaction spe cification dif fers from the “linear –in– mean” approach widely used in the literature o n social inte ractions (see e . g. Manski, 1993). The latte r captures the notion that an individual’ s behavior de p ends on the average behavior of all oth e r s ocial members. The local interaction approach is attractive in the st u dy of large–network–based 1 In particular , we assume each play e r ’s payoff relevant covariates (including friend relationship) are public information, but payoff s hocks (i.e. the error terms) are pri vate information of the player . 2 social interactions: First, our model allows one to stu d y the counte rfactuals and th e policy eff ects from t he change of network graph. I n contrast, much of the theoretical literature on network interventions have long focuse d attention on qualitative features like the s tability , but not quantitative e f fects. A second advantage is that an equilibrium in our local inter- action model exh ibits featur es that reflect how the lar ge net work connects players to each other . Last but not least, the s trategic effects betwee n any pair of friends in our mod e l are not diluted by the large size o f the netwo rk, which is a ty pical feature in linear –in–mean models. By restricting the interaction strength to be suffici ently mild, we establish t h e u nique- ness of the equilibrium. U niqueness of the equilibrium is alwa ys crucial and of interest to both theoretical and empirical sides in game the ory . In the presence of multiple equilib- ria, it is quite difficult to charac terize the whole set of equilibria in a large ne twork game: When the ne twork is no t regular , we cannot use a Markov–type o f equilibrium s o lution concept to simplify empirical analysis like in t h e dy n amic model inference. Another more fundamental concern is the “incompleteness” o f the e conometric mo d el d ue to the exis- tence of multiple e quilibria (see T amer, 2003). I n the e mpirical game literature, uniquene s s can also be found in, e . g., Brock and Durlauf (2001) and Xu (2014). While there are str ate gic interactions among friends in a lar ge ne twork, players’ choices ar e mut ually dep e ndent on each other . Intu itively , o n e would e xpect such a de p endence to decay w ith the ne twork distance. Under primitive conditions on the strateg ic components, we show that t he depe n d ence of a player ’s equilibrium choice on her friends’ choices de- cays (expo nentially) with network distance, a so–called “ network decayi ng dependence condi- tion ” (NDD condition) that more or less amounts to the restrictions for a stationary so lut ion in the autoregressive model. Our NDD condition is related to a number of de pende n ce de- cay conditions used in the t ime series and s patial analysis (see e.g. Je nish and Prucha, 2009). When the data come from the equilibrium o f a s ingle lar ge netw o rk game, all observations ar e dependent on each other du e to s t rategic interactions. The NDD implies that any two players’ de cisions ar e close r to be indepe n d ent if they ar e farther away from each other . The fo r mulation of NDD is nove l and ser ve s as the basis for our statistical inference. 3 For es timation, a key challenge arises because it is costly to solve the equilibrium ana- lytically or numericall y . W e pr opose a new approac h that appr oximates the equilibrium solution by solving n (i.e. the number of players) Bayesian games of much smaller size, one for each player . Specifically , for player i , a Bayesian game is tailor ed fr om the or iginal one by cutt ing off all those players who se ne twork distances from i are lar ger than h ( h ∈ N ). The se t of players left on the sub–network, as well as th e ir payo ff s, action s p ace, informa- tion structure and s o on, defines a smaller –sized Bayesian game. W e so lve this subnetwork game and use the equilibrium s olution of player i to approximate her equilibrium s trategy in the original lar ge network game. The tuning parameter h is chosen car efully d epend- ing on t he network s ize n , i.e. , h needs t o increase w ith n at an exponential rate such that approxima tion e rrors ar e negligible for the limiting distribution of the estimator . By this approxima tion, we the n defin e an approximated maxim um likelihood e s timator (AMLE), which is sho wn to be asymptot ically equival ent to the infeasible MLE . W e use Monte Carlo experiments to illustrate the proposed estimation me thod, which p erforms well. It is worth pointing out t hat our asymptotic analysis is based on the number of players going to infinity , instead o f the infinite repetition o f the same game with a small fixed num- ber of players. The latter asymptotic appr oach is used by most of the existing empirical game literature, e.g. Bjorn and V uong (19 84), Bresnahan and Reiss (199 1), B rock and Durlauf (2001) and T amer (2003). Our asympto t ic anal ysis applies to observations coming from one or a small number of large netwo rks. In a a seminal p ape r , Me nzel (2015a) charac terizes the asymptot ic distribution of a large matching market. The analysis is similar in spirit to our appr oach in te rms of using the limiting distribution as the number of players goes to infinity t o approxima te the distribution of the equilibrium in a lar ge g ame. An important dif ference is that in Menzel (2015 a ) the strategic effects t hat cause the endo geneity iss u e become n e gligible as the number of p layers incr eases t o infinity , which is not the case in our asy mptotic analysis. W e apply our methods to stu d y college attendance de cisions of high s cho o l stud ents by using the d ata fr om the N ational Longitudinal Study of Ado lescent H ealth (Ad d Health). The Ad d health data is a longitud inal survey containing a nationally representative sam- ple o f adoles cents in the Unite d State s during the 1994–95 school ye ar . A unique featu re 4 in Add Health data is the availabil ity of respondents’ social ne t work information, which is r econstructed by s tudent s ’ best friends nominations in the survey . Applying the pr o- posed estimation procedure, we find statistically significant, positive peer effects, which has a similar scale to othe r empirical find ings of peer effects on y outh behaviors using the similar or earlier d atasets. See e .g. Calvó-Armengol, Patacchini, and Zenou (2009) and Gaviria and Raphael (2001) . The r est of the p aper is organized as follows. In S e ction 2, we des cribe the data and provide descriptive statist ics. In Section 3, we introduce the network game model and establish t he uniqueness of the BNE and the NDD condition. In S ections 4, w e est ablish identification of the mo d el. In Se ction 5, we propose an es t imation procedure. T o present basic ideas, we firs t s how t he p roposed estimator is asy mptotically e quivalent t o t he ML E and derive its asymp t otic properties. W e then study its finite sample performance by us ing Monte Carlo experiments. Ap plying the p ropose d method, we also present the baseline coeffic ient estimates and comp ares them w ith alternative empirical results in the literature. Section 6 concludes. Proofs ar e provided in the Appendix. 2. D ATA W e study pee r effects on college attend ance of high s chool st udents using data fr om the National Longitudinal Stu dy of Ad o lescent H ealth (Ad d Health), which is a longitud inal survey containing a nationally r epresentative sample o f adolescents in the United States during the 1994–95 school year . A unique featur e of t h e Add Health data is the availa bility of respondents ’ social network information, as well as their social and e conomic charac- teristics (including college attendance): Each respondent provides his or he r friendship information by no minating at most five male and female best friends, respectively . Intu- itively , one can then reconstruct the whole friend s hip netwo rk among respondents. All the responde nts in our e mpirical stud y come fr om three high school net works and the to- tal number of o bservations is n = 831. A detailed de s cription of the data can be found on the website of the Car olina Population Center . 2 2 See http://ww w.cpc.unc.edu /projects/add health/data . 5 The college attendance decisions must have bee n made by individual famil ies during a short p eriod. Following the literature (see e.g. Christensen, Me lder , and W eisbr od, 1975; Leslie and Brinkman, 198 8), the exogenous covariates that af fect college attendance in- clude age , ho usehold income, GP A, parents’ ed u cation level, race, ge nder , etc. Descrip- tive statistics ar e presented in T able 1. The de mographic variables, i.e., Ho usehold Income, Mother ’s E ducation and Father ’s Education, are r ecorded by some codes. Thes e codes are natural numbers incr easing with the actual value o f variab les. The median Househo ld Income is bet ween $50,000 and $74,999. Mothe r ’s Education and Father ’s Education ar e coded as 1 = never went to school, 2 = not graduate from high school, 3 = high school graduate, 4 = graduated from college or a university , 5 = professional training bey ond a four-year college. T here is a severe missing data issue in these two variables: we t reat missing o bs ervations as value 0. Fo r the observed s ubsample, t he medians of Mothe r ’s E d - ucation and the median o f Father ’s Education are high school graduate. Over the whole sample, how ever , bot h med ians are zero. As a matter of fact, we only use o bservations from three lar gest schools. For schools with small n u mbers of respondents, the missing d ata iss ue is severe. The refor e, the d escriptive statistics in T able 1 are slightly d if ferent from other studies on social interactions that us e the whole Add Health dataset (see e.g . Calvó-Armengol, Patacchini, and Zenou, 2009). The number of friends and t he net work cent rality are two de scriptive s tatistics on the network s t ructur e. Player i ’s network cent rality is define d by the nu mber of players who take i as a friend, i.e. , ∑ j 6 = i 1 ( i ∈ F j ) . In the data, the standard deviation of the number of friends is less than the st andar d de viation of the network centrality , which is a typical feature in s ocial n e tworks . 3. T H E M O D E L Following our empirical applica tion, we consider a game theoretic model on social in- teractions of high schoo l s tudents ’ college atte n d ance de cisions. All t hese stude nts are de- noted as p layers indexed by i ∈ N ≡ { 1, · · · , n } , with exogeno usly dete rmined locations on t he schoo l netw ork. Using the terminology in graph t heory , a ver t ex of the net work denote s a student and a d ir ected edge connects verte x i to j if student j is one o f i ’s best 6 T A B L E 1 . Descriptive Statistics: 3 school net works; Y ear 1994–1995 V ar iable Mean Std. Dev . Min Max Age 17.08 8 1.138 15 21 Female 0.502 0.500 0 1 Household Income 8.827 2.122 1 12 Mother ’s Education* 0.516 1.676 0 11 Father ’s Education 1.709 2.955 0 12 Overall GP A 2.376 0.772 0.11 4 American Indian** 0.039 0.193 0 1 Asian 0.140 0.347 0 1 Black 0.084 0.278 0 1 Hispanic 0.348 0.477 0 1 White 0.651 0.477 0 1 Other Race 0.153 0.360 0 1 Number of Friends 1.303 1.575 0 8 Network Centrality 1.303 1.780 0 13 College Attendance 0.535 0.499 0 1 *Missing observations have been treated as 0. **Note that some observations are associated with more tha n one race. friends. Following the netwo r k d istribution t heory (see e. g. Barabási and Albert, 1999), we can view t he h igh s chool network as a r and o m graph with verte x connectivities go ve r n e d by some probabil ity d istribution, and the observed netwo r k in our data is a single r ealiza- tion of the large random ne twork. Given the network, we denote F i as the group of i ’s best friends, i.e., the set of s tudents are directly connecte d t o i . Note that friendship may not be symmetric, i.e., j ∈ F i does not necess arily imply i ∈ F j , w h ich is an important feature in our data. Mo reover , let Q i = # F i be the number of i ’s best friend s. In our g ame theoretic model, we assume the school ne twork structure is public information. Therefor e, F i is also public information. Moreover , we assume each player i simultaneous ly choo ses a discrete action Y i ∈ A ≡ { 0, 1, 2, ..., K } . Following the convention, let Y − i denote a profile of actions of all other play- ers. L et further X i ∈ S X ⊆ R d be a vecto r o f player i ’s payoff relevant state variables, which ar e publicly obse rved by all players, as well as t he researcher . Fu r t her , p layer i observes a vector of action–specific p ayo ff shocks labeled as ǫ i ≡ ( ǫ i 0 , · · · , ǫ iK ) ∈ R K + 1 . W e assume t hat ǫ i is i ’s private information, i.e., ǫ i is no t observe d by any j 6 = i . 3 In our 3 It should be noted our specification rules o ut unobserved heterogeneity , which is observed by all the players but not the researcher . 7 empirical application, Y i is binary indicating colleague atten d ance, wh e re action 1 de notes the college atte ndance; X i is a vector of demographic variables including e.g. age, ge nder , GP A, parents’ ed ucation, ho usehold income and race; Moreover , ǫ i is an idiosy ncratic pref- erence s hock for college attendance. For expositional simplicity , we denote all the public state variables ass o ciated with stude nt i by S i ≡ ( X ′ i , F i ) ′ . Players interact with each ot her through their ut ilities. Sp ecifically , we specify player i ’s payoff from choosing an action k ∈ A as follows U ik ( Y − i , S i , ǫ i ) = β k ( X i ) + ∑ j ∈ F i α k ( Y j , X i , Q i ) + ǫ ik , (1) where β k ( · ) is a choice–specific function, and α k ( · , · , · ) measures the s trategic effects on i ’s payoff (of choo sing k ) fr om her friend j ’s decision. In o u r specification, the s trategic eff ects depend on the state variable X i as well as Q i . Because only the dif ferences of choice– specific payoffs matter to decision makers, hence, w .l.o.g., we normaliz e the mean utility of action 0 by lett ing β 0 ( x ) = α 0 ( ℓ , x , q ) = 0 for all x ∈ S X , ℓ ∈ A and q ∈ N . Let θ k = ( β k , α k ) ′ and θ = ( θ ′ 1 , · · · , θ ′ K ) ′ be t he structural parameters o f the g ame, which are unknown functions . It is worth pointing out that our mode l can be extende d to allow for exog e nous inte r- action effects, i.e., player i ’s payoffs U ik depend s on X j for all j ∈ F i . See e.g . Manski (1993) and Bramoullé, Djebbari, and Fortin (2009). Our approach could be mo d ified to ac- commodate such an extension. 4 In our empirical applica tion on college attendance, high school studen t s are less likely to make the ir attendance decisions according t o friends’ de - mographic varia bles (e .g. House hold income, P arents’ education level and Overall GP A). On the other hand, such payoff relevant covariates of friends can affect each individual’s decision indirectly through he r exp ectation o n friends’ equilibrium cho ices . In our setting , direct interactions on payo f fs only occur among friends. A lthough inte rac- tions are local, st r ate gic effects can spread throughout the whole netwo rk if no s ubnetwork is isolated . Fo r inst ance, e ach player nee ds to consider the decisions by the friends o f he r friends. This is because t hose decisions are r elevant to her friends ’ choices which thereafter 4 W e thank a referee for this point. 8 af fect her payoffs. In the equilibri um, each player ’s strategy depe nds on all other players’ public observables { ( X j , F j ) } j 6 = i as well as her own state variabl es ( X i , F i ) . 5 3.1. Bayesian Nash Equ ilibrium (BNE ). Let S n = ( S 1 , · · · , S n ) be all t he public informa- tion in t he ne twork g ame. F or s implicity , we will suppress the subscript n in S n unless the subscript is necessary . T o discuss the equilibrium solution, we fix the public state variabl e S . In this Bayes ian g ame, player i ’s strate gy is a function r i ( ·| S ; θ ) that maps her private information ǫ i to a discrete choice Y i . F ollowing the BNE so lution concept, player i ’s eq u i- librium strategy , denoted as r ∗ i , maximiz es her (conditional) e xpected p ayoff given all other players equilibrium st rategies r ∗ − i , i.e., r ∗ i ( ǫ i | S ; θ ) = ar gmax k ∈ A E [ U ik ( Y − i , S i , ǫ i ) | S , ǫ i ] = ar gmax k ∈ A " β k ( X i ) + K ∑ ℓ = 0 n α k ( ℓ , X i , Q i ) × ∑ j ∈ F i P r ∗ j ( ǫ j | S ; θ ) = ℓ S , ǫ i o + ǫ ik # , ∀ i . (2 ) Thus, eq. (2) define s a simultaneous equation sy stem in terms of r ∗ 1 , · · · , r ∗ n . T o characterize the BN E solution, we first make the following assumption o n the private information ǫ i . Assumption A. Let ǫ ik be i.i.d. acr oss both actions and players and conform to an extr e me value distrib ution w ith a density function f ( t ) = exp ( − t ) exp [ − exp ( − t ) ] . Assumption A has been widely assumed in the d iscrete choice mod el literature, as well as in empirical discrete g ames (see, e.g., Br ock and Durlauf, 200 2; Bajari, Hong, Krainer , and Nek ipelov, 2010). The independe nce of ǫ i acr oss players in assumption A implies that players’ equilib- rium choices are conditionally independ e nt g iven S . T h e refor e, the n e twork depe n d ences of players ’ decisions are all characterized by t he dep endence of players equilibrium strate- gies r ∗ i on the common state variable S . By assumption A, we can rewrite (2) in terms of equilibrium choice probabili ties. Let σ ∗ ik ( S ; θ ) = P ( r ∗ i ( ǫ i | S ; θ ) = k | S ) and σ ∗ i ( S ; θ ) = ( σ ∗ i 0 ( S ; θ ) , · · · , σ ∗ iK ( S ; θ ) ) ′ be the e quilibrium 5 A recent work by Manresa (2013 ) develo ps a reduced form to assess the dep endence structure from s o cial interactions in a linear setting. 9 choice probabi lities of action k and the action profile, r espectively . Let further Σ ∗ ( S ; θ ) = ( σ ∗ 1 ( S ; θ ) , · · · , σ ∗ n ( S ; θ ) ) be the e quilibrium cho ice pr obability profil es of all p layers. By (2) and assumption A, we have σ ∗ ik ( S ; θ ) = exp h β k ( X i ) + ∑ K l = 0 n α k ( ℓ , X i , Q i ) × ∑ j ∈ F i σ ∗ j ℓ ( S ; θ ) o i 1 + ∑ K q = 1 exp h β q ( X i ) + ∑ K l = 0 n α q ( ℓ , X i , Q i ) × ∑ j ∈ F i σ ∗ j ℓ ( S ; θ ) o i , ∀ i , k . (3) Note that solving the BNE solution { r ∗ 1 ( ·| S ; θ ) , · · · , r ∗ n ( ·| S ; θ ) } to eq. (2) is equivalent to solving { σ ∗ 1 ( S ; θ ) , · · · , σ ∗ n ( S ; θ ) } fr om (3). Se e Bajari, Hong , Krainer , and Nekipelov (2010). Equation (3) is the common logit functional form, except for t h e presence of the e quilib- rium cho ice probabi lities of i ’s friends. The existe n ce of a solution follows Brouwer ’s fixe d point the orem. Next, we es tablish the unique ness of the equilibrium, and the n show the equilibrium satisfies a decaying depend ence condition. Bo th uniquene ss and t he d ecaying depend ence condition are crucial for our e mp irical analysis. 3.2. Unique equilibrium. The insight for deriving the unique equilibrium comes from the linear spatial auto regr essive model literature: Strong interactions among individuals can induce multiple equilibria in a s imultaneous e quation syst e m. T o obtain the unique ness o f the BNE, we need to restrict the inte raction s trength to be sufficiently mild. Assumption B. Denote λ ≡ K K + 1 · s up ( x , q ) ∈ S XQ max k , m , ℓ ∈ A q | α k ( ℓ , x , q ) − α m ( ℓ , x , q ) | . Let λ < 1 . For estimation, w e parametrize α k ( ℓ , x , q ) by α k ℓ / q for some α k ℓ ∈ R . Th e n, assumption B becomes max k , m , ℓ ∈ A | α k ℓ − α m ℓ | < ( K + 1 ) / K . Similar to the requirement that all roots lie outside of the unit circle in spatial autor egres- sive models, such a condition ensures weak depende nce. In our empirical application, each student t ake s a binary decision for college att e ndance. Under the parametrization, assumption B can be rewritten as max { | α 10 − α 00 | , | α 11 − α 01 |} < 2, 10 Note that α 00 − α 10 and α 11 − α 01 describe peer ef fects in social interactions, i.e., t he prin- cipal that friends bene fit from choosing the same action. Intuitively , α 00 − α 10 ≥ 0 and α 11 − α 01 ≥ 0 in ou r empirical context. Therefore, assumption B requir es peer effects to be bounded above. Intuitively , this condition means t hat the college attendance d ecisions are mainly determined by stude nts’ own so cial and e conomic characteristics like GP A, hous e- hold income e tc., and their idios yncratic prefer ence shock on college attend ance as well. I f the average probabil ities of friends’ college atte ndance increase o ne percentage po int, the n the pee r effects o n her own college attendance p roba bility is limited by λ < 1 percentage. 6 Assumption B generally holds in a wide range of e mpirical studies o f youth be hav- iors, including e.g. the substance use, churc h attendance, academic perfo r mance, aca- demic cheating. See e.g. , Gaviria and Raphael (2001), Sacerdote (2001), Kawaguchi (2004), Carr ell, Malmstr om, and W e s t (2008) and Calvó-Armengol, Patacchini, and Zeno u (2009). In th e se st udies, the effects on a player ’s equilibrium choice probab ilities from he r friends ’ choices are significantly smaller than one. Al though the N DD is a natural condition for peer eff ects in our empirical context, numerous p romi nent exceptions exist. For example, adolescent risky behaviors like substance (marijuana, alcohol, or tobacco) us e are mai nly driven by influence from friends . Se e e.g . Gaviria and Raphael (2001) and Kawaguchi (2004). Anothe r leading example is the butterfly effects widely us ed in e.g. fashion, finan- cial crisis and g old rush, w hich characterize the se n s itive depende nce o f players’ choices on e ach othe r . Lemma 1. Suppose assumptions A and B hold. Then, ther e always exists a u n ique BNE, re gardle ss of the nu m ber of players n or the re alizatio n of the state variable S . The proof of t he uniquene ss of t h e BNE relies on a contraction mapping argument. W e can generalize such a r esult to the exp onential family distribution for the p rivate informa- tion ǫ i . 6 T o see this, note that assumption B ensures a quasi–Lipschitz condition hold for the best response function: The best response function Γ i ( s i , { σ j : j ∈ F i } ) defined by (10) in the appendix satisfies the following condition: k Γ i ( s i , { σ j : j ∈ F i } ) − Γ i ( s i , { ˜ σ j : j ∈ F i } ) k 1 ≤ λ · max j ∈ F i k σ j − ˜ σ j k 1 where k · k is the L 1 –norm. See the proof of Lemma 1. 11 3.3. Network Decaying Dependence (NDD). W e beg in with some notation. For any pos- itive inte ger h ∈ N , let N ( i , h ) be the s ubset o f players defined inductively: N ( i ,0 ) = { i } and ∀ h ≥ 1, N ( i , h ) = N ( i , h − 1 ) [ [ j ∈ N ( i , h − 1 ) F j . By definition, N ( i , h ) is the set of players on the s ocial network w ithin h dist ance of p layer i (including i hers elf). Moreover , let G ( i , h ) be th e ne twork g raph t hat us es vertices and edge s to describe all the connections among N ( i , h ) . Let furthe r S ( i , h ) = { X j : j ∈ N ( i , h ) } ; G ( i , h ) . By d e finition, S ( i , h ) describes t he su bnetwork cent e red ar ound player i within her h –distance, i.e., h o w th e se players ar e connected t o each other and what are the state variables at each node of the graph. Note that players’ identities do no t matter in the definition o f S ( i , h ) . The idea of NDD condition is to e x amine how player i ’s equilibrium choice probability σ ∗ i ( S ; θ ) responds to counte r factual changes of some other player j ’s public state variable S j . N ote that in equ ilibrium σ ∗ i ( S ; θ ) depe n d s on all t he p ublic information S , including S j no matter j is i ’s friend or not. I n a “stable” eq u ilibrium, intuitively s uch a dep e ndence should decay with d istance. Therefor e, the statistical depend ence bet ween Y i and Y j also diminishes with i and j ’s net w ork distance. Definition 1 (Networ k Decaying Depende nce, NDD) . In the above network game, the equi- librium sati sfies the NDD condition if ther e exist s a deterministic sequence { ξ h : h = 1, · · · , ∞ } with ξ h ↓ 0 as h → ∞ such that for any given size n of the n etwork and positive integer h , we have sup s , s ′ ∈ S S : s ( i , h ) = s ′ ( i , h ) σ ∗ i ( s ; θ ) − σ ∗ i s ′ ; θ 1 ≤ ξ h , ∀ i = 1, · · · , n , (4) wher e k · k 1 is the L 1 –norm, i.e., for any z ∈ R k , k z k 1 = ∑ k ℓ = 1 | z ℓ | . Our notion of NDD is related to the weak d e pende n ce in the time–series/spatial literatu re. In particular , NDD implies the near–epoch de pendence (NED) cond ition in e .g. An d rews (1988). 7 Dif fer ent from the time-se ries/spatial statistical literature t h at assumes weak de- pendence of unobs erved e rrors acr oss observations, the dependence of players’ decisions 7 This is because by eq. (4), P Y i 6 = E [ Y i | { ( ǫ j , S j ) : j ∈ N ( i , h ) } ] is bounded by K ξ h . 12 results from net work–based s trategic interactions. Conditional o n S , players’ decisions ar e mutually inde pendent und er assumption A. In Definition 1, N DD requir es the causal effects of S j on σ ∗ i to be bounde d above by ξ ρ ( i , j ) , where ρ ( i , j ) denotes the network dist ance from j to i . The game size n is treated as a s tate variabl e. W ith NDD (and assumpt ion J to be introduced later), if we incr ease the ne twork size by k e eping adding players to the “fringe” of the ne t work, then t he equilibrium choice probab ility for any e x ist ing player will conver ge to a limit. 8 The next lemma shows that the equilibrium in our network game satisfie s NDD under weak conditions . Lemma 2. Suppose assumptions A and B hold. Then the B NE satisfie s NDD with ξ h = 2 λ h + 1 . W ith the NDD, the d istribution of the observable variables P Y i | S can be nonparametrically estimated by us ing data from one single lar ge network as t he ne t work s ize n goes to infinity . See App endix D. 4. I DE N T I FI C AT I O N In t his s ection, we discuss the ide ntification of the structural parameter θ . F ollowing Hurwicz (1950) and Koopmans and Reiersol (1950), the definition of ident ification in a structural model r equires that there is a unique value of t h e structural parameter θ that generates the distr ibution of the obse rvable variables { P Y i | S : i = 1, · · · , n } . Because of the uniquenes s of the equ ilibrium by Lemma 1, σ ∗ ik ( S ; θ ) is identified by σ ∗ ik ( S ; θ ) = P ( Y i = k | S ) . L et δ ik ( S ) = ln P ( Y i = k | S ) − ln P ( Y i = 0 | S ) for each k ∈ A . By definition, δ ik ( S ) is also identified. Mor eover , by (3), δ ik ( S ) = β k ( X i ) + ∑ ℓ ∈ A h α k ( ℓ , X i , Q i ) × ∑ j ∈ F i P ( Y j = ℓ | S ) i , ∀ i , k . 8 T o se e this, fix i ( i ≤ n ) and consider adding new players n + 1, n + 2, · · · o ne by one to an ex isting network with n players. Assumption J e nsures that for any existing player i ≤ n , the network d istance from the added player n + k to i , i. e., ρ ( i , n + k ) , will go to infinity as k goes to infinity , if any new player is added to the f r inge of the previously existing network (i.e., a new player will not decrease the network distance of any pair of existing players). Therefore, { σ ∗ i ( S n ′ ; θ ) : n ′ = n , n + 1, · · · , ∞ } i s a Cauchy s equence if the NDD condi tio n holds. 13 Let furthe r φ i ℓ ( S ) = ∑ j ∈ F i P ( Y j = ℓ | S ) . By de finition, ∑ ℓ ∈ A φ i ℓ ( S ) = Q i . I t follows that δ ik ( S ) = β k ( X i ) + α k ( 0, X i , Q i ) × Q i + K ∑ ℓ = 1 [ α k ( ℓ , X i , Q i ) − α k ( 0, X i , Q i ) ] × φ i ℓ ( S ) . (5) Similar to Robinson (1988), eq. (5) is es sentially a p artial linear model as shown in Lemma 3. Equation (5) s uggest s that β k ( · ) and α k ( 0, · , · ) are not identified s eparately unless 0 ∈ S Q . 9 Hence, we introduce the following normalization on α k . Assumption C. Let α k ( 0, · , · ) = 0 for all k ∈ A . Next, we ass ume a rank condition for identification. L et ϕ i ( S ) = ( 1, φ i 1 ( S ) , · · · , φ iK ( S ) ) ′ . Assumption D (Rank Condition) . Given the game size n , the matrix E [ ϕ i ( S ) × ϕ i ( S ) ′ | X i = x , Q i = q ] is invertible for all ( x , q ) ∈ S X Q . Assumption D is te stable given that the conditional choice probabi lities can be consistent ly estimated. The nex t theo rem e s tablishes the identification of the model. For th e sake of simplici ty , let α k ( · , · ) = ( α k ( 1, · , · ) , · · · , α k ( K , · , · ) ) ′ be a vector of functions. Lemma 3. Fix arbitrary n . S uppose assumptions A to D hold. Then the structural paramete r θ is identified, i.e. P Y 1 , · · · , Y n | S ( θ ′ ) 6 = P Y 1 , · · · , Y n | S ( θ ) for all θ ′ 6 = θ . Specifically , for an y ( x , q ) ∈ S X Q , β k ( x ) α k ( x , q ) ! = E ϕ i ( S ) × ϕ ′ i ( S ) | X i = x , Q i = q − 1 E [ ϕ i ( S ) × δ ik ( S ) | X i = x , Q i = q ] . Note that the ide ntification result in Lemma 3 is e stablished for each fixe d n . For purpo se of estimation and asymptotic analysis, ho wever we need n goes to infinity . Hence , we replac e the rank condition D by the following assumpt ion. Assumption E (Rank Condition for large n ) . T he matrix E [ ϕ i ( S ) × ϕ i ( S ) ′ | X i = x , Q i = q ] is invertible for all n sufficiently larg e and ( x , q ) ∈ S X Q , i.e., for any ( x , q ) ∈ S X i , Q i , lim inf n → ∞ det E ϕ i ( S ) × ϕ i ( S ) ′ | X i = x , Q i = q > 0. 9 T o see this, consider the following specification: α k ( 0, X i , Q i ) = α k ( 0, X i ) / Q i for all Q i ≥ 1. 14 By r elaxing conditions in L emma 3, the next theorem est ablishes ident ification o f the model for all sufficiently lar ge n . Theorem 1. Suppose assumptions A to C and E hold. T hen the structural parameter θ is identified for all n sufficiently large. The semiparametric identification in The orem 1 helps the applied r esearcher t o get a better sen s e of whether a fully parametric approach relies on ad hoc s pecification (of the payoff function) for identification or merely for s implicity of estimation. Analogous rank condition can be formulated in the fully parametric mode l that is used for our es timation. Let β k ( x ) = x ′ β k and α k ( ℓ , x , q ) = α k ℓ / q , whe re β k ∈ R d and α k ≡ ( α k 1 , · · · , α kK ) ′ ∈ R K . Let W i = ( X ′ i , φ i 1 ( S ) , · · · , φ iK ( S ) ) ′ . Assumption F (Rank Condition for linear– index setup ) . T he matrix E ( W i W ′ i ) is in vertibl e for all n sufficiently large. Replace ass umption E with F in Theorem 1, then the identification of θ k = ( β ′ k , α ′ k ) ′ is given as follows: for sufficiently lar ge n , θ k = E ( W i W ′ i ) − 1 E [ W i δ ik ( S ) ] . Clearly , variations in t he aggregated friends’ choice probabilities φ i ℓ ( S ) identify the strate- gic coe f ficients α k ℓ . 5. E S T I M AT I O N This section discusses the parametric estimation o f the structural parameter θ . In p artic- ular , we spe cify the payoff function by U ik ( Y − i , S i , ǫ i ) = X ′ i β k + K ∑ ℓ = 1 α k ℓ × 1 Q i ∑ j ∈ F i 1 ( Y j = ℓ ) + ǫ ik (6) where β k ∈ R d and α k = ( α k 1 , · · · , α kK ) ′ ∈ R K . Let θ k = ( β ′ k , α ′ k ) ∈ R K + d . Moreover , let β = ( β ′ 1 , · · · , β ′ K ) ′ ∈ Θ β ⊆ R K d and α = ( α ′ 1 , · · · , α ′ K ) ′ ∈ Θ α ⊆ R K × K , where Θ β and Θ α ar e the parameter sp ace for β and α , respectively . Deno te θ = ( θ ′ 1 , · · · , θ ′ K ) ′ and Θ = Θ β × Θ α . 15 Let { X i , F i , Y i } n i = 1 be the data generated from the equilibrium of a single lar ge networ k . For asymp t otic analysis, we consider the network size n goes to infinity , since our empiri- cal application involves a few lar ge ne t works. For the d ata generating process, our asymp- totics r equires t h e probabil ity distributions of G ( i , h ) with arbitraril y fix e d h converges to the same limiting distribution for all i as n → ∞ , and the random graph G ( i , h ) is inde pendent of G ( j , h ) given they do n’t contain any common e lement. W e now proceed to motivate ou r estimation procedure. First, note th at under As sump- tion A, the actions chosen by the players ar e conditional i.i.d. given S . Thus, we have the (conditional) loglikelihood function ˆ L ( θ ) = 1 n n ∑ i = 1 ∑ k ∈ A 1 ( Y i = k ) · ln σ ∗ ik ( S ; θ ) . (7) Let ˆ θ M LE = argmax c ∈ Θ ˆ L ( c ) be the MLE, which requir es to solve { σ ∗ ik ( S ; θ ) : i ∈ N ; k ∈ A } to (3). In (7), we can verify that all the regularity conditions hold under add itional weak conditions. 10 In pr actice, however , ˆ θ M LE is not computationally feasible when the ne twork is lar ge. This is because t he equilibrium choice probability σ ∗ ik ( S ; θ ) has no closed–form expression and its numerical solution is costly to obtain in the lar ge simultaneous e quation syste m. The ke y to our appr oach is to appr oximate σ ∗ ik ( S ; θ ) by s ome comput able solut ion σ h ik ( S ; θ ) to be defin e d later , where h is an integer that de pends o n n s u ch that the approxim ation error k σ h ik ( S ; θ ) − σ ∗ ik ( S ; θ ) k 1 is ne g ligible relativ e to the sampling error . Thus, we define our app roximated log likelihood function ˆ Q ( θ ) = 1 n n ∑ i = 1 ∑ k ∈ A 1 ( Y i = k ) · ln σ h ik ( S ; θ ) . (8) Further , o ur e s timator maximizes the approximated likelihood, i.e., ˆ θ = argmax c ∈ Θ ˆ Q ( c ) . T o define σ h ik ( S ; θ ) , we first define a Baye sian game of smaller size: let N ( i , h ) be the set of players and each player j ∈ N ( i , h ) simultaneously makes a discrete choice Y j ∈ A . More- over , each player j in N ( i , h ) has the same state variab les ( X j , ǫ j ) as the original netwo rk 10 For instance, Lemma 7 in the appendix ensures the differentiability of the objective function. 16 game, but player j ’s set of friend s is restricted to be F j ∩ N ( i , h ) . In other wor ds, we artifi- ciall y removes all the players outside of N ( i , h ) in the o r iginal game. N o te th at player i is located at the cent er of the s ubnetwork N ( i , h ) . Similarly , { σ h ik ( S ; θ ) : j ∈ N ( i , h ) , k ∈ A } solves: σ jk = exp h β ′ k X j + ∑ K ℓ = 1 α k ℓ × 1 Q j ∑ j ′ ∈ F j ∩ N ( i , h ) σ j ′ ℓ i 1 + ∑ K q = 1 exp h β ′ q X j + ∑ K ℓ = 1 α q ℓ × 1 Q j ∑ j ′ ∈ F j ∩ N ( i , h ) σ j ′ ℓ i . (9) By Lemma 1, there is a unique solution to (9). In this derived s u bnetwork game, player i is in the ce n t er of the subnet work and her equilibrium choice probabil ities profile is de n o ted by σ h i ( S ; θ ) = σ h i 0 ( S ; θ ) , · · · , σ h iK ( S ; θ ) . By Lemma 2, the approximation e rror k σ ∗ i ( S ; θ ) − σ h i ( S ; θ ) k 1 can be bounde d by 2 λ h + 1 . 11 T o control for the approxima tion error , we choose h to increase with n at a proper rate. 12 5.1. Asymptotic analysis. W e now establish the consistency and limiting d ist ribution for the proposed e stimator . First , we make the fo llowing assump t ions. Assumption G. (i) The para meter space Θ is compact and the support S X Q is bounded; (ii) The true parameter θ belongs to the interior of Θ . Assumption H. Let sup a ∈ Θ α max k , ℓ , m ∈ A | a k ℓ − a m ℓ | < ( K + 1 ) / K . Assumption I. Given any h ∈ N , the proba bility distribution of G ( i , h ) conver ges to a limiting distrib ution P G , h as n → ∞ for all i ; and G ( i , h ) is independent of G ( j , h ) if N ( i , h ) ∩ N ( j , h ) = ∅ . Mor eover , the payoff covariates X i ar e i.i.d. acr oss playe rs given the exogenous random network. Assumption J. Ther e exis ts a positive constant c 0 ∈ N , w hich does n ot depend on n , such tha t max i ∈ N ∑ j 6 = i 1 ( i ∈ F j ) ≤ c 0 with probab ility one. Assumption K. (i )L et h → ∞ as n → ∞ ; (ii) Let further h = [ h 0 · n a ] for some constant h 0 > 0 and a > 0 , where [ t ] is the lar gest integer that is no lar ger than t . 11 T o apply Lemma 2, le t S i , h denote the s tate of the network derived from S by eliminating all the network connections outside of N ( i , h ) . By d efinition, S i , h ∈ { s ′ ∈ S S : s ′ ( i , h ) = S ( i , h ) } . M oreover , note that σ h i ( S ; θ ) = σ ∗ i ( S i , h ; θ ) , since all players outside of N ( i , h ) have no strategic effects on players in N ( i , h ) . 12 Note that for h = 0, the proposed estimator becomes the classical multinomial logit estimator . 17 Let P = K ( d + K ) d enote the dimension of the p aramete r θ . Moreover , let f i ( Y i | S ; θ ) = ∑ k ∈ A 1 ( Y i = k ) × ln σ ∗ ik ( S ; θ ) and J n ( θ ) = E ∂ ∂θ f 1 ( Y 1 | S ; θ ) × ∂ ∂θ ′ f 1 ( Y 1 | S ; θ ) . The latter is indexed by n because of the dep e ndence of f i on n through S and σ ∗ i . Assumption L. Ther e exists a n on–singular P × P m atrix J ( θ ) such that J n ( θ ) → J ( θ ) . Assumption G–(i) ensures that choice probabi lities ar e bound ed away from zero so that the loglikelihood function is bo u nded. Unbounde d regressors can be accommodated us- ing high o rder moments restrictions (see e.g. V an de Geer, 1990). As sumption G–(ii) is standard in t he literatur e. Assumption H st rengthens assumption B to hold for all the values in the parameter space Θ . Assumptions I and J impose restrictions on t he distribution o f the state variables as we ll as the netw ork connections. For the first half o f As sumption I, note that for any given n and h , the probabi lity distribution of G ( i , h ) is well defined, since the subgraph G ( i , h ) is a mapping t o the sp ace of graphs from the n × n matrix w ith 0 − − 1 e ntries. Note that the subgraph G ( i , h ) here refers to all subgraphs that ar e homomorphic to G ( i , h ) , 13 because players’ identities do not matter in the definition of G ( i , h ) . Moreover , the firs t half of As- sumption I also requir es t hat G ( i , h ) should be i.i.d. across players who are at least 2 h –ste p faraway fr om each o ther in the network . This condition g enerally ho lds in the random graph literature, since conditional on G ( i , h ) and G ( j , h ) do not overlap, t he graph structure of G ( i , h ) does not p rovi de additional information on how G ( j , h ) looks like. Moreover , for the second half of As s umption I, the (conditional on the network connections) independ e nce of X i is a strong assumption. In practice, positive statistic depe ndence across friends’ de - mographic variables (e.g. age, education, race, e tc.) has been emphasized in the sociology literature (see e.g. E asley and Kleinberg, 2010), w h ich is the s o-called “homophily” phe- nomena. For our asymp t otic results to be established, this assumption cou ld be r elaxed to allow for so me deg ree of depe n d ence at the expe nse of longer proofs. 14 13 In Gr aph theory , the notion of graph homomorp his m is d efined as follows: F or a graph G , let V ( G ) and E ( G ) be the s e t o f vertices and the set of edg es o f G , respectively . Let G and H be two g raphs. A mapping ϕ : V ( G ) → V ( H ) is called homomorphism if ϕ preser ves e dge adjacency , i.e. , for every edge { v , w } ∈ E ( G ) , the edge { ϕ ( v ) , ϕ ( w ) } belongs to E ( H ) . 14 For instance, as is sug gested by spatial autoregressive mo dels, one could assume tha t X ≡ ( X ′ 1 , · · · , X ′ n ) ′ takes a simultaneous autoregressive de pendence structur e: X = Ψ ( γ 0 ) · X + ν , 18 Assumption J impos es restrictions on the number of best friends th at a single indivi du- als could have. No t e that the upper bou nd c 0 does not d epend on the ne twork size n . This condition is cruci al for t he √ n –asymptotics of the proposed es timator whe n th e d ata come fr om one single lar ge ne t work game: By ass umption J and the NDD condition, we can limit the depend ence among all the observations. Similar assumptions can also be found in, e.g., Morr is (2000) for the contagion analysis in local–i nteraction games. It is w orth po inting o ut that Assumption J is not imposed in most of the recent empiri- cal network formation mode ls. Such a restriction, how e ver , can be e asily accommodated in r ecent net work formation mode ls, e.g. , Christ akis, F owler , Imbens, and Kalyanaraman (2010) and Mele (2010). On t he the oretic s ide of netwo rk formation, e.g., Jackson and W olinsky (1996) introduce a cost for p layer s t o maintain a direct friends h ip link, which similarl y lim- its the number of d ir ect links each individual could have. In our empirical application, each stu dent was allowed t o no minate at most t en best friends . Such a restriction is reason- able in light of capacity const r aints (e.g . time and/or effort) for student s t o make and keep their best –friends. Ther efore, a n e twork formation model u s ing the same datase t should impose such a restriction; othe rwise the model cannot rationalize the data. Assumption K–(i) is intuitive for the approximation of σ ∗ i ( S ; θ ) . Moreover , str ength- ening (i), assumpt ion K–(ii) ens ures the approxima tion error is neg ligible in t he limiting distribution o f the estimator . In assumption L, J n ( θ ) is the Fisher information matrix of the n –player game. Assump- tion L requires that the Fisher information matrix h as a non–deg enerate limit whe n the network s ize goe s to infinity . No te that the con ve rgence of J n ( θ ) is implied by L emma 2 where Ψ is an n × n weight matrix parametrized by a q -dime nsi o nal vector γ 0 such that d iagonal elements of Ψ are zeros and I n − Ψ is non-singular . Moreover , ν ∈ R n is a vector of i.i. d. errors tha t are independent of S and ( ǫ 1 , · · · , ǫ n ) ′ . Our asymptotic results es tablished in Theorems 2 and 3 still hold as long as for each k ∈ A , 1 n n ∑ i = 1 n σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) − E [ σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) ] o p → 0, unifor mly holds in c ∈ Θ . Such a high level condition can be satisfied i f the weight matrix is modele d as: Ψ j ℓ = ψ ( min { ρ ( j , ℓ ) , ρ ( ℓ , j ) } ) where ψ is a de creasing function that decays sufficiently fast (i.e., subjects to expo nential decay). Moreover , it is also possi ble to allow the depe ndence between X j and X ℓ , i.e. , Ψ j ℓ , to depend not only on the network distance between j and ℓ , but also on the distance between j (o r ℓ ) and other players that connect (di rectly or indirectly) to both o f them. See e.g. Pinkse, Slade, and Brett (2002). 19 and ass u mption I, since the dist ribution S ( i , h ) convergence to a limit for all i as n → ∞ . Hence, in assumption L, the essential restriction is the non–dege neracy of t he limit. Theorem 2. Suppo se that assumptions A and F, G-(i), I, J, and K-(i) hold. Then ˆ θ p → θ . Given t he con s istency of ˆ θ , we now e stablish its limiting distribution, which is shown to be ident ical to ˆ θ M LE under addition cond itions . Theorem 3. Suppo se that assumptions A and F to L hold. Then √ n ( ˆ θ − θ ) d → N 0, J ( θ ) − 1 . Note that the infeasible likelihood function (7) is indeed what ultimately g ives the infor - mation equality in Theorem 3. Furthe rmore, the limiting Fisher information matrix J ( θ ) can be consist ently es timated by 1 n n ∑ i = 1 h ∂ ∂θ f h i ( Y i | S ; ˆ θ ) × ∂ ∂θ ′ f h i ( Y i | S ; ˆ θ ) i , where f h i ( Y i | S ; θ ) = ∑ k ∈ A 1 ( Y i = k ) ln σ h ik ( S ; θ ) and ∂ ∂θ f h i ( Y i | S ; θ ) = ∑ k ∈ A 1 ( Y i = k ) σ h i k ( S ; θ ) × ∂ ∂θ σ h ik ( S ; θ ) . Remark. It is a ge n e ric aspect of our asymptotic analysis that the size of the net w ork goes to infinity , but t he maximum number of friends each player sho u ld r emain fixe d (i.e. assump- tion J). Therefore, t h e collection o f state variabl es { S ( i , h ) : i ≤ n } becomes an m –de pende n t sequence, where m ≤ c h + 1 0 , which is crucial for the √ n –consiste ncy of the est imator in the proof of Theorem 3. This aspect rules out the “Small–W orld phenomenon” (se e e.g. W atts and Strogatz, 1998), ofte n referred to as six degrees of sep aration (see e.g . Guare, 1990). It s hould be note d that whet h e r the network is a “small–world” is an empirical ques - tion that can be verified from t he d ata. In a small–world net w ork, the asympto tic analysis should allow the (average) number of friends to increase with the size of the network . 15 It seems t o be an intriguing challenge to consider Small–W orld asymptot ics. It is worth pointing out that it is possible to r elax assumption J to accommodate s ome “intermediate” case of the network structure at the expe nse of longer proofs. For instance, consider a ne twork where the maximum number of friend s is no t bou nded from above, but t he distribution of Q i is asymptotically stable (as the network size n goes t o infinity) 15 W atts and Strogatz (1998 ) develop a small–world model by rewiring a regular network with n ≫ Q i ≫ ln n ≫ 1. 20 with finite mean and variance. Hence, there can be a few , but significant number of nodes with a lot of connections , which howe ver d oes no t render the netw o rk to a “Small–W orld”. In the following Mont e Carlo ex p eriments, we consider su ch a specification to examine t he finite s ample pe rformance of our approxima ted maximum likelihood est imator . 5.2. Monte Carlo E xperiments. This section uses Monte Carlo to illustrate the finite sam- ple performance of the p ropose d estimator . In particular , we consider a binary game with payoff: U i 1 ( Y − i , S i , ǫ i ) = X ′ i β + α × h 1 Q i ∑ j ∈ F i 1 ( Y j = 1 ) i + ǫ i 1 , where α ∈ R and X i ∈ R 2 . Moreover , we consider two repr esentative networks: First , we consider the Cir cle net- work specified in S alop (1979 ), where n p layers are e qually spaced in a circl e and each player h as two friend s. In the circle net work, Q i = 2 for all players and the friendship rela tion betwe en each p air of players is also symmetric. The second network is a ran- dom network. For any i 6 = j , we use a rando m variable ~ ℓ i , j ∈ { 0, 1, 2, 3 } to deno te “no rela tionship”, “ i is j ’s friend, but not vice versa”, “ j is i ’s friend, but no t vice vers a” and “mutual friends hip”, respectively . Fo r i 6 = j , ~ ℓ i , j is drawn independ ently from the prob- abili ty mass distribution 1 − 4 n , 1 n , 1 n , 2 n . Moreover , s et ~ ℓ ii = 0 for all i . By defin ition, Q i = ∑ n j = 1 1 ( ~ ℓ i j ∈ { 2, 4 } ) , which conforms t o a Binomial Distribution B ( n , 3 / n ) . As n goe s to infinity , the mean of Q i rema ins constant and conforms to the Poisson (3) dist ribution asymptotically . Moreover , we take X i 1 ∼ U ( − 0.5, 0.5 ) , X i 2 ∼ N ( 0, 1 ) and X i 1 ⊥ X i 2 . The r esults for the other dist ributional specifications of X are qualitatively similar . Further , w e set β = ( 1, 1 ) ′ which ar e invariant acr oss all the experiment s . Accor ding to assumption H, we choose Θ α = [ − 1.99, 1.99 ] and set the true parameter α = 0, 0.8, and 1.6, respectively . In particular , for α = 0, our sett ing is e quivalent t o t h e classical Logit model. W e have performed exp eriments with the number of players n = 500, n = 1000 and n = 2000. In each d esign, we first compute the unique BNE g iven the underlying p arameter value, i.e., we solve the equilibrium by finding a fixe d point to (3). W ith t he (numerical) solution in hand, we ar e able to simulate the equilibrium decision made by each player . Regarding es timation, it is crucial to choose t he parameter h ∈ N accor ding to t h e sam- ple size n . Following assumption K, we set h = [ √ n /10 ] , i.e . h = 2, 3 and 4 with r espect 21 to the three choices o f sample size. It is worth po inting o ut that the computation time in- cr eases with h in a no n–linear pattern. F or fixed n , we also investigate the performance of the proposed estimator under d if ferent choices of h . The results for dif ferent sample sizes ar e qualitatively similar and th e refor e we only report results for n = 1000. In addition, we perform 500 replications to approximate t h e finite sample distribution of our estimator . T ables 2 and 3 report t he finite s ample pe r fo r mance of the proposed estimator under the dif ferent settings . T h e numbers in parentheses are the st andar d deviations. The es t imator is consistent for all t hese designs and the s tandard deviation diminishes at the √ n –rate as we incr ease the sample size. In T able 4, we furthe r investigate how the choice of h affects the performance of ˆ θ . For n = 1000, it shows that the approximation behaves well by using h ≥ 3 and additional gains o f accuracy are minor from choosing lar ger h . T A B L E 2 . Finite s ample pe rformance: β = ( 1, 1 ) and ( n , h ) = ( 1000, 3 ) T rue value of α Parameters Cir cle Network Random Network 0 β 1 1.013 1 1.029 2 (0.245 4) (0.249 3) β 2 1.003 6 1.005 8 (0.082 6) (0.083 3) α 0.006 8 0.010 9 (0.132 6) (0.140 2) 0.8 β 1 1.001 8 1.020 4 (0.246 8) (0.255 7) β 2 1.009 1 1.006 0 (0.083 3) (0.083 4) α 0.806 6 0.802 3 (0.104 2) (0.111 4) 1.6 β 1 1.005 9 1.017 9 (0.246 4) (0.272 1) β 2 1.000 8 1.006 4 (0.084 9) (0.083 9) α 1.625 6 1.616 9 (0.095 0) (0.093 0) 5.3. Empirical results for peer eff ects on college attendance. W e now apply our method to estimate p eer effects on high scho o l stude nts’ college attendance decisions. Th e specifi- cation o f the payoff function is the same as the one used in our Monte Carlo e xperiments. 22 T A B L E 3 . Finite s ample pe rformance of ˆ α T rue value of α Sample size Cir cle Ne twork Random Network 0 500 0. 0030 0. 0033 (0.195 4) (0.200 4) 1,000 0 .0068 0 .0109 (0.132 6) (0.140 2) 2,000 0 .0044 0 .0022 (0.096 2) (0.096 8) 0.8 500 0.8 032 0.8 048 (0.157 0) (0.146 9) 1,000 0 .8066 0 .8023 (0.104 2) (0.111 4) 2,000 0 .8036 0 .7964 (0.071 4) (0.071 6) 1.6 500 1.6 254 1.6 776 (0.128 2) (0.139 8) 1,000 1 .6256 1 .6169 (0.095 0) (0.093 0) 2,000 1 .6072 1 .6064 (0.066 0) (0.065 9) *Note that h = 2, 3, 4 for n = 500, 1000 and 2000, respectively . T A B L E 4 . Finite s ample pe rformance of ˆ θ at dif ferent h ( n = 1000 , α = 0.8) Parameters h = 0 1 2 3 4 Circle Network β 1 0.979 0 1.012 1 1.0 155 1 .0157 1 .0157 (0.250 1) (0.2448) (0.2461) (0.246 2) (0.2462) β 2 0.962 7 0.996 7 1.0 002 1 .0004 1 .0004 (0.082 1) (0.0845) (0.0848) (0.084 9) (0.0849) α n/a 0. 8560 0.801 4 0.7 974 0.7 972 n/a (0.111 8) (0.099 6) (0. 0 986) ( 0.098 4) Random Network β 1 0.964 9 1.006 3 1.0 094 1 .0098 1 .0098 (0.256 8) (0. 2 575) (0.2 584) ( 0.258 5) (0.2 585) β 2 0.961 4 0.999 0 1.0 023 1 .0026 1 .0026 (0.082 3) (0. 0 824) (0.0 825) ( 0.082 5) (0.0 825) α n/a 0. 8957 0.806 8 0.7 979 0.7 968 n/a (0.128 9) (0.106 3) (0. 1 033) ( 0.102 8) T able 5 p resents our estimation r esults. W e also provide results using the pse udo ML E for comparison. The dif fer ence reflects the bias due to t h e misspe cification of social inter- actions. Note that AML E(h) refers to the approximated MLE with the p arameter value h and the pseudo MLE is equivalent to AMLE(0). From T able 5, the app roximation of the 23 equilibrium is suffic iently good for h ≥ 2. So we can use AMLE(2) as our estimates. It is worth pointing o ut that the estimates of p eer effects satisfy assumption B. The second column o f T able 5 contains the corresponding es timates of the pse u do MLE, which has been typically adopt ed in t he empirical analysis on college attendance. Given the pseudo MLE e stimates, the most striking dif ference of our es timates (i.e. AMLE(2) in the fourth column) is that the peer ef fects coefficient is significant at the 5% level, while the pse udo MLE implicitl y se ts it to be zero. T herefor e, the igno rance of pee r effects in the empirical anal ysis on college attendance results in biased estimates, which can be corrected by increasing h fr om 0 to 2. In T able 5 , most of coefficients e stimates ar e significant at the 10% significance level. Regarding race, the coe ffic ients of American Indian, Asian and Black are insignificant, this is simply due to the fact th at all t hese three categories h ave only a few observations in the sample. Moreover , due t o missing data issue on parents’ education, one would expect noisy estimates for the par ents’ education coe f ficients. Our ps eudo MLE estimates ar e q u alitatively similar to those empirical results in Light and Strayer (2002) who estimate raci al e ff ects on college attendance with a Pr obit model by using the data fr om the 1979 National Longitudinal Survey of Y out h (NLSY 79), which consists a s am- ple of respondents bo r n in 1957–1964. In particular , white s are less likely than minorities to atten d college , given o ther dete rminants of college attendance are held constant. Fo r such a comparison, note that peer ef fects are not considered in Light and St rayer (2002). Our pseudo MLE results are also consist e nt with ot her early empirical evidence on college attendance. S ee e.g . Fuller , Manski, and W ise (1982). 16 Peer eff ects est imates p rovi ded by AML E (2) ar e related to those empirical results in Calvó-Armengol, Patacchini, and Zenou (2009), who also use the Add Health data to stud y peer effects on school p e rformance index. In particular , the y specify a linear equation syste m for netw ork–based social interactions and obtain statistically significant peer e f- fects estimates of similar magnitude (i.e., 0.5505 with a standar d error 0.1247). Moreover , 16 Fuller , Manski, and W ise (1982) use t he 1972 National Longitudinal Study of the High School Class (NLSS72). 24 Gaviria and Raphael (2001) and Kawaguchi (2004) use the National Education Longitu- dinal Study (NELS ) d ataset and the National Long itudinal Survey Y outh 97 (NLSY97) dataset, respectively , to stud y peer eff ects on y outh behaviors of high school st udents, e.g., drug use, alcohol drinking, cigarette smok ing, church att e ndance and dropping out. Their empirical results also provide evidence for significant peer ef fects of similar magnitude to our estimates. For example, consider a typical student in our sample whose covariates take the mean values in T able 1. Sup pose all he r friend s s hift their college attendance probab ilities toget her from 0% to 10%, the n he r college attend ance probabili ty wo u ld in- cr ease about 1.52% (namely , from 37.93% to 39.45%). S imilarl y , if all he r friends’ college attendance probabiliti es shift jointly from 0% to 50%, the n it would yield an increase of 11.83%. 17 6. C O N C L U S I O N This p aper provi des a structural approach to s tudy social interactions in a lar ge net- work. Our benchmark model assume s that ind ividuals are af fected by the ir friends only but all individuals ar e connecte d to each ot h e r directly or indir ectly in a single ne twork. By restricting the strength of interactions among friend s, we es tablish the existence, unique- ness o f the equilibrium and a NDD condition. W e furt her establish t he semiparametric identification of the model and propose a computationally feasible and novel estimation procedur e. The classic MLE me t hod developed in single–agent binary response mod els is naturally ne s ted in our approach. An important extens ion of the benchmark mod e l is to allow for interdependence be- tween a pair o f friends’ private information. Individuals tend t o bond with similar others as t h e ir friend s . In so ciology , s uch a phenomena is called “homophily”; see e.g. Easley and Kleinberg (2010). Homophily leads to friends hip bet ween p eople with similar characteristics (age, education, race, etc.) and with positively correlated type s. The former can be directly 17 In a study of tenth graders’ substance use, Gaviria and Raphael ( 2001)’s estimates i mply , for example, that moving a typical teenager from a school where none of his classmates use drugs to one where half use drugs would increa se the probab ility by approximately 13%. Simil ar exper i ments would yield increases in the corre- sponding probabilities of 9% for alcohol use, 8% for cigarette smo k ing, 11% for church attendance, and 8% for dropping out of school. Moreover , Kawaguchi (2004) show that i f a teenager ’s perception of the percentage of his/her peer s who use a substance (i.e. marijuana, alcohol, or tobacco) increases by 10%, the probability that he/she will use the substance increases from 1.4% to 2.6%. 25 T A B L E 5 . Estimation Re s ults V ar iable Pseudo MLE AMLE( 1) A MLE(2) A MLE(3 ) AML E (4) Age -0.140 * -0.135 * -0.135 * -0.1 35* -0.1 35* (0.076 ) (0.076) (0 .076) (0.0 76) (0.07 6) Female -0.028 -0.038 -0.035 -0.034 -0.034 (0.171 ) (0.171) (0 .171) (0.1 71) (0.17 1) Household Income 0.150 ** 0.134 ** 0.134** 0.134** 0.13 4 ** (0.042 ) (0.043) (0 .043) (0.0 43) (0.04 3) Mother ’s Education 0.066 0.064 0 .064 0.064 0 .064 (0.052 ) (0.053) (0 .053) (0.0 53) (0.05 3) Father ’s Educa tion 0.033 0 .035 0.0 36 0 .036 0. 036 (0.029 ) (0.029) (0 .029) (0.0 29) (0.02 9) Overall GP A 1.749 ** 1.714 ** 1.717** 1.717** 1.71 7** (0.147 ) (0.148) (0 .148) (0.1 48) (0.14 8) American Indian -0.559 -0.575 -0.574 -0.57 4 -0.574 (0.418 ) (0.423) (0 .423) (0.4 23) (0.42 3) Asian -0.050 0.035 0 .043 0.043 0 .043 (0.428 ) (0.435) (0 .435) (0.4 35) (0.43 5) Black 0.206 0 .351 0.3 63 0 .364 0. 364 (0.455 ) (0.466) (0 .467) (0.4 67) (0.46 7) Hispanic 0.891 ** 1.043 ** 1.051** 1.052** 1.05 2 ** (0.223 ) (0.233) (0 .234) (0.2 34) (0.23 4) White -0.703 * -0.718 * -0.717 * -0.7 18* -0.7 18* (0.393 ) (0.401) (0 .401) (0.4 01) (0.40 1) Other Race -1.024 ** -1.0 96** -1.097** -1.098** -1.098 ** (0.422 ) (0.430) (0 .430) (0.4 30) (0.43 0) Constant -2.680* -2 .795* -2 .806* -2.806 * -2.806 * (1.441 ) (1.445) (1 .446) (1.4 46) (1.44 6) Peer E ffects — 0.657** 0.642** 0.640** 0.64 0** — (0.297 ) (0.28 6) (0.285 ) ( 0.285 ) LogLikelihood -437.5 37 -435. 063 -43 4.988 -43 4.990 -434.9 90 * significant at 10% level. ** significant at 5% level. observed fr om the data. T o identify the latter is more challenging to the resear cher . In a discrete g ame with a (small) fixe d number of players, L iu, V uong, and Xu (2012) establish the nonparametric iden t ification o f homophily in a context of discrete game. Identification and estimation of homophily in a large n e twork g ame is an important e xtension. Allowing for po s sible e n d ogene ity o f the n e twork is anot her important r esearch ques - tion in the study of large–network s ocial interactions. Being popular in a hight s chool network might be associated with a p ossible high d raw of payoff shocks for college at- tendance. Part of the problem could be addressed by taking into account the ne twork 26 formation in the first stage, see e. g . Christakis, Fowler , Imbens, and Kalyanaraman (20 10), Mele (2010), Badev (2013) L eung (2014) and Menzel (2015b). In this regar ds, our identifi- cation and est imation results are useful for the second st age analysis of social interactions in the s ubgame. In a large networ k g ame, however , dif ficulties arise when each p layer has a small op portunity se t, relativ e to the large network size, of p layers to meet with, and more importantly , such opport u nity s e ts are no t observed in the dataset. Fo r the majority of pairs of distinct individuals, it is unclear whether an unconne cted link is du e to the lack of opportunity , or players’ unfavorable des ir e for s u ch a connection. As a matter of fact, our results g o well beyond the local interaction studied he re as they can be ge neralized to more g eneral social inter action games. For instance, one can cons ider that each player interacts d irectly with her friends, friends of friends , et c. In particular , the payoff function can be ge neralized as follows: fo r choo sing an action k ∈ A , U ik ( Y − i , S i , ǫ i ) = β k ( X i ) + ∑ j 6 = i α k ( Y j , d i j , X i , Q i ) + ǫ ik , where d i j is the netw o rk dist ance fr om j to i . By such an exte nsion, the interaction te rm α k ( Y j , d i j , X i , Q i ) dep ends on player j ’s choice as well as their network distance. In (1), direct interactions α k have bee n set to zero for all j 6∈ F i . By a s imilar argument, our uniqueness and NDD condition of the equilibrium can be established. A major difficulty in developing nonparametric identification and e stimation, however , is to consider a mo d el with an incr easing parameter space, since the suppo rt of d i j expands with the size of th e network. Though significant p rogress has been made in t he regr ession cont e xt (see, e.g., Belloni and Chernozhuko v, 2011), the differ ent nature of the structural analysis calls for further work. 27 R EF E R E N C E S A ND R E W S , D . W . (1988): “Laws of lar ge numbers for depe ndent non-identicall y dis- tributed random variables,” Econometric theory , 4(03), 458–467. B A D E V , A . (2013): “Discr ete games in e ndogeno us n e tworks : Theor y and p olicy ,” Discus- sion pape r . B A J A R I , P . , H . H O N G , J . K R A I N E R , A N D D . N E K I P E L O V (2010): “Estimating s tatic mode ls of st r ate gic interactions,” Journal of B usiness and E conomic Statistics , 28(4), 469–4 82. B A L A , V . , A N D S . G O Y A L (2000): “A noncoope rative mod el of ne twork formation,” E con o- metrica , 68(5) , 1181–12 29. B A R A B Á S I , A . - L . , A N D R . A L B E RT (1999): “Emergence of scaling in random network s,” science , 286(5439) , 509–5 12. B EL L O NI , A . , A N D V . C H E R N O Z H U K O V (201 1): High dimensional sparse econometric models: An introd uction . Springer . B JO R N , P . A . , A N D Q . H . V U O N G (1984) : “Simultaneous equations mod els for dummy endoge nous variables: a game t h e oretic formulation with an application to labor force participation,” W orking Papers 537, California Institute of T echnology , Divi sion o f the Humanities and Social Sciences. B O L L O BÁ S , B . (19 98): Random graphs . Sp ringer . B R A M O U L LÉ , Y . , H . D J E B B A R I , A N D B . F O RT I N (2009) : “Identification of peer eff ects through social networks , ” Journal of Econometrics , 150(1) , 41–55. B R E S N A H A N , T. F . , A ND P . C . R E I S S (199 1): “Empirica l mode ls of discr ete games,” Journal of Econ ometrics , 48(1-2), 57–81. B R O C K , W . , A N D S . D U R L A U F (2001): “Discr ete choice with social inte r actions, ” Review of Economic Studies , 68(2), 235–260 . (2002 ): “A multinomial-choice mod el of neighborhood eff ects,” American Economic Review , 92(2), 298–303 . C A L V Ó - A R M EN G O L , A . , E . P ATA C C H I N I , A N D Y . Z E N O U (2009): “Peer effects and social networks in edu cation,” The R eview of Economic Studies , 76(4), 1239–1267. C A R R E L L , S . E . , F. V . M A L M S T R O M , A N D J . E . W E S T (2008 ): “Peer e f fects in academic cheating,” Journal of human re sour ces , 43(1), 173–2 07. 28 C H R I S TA K I S , N . A . , J . H . F O W L E R , G . W . I M B E N S , A N D K . K A LY A N A R A M A N (2010): “An empirical mod el for str ate gic network formation,” Discussion paper , National B ureau of Economic R esearch. C H R I S T E N S E N , S . , J . M E L D E R , A N D B . A . W E I S B R O D (197 5): “Fac tors affecting college attendance,” Journal of Human Resour ces , 10(2), 174–1 88. E A S LE Y , D . , A N D J . K L E I N B E R G (2010): Networks, crowds, an d markets : Reasoning about a highly conn ected w orld . Cambridge University Press. F U L L E R , W. C . , C . F . M A N S K I , A ND D . A . W I S E (1982): “New evidence on the econo mic determinants of pos tsecondary schooling choices,” Journal of Human Resour ces , pp. 477– 498. G A V I R I A , A . , A N D S . R A P H A E L (2001): “School-based peer effects and juvenile behavior ,” Review of Econ omics and Statistics , 83(2), 257–268 . G R A N O V E T T E R , M . (1985 ): “Economic action and social s tructur e: the problem of e mbed- dednes s,” A merican Journal of Sociology , 91(3) , 481 –510. G U A R E , J . (1990): Six degr ee s of separatio n: A play . V intage. H U RW I C Z , L . (1950): “Generalization of the concept of identification,” Statist ical infere nce in dynamic economic models , 10. J A C K S O N , M . O . , A N D A . W O L I N S K Y (1996): “A s trategic model of social and economic networks ,” Journal of Economic Theory , 71(1), 44–74. J E N I S H , N . , A N D I . R . P R U C H A (2009): “Central lim it theorems and uniform laws of lar ge numbers for arrays of random fie lds,” Journal of econometrics , 150(1 ), 86–98. K AWA G U C H I , D . (200 4): “Peer e ff ects on subst ance us e among Amer ican te enagers,” Jour- nal of Population Economics , 17(2) , 351 –367. K O O P M A N S , T . , A N D O . R E I E R S OL (1950) : “The identification o f structural characteristics,” The A nnals of Mathematical Statistics , pp. 165–1 81. L ES L I E , L . L . , A N D P . B R I N K M A N (1988): The economic value of hig her education . American Council on Ed ucation, Ne w Y ork. L EU N G , M . (2014): “A random-Field approach to inference in lar ge mod e ls of network formation,” Avail able at SSRN . 29 L I G H T , A . , A N D W . S T R AY E R (2002): “Fr om Bakke to Hopwood: Does r ace af fect college attendance and completion?,” Review of Economics and Statistics , 84(1), 34–44. L I U , N . , Q . V U O N G , A N D H . X U (2012): “Rationalization and nonparametric ide n t ification of discrete g ames with correlated t ypes,” Discuss ion paper . M A N R E S A , E . (2013): “Estimating the structure o f social interactions using panel data,” Discussion pape r . M A N S K I , C . (1993): “Iden t ification o f endog enous so cial effects: The reflection problem,” The R eview of Economic Studies , 60(3), 531–542. (2000 ): “Economic analysis of social interactions,” The Journal of Economic Perspec- tives , 14(3), 115–136 . M E L E , A . (2010 ): “A structural model of segregation in s ocial networks,” Discussion paper , cemmap wo rking p aper . M E N Z E L , K . (2015a): “Lar ge matching markets as two-sided demand syst ems,” Economet- rica , 83(3), 897–941. (2015 b): “Strate gic netwo rk formation with many agents,” Discussion paper , N ew Y o r k University . M O R R I S , S . (2000): “Contagion,” Review of Economic Studies , 67(1), 57–7 8. N A K A J I M A , R . (2007 ): “Measur ing peer effects on youth smoking behaviour ,” Review of Economic Studies , 74(3), 897–935 . N EW E Y , W . , A N D D . M C F A D D E N (1994): “Lar ge sample estimation and hypot hesis tes ting,” Handbook of econometrics , 4, 2111–2 245. P A G A N , A . , A N D A . U L L A H (1999 ): Nonparametric econometrics . Cam bridge University Press, New Y or k , NY , USA . P I N K S E , J . , M . E . S L A D E , A N D C . B R E T T (2002): “Spatial price competition: a semiparamet- ric approach,” Econometrica , 70(3), 1111–1 153. R OB I N S O N , P . (1988) : “Ro o t-N-consistent s emiparametric regression,” Econometrica , 56(4) , 931–9 54. S A C E R DO T E , B . (20 01): “Peer ef fects with random ass ignment: results for Dartmouth roomma tes,” Quarterly Journal of Economics , 116(2), 681–7 04. 30 S A L O P , S . (1979): “Monopo listic competition with outside go o ds,” The Bell Journal of E co- nomics , 10(1) , 141–156. S EI M , K . (2006): “An empirical model of firm entry with endogeno u s product–type choices,” The RA ND Journal of Economics , 37(3), 619–640. T A M E R , E . (2003): “Incomplete simultaneous discrete response model with multiple equi- libria,” T he R eview of Economic Studies , 70(1), 147–165. V A N D E G E E R , S . (1990 ): “Estimating a regression function,” The A nnals of Statistics , 18(2) , 907–9 24. V A N D E R V A A R T , A . W . (2000): Asymptotic statistics , vol. 3. Cambridge university press. W AT T S , D . J . , A N D S . H . S T R O G AT Z (1998): “Collective d ynamics of “small-world” net- works,” nature , 393(66 84), 440–442. X U , H . (2014): “Estimation of discrete games with correla ted ty p es,” The Econometrics Jour- nal , 17(3), 241–270. 31 A P P E N D I X A. E Q U I L I B R I U M U N I Q U EN E S S A N D N E T W O R K S TA B I LI T Y A.1. Proof of Lemma 1. Fix n and S = s . W e p rove by contradiction. Suppos e there ar e two BNEs, denot e d by { σ ∗ i : i = 1, · · · , n } and { σ † i : i = 1, · · · , n } respectively . Fo r notational s implicity , throughout we supp ress their depend ence on S and θ . For any choice proba bility profile ( σ 1 , · · · , σ n ) , where σ i is a (K+1)–choice probabi lity distributions, let Γ ik s i , { σ j : j ∈ F i } = exp h β k ( x i ) + ∑ K ℓ = 0 n α k ( ℓ , x i , q i ) ∑ j ∈ F i σ j ℓ o i 1 + ∑ K ℓ ′ = 1 exp h β ℓ ′ ( x i ) + ∑ K ℓ = 0 n α ℓ ′ ( ℓ , x i , q i ) ∑ j ∈ F i σ j ℓ o i . (10) Let furt her Γ i s i , { σ j : j ∈ F i } = Γ i 0 ( s i , { σ j : j ∈ F i } ) , · · · , Γ iK ( s i , { σ j : j ∈ F i } ) ′ . By eq. (3), we have σ ∗ i = Γ i ( s i , { σ ∗ j : j ∈ F i } ) and σ † i = Γ i ( s i , { σ † j : j ∈ F i } ) for all i ∈ N . Therefor e, for any i ∈ N , σ ∗ i − σ † i = Γ i s i , { σ ∗ j : j ∈ F i } − Γ i s i , { σ † j : j ∈ F i } = ∑ j ∈ F i ∑ ℓ ∈ A ∂ Γ i ( s i , { ˜ σ j : j ∈ F i } ) ∂σ j ℓ · ( σ ∗ j ℓ − σ † j ℓ ) where { ˜ σ j : j ∈ F i } is a vector between { σ ∗ j : j ∈ F i } and { σ † j : j ∈ F i } . By the d efinition of Γ ik , we have ∂ ln Γ ik ∂σ j ℓ = α k ( ℓ , x i , q i ) − K ∑ ℓ ′ = 1 Γ i ℓ ′ · α ℓ ′ ( ℓ , x i , q i ) = K ∑ ℓ ′ = 0 Γ i ℓ ′ · α k ( ℓ , x i , q i ) − K ∑ ℓ ′ = 0 Γ i ℓ ′ · α ℓ ′ ( ℓ , x i , q i ) where the last step is because: (i) ∑ K ℓ ′ = 0 Γ i ℓ ′ = 1; (ii) α 0 ( ℓ , x , q ) = 0. It follows that ∂ Γ ik ∂σ j ℓ = Γ ik ∑ k ′ 6 = k [ Γ ik ′ · { α k ( ℓ , x i , q i ) − α k ′ ( ℓ , x i , q i ) } ] . Therefor e, ∑ k ∈ A ∂ Γ ik ∂σ j ℓ ≤ ∆ ∗ ( x i , q i ) · ∑ k ∈ A [ Γ ik ( 1 − Γ ik ) ] ≤ ∆ ∗ ( x i , q i ) · K K + 1 . 32 where ∆ ∗ ( x , q ) ≡ max k , ℓ , m ∈ A | α k ( ℓ , x , q ) − α m ( ℓ , x , q ) | and the last s tep comes from the fact that (i) 0 ≤ Γ ik ≤ 1; (ii) ∑ K k = 0 Γ ik = 1. Hence, k σ ∗ i − σ † i k 1 = ∑ k ∈ A ∑ j ∈ F i ∑ ℓ ∈ A ∂ Γ ik ( s i , { ˜ σ j : j ∈ F i } ) ∂σ j ℓ · ( σ ∗ j ℓ − σ † j ℓ ) ≤ ∑ j ∈ F i ∑ ℓ ∈ A ( σ ∗ j ℓ − σ † j ℓ · ∑ k ∈ A ∂ Γ ik ( s i , { ˜ σ j : j ∈ F i } ) ∂σ j ℓ ) ≤ ∆ ∗ ( x i , q i ) · K K + 1 · ∑ j ∈ F i ∑ ℓ ∈ A σ ∗ j ℓ − σ † j ℓ ≤ ∆ ∗ ( x i , q i ) · K K + 1 · q i · max j ∈ F i k σ ∗ j − σ † j k 1 ≤ λ · max j ∈ F i k σ ∗ j − σ † j k 1 . Therefor e, max i ∈ N k σ ∗ i − σ † i k 1 ≤ λ · max i ∈ N max j ∈ F i k σ ∗ j − σ † j k 1 ≤ λ · max j ∈ N k σ ∗ j − σ † j k 1 which leads t o cont radiction by λ < 1 under assumpt ion B. A.2. Proof o f Lemma 2. This lemma is shown by mathematical induction. F ix arbitrarily n , h ∈ N and s , s ′ ∈ S su ch that s ( i , h ) = s ′ ( i , h ) . First, for all j ∈ N ( i , h ) , we have s j = s ′ j . W e now derive σ ∗ j ( s ; θ ) − σ ∗ j ( s ′ ; θ ) using T aylor expansion, i.e., σ ∗ j ( s ′ ; θ ) − σ ∗ j ( s ; θ ) = ∑ j ∈ F i ∑ ℓ ∈ A ∂ Γ j ( s j , { ˜ σ j ′ : j ′ ∈ F j } ) ∂σ j ′ ℓ · ( σ ∗ j ′ ℓ ( s ′ ; θ ) − σ † j ′ ℓ ( s ; θ ) ) where { ˜ σ j ′ : j ′ ∈ F j } is a vector between { σ ∗ j ′ ( s ; θ ) : j ∈ F i } and { σ ∗ j ′ ( s ′ ; θ ) : j ∈ F i } . By a similar argument t o the proof of Lemma 1, we have k σ ∗ j ( s ; θ ) − σ ∗ j ( s ′ ; θ ) k 1 ≤ λ · max j ′ ∈ F j k σ ∗ j ′ ( s ; θ ) − σ ∗ j ′ ( s ′ ; θ ) k 1 ≤ λ · max j ′ ∈ F j {k σ ∗ j ′ ( s ; θ ) k 1 + k σ ∗ j ′ ( s ′ ; θ ) k 1 } = 2 λ , 33 where the last inequality comes from th e triangular inequ ality . Because for all j ∈ N ( i , h − 1 ) , any friend j ′ of j belongs to N ( i , h ) , the n k σ ∗ j ( s ; θ ) − σ ∗ j ( s ′ ; θ ) k 1 ≤ λ 2 · max j ′′ ∈ F j ′ , j ′ ∈ F i k σ ∗ j ′′ ( s ; θ ) − σ ∗ j ′′ ( s ′ ; θ ) k 1 ≤ 2 λ 2 . By induction, for all j ∈ N ( i , h − q ) where q ≤ h , there is k σ ∗ j ( s ; θ ) − σ ∗ j ( s ′ ; θ ) k 1 ≤ 2 λ q + 1 . Hence, for any q ≤ h , we have max j ∈ N ( i , h − q ) k σ ∗ j ( s ; θ ) − σ ∗ j ( s ′ ; θ ) k 1 ≤ 2 λ q + 1 . Because i ∈ N ( i ,0 ) , then k σ ∗ i ( s ; θ ) − σ ∗ i ( s ′ ; θ ) k 1 ≤ 2 λ h + 1 . B y assumpt ion B, 2 λ h + 1 ↓ 0 as h → ∞ . A.3. Proof of L emma 3. F irst, by assumption C, (5) can be rewritten as δ ik ( S ) = ϕ ′ i ( S ) × β k ( X i ) α k ( X i , Q i ) ! . W e further multiply by ϕ i ( S ) o n both sides and o btain ϕ i ( S ) × δ ik ( S ) = ϕ i ( S ) × ϕ ′ i ( S ) × β k ( X i ) α k ( X i , Q i ) ! . Moreover , we take conditional exp ectation on both sides g iven X i = x and Q i = q : E [ ϕ i ( S ) × δ ik ( S ) | X i = x , Q i = q ] = E [ ϕ i ( S ) × ϕ ′ i ( S ) | X i = x , Q i = q ] × β k ( x ) α k ( x , q ) ! fr om which we invert the coefficients vector ( β k ( x ) , α ′ k ( x , q ) ) ′ . A P P E N D I X B. A S Y M P T O T I C P R O P E R T I E S U N D E R P A R A M E T R I C S ET T I N G For any c ∈ Θ , let L n ( c ) = 1 n ∑ n i = 1 ∑ k ∈ A E σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) . For arbitrary ǫ > 0, let B ǫ ( θ ) be an ope n ball centered at θ with ǫ r adius in the space Θ . 34 B.1. Proof of Theorem 2. By Lemma 4, it suffices to check the conditions (i) – (iii) in the lemma. B y the identification ar gument and ass u mption F, condition (i) holds. More- over , condition (iii) also holds by Lemma 5 . Hence, it suffices to verify condition (ii) , i.e . sup c ∈ Θ ˆ L ( c ) − L n ( c ) p → 0. By Lemmas 6 and 7, ∑ K k = 0 1 ( Y i = k ) ln σ ∗ ik ( S ; · ) is bounded and continuou s on Θ . S ince Θ is compact, then F n = ∑ k ∈ A 1 ( Y i = k ) ln σ ∗ ik ( S ; c ) : c ∈ Θ can be covered by a finite number o f ǫ –brackets. T o apply the classical Glivenko-Cantelli ar gument, it s uffic es to show the point–wise law of large number , i.e. for any c ∈ Θ , ˆ L ( c ) − L n ( c ) p → 0. W e pick an integ e r d n ∝ 0.5 ln n / ln c 0 . Clearly , d n → ∞ as n → ∞ . T h e n we have ˆ L ( c ) − L n ( c ) = 1 n n ∑ i = 1 ∑ k ∈ A 1 ( Y i = k ) − σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) + 1 n n ∑ i = 1 ∑ k ∈ A σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) − σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) + 1 n n ∑ i = 1 ∑ k ∈ A n σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) − E h σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) i o + 1 n n ∑ i = 1 ∑ k ∈ A n E h σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) i − E [ σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) ] o . (11) For the firs t term of right–hand side in eq. (11), w e have E ( h 1 n n ∑ i = 1 ∑ k ∈ A ( 1 ( Y i = k ) − σ ∗ ik ( S ; θ ) ) ln σ ∗ ik ( S ; c ) i 2 S ) = 1 n 2 n ∑ i = 1 E ( h ∑ k ∈ A ( 1 ( Y i = k ) − σ ∗ ik ( S ; θ ) ) ln σ ∗ ik ( S ; c ) i 2 S ) ≤ 1 n ( K + 1 ) 2 ( ln σ 0 ) 2 → 0 where the first s tep is because of the r easons t hat Y i is conditionally inde pendent g iven S and that E ( Y i | S ) = σ ∗ ik ( S ; θ ) , and the last ine q u ality is d u e to the fact: ln σ 0 ≤ 1 ( Y i = k ) − σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) ≤ − ln σ 0 under Le mma 6. 35 Next, for the second t erm o f RH S in eq. (11), no te that E σ ∗ ik ( S ; θ ) ln σ ∗ ik ( S ; c ) − σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) ≤ E h σ ∗ ik ( S ; θ ) − σ d n ik ( S ; θ ) · | ln σ ∗ ik ( S ; c ) | i + E h σ d n ik ( S ; θ ) · ln σ ∗ ik ( S ; c ) − ln σ d n ik ( S ; c ) i ≤ − ln σ 0 · E σ ∗ ik ( S ; θ ) − σ d n ik ( S ; θ ) + 1 σ 0 · E σ ∗ ik ( S ; c ) − σ d n ik ( S ; c ) → 0. Similarl y , we can show th at the last term in eq. (11) is also o p ( 1 ) . Therefor e, it suffices to show that the thir d term of RHS in eq. (11) is also o p ( 1 ) . Note that E n 1 n n ∑ i = 1 ∑ k ∈ A σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) − E σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) o 2 = 1 n 2 n ∑ i , j = 1 Cov ∑ k ∈ A σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) , ∑ k ∈ A σ d n jk ( S ; θ ) ln σ d n jk ( S ; c ) ! . By definition and assumption I, σ d n i ( S ; θ ) is indep endent of σ d n j ( S ; θ ) if t h e re does not exist a player m ∈ N ( i , d n ) T N ( j , d n ) . By assumpt ion J, the re are at most n · ( 1 + c 0 + · · · c d n 0 ) ≤ n c d n + 1 0 pair of ( i , j ) s uch that σ d n i ( S ; θ ) and σ d n j ( S ; θ ) are depe ndent of each othe r . Moreover , for any i and j , 2Cov ∑ k ∈ A σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) , ∑ k ∈ A σ d n jk ( S ; θ ) ln σ d n jk ( S ; c ) ≤ E ∑ k ∈ A σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) 2 + E ∑ k ∈ A σ d n jk ( S ; θ ) ln σ d n jk ( S ; c ) 2 ≤ 2 ( 1 + K ) 2 ( ln σ 0 ) 2 . Therefor e, E n 1 n n ∑ i = 1 ∑ k ∈ A σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) − E σ d n ik ( S ; θ ) ln σ d n ik ( S ; c ) o 2 ≤ 1 n 2 · n c d n + 1 0 2 ( 1 + K ) 2 ( ln σ 0 ) 2 ∝ 1 √ n 2 c 0 ( 1 + K ) 2 ( ln σ 0 ) 2 → 0. Lemma 4. Su ppose (i) lim sup n → ∞ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) < 0 holds for any ǫ > 0 ; (ii) ˆ L n conver ges u n iformly in pro bability to L n , i.e. sup c ∈ Θ ˆ L n ( c ) − L n ( c ) p → 0 ; (iii) ˆ L n ( ˆ θ ) ≥ sup c ∈ Θ ˆ L n ( c ) − o p ( 1 ) . Then b θ p → θ . 36 Pro of. T o prove the lemma, we modify the proofs in Newey and McFadde n (1994), Theo- rem 2.1. Note that the objective function L n ( · ) in our case depends on n , and it converges to a limit as n go es to infinity . By (ii) and (iii), with p robability approaching one (w .p.a.1), L n ( ˆ θ ) > ˆ L n ( ˆ θ ) − η /3 > ˆ L n ( θ ) − 2 η /3 > L n ( θ ) − η , ∀ η > 0. Then, for any ǫ > 0, choose η = − 1 2 lim sup n → ∞ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) > 0. It follows that w .p.a.1, L n ( ˆ θ ) − L n ( θ ) > 1 2 lim sup n → ∞ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) . Because for sufficient large n , sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) − lim s up n → ∞ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) ≤ η = − 1 2 lim sup n → ∞ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) , which implies 1 2 lim sup n → ∞ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) ≥ sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) . Therefor e, w .p.a.1, L n ( ˆ θ ) − L n ( θ ) > sup c 6 ∈ B ǫ ( θ ) ( L n ( c ) − L n ( θ ) ) , which implies that ˆ θ ∈ B ǫ ( θ ) w .p.a.1. Because ǫ can be arbitraril y small, b θ p → θ . Lemma 5. Suppose that assumption A, G–(i) and H hold. Then, ˆ L ( ˆ θ ) ≥ sup c ∈ Θ ˆ L ( c ) − o p ( 1 ) . Pro of. By the definition of ˆ θ , it suffic es to sh o w that sup c ∈ Θ ˆ Q ( c ) − ˆ L ( c ) → 0. Because sup c ∈ Θ | ˆ Q ( c ) − ˆ L ( c ) | ≤ s u p c ∈ Θ 1 n n ∑ i = 1 ∑ k ∈ A ln σ h ik ( S ; c ) − ln σ ∗ ik ( S ; c ) . By T aylor expansion, ∑ k ∈ A ln σ h ik ( S | c ) − ln σ ∗ ik ( S ; c ) = 1 σ † ∑ k ∈ A σ h ik ( S ; c ) − σ ∗ ik ( S ; c ) ≤ 2 λ h + 1 σ 0 , 37 where σ † is some real value betw e en σ h ik ( S ; c ) and σ ∗ ik ( S ; c ) , and σ 0 is the lower bound of the equilibrium choice probabil ity . The last step uses Lemmas 2 and 6. Thus, sup c ∈ Θ ˆ Q ( c ) − ˆ L ( c ) ≤ 2 λ h + 1 σ 0 . Because of assumpt ion K and λ < 1, we have sup c ∈ Θ ˆ Q ( c ) − ˆ L ( c ) p → 0. B.2. Proof of Theorem 3. Pro of. First, by t he p roof of Lemma 5 and assumpt ion K–(ii) , sup c ∈ Θ ˆ Q ( c ) − ˆ L ( c ) ≤ 2 ( K + 1 ) λ h σ 0 = o p ( n − 1 ) . Hence, ˆ L ( ˆ θ ) ≥ sup c ∈ Θ ˆ L ( c ) − o p ( n − 1 ) , which implies that ∂ ˆ L ( ˆ θ ) / ∂ c = o p ( n − 1/ 2 ) . By the T aylor expansion, we have ∂ ˆ L ( θ ) ∂ c + ∂ 2 ˆ L ( θ † ) ∂ c ∂ c ′ ( ˆ θ − θ ) = o p ( n − 1/ 2 ) for so me θ † between θ and ˆ θ . Now it suffic es to sh o w: √ n × ∂ ˆ L ( θ ) ∂ c d → N 0, J ( θ ) , (12) ∂ 2 ˆ L ( θ † ) ∂ c ∂ c ′ p → − J ( θ ) . (13) W e first show e q. (12). L et ξ i = ∂ ∂ c ∑ k ∈ A 1 ( Y i = k ) ln σ ∗ ik ( S ; c ) | c = θ . Note that the true parameter θ always maximizes the likelihood function E ∑ k ∈ A 1 ( Y i = k ) ln σ ∗ ik ( S ; · ) | S for any n and S . Thus E ( ξ i | S ) = 0. By de finition, ∂ ˆ L ( θ ) / ∂ c = n − 1 ∑ n i = 1 ξ i . Then, it suffices to show that n − 1/ 2 ∑ n i = 1 ξ i d → N ( 0, J ( θ )) . Equivalently , we need to sho w n − 1/ 2 ∑ n i = 1 J ( θ ) − 1 2 ξ i d → N ( 0, 1 P ) , wher e 1 P is the P –by– P ident ity matrix. Fo r t his, we show t hat t h e conditional distribution of √ n ∑ n i = 1 J ( θ ) − 1 2 ξ i given S always conve rges to the same limiting normal distribution N ( 0, 1 P ) . Because ξ i is conditionally indep endent across i given S . Then E " n − 1/ 2 n ∑ i = 1 ξ i · n − 1/ 2 n ∑ i = 1 ξ ′ i S # = n − 1 n ∑ i = 1 E ξ i · ξ ′ i S . 38 By a similar argument to that in the p roof o f Theorem 2, we have n − 1 n ∑ i = 1 E ξ i · ξ ′ i S = n − 1 n ∑ i = 1 E ξ i · ξ ′ i + o p ( 1 ) = J n ( θ ) + o p ( 1 ) = J ( θ ) + o p ( 1 ) . Thus, E " n − 1/ 2 n ∑ i = 1 ξ i · n − 1/ 2 n ∑ i = 1 ξ ′ i S # p → J ( θ ) . Hence, by the Lindebe rg-Feller The o rem (see e.g. V an der V aart, 2000), conditional o n S , n − 1/ 2 n ∑ i = 1 J ( θ ) − 1 2 ξ i d → N ( 0, 1 P ) W e no w show eq. (13) . Under assump t ion G, it follows from Lemmas 6 and 7 that ∂ 2 ∂ c ∂ c ′ ∑ k ∈ A 1 ( Y i = k ) ln σ ∗ ik ( S ; c ) is boun d ed above uniformly on n , S and θ , and ∂ 2 ∂ c ∂ c ′ ∑ k ∈ A 1 ( Y i = k ) ln σ ∗ ik ( S ; c ) ar e smooth functions of c ∈ Θ . Hence by a similar argument as the proofs in Theorem 2, sup c ∈ Θ " ∂ ˆ L ( c ) ∂ c ∂ c ′ − 1 n n ∑ i = 1 E ( ∂ 2 ∂ c ∂ c ′ ∑ k ∈ A 1 ( Y i = k ) ln σ ∗ ik ( S ; c ) )# p → 0. Because θ † p → θ and by assumption I, we have ∂ 2 ˆ L ( θ † ) ∂ c ∂ c ′ = E ( ∂ 2 ∂ c ∂ c ′ ∑ k ∈ A 1 ( Y 1 = k ) ln σ ∗ 1 k ( S ; θ ) ) + o p ( 1 ) . Moreover , by the information matrix equality , E ( ∂ 2 ∂ c ∂ c ′ ∑ k ∈ A 1 ( Y 1 = k ) ln σ ∗ 1 k ( S ; θ ) ) = − J n ( θ ) = − J ( θ ) + o ( 1 ) . Then eq. (13) is proved. A P P E N D I X C. A U X I L I A RY L E M M A S Lemma 6. Suppose ass umption A and G–(i) hold. T hen ther e exists σ 0 ∈ ( 0, 1 ) such that σ ∗ ik ( S ; c ) ≥ σ 0 for all n ∈ N , i ∈ N , k ∈ A and c ∈ Θ . 39 Pro of. By assumption A, for all ( i , k ) ∈ N × A , σ ∗ ik ( S ; c ) = exp n ( X ′ i , Q i ) · b k + ∑ ℓ ∈ A a k ℓ 1 Q i ∑ j ∈ F i σ ∗ j ℓ ( S ; c ) o 1 + ∑ K ℓ ′ = 1 exp n ( X ′ i , Q i ) b ℓ ′ + ∑ ℓ ∈ A a ℓ ′ ℓ 1 Q i ∑ j ∈ F i σ ∗ j ℓ ′ ( S ; c ) o . Because 0 ≤ 1 Q i ∑ j ∈ F i σ ∗ j ℓ ( S ; c ) ≤ 1 and by assumption G–(i), the RH S has a lowe r bound, denote d as σ 0 > 0. Note that the above ar gument does not depend on the value of n , i , k and c . Lemma 7. Suppose that assumptions A and H hold. Then, σ ∗ ik ( S ; · ) ∈ C ∞ ( Θ ) for all n ∈ N , S , i ∈ N , k ∈ A and c ∈ Θ . Pro of. W e fix an arbitrary n and S in the following analysis. By Lemma 1, { σ ∗ i ( S ; c ) : i ∈ N } is the unique solution to th e equation syst em: for all ( i , k ) ∈ ( N , A ) , σ ∗ ik = exp h b k ( X i , Q i ) + ∑ K l = 0 a k ( ℓ , X i , Q i ) · ∑ j ∈ F i σ ∗ j ℓ i 1 + ∑ K q = 1 exp h b q ( X i , Q i ) + ∑ K l = 0 a q ( ℓ , X i , Q i ) · ∑ j ∈ F i σ ∗ j ℓ i . Let Σ ∗ = ( σ ∗ 1 , · · · , σ ∗ n ) . Then the above equation system can be repr esente d as Σ ∗ = B R ( S , Σ ∗ ; c ) where B R is t he n ( K + 1 ) dimensional mapping representing the best response functions for all ( i , k ) ∈ ( N , A ) . Fix S . Cl early , BR belongs to C ∞ R n ( K + 1 ) × Θ . Then by implicit function theorem, t he s olution σ ∗ i ( S ; · ) ∈ C ∞ ( Θ ) for all i ∈ N . A P P E N D I X D. C O N S I S T E N T N O N P A R A M E T R I C E S T I M AT O R O F P Y i | S The NDD cond ition is important for lar ge netwo rk asymptotics. In particular , it allows us to nonparametrically estimate the proba bility distribution P Y i | S using observations fr om one sing le lar ge network. T o illustrate, we consider the simple circl e network whe re each player has t wo direct friends and the friendship is symmetric. Such a s pecification helps highlights k ey features o f the consiste ncy argument for the no nparametric e stimation. Because our asy mptotic analysis considers a seque n ce of games with n → ∞ , we us e S n with subscript n to emphasize its depe n d ence o n th e ne twork size in the following analysis. The sequen ce of games ar e des cribed as follows: Let the set of players { 1, 2, · · · , n } for 40 n ≥ 2 be located on a circ le network as follows: First we randomly pick a location for player 1 on the circ le. N ext, players 2 and 3 are on 1’s left and right, respectively; then players 4 and 5 ar e further located on 2’s left and 3’s right, respectively; so on and s o forth. Thus we obtain a circl e network with n = + ∞ in the limit. Given the net work, state variables X i ar e i.i.d. across all the players . Similarly to the probabi lity theo ry in t ime series, t he p roba bility dist ribution of the seque nce { S n : n ≥ 2 } is well defined. For simplicity , let A = { 0, 1 } and X i ∈ R . W .o.l.g., we consider the estimation of P ( Y i = 1 | S n = s n ) for i = 1. T o begin with, we first conside r t he case where X i is binary , i.e., X i ∈ { 0, 1 } . It is s traightforward t hat our arguments can be generalized to th e case of multiple valued X i ’s. The cont inuous X i ’s case will be discusse d later . Intuitively , a nonparametric estimator ˆ P ( Y 1 = 1 | S n = s n ) can be de fined as follows: ∑ n j = 1 1 ( Y j = 1 ) · 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i ∑ n j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i , where j ( ℓ ) d enotes the | ℓ | -th left vert ex of j if ℓ < 0; o therwise it refers to the | ℓ | -th right vertex of j . Note that because of the circ le ne twork, G ( j , h ) = g ( 1, h ) a.s.. Then, the te rm 1 h G ( j , h ) = g ( 1, h ) i is redundant in the above expression. As is s hown in the proof o f the next lemma, the above estimator is ess e ntially a kernel estimator with a s pecific choice of bandwidth and a uniform kernel. In the above estimator , It is crucial to choose h for its consistency , which carri es a bias and variance trade o f f: Intuitively , h ∈ N needs to increase properly with n such that P ( Y 1 = 1 | S ( 1, h ) ) converges to P ( Y 1 = 1 | S n ) (note that the approximation e rror is bound ed by 2 ξ h + 1 where | ξ | < 1). On the other hand, we r equire the number of observations G ( j , h ) = g ( 1, h ) goes to infinity with the net work size, so that t he variance of the estimator d ecreases to zero as n → ∞ . W .l.o. g., s uppose P ( X i = 0 ) ≤ 1/2. Moreover , let p h ≡ P ( S ( 1, h ) = s ( 1, h ) ) = ∏ 2 h + 1 j = 1 P ( X j = x j ) . By definition, P ( X i = 0 ) 2 h + 1 ≤ p h ≤ P ( X i = 1 ) 2 h + 1 . The refore, we have p h → 0 as h → ∞ . 41 Lemma 8. Suppos e that assumptions A and F, G-(i) , I and J hold . S uppose h → ∞ and h n p h → 0 as n → ∞ . Then ˆ P ( Y 1 = 1 | S n = s n ) − P ( Y 1 = 1 | S n = s n ) p → 0 Pro of. First note that ˆ P ( Y 1 = 1 | S n = s n ) = 1 n p h ∑ n j = 1 1 ( Y j = 1 ) · 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i 1 n p h ∑ n j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i , W e now s how that the d e nominator and numerator converge to one and P ( Y 1 = 1 | S ( 1, h ) = s ( 1, h ) ) , respectively . First, we look at de nominator and sho w E ( 1 n p h n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i ) → 1; (14) V ar ( 1 n p h n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i ) → 0. (15) Regarding (14), we h ave E ( 1 n p h n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h { X ℓ : ℓ ∈ N ( j , h ) } = { x ℓ : ℓ ∈ N ( 1, h ) } i ) = 1 p h E n 1 S ( 1, h ) = s ( 1, h ) o = 1. T o establish (15), no t e t hat V ar ( 1 n p h n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h { X ℓ : ℓ ∈ N ( j , h ) } = { x ℓ : ℓ ∈ N ( 1, h ) } i ) = 1 n 2 p 2 h n ∑ ℓ = 1 ∑ j 6 = ℓ Cov n 1 S ( j , h ) = s ( 1, h ) , 1 S ( ℓ , h ) = s ( 1, h ) o + 1 n p 2 h V ar n 1 S ( 1, h ) = s ( 1, h ) o = 1 n p 2 h ∑ j 6 = 1 Cov n 1 S ( j , h ) = s ( 1, h ) , 1 S ( 1, h ) = s ( 1, h ) o + 1 − p h n p h = 1 n p 2 h 2 h + 1 ∑ j = 2 Cov n 1 S ( j , h ) = s ( 1, h ) , 1 S ( 1, h ) = s ( 1, h ) o + 1 − p h n p h , 42 where the last step comes from the assumption that S ( j , h ) is inde pendent of S ( 1, h ) if N ( j , h ) does no t o verlap with N ( 1, h ) . T hus, V ar ( 1 n p h n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h { X ℓ : ℓ ∈ N ( j , h ) } = { x ℓ : ℓ ∈ N ( 1, h ) } i ) ≤ 2 h n p 2 h × V ar n 1 S ( j , h ) = s ( 1, h ) o + V ar n 1 S ( 1, h ) = s ( 1, h ) o 2 + 1 − p h n p h = ( 2 h + 1 ) ( 1 − p h ) n p h ∝ h n p h → 0. It follows that 1 n p h n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i p → 1 By a similar argument, we have E ( 1 n p h n ∑ j = 1 1 ( Y j = 1 ) · 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i ) = P ( Y 1 = 1 | S ( 1, h ) = s ( 1, h ) ) = P ( Y 1 = k | S n = s n ) + o ( | ξ | h ) and V ar ( 1 n p h n ∑ j = 1 1 ( Y j = 1 ) · 1 h G ( j , h ) = g ( 1, h ) i · 1 h X j ( ℓ ) = x 1 ( ℓ ) , for ℓ = − h , · · · , h i ) → 0. Moreover , by Slutsky ’s theorem, we establish the consistency of the proposed es timator . In Lemma 8, it is r equired that h s hould incr ease t o infinity with n , but sufficiently slow . I n particular , the conditions imply p h → 0 and n p h → ∞ as n → ∞ . This sugg ests that the term p h plays the same role as the bandwidth in kernel est imation. In add ition, because of the de pende n ce bet ween S ( j , h ) and S ( i , h ) for ρ ( i , j ) ≤ h , we requir e n p h incr ease to infinity faster than h . Sup p ose one chooses h = [ h 0 × ln n ] for some constant h 0 > 0. Then, p h ∝ n − κ where κ > 0 that is determined by h 0 and P ( X i = 0 ) . Then the restrictions on h in Lemma 8 ar e satisfied if κ is sufficiently small. 43 Suppose X i is continuously d istributed. Let f X be t h e pd f of X i . Fo r s implicity , W e assume 0 < inf x ∈ R f X ( x ) < sup x ∈ R f X ( x ) < ∞ . As usual, additional assumptions on the st ructural parameters ar e needed to en s ure P ( Y 1 = 1 | S n = s n ) is R –th ( R ≥ 2) order continuously d if ferentiab le in each argument of S n . Moreover , a nonparametric e stimator is defined by ˆ P ( Y 1 = 1 | S n = s n ) = ∑ n j = 1 1 ( Y j = 1 ) · 1 h G ( j , h ) = g ( 1, h ) i · ∏ 2 h + 1 ℓ = 1 K X ℓ − x ℓ b ℓ ∑ n j = 1 1 h G ( j , h ) = g ( 1, h ) i · ∏ 2 h + 1 ℓ = 1 K X ℓ − x ℓ b ℓ , where K and b ℓ for ℓ = 1, · · · , 2 h + 1 are R –th order kernel function and bandwidth, r e- spectively . For consistency , we nee d t o choo se h → ∞ and b ℓ → 0 for ℓ = 1, · · · , 2 h + 1 properly as n → ∞ . For simplicity , let b ℓ f X ( x ℓ ) = p for s ome p ≡ p n > 0. Moreover , let h → ∞ , p → 0 and h / ( n p 2 h + 1 ) → 0 as n → ∞ . By a s imilar argument to Lemma 8 and Bochner ’s Le mma, we can s how consiste ncy of the kernel es timator . In particular , we have E ( 1 n p 2 h + 1 n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 2 h + 1 ∏ ℓ = 1 K X ℓ − x ℓ b ℓ ) = 1 + O ( p R ) and V ar ( 1 n p 2 h + 1 n ∑ j = 1 1 h G ( j , h ) = g ( 1, h ) i · 2 h + 1 ∏ ℓ = 1 K X ℓ − x ℓ b ℓ ) = O h n p 2 h + 1 , and similar expressions hold for the numerator o f t he k ernel estimator , which p rovi de the consistency . 44
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment