The End of Rented Discovery: How AI Search Redistributes Power Between Hotels and Intermediaries
When a traveler asks an AI search engine to recommend a hotel, which sources get cited -- and does query framing matter? We audit 1,357 grounding citations from Google Gemini across 156 hotel queries in Tokyo and document a systematic pattern we call…
Authors: Peiying Zhu, Sidi Chang
The End of Ren ted Disco v ery: Ho w AI Searc h Redistributes P o w er Bet w een Hotels and In termediaries P eiying Zh u * Sidi Chang * † p eiying@blossomai.co sc hang@blossomai.co Blossom AI Blossom AI San F rancisco, CA, USA San F rancisco, CA, USA Abstract When a trav eler asks an AI search engine to recommend a hotel, which sources get cited—and does query framing matter? W e audit 1,357 grounding citations from Google Gemini across 156 hotel queries in T oky o and do cument a systematic pattern we call the In ten t-Source Divide. Exp erien tial queries dra w 55.9% of their citations from non-OT A sources, compared to 30.8% for transactional queries—a 25.1 percentage-point gap ( p < 5 × 10 − 20 ). The effect is amplified in Japanese, where exp eriential queries dra w 62.1% non-OT A citations compared to 50.0% in English—consistent with a more div erse Japanese non-OT A con ten t ecosystem. F or an industry in which hotels ha v e long paid OT As for demand acquisition, this pattern matters b ecause it suggests that AI searc h ma y mak e hotel discov ery less exclusiv ely controlled b y commission-based in termediaries. Keyw ords: generativ e engine optimization, AI search, hotel disco very , online trav el agencies, hotel intermediation, inten t- source divide, hospitalit y strategy , platform economics 1 In tro duction The hospitality distribution landscap e has long been or- ganized around a simple bargain: Online T ra v el Agencies help trav elers discov er hotels, and hotels pay for that dis- co v ery through commission. Commission rates commonly range from 15% to 25% of b o oking v alue [9], and for in- dep enden t prop erties OT A-originated b o okings can rep- resen t ov er 61% of ro om reven ue [8]. This arrangemen t is costly , but hotels hav e historically accepted it b ecause OT As control the comparison in terface that trav elers use at the momen t of disco very . The economics are w ell docu- men ted: OT As b o ost visibilit y and occupancy but extract high margins [2, 10], and mark et entry dynamics sho w that brand loy alty consolidated through OT A platforms can deter indep endent hotel en tries [3]. Generativ e AI search ma y change that arrangemen t. When a trav eler asks Gemini or another AI system for a hotel recommendation, the system no longer merely ranks links; it syn thesizes an answer and cites the sources it relies on. The strategic question for hotels is therefore no longer only whether they app ear on OT A platforms, but whether AI systems con tinue to route discov ery through those platforms or b egin citing hotel-owned con tent and other non-OT A s ources directly . In that sense, AI search is not simply a new marketing c hannel—it is a p ossible redistribution of who con trols hotel disco very . ∗ Both authors contributed equally to this researc h. † Corresponding author. 1.1 GEO and Citation Audits: What W e Kno w Ab out AI Searc h Generativ e AI search engines syn thesize answ ers from m ultiple sources and cite those sources as grounding ref- erences, creating a discov ery arc hitecture where visibil- it y may depend on conten t relev ance rather than page authorit y alone. The emerging GEO literature has b e- gun to c haracterize these patterns: traditional SEO sig- nals do not straigh tforw ardly predict AI citation [1], AI searc h sho ws systematic bias to ward editorial conten t ov er brand-o wned conten t [5], query inten t significan tly shap es source selection [6], and AI citations concentrate hea v- ily among a small num b er of outlets [16]. Citation au- dits across platforms hav e identified sen timent, commer- cial, and geographic biases [14] and a “discov ery gap” in whic h discov ery-style queries produce very differen t cita- tion patterns than named-en tity queries [15]. Our study adapts this citation-audit metho dology for the hospitalit y domain. 1.2 Theoretical Grounding: Query In- ten t and Source Div ersity Our study draws on t wo established lines of researc h. First, the information retriev al literature has long rec- ognized that query in tent shap es searc h b eha vior and re- sult characteristics. Bro der [4] prop osed a foundational taxonom y distinguishing navigational, informational, and transactional queries, demonstrating that different inten t t yp es activ ate different retriev al strategies and satisfy dif- feren t information needs. Our transactional/exp eriential distinction maps onto this taxonom y: transactional ho- tel queries resem ble Broder’s transactional class (goal: 1 The End of R ente d Disc overy Zhu & Chang complete a sp ecific action), while experiential queries re- sem ble the informational class (goal: acquire knowledge ab out a topic). The key insigh t from this literature is that query inten t is not merely a user characteristic but a structural feature of the query that determines what kinds of web conten t constitute a satisfactory answer. Second, the algorithm auditing literature [11, 13] has established that searc h engine ranking algorithms em b ed systematic biases in whic h information sources gain vis- ibilit y . Our study extends this tradition to generativ e AI search, where the mec hanism shifts from ranking to citation—the AI system do es not rank sources but selects whic h to cite as grounding for its syn thesized resp onse. The citation-audit metho dology we employ is conceptu- ally equiv alent to the ranking-audit metho dology of tra- ditional searc h engine studies, adapted for the generative paradigm. These tw o strands conv erge in a simple theoretical pre- diction: if AI searc h op erates as a con tent-matc hing sys- tem (retrieving sources whose con ten t b est matches the query’s information need), then queries with differen t in- ten t t yp es should systematically retriev e differen t source t yp es, b ecause the w eb conten t that b est answ ers a trans- actional query (structured data, price comparisons, bo ok- ing interfaces) differs from the con ten t that best answ ers an exp eriential query (narrativ e descriptions, editorial recommendations, atmosphere c haracterizations). The in ten t-source divide, if observ ed, w ould be consistent with con ten t-matching playing a significant role in AI search citation, though it would not rule out authority signals as a contributing factor. 1.3 The Gap: No Study Has Done This for Hospitalit y Despite growing work on AI searc h citation patterns and on hotel-OT A distribution economics, to our kno wledge, no study has examined how AI search c hanges intermedi- ation in hospitality . This gap matters because hotels are a rare setting where the discov ery in termediary is clearly iden tifiable and economically imp ortan t. If Gemini cites Bo oking.com, Exp edia, or Jalan, the hotel is b eing sur- faced through a commission-based intermediary . If Gem- ini cites the hotel’s own w ebsite, the hotel is comp eting for disco v ery more directly . Citation source does not tell us exactly where the guest will b o ok, but it do es tell us who captured the discov ery moment at the moment the rec- ommendation was made. In a mark et where discov ery has long b een ren ted from OT As, that distinction has direct strategic and economic significance. Additionally , hotels ha v e a spatial dimension that in teracts with query in- ten t (a station-access query differs from a neighborho o d- atmosphere query), and the three-tier market structure (in ternational chains, domestic chains, independents) cre- ates natural v ariation in web presence, con tent depth, and OT A dep endency . 1.4 Wh y T oky o W e study this question in T okyo for three reasons that mak e it an ideal natural laboratory . First, T okyo’s hotel mark et exhibits a w ell-defined three-tier structure: in- ternational chains (Marriott, Hyatt, Hilton, InterCon ti- nen tal) with global brand w ebsites optimized for English- language discov ery; domestic chains (AP A Hotels, T o y oko Inn, Dormy Inn, Mitsui Garden Hotels) with conten t-ric h Japanese-language w ebsites serving a large domestic mar- k et; and indep endent prop erties (ryok ans, boutique ho- tels, capsule hotels, design hotels) with highly v ariable w eb presences. This three-tier structure generates natu- ral v ariation in website con ten t depth, language co verage, and OT A dep endency that allows us to examine whether AI citation patterns systematically differ across market segmen ts. Second, T okyo op erates within tw o linguistically sep- arate w eb ecosystems. English-language and Japanese- language w eb con tent about T okyo hotels are largely dis- tinct corp ora: they feature different domains, different editorial voices, differen t OT A platforms (Booking.com and Exp edia dominate the English w eb; Jalan, Rakuten T rav el, and Ikyu dominate the Japanese w eb), and dif- feren t non-OT A con ten t types. Because Google Searc h grounding—the mec hanism through which Gemini re- triev es source material—searc hes the web in the query language, English and Japanese queries effectiv ely sur- face differen t information mark ets. This bilingual struc- ture allows us to test whether AI citation patterns reflect prop erties of the underlying web ecosystem rather than prop erties of the AI mo del itself. Third, T okyo is one of the w orld’s largest in b ound tourism mark ets: Japan recorded 36.9 million foreign vis- itors in 2024, the highest figure in the country’s history , with T okyo as the primary gatewa y [12]. Hotel disco ver- abilit y in T okyo has substan tial economic consequences. The combination of high visitor v olume, a competitive hotel mark et, and the co existence of in ternational and domestic b o oking platforms makes T oky o a setting where the question of AI search intermediation is b oth empiri- cally tractable and practically significant. 1.5 Researc h Questions W e p ose three research questions: (1) Discov ery in ter- mediation b y in ten t. Do es query framing shift AI hotel disco v ery aw ay from OT A in termediaries? (2) Language and ecosystem structure. Do es the shift dep end on the language-sp ecific web ecosystem? (3) Hotel-direct citation and con ten t depth. Do hotel-owned websites app ear in AI citations, and what conten t c haracteristics app ear to distinguish cited from non-cited properties? 2 The End of R ente d Disc overy Zhu & Chang These questions are addressed through a systematic au- dit of 1,357 grounding citations from 156 queries to Gem- ini 2.5 Flash, using a paired query design that isolates the effect of inten t framing on source selection. This is a hospitalit y strategy pap er ab out AI-era hotel discov- ery . It uses GEO concepts and citation-audit methods as the mec hanism and metho dology , but its core con tri- bution is to the question of who con trols hotel discov ery as the search interface shifts from link-based ranking to AI-syn thesized recommendation. 2 Data and Metho ds 2.1 Query Design W e designed a structured query corpus of 156 queries or- ganized as category-matched pairs. Eac h pair consists of a transactional v ariant and an exp erien tial v ariant ad- dressing the same trav eler need category in the same ge- ographic area, executed in b oth English and Japanese. This paired design ensures that the only v ariable changing b et w een queries within a pair is the inten t framing—not the topic, geographic scope, or language—allo wing us to isolate the effect of inten t on citation source comp osition. Query categories. W e identified four trav eler need categories that span the primary dimensions of ho- tel searc h: (1) Budget —price-sensitiv e discov ery , where OT As hav e a strong structural adv antage due to price comparison functionality; (2) R ating/Quality —qualit y- orien ted discov ery , where aggregated review scores and star ratings are the primary signals; (3) Convenienc e — lo cation and access-oriented discov ery , where pro ximity to transit no des and walk ability are the key criteria; and (4) Business —w ork-oriented disco v ery , where workspace qualit y , quiet environmen ts, and business amenities de- fine the need. These categories w ere selected to co ver the range of functional and experiential dimensions that prior research on hotel choice b eha vior has identified as primary decision factors. F or eac h category , the transactional v ariant w as de- signed to resem ble a b o oking-oriented query—the kind of query that would naturally lead to an OT A listing (e.g., “Cheap hotel in Shinjuku,” “Best rated hotel in Shibuy a”). The exp eriential v arian t was designed to de- scrib e the same need in terms of the guest exp erience rather than a b o ok able attribute (e.g., “Go o d v alue hotel with lo cal c harm in Shinjuku,” “Hotel with exceptional service and atmosphere in Shibuya”). Japanese transla- tions w ere crafted to be natural-sounding queries rather than literal translations, preserving the inten t distinction in idiomatic Japanese (e.g., transactional: “ 新 宿 で 安 い ホテ ル ”; exp eriential: “ 新 宿 で 地 元 の 雰 囲 気 が 楽 し め る コスパ の 良 い ホテル ”). Geographic scop e. Area-lev el queries were executed across 9 of T okyo’s 23 sp ecial wards, selected on the basis of hotel density and tourism significance. The nine wards comprise the T oshin 5-ku ( 都 心 5 区 ) core— Chiy o da, Chuo, Minato, Shinjuku, and Shibuya—whic h constitute T okyo’s central business and tourism district, plus four additional wards with high concentrations of tourist-relev an t hotels: T aito (Asakusa and Ueno, the traditional ry ok an district), T oshima (Ik ebukuro, a ma jor budget hotel h ub), Shinaga wa (a Shink ansen gatew ay and business hotel cluster), and Koto (T oy osu and Odaiba, with new er hotel dev elopments). The remaining 14 wards w ere excluded b ecause they contain relatively few tourist- orien ted hotels and would ha ve generated queries ab out predominan tly residential areas, in tro ducing noise with- out proportionate analytical v alue. In addition to the 144 area-level queries (4 categories × 2 inten ts × 9 wards × 2 languages), w e included 12 city-lev el queries (3 cat- egories × 2 inten ts × 2 languages) that address T okyo as a whole, yielding a total of 156 queries. App endix B pro vides sample queries illustrating the paired design. 2.2 Data Collection All 156 queries w ere executed on Gemini 2.5 Flash with Go ogle Searc h grounding enabled in March 2026, us- ing default generation parameters (temp erature = 1.0, no seed; see App endix A for full configuration). When grounding is active, Gemini issues Go ogle Searc h queries in the input language, retrieves top results, and synthe- sizes a resp onse citing the consulted web pages. Eac h query was executed once; b ecause Gemini’s grounding output is non-deterministic, we additionally re-ran a stratified subset of 20 queries five times eac h to assess test-retest reliabilit y (see Appendix E). W e extracted 1,357 grounding citations across 156 queries, yielding a mean of 8.7 citations per query (range: 2–32, CV: 53%). T o address v ariability in citation counts, we employ b oth citation-weigh ted and query-weigh ted analytical ap- proac hes (see Section 2.5). 2.3 Source Classification Eac h grounding citation was classified as either OT A (bo oking-orien ted intermediary , including global, Japanese, AP AC, and meta-search platforms) or non- OT A. Non-OT A sources w ere further classified into nine sub-t yp es: hotel direct, editorial curation, trav el blog, tra v el media, lo cal tourism, cow orking/w orkspace, tra vel agency , user-generated con ten t, and accommo dation plat- form. Classification w as b y domain matc hing against cu- rated lists (see Appendix C). 2.4 Hotel Name Extraction and Tier Classification Hotel names w ere extracted from Gemini’s resp onses and classified in to three tiers: international chain (Mar- 3 The End of R ente d Disc overy Zhu & Chang riott, Hyatt, Hilton, IHG, Accor and sub-brands), do- mestic c hain (AP A Hotel, T o yok o Inn, Dormy Inn, Su- p er Hotel, Mitsui Garden, Prince Hotels, and others), or indep enden t . Classification follo ws the guest-facing brand name—the name a trav eler would recognize—not the parent company . A hotel qualifies as “chain” if its brand name app ears on three or more prop erties—a threshold that captures the brand recognition and cross- prop ert y consistency relev ant to trav eler search b ehavior, while excluding one-off coinciden tal name o verlaps. 2.5 Statistical Metho ds W e test the Inten t-Source Divide using chi-squared tests, o dds ratios with 95% confidence interv als, and Cramer’s V for effect size. T o adjust for language and query cate- gory , w e estimate logistic regression models at the citation lev el ( n = 1 , 357) and a query-lev el quasibinomial sp ecifi- cation ( n = 156) to address non-independence of citations within queries. A Mann-Whitney U test pro vides a query- w eigh ted robustness chec k. F ull test details, co efficients, and robustness results are rep orted in Appendices D–E. Citations are clustered within queries and shared tem- plates, so citation-lev el p-v alues and confidence in ter- v als should b e in terpreted with caution. The query-lev el mo del confirms that results are substantiv ely unc hanged when this clustering is accounted for. 3 Results Across 156 Gemini 2.5 Flash queries, w e extracted 1,357 grounding citations. OT As accoun t for 55.3% of all cita- tions (751 of 1,357), with the remainder distributed across hotel direct w ebsites, editorial platforms, tra vel blogs, and other non-OT A sources. W e observed systematic v aria- tion in this OT A/non-OT A split by query inten t, lan- guage, and category . Because citations are partially clus- tered within queries and shared templates, the citation- lev el results b elow should b e interpreted as substantiv ely informativ e but somewhat optimistic in precision. The main pattern, how ever, is large in magnitude and is cor- rob orated b y query-lev el robustness c hecks. 3.1 The In tent-Source Divide The cen tral empirical finding is a substan tial difference in non-OT A citation rates betw een experiential and transac- tional hotel queries. Non-OT A sources account for 55.9% of exp eriential citations (419 of 750) but only 30.8% of transactional citations (187 of 607), a gap of 25.1 p ercent- age p oints. A chi-squared test confirms that this differ- ence is highly significant: χ 2 (1) = 84 . 23, p = 4 . 40 × 10 − 20 . The unadjusted o dds ratio is 2.84 (95% CI [2.27, 3.56]): exp erien tial queries are nearly three times as lik ely to cite non-OT A sources as transactional queries. The asso cia- tion is mo derate in size by Cramer’s V ( V = 0 . 249). T o examine whether this pattern holds after account- ing for language and query category , w e estimated tw o logistic regression mo dels predicting the probability that a citation comes from a non-OT A source. F ull coeffi- cien ts are rep orted in App endix D (T ables 8–9). In the main-effects mo del (Mo del 1), exp erien tial inten t is the strongest predictor (adjusted OR = 2.95, 95% CI [2.34, 3.71], p < 0 . 001). Japanese queries also pro duce signifi- can tly more non-OT A citations than English queries (OR = 1.33, 95% CI [1.06, 1.67], p = 0 . 015). Relativ e to bud- get queries, business and conv enience queries pro duce sig- nifican tly more non-OT A citations (business OR = 2.57, con v enience OR = 2.00). Mo del 2 adds an interaction b et w een exp eriential in- ten t and Japanese-language queries. This in teraction is significan t (OR = 1.77, 95% CI [1.12, 2.80], p = 0 . 015; AIC 1732.1 vs. 1736.0; lik elihoo d-ratio χ 2 (1) = 5 . 90, p = 0 . 015), indicating that the non-OT A b o ost from ex- p erien tial framing is amplified in Japanese. F or Japanese- language experiential queries, 62.1% of citations come from non-OT A sources—nearly double the 31.8% ob- serv ed for English transactional queries (direct cell com- parison: OR = 3.52, 95% CI [2.53, 4.90], p < 10 − 14 ). This is consisten t with the descriptiv e language split rep orted in Section 3.2. The in ten t-source divide is robust across all four query categories. T able 1 presen ts the category-lev el break- do wn: Sev eral patterns are notable. Budget queries sho w the lo west non-OT A share in the transactional condition (15.5%), consisten t with the structural adv antage that OT A platforms hold in price-comparison functionality . Business queries show the largest absolute gap (28.4 p er- cen tage points) and the highest non-OT A share in the ex- p erien tial condition (68.5%), driv en by cow orking review platforms, hotel direct w ebsites describing w orkspace amenities, and sp ecialized business trav el con tent that OT A listings do not provide. Con venience queries show the smallest gap (19.0 pp), suggesting that lo cation and access information is somewhat av ailable across b oth OT A and non-OT A sources. All four category-level gaps remain significant under Bonferroni correction (adjusted α = 0 . 0125). The inten t-source divide is robust to query-weigh ted analysis, in which each query contributes equally re- gardless of citation count (query-weigh ted non-OT A rates: 55.1% exp eriential vs. 27.1% transactional; Mann- Whitney U = 4 , 721, p < 10 − 9 ). It also survives con- trols for query length and lexical richness, answer-t yp e sp ecificit y tests, and test-retest replication across five in- dep enden t runs (ICC = 0.656). F ull robustness results are reported in App endix E; a summary of all statistical tests is in Appendix D (T able 10). 4 The End of R ente d Disc overy Zhu & Chang T able 1: Non-OT A Citation Rate b y Query Category and In tent Category T non-OT A% E non-OT A% Gap (pp) 95% CI χ 2 (1) p -v alue Budget 15.5% 42.5% 27.0 [17.5, 36.5] 26.00 3 . 4 × 10 − 7 Rating/Qualit y 22.1% 46.7% 24.6 [13.6, 35.5] 16.78 4 . 2 × 10 − 5 Business 40.1% 68.5% 28.4 [18.1, 38.7] 25.81 3 . 8 × 10 − 7 Con v enience 40.1% 59.2% 19.0 [9.3, 28.8] 13.29 2 . 7 × 10 − 4 T able 2: OT A and Non-OT A Citation Rates b y Language and Inten t Segmen t OT A% Non-OT A% N EN T ransactional 68.2% 31.8% 274 EN Exp erien tial 50.0% 50.0% 386 JP T ransactional 70.0% 30.0% 333 JP Exp erien tial 37.9% 62.1% 364 3.2 Language, Ecosystem Structure, and Non-OT A Comp osition The inten t-source divide op erates differently across lan- guages. T able 2 presen ts the four-segment breakdown: The most striking cell is Japanese exp eriential: 62.1% of citations come from non-OT A sources. In English, the inten t gap is 18.2 p ercen tage p oints; in Japanese, 32.1 p ercentage p oin ts—nearly t wice as large. The in- teraction term confirms this amplification (OR = 1.77, p = 0 . 015). The divergence is concentrated in exp eri- en tial queries (EN 50.0% OT A vs. JP 37.9%, a 12.1 pp gap) rather than transactional queries (EN 68.2% vs. JP 70.0%, a 1.8 pp gap). This amplification is consistent with the structure of t w o effectiv ely separate web ecosystems. Japanese queries cite Japanese-domain pages 68.4% of the time; English queries cite Japanese-domain pages only 6.4%. The non- OT A channel accounts for 42.4% of English citations (280 of 660) and 46.8% of Japanese citations (326 of 697), with significan tly differen t comp osition ( χ 2 (10) = 171 . 90, p < 0 . 001). T able 3 presen ts the breakdown: The English non-OT A c hannel is dominated by trav el blogs (22.9%) and editorial curation sites (22.5%)— in ternational conten t creators writing ab out T okyo for foreign tourists. The Japanese non-OT A c hannel is more div erse: hotel direct w ebsites are the largest category (23.6%), but the most distinctiv e feature is the pres- ence of source t yp es absent from English—trav el agency sites (12.9% vs. 0.0%) and co working/w orkspace plat- forms (8.3% vs. 2.5%). These are not b etter or w orse ecosystems—they are differently structured, with differ- en t source types av ailable to absorb the OT A share when queries b ecome experiential. T able 3: Non-OT A Source-Type Comp osition b y Lan- guage (% of Non-OT A Citations) Source Type EN JP T rav el blog 22.9% 1.2% Editorial curation 22.5% 12.3% Hotel direct 19.3% 23.6% T rav el agency 0.0% 12.9% T rav el media 8.9% 3.7% UGC 6.4% 8.6% Co w orking/workspace 2.5% 8.3% Lo cal tourism 5.4% 7.7% Accommo dation platform 2.5% 4.3% Other 8.9% 17.5% 3.3 Hotel Direct Citation: Language Matters More Than In ten t A more nuanced pattern emerges for hotel-direct cita- tion. Within each language, w e do not detect a clear in ten t effect: in English, hotel-direct citation rates are nearly identical for transactional and exp eriential queries (8.0% vs. 8.3%, p = 0 . 90), while in Japanese the exp e- rien tial rate is higher than the transactional rate (12.6% vs. 9.3%), though this within-language difference is not statistically significan t ( p = 0 . 16). T able 4 presents hotel- direct citations as a share of all citations b y segment. The stronger pattern is cross-linguistic. Aggregating across in tent, hotel-direct w ebsites accoun t for 8.2% of all English citations and 11.0% of all Japanese citations. Al- though this 2.8 p ercentage-point gap falls short of conv en- tional significance in a citation-level test ( χ 2 (1) = 2 . 87, p = 0 . 090), the query-lev el mo del—whic h a voids pseudo- replication across citations within the same response— sho ws that Japanese queries are significan tly more likely than English queries to pro duce an y hotel-direct citation (OR = 2.27, 95% CI [1.08, 4.75], p = 0 . 030), controlling for inten t and category . This cross-linguistic difference is esp ec ially visible in exp erien tial search. Among exp eriential queries, hotel- direct citation is 12.6% in Japanese compared with 8.3% in English (one-sided z-test, p = 0 . 026). T aken together, these results suggest that hotel-direct capture is shap ed less b y transactional v ersus experiential framing alone than by the structure and depth of the underlying lan- guage ecosystem. This is consistent with the p ossibil- 5 The End of R ente d Disc overy Zhu & Chang T able 4: Hotel Direct Citation Rate (% of All Citations) b y Language and In tent Language T-rate E-rate Difference 95% CI p English 8.0% ( n =274) 8.3% ( n =386) +0.3 pp [ − 4.0, 4.5] 0.90 Japanese 9.3% ( n =333) 12.6% ( n =364) +3.3 pp [ − 1.3, 8.0] 0.16 it y that Japanese hotel w ebsites contain richer searc h- answ erable con ten t, a p ossibilit y explored in the conten t audit b elo w. 3.4 Exploratory Conten t Audit: What Characterizes Cited Hotels? T o explore what distinguishes cited from non-cited ho- tels, we audited the w ebsites of sev en hotels that Gemini cites directly and sev en con trol hotels that w ere not cited. W e scored each hotel on a search-answ erable depth (SAD) scale across fiv e dimensions (F A Q, area guide, blog, access information, unique/distinctive conten t), each scored 0–3 based on depth of conten t a v ailable for Google Search to retriev e: 0 = absen t; 1 = presen t but shallo w (e.g., a F A Q with < 10 questions); 2 = mo derate depth (e.g., 10–25 F A Q questions or area conten t cov ering several neighbor- ho o ds); 3 = deep (e.g., a 30+ question F A Q organized b y topic, or dedicated neighborho o d pages with 500+ w ords each). The key distinction is not whether a con- ten t feature exists, but whether it is deep enough to rank in Google Search for relev an t trav eler queries. T able 5 presen ts the results. Cited hotels av erage 8.6/15 (range: 5–13); con trol ho- tels av erage 3.4/15 (range: 1–5), a significant difference (Mann-Whitney U = 48, p = 0 . 003). Ev ery hotel scoring 6 or ab ov e was cited; ev ery hotel b elow 6 w as not (Fisher exact p = 0 . 002). The K5 case illustrates wh y depth matters more than feature presence. Hotel K5—a design b outique hotel in Nihon bashi with a 9.6/10 Bo oking.com rating, bilingual con ten t, Schema.org markup, a neigh b orho o d page, and a F AQ—scores only 5/15 on the depth scale because eac h feature is brief. Gemini men tions K5 b y name in three resp onses but draws its information from OT A and editorial sources (Exp edia, Hotels.com, m yb outiqueho- tel.com), not from k5-toky o.com. The hotel is discov ered, but through intermediaries. B y con trast, Kado ya Hotel— an indep endent with no schema markup and no tech- nical SEO—scores 13/15 because its 33-question F AQ, 13-attraction sigh tseeing guide, and regular blog create 20+ indexed pages that directly answer tra v eler queries. Kado y a achiev es direct citation; K5 does not. The audit suggests a tw o-stage process for hotel direct citation in AI search: first, the hotel website must contain con ten t deep enough to rank in Google Searc h for relev ant queries; second, the retrieved con tent m ust answer the question better than comp eting OT A or editorial pages. Hotels that fail at the first stage never reach the second. 4 Discussion 4.1 What This Means for Hotel Disco v- ery T raditional hotel discov ery rewards accumulated authorit y—review volume, ratings, advertising sp end, domain authorit y . Our findings are consisten t with a retriev al pro cess in which question-con ten t fit pla ys a ma jor role, ev en if authority and ranking signals still matter. The in ten t-source divide (Section 3.1) supp orts this in terpretation: when the question is transactional, non-OT A sources account for only 30.8% of citations; when the question is exp erien tial, non-OT A citation rises to 55.9%. The 25.1 percentage-point swing suggests that the source of hotel disco very in AI search is not fixed—it dep ends on how the query is framed. 4.2 The Hotel W ebsite as Disco very As- set The most actionable finding for hotel op erators is that hotel-direct citation is stable across query inten t but sen- sitiv e to conten t depth. Three observ ations p oint in this direction, with the third remaining exploratory . First, hotel-direct citation do es not v ary by in- ten t. Hotel-direct rates are stable across transactional and exp erien tial queries in b oth languages (T able 4). This means a hotel w ebsite with deep, question-answ ering con- ten t earns citations under both conditions—it do es not need to b e optimized for one narrow class of prompts. The inten t-stability is theoretically consisten t with ho w grounding works: a hotel website with neigh b orho o d guides, transit information, and atmosphere descriptions is relev ant to b oth transactional and exp eriential queries sim ultaneously . Second, Japanese queries pro duce significan tly more hotel-direct citations than En- glish queries (Section 3.3: query-level OR = 2.27, p = 0 . 030). Japanese hotel brand websites tend to con- tain deep er question-answering conten t—neighborho o d guides, transit information, area descriptions—than their English counterparts, which tend to b e b o oking-oriented. The implication is that con tent depth at the language- ecosystem level translates into measurably higher hotel- direct citation rates. Third, conten t depth—not just 6 The End of R ente d Disc overy Zhu & Chang T able 5: Searc h-Answ erable Depth Audit (14 Hotels) Hotel Cited F A Q Area Guide Blog Access Unique Conten t SAD Score Kimpton Shinjuku Y es 3 3 2 2 3 13/15 Kado y a Hotel Y es 3 3 2 2 3 13/15 Sup er Hotel Y es 2 2 2 2 2 10/15 P ark Hotel T okyo Y es 1 1 1 1 3 7/15 T okyu Stay Y es 1 0 0 2 3 6/15 TR UNK (Hotel) Y es 0 2 1 1 2 6/15 Prince Hotels Y es 0 0 0 3 2 5/15 Hotel K5 No 1 1 1 1 1 5/15 NOHGA Hotel Ueno No 0 2 0 1 2 5/15 Ry ok an Shigetsu No 1 1 0 1 1 4/15 Ao y ama Grand Hotel No 1 0 1 1 1 4/15 Hotel Niw a T okyo No 0 1 0 1 1 3/15 Marunouc hi Hotel No 1 0 0 1 0 2/15 Shibuy a Stream Excel No 0 0 0 1 0 1/15 con tent presence—separates cited from non-cited hotels. The searc h-answ erable depth (SAD) audit of 14 hotels (Section 3.4) sho ws that cited hotels score sig- nifican tly higher than non-cited hotels (mean 8.6/15 vs. 3.4/15; Mann-Whitney U = 48, p = 0 . 003). Ev ery hotel scoring 6 or ab o v e on the depth scale w as cited; ev ery ho- tel b elow 6 w as not (Fisher exact p = 0 . 002). The K5 case is instructive: a hotel with a p erfect binary feature score (F A Q page, area guide, bilingual conten t) but shallow depth on each feature was nev er cited directly—Gemini drew K5 information from OT A and editorial intermedi- aries instead. Hotels cannot con trol how tra velers phrase queries, but they can control the depth of their o wn con tent. Our ex- ploratory audit suggests that con tent depth may b e one factor that distinguishes hotels whose w ebsites capture the discov ery moment from those disco vered through in- termediaries. 4.3 Implications for Differen t Stakehold- ers Indep enden t hotels may hav e the most to gain. The K5-vs-Kado y a comparison (Section 3.4) shows that brand prestige and tec hnical sophistication do not determine di- rect citation—conten t depth do es. Kadoy a, a single inde- p enden t with no schema markup and no SEO, achiev es direct citation through a 33-question F AQ and a 13- attraction sigh tseeing guide. K5, a design b outique with a 9.6 rating and full tec hnical implementation, do es not. Con ten t depth may level the playing field in w a ys that OT A rankings do not. Domestic chains are well-positioned in Japanese but underin v ested in English. Japanese queries pro duce 2.27 × more hotel-direct citations than English queries (Section 3.3, p = 0 . 030), consisten t with deep er Japanese- language con tent on domestic chain websites. F or the 36.9 million inbound visitors to Japan [12], the English- language conten t gap represen ts unrealized discov ery p o- ten tial. In ternational chains hav e the resources for con- ten t in vestmen t but face structural barriers. Kimpton Shinjuku scores 13/15 on searc h-answerable depth and ac hiev es direct citation—but Kimpton is an outlier. Most in ternational c hain prop erty pages are cen trally man- aged b o oking endp oints without the lo cation-sp ecific, question-answ ering conten t that AI search retrieves. OT As remain the dominant citation source at 55.3% o v erall and 69.2% for transactional queries. Disin terme- diation is not imminent. How ever, OT A citation drops to 44.1% for exp eriential queries—a 25.1 p ercentage-point swing that re presen ts the contestable frontier where non- OT A sources, including hotel-direct w ebsites, can com- p ete. 4.4 Connection to the Broader GEO Field Our findings con tribute to the emerging understanding of ho w AI search systems allo cate visibility . The GEO liter- ature has identified systematic patterns: citation concen- tration among a few outlets [16], earned media bias [5], geographic and comm ercial bias [14], and a discov ery gap for new entities [15]. Our study adds a domain-sp ecific dimension b y sho wing that these patterns op erate differ- en tly dep ending on query in tent and the structure of the underlying web ecosystem. The “earned media bias” identified by Chen, M. et al. [5]—a systematic preference for editorial conten t ov er brand-o wned con ten t—requires rein terpretation in light of our results. In the hotel domain, OT As are commercial in termediaries, not earned media in the traditional sense. Our data suggest that AI search does not prefer earned media p er se but rather cites whatever conten t t yp e b est 7 The End of R ente d Disc overy Zhu & Chang answ ers the query: OT A listing pages for transactional queries, editorial and blog conten t for exp eriential queries. The apparen t earned media bias ma y b e partially an ar- tifact of query distribution—if exp eriential queries pre- dominate in a study , earned media will naturally app ear fa v ored. Our taxonomy reveals that the in tent-source di- vide op erates along the OT A/non-OT A b oundary rather than the brand-o wned/earned-media b oundary . Hotel di- rect websites constitute a consisten t 19–24% of the non- OT A channel across in tent conditions, while OT A sources fluctuate dramatically . This suggests that the relev ant analytical frame for AI searc h citation is not “earned vs. o wned media” but “which conten t b est answers the sp e- cific query .” The role-augmen ted inten t-driv en GEO framework [6] pro vides theoretical scaffolding for this finding. That framew ork argues that query inten t is a primary determi- nan t of AI searc h behavior. Our four-category , tw o-inten t design provides empirical evidence within the hospital- it y domain: the same hotel market pro duces dramati- cally differen t citation patterns dep ending on whether the query is transactional or exp eriential. The inten t-source divide may reflect a structural prop erty of how AI searc h matc hes queries to a v ailable con ten t. 5 Limitations This study has six limitations that b ound the generaliz- abilit y and causal in terpretation of our findings. 1. Single platform, single point in time. All 156 queries w ere executed on Gemini 2.5 Flash with Go ogle Search grounding in March 2026. Go ogle regu- larly up dates b oth the underlying language mo del and the grounding mec hanism that determines which w eb sources are retriev ed and cited. A model version change, a mo d- ification to grounding source ranking, or an update to Go ogle Search’s index could alter the citation patterns w e do cumen t. F urthermore, other AI search platforms— ChatGPT with bro wsing enabled, Perplexit y , Claude— emplo y different retriev al architectures and may exhibit differen t source-type preferences. ChatGPT uses Bing rather than Go ogle Searc h for w eb grounding, which ac- cesses a differen t index and applies different ranking sig- nals. Perplexit y maintains its own search index and ci- tation system. Our findings are sp ecific to the Gemini- Go ogle Search grounding pipeline and should not b e as- sumed to generalize across platforms without empirical cross-v alidation. 2. Correlation, not causation. W e observe that hotels with richer w ebsite conten t—neighborho o d guides, F A Q pages, sigh tseeing information, access maps—receiv e more direct citations from Gemini. Ho w ev er, we can- not establish that con tent depth causes direct citation. The observ ed asso ciation is susceptible to confounding: w ell-kno wn hotels with strong brands ma y b oth inv est in w ebsite con tent and receive citations due to brand recog- nition, search engine authorit y , or backlink profiles that are correlated with but distinct from con tent depth. A hotel that is frequently men tioned on third-party web- sites may accumulate domain authorit y that b enefits its direct citation rate indep endently of its o wn con ten t. The K5 case (Section 3.4) reinforces this limitation: a hotel with strong conten t features but shallow search- answ erable depth w as not cited, suggesting that conten t alone does not determine citation—Go ogle Searc h rank- ing also plays a role. The causal test—mo difying a ho- tel’s website con tent and measuring subsequent c hanges in Gemini citation b ehavior—w ould require a con trolled in terv ention design with pre-p ost measuremen t, which we prop ose as future work but ha ve not conducted in this study . 3. Single mark et. T okyo’s hotel web ecosystem has c haracteristics that ma y limit generalizability to other tourism markets. The Japanese web contains a dense net w ork of domestic OT As (Jalan, Rakuten T rav el, Ikyu), Japanese editorial platforms (icotto.jp, ozmall.co.jp), and Japanese-language hotel brand websites that collectiv ely create a rich non-OT A con tent supply . Other tourism mark ets may hav e thinner non-OT A con ten t ecosystems, whic h would affect the magnitude of the inten t-source di- vide. The EN-JP difference in hotel direct citation rate (11.0% vs. 8.2%, p = 0 . 090) ma y b e sp ecific to the bilin- gual information landscape of T okyo, where t w o linguisti- cally separate w eb ecosystems serve the same geographic mark et. Cities with a single dominan t language (e.g., P aris, Bangkok) or cities where the lo cal-language web is less developed would present different dynamics. Addi- tionally , T okyo’s hotel market includes distinctiv e prop- ert y t yp es (capsule hotels, ryok ans, business hotels with sp ecific Japanese amenities) that generate unique conten t not found in other markets, p otentially inflating the ex- p erien tial query effect. 4. Query framing and answ er-type specificity . Our exp eriential queries are not only longer and lexically ric her than transactional queries (addressed via robust- ness chec ks in App endix E) but also em b ed implicit con- strain ts on what constitutes a go o d answer. A supple- men tary test with 20 “experiential-but-OT A-answerable” queries (App endix E) found that exp eriential framing pro duces high non-OT A rates (54.0%) ev en when the question could be answ ered b y OT A review data, sug- gesting the effect is driv en b y framing rather than answ er- t yp e bias. How ever, this supplementary set is small (20 queries, 137 citations) and was not part of the original study design. A larger, more systematic manipulation of answer-t yp e sp ecificity indep enden t of framing would pro vide stronger evidence. 5. Sample size and partial replication. Our 156 queries yielded 1,357 citations, pro viding adequate sta- tistical p ow er for the core in ten t-source divide ( χ 2 = 8 The End of R ente d Disc overy Zhu & Chang 84 . 23, p = 4 . 40 × 10 − 20 ) and the language in teraction effect ( p = 0 . 015). How ever, sub-analyses—particularly the category-b y-language-by-in tent breakdowns and the hotel-direct inten t comparison within each language— op erate on smaller cell sizes where statistical p ow er is re- duced. Gemini’s grounding output is non-deterministic: re-running the same query pro duces partially different citation sets (within-query SD = 0.142 in our 20-query replication test; App endix E). While the ICC of 0.656 in- dicates mo derate-to-go o d reliabilit y and the in ten t-source divide replicated in all five runs, the replication co vered only 20 of 156 queries. A full m ulti-run design re-running all 156 queries would provide tighter query-lev el esti- mates, particularly for sub-group analyses. Additionally , the 156 queries are generated from 16 template types (4 categories × 2 inten ts × 2 languages), with v ariation only in the w ard name. Citations from queries sharing a tem- plate are not fully indep enden t, which ma y understate standard errors in the logistic regression. Giv en the large effect size ( χ 2 = 84 . 23), template clustering is unlikely to alter the headline finding, but a mixed-effects mo del with template as a random effect would provide more conser- v ativ e inference. 6. Disco very , not b o oking. This study measures who con trols the discov ery momen t—whic h sources AI searc h cites—not who closes the b o oking. The billb oard effect [2, 10] means OT A-cited hotels ma y still receiv e direct b o okings. Our OT A citation rates should b e in- terpreted as measures of disco very in termediation, not b o oking capture. 6 F uture W ork Three extensions would most strengthen these findings. First, cross-platform v alidation —executing the same query set on ChatGPT, Perplexit y , and Claude—would test whether the inten t-source divide is specific to Gem- ini or a general prop erty of AI searc h. Second, a causal in terven tion study partnering with a hospitality busi- ness to expand w ebsite con tent and trac k citation c hanges w ould establish whether con tent depth causes citation rather than merely correlating with it. Third, m ulti- mark et replication in cities with different OT A land- scap es and w eb ecosystems w ould test generalizabilit y be- y ond T okyo. 7 Conclusion AI search ma y b e making hotel discov ery more con- testable. This study do cuments a 25.1 p ercen tage-point shift in non-OT A citation b etw een transactional and ex- p erien tial queries, amplified in Japanese where a more div erse conten t ecosystem absorbs a larger share of the exp erien tial shift. Hotel-direct citation is stable across in ten t conditions, and Japanese queries pro duce signif- ican tly more hotel-direct citations than English queries (OR = 2.27, p = 0 . 030), consistent with deep er question- answ ering conten t on Japanese hotel websites. F or an industry that has historically rented disco very from commission-based intermediaries, these findings suggest that the discov ery moment itself is b ecoming contestable. Hotels with deep, question-answering con ten t app ear bet- ter p ositioned to comp ete for that moment directly . Our results do not imply that AI will eliminate in termedi- ation or that citation source maps onto b o oking c han- nel. They do suggest that AI search opens a new path- w a y to discov ery—one where hotels with deep, question- answ ering conten t can be found without paying an in ter- mediary for the privilege. References [1] Pranjal Aggarw al, Vishv ak Murahari, T anmay Ra- jpurohit, Ash win Kaly an, Karthik Narasimhan, and Ameet Deshpande. GEO: Generativ e engine opti- mization. In Pr o c e e dings of the 30th A CM SIGKDD Confer enc e on Know le dge Disc overy and Data Min- ing , 2024. [2] Chris K. Anderson. The billboard effect: Online tra v el agent impact on non-OT A reserv ation volume. Cornel l Hospitality R ep ort , 9(16):6–9, 2009. [3] Simone Bianco. Deter and div ert: Ho w incum b en ts’ mark et structures guide hotel en try strategies. Jour- nal of Hospitality & T ourism R ese ar ch , 2025. [4] Andrei Broder. A taxonom y of w eb searc h. A CM SIGIR F orum , 36(2):3–10, 2002. [5] Mahe Chen, Xiaoxuan W ang, Kaiw en Chen, and Nic k Koudas. Generative engine optimization: Ho w to dominate AI searc h. arXiv pr eprint arXiv:2509.08919 , 2025. [6] Xiaolu Chen, Hao jie W u, Jie Bao, Zhen Chen, Y ong Liao, and Hu Huang. Role-augmented in tent- driv en generativ e search engine optimization. arXiv pr eprint arXiv:2508.11158 , 2025. [7] Domenic V. Cicchetti. Guidelines, criteria, and rules of thum b for ev aluating normed and standardized assessmen t instruments in psyc hology . Psycholo gic al Assessment , 6(4):284–290, 1994. [8] Cloudb eds. The state of indep endent lodging rep ort. T echnical rep ort, Cloudb eds, 2025. [9] D-EDGE. Hotel distribution rep ort 2024. T echnical rep ort, D-EDGE Hospitality Solutions, 2024. 9 The End of R ente d Disc overy Zhu & Chang [10] Anindya Ghose, P anagiotis G. Ipeirotis, and Beibei Li. Designing ranking systems for hotels on tra v el searc h engines by mining user-generated and crowd- sourced conten t. Marketing Scienc e , 31(3):493–520, 2012. [11] Lucas D. In trona and Helen Nissen baum. Shaping the web: Why the p olitics of search engines matters. The Information So ciety , 16(3):169–185, 2000. [12] Japan National T ourism Organization (JNTO). Vis- itor arriv als to Japan, 2024 statistics. T echnical re- p ort, JNTO, 2025. [13] Juhi Kulshrestha et al. Quan tifying search bias: In- v estigating sources of bias for p olitical searc hes in so- cial media. In Pr o c e e dings of the 2017 A CM Confer- enc e on Computer Supp orte d Co op er ative Work and So cial Computing , pages 417–432, 2017. [14] Alice Li and Luanne Sinnamon. Generativ e AI search engines as arbiters of public knowledge: An audit of bias and authorit y . In Pr o c e e dings of the Asso ciation for Information Scienc e and T e chnolo gy , v olume 61, 2024. [15] Amit Prak ash Sharma. The discov ery gap: How Pro duct Hunt startups v anish in LLM organic dis- co v ery queries. arXiv pr eprint arXiv:2601.00912 , 2026. [16] Kai-Cheng Y ang. News source citing patterns in AI searc h systems. arXiv pr eprint arXiv:2507.05301 , 2025. 10 The End of R ente d Disc overy Zhu & Chang A Exp erimen tal Setup Mo del and API configuration. All queries were executed on Gemini 2.5 Flash via the Go ogle GenAI Python SDK ( google-genai ) with the following configuration: • Mo del: gemini-2.5-flash • Go ogle Searc h grounding: enabled ( GoogleSearch to ol) • T emp erature: 1.0 (default; not explicitly ov erridden) • Seed: not a v ailable (Gemini API do es not supp ort deterministic seeding for grounded resp onses) • Data collection p erio d: March 2026 Resp onse structure. Eac h API response con tains: (a) a generated text answer in markdown, and (b) a struc- tured list of grounding citations ( grounding chunks )—the URLs and titles of web pages consulted during resp onse generation. W e extract and classify the grounding citations; the text resp onses are used only for hotel name extrac- tion. Example query and response format: { "query_id": "q0157", "model": "gemini-2.5-flash", "prompt": "Cheap hotel in Chiyoda", "query_language": "en", "query_intent": "transactional", "query_category": "budget", "area": "Chiyoda", "sources": [ {"url": "...", "text": "hotelscombined.com"}, {"url": "...", "text": "expedia.com"}, {"url": "...", "text": "hotels.com"}, {"url": "...", "text": "hostelworld.com"}, {"url": "...", "text": "kayak.com.ph"}, {"url": "...", "text": "momondo.co.za"}, {"url": "...", "text": "agoda.com"} ] } B Sample Queries The following 10 queries illustrate the paired transactional-exp eriential design across categories and languages. The full corpus comprises 156 queries (144 area-lev el + 12 city-lev el). T able 6: Sample Queries from the Paired Design ID Language Category In ten t Query q0157 EN Budget T ransactional Cheap hotel in Chiy oda q0176 EN Budget Experiential Go o d v alue hotel with lo cal charm in Chuo q0194 EN Rating T ransactional Best rated hotel in Chuo q0212 EN Rating Exp erien tial Hotel with exceptional service and atmosphere in Ch uo q0229 EN Conv enience T ransactional Hotel near Chiyoda station with easy access q0169 JP Budget T ransactional 新 宿 で 安 い ホテル q0184 JP Budget Exp erien tial 東 京 都 千代 田 区 で 地 元 の 雰 囲 気 が 楽 しめる コスパ の 良 い ホテル q0208 JP Rating T ransactional 豊 島 で 評 価 の 高 い ホテル q0275 JP Business T ransactional 東 京 都中 央 区 で 出 張 向 けの ホテル q0293 JP Business Exp erien tial 東 京 都中 央 区 で 仕事 スペ ー ス があり 静 かな ホテル 11 The End of R ente d Disc overy Zhu & Chang C Source Classification T axonom y Citations w ere classified using domain-substring matching. The table below shows the taxonomy structure with represen tativ e examples. T able 7: Source Classification T axonomy Category Type Examples OT A Global OT A b o oking.com, exp edia.com (incl. regional v ariants), hotels.com, tripadvisor.com Japanese OT A jalan.net, rakuten.co.jp, ikyu.com, jtb.co.jp AP AC OT A ago da.com, trip.com, tra v elok a.com Meta-searc h k ay ak.com, triv ago, skyscanner, hotelscom bined Non-OT A Hotel direct kimptonshinjuku.com, sup erhotel.co.jp, princehotels.com Editorial curation wanderlog.com, myboutiquehotel.com, icotto.jp, ozmall.co.jp T rav el blog Individual tra v el bloggers T rav el media F orb es, Lonely Planet, livejapan.com Lo cal tourism W ard-level tourism b oards, gotokyo.org Co w orking/w orkspace bizcomfort.jp, e-office.space T rav el agency skytick et.jp, tra vel.co.jp User-generated Y ouT ub e, Reddit, note.com Accommo dation platform Airbnb, HafH D Statistical Results T able 8: Citation-Lev el Logistic Regression— P (non-OT A Citation) V ariable M1 Co ef M1 OR [95% CI] M1 p M2 Co ef M2 OR [95% CI] M2 p In tercept − 1.429 0.24 [0.18, 0.32] < 0.001 − 1.241 0.29 [0.21, 0.40] < 0.001 In ten t (experiential) 1.080 2.95 [2.34, 3.71] < 0.001 0.782 2.19 [1.57, 3.04] < 0.001 Language (Japanese) 0.283 1.33 [1.06, 1.67] 0.015 − 0.053 0.95 [0.67, 1.35] 0.769 Category: Rating 0.039 1.04 [0.74, 1.47] 0.827 0.037 1.04 [0.73, 1.47] 0.833 Category: Con v enience 0.694 2.00 [1.48, 2.71] < 0.001 0.690 1.99 [1.47, 2.70] < 0.001 Category: Business 0.944 2.57 [1.87, 3.54] < 0.001 0.945 2.57 [1.87, 3.54] < 0.001 Exp erien tial × Japanese — — — 0.571 1.77 [1.12, 2.80] 0.015 Mo del 1: AIC = 1736.0, Log-L = − 862.0. Mo del 2: AIC = 1732.1, Log-L = − 859.1. LR test: χ 2 (1) = 5 . 90, p = 0 . 015. n = 1 , 357 citations. Reference: transactional, English, budget. DV = non-OT A citation. T able 9: Query-Lev el Quasibinomial Regression ( n = 156 queries) T o address non-indep endence, we re-estimated Model 2 at the query lev el using grouped binomial regression. Scale parameter = 0.262 (sligh t underdisp ersion). V ariable OR [95% CI] p Exp erien tial 2.19 [1.85, 2.59] < 0.001 Japanese 0.95 [0.79, 1.14] 0.566 Rating 1.04 [0.87, 1.24] 0.680 Con v enience 1.99 [1.71, 2.33] < 0.001 Business 2.57 [2.18, 3.03] < 0.001 Exp erien tial × Japanese 1.77 [1.40, 2.24] < 0.001 Co efficien ts are substantiv ely identical to citation-level estimates, confirming the finding is not an artifact of pseudo- replication. 12 T able 10: Complete Statistical T est Summary T est Statistic V alue p Core finding Chi-squared (in ten t × source) χ 2 (1) 84.23 4 . 4 × 10 − 20 Unadjusted OR (non-OT A, E vs T) OR [95% CI] 2.84 [2.27, 3.56] — Cramer’s V V 0.249 — Mann-Whitney U (query-w eigh ted) U 4,721 < 10 − 9 Category-lev el Budget χ 2 (1) 26.00 3 . 4 × 10 − 7 Rating/Qualit y χ 2 (1) 16.78 4 . 2 × 10 − 5 Business χ 2 (1) 25.81 3 . 8 × 10 − 7 Con v enience χ 2 (1) 13.29 2 . 7 × 10 − 4 Language interaction LR test (Mo del 2 vs Model 1) χ 2 (1) 5.90 0.015 In teraction OR (experiential × JP) OR [95% CI] 1.77 [1.12, 2.80] 0.015 Direct cell: JP-E vs EN-T (non-OT A, unadj.) OR [95% CI] 3.52 [2.53, 4.90] < 10 − 14 Non-OT A composition JP-domain share (JP queries) % 68.4% — JP-domain share (EN queries) % 6.4% — EN vs JP composition χ 2 (10) 171.90 < 0 . 001 Hotel-direct EN vs JP χ 2 (1) 2.87 0.090 (n.s.) Robustness Answ er-t ype sp ecificity (non-OT A: OA vs E) z − 0.40 0.69 (n.s.) Answ er-t ype sp ecificity (non-OT A: OA vs T) z 5.14 < 0 . 0001 Query-length OLS (p o oled, in ten t β ) β 0.27 0.0001 Query-length OLS (JP only , inten t β ) β 0.45 < 0 . 0001 ICC (test-retest, 20 queries × 5 runs) ICC(1,1) 0.656 — Hotel-direct Hotel-direct EN vs JP (query-lev el logistic) OR [95% CI] 2.27 [1.08, 4.75] 0.030 Hotel-direct JP-E vs EN-E (one-sided z) z — 0.026 Exploratory SAD cited vs con trol (Mann-Whitney) U 48 0.003 SAD threshold (Fisher exact) — — 0.002 E Robustness Chec ks Query length and lexical richness. OLS regression controlling for query length, adjectiv e density , sub jectivit y , and t yp e-tok en ratio confirms that inten t remains the strongest predictor in a p o oled model ( β = 0 . 27, p = 0 . 0001, n = 156). In Japanese alone, inten t survives all lexical controls ( β = 0 . 45, p < 0 . 0001). Answ er-type sp ecificit y . 20 supplementary queries using exp erien tial language but asking OT A-answerable questions pro duced an OT A citation rate of 46.0% (54.0% non-OT A, n = 137), statistically indistinguishable from the original exp erien tial non-OT A rate of 55.9% ( z = − 0 . 40, p = 0 . 69) and significan tly higher than the transactional non-OT A rate ( z = 5 . 14, p < 0 . 0001). T est-retest reliability . Re-running 20 stratified queries five times eac h yielded ICC(1,1) = 0.656 (mo derate-to- go o d; [7]). The inten t-source divide was p ositive in all five runs. Query-w eighted analysis. T reating each query as a single observ ation: non-OT A rates 27.1% (transactional) vs. 55.1% (exp eriential). Mann-Whitney U = 4 , 721, p < 10 − 9 . 13
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment