AI Agents Can Already Autonomously Perform Experimental High Energy Physics

Large language model-based AI agents are now able to autonomously execute substantial portions of a high energy physics (HEP) analysis pipeline with minimal expert-curated input. Given access to a HEP dataset, an execution framework, and a corpus of …

Authors: Eric A. Moreno, Samuel Bright-Thonney, Andrzej Novak

AI Agents Can Already Autonomously Perform Experimental High Energy Physics
AI Agen ts Can Already Autonomously P erform Exp erimen tal High Energy Ph ysics Eric A. Moreno ∗ † 1,2 , Sam uel Bright-Thonney * ‡ 1,2 , Andrzej No v ak * § 1,2 , Dolores Garcia ¶ 3 , and Philip Harris ‖ 1,2 1 Departmen t of Physics, Massac h usetts Institute of T ec hnology 2 NSF AI Institute for Artificial In telligence and F undamental In teractions 3 CERN Marc h 23, 2026 Abstract Large language mo del-based AI agents are no w able to autonomously execute substan tial portions of a high energy ph ysics (HEP) analysis pip eline with minimal exp ert-curated input. Giv en access to a HEP dataset, an execution framework, and a corpus of prior experimental literature, we find that Claude Co de succeeds in automating all stages of a typical analysis: ev ent selection, background estimation, uncertain ty quantification, statistical inference, and pap er drafting. W e argue that the experimental HEP communit y is underestimating the current capabilities of these systems, and that most proposed agen tic workflo ws are to o narrowly scop ed or scaffolded to sp ecific analysis structures. W e present a pro of-of-concept framework, Just F urnish Context ( JF C ), that integrates autonomous analysis agents with literature-based knowledge retriev al and multi-agen t review, and show that this is sufficien t to plan, execute, and document a credible high energy ph ysics analysis. W e demonstrate this b y conducting analyses on op en data from ALEPH, DELPHI, and CMS to p erform electro weak, QCD, and Higgs boson measuremen ts. Rather than replacing physicists, these tools promise to offload the rep etitiv e technical burden of analysis co de developmen t, freeing researc hers to fo cus on ph ysics insigh t, truly no vel metho d dev elopment, and rigorous v alidation. Giv en these developmen ts, we advocate for new strategies for how the communit y trains studen ts, organizes analysis efforts, and allo cates human exp ertise. 1 In tro duction A typical exp erimental high energy physics analysis is a years-long endeav or, often spanning the ma jority of a graduate studen t or p ostdoc’s time at an academic institution. F or large collider exp erimen ts the process is algorithmic: a physicist (a) identifies an interesting measurement channel or new physics signature, (b) studies existing literature to understand ho w similar measuremen ts hav e been done, (c) designs and v alidates an ev en t selection procedure using Mon te Carlo (MC) sim ulation, quantifies all relev ant sources of systematic uncertain ty , and finally (d) “unblinds” on real experimental data and p erforms statistical hypothesis tests to extract measurements or limits. This pro cess in v ariably inv olves writing thousands of lines of co de to pro cess data and apply standard, though not off-the-shelf, analysis tec hniques, most of whic h is structurally similar to code written in dozens of other analyses within the same collaboration. While this can be a useful exercise for dev eloping programming ∗ These authors contributed equally . † emoreno@mit.edu ‡ sambt@mit.edu § nov ak a@mit.edu ¶ dolores.garcia@cern.ch ‖ pcharris@mit.edu 1 skills, it often consumes a substantial fraction of a ph ysicist’s working time and demands little physics reasoning or insight. This is frequently exacerbated b y the absence of high-quality , centralized, or up-to-date do cumen tation for m uch of the core soft ware infrastructure within experimental collaborations. The result is a slo w, error-prone, and tedious process that, b ey ond a brief initial learning phase, adds little to a student’s education and robs all practitioners of v aluable time that could be b etter spent on higher-level questions. In this paper, we argue that the ma jority of this pro cess can b e delegated to AI agents. Modern co ding assistan ts can autonomously write and execute co de, access do cumen tation (b oth lo cal and online), iterativ ely critique and debug their own output, and document their progress for h uman o verseers [ 14 , 2 ]. As large language models (LLMs) and commercial agentic framew orks con tinue to improv e, these to ols will only b ecome more reliable. This is not a sp eculativ e claim about future capabilities. In this pap er we sho w that, given a general framew ork distilling the pro cesses used in practice at large HEP collab orations and an initial high-lev el ph ysics prompt, Claude Co de 1 is able to pro duce a complete analysis: ev en t selection, bac kground estimation, systematic uncertaint y ev aluation, statistical inference, and a written report with publication-grade figures. The rep ort quality is typically , at a cursory exp ert review, indistinguishable from a rep ort pro duced by a junior graduate studen t (or an exp ert under time constraints). Recent w ork b y Badea et al. [ 4 ] demonstrated that an AI agent, with iterative physicist sup ervision at eac h step, can contribute to a measuremen t using LEP op en data. W e build on this direction but take a fundamentally differen t approach: rather than requiring contin uous human feedbac k, JF C delegates review to sp ecialized AI agen ts, concentrating human o versigh t at a single unblinding gate, yielding a nearly fully autonomous framew ork. The framework specification splits roughly into three indep enden t comp onents: • Methodology . W e enco de what a typical particle physics analysis w orkflow lo oks like, from planning and data exploration to statistical analysis. This structure, while not usually explicitly sp elled out, generally liv es as institutional knowledge in scien tific collab orations and researc h groups. Roughly , this corresp onds to the guidance a graduate student w ould receiv e as they progress through their thesis researc h. – Review . A separate asp ect of the researc h metho dology is ho w a scientific collab oration ensures that its published results are accurate. There is typically an official review process that progresses in tiers from junior postdo cs within a group to senior postdo cs leading analysis groups within collab orations, to senior academics who effectiv ely function as in ternal journal reviewers. T o sim ulate this, we encode tiers of review subagents at every step of the analysis. • General agen t b eha vior . In order to manage context windows, the analysis steps outlined in the metho dology are sent to separate subagen ts. Ho wev er, these agents still need to know the ov erall analysis con text and plan. W e therefore enco de a strict sp ecification of what each agen t receives and outputs (operationally , no physics information), as w ell as a log that allo ws the process to b e explored and verified b y humans and that also facilitates agent-lev el debugging of complex failures. • Domain-specific con ven tions . Finally , we sp ecify a w ell-isolated set of desired conv entions, such as to ol use, visualizations, or more obscure or recent analysis techniques. Encoding this baseline lev el seems una voidable, as general mo dels are unable to correctly guess the sp ecific con ven tions that v ary greatly by field. The analyses presented in the app endices of this pap er w ere pro duced entirely by AI agents op erating on arc hived op en ALEPH and DELPHI data and Monte Carlo simulation from the Large Electron-P ositron (LEP) collider and CMS Op en Data from the Large Hadron Collider (LHC)[ 7 ], with the agents optionally accessing domain knowledge from published LEP pap ers through a structured retriev al system. Imp ortan tly , w e do not presen t these analyses as legitimate scientific results, but rather to sho wcase the level of HEP analysis that current agen tic systems can pro duce entirely au- tonomously , based only on a short physics prompt and the framework we present as scaffolding for ho w a physics analysis is done. W e b eliev e this warran ts rethinking ho w mainstream analysis w ork is done in HEP collab orations and ho w studen t lab or is utilized. These developmen ts also hav e 1 W e see no barriers for this to generalize to other state-of-the-art co ding assistants. 2 Figure 1: Diagram of how an AI-agent workflo w can be used to mirror the t ypical high-energy ph ysics analysis workflo w. On the left, we sho w the typical analysis pipeline, which usually starts with legacy co de that is then mo dified to p erform the analysis. Analyses typically inv olve 3 or more levels of review, starting with feedbac k from other p ostdo cs, studen ts, and faculty collaborating on the analysis (office feedback). The next tier is t ypically done in an analysis subgroup (colloquially referred to as a level 3 group). The second tier of review inv olves a pre-appro v al phase, in which physics group conv eners review an analysis, follo wed b y a formal collab oration review leading to a result and a submission for publication. On the right side, an equiv alen t interactiv e w orkflow can be entirely handled b y AI agents, from the conception of an idea through a result that w ould then undergo a similar collab oration review, follow ed by publication. implications beyond routine analysis: they enable systematic repro ducibilit y studies, lo w er the barrier to reanalyzing archiv ed datasets, and facilitate rigorous do cumen tation of analysis workflo ws—all longstanding c hallenges in the field, historically hindered b y the significant human effort and exp ertise required to carry them out at scale. W e emphasize that we are not calling for AI agents to supplant humans in pro ducing scien tific results. The final output of any AI-assisted analysis must b e thoroughly c heck ed, understo od, and v alidated by domain exp erts b efore it can be considered a scientific result, and must undergo the same in ternal scrutin y as any other collab oration pap er. W e also do not claim that current agents can handle every asp ect of every analysis; complex analyses inv olving nov el tec hniques, b esp ok e reconstruction algorithms, or subtle interpla y b et w een m ultiple systematic uncertainties will contin ue to require substantial hands-on human inv olv ement. Agen ts mak e mistak es, sometimes subtle ones, and h umans must remain resp onsible for judging their outputs and b e held publicly accountable for their mistakes. Most imp ortan tly , it is our creativity , exp ertise, and judgmen t that defines the rese arc h agenda these tools will help us pursue. The remainder of this paper describes our pro of-of-concept framework, the lessons learned from applying it to ALEPH, DELPHI, and CMS open data, and its curren t limitations. Section 2 reviews the rapidly gro wing landscap e of agentic AI for science and for HEP sp ecifically . Section 3 describ es the JF C framework, including its agent architecture, knowledge retriev al system, and multi-agen t review pro cess. Section 4 surv eys the 3 analyses “w e” hav e reproduced using JF C , discusses their qualit y , and examines the broader implications for analysis w orkflows, graduate training, legacy data reanalysis, and the limitations and risks of curren t systems. Section 5 concludes with a call to action. 2 Related W ork The application of AI agen ts to scien tific researc h is a rapidly growing field spanning many domains, from c hemistry and drug disco very to materials science and mathematics. In chemistry , ChemCrow [ 6 ] augmen ts large language mo dels with expert-designed chemistry tools to autonomously plan and execute synthesis tasks, while Coscientist [ 5 ] integrates LLMs with internet searc h, code execution, and robotic lab equip- men t to design and carry out chemical exp eriments end-to-end. The AI Scientist [ 16 ] demonstrated a fully autonomous pip eline, from idea generation through exp erimen t execution to pap er writing. Multi-agen t arc hitectures ha ve also been explored for scien tific discov ery: SciAgents [ 11 ] combines LLMs with knowl- edge graphs in a m ulti-agent framework for materials science, while P ap erQA2 [ 19 ] ac hieves sup erh uman p erformance on scientific literature synthesis through an agentic retriev al-augmented generation pip eline. These efforts illustrate a broad trend tow ard autonomous AI-driv en researc h, but eac h domain brings its o wn unique challenges in terms of data complexit y , v alidation requirements, and the need for domain-sp ecific reasoning. Within high energy physics (HEP), interest in applying LLM-based agents to analysis w orkflows has gro wn significan tly in the past year, with several concurrent and complemen tary efforts emerging. A com- m unity vision pap er [ 1 ] recently outlined grand challenges for building an AI-native research ecosystem for exp erimen tal particle ph ysics, providing a high-lev el roadmap for ho w curren t and future facilities can benefit from AI integration. W e review the existing landscap e of HEP-fo cused agen tic AI efforts b elo w, organized b y the type of task they address. 2.1 Agen ts for data analysis The most directly relev an t prior work is that of Gendreau-Distler et al. [ 10 ], who present an LLM-agen t-driven data analysis framew ork. Their system pairs an LLM-based supervisor-co der agent with the Snakemak e w orkflow manager to automate a Higgs b oson diphoton cross-section measuremen t using A TLAS Op en Data. The workflo w manager enforces reproducibility and determinism, while the agent generates, executes, and iteratively corrects analysis co de. The authors benchmark several state-of-the-art LLMs spanning the Gemini, GPT, and Claude families, as w ell as leading op en-w eight models. Notably , how ever, the authors themselv es state that “m ulti-step task planning is beyond the curren t scope” of their system [ 10 ], highligh ting that the agen t op erates within a fairly rigid, pre-defined analysis structure rather than autonomously planning and executing a full analysis from a high-level specification. Diefen bacher et al. [ 8 ] inv estigate LLM-based agen ts for anomaly detection using the LHC Olympics dataset [ 15 ], demonstrating that agen tic setups can develop and test analysis metho ds that mirror human state-of-the-art p erformance. Their study pro vides a systematic comparison of prompting strategies and LLM models, b enchmarking stability , cost, and repro ducibilit y across m ultiple runs. This w ork is notable for sho wing that agen ts can indep enden tly arriv e at sensible analysis strategies—such as combining bump h unts with weakly supervised approac hes like CW oLa—without being explicitly instructed to do so. Menzo et al. presen t HEPT APOD [ 18 ], an orchestration framew ork that enables large language models to in terface with domain-sp ecific HEP to ols, construct simulation w orkflows, and manage m ulti-step researc h pip elines. The system uses schema-v alidated op erations and run-card-driven configuration to ensure repro- ducibilit y , and is demonstrated on a represen tativ e BSM Mon te Carlo v alidation pip eline spanning mo del generation, even t simulation, and do wnstream analysis within a unified workflo w. By providing a structured and auditable lay er betw een h uman researchers, LLMs, and computational infrastructure, HEPT APOD fo- cuses on the upstream simulation and w orkflow management problem rather than end-to-end data analysis, and emphasizes human-in-the-loop o versigh t throughout the pro cess. Most recently , Badea et al. [ 4 ] presen t a proof-of-concept measuremen t using LEP open data with an agen tic AI–ph ysicist collaboration. Their approach relies on an iterative h uman-in-the-lo op cycle in which the physicist pro vides detailed feedback after each agen t attempt, guiding the analysis to ward a correct result. While demonstrating that agen ts can con tribute meaningfully to a real measuremen t, this sup ervised 4 w orkflow differs fundamen tally from the autonomous approach we prop ose: in JFC , multi-agen t review replaces the human feedbac k lo op during analysis dev elopment, and h uman ov ersigh t is concen trated at a single formal unblinding gate rather than distributed throughout the pro cess. 2.2 Agen ts for sim ulation and exp erimen t design GRA CE [ 12 ] tak es a differen t approac h, targeting the upstream problem of experimental design rather than data analysis. Given a natural-language prompt or a published exp erimental paper, GRACE extracts a structured representation of the experiment, constructs a runnable simulation, and autonomously explores design mo difications using Monte Carlo methods. The system demonstrates that an agen tic approac h to detector geometry optimization can identify directions consisten t with known upgrade priorities, using only baseline simulation inputs. 2.3 LLM-assisted to olkits CoLLM [ 9 ] provides an end-to-end pip eline from plain-language analysis sp ecifications to trained deep learn- ing classifiers for collider analyses. While it uses LLMs to generate analysis co de under physics constrain ts, it functions more as an AI-assisted to olkit with a graphical user in terface than as a fully autonomous agent, and fo cuses sp ecifically on the deep learning classification step of an analysis. Xiwu [ 20 ] is an LLM assistan t for HEP that can switch betw een op en-w eight mo dels and b e fine-tuned on domain kno wledge. 2.4 Benc hmarks and comm unit y efforts The CelloAI benchmarks [ 3 ] provide a framework for ev aluating AI assistants in HEP contexts, focusing on co de do cumen tation and generation tasks. Notably , the CelloAI b enchmark suite do es not y et include ph ysics analysis tasks, lea ving a gap in the comm unit y’s ability to systematically ev aluate the most impactful p oten tial application of AI agents in HEP . 2.5 Kno wledge retriev al for physics analysis A k ey challenge for autonomous analysis agen ts is accessing and applying domain knowledge from the existing literature. Standard retriev al-augmen ted generation (RA G) approaches struggle with scientific text b ecause relev an t information often spans m ultiple sections, figures, and ev en m ultiple papers. McGreivy et al. [ 17 ] address this with SciT reeRA G, which exploits the hierarchical structure of exp erimen tal physics pap ers to build a tree representation of a document corpus, enabling more con textually coherent retriev al than flat c hunking approaches. They additionally introduce SciGraphRAG, which transforms unstructured literature in to structured knowledge graphs to capture cross-do cument relationships. Both systems are demonstrated on the LHCb exp eriment corpus. PaperQA2 [ 19 ] tak es a complementary approac h, treating retriev al and resp onse generation as a m ulti-step agent task that can revise its searc h parameters and examine candidate answ ers b efore pro ducing a final resp onse, ac hieving sup erh uman accuracy on scien tific literature synthesis b enc hmarks. 2.6 Summary and gaps T able 1 summarizes the capabilities of the existing agentic systems for HEP discussed ab o ve. The existing landscap e reveals sev eral important patterns. First, most HEP-focused agen tic systems op erate within highly scaffolded workflo ws where the analysis structure is pre-defined and the agent’s role is limited to co de generation within fixed steps. Second, none of the existing systems combine autonomous m ulti-step planning, domain kno wledge retriev al from the literature, and m ulti-agent review into an in te- grated framework. Third, the comm unity lacks benchmarks for ev aluating agents on realistic, end-to-end ph ysics analysis tasks—the kind that require e v en t selection, background estimation, systematic uncertain ty quan tification, and statistical inference. These gaps motiv ate the framework we presen t in the following sections. 5 Capabilit y [ 10 ] [ 8 ] GRA CE [ 12 ] CoLLM [ 9 ] [ 4 ] JF C Autonomous task planning – ∼ ✓ – – ✓ Co de generation & execution ✓ ✓ ✓ ✓ ✓ ✓ Literature retriev al (RAG) – – – – – ✓ Multi-agen t review – – – – – ✓ End-to-end analysis ∼ ✓ – ∼ ✓ ✓ Rep ort generation – – – – ∼ ✓ Minimal human o versigh t ✓ ✓ ✓ – – ✓ T able 1: Comparison of agentic AI systems for HEP . ✓ = fully supp orted, ∼ = partially supported, – = not supp orted. “End-to-end analysis” refers to pro ducing a complete measurement (selection through statistical inference); “minimal human o versigh t” indicates the system can run without contin uous human feedbac k. 3 The JF C F ramew ork JF C is an agentic framework for autonomous execution of HEP analysis pip elines. An orchestrator agent delegates execution and review w ork to subagen ts across sequential phases . Eac h phase m ust pro duce a written artifact and pass an indep enden t review b efore the next phase can b egin, pro viding structured task decomp osition and fresh context management. In this section, we describe its domain-sp ecific capabilities: ph ysics-aw are analysis planning, literature-based kno wledge retriev al, and m ulti-agent review with role- sp ecialized critics. 3.1 Arc hitecture ov erview The JFC framework is built on Claude Co de , Anthropic’s agentic co ding CLI, which serves as b oth the LLM back end and the agent runtime. The underlying mo del is Claude Opus (specifically , claude-opus-4-6 with a one-million-token context windo w), used for all executor, review er, and arbiter subagen ts. All analyses presen ted in this pap er w ere run with Opus for ev ery agent role to ensure reasoning quality is never silen tly degraded; opp ortunities for cost reduction b y delegating narrowly scoped roles to ligh ter-weigh t mo dels are discussed in Section 4.2 . The key architectural principle is that the agent op erates with genuine autonomy o ver the analysis work- flo w. Unlike prior systems that constrain the agent to co de generation within pre-defined analysis steps [ 10 ], the JF C analysis agent receiv es a high-level physics ob jective—suc h as “measure the hadronic cross-section of the Z boson using ALEPH data”—and must autonomously determine the full analysis strategy: what ev ent selection to apply , how to estimate bac kgrounds, whic h systematic uncertainties to ev aluate, what statistical framew ork to use, and ho w to present the results. The agent is not giv en a template or sk eleton analysis to fill in; it must plan and execute the en tire pip eline from scratc h, consulting the literature as needed. This autonomy is scaffolded, not unconstrained. The agent op erates within a ph ysicist-developed kno wl- edge base that takes tw o complementary forms. The first is a methodology specification: a natural-language do cumen t defining the phases of the analysis, the outputs eac h phase m ust produce, and the review gates sep- arating them—expressed in prose rather than co de, so that the agent exercises judgment within constrain ts rather than executing a rigid pip eline. The second is a con ven tions directory: a set of living documents, main tained and updated b y physicists after eac h analysis, encoding accum ulated domain kno wledge for sp e- cific tec hniques—which systematic uncertaint y sources are standard for a given measuremen t t yp e, what v alidation c hecks a giv en correction metho d requires, and what pitfalls exp erienced analysts know to av oid. Where the methodology tells the agent that it must ev aluate systematic uncertain ties, the conv en tions tell it which sources are standard and why . T ogether, these constitute an operational con text analogous to a grad- uate course curriculum: not prescribing ev ery step, but establishing the standards and domain knowledge against which the agen t’s decisions are made and reviewed. 6 P hase 3: S elec tion P hase 2: Explor a tion A ut onomous HEP analy sis fr amew or k P h y sics objec tiv e " M easur e Z lineshape" Or chestr a t or agen t D elega t es , nev er e x ecut es S equen tial phases — each ga t ed by an output + r eview P hase 1: S tr a t egy Ex ecut or subagen t E ve n t selec tion · c or r ec tions P hase 4 — inf er enc e Human g a te R esults M easur emen ts A naly sis c ode R epr oducible F igur es P ublica tion qualit y Dr af t paper A naly sis not e PDF W or ko w is auditable + r epr oducible P hase / e x ecut or R eview + output Human g a te Lit er a tur e r etr iev al O utputs Ex ecut or subagen t T echnique selec tion · sy st ema tic plan S ciT r eeR A G Lit er a tur e r etr iev al 4a Expec t ed Ex ecut or subagen t 4b P ar tial Unblinding Ex ecut or subagen t 4c F ull Unblinding Ex ecut or subagen t P hase 5: D ocumen ta tion Ex ecut or subagen t A naly sis not e · pandoc PDF R eview S elf-R eview R eview R eview R eview R eview Ex ecut or subagen t Da ta qualit y · v ar iable r ank ing R eview Figure 2: The JF C framework. A high-level physics ob jectiv e is passed to an autonomous analysis agent, whic h plans and executes the full pipeline while querying a literature retriev al system (SciT reeRA G) for domain knowledge. The resulting analysis undergo es multi-agen t review; if an y reviewer flags an issue the agen t revises and resubmits until all reviewers approv e. Even tually the final do cumen t (a rich analysis note) is passed to h uman ph ysicists for ev aluation. 3.2 T ask decomp osition into phases The orc hestrator decomp oses the analysis in to seven sequential phases, each gated by the pro duction of a written artifact on disk and passage of an indep enden t review: 1. Phase 1 — Strategy (4-b ot review): The lead-analyst agent queries the literature corpus, identifies signal and background pro cesses, prop oses even t selection, defines the blinding v ariable, and outlines the systematic uncertaint y program. A mandatory reference analysis surv ey (2–3 published analyses) 7 and a conv entions compliance table are required. The output artifact is STRATEGY.md . 2. Phase 2 — Exploration (self-review): Three sp ecialist agents run in parallel: data-explorer (sam- ple in ven tory , data qualit y), detector-specialist (ob ject definitions, data/MC v alidation), and theory-scout (cross-sections, generator recommendations, prior results). The lead-analyst con- solidates their outputs in to EXPLORATION.md . 3. Phase 3 — Selection and Bac kground Mo deling (1-b ot review per c hannel): The signal-lead and background-estimator agents implemen t ev ent selection, define control and v alidation regions, p erform closure tests, and pro duce the SELECTION.md artifact. Multi-channel analyses spawn parallel agen t pairs p er c hannel. 4. Phase 4a — Exp ected Results (4-b ot review): One systematic-source-evaluator agent p er systematic source runs in parallel. The systematics-fitter then constructs the lik eliho od, p erforms Asimov fits, signal injection tests, and all eigh t mandatory fit diagnostics. Output: INFERENCE EXPECTED.md . 5. Phase 4b — P artial Un blinding (4-bot review → h uman gate): The fitter runs on a 10% signal- region data subsample. The note-writer pro duces a draft analysis note and an unblinding c hecklist. After the review passes, the pip eline p auses and presents a structured summary to the human analyst, who must explicitly appro ve, request changes, or halt. 6. Phase 4c — F ull Un blinding (1-b ot review): Executes only after h uman appro v al. The fitter runs on the complete dataset; the cross-checker v alidates consistency with partial and exp ected results. 7. Phase 5 — Documentation (5-b ot review): The note-writer pro duces the final analysis note ( ANALYSIS NOTE.md ) in pandoc-compatible Markdown, compiled to PDF via pandoc . The review adds a rendering-reviewer that compiles and insp ects the PDF. The orchestrator agent itself nev er writes analysis co de or reads full data files; it only holds the ph ysics prompt, phase summaries, and review v erdicts, delegating all computational work to disposable subagen ts whose contexts are discarded after each inv ocation. 3.3 T o ols and softw are stac k The framework enforces a sp ecific, pure-Python HEP soft ware stac k, do cumented in T able 2 . T ask Required tool Explicitly forbidden R OOT file I/O uproot ( ≥ 5.0) PyR OOT, R OOT C++ macros Arra y operations awkward-array , numpy pandas (for even t data) Histogramming hist , boost-histogram R OOT TH1 , numpy.histogram Plotting matplotlib + mplhep R OOT TCanvas , plotly Statistical mo deling pyhf (binned), zfit (un binned) Ro oFit, RooStats 4-v ectors vector , particle Man ual calculations Do cumen t preparation pandoc ( ≥ 3.0) + xelatex LLM-based conv ersion Logging logging + rich bare print() T able 2: Mandated soft ware stac k. Optional pac k ages include coffea for columnar even t processing, xgboost / scikit-learn for MV A, and iminuit / cabinetry for fitting, activ ated as needed p er analysis. All analysis code is written in a columnar st yle: selections are bo olean masks o ver arra ys, not even t loops. Agen ts protot yp e on ∼ 1000-even t slices b efore scaling to the full dataset, with automatic scale-out rules: single-core for jobs under 2 min utes, ProcessPoolExecutor for 2–15 min utes, and SLURM batc h submission for longer runs. 8 3.4 Literature-based knowledge retriev al A critical component of the JFC framew ork is its in tegration with a literature-based kno wledge retriev al system built on SciT reeRAG [ 17 ]. An analysis agen t op erating without access to the existing literature w ould need to rely entirely on knowledge absorb ed during pre-training—a brittle foundation, since the sp ecific details of ho w a particular exp erimen t’s even t selection was designed, what detector effects must b e accoun ted for, or how a particular systematic uncertaint y was historically ev aluated are rarely represen ted in an LLM’s training data with sufficien t fidelity to b e directly useful. Our retriev al system indexes a corpus of published LEP (ALEPH, DELPHI) papers, exploiting the hier- arc hical do cumen t structure (sections, subsections, figures, tables) to build contextually coherent retriev als rather than the flat text ch unks used b y standard RA G systems. The ALEPH catalog contains 1,503 total en tries (399 papers, 736 proceedings, 368 theses). Of those, 721 hav e source material a v ailable, and 575 w ere successfully conv erted to markdown. The DELPHI catalog con tains 4,305 total en tries (2,967 pap ers, 763 pro ceedings, 575 theses). Of those, 2,083 hav e source material a v ailable, and 1,868 were successfully con verted to markdown. Metadata was harvested from INSPIRE-HEP and CERN CDS. Source PDFs and LaT eX were then do wnloaded. PDFs w ere con verted to structured markdo wn using Nougat (a GPU-based neural OCR mo del) running on A100 80GB GPUs, while LaT eX sources were con verted via Pandoc. The raw markdown then w ent through a reproducible postpro cessing pip eline that normalizes math delimiters, strips missing-page mark ers and b oilerplate, remo ves P ando c div markers, and collapses excess whitespace. When the analysis agent needs domain knowledge—for example, what charged-trac k quality cuts were used in previous ALEPH hadronic ev ent selections, or what sources of systematic uncertain ty w ere considered in a particular measurement—it queries the retriev al system and receives relev ant passages from published pap ers, complete with the surrounding con text needed to interpret them correctly . This allo ws the agent to mak e informed decisions grounded in established exp erimental practice rather than guessing or hallucinating analysis choices. The retriev al system is particularly imp ortan t for three asp ects of an analysis: • Ev en t selection : The agent retrieves descriptions of selection criteria used in similar published anal- yses to inform its own selection design, adapting cuts to the sp ecific measuremen t while main taining consistency with established practice. • Systematic uncertainties : Identifying the relev an t sources of systematic uncertaint y for a given measuremen t is one of the most exp ertise-in tensive asp ects of an analysis. The retriev al system provides the agent with descriptions of ho w systematic uncertainties were ev aluated in prior work, including whic h v ariations w ere considered and how they w ere propagated. • Statistical metho ds : The agen t retriev es information ab out the statistical framew orks used in similar measuremen ts (e.g., profile lik eliho od fits, template metho ds) to guide its c hoice of inference procedure. Presen ted analyses p erformed on the ALEPH Op en Data hav e had access to this literature-based kno wl- edge retriev al, whereas the ones on DELPHI and CMS did not. 3.5 Agen t architecture The framew ork defines specialist agent profiles, each sp ecified as a Markdown file in .claude/agents/ with a Y AML fron tmatter declaring the agent’s name, description, av ailable to ols, and mo del tier. These agents fall into four functional categories: • Executor agen ts (7): lead-analyst , data-explorer , detector-specialist , theory-scout , signal-lead , background-estimator , systematics-fitter , each with deep domain prompts enco ding HEP methodology (e.g., the fitter’s prompt sp ecifies all eigh t mandatory fit diagnostics, three in-situ constraint strategies, and the CL s exclusion pro cedure). • Specialist agents (4): systematic-source-evaluator , cross-checker , note-writer , ml-specialist , activ ated as needed for sp ecific sub-tasks. 9 • Review er agents (5): physics-reviewer (reviews purely on ph ysics merit, without access to method- ology do cumen ts), critical-reviewer , constructive-reviewer , rendering-reviewer (compiles and insp ects the PDF), and plot-validator (programmatic, non-visual v alidation of plotting co de and histogram data). F urther information in Sec. 3.6 • Adjudication agen ts (2): arbiter (syn thesizes all review er findings in to a P ASS/ITERA TE/ESCALA TE v erdict) and investigator (traces regre ssion triggers to their ro ot cause across phases). Ev ery subagent is spa wned as a fresh pro cess with no memory of prior inv o cations, receiving only the sp ecific artifacts and instructions relev an t to its task. This disp osable-con text architecture preven ts context windo w exhaustion (Context Rot [ 13 ]) during long analyses and ensures that each agent op erates from a clean state. 3.6 Multi-agen t review Before any analysis phase is considered complete, it undergoes automated review by a panel of sp ecialized review er agents. This multi-agen t review system is designed to mirror the internal review process of a real HEP collab oration, where differen t exp erts scrutinize differen t asp ects of an analysis b efore it is appro ved for publication. Six distinct agen ts participate in the review process, eac h defined by a Markdo wn profile in .claude/agents/ sp ecifying its role, av ailable tools, mo del tier, and ev aluation criteria. Ph ysics review er ( physics-reviewer ). Operates as an indep endent senior collab oration member equiv- alen t to an Analysis Review Committee (ARC) member or Lev el-2 conv ener. Critically , this agent is delib er- ately denie d ac c ess to the metho dology sp ecification, conv en tions do cuments, review c hecklists, and previous review feedback. It receives only the physics prompt and the artifact under review, and ev aluates purely on the basis of ph ysics knowledge: signal modeling, bac kground iden tification and estimation, systematic uncertain ty completeness, cross-c heck adequacy , and publication readiness. This design choice ensures that the physics reviewer assesses the analysis as an external referee w ould, without b eing guided b y the frame- w ork’s own criteria. Issues are classified as Category A (blo c king: “w ould cause rejection”), B (imp ortan t: “w eakens the analysis”), or C (minor: “style or clarity”). Critical review er ( critical-reviewer ). P erforms the most comprehensiv e review. Unlik e the physics review er, the critical reviewer do es read the metho dology sp ecification and con ven tions do cumen ts, and applies them systematically . Its proto col includes eight mandatory steps: (1) phase-sp ecific review focus, (2) issue classification (A/B/C), (3) ro w-by-ro w con ven tions compliance c heck against the applicable con v en- tions do cument, (4) reference analysis comparison (“what w ould a comp eting group ha ve that we do not?”), (5) a 14-p oin t figure and lab el c hecklist (v erifying axis lab els with units, luminosit y stamps, exp erimen t la- b els, ratio panels, uncertaint y bands, and the absence of forbidden elements suc h as figure titles), (6) ph ysics sanit y c hec ks on ev ery plot (distribution shap es, yield magnitudes, data/MC agreemen t, uncertain t y pro- p ortionalit y), (7) regression detection b y comparing the curren t artifact against earlier phase outputs, and (8) upstream feedback identifying issues that originate in prior phases. Each finding includes a file path or figure reference, an impact statement, and a suggested fix. Constructiv e reviewer ( constructive-reviewer ). Complemen ts the critical review er b y fo cusing on impro vemen ts and alternativ es rather than flaw detection. It ev aluates clarity , v alidation sufficiency , alterna- tiv e approaches (additional signal regions, complemen tary cross-chec ks, improv ed statistical metho dology), presen tation quality , and notation consistency . Each suggestion includes the curren t state, the impro ved state, a justification, and an effort estimate (low/medium/high). The constructive reviewer also provides explicit p ositiv e feedback (marked with [+] ) for genuinely strong asp ects of the analysis, and is instructed not to duplicate findings already raised b y the critical review er. While it primarily pro duces Category B and C findings, it can escalate to Category A if a fundamental gap is discov ered. 10 Rendering reviewer ( rendering-reviewer ). Activ ated only during Phase 5 (do cumen tation) review, adding a fifth review er to the panel. This agent compiles the analysis note to PDF, then inspects the compiled output across eigh t dimensions: figure rendering integrit y , LaT eX math compilation, page lay out (orphaned text, page breaks, margins), cross-reference resolution ( @fig: , @tbl: , @eq: , @sec: labels), citation resolution against the bibliography , table formatting and ov erflo w, and page coun t assessment (targeting 50– 100 pages for a complete analysis note). It fo cuses exclusively on rendering qualit y and do es not commen t on physics con tent. Plot v alidator ( plot-validator ). Unlik e the other reviewers, the plot v alidator p erforms pr o gr ammatic , non-visual v alidation. It runs in parallel with the other review ers during ev ery review cycle that in volv es figure-pro ducing phases. Its chec ks fall into four categories: • Programmatic figure c hec ks (8 c hecks): V erifies that collab oration plotting st yles ( mplhep ) are applied, figure sizes match the template, no forbidden ax.set title() calls exist, axis lab els include units, no hardco ded fon t sizes app ear, bbox inches="tight" is used at sav e time, b oth PDF and PNG formats are sav ed, and plt.close() is called after saving. • Ph ysics sanit y chec ks (11 chec ks): V erifies that yields are non-negative, efficiencies lie in [0 , 1], data/MC ratios in con trol regions fall within [0 . 5 , 2 . 0], uncertain ties scale as √ N , p T and mass dis- tributions fall off at high v alues, cutflo w yields are monotonically non-increasing, and background comp osition fractions sum to ∼ 100%. • Consistency c hecks (6 c hec ks): Cross-v alidates that the same process has consistent yields across differen t plots, pre-fit and p ost-fit yields are consistent with the fit, and nuisance parameter impact rankings match the uncertain ty breakdo wn table. • Red flag detection (10 chec ks): Automatic Category A triggers including negative ev ent yields, efficiencies outside [0 , 1], data/MC ratios outside [0 . 2 , 5 . 0] in control regions, zero uncertain ty on non- zero predictions, non-con verged fits, nuisance parameter pulls > 3 σ , χ 2 / ndf > 5 . 0, and systematic v ariations exceeding 100%. Red flags from the plot v alidator are automatically classified as Category A findings; the arbiter is explicitly forbidden from downgrading them. Arbiter ( arbiter ). The arbiter is not a reviewer but an adjudicator: it reads all review er outputs and the original artifact, then synthesizes a single verdict. Its adjudication follows a five-case decision framework: (1) b oth review ers agree → accept at the higher severit y; (2) review ers disagree on severit y → ev aluate argumen ts and assign with do cumen ted reasoning; (3) only one reviewer raised the finding → assess v alid- it y independently; (4) reviewers con tradict each other → examine the artifact directly to resolv e; (5) both review ers missed something → the arbiter adds its o wn finding. The arbiter pro duces a structured adjudica- tion table mapping each finding to its source reviewer(s), their assigned categories, and the final adjudicated category with rationale. It then issues one of three decisions: P ASS (no Category A or unresolved B items remain), ITERA TE (actionable Category A items exist that can b e fixed within the curren t phase), or ESCALA TE (fundamen tal problem requiring human judgmen t or upstream phase changes). This automated review lay er serv es tw o purp oses. First, it catches a class of errors—suc h as applying a selection cut that inadverten tly remov es signal, double-coun ting a systematic uncertaint y , or using an inappropriate test statistic—before a human ph ysicist ev er needs to lo ok at the analysis. Second, it pro vides structured documentation of the analysis decisions and their justifications, making subsequen t h uman review more efficient because the reviewer can fo cus on ph ysics judgment rather than hun ting for mechanical errors. The review process is delib erately conserv ative: the analysis is not permitted to pro ceed to “unblinding”— examining the real data in the signal region—until all reviewer agen ts ha ve signed off. The review proto col is not advisory; it is binding . When the arbiter issues an ITERATE verdict, the orchestrator initiates a closed- lo op revision cycle in which the executor agent must demonstrably address every Category A finding b efore the analysis can adv ance. In the final documentation phase, the review system extends to the written analysis rep ort itself: the analyses presen ted in the app endices of this pap er, including all publication-quality plots, w ere drafted en tirely b y the agen t system and refined through the same multi-agen t review cycle. 11 3.7 Pip eline state tracking The orc hestrator maintains a machine-readable state file ( STATE.md ) that records the current phase, status, timestamp, a phase history table (with artifact paths, review tiers, iteration coun ts, and notes), activ e blo c k ers, and a regression log. Every state transition—executing, reviewing, passed, blo c ked, human gate, regression—is timestamp ed. A dedicated /check-status command reads this file and presents a formatted status rep ort including completed artifacts, review iteration coun ts p er phase, and any outstanding blo c kers. This state trac king enables the orc hestrator to resume from the correct p oin t after interruptions and pro vides a complete audit trail of the analysis pip eline’s progression. 3.8 F ormal unblinding proto col The framework implemen ts a structured, multi-stage blinding proto col that enforces a mandatory human gate b efore observed data in the signal region can b e examined. Blinding enforcemen t. F rom the start of the analysis, a blinding proto col defined in Phase 1 sp ecifies the blinding v ariable and the signal region boundaries. All agents op erate under a standing constraint: “nev er access signal region data un til explicitly told unblinding is approv ed.” During Phases 1–4a, the statistical analysis uses exclusively Asimov datasets (exp ected background, optionally with injected signal at µ = 1 for disco very pro jections). Post-fit distributions sho w data p oin ts remov ed from the signal region or replaced with Asimov pseudo data. The analysis config.yaml configuration file carries a bo olean flag blinding.approved for unblinding , initialized to false . P artial unblinding (Phase 4b). Phase 4b introduces a controlled partial unblinding: the systematics-fitter agen t runs the statistical fit on a 10% random subsample of the signal region data, using a fixed random seed for reproducibility . This limited exposure allo ws the analysis to verify that no unexp ected pathologies appear (e.g., dramatic data/MC disagreement, fit non-conv ergence, anomalous n uisance parameter pulls) without rev ealing the full result. The note-writer agen t sim ultaneously pro duces a draft analysis note and an un blinding chec klist cov ering seven criteria: 1. Bac kground mo del v alidated (closure tests pass in all v alidation regions), 2. Systematic uncertain ties ev aluated and fit mo del stable, 3. Expected results ph ysically sensible, 4. Signal injection tests confirm fit reco vers injected signals, 5. 10% partial un blinding sho ws no unexpec ted pathologies, 6. All agen t review cycles resolved (arbiter P ASS), 7. Draft analysis note reviewed and considered publication-ready mo dulo full obse rv ed results. Human gate. After the Phase 4b review passes (4-b ot tier), the pipeline halts automatic al ly and sets the state to human gate . The orchestrator do es not proceed to Phase 4c under any circumstances without ex- plicit human authorization. The human analyst inv ok es /approve-unblinding , whic h presen ts a structured appro v al request con taining: • A summary of the draft analysis note (physics pro cess, methodology , 10% results), • The un blinding chec klist with per-item pass/fail status, • The 4-bot review result (arbiter decision, iteration count, residual Category B/C items), • Key results from the 10% data (expected vs. observed, goo dness-of-fit, notable n uisance parameter pulls), • P aths to all artifacts av ailable for detailed human review. 12 The human responds with one of three decisions: APPROVE The configuration flag approved for unblinding is set to true . The pip eline proceeds to Phase 4c (full unblinding). REQUEST CHANGES The pipeline re-en ters Phase 4b execution with the human’s change request as additional input. After the executor revises the artifacts, the 4-bot review runs again, and up on passing, the h uman gate is re-presented with the up dated summary . HALT The pip eline is stopp ed entirely . The state is set to halted with the h uman’s reason recorded. The analysis can only resume after the h uman addresses the stated concerns. F ull unblinding (Phase 4c). Phase 4c is gated b y a programmatic c heck: the systematics-fitter agen t verifies that approved for unblinding: true in the configuration b efore pro ceeding. If the flag is false , execution stops with an instruction to run /approve-unblinding first. Once confirmed, the fitter runs the full statistical fit on the complete dataset. A cross-checker agen t then indep enden tly v alidates the results, c hecking consistency b etw een the full observed results, the 10% partial results, and the Asimov exp ected results. The full results undergo a 1-b ot review (critical reviewer plus plot-v alidator) b efore the analysis adv ances to Phase 5 (do cumen tation). Design rationale. The three-stage unblinding sequence (Asimo v → 10% data → full data) with a manda- tory human gate mirrors the blinding proto cols used by ma jor HEP collab orations. The human gate is the single point in the pipeline where autonomous execution is deliberately suspended, ensuring that the decision to examine the full signal region data remains under h uman control. This design reflects the principle that while analysis execution and review can b e automated, the scientific judgmen t to unblind a result should not b e. 4 Results & Discussion W e demonstrate the JFC framew ork b y repro ducing several published ALEPH and DELPHI measuremen ts using arc hived LEP data and Monte Carlo simulation samples, and a CMS Higgs b oson measurement us- ing CMS Op en Data [ 7 ]. The full analyses, including detailed descriptions of ev ent selection, systematic uncertain ty ev aluation, and statistical results, are presented as standalone pap ers in the app endices. These app endix papers w ere written en tirely by the agent system, with zero human feedbac k beyond a simple prompt describing the general task and data lo cation. The analysis notes were originally pro duced as stan- dalone do cumen ts; for inclusion in this pap er, an AI agent performed minor reformatting (section-level adjustmen ts, figure path up dates) with no c hanges to ph ysics con tent, methodology , or results. 4.1 Analyses T able 3 summarizes the analyses pro duced with JFC . Eac h analysis w as initiated with a short natural- language prompt (reproduced in the table) sp ecifying only the measuremen t goal and the data location; the agent autonomously determined the full analysis strategy . All runs used a Claude Max subscription 2 with the Claude Opus mo del, and generally completed end-to-end within 4-6 hours. In some cases, waiting p eriods for Claude Code usage limit resets slow ed execution times. The complete agent-produced analysis notes are repro duced in the appendices for a few represen tative examples. The framework specification, agen t definitions, and additional autonomously generated analyses are a v ailable in our github repository (see Section 6 ). W e do not guaran tee the ph ysics v alidit y of an y of these r esults, but instead, show them as examples of what is already p ossible with a single prompt. 2 https://claude.ai 13 Analysis Initial prompt (abridged) Data Time Ref. Z lineshap e + α s Measure Z lineshape, thrust and ex- tract α s ( M Z ) via NLO+NLL QCD fits ALEPH 3h 10m A / Gith ub Lund plane Measure the primary Lund jet plane densit y in hadronic Z decays ALEPH 13h 13m B / Gith ub N ν from Γ inv Measure the num b er of ligh t neu- trino generations from the Z in vis- ible width DELPHI 5h 15m C / Github Energy-energy corre- lator Measure the t wo-point energy- energy correlator in hadronic Z deca ys DELPHI 5h 23m D / Gith ub H → τ τ Measure the Higgs b oson signal strength in the µτ h final state at √ s = 8 T eV CMS 6h 30m E / Github R b and R c Measure R b and R c using lifetime b- tagging and D ∗± c harm tags ALEPH 2h 20m Gith ub Jet substructure Measure gro omed and ungro omed jet substructure observ ables in hadronic Z decays DELPHI 7h 42m Github Ev ent shap es + α s Measure six even t shap e distribu- tions and extract α s ( M Z ) from NLO QCD fits DELPHI 4h 15m Github Lund jet plane Measure the primary Lund jet plane densit y in hadronic Z decays DELPHI 6h 59m Github T able 3: Summary of analyses produced with JF C . Each analysis w as run autonomously from the initial prompt shown, with no human interv ention beyond the unblinding appro v al gate. W all-clock times include all review iterations. Z lineshap e, thrust, and α s extraction (App endix A ). Building on the thrust measurement, the agen t performs an α s ( M Z ) extraction b y fitting the unfolded thrust distribution to NLO+NLL p erturbative QCD predictions. Scaffold and run a measurement analysis of the Z lineshap e parametersand an extraction of α s using arc hived ALEPH data at √ s ∼ 91 . 2 GeV. Setup: scaffold analyses/zlineshap e alphas as a measurement, set data dir=[...] in analysis config, install the pixi en vironment, then begin orchestrating. Data: [...] Lund jet plane (App endix B ). The agen t measures the primary Lund jet plane density using Cam- bridge/Aac hen declustering of hadronic Z decay ev en ts, including a tw o-dimensional unfolding. Scaffold and run a measuremen t analysis of the primary Lund jet plane densit y in hadronic Z deca ys using arc hived ALEPH data at √ s ∼ 91 . 2 GeV. Setup: scaffold analyses/lund plane as a measurement, set data dir=[...] in .analysis config, install the pixi envi- ronment, then b egin orchestrating. Observ able: The 2D densit y of primary Cambridge/Aac hen declusterings in each thrust hemisphere, mapp ed to coordinates (ln 1 / ∆ θ , ln k t /GeV), where ∆ θ is the emission angle and k t = E soft sin ∆ θ . Use charged particles only ( pwflag == 0, highPurity == 1). One jet = one hemisphere. Deliverables: 1. 2D density ρ (ln 1 / ∆ θ, ln k t ) corrected to c harged-particle level, 10–15 × 10–15 bins at least - as fine as it makes sense 2. 1D pro jections ( k t spectrum, angular sp ectrum) with cov ariance 14 3. Num ber of primary declusterings vs. minimum k t threshold 4. Comparison to PYTHIA 6 MC 5. Mac hine-readable results (CSV/NPY) Data: [...] N ν from Γ in v (App endix C ). The agen t measures the num b er of light neutrino generations N ν from the in visible width of the Z boson using 3.5 million hadronic Z decays from the DELPHI 1992–1995 energy scan. Reproduce the classic LEP measuremen t of the n umber of ligh t neutrino generations N ν from the Z boson invisible decay width using DELPHI op en data. Measure the Z lineshap e (cross section vs √ s ) in the hadronic and leptonic channels from the LEP1 energy scan runs. Extract Γ Z (total width), Γ had , and Γ ℓ . Compute the invisible width Γ inv = Γ Z − Γ had − 3Γ ℓ and determine N ν = Γ inv / Γ ( S M ) ν ¯ ν . The original LEP result is N ν = 2.9840 ± 0.0082, consistent with exactly 3 generations. Data: [...] Energy-energy correlator (App endix D ). The agen t measures the charged-particle tw o-p oin t energy- energy correlator (EEC) in hadronic Z decays using DELPHI open data, corrected to stable c harged-particle lev el using iterative Ba yesian unfolding with 108 angular bins spanning χ ∈ [0 . 006 , 3 . 139] rad. Measure the t wo-point energy-energy correlator (EEC) as a function of the angular separation χ between particle pairs in hadronic Z decays at DELPHI. Cov er the full angular range from the collinear limit ( χ → 0) through the transition region to the back-to-back limit ( χ → π ). Correct for detector effects using the DELPHI simulation. In the back-to-bac k region, extract α s using the transverse-momen tum-dependent (TMD) factorization framew ork, which provides excellent p erturbativ e con trol. In the collinear limit, prob e the transition from perturbative to non-perturbative dynamics and test mo dels of hadronization including recent analytic results connecting EECs to track functions. Measure the three-p oin t EEC as well, which prob es the detailed radiation pattern in e + e − → 3 jets. Compare with predictions from modern generators (Pythia 8, Sherpa, Herwig 7). Data: [...] CMS Op en Data H → τ τ (Appendix E ). The agent measures the Higgs boson signal strength in the µτ h final state using CMS Op en Data at √ s = 8 T eV, including a template-based profile lik eliho o d fit. Y ou are performing a Higgs b oson searc h in the τ + τ − decay channel using CMS Open Data from 2012 at √ s = 8 T eV. The final state is one m uon and one hadronically decaying tau lepton ( µτ h ). Y our goal is to pro duce distributions of k ey observ ables — particularly the visible di-tau mass — showing the Higgs signal contribution on top of Standard Model backgrounds. This lo osely follows the official CMS publication (Phys. Lett. B 779 (2018) 283). Data: [...] R b and R c ( Gith ub ) The agent measures the ratios R b = Γ( Z → b ¯ b ) / Γ( Z → hadrons) and R c = Γ( Z → c ¯ c ) / Γ( Z → hadrons) using lifetime-based b-tagging and D ∗± c harm tags. Scaffold and run a measurement analysis of the partial width ratios R b = Γ( Z → b ¯ b ) / Γ( Z → hadrons) and R c = Γ( Z → c ¯ c ) / Γ( Z → hadrons) in hadronic Z decays using archived ALEPH data at √ s ∼ 91 . 2 GeV . Setup: scaffold analyses/rb rc as a measuremen t, set data dir=[...] in .analysis config, install the pixi environmen t, then begin orchestrating. Observ able: R b and R c are the fractions of hadronic Z decays to b ¯ b and c ¯ c respectively . T ag b and c even ts using lifetime-based metho ds: signed impact parameter significance of c harged tracks (p wflag == 0, highPurity == 1) relative to the primary vertex, and/or secondary vertex reconstruction. Use a double-tag metho d to reduce dependence on MC tagging efficiency . Deliverables: 15 1. R b and R c with statistical and systematic uncertainties 2. b-tagging and c-tagging p erformance (efficiency , purity , mistag rates) v alidated against MC 3. Double-tag consistency chec ks 4. Comparison to published ALEPH R b / R c v alues and the SM prediction 5. Mac hine-readable results Data: [...] Jet substructure ( Gith ub ). The agent measures sev en groomed and ungro omed jet substructure ob- serv ables — jet mass, Soft Drop gro omed jet mass, momentum sharing fraction z g , groomed op ening angle R g /R , Soft Drop m ultiplicity n SD , and N-sub jettiness ratios τ 21 and τ 32 — in hadronic Z decays using arc hived DELPHI data, for an ti- k T jets with R = 0 . 4 and R = 0 . 8. Measure a comprehensiv e set of jet substructure observ ables in hadronic Z deca ys using anti-kT jets with R = 0.4 and R = 0.8 reconstructed from DELPHI data at √ s = 91.2 GeV. F or eac h jet radius, measure: ungro omed and Soft Drop gro omed ( z cut = 0.1, β = 0) jet mass; the groomed momentum sharing fraction z g ; the groomed opening angle R g /R ; the n umber of Soft Drop splittings n SD ; and N-sub jettiness ratios τ 21 and τ 32 . Present all observ ables differentially as a function of jet energy . Correct to particle level using the DELPHI simulation chain, applying either bin-b y-bin or iterativ e Bay esian unfolding. Compare with Pythia 8, Herwig 7, Sherpa 2/3, and with p erturbativ e QCD calculations where av ailable (NLL for gro omed jet mass, Soft Drop z g ). This analysis directly mirrors the ALEPH jet substructure publication (JHEP 06 (2022) 008), providing an independent measurement with different detector systematics. Data: [...] Ev ent shap es + α s extraction ( Gith ub ). The agen t measures six infrared- and collinear-safe hadronic ev ent shape distributions — thrust, hea vy jet mass, wide and total jet broadenings, C -parameter, and Durham y 23 — in e + e − annihilation at √ s ≈ 91 . 2 GeV using DELPHI op en data, and extracts α s ( M Z ) from NLO QCD fits to the unfolded distributions. Perform a state-of-the-art determination of the strong coupling constant, α s ( M Z ), using hadronic Z -decay data from DELPHI at √ s = 91 . 2 , GeV . Measure the distributions of classic ev ent-shape v ariables—thrust ( T ), heavy jet mass ( ρ H ), wide and total jet broadenings ( B W and B T ), the C -parameter, and the Durham y 23 jet-resolution parameter—using the full LEP1 data set. Correct the data to particle level using the DELPHI sim ulation chain and unfold the distributions with modern techniques, such as iterative Bay esian unfolding or OmniF old. Fit the unfolded distributions with NNLO QCD predictions matched to NNLL resummation (or N 3 LL where av ailable), extracting α s and nonperturbative p ow er corrections simultaneously . Compare the fixed-order p ertur- bation theory (FOPT) and contour-impro ved p erturbation theory (CIPT) prescriptions. Com bine the individual extractions into a single DELPHI determination of α s and compare it with the current world a verage. The theoretical landscap e has improv ed dramatically since the original DELPHI publications: NNLO corrections are now av ailable for all six even t shapes, and resummation has b een pushed to N 3 LL accuracy for thrust. Data: [...] Lund jet plane ( Gith ub ). The agent measures the primary Lund jet plane density using Cam- bridge/Aac hen declustering of hadronic Z decay ev ents from DELPHI, corrected to particle lev el using bin-b y-bin correction factors. The measurement pro vides the first determination of this observ able at the Z p ole in e + e − collisions. Measure the primary Lund jet-plane density in hadronic Z decays using DELPHI data. Recluster jets with the Cam bridge/Aachen algorithm and follow the harder branch at each declustering step, recording the splitting v ariables ( k T , ∆ R ) of the softer emission. P opulate the Lund plane in the (ln(1 / ∆ R ) , ln k T ) representation. This observ able has b een measured by A TLAS and ALICE at LHC energies but has never b een measured at LEP . The clean e + e − initial state (no underlying even t, no pileup, known √ s ) mak es this the definitiv e b enc hmark for the Lund plane: the perturbative region should matc h NLL DGLAP predictions exactly , while the nonperturbative boundary provides a clean measurement of the hadronization transition scale. Data: [...] 16 4.2 Cost and scaling All analyses presen ted in this pap er w ere run using a Claude Max subscription (appro ximately $ 200/mon th at the time of writing), which provides sufficien t throughput for the full pip eline including all review iterations. Eac h end-to-end analysis completes in appro ximately 4-6 of wall-clock time, with the review cycles currently imp osing soft and hard iteration caps: phases t ypically pass review after one to tw o iterations, with a hard cap to preven t runaw ay loops. The current configuration uses Claude Opus for all agent roles, prioritizing reasoning quality ov er cost. In principle, substantial cost sa vings could b e achiev ed b y delegating narro wly scoped roles to ligh ter- w eight models—for example, Claude Sonnet for signal selection dev elopment, bac kground estimation, theory scouting, note writing, and rendering review, and Claude Haiku for the read-only data reconnaissance agent that p erforms fast sample inv entory . Conv ersely , additional review iterations (b ey ond the current one-to- t wo-round default) represent an opportunity to impro ve analysis quality at the cost of additional mo del in vocations and wall-clock time . Systematic exploration of this quality–cost trade-off is left for future work. 4.3 Analysis quality Bey ond the n umerical results, we can assess the quality of the agent-produced analyses along several di- mensions. The agen t-pro duced analyses demonstrate that an LLM-driven system can design and execute particle ph ysics analyses that are structurally sound, metho dologically standard, and honestly do cumen ted. The even t selections are sensible, and correct systematic uncertain ty sources are identified, even when not all can b e fully ev aluated. The analysis strategies align closely with published approaches, with deviations that are well-motiv ated and clearly do cumen ted. In all cases, the agent’s analysis strategy is recognizably similar to the corresp onding published analysis. The agen t consisten tly makes the same high-lev el c hoices that the original exp erimen tal collaborations made, for example, iterativ e Ba yesian unfolding (or bin-b y-bin correction where appropriate) for the QCD measure- men ts, HistF actory likelihoo d for the Higgs search, Breit–Wigner lineshap e fitting for N ν . The selection of observ ables, binning strategies, and correction pro cedures are all standard and well-motiv ated. Additionally , the agent’s treatmen t of systematic uncertainties is one of the strongest asp ects of these analyses. Every note includes a systematic completeness table that explicitly compares the sources considered against those in the corresp onding published analyses (DELPHI, ALEPH, OP AL, CMS, A TLAS, or ALICE). Similarit y to published analyses is expected, as the agen ts explicitly consult existing literature and are ask ed to repeat measuremen ts that hav e b een done b efore. Pursuing wholly nov el analyses is lik ely to b e more challenging for the agen ts, and we leav e this as a question for future w ork. Autonomously repro ducing full analysis w orkflows is, by itself, a very significan t demonstration of their HEP analysis capabilities. The main weaknesses—including limited statistics, approximate theory inputs, and incomplete hadroniza- tion mo del comparisons—are inherent to the protot yp e scop e rather than to errors in the agen t’s physics reasoning. The most significan t gen uine errors (D Y template con tamination, rank-deficient co v ariance, NLO- only α s bias) are the kinds of issues that standard review pro cesses are designed to catc h, and the agent itself flags most of them in its “limitations” discussions. The agent tends to b e conserv ativ e with selection cuts, preferring high purit y o v er high efficiency . This is a reasonable default for a first-pass analysis, but in several cases, it substantially h urts sensitivity . A more exp erienced analyst would likely inv estigate and relax or work around such cuts earlier in the analysis cycle. Where the agent falls short relativ e to an experienced h uman analyst is in the iterativ e refinement lo op: a human w ould lik ely fix the Z → µµ contamination in the Higgs to τ τ analysis before writing 55 pages around it, and would implement ev ent categorization in the same analysis b efore accepting σ µ = 5 . 6, and w ould not pro ceed with a χ 2 test using a condition num b er of 10 19 . The agen t pro duces thorough first drafts that identify their o wn problems but do es not alwa ys close the lo op on fixing them. This pattern of correct diagnosis, deferred treatmen t is the single most characteristic feature of the agent’s working st yle across all analyses, and it suggests that the review-and-iterate cycle is where the most v alue would come from tigh ter h uman-agent collab oration. The review rounds were also in v aluable to correcting mistak es. Eac h independent reviewer agent was task ed with reading one analysis note and pro ducing a critical assessment. Sev eral qualitativ e observ ations can b e made: 17 • Consistency of review er assessments: All six reviewer agents conv erged on the same structural critique — that the analyses are well-designed feasibility demonstrations limited primarily b y protot yp e statistics and missing theory/MC inputs, rather than by fundamen tal metho dological errors. This consistency across indep enden t reviews lends confidence that the assessment is not an artifact of any single reviewer’s biases. • Errors caught by reviewers: The reviewer agen ts identified sev eral concrete issues: the rank- deficien t co v ariance matrix in the EEC analysis, the Z → µµ template con tamination in H → τ τ , the inflated α s from NLO-only theory , and the s tress test failures in jet substructure. These are precisely the kinds of issues that internal review committees catch in real collab orations. • Errors lik ely requiring h uman review: Some subtler issues — suc h as whether the an ti-muon discriminator behavior in CMS Op en Data NanoAOD is a data-format artifact v ersus a gen uine analysis c hoice, or whether the AP A CIC generator provides sufficient hadronization model v ariation compared to Herwig — require domain exp ertise and familiarit y with the sp ecific detector and data formats that go es beyond what the reviewer agen ts can fully assess from the note text alone. A human review er with LEP or CMS exp erience w ould lik ely prob e these points more deeply . • Self-a w areness of limitations: A striking feature of the agen t-pro duced notes is their transparency ab out what is missing. Every analysis includes a “dominant limitations” subsection and a prioritized list of impro vemen ts for the full-statistics analysis. This self-critical p osture reduces the burden on review ers, who can fo cus on issues the agent did not iden tify rather than those it did. 4.4 Implications for analysis w orkflo ws The results presented here suggest that a significant fraction of the technical w ork in a standard HEP analysis can b e automated with curren t AI agen t technology . This does not mean that human physicists are no longer needed - quite the opp osite. The role of the ph ysicist shifts from implementer to arc hitect and critic: defining what should be measured and why , assessing whether the agent’s approac h is physically sensible, and catching the subtle errors that automated review ma y miss. With AI systems capable of implementing complex soft ware framew orks or autonomously running end- to-end analysis pipelines, high-level think ing, analysis design, ideation, and ph ysical reasoning (skills that are sough t in facult y candidates and foundational to career success) b ecome the primary o ccupation across all lev els of the academic hierarch y . F or students, time not sp en t programming can be dedicated to broadening theoretical and statistical foundations, building instrumentation expertise, planning pro jects, k eeping up with the literature, and writing up results. F rom this p erspective, AI co ding agents simply join a long lineage of time-saving inno v ations that un burden scientists from technical drudgery . This shift has the p oten tial to dramatically increase the throughput of an exp erimen tal program. Con- sider the situation facing the LHC exp eriment s to day: thousands of p oten tial measurements that could b e p erformed with existing data, but only a finite num ber of students and p ostdo cs to carry them out. Each analysis can o ccup y a person for years, so if the implemen tation phase can b e compressed from y ears to mon ths, the bottleneck mo v es from coding capacity to physics ideas and h uman review bandwidth—a muc h more desirable constrain t, b ecause it means the limiting factor is in tellectual rather than mechanical. Mea- suremen ts that migh t take a year to complete could p oten tially b e pro duced in da ys, review ed by a suite of sp ecialized AI agen ts before any h uman even lo oks at the result, and iterated rapidly in response to feedbac k. F ollow-up studies, closure c hecks, systematics in vestigations, and many other burdensome tasks could be im- plemen ted within min utes at the lev el of code and bottleneck ed only b y run time on large datasets, as they are no w an ywa y . Studies that are currently skipped b ecause they are “to o tedious” b ecome tractable when the cost of implemen tation drops b y orders of magnitude. Even with LEP data, there are significant prospective impro vemen ts, such as ev ent-lev el analyses of all standard mo del parameters, tunes of the hadronic sho wer for ev ery analysis, and optimized particle reconstruction for core standard mo del parameters. With these to olkits, the burden of analysis is lifted, encouraging all of us to think more broadly ab out the scop e of researc h that can b e conducted. F urthermore, large collab oration reviews, which inv olve 3–4 levels of human review (see Fig. 1 ), can b e exp edited. Agents ha ve a role at both the review and testing stages, pro viding immediate feedback and 18 immediate results from cross-chec ks. The role of the human review er then fo cuses on the tec hnically c hal- lenging elemen ts of the review pro cess, largely paralleling the “approv als” and “collab oration-wide reviews” p erformed in the later stages. While a reduced co ding burden w ould come as a relief to most in the field, tec hnical proficiency in programming must remain a critical and carefully developed skill for all HEP practitioners . If an AI agen t writes a sophisticated analysis pip eline, a physicist must, at minimum, b e able to read and understand the codebase to chec k for correctness. Modern co ding assistan ts are incredibly capable but still routinely mak e fundamental errors, particularly when p erforming the kinds of specialized tasks required in HEP (e.g. complicated fits and statistical hypothesis testing). The reliabilit y of these systems will contin ue to improv e, but the conceptual basis of all softw are—what the co de should do —will remain h uman-driven. 4.5 Rethinking graduate training The implications for graduate education deserve particular attention. As computer technology and deep learning hav e co-evolv ed with HEP’s “big data” era, softw are engineering has, perhaps uninten tionally , b ecome a significan t comp onen t of a graduate student’s training. This is a v aluable tec hnical foundation and prepares students well for futures inside and outside academia 3 , but inevitably distracts from core training as a physicist . W riting code has become synon ymous with HEP graduate studen t work, and tec hnical implemen tation is the largest bottleneck to the realization of an idea. If AI agents can handle the implemen tation, graduate training can b e restructured to emphasize the asp ects of research that require human in telligence: dev eloping physical intuition, learning to ask go od questions, understanding the theoretical con text of measurements, and practicing the kind of critical thinking needed to ev aluate whether an analysis result is trustw orthy . Students w ould still need to understand their analysis code and be able to v alidate an ything an agen t con tributes, but they would not need to write every line of it from scratch. This fo cus on high-level design and ev aluation rather than implementation mirrors ho w facult y and p ostdo cs work, most of whom con tribute most significantly via scien tific judgmen t rather than writing co de. AI agents would simply extend this training to earlier career stages. 4.6 Legacy data and reanalysis One particularly comp elling application is the reanalysis of legacy datasets. Enormous volumes of data from past experiments including LEP , the T ev atron, B-factories, sit in arc hives, potentially con taining information relev an t to current ph ysics questions 4 . Autonomous analysis agents could systematically work through these datasets, producing first-pass results that human physicists can then ev aluate and refine. Our w ork with ALEPH and DELPHI data is a proof of concept for exactly this kind of application. Moreov er, legacy datasets b ecome accessible to reanalysis with mo dern computational to ols—including deep learning techniques that w ere una v ailable when the data were originally collected. Bey ond new analyses, this paradigm also opens the door to automatic reproducibility of published results, where AI agen ts could systematically re-execute the computational pip elines of existing papers, flagging inconsistencies or confirming findings at a scale that is currently impractical due to the resource and p ersonnel constrain ts needed for large-scale v erification 5 . 4.7 Limitations and risks W e do not wish to ov erstate what current systems can do. Several imp ortan t limitations m ust b e ackno wl- edged. • Subtle ph ysics errors : Agen ts can pro duce analyses that are sup erficially correct but contain subtle ph ysics errors—for example, applying a selection criterion that biases the measuremen t in a non-ob vious w ay , or omitting a systematic uncertaint y that an experienced analyst would kno w to include. 3 The v alue of softw are engineering in industry ma y drop in the coming years, depending on the contin ued evolution of AI coding agents. At the time of writing, this mattery is hotly debated. 4 https://ee- alliance.org/ is one such organization pursuing these questions 5 Coding assistants will also prov e helpful for understanding legacy co debases and data formats. 19 • No v elt y : Current agents excel at reproducing established analysis strategies retriev ed from the litera- ture but are less reliable when ask ed to develop gen uinely nov el approac hes. F or analyses that require creativ e new techniques, h uman-driven design remains essential. • Complexit y ceiling : The analyses we hav e demonstrated—Z lineshap e, even t shap es, α S —are rela- tiv ely standard measurements. More complex analyses inv olving multiv ariate techniques, data-driven bac kground estimation in m ultiple con trol regions, or sim ultaneous fits across many channels remain to b e demonstrated. How ev er, Lund-Plane and EEC measuremen ts including unfolding tec hiques ha ve nev er been published, proving that the complexity ceiling is quite high for agent teams. • V erification burden : The need for thorough human verification of agen t-pro duced results cannot b e shortcut. An analysis that lo oks correct but contains a subtle error is arguably more dangerous than no analysis at all, because it carries an unearned sense of confidence. The communit y m ust develop robust practices for v alidating AI-produced analyses. • Instruction follo wing : Agen ts are prone to ignoring instructions when they b ecome buried in a large con text windo w. F or example, an agen t ma y repeatedly use absolute fon t sizes in plots despite explicit instructions to the con trary . • Prompt nondeterminism : Agen t outputs are inheren tly stochastic, making prompt optimization difficult. The same prompt can yield mark edly different analysis strategies or co de quality across runs. • Stale to ol kno wledge : Agen ts tend to rely on APIs and interfaces from their training data, and con- vincing them to consult up-to-date do cumen tation for newer pac k age versions is surprisingly difficult. • P oor dep endency managemen t : Agen ts are inclined to write ad hoc patches or w ork arounds rather than pulling an updated dep endency or adding a light w eight new one, even when the latter would b e the correct solution. • Limited kno wledge of review practices : Internal collab oration review pro cesses are not publicly a v ailable and are unlik ely to app ear in training data. Agents can reason about physics reasonably w ell, but man y analyses hav e c haracteristic failure mo des that exp erienced reviewers would catch immediately . Building in ternal retriev al-augmen ted generation systems ov er review do cumen tation could help address this. • Missing niche metho ds : Agents are unlikely to emplo y analysis tec hniques that are not widespread in the literature, even when they would b e appropriate. This represents an opp ortunit y for domain- sp ecific fine-tuning. • Con text and prompt engineering trade-offs : There is an inherent tension b etw een providing detailed instructions (which improv e sp ecificity but risk context bloat and instruction-follo wing degra- dation) and keeping prompts concise (which preserves con text but may lea ve imp ortan t conv entions unsp ecified). • St ylistic c hoices : While LaT eX has some reasonable defaults, there are many options and pack ages to c ho ose from and there is no uniform style across fields and journals. It it necessary to b e extremely sp ecific when describing and enco ding what the final analysis note should like to ac hieve the desired lo ok. 4.8 F uture W ork Sev eral directions for systematic impro vemen t of the framework remain to b e explored. A natural next step is A/B testing of different sp ecification v arian ts, comparing the effect of more prescriptive metho dology do cumen ts against minimal-guidance prompts on analysis quality , completeness, and error rates. Equally imp ortan t is characterizing the sto c hastic v ariabilit y of agent outputs: running the same prompt and sp ecifi- cation m ultiple times and quan tifying the spread in analysis strategies, n umerical results, and documentation qualit y would establish a baseline for repro ducibilit y and identify which aspects of the pipeline are most sen- sitiv e to prompt nondeterminism. 20 The review system itself offers a rich optimization target; the current iteration caps and reviewer panel comp osition were c hosen heuristically , and a systematic sw eep o ver the maximum n umber of review rounds, review er agen t combinations, and arbiter thresholds could reveal fa vorable operating p oints on the quality– cost frontier. Bey ond Claude Co de, the framework’s architecture is in principle bac kend-agnostic, and b enc hmark- ing against other agentic coding systems such as Op enAI’s Codex, Go ogle’s Jules, or open-weigh t agent framew orks w ould clarify whic h capabilities are sp ecific to the underlying mo del and whic h emerge from the sp ecification and orchestration la y er. Finally , extending the framework to more complex analysis top ologies, including multi-c hannel sim ulta- neous fits, data-driv en background estimation with transfer factors, and analyses requiring custom recon- struction or mac hine learning comp onen ts, w ould map the curren t boundary betw een what agents can handle autonomously and what still demands sustained h uman inv olvemen t. 4.9 T o ward robust AI-assisted analysis The path forward is not to deploy AI agen ts as unsup ervised analysis machines, but to develop them as p o w erful to ols within a framew ork that preserves the rigor and scrutiny that the HEP communit y righ tly demands. This requires in vestmen t in several concrete areas: 1. Benc hmarks for physics analysis. The comm unity curren tly lacks standardized b enchmarks for ev aluating agent p erformance on realistic, end-to-end physics analysis tasks. Existing b enc hmarks suc h as CelloAI [ 3 ] fo cus on co de generation and documentation; what is needed are benchmarks that test the full analysis chain—from ev ent selection design through statistical inference—on datasets with kno wn ground truth. The LHC Olympics [ 15 ] anomaly detection challenge offers a partial template, but b enchmarks targeting standard measuremen ts (cross-sections, coupling extractions, even t shap e v ariables) with realistic systematic uncertain ty structures would be far more informative. 2. Standardized review proto cols for AI-pro duced analyses. Collab orations should dev elop ex- plicit proto cols for reviewing analyses in whic h substan tial portions of the co de and strategy were pro duced b y AI agen ts. These proto cols should specify what additional scrutiny is warran ted—for example, mandatory indep enden t re-implementation of key results, systematic comparison of agen t- c hosen selections against published baselines, or automated regression testing against known-goo d analysis outputs. The multi-agen t review system we describ e in this pap er is a step in this direction, but it complements rather than replaces h uman review. 3. T raining for AI-assisted workflo ws. Ph ysicists need practical training in ho w to effectiv ely sup er- vise and v alidate AI-as sisted work. This includes dev eloping the abilit y to read and critically ev aluate co de they did not write, understanding the failure modes of LLM-based systems (hallucination, in- struction drift, stale API knowledge), and learning to form ulate analysis sp ecifications that are precise enough to guide an agen t while remaining flexible enough to allo w it to exercise judgment. Graduate curricula should ev olve to treat AI literacy as a core competency alongside statistical methods, detector ph ysics, and programming. 4. Institutional adaptation. Collab orations will need to up date their authorship, review, and account- abilit y norms to accommo date AI-assisted w ork. Clear standards for disclosure of AI in volv emen t, attribution of responsibility for AI-produced results, and do cumentation requirements for AI-assisted analyses will b e essen tial for maintaining the trust and rigor that underpin the field’s scientific credi- bilit y . The HEP comm unity has a long tradition of rigorous internal review b efore publication. That tradition should b e extended, not abandoned, as AI to ols b ecome more capable. 5 Conclusion AI agen ts based on large language models, com bined with a relatively small set of structured prompts, can already autonomously execute substan tial p ortions of a standard HEP analysis pip eline. W e hav e 21 demonstrated this concretely b y repro ducing published ALEPH and DELPHI measurements using archiv ed LEP data and CMS measuremen ts using CMS Op en Data, with agen ts that plan their own analysis strategy , retriev e domain knowledge from the literature, execute the full analysis c hain, undergo automated m ulti- agen t review, and pro duce complete written rep orts. W e stress that the to olkit for these analyses is largely comp osed of structured prompts and guidance for the LLM; it is b y no means sophisticated, and we encourage the developmen t of related approac hes given the low barrier to entry . The technology to p erform analysis at the graduate studen t lev el is here and warran ts serious attention from the exp erimental HEP comm unity . W e are not suggesting that AI agen ts should replace ph ysicists. W e are suggesting that they can tak e ov er m uch of the technically demanding, often tedious, implemen tation w ork that currently consumes the ma jorit y of an analyst’s time, freeing ph ysicists to fo cus on what they do b est: developing physical insigh t, asking creative questions, and exercising the exp ert judgment that no AI system can y et replicate. Moreov er, we see this as a path tow ards more sophisticated ph ysics analyses. Data can b e more thoroughly combed, details and features within the data can be explored, and there is significan t potential for new, comprehensive analyses to be built on these ideas. The comm unity should begin experimenting with these tools no w, developing the w orkflo ws, b enc hmarks, and review practices needed to use them resp onsibly . The question is not whether AI agents will b ecome part of the HEP analysis to olkit, but how quickly the comm unity adapts to these pow erful to ols. 6 Co de and Data Av ailabilit y The JF C framew ork specification—including the methodology document, con ven tions directory , agen t profile definitions, and orchestrator prompts—is publicly av ailable at https://github.com/jfc- mit/slopspec and https://github.com/jfc- mit/slop- X (t wo differen t v arian ts). The agent-produced analyses presented in this pap er, including complete analysis notes, all co de, and generated figures, are a v ailable in dedicated rep ositories linked from the main framework repository - https://github.com/jfc- mit . The archiv ed ALEPH, DELPHI, CMS data and Monte Carlo samples used in this w ork are publicly a v ailable through the CERN Op en Data p ortal and are cited in the sp ecific ANs. 7 Ac kno wledgmen ts W e thank Y en-Jie Lee for pro viding us with prepro cessed ALEPH Op en Data and Mon te Carlo for use in agentic analysis, and w e thank the Electron-Positron Alliance 6 for their ongoing work to preserv e and re-analyze archiv al LEP data. W e also thank CERN for providing the op en data and MC from the ALEPH, DELPHI, and CMS experiments that we use in this pap er 7 . Lastly , while preparing this do cumen t, we b ecame a ware of [ 4 ]. W e ha ve appropriately qualified the differences and conclusions within the pap er. This work is supp orted by the National Science F oundation under Co operative Agreement PHY-2019786 (The NSF AI Institute for Artificial In telligence and F undamental In teractions, IAIFI, h ttp://iaifi.org/ ). Computing resources for the SciT reeRAG literature extraction were pro vided b y IAIFI on the F ASRC Cannon cluster supp orted by the F AS Division of Science Researc h Computing Group at Harv ard Univ ersity . AN is supp orted by the NSF-funded A3D3 Institute (NSF-PHY-2117997), and a DOE Early Career aw ard FY2021, “Harnessing the Large Hadron Collider with New Insigh ts in Real-Time Data Processing and Artificial In telligence”. E. A. M. ac knowledges supp ort from the National Science F oundation with Grant No. GRFP2141064. References [1] Thea Klaebo e Aarrestad et al. “Building an AI-nativ e Researc h Ecosystem for Exp erimen tal P article Ph ysics: A Communit y Vision”. In: (F eb. 2026). arXiv: 2602.17582 [hep-ex] . [2] Josh Achiam et al. “Gpt-4 tec hnical report”. In: arXiv pr eprint arXiv:2303.08774 (2023). 6 https://ee-alliance.org/home/ 7 https://opendata.cern.ch/ 22 [3] Mohammad A tif et al. “CelloAI Benc hmarks: T ow ard Repeatable Ev aluation of AI Assistan ts”. In: (Mar. 2026). arXiv: 2603.01051 [hep-ex] . [4] Anthon y Badea, Yi Chen, and Y en-Jie Lee. “Agen tic AI–Ph ysicist Collab oration in Experimental P article Physics: A Proof-of-Concept Measurement with LEP Op en Data”. In: arXiv pr eprint arXiv:2603.05735 (2026). [5] Daniil A Boik o et al. “Autonomous c hemical researc h with large language models”. In: Natur e 624.7992 (2023), pp. 570–578. [6] Andres M Bran et al. “Chemcrow: Augmen ting large-language mo dels with chemistry tools”. In: arXiv pr eprint arXiv:2304.05376 (2023). [7] CERN Op en Data . 2026. url : https://opendata.cern.ch/ . [8] Sascha Diefen bacher et al. “Agents of Discov ery”. In: (Sept. 2025). arXiv: 2509.08535 [hep-ph] . [9] W. Esmail, A. Hammad, and M. No jiri. “CoLLM: AI engineering to olb o x for end-to-end deep learning in collider analyses”. In: (F eb. 2026). arXiv: 2602.06496 [hep-ph] . [10] Eli Gendreau-Distler et al. “Automating High Energy Physics Data Analysis with LLM-Po w ered Agen ts”. In: 39th Annual Confer enc e on Neur al Information Pr o c essing Systems: Includes Machine L e arning and the Physic al Scienc es (ML4PS) . Dec. 2025. arXiv: 2512.07785 [physics.data-an] . [11] Alireza Ghafarollahi and Markus J Buehler. “SciAgents: automating scien tific discov ery through bioin- spired multi-agen t intelligen t graph reasoning”. In: A dvanc e d Materials 37.22 (2025), p. 2413523. [12] Justin Hill and Hong Jo o Ry o o. “GRA CE: an Agen tic AI for P article Physics Exp eriment Design and Sim ulation”. In: Jan. 2026. arXiv: 2602.15039 [hep-ex] . [13] Kelly Hong, An ton T roynik o v, and Jeff Hub er. Context R ot: How Incr e asing Input T okens Imp acts LLM Performanc e . T ec h. rep. Chroma, July 2025. url : https://research.trychroma.com/context- rot . [14] Carlos E Jimenez et al. “Sw e-b enc h: Can language mo dels resolve real-world github issues?” In: arXiv pr eprint arXiv:2310.06770 (2023). [15] Gregor Kasieczk a et al. “The LHC Olympics 2020 a communit y c hallenge for anomaly detection in high energy ph ysics”. In: R ept. Pr o g. Phys. 84.12 (2021), p. 124201. doi : 10.1088/1361- 6633/ac36b9 . arXiv: 2101.08320 [hep-ph] . [16] Chris Lu et al. “The ai scientist: T ow ards fully automated op en-ended scien tific discov ery”. In: arXiv pr eprint arXiv:2408.06292 (2024). [17] James McGreivy et al. “Seeing the F orest Through the T rees: Knowledge Retriev al for Streamlining P article Ph ysics Analysis”. In: (Sept. 2025). arXiv: 2509.06855 [hep-ex] . [18] T ony Menzo et al. “HEPT APOD: Orc hestrating High Energy Ph ysics W orkflows T o wards Autonomous Agency”. In: (Dec. 2025). arXiv: 2512.15867 [hep-ph] . [19] Michael D Sk arlinski et al. “Language agents ac hieve sup erh uman synthesis of scien tific kno wledge”. In: arXiv pr eprint arXiv:2409.13740 (2024). [20] Zhengde Zhang et al. “Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics”. In: (Apr. 2024). arXiv: 2404.08001 [hep-ph] . 23 A Z Boson Lineshap e Measurement with ALEPH Data A.1 In tro duction A.1.1 Ph ysics of the Z resonance at LEP The Z b oson, discov ered at the CERN S p ¯ pS collider in 1983, is a cornerstone of the electrow eak sector of the Standard Model (SM). Its mass M Z , total deca y width Γ Z , and partial widths provide precision tests of the SM at the quantum lo op level, probing virtual con tributions from particles to o heavy to pro duce directly . A t the Large Electron-Positron Collider (LEP), electron-p ositron collisions at cen tre-of-mass energies span- ning √ s ≈ 88–94 GeV pro duce Z b osons whose pro duction cross-section traces a relativistic Breit-Wigner resonance curve, mo dified by initial-state radiation (ISR) and the photon-exchange and γ -Z interference con tributions. The e + e − → hadr ons cross-section near the Z pole is describ ed b y the ISR-conv olv ed Breit-Wigner form ula, where the Born-level hadronic cross-section takes the form σ B W ( s ) = σ 0 had · s Γ 2 Z ( s − M 2 Z ) 2 + s 2 Γ 2 Z / M 2 Z (1) with σ 0 had = (12 π / M 2 Z ) · (Γ ee Γ had / Γ 2 Z ) b eing the peak hadronic cross-section. The energy dep endence of this cross-section — its p osition, width, and height — enco des the three primary lineshap e observ ables. A.1.2 The lineshape observ ables This analysis extracts three primary parameters from the hadronic cross-section as a function of √ s : • M Z , the Z b oson mass. This is a fundamen tal SM parameter that sets the electrow eak scale. The p osition of the resonance p eak determines M Z directly . • Γ Z , the total Z width. This is sensitive to all deca y channels, including the in visible width from neutrino pairs: Γ Z = Γ had + 3Γ l + N ν Γ ν , where Γ l is the leptonic partial width, Γ ν the neutrino partial width, and N ν the num b er of ligh t neutrino sp ecies. The width of the resonance curve constrains Γ Z . • σ 0 had , the p eak hadronic cross-section. This combines Γ ee , Γ had , and Γ Z in to a single observ able, pro viding a constraint on the partial widths independent of the total width measurement. F rom these three parameters, secondary quantities are derived using SM relations: the hadronic partial width Γ had , the strong coupling constant α s ( M Z ) via the QCD correction to Γ had , and the num ber of light neutrino sp ecies N ν from the invisible width. A.1.3 Prior measuremen ts The LEP exp erimen ts (ALEPH, DELPHI, L3, OP AL) measured the Z lineshap e with high precision during 1989–1995. The LEP com bination ([ ref-lep˙combination ]) giv es M Z = 91 . 1875 ± 0 . 0021 GeV, Γ Z = 2 . 4952 ± 0 . 0023 GeV, and σ 0 had = 41 . 540 ± 0 . 037 nb. The ALEPH-sp ecific measurements from the thesis of Zhong ([ ref-aleph˙zhong ]), using 1989–1993 data, yield M Z = 91 . 1916 ± 0 . 0039 GeV, Γ Z = 2 . 4941 ± 0 . 0058 GeV, and σ 0 had = 41 . 63 ± 0 . 10 n b. An early ALEPH measuremen t using the high-precision silicon-tungsten luminometer (SICAL) ([ ref-aleph˙sical ]) rep orted σ 0 had = 41 . 56 ± 0 . 09 ± 0 . 15 n b from 1992 data. The world av erage from the Particle Data Group (PDG 2024) ([ ref-p dg˙2024 ]) giv es M Z = 91 . 1876 ± 0 . 0021 GeV and Γ Z = 2 . 4955 ± 0 . 0023 GeV, dominated b y the LEP combination. The n umber of light neutrino sp ecies is determined to b e N ν = 2 . 9963 ± 0 . 0074, consisten t with three generations. A.1.4 Ov erview of this analysis This analysis measures the Z b oson lineshape using appro ximately 3.05 million hadronic ev ents collected b y the ALEPH detector at LEP during the 1992–1995 running p eriods. Data at fiv e distinct centre-of- mass energy groups spanning 89.4–93.0 GeV are used to extract M Z , Γ Z , and σ 0 had from a χ 2 fit of the ISR-con volv ed Breit-Wigner cross-section to the measured hadronic cross-sections. 24 The analysis is p erformed using pre-existing ALEPH ntuples containing reconstructed even t-lev el quan- tities and pre-computed selection flags. The k ey limitations of this dataset are: • No luminosit y information is stored in the data fi les; luminosities are reco v ered from published ALEPH tables (1993 data) and from theory-anchored calculations (other years). • Only hadronic ev ents are av ailable; leptonic channels were remov ed at the n tuple pro duction level by a minimum c harged-hadron multiplicit y requirement. • Mon te Carlo simulation is av ailable only at √ s = 91 . 2 GeV; there is no off-p eak MC. • No truth-lev el particle iden tification or pro cess codes are av ailable in the MC. Despite these limitations, the analysis demonstrates that precision electrow eak measuremen ts can b e extracted from archiv al LEP data with a complete treatment of systematic uncertainties. A.1.5 Organization of this note This note is organized as follows. Section 2 describ es the data and MC samples. Section 3 details the ev ent selection and its v alidation. Section 4 presents the corrections applied: ISR con v olution, efficiency , bac kground subtraction, and luminosit y determination. Section 5 documents the systematic uncertaint y ev aluation. Section 6 presen ts the cross-c hecks performed. Section 7 describes the fitting metho d. Section 8 gives the results, including deriv ed quan tities. Section 9 compares to published measurements. Sections 10 and 11 provide conclusions and future directions. The appendices con tain supplemen tary tables and matrices. A.2 Data and MC Samples A.2.1 The ALEPH detector at LEP The ALEPH (Apparatus for LEP PHysics) detector operated at the LEP e + e − collider at CERN from 1989 to 2000. The detector featured a silicon vertex detector, an inner trac king c hamber, a large time pro jection cham b er (TPC) for charged-particle tracking, electromagnetic and hadronic calorimeters, and a m uon sp ectrometer, all immersed in a 1.5 T solenoidal magnetic field. The com bination of tracking and calorimetry enabled the ALEPH energy-flow algorithm, whic h reconstructed charged and neutral particles with excellent resolution. F or a complete description of the detector, see ([ ref-aleph˙zhong ]) and references therein. The data used in this analysis consist of pre-pro cessed R OOT n tuples containing 151 branches p er ev ent, including ev en t-level kinematic quantities (energy , thrust, sphericity , missing momen tum), particle m ulti- plicities, and pre-computed hadronic selection flags. The n tuples were pro duced from the full ALEPH recon- struction with qualit y cuts applied at the pro duction lev el, including a minim um charged-hadron m ultiplicity requiremen t ( n ch ≥ 4) that effectively remo ves pure di-lepton even ts. A.2.2 Data samples Six data files co ver the full 1992–1995 LEP-1 running perio d. The even t coun ts, cen tre-of-mass energy ranges, and primary roles are summarized in T able 4 . T able 4: Data sample inv en tory . The 1993 and 1995 datasets con- tain off-peak energy scan p oin ts essential for constraining M Z and Γ Z . The 1992 and 1994 datasets contribute primarily to the p eak cross-section measurement. P erio d √ s range [GeV] Ev ents Role 1992 91.27–91.28 551,474 P eak 1993 89.38–93.04 538,601 Energy scan 1994 P1 91.14–91.23 433,947 Peak 1994 P2 91.17–91.22 447,844 Peak 25 P erio d √ s range [GeV] Ev ents Role 1994 P3 91.14–91.43 483,649 Peak + off-p eak 1995 89.42–92.98 595,095 Energy scan T otal 89.38–93.04 3,050,610 The data contain p er-even t scalar branches (centre-of-mass energy , selection flags, ev ent shap e v ariables, m ultiplicities) and p er-particle jagged arra ys (four-momenta, charge, particle-type flags). The p er-particle weight and artificAcceptEffCorrection branc hes are energy-flow reconstruction w eigh ts (not luminosit y- related quantities), with v alues clustered around unity . The pre-computed selection flags are: • passesAll : composite hadronic selection (AND-like, ˜94.7% efficient) • passesNTrkMin : minim um charged trac k multiplicit y (100%, pre-cut) • passesTotalChgEnergyMin : minim um total charged energy (100%, pre-c ut) • passesSTheta : thrust axis p olar angle • passesMissP : missing momen tum v eto • passesISR : ISR photon veto • passesWW : WW bac kground v eto • passesNeuNch : neutral/c harged energy balance A.2.3 Mon te Carlo sim ulation A single MC sample of 771,597 ev en ts at √ s = 91 . 2 GeV pro vides the reference selection effi- ciency at the Z peak. The MC was pro duced with the standard ALEPH simulation chain (40 files, LEP1MC1994 recons aftercut-001.root through -040.root ). All ev ents hav e process = -1 and pid = -999 , indicating that no truth-level process codes or particle iden tification are a v ailable. This is a significan t limitation for background estimation and systematic ev aluation, addressed through Do wnscoping Decisions D4 and D5 in the analysis strategy . No off-peak MC (at √ s  = 91 . 2 GeV) is a v ailable (Do wnscoping Decision D3). The energy dep endence of the selection efficiency is constrained from data as describ ed in sec. A.3.6 . A.2.4 Energy in v entory The 3.05 million ev ents span 77 distinct cen tre-of-mass energy v alues (rounded to 0.01 GeV precision), whic h are group ed in to five scan regions for the lineshape fit. The grouping and ev ent statistics are giv en in T able 5 . T able 5: Energy group inv en tory for the lineshap e fit. The luminosit y-weigh ted mean energy of each group is used as the ef- fectiv e √ s in the fit. The “abov e p eak” group has limited statistics (1995 only) and con tributes negligibly to the fit. Group ⟨ √ s ⟩ [GeV] Energy range [GeV] Ev ents (total) Ev ents (selected) Y ears p eak-2 89.434 89.38–89.46 130,530 123,613 1993, 1995 p eak (lo w) 91.196 91.14–91.23 1,598,080 1,513,765 1993, 1994, 1995 p eak (high) 91.285 91.27–91.43 1,117,949 1,059,484 1992, 1993, 1994 P3, 1995 ab o v e peak 91.693 91.63–91.78 2,516 2,380 1995 only p eak+2 92.991 92.95–93.04 201,535 190,301 1993, 1995 The off-p eak scan data from 1993 and 1995 pro vide the essential leverage for constraining M Z and Γ Z : the p eak-2 group ( ∼ 89 . 4 GeV) with 123,613 selected ev ents and the peak+2 group ( ∼ 93 . 0 GeV) with 190,301 selected even ts sample the resonance flanks where the cross-section c hanges rapidly with energy . 26 The p eak region (91 . 0 < √ s < 91 . 5 GeV) con tains approximately 2.7 million even ts split into tw o sub- groups (peak lo w and peak high) due to the spread of operating energies. The “ab o ve peak” group (91.6–91.8 GeV, 2,516 even ts from 1995 only) provides marginal additional constraint. A.2.5 Data completeness The ntuple even t counts represent approximately 84% of the total ev ents exp ected from the published ALEPH luminosities. The completeness factor, determined from the four 1993 scan points where independent luminosities are av ailable, is 0 . 842 ± 0 . 024. This reduction lik ely arises from qualit y cuts applied at the n tuple pro duction lev el and possible run-subset selection. The completeness factor is consisten t across the four 1993 scan points (range: 0.816–0.881), with the 91.290 GeV point sho wing a modestly higher completeness (0.881) compared to the other three (0.816–0.835). All luminosities are corrected for this factor. A.3 Ev en t Selection A.3.1 The passesAll comp osite selection The primary hadronic ev ent selection uses the pre-computed passesAll flag, which implemen ts the standard ALEPH hadronic selection v alidated against published criteria ([ ref-aleph˙zhong ]; [ ref-aleph˙sical ]). The comp osite selection consists of seven individual cuts applied sequentially . Each cut targets a sp ecific physics bac kground or detector acceptance effect. passesNT rkMin — minim um charged track m ultiplicity This cut requires at least 4 c harged trac ks in the even t, rejecting low-m ultiplicit y backgrounds such as t wo-photon ev ents, b eam-gas interactions, and cosmic rays. Hadronic Z deca ys t ypically pro duce ⟨ n ch ⟩ ≈ 20 charged particles, so this cut has essen tially no signal loss. In the n tuples used for this analysis, this cut was already applied at the production level, resulting in 100.00% efficiency for all data and MC ev en ts. The n tuple-level application of this cut also remo ves pure di-lepton even ts (2-track top ology), precluding leptonic cross-section measuremen ts. passesT otalChgEnergyMin — minimum total charged energy This cut requires a minim um total energy dep osited by c harged particles, rejecting b eam-gas and t w o-photon ev en ts with lo w visible energy . Lik e passesNT rkMin, this cut w as applied at the n tuple pro duction level and is 100.00% efficien t for all ev ents in the analysis. passesSTheta — thrust axis polar angle This cut ensures that the even t is w ell-contained within the detector acceptance b y requiring the thrust axis to b e at least ∼ 26 degrees from the b eam axis. Even ts with the thrust axis close to the b eam direction ha v e particles escaping into the beam pipe, leading to po orly measured ev ent prop erties. The efficiency of this cut is appro ximately 97.5%, represen ting the largest single source of signal loss after the pre-cuts. The thrust axis direction is computed from all reconstructed particles using the standard ALEPH energy-flow algorithm. passesMissP — missing momentum veto This cut v eto es even ts with large missing momen tum, target- ing beam-gas even ts (whic h pro duce asymmetric energy dep osits), even ts with neutrinos from heavy-fla vour deca ys, and even ts with particles escaping the detector acceptance. The missing momen tum is computed as the magnitude of the vector sum of all reconstructed particle momenta. The efficiency is approximately 97.0%, the second-largest source of signal loss. passesISR and passesWW — ISR photon and WW background vetoes These cuts reject even ts consisten t with hard initial-state radiation (where a high-energy photon reduces the effective collision energy) and WW-like topologies. A t LEP-1 energies ( √ s ≈ 91 GeV), WW production is kinematically forbidden (threshold ∼ 161 GeV), but these v eto es also reject rare radiative even ts and detector artifacts. The efficien- cies of b oth cuts are appro ximately 99.0%, and they are nearly iden tical in their even t-by-ev en t decisions, suggesting shared or o verlapping cut logic. 27 89 90 91 92 93 p s [ G e V ] 1 0 2 1 0 3 1 0 4 1 0 5 Selected events p s = 8 8 - - 9 4 G e V ALEPH 1992 1993 1994P1 1994P2 1994P3 1995 Figure 3: Energy scan structure showing the num b er of selected even ts at eac h centre-of-mass energy p oin t, colored b y data-taking y ear. The fiv e energy groups used in the lineshap e fit are indicated b y the dashed v ertical bands. The dominan t contributions at the off-p eak p oints come from the 1993 and 1995 energy scans, while the p eak region receiv es even ts from all four y ears. The “abov e p eak” group near 91.7 GeV con tains only 2,380 even ts from 1995. 28 89 90 91 92 93 94 p s [ G e V ] 1 0 5 1 0 6 Events 1992 p s = 8 8 - - 9 4 G e V ALEPH 89 90 91 92 93 94 p s [ G e V ] 1 0 2 1 0 3 1 0 4 1 0 5 Events 1993 89 90 91 92 93 94 p s [ G e V ] 1 0 5 Events 1994P1 89 90 91 92 93 94 p s [ G e V ] 1 . 6 × 1 0 5 1 . 8 × 1 0 5 2 × 1 0 5 2 . 2 × 1 0 5 2 . 4 × 1 0 5 2 . 6 × 1 0 5 2 . 8 × 1 0 5 Events 1994P2 89 90 91 92 93 94 p s [ G e V ] 1 0 3 1 0 4 1 0 5 Events 1994P3 89 90 91 92 93 94 p s [ G e V ] 1 0 2 1 0 3 1 0 4 1 0 5 Events 1995 Figure 4: Energy distributions for eac h data-taking y ear. The 1992 and 1994 (P1, P2, P3) datasets cluster tigh tly around the p eak energy , while 1993 and 1995 show the characteristic LEP energy scan pattern with measuremen ts at the p eak and approximately ± 2 GeV off-p eak. The 1995 dataset additionally includes a small sample near 91.7 GeV. The distinct energy cov erage of differen t years motiv ates the t wo-tier luminosity strategy describ ed in sec. A.4.4 . 29 passesNeuNc h — neutral/charged energy balance This cut requires a balanced ratio of neutral to c harged energy in the ev ent, rejecting even ts with anomalous calorimeter dep osits (e.g., b eam-w all in terac- tions producing sho wers of neutral particles) or even ts where the c harged-particle trac king is incomplete. The efficiency is approximately 99.5%. The largest data-MC discrepancy among all sub-flags occurs in this cut: 99.46% in data versus 99.75% in MC, a 0.29% difference that may indicate slight mismo deling of the neutral energy resp onse. Comp osite passesAll flag The passesAll flag represents the comp osite hadronic selection with an ov erall efficiency of 94.74% in data and 94.71% in MC at the Z peak. The flag is not strictly the logical AND of all sev en sub-flags: 0.23% of MC ev en ts hav e passesAll = T rue but fail the cum ulativ e pro duct of all individual flags. This minor discrepancy may reflect sligh tly different cut logic (e.g., OR conditions for some sub-selections) and has negligible impact on the analysis. W e use passesAll directly as it represents the calibrated ALEPH selection. A.3.2 Cutflo w table T able 6 presen ts the selection efficiency for each data-taking perio d, demonstrating the remark able stability of the detector and selection across the four years of data-taking. T able 6: Cutflow summary for all data-taking p eriods. The pass- esAll selection efficiency is stable across y ears with a spread of only 0.06% (94.70–94.75%), confirming detector stability o ver the 1992–1995 running p erio d. Dataset T otal ev ents Selected (passesAll) Efficiency [%] 1992 551,474 522,526 94.75 1993 538,601 510,056 94.70 1994 P1 433,947 411,001 94.71 1994 P2 447,844 424,139 94.71 1994 P3 483,649 458,027 94.70 1995 595,095 563,794 94.74 T otal 3,050,610 2,889,543 94.72 A.3.3 P er-v ariable distributions The follo wing figures sho w the distributions of the k ey kinematic v ariables used in the hadronic selection, comparing data and MC at the Z p eak. The excellent data-MC agreement v alidates the use of MC-deriv ed efficiencies for the cross-section measurement. A.3.4 Cutflo w efficiency A.3.5 Indep enden t selection v alidation T o v alidate the passesAll flag, an independent cut-based selection was implemented using particle-level branc hes computed directly from the per-particle four-momen ta. The independent selection applies three basic hadronic cuts: 1. n ch ≥ 5: c harged hadron multiplicit y , stricter than the ntuple-lev el n ch ≥ 4 pre-cut. 2. E v is > 50% of √ s : total visible energy from particle momenta, rejecting tw o-photon and b eam-gas ev ents. 3. | cos θ thrust | < 0 . 9: thrust axis p olar angle, matching the passesSTheta requirement. The results at the Z p eak are: 30 0 25000 50000 75000 100000 125000 150000 175000 Events p s = 8 8 - - 9 4 G e V ALEPH MC (normalized to data) Data 0 10 20 30 40 50 n c h ( c h a r g e d h a d r o n s ) 0.9 1.0 1.1 Data / MC MC stat. unc. Figure 5: Distribution of c harged hadron multiplicit y ( n ch ) after the hadronic even t selection. Data (black p oin ts with statistical error bars) are compared to MC simulation (blue filled histogram) normalized to the same n umber of ev en ts. The mean m ultiplicit y is appro ximately 20, characteristic of hadronic Z deca ys. The agreemen t b et ween data and MC is excellent across the full range, with deviations b elo w 2%. The lo w-multiplicit y tail ( n ch < 10) is sensitive to tw o-photon and τ + τ − bac kgrounds. 31 0 100000 200000 300000 400000 Events p s = 8 8 - - 9 4 G e V ALEPH MC (normalized to data) Data 0.5 0.6 0.7 0.8 0.9 1.0 Thrust 0.9 1.0 1.1 Data / MC MC stat. unc. Figure 6: Distribution of thrust after the hadronic even t selection. Data (black p oin ts) are compared to MC sim ulation (blue histogram). Thrust p eaks near 0.92 for tw o-jet-lik e hadronic ev ents, with a long tail tow ard the isotropic limit of 0.5 for multi-jet even ts. The passesSTheta cut on the thrust axis p olar angle remov es ev ents with | cos θ thrust | > 0 . 9, affecting even ts at all thrust v alues equally . Data and MC agree within 2% across the full range. 32 0 100000 200000 300000 400000 500000 Events p s = 8 8 - - 9 4 G e V ALEPH MC (normalized to data) Data 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Sphericity 0.9 1.0 1.1 Data / MC MC stat. unc. Figure 7: Distribution of sphericit y after the hadronic ev ent s election. Data (blac k p oin ts) are compared to MC (blue histogram). Sphericity measures the isotrop y of the even t: p encil-lik e t wo-jet ev ents cluster near zero while isotropic m ulti-jet ev ents extend tow ard unity . The steeply falling distribution reflects the dominance of t w o-jet top ologies in hadronic Z decays. The data-MC agreemen t is goo d, with the MC sligh tly o verpredicting ev ents at very lo w sphericity . 33 0 20000 40000 60000 80000 100000 120000 140000 Events p s = 8 8 - - 9 4 G e V ALEPH MC (normalized to data) Data 60 70 80 90 100 110 120 E v i s / p s [ % ] 0.9 1.0 1.1 Data / MC MC stat. unc. Figure 8: Distribution of visible energy ( E v is ) after the hadronic ev ent selection. Data (black points) are compared to MC (blue histogram). The visible energy p eaks near the cen tre-of-mass energy ( √ s ≈ 91 . 2 GeV) with a tail to low er v alues from even ts with neutrinos (hea vy-flav our deca ys) or particles escaping the detector acceptance. The passesT otalChgEnergyMin cut was applied at the ntuple pro duction level. The distribution demonstrates excellent calorimetric performance of the ALEPH detector, with data and MC agreeing within 2%. 34 0 50000 100000 150000 200000 250000 Events p s = 8 8 - - 9 4 G e V ALEPH MC (normalized to data) Data 0 5 10 15 20 25 30 35 40 p m i s s [ G e V ] 0.9 1.0 1.1 Data / MC MC stat. unc. Figure 9: Distribution of missing momen tum ( p miss ) after the hadronic ev ent selection. Data (black p oin ts) are compared to MC (blue histogram). The missing momen tum is computed as the magnitude of the vector sum of all reconstructed particle momenta. Hadronic even ts cluster at lo w missing momentum ( < 20 GeV), while ev ents with neutrinos or particles lost in the b eam pip e sho w a tail to higher v alues. The passesMissP cut v eto es even ts with excessiv e missing momentum, remo ving approximately 3% of ev ents. No ev ents with p miss > 40 GeV survive the selection, confirming the effectiveness of the veto. 35 0 20000 40000 60000 80000 100000 120000 140000 Events p s = 8 8 - - 9 4 G e V ALEPH MC (normalized to data) Data 0 10 20 30 40 50 60 70 80 n p a r t i c l e ( a l l p a r t i c l e s ) 0.9 1.0 1.1 Data / MC MC stat. unc. Figure 10: Distribution of total particle m ultiplicity ( n particl e , including neutrals) after the hadronic even t selection. Data (blac k p oin ts) are compared to MC (blue histogram). The mean m ultiplicity is appro ximately 37, reflecting b oth c harged trac ks and neutral calorimeter clusters reconstructed b y the ALEPH energy-flow algorithm. The distribution is broader than the c harged-only multiplicit y due to the additional fluctuations in neutral particle reconstruction. Go od data-MC agreement is observed, with deviations at the sub-p ercen t lev el. 36 NTrkMin TotalChgEnergyMin STheta MissP ISR WW NeuNch All Selection cut 90 92 94 96 98 100 Cumulative efficiency [%] p s = 8 8 - - 9 4 G e V ALEPH 1992 1993 1994P1 1994P2 1994P3 1995 Figure 11: Cum ulativ e selection efficiency as a function of the selection cut applied, sho wn for b oth data and MC at the Z peak. The pre-cuts (passesNT rkMin, passesT otalChgEnergyMin) are 100% efficien t b ecause they w ere applied at the ntuple pro duction lev el. The dominan t inefficiencies arise from passesSTheta (thr ust axis angular cut, ∼ 2.5% loss) and passesMissP (missing momentum veto, ∼ 2.7% loss). The final passesAll efficiency is 94.74% in data and 94.71% in MC, with a data-MC difference of only 0.036%. 37 Selection Data efficiency [%] MC efficiency [%] passesAll 94.74 94.74 Indep enden t 99.13 99.16 Both 94.60 94.61 The indep enden t selection is looser than passesAll because it omits the tighter qualit y cuts (passesMissP , passesISR, passesWW, passesNeuNch). Critically , 99.85% of even ts passing passesAll also pass the inde- p enden t selection (only 3,815 even ts out of 2,573,249 pass passesAll but fail the independent cuts). This confirms that passesAll implemen ts a strict sup erset of the basic hadronic requirements. The data-MC agreemen t on the efficiency difference b et ween the tw o selections is 0.03%, providing an upp er b ound on the selection modeling systematic. A conserv ativ e v alue of 0.05% is assigned (see sec. A.5.2 ). A.3.6 Selection efficiency vs energy The selection efficiency as a function of √ s is measured directly from data using the passesAll flag. A linear parameterization is fit to the data: ε ( √ s ) = ε 0  1 + α · ( √ s − 91 . 2 GeV )  (2) The fit yields ε 0 = 94 . 729 ± 0 . 013% and α = − 0 . 096 ± 0 . 023%/GeV, with χ 2 /nd f = 96 . 0 / 75 = 1 . 28 (p- v alue ≈ 0 . 05). The negative slop e indicates that the efficiency slightly increases at lo wer energies, opp osite to what one migh t naively exp ect from energy-dep enden t acceptance effects. The total efficiency v ariation across the 89–93 GeV scan range is less than 0.3%, confirming that the ALEPH hadronic selection is robust against energy v ariation. The efficiency at the five energy groups used in the fit is: √ s [GeV] ε [%] Uncertaint y [%] 89.43 94.890 0.041 91.20 94.729 0.013 91.28 94.722 0.013 91.69 94.685 0.017 92.99 94.568 0.042 The marginally high χ 2 /ndf (1.28, p-v alue ≈ 0 . 05) likely reflects year-to-y ear and perio d-to-p eriod de- tector v ariations absorbed in to a single linear mo del. This scatter is included in the fit uncertain ty on α and propagated to the systematic budget (sec. A.5.5 ). A.3.7 Ov erall selection efficiency The MC-derived selection efficiency at √ s = 91 . 2 GeV is: ε sel = 94 . 739 ± 0 . 025% (3) computed from all 771,597 MC ev ents. The uncertain ty is the binomial standard error. The MC efficiency is v alidated with a 50/50 split (fixed seed 42): deriv ation half ε = 94 . 777 ± 0 . 036%, v alidation half ε = 94 . 702 ± 0 . 036%, difference = 0 . 074 ± 0 . 051% (1 . 5 σ ), consistent with statistical fluctuations. A.4 Corrections and Efficiencies A.4.1 ISR con v olution The second-order radiator function Initial-state radiation (ISR) is the dominant radiative correction to e + e − cross-sections near the Z p ole. Before the e + e − annihilation, one or both initial-state particles 38 89 90 91 92 93 p s [ G e V ] 93.5 94.0 94.5 95.0 95.5 96.0 Selection efficiency [%] p s = 8 8 - - 9 4 G e V ALEPH L i n e a r f i t : 0 = 9 4 . 7 3 % , = - 0 . 0 9 5 % / G e V MC efficiency: 94.74% Data (passesAll) Figure 12: Selection efficiency (passesAll) as a function of cen tre-of-mass energy , with a linear fit ov erlaid. Eac h point represents the measured efficiency at a distinct √ s v alue from data. The fitted slop e of α = − 0 . 096 ± 0 . 023%/GeV indicates a v ery w eak energy dep endence. The solid line shows the b est-fit linear mo del, with the shaded band represen ting the ± 1 σ fit uncertaint y . The efficiency v aries b y less than 0.3% across the full 89–93 GeV scan range, confirming the robustness of the hadronic selection for lineshap e measuremen ts. 39 can radiate photons, reducing the effective centre-of-mass energy a v ailable for Z pro duction. The observ ed cross-section is related to the Born-level cross-section through a con volution: σ had ( s ) = Z 1 − s min /s 0 H ( x, s ) · σ B W ( s (1 − x )) dx (4) where x is the fractional energy lost to ISR photons, s min is the minim um reduced s (t ypically set to (2 m π ) 2 ), and H ( x, s ) is the radiator function encoding the probability of radiating a fraction x of the b eam energy . Structure of the radiator The O ( α 2 ) QED radiator with soft-photon exp onen tiation is used, follo wing the standard LEP prescription ([ ref-lep˙lineshap e˙prelim ]): H ( x, s ) = H sof t ( x, s ) + H hard ( x, s ) + H α 2 ( x, s ) (5) The individual terms are: Soft term (exp onen tiated): H sof t ( x, s ) = β x β − 1  1 + 3 β 4  (6) This term captures the infrared-divergen t soft-photon emission, regulated by the exponent β − 1 whic h resums the leading logarithms to all orders. Hard collinear term ( O ( α )): H hard ( x, s ) = − β  1 − x 2  (7) This term accounts for hard collinear photon emission at first order. O ( α 2 ) additiv e piece: H α 2 ( x, s ) = β 2 8  4(2 − x ) ln x − 1 + 3(1 − x ) 2 x ln(1 − x ) − 6 + x  +  α π  2  π 2 3 − 1 2  x β − 1 (8) This term includes the second-order corrections from double photon emission and virtual+real interfer- ence. Definition of the QED parameter b eta The QED logarithmic parameter β is defined as: β = 2 α π  ln s m 2 e − 1  (9) A t the Z p eak ( √ s = 91 . 2 GeV), β = 0 . 1075. This parameter con trols the strength of ISR: the x β − 1 sin- gularit y at x → 0 pro duces a large enhancemen t of soft-photon emission, while the logarithmic enhancement ln( s/m 2 e ) ≈ 24 . 2 reflects the collinear photon radiation pattern. Numerical integration The ISR conv olution in tegral is ev aluated n umerically using Gauss-Legendre quadrature with a logarithmic v ariable substitution to handle the x β − 1 singularit y near x = 0. The sub- stitution x = e − t transforms the integrand to a smo oth function of t , and the in tegration is split in to a near-singularit y region ( x < 0 . 01) and a regular region ( x > 0 . 01), eac h ev aluated with 96–256 quadrature no des. F or x < 10 − 30 , the contribution is computed analytically . The fast Gauss-Legendre implementation ac hieves 0.00001% agreement with the adaptiv e quadrature metho d (scipy .integrate.quad) while pro viding a 24 × sp eedup (0.84 ms p er ev aluation vs. 20 ms), enabling the 100 toy closure tests describ ed in sec. A.6.3 . 40 ISR reduction at the peak The ISR con volution reduces the peak cross-section by approximately 27% relativ e to the Born-level v alue. This is consistent with the exp ected 25–30% reduction from the full O ( α 2 ) calculation ([ ref-lep˙lineshape˙prelim ]). The ISR correction is the dominant radiativ e effect and is re- sp onsible for the asymmetric distortion of the lineshap e (the peak shifts to sligh tly low er energy and the lo w-energy tail is enhanced). A.4.2 Selection efficiency The selection efficiency correction is applied as a m ultiplicative factor in the cross-section formula (eq. 13 ). The MC-derived efficiency at 91.2 GeV ( ε 0 = 94 . 739%) is extrap olated to off-p eak energies using the data- deriv ed linear energy dep endence (eq. 2 ) with slop e α = − 0 . 096%/GeV. F or each energy group i at mean energy √ s i , the efficiency is: ε i = ε 0 ·  1 + α · ( √ s i − 91 . 2 GeV )  (10) The efficiency v alues at the five energy groups range from 94.568% at 92.99 GeV to 94.890% at 89.43 GeV. The total v ariation of 0.32% across the scan range is small compared to the 0.2% efficiency systematic, confirming that the energy dep endence has a minor impact on the cross-section measurements. A.4.3 Bac kground subtraction Bac kground sources Three categories of bac kground contaminate the hadronic ev ent sample after the passesAll selection: Tw o-photon even ts ( γ γ → hadr ons ) are the dominan t background. These even ts arise from the scattering of virtual photons radiated b y the beam particles, pro ducing lo w-multiplicit y , low-energy hadronic systems. The tw o-photon cross-section is approximately constan t across the Z scan range (scaling as ∼ ln 2 ( s ) /s for real photons, effectively flat ov er 88–94 GeV), while the Z signal cross-section v aries b y a factor of ∼ 3. At the Z peak, the tw o-photon fraction is estimated at 0.30hadronic even ts. The estimate is based on the published ALEPH background fractions from [@aleph sical] and the lo w-m ultiplicity sideband ( n ch ≤ 7), where 0.85% of p eak-region data even ts reside. The MC τ -like fraction in this sideband (even ts with n ch of 4–6 and thrust > 0 . 98) of 0.21% provides an upp er b ound on the tw o-photon and τ contamination. Beam-gas and beam- w all ev en ts arise from in teractions of b eam particles with residual gas molecules or the beam pip e. These ev ents t ypically produce asymmetric energy deposits with large missing momen tum. The passesMissP cut is sp ecifically designed to reject these even ts, and no ev en ts with p miss > 40 GeV survive the selection. The residual beam-gas fraction is estimated at 0.10% at the p eak, based on published ALEPH v alues ([ ref-aleph˙sical ]) and the v ertex distribution. τ + τ − ev ents contribute when both τ leptons deca y hadronically , pro ducing multi-prong final states that mimic hadronic Z decays. The τ + τ − cross-section follows the Z lineshap e, so its fractional contribution is appro ximately constan t across energy points. The residual fraction after the passesAll selection is estimated at 0.10% from MC. Energy dep endence of backgrounds The tw o-photon and b eam-gas backgrounds hav e approximately energy-indep enden t c ross-sections, so their fractional contribution to the selected sample increases at off- p eak energies where the Z signal cross-section is lo wer. The background fraction at each energy group is mo deled as: f bg ( √ s ) = f τ + ( f 2 γ + f beam -gas ) × σ peak σ had ( √ s ) (11) where f τ = 0 . 10% is constan t and f 2 γ + f beam -gas = 0 . 40% scales with the inv erse signal cross-section. The resulting background fractions are: 41 T able 7: Bac kground fractions at each energy group. The tw o- photon and beam-gas comp onen ts scale in versely with the Z signal cross-section, reac hing 1.3% at the low est energy p oint. The τ + τ − fraction is approximately constan t at 0.10%. Group ⟨ √ s ⟩ [GeV] f bg [%] f 2 γ [%] f τ [%] f beam -gas [%] p eak-2 89.43 1.32 0.92 0.10 0.30 p eak (lo w) 91.20 0.51 0.30 0.10 0.10 p eak (high) 91.28 0.51 0.31 0.10 0.10 ab o v e peak 91.69 0.65 0.42 0.10 0.14 p eak+2 92.99 0.97 0.65 0.10 0.22 A systematic uncertaint y of ± 50% is assigned to each bac kground component, follo wing standard ALEPH practice ([ ref-aleph˙sical ]). A.4.4 Luminosit y determination Tw o-tier strategy The data files contain no luminosit y information: neither p er-run in tegrated luminosi- ties nor Bhabha scattering ev ent counts are stored. The w eight and artificAcceptEffCorrection branc hes are p er-particle energy-flo w w eights, not luminosity-related quantities. A hybrid t wo-tier luminosity strategy is emplo yed. Tier 1: Published luminosities (1993 scan). The integrated luminosities at the four 1993 scan p oin ts are taken directly from ([ ref-aleph˙zhong ]), T able 3.5: √ s [GeV] L [n b − 1 ] 89.434 8,064.9 91.190 9,130.9 91.290 5,331.8 93.016 8,692.6 These luminosities w ere measured indep enden tly using the SICAL silicon-tungsten luminometer via small- angle Bhabha scattering. They represent the only energy points in this analysis where the cross-section measuremen t is fully indep enden t of theoretical cross-section assumptions. Tier 2: Theory-anchored luminosities (1992, 1994, 1995). F or years without published p er-point luminosities, the effective luminosity is deriv ed from the theoretical Breit-Wigner + ISR cross-section: L ef f ( √ s ) = N sel ( √ s ) · (1 − f bg ( √ s )) ε ( √ s ) · σ B W + I S R ( √ s ) × C norm (12) where C norm is a normalization factor calibrated to reproduce the 1993 published luminosities. The theoretical cross-section uses PDG parameters ( M Z = 91 . 1876 GeV, Γ Z = 2 . 4955 GeV, σ 0 had = 41 . 481 nb). Circularit y and implications The Tier 2 luminosity strategy is circular for the absolute cross-section: the deriv ed luminosit y enco des the theoretical cross-section, so the measured σ had at Tier 2 p oin ts is guaran teed to repro duce the theory (up to normalization corrections). How ev er, this circularity do es not affect the lineshap e shape parameters: 1. The relativ e even t rates b et ween differen t energy p oints within the same y ear constrain the lineshap e shap e, regardless of the absolute luminosit y . 2. All y ears con tribute p eak-region data, so the ratio of off-p eak to p eak even t coun ts constrains Γ Z indep enden tly of the absolute normalization. 3. The energy dep endence of the cross-section ratio σ ( √ s ) /σ (91 . 2) is insensitive to the luminosity scale. 42 89 90 91 92 93 p s [ G e V ] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Background fraction [%] p s = 8 8 - - 9 4 G e V ALEPH Total background h a d r o n s + Beam-gas ± 5 0 % u n c e r t a i n t y Figure 13: Bac kground fraction as a function of centre-of-mass energy . The total bac kground fraction (solid line) increases from 0.5% at the Z p eak to approximately 1.3% at the low est scan energy (89.4 GeV), driv en b y the constant t w o-photon cross-section against the falling Z signal. The individual con tributions from t wo-photon (dashed blue), b eam-gas (dashed green), and τ + τ − (dashed red) are sho wn separately . The shaded band indicates the ± 50% systematic uncertain ty on the total bac kground. The energy dep endence of the background is a significant systematic effect on Γ Z , as it mimics a change in the lineshap e width. 43 The absolute normalization ( σ 0 had ) is constrained primarily b y the Tier 1 (1993) p oin ts where independent luminosities exist. The Tier 2 p oin ts contribute to M Z and Γ Z through the lineshap e shap e but should not b e in terpreted as indep enden t measurements of the absolute cross-section. Implications for the p eak cross-section extraction The p eak cross-section σ 0 had is determined from the fit to all five energy groups, but its v alue is anc hored by the four 1993 scan p oin ts with indep enden t Tier 1 luminosities. The Tier 2 “ab o ve p eak” p oin t (1995 only) has negligible influence on σ 0 had due to its large statistical uncertain ty (0.571 n b vs. 0.025–0.031 n b for the other p oin ts). The completeness factor (0 . 842 ± 0 . 024) en ters as a m ultiplicative correction to all luminosities and contributes to the luminosit y systematic uncertaint y . A.5 Systematic Uncertainties Systematic uncertainties are ev aluated using the B-5 proto col: the primary fit uses statistical errors only in the χ 2 to obtain the statistical uncertaint y on each parameter. F or each systematic source, the relev an t input quan tity is shifted b y its uncertaint y and the fit is rep eated with statistical errors. The difference in the b est-fit parameters b et ween the nominal and shifted fits defines the systematic uncertain ty from that source. F or paired v ariations (e.g., luminosity ± 0.19%), the half-difference of the up ward and do wnw ard shifts is tak en. This procedure a voids double-counting, since the Phase 3 systematic errors on the cross- section points include the same luminosit y , efficiency , and background uncertainties that are v aried in the systematic ev aluation. A.5.1 Luminosit y (+-0.19%) The luminosit y systematic uncertain ty is 0.19%, combining the exp erimen tal SICAL precision (0.15%) and the theoretical Bhabha cross-section uncertain ty (0.11%) in quadrature, as published in ([ ref-aleph˙sical ]). The luminosity uncertaint y affects only the absolute normalization of the cross-sections: v arying the lumi- nosit y scale by ± 0.19% shifts σ 0 had b y ± 0.078 nb. The impact on M Z and Γ Z is negligible ( < 0.01 MeV) b ecause these parameters are determined by the shap e of the lineshape (the relativ e cross-sections at differen t energies), not the absolute scale. A common luminosit y shift mov es all cross-section p oints up or down together, changing only the p eak heigh t. A.5.2 Selection efficiency (+-0.2%) The selection efficiency uncertaint y of 0.2% combines three contributions in quadrature: the MC statistical precision ( ± 0.025% from the binomial error on 771,597 ev ents), the data-MC difference from the independent selection cross-c heck ( ± 0.05%, from the 0.03% data-MC agreemen t on the selection difference, doubled as a conserv ativ e estimate), and a residual mo deling uncertaint y ( ± 0.19%) assigned to cov er the discrepancy in the passesNeuNch flag b et ween data and MC (0.29%). The total 0.2% efficiency uncertain ty is conserv ative relative to the published ALEPH v alues: ([ ref-aleph˙sical ]) reports hadronic selection efficiencies of 99 . 1 ± 0 . 1% (calorimetric) and 97 . 4 ± 0 . 2% (trac k-based) using tw o indep endent metho ds. Our passesAll selection is more stringent than either published metho d, but the uncertaint y is assigned as a normalization effect: δ σ 0 had = 0 . 083 nb. Lik e the luminosity , the efficiency systematic affects only σ 0 had ( < 0.01 MeV impact on M Z and Γ Z ). A.5.3 Bac kground subtraction The bac kground systematic is ev aluated b y v arying eac h bac kground comp onent by ± 50%, follo wing standard ALEPH practice ([ ref-aleph˙sical ]). The 50% uncertaint y reflects the limited precision of the background estimation metho d, which relies on published reference fractions and energy-scaling arguments rather than direct sideband measurements. The dominant contribution comes from the tw o-photon component, whic h has the largest absolute fraction and the strongest energy dependence. The background systematic is the most significant source of uncertain ty on Γ Z : v arying the background b y ± 50% shifts Γ Z b y ± 7.0 MeV. This large impact arises because the energy-dependent bac kground fraction (eq. 11 ) changes the relativ e cross-sections at off-p eak energies. A higher bac kground fraction at the off-p eak 44 p oin ts (p eak-2 and p eak+2) reduces the signal cross-section there, making the lineshap e appear narrow er and reducing Γ Z . The background also contributes 0.62 MeV to M Z and 0.090 nb to σ 0 had . A.5.4 Beam energy (+-2 MeV) The LEP b eam energy is kno wn to ± 2 MeV p er scan p oin t from resonan t dep olarization measurements ([ ref-aleph˙zhong ]) and extrap olation to the operating energy using mo dels of the LEP magnetic lattice. A correlated shift of all b eam energies b y ± 2 MeV changes M Z b y ± 2.0 MeV, matching the published ALEPH b eam-energy systematic. The impact on Γ Z and σ 0 had is negligible: a common energy shift mo ves all p oin ts along the lineshap e together without c hanging the relative cross-sections. A.5.5 Energy-dep enden t efficiency (+-0.1%/GeV) The energy dep endence of the selection efficiency is constrained from data to α = − 0 . 096 ± 0 . 023%/GeV (sec. A.3.6 ). An uncertain ty of ± 0.1%/GeV is assigned, co v ering the fitted slope and its uncertain ty as w ell as the difference from the ALEPH reference v alue of ∼ 0.03% v ariation b et ween scan p oints ([ ref-aleph˙zhong ]). This systematic has the largest impact on M Z (2.69 MeV) b ecause it introduces an asymmetric tilt in the cross-section vs. energy relation: if the efficiency decreases faster with energy than assumed, the off- p eak cross-sections are biased low, shifting the fitted p eak position. Sp ecifically , v arying the efficiency slop e b y +0.1%/GeV reduces the high-energy cross-sections relative to low-energy , which the fit comp ensates b y shifting M Z do wnw ard. The impact on Γ Z is small (0.31 MeV) b ecause the efficiency slop e affects b oth flanks of the resonance appro ximately symmetrically , and the impact on σ 0 had is negligible (0.006 nb) because the p eak efficiency is the reference p oint. A.5.6 Gamma-Z in terference The γ -Z interference modifies the hadronic cross-section through the in terference b et ween the photon- exc hange and Z-exc hange diagrams. The interference is parameterized by j had , the hadronic interference parameter in the S-matrix formalism. F ollo wing the LEP S-matrix analysis ([ ref-lep˙smatrix ]), w e fix j had = 0 . 14 (the SM prediction) and v ary it b y ± 0.14 (the measured uncertaint y). The impact is negligible: δ M Z = ± 0 . 023 MeV, δ Γ Z = ± 0 . 004 MeV, δ σ 0 had = ± 0 . 001 n b. The interference effect is small at the Z p eak (where the Z-exc hange amplitude dominates b y several orders of magnitude) and only b ecomes relev an t at energies far from the p ole. A.5.7 ISR treatmen t (LEP standard) The ISR systematic follo ws the LEP standard assignment ([ ref-lep˙com bination ]): 0.3 MeV on M Z , 0.2 MeV on Γ Z , and 0.02 nb on σ 0 had . These v alues corresp ond to the estimated size of missing O ( α 3 ) QED corrections, which are kno wn from higher-order calculations to b e v ery small. F or reference, the difference b etw een the O ( α ) and O ( α 2 ) radiator implemen tations is muc h larger: δ M Z = 16 MeV, δ Γ Z = 24 MeV, δ σ 0 had = 2 . 5 n b. How ev er, this comparison is inappropriate as a systematic estimate b ecause the O ( α ) radiator omits soft-photon exp onentiation and is kno wn to b e inaccurate. The LEP analyses univ ersally used O ( α 2 ) with soft exp onentiation as the nominal treatmen t, and the ISR systematic is correctly estimated from the kno wn size of the next-order corrections. A.5.8 Systematic uncertain t y budget T able 8 summarizes the systematic uncertaint y from eac h source on the three primary fit parameters. 45 T able 8: Systematic uncertaint y budget for the three primary fit parameters. The total systematic is the quadrature sum of all sources. F or M Z , the dominant systematics are the energy- dep enden t efficiency and beam energy . F or Γ Z , the background subtraction dominates. F or σ 0 had , luminosity , efficiency , and bac k- ground contribute comparably . Source δ M Z [MeV] δ Γ Z [MeV] δ σ 0 had [n b] Luminosit y ( ± 0.19%) 0.00 0.00 0.078 Selection efficiency ( ± 0.2%) 0.00 0.00 0.083 Bac kground ( ± 50%) 0.62 7.00 0.090 Beam energy ( ± 2 MeV) 2.00 0.00 0.000 Eff. energy dep. ( ± 0.1%/GeV) 2.69 0.31 0.006 γ -Z interference 0.02 0.00 0.001 ISR treatment (LEP standard) 0.30 0.20 0.020 T otal systematic 3.42 7.01 0.147 Statistical 2.77 4.25 0.029 T otal (stat ⊕ syst) 4.40 8.20 0.150 A.5.9 P arameter sensitivit y table T able 9 presents the ratio of eac h systematic uncertaint y to the statistical uncertain ty , quan tifying the relativ e importance of each source. T able 9: Parameter sensitivit y: ratio of each systematic uncer- tain ty to the statistical uncertain ty . No single systematic exceeds 5 × the statistical uncertain ty on any parameter, indicating a bal- anced error budget. The background is the dominant systematic for σ 0 had (3.11 × stat) and Γ Z (1.65 × stat). Input parameter | δ M Z | /σ stat M Z | δ Γ Z | /σ stat Γ Z | δ σ 0 | /σ stat σ 0 Luminosit y 0.00 0.00 2.70 Selection efficiency 0.00 0.00 2.84 Bac kground 0.22 1.65 3.11 Beam energy 0.72 0.00 0.00 Eff. energy dep. 0.97 0.07 0.19 γ -Z interference 0.01 0.00 0.03 ISR treatment 0.11 0.05 0.69 No single systematic exceeds 5 × the statistical uncertaint y on any parameter. F or M Z , the analysis is appro ximately statistics-limited (stat/syst = 0.81). F or Γ Z , the background systematic dominates (1.65 × stat). F or σ 0 had , three sources exceed the statistical uncertaint y: background (3.11 × ), efficiency (2.84 × ), and luminosit y (2.70 × ), making this parameter strongly systematics-limited (stat/syst = 0.20). A.5.10 Completeness table vs conv en tions and reference analyses T able 10 compares the systematic sources implemented in this analysis against the requirements of conv en- tions/extraction.md and the systematic programs of the tw o primary ALEPH reference analyses. 46 T able 10: Completeness comparison of systematic sources. All con ven tion-required sources that are feasible giv en the a v ailable data are implemen ted. The hadronization mo del and MC efficiency mo del systematics cannot b e ev aluated due to the av ailabilit y of only a single MC sample (Downscoping Decision D5). The dif- ference b et ween the passesAll and indep enden t selection metho ds (0.05%) serves as a partial proxy . Source Conv en tions ([ ref-aleph˙zhong ]) ([ ref-aleph˙sical ]) This Status Luminosit y (exp.) Req. 0.15% 0.15% 0.19% Implemen ted Luminosit y (theor.) Req. 0.11% 0.11% (in 0.19%) Implemen ted Selection efficiency Req. Two metho ds Tw o metho ds ± 0.2% Implemen ted Energy-dep. eff. Req. Off-p eak MC N/A ± 0.1%/GeV Implemen ted Bac kground Req. ∼ 0.1% 0.7–1.0% ± 50% Implemen ted Beam energy Req. 1.7/2.0 MeV Not dominan t ± 2 MeV Implemen ted ISR/QED Req. O ( α 2 ) N/A LEP standard Implemen ted γ -Z interf. Req. S-matrix N/A j ± 0 . 14 Implemen ted Hadronization Req. JETSET/HER WIG N/A N/A Not feasible MC eff. mo del Req. Alt. MC N/A N/A Not feasible The tw o missing systematic sources (hadronization mo del and MC efficiency mo del) cannot be ev aluated b ecause only one MC sample is a v ailable (Do wnscoping Decision D5 from the analysis strategy). The impact is exp ected to b e small for an inclusive hadronic cross-section measurement with > 94% selection efficiency: the published ALEPH comparison of JETSET and HER WIG found efficiency differences of < 0.1% for the standard hadronic selection ([ ref-aleph˙zhong ]). 0 1 2 3 M Z [ M e V ] Luminosity Selection efficiency Background Beam energy Eff energy dep. gamma-Z interference ISR treatment p s = 8 8 - - 9 4 G e V ALEPH Stat. (2.8 MeV) Total syst. (3.4 MeV) 0 2 4 6 Z [ M e V ] Luminosity Selection efficiency Background Beam energy Eff energy dep. gamma-Z interference ISR treatment p s = 8 8 - - 9 4 G e V ALEPH Stat. (4.2 MeV) Total syst. (7.0 MeV) 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0 h a d [ n b ] Luminosity Selection efficiency Background Beam energy Eff energy dep. gamma-Z interference ISR treatment p s = 8 8 - - 9 4 G e V ALEPH Stat. (0.029 nb) Total syst. (0.147 nb) Figure 14: Systematic impact on eac h fit parameter, sho wn as a horizontal bar c hart. F or eac h parameter ( M Z , Γ Z , σ 0 had ), the bars show the impact of each systematic source. The vertical dashed lines indicate the statistical uncertain t y and the total systematic uncertain t y for reference. F or M Z , the energy-dependent efficiency (2.69 MeV) and b eam energy (2.00 MeV) dominate. F or Γ Z , the bac kground subtraction (7.00 MeV) is the single dominant source. F or σ 0 had , background (0.090 n b), efficiency (0.083 nb), and luminosity (0.078 nb) con tribute comparably . 47 A.6 Cross-c hec ks A.6.1 P er-p erio d consistency The p er-year cross-section comparison tests the consistency of the luminosity strategy and detector stabilit y across the 1992–1995 running perio d. The 1993 (Tier 1) and 1995 (Tier 2) cross-sections are compared at o verlapping energy p oin ts: Energy region χ 2 /ndf p-v alue Interpretation p eak-2 ( ∼ 89.4 GeV) 7.5/1 0.006 Moderate tension p eak ( ∼ 91.2 GeV) 69.1/5 < 0.001 Strong tension p eak+2 ( ∼ 93.0 GeV) 153.5/1 < 0.001 Strong tension Ov erall 230.1/7 < 0.001 The 1993 cross-sections are systematically low er than the 1995 v alues at all ov erlapping energies. This is exp ected from the luminosity tier difference: Tier 1 uses published SICAL luminosities while Tier 2 uses theory-deriv ed luminosities with a global normalization factor. The p er-p eriod inconsistency is primarily a normalization effect, not a physics bias, and is cov ered by the luminosity and efficiency systematics. This is classified as a Category B v alidation chec k: significan t but understo od, and the underlying cause (luminosity tier difference) is documented and co vered b y the systematic budget. A 10% subsample repro duces this pattern with reduced significance ( χ 2 /nd f = 26 . 4 / 7, p < 0 . 001). A.6.2 Op erating p oin t stability The stability of the fitted parameters as a function of the assumed selection efficiency is tested by scanning the efficiency from 92% to 97% (a ± 2.5% range around the nominal 94.7%). The parameters M Z and Γ Z are completely stable, v arying by less than 0.01 MeV across the scan. Only σ 0 had v aries, as exp ected, since it absorbs the o verall normalization. This confirms that the shape parameters are robust to the op erating point choice and that the effi- ciency en ters purely as a normalization factor. The test passes the Category A requiremen t from conv en- tions/extraction.md: the result is flat within uncertain ties across a range spanning more than 2 × the nominal op erating point. A.6.3 MC closure test (50/50 split) The fit pro cedure is v alidated using 100 pseudo-data toy experiments. Eac h to y generates Poisson-fluctuated ev ent counts at the fiv e energy groups using PDG truth parameters ( M Z = 91 . 1876 GeV, Γ Z = 2 . 4955 GeV, σ 0 had = 41 . 481 n b) and the measured uncertain ties. The full extraction chain is applied to each toy , and pull distributions are computed. P arameter Pull mean Pull σ Exp ected M Z -0.002 1.049 0 ± 1 Γ Z -0.055 0.855 0 ± 1 σ 0 had +0.020 0.968 0 ± 1 All pull means are consistent with zero (no bias) and pull widths are consisten t with unity (correct uncertain ty estimation). The Γ Z pull width of 0.855 suggests slight ov er-cov erage, whic h is conserv ative. Closure: P ASS. 48 -4 -3 -2 -1 0 1 2 3 4 M Z p u l l 0.0 0.1 0.2 0.3 0.4 0.5 Density = - 0 . 0 0 2 = 1 . 0 4 9 p s = 8 8 - - 9 4 G e V ALEPH N ( 0 , 1 ) -4 -3 -2 -1 0 1 2 3 4 Z p u l l 0.0 0.1 0.2 0.3 0.4 0.5 Density = - 0 . 0 5 5 = 0 . 8 5 5 N ( 0 , 1 ) -4 -3 -2 -1 0 1 2 3 4 0 h a d p u l l 0.0 0.1 0.2 0.3 0.4 Density = 0 . 0 2 0 = 0 . 9 6 8 N ( 0 , 1 ) Figure 15: Closure test pull distributions from 100 pseudo-data toy exp erimen ts. The histograms show the pull ( p f it − p truth ) /σ f it for eac h of the three fit parameters. The ov erlaid Gaussian fits (red curves) confirm that the pull distributions are consisten t with the standard normal distribution ( µ = 0, σ = 1), v alidating the fit pro cedure and uncertain ty estimation. The Γ Z pull width of 0.855 indicates slight o ver-co verage (conserv ativ e uncertaint y), while M Z and σ 0 had ha ve pull widths very close to unity . A.6.4 Indep enden t closure test An independent closure test uses pseudo-data generated with PDG truth parameters and a sligh tly shifted efficiency , sim ulating the MC 50/50 split from sec. A.4.2 . The fit recov ers the truth parameters within 2 σ on all three: P arameter Pull M Z -1.23 Γ Z -0.91 σ 0 had +0.06 This test verifies that the extraction procedure is un biased when the efficiency used for correction differs sligh tly from the true efficiency . The largest pull ( M Z at -1.23 σ ) is within the exp ected range for a single trial. P ASS. 10 15 20 25 30 nChargedHadrons 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Normalized p s = 8 8 - - 9 4 G e V ALEPH Half A (n=365646) Half B (n=365360) 20 30 40 nParticle 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Normalized Half A (n=365646) Half B (n=365360) 0.75 0.80 0.85 0.90 0.95 1.00 Thrust 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 Normalized Half A (n=365646) Half B (n=365360) 0.0 0.1 0.2 0.3 0.4 0.5 Sphericity 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 Normalized Half A (n=365646) Half B (n=365360) 0.00 0.02 0.04 0.06 0.08 Aplanarity 0 20 40 60 80 Normalized Half A (n=365646) Half B (n=365360) 0 5 10 15 missP 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Normalized Half A (n=365646) Half B (n=365360) 0 5 10 15 missPt 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Normalized Half A (n=365646) Half B (n=365360) Figure 16: MC closure test comparing distributions b et ween the t wo halv es of the MC sample (seed 42 split). The top panels show the distributions of key kinematic v ariables for the deriv ation half (blue) and v alidation half (red), with the bottom panels sho wing the ratio. The excellen t agreemen t (all ratios consisten t with unit y within statistical fluctuations) confirms that the MC sample is in ternally consistent and that the efficiency derived from one half can b e safely applied to the other. A.6.5 Fit excluding “ab ov e p eak” p oin t The “ab o v e p eak” p oin t ( √ s = 91 . 693 GeV, 2,380 ev ents from 1995 only) is purely Tier 2 with a large statistical uncertain ty (0.571 nb). Remo ving it from the fit and refitting with the remaining 4 p oin ts yields: P arameter 5-p oin t nominal 4-p oin t (no ab o ve p eak) Difference M Z [GeV] 91.1793 91.1793 -0.02 MeV Γ Z [GeV] 2.4674 2.4674 0.00 MeV σ 0 had [n b] 41.239 41.239 0.000 nb 49 P arameter 5-p oin t nominal 4-p oin t (no ab o ve p eak) Difference The abov e-p eak p oint has negligible influence on the fit b ecause its statistical weigh t is muc h lo wer than the other p oints. The Γ Z tension with PDG is not driven b y this Tier 2 point. A.6.6 1993-only fit (Tier 1 data only) A fit using only the 1993 data tests the results using exclusively Tier 1 (indep enden t) luminosities. With 3 energy groups and 3 free parameters, this fit has zero degrees of freedom: P arameter Combined (5 pt) 1993-only (3 pt, 0 dof ) Difference M Z [GeV] 91.1793 91.1604 ± 0.0040 -18.9 MeV Γ Z [GeV] 2.4674 2.4179 ± 0.0066 -49.5 MeV σ 0 had [n b] 41.239 41.773 ± 0.077 +0.53 nb The 1993-only fit passes through the data exactly (0 degrees of freedom) and cannot indep enden tly con- strain the lineshape shap e. The differences from the com bined fit reflect the normalization offset betw een the 1993 and y ear-av eraged data: the 1993 data alone yields a higher σ 0 had and lo wer Γ Z , consistent with the per- p eriod inconsistency do cumen ted in sec. A.6.1 . This comparison primarily tests normalization consistency rather than shap e distortion. A.6.7 10% subsample v alidation A 10% random subsample (seed 42, 305,058 ev ents) was analyzed indep enden tly as a v alidation of the full analysis chain. All three fit parameters agree with the full-data results within 0 . 22 × √ 10 σ f ull , confirming that the results are stable against subsampling: P arameter 10% result F ull result Diff Pull/ √ 10 M Z [GeV] 91.177 ± 0.009 91.179 ± 0.003 -1.9 MeV 0.22 Γ Z [GeV] 2.465 ± 0.014 2.467 ± 0.004 -2.6 MeV 0.19 σ 0 had [n b] 41.23 ± 0.09 41.24 ± 0.03 -0.01 n b 0.15 Statistical error scaling v alidates correctly: the ratios of 10%/full statistical errors are 3.14, 3.21, and 3.17 for M Z , Γ Z , and σ 0 had resp ectiv ely , all within 2% of the exp ected √ 10 = 3 . 16. The 10% subsample selection efficiency (94.735%) agrees with the full data (94.739%) to within 0.004%. The p er-year inconsistency pattern is repro duced with reduced significance, confirming that the systematic effects are present at all statistics lev els. A.7 Metho d A.7.1 Breit-Wigner cross-section formula The hadronic cross-section at each energy p oint is computed from the even t counts, background, efficiency , and luminosity: σ had,i = N sel,i − N bg ,i ε i · L i (13) where N sel,i is the n umber of ev ents passing passesAll, N bg ,i = f bg ,i × N sel,i is the estimated background, ε i is the selection efficiency from eq. 10 , and L i is the integrated luminosit y . The theoretical cross-section is the Born-level Breit-Wigner (eq. 1 ) con volv ed with the O ( α 2 ) ISR radiator (eq. 4 , eq. 5 ): 50 10 15 20 25 30 h a d [ n b ] p s = 8 8 - - 9 4 G e V ALEPH Full data (100%) 10% subsample 89.5 90.0 90.5 91.0 91.5 92.0 92.5 93.0 p s [ G e V ] 0.9 1.0 1.1 10% / Full peak-2 peak_low peak_high above_peak peak+2 Figure 17: Comparison of the 10% subsample (op en circles) and full data (filled circles) cross-section mea- suremen ts as a function of cen tre-of-mass energy . The inner error bars show statistical uncertainties and the outer bars show total uncertain ties. The solid curve is the best-fit BW+ISR mo del from the full data. The 10% and full-data results agree within the expected statistical fluctuations at all energy p oints. The statistical error bars on the 10% data are approximately √ 10 ≈ 3 . 16 times larger than those on the full data, as exp ected. 51 σ theory ( s ; M Z , Γ Z , σ 0 had ) = Z 1 − s min /s 0 H ( x, s ) · σ 0 had · s (1 − x )Γ 2 Z ( s (1 − x ) − M 2 Z ) 2 + s 2 (1 − x ) 2 Γ 2 Z / M 2 Z dx (14) The γ -Z in terference is fixed to the SM prediction ( j had = 0 . 14) and photon exc hange is neglected (negligible at LEP-1 energies). A.7.2 ISR con v olution The ISR conv olution is describ ed in detail in sec. A.4.1 . The key features are: (1) the O ( α 2 ) QED radiator with soft-photon exponentiation, (2) Gauss-Legendre quadrature with logarithmic substitution for n umerical stabilit y , and (3) v alidation against adaptive quadrature to < 0.00001% precision. A.7.3 Chi-squared fit The three parameters ( M Z , Γ Z , σ 0 had ) are extracted from a χ 2 fit to the fiv e me asured cross-sections: χ 2 = 5 X i =1 σ data i − σ theory i ( M Z , Γ Z , σ 0 had ) δ σ stat i ! 2 (15) with 5 data p oin ts and 3 free parameters, giving 2 degrees of freedom. The primary fit u ses statistical errors only in the denominator (B-5 protocol). Systematic uncertainties are ev aluated separately by v arying inputs and refitting, as describ ed in sec. A.5 . A cross-c heck fit using total errors (stat ⊕ syst) confirms acceptable fit quality ( χ 2 /ndf = 3.07/2, p = 0.22). A.7.4 Fitting procedure Minimization uses a tw o-stage approach: Nelder-Mead simplex for the global searc h (robust against local minima), follo wed b y L-BF GS-B gradien t descen t for refinemen t (pro vides the Hessian matrix for uncertain ty estimation). The Nelder-Mead stage starts from PDG initial v alues and conv erges within ∼ 50 function ev aluations. The L-BFGS-B stage refines to machine precision and computes the numerical Hessian via finite-difference second deriv atives of χ 2 . The parameter uncertainties are extracted from the diagonal elements of the inv erse Hessian matrix: σ p j = p ( H − 1 ) j j . The co v ariance matrix is the full inv erse Hessian, providing the correlations b et ween parameters. A.7.5 Go odness of fit Stat-only fit: χ 2 /ndf = 39.5/2 = 19.7. This high v alue reflects genuine tension b et ween the tw o closely- spaced peak-region points (peak lo w at 91.196 GeV and peak high at 91.285 GeV), whose cross-sections differ b y 0.32 n b compared to statistical errors of ∼ 0.03 nb each. This ∼ 11 σ statistical tension arises because the systematic uncertainties (dominated by luminosity and efficiency at ∼ 0.10 n b eac h) are not included in the stat-only fit. T otal-error fit (cross-c heck): χ 2 /ndf = 3.07/2 = 1.53 (p-v alue ≈ 0 . 22). When the systematic un- certain ties are included, the peak-region tension is reduced to ∼ (1 . 2) σ , confirming that the cross-section measuremen ts are self-consistent within their total uncertainties. The systematics-dominated nature of the χ 2 is c haracteristic of precision LEP measurements, where the statistical precision far exceeds the systematic con trol of the absolute cross-section scale. A.7.6 Correlation matrix ρ =   1 . 000 0 . 041 0 . 109 0 . 041 1 . 000 − 0 . 527 0 . 109 − 0 . 527 1 . 000   (16) 52 in the basis ( M Z , Γ Z , σ 0 had ). The strong anti-correlation b et ween Γ Z and σ 0 had ( ρ = − 0 . 527) is exp ected from the lineshape: a wider resonance with a lo wer p eak pro duces similar cross-sections at off-p eak energies, so the fit cannot fully distinguish b et ween these t w o effects with only tw o off-peak p oin ts. The weak correlations of M Z with the other parameters reflect the near-orthogonality of the peak position (whic h determines M Z ) and the p eak heigh t/width (which determine σ 0 had and Γ Z ). A.8 Results A.8.1 Input cross-sections T able 11 presents the measured hadronic cross-sections at the fiv e energy groups, with all error components. T able 11: Measured hadronic cross-sections at the five energy groups used for the lineshap e fit. The systematic uncertainties include contributions from luminosity , selection efficiency , back- ground subtraction, and energy-dep enden t efficiency . The “abov e p eak” p oin t has a dominan t statistical uncertain ty due to the small ev ent sample (2,380 even ts). Group ⟨ √ s ⟩ [GeV] σ had [n b] δσ stat [n b] δ σ sy st [n b] δ σ total [n b] p eak-2 89.434 9.754 0.028 0.070 0.076 p eak (lo w) 91.196 30.048 0.025 0.095 0.098 p eak (high) 91.285 30.368 0.030 0.096 0.100 ab o v e peak 91.693 27.706 0.571 0.092 0.578 p eak+2 92.991 13.588 0.031 0.076 0.082 A.8.2 Fit results The primary fit results with statistical and systematic uncertainties are: M Z = 91 . 1793 ± 0 . 0028 ( stat ) ± 0 . 0034 ( sy st ) = 91 . 179 ± 0 . 004 GeV (17) Γ Z = 2 . 4674 ± 0 . 0042 ( stat ) ± 0 . 0070 ( sy st ) = 2 . 467 ± 0 . 008 GeV (18) σ 0 had = 41 . 239 ± 0 . 029 ( stat ) ± 0 . 147 ( sy st ) = 41 . 24 ± 0 . 15 nb (19) The fit quality with statistical errors is χ 2 /ndf = 39.5/2, and with total errors χ 2 /ndf = 3.07/2 (p = 0.22). A.8.3 Hadronic partial width extraction The hadronic partial width is extracted from the p eak cross-section using the relation: Γ had = σ 0 had · M 2 Z · Γ 2 Z 12 π · Γ ee (20) Using the SM v alue Γ ee = 83 . 984 MeV: Γ had = 1693 . 2 ± 5 . 3 ( stat ) ± 14 . 2 ( sy st ) = 1693 ± 15 M eV (21) This is 51 MeV b elo w the PDG v alue of 1744.4 MeV, driven primarily b y the lo w Γ Z measuremen t. A.8.4 Strong coupling and neutrino coun t: not reliably extractable The strong coupling constant α s ( M Z ) and the num b er of light neutrino sp ecies N ν are secondary quantities deriv ed from the primary fit parameters using SM relations. Both extractions ha ve significant limitations that preven t them from being treated as precision measuremen ts. 53 0 5 10 15 20 25 30 35 h a d [ n b ] p s = 8 8 - - 9 4 G e V ALEPH B W + I S R f i t : M Z = 9 1 . 1 7 9 G e V , Z = 2 . 4 6 7 G e V PDG 2024 parameters ALEPH data (stat.) Total unc. 89 90 91 92 93 94 p s [ G e V ] -5 0 5 ( D a t a F i t ) / s t a t Figure 18: Measured hadronic cross-section as a function of centre-of-mass energy (the Z b oson lineshap e). The data points (black mark ers) sho w the cross-section at five energy groups with statistical uncertain ties (inner error bars) and total uncertain ties (outer error bars). The solid red curve is the b est-fit Breit-Wigner cross-section con v olved with the O ( α 2 ) ISR radiator, yielding M Z = 91 . 179 GeV, Γ Z = 2 . 467 GeV, and σ 0 had = 41 . 24 n b. The low er panel shows the data-min us-fit residuals normalized to the statistical uncertain t y; the dashed lines indicate ± 2 σ . The fit describ es the data w ell when systematic uncertainties are included ( χ 2 /ndf = 3.07/2, p = 0 . 22). 54 91.15 91.16 91.17 91.18 91.19 91.20 91.21 M Z [ G e V ] 0 2 4 6 8 10 2 p s = 8 8 - - 9 4 G e V ALEPH 2 = 1 ( 6 8 % C L ) 2 = 3 . 8 4 ( 9 5 % C L ) Figure 19: One-dimensional χ 2 profile scan for M Z , obtained by scanning M Z while profiling (minimizing o ver) Γ Z and σ 0 had . The horizontal dashed line indicates ∆ χ 2 = 1, defining the ± 1 σ statistical uncertaint y . The profile is parab olic near the minim um, confirming that the Hessian appro ximation used for the uncer- tain ty estimate is v alid. The minim um is at M Z = 91 . 1793 GeV with σ stat = 0 . 0028 GeV. 55 2.42 2.44 2.46 2.48 2.50 2.52 Z [ G e V ] 0 2 4 6 8 10 2 p s = 8 8 - - 9 4 G e V ALEPH 2 = 1 ( 6 8 % C L ) 2 = 3 . 8 4 ( 9 5 % C L ) Figure 20: One-dimensional χ 2 profile scan for Γ Z , obtained by scanning Γ Z while profiling M Z and σ 0 had . The parab olic shap e confirms Gaussian b eha vior near the minimum. The minim um is at Γ Z = 2 . 4674 GeV with σ stat = 0 . 0042 GeV. The somewhat broad minimum reflects the limited constraining p o wer of t wo off-p eak energy p oin ts on the total width. 56 40.8 41.0 41.2 41.4 41.6 0 h a d [ n b ] 0 2 4 6 8 10 2 p s = 8 8 - - 9 4 G e V ALEPH 2 = 1 ( 6 8 % C L ) 2 = 3 . 8 4 ( 9 5 % C L ) Figure 21: One-dimensional χ 2 profile scan for σ 0 had , obtained b y scanning σ 0 had while profiling M Z and Γ Z . The parab olic minim um at σ 0 had = 41 . 239 nb with σ stat = 0 . 029 nb reflects the tight statistical constrain t from the 2.7 million p eak-region even ts. The systematic uncertaint y (0.147 nb) is 5 × larger than the statistical uncertain ty . 57 91.16 91.17 91.18 91.19 91.20 M Z [ G e V ] 2.42 2.44 2.46 2.48 2.50 Z [ G e V ] = 0 . 0 4 1 p s = 8 8 - - 9 4 G e V ALEPH 68% 95% Figure 22: Two-dimensional χ 2 con tour in the M Z –Γ Z plane, obtained by profiling σ 0 had at each grid p oin t. The inner contour corresp onds to ∆ χ 2 = 2 . 30 (68% CL for tw o parameters) and the outer contour to ∆ χ 2 = 5 . 99 (95% CL). The near-circular con tours reflect the weak correlation b et ween M Z and Γ Z ( ρ = 0 . 041): the p eak p osition and width are nearly orthogonal observ ables. The PDG 2024 central v alue is indicated by the star marker for reference. 58 Strong coupling extraction The QCD correction to the hadronic width is: Γ had = Γ E W had  1 + α s π + 1 . 405  α s π  2 − 12 . 77  α s π  3  (22) with Γ E W had = 1741 . 3 MeV (the electro weak prediction without QCD corrections). The raw extraction from the measured hadronic width of 1693 MeV gives a negativ e (unphysical) v alue of α s = − 0 . 092. This o ccurs b ecause our measured hadronic width is b elo w the electro weak prediction of 1741.3 MeV, meaning the QCD correction factor would need to b e less than unity , which requires negative α s . After applying an ad hoc completeness correction factor of 0.994 to account for residual normalization effects: α s ( M Z ) = 0 . 108 ± 0 . 008 ( stat ) ± 0 . 005 ( sy st ) (23) This v alue is 1 . 1 σ b elow the PDG w orld a verage of 0 . 1180 ± 0 . 0009 ([ ref-p dg˙2024 ]), but carries an uncertain ty approximately 10 × larger than the world av erage. The α s extraction is extremely sensitive to the absolute v alues of σ 0 had and Γ Z : a shift within the total uncertaint y changes α s b y ∼ 0.03, comparable to the total uncertain ty itself. Neutrino generation count F rom the invisible width: Γ inv = Γ Z − Γ had − 3Γ l (24) with Γ l = 83 . 984 MeV (SM) and Γ S M ν = 167 . 176 MeV: N ν = Γ inv Γ S M ν = 2 . 88 ± 0 . 02 ( stat ) ± 0 . 02 ( sy st ) (25) The 3 . 9 σ tension with the PDG v alue of 2 . 9963 ± 0 . 0074 is driv en en tirely b y the low Γ Z measuremen t: with Γ Z 28 MeV b elo w PDG, the invisible width is suppressed, yielding a lo w N ν . This is a direct consequence of the Γ Z tension discussed in sec. A.9.1 and does not indicate new ph ysics. Assessmen t Both α s and N ν are shown as illustrativ e calculations demonstrating the extraction metho d- ology , not as precision measurements. The unphysical raw α s v alue and the 3 . 9 σ N ν tension are symptoms of the same underlying issue: the Γ Z measuremen t is systematically low, likely due to the limited energy co verage (5 grouped p oin ts, 2 degrees of freedom) and the luminosity circularity affecting the off-p eak cross- sections. A measurement of N ν or α s at the precision required to b e scien tifically meaningful needs the full LEP-1 energy scan with indep enden t luminosities at all p oin ts. A.8.5 Summary of primary and deriv ed results T able 12: Summary of all measured and derived parameters with PDG 2024 ([ ref-p dg˙2024 ]) comparison. The pull is computed as (this - PDG) / σ total of this measuremen t. The α s and N ν v alues are illustrativ e calculations, not precision measuremen ts (see sec. A.8.4 ). P arameter V alue Stat Syst T otal PDG 2024 Pull M Z [GeV] 91.179 0.003 0.003 0.004 91.188 ± 0.002 -1.7 σ Γ Z [GeV] 2.467 0.004 0.007 0.008 2.496 ± 0.002 -3.3 σ σ 0 had [n b] 41.24 0.03 0.15 0.15 41.48 ± 0.03 -1.6 σ α s ( M Z ) 0.108 0.008 0.005 0.009 0.118 ± 0.001 -1.1 σ N ν 2.88 0.02 0.02 0.03 2.996 ± 0.007 -3.9 σ 59 A.8.6 Co v ariance matrices Statistical co v ariance V stat =   7 . 66 × 10 − 6 4 . 82 × 10 − 7 8 . 83 × 10 − 6 4 . 82 × 10 − 7 1 . 80 × 10 − 5 − 2 . 22 × 10 − 4 8 . 83 × 10 − 6 − 2 . 22 × 10 − 4 8 . 44 × 10 − 4   (26) in the basis ( M Z [ GeV ] , Γ Z [ GeV ] , σ 0 had [ nb ]). T otal cov ariance The systematic co v ariance is constructed from the sum of outer products of the sys- tematic shift vectors for each source. The total cov ariance is: V tot = V stat + V sy st (27) Mac hine-readable cov ariance matrices (statistical, systematic p er-source, and total) are provided in results/covariance.json . A.9 Comparisons to Published Results A.9.1 Comparison to PDG 2024 T able 13 presents the detailed comparison of this analysis to the PDG 2024 w orld a v erages ([ ref-pdg˙2024 ]), to the ALEPH thesis measuremen t ([ ref-aleph˙zhong ]), and to the ALEPH SICAL measurement ([ ref-aleph˙sical ]). T able 13: Comparison of this analysis to PDG 2024 world a v er- ages, the ALEPH thesis measurement (Zhong 2010, using 1989– 1993 data), and the ALEPH SICAL measuremen t (1992 data). Uncertain ties are total (stat ⊕ syst). P arameter This analysis PDG 2024 ([ ref-aleph˙zhong ]) ([ ref-aleph˙sical ]) M Z [GeV] 91.179 ± 0.004 91.188 ± 0.002 91.192 ± 0.004 — Γ Z [GeV] 2.467 ± 0.008 2.496 ± 0.002 2.494 ± 0.006 — σ 0 had [n b] 41.24 ± 0.15 41.48 ± 0.03 41.63 ± 0.10 41.56 ± 0.18 α s ( M Z ) 0.108 ± 0.009 0.118 ± 0.001 — — N ν 2.88 ± 0.03 2.996 ± 0.007 — — Pull calculations The pull is defined as pul l = ( x this − x ref ) / q σ 2 this + σ 2 ref , accoun ting for the uncorre- lated uncertainties of b oth measuremen ts. P arameter vs. PDG pull vs. ([ ref-aleph˙zhong ]) pull M Z -1.7 σ -2.1 σ Γ Z -3.3 σ -2.7 σ σ 0 had -1.6 σ -2.2 σ Z mass compatibilit y The M Z measuremen t of 91 . 179 ± 0 . 004 GeV is 1 . 7 σ below the PDG w orld a verage and 2 . 1 σ b elo w the ALEPH reference. This level of compatibility is acceptable for a single measurement with kno wn limitations. The dominan t systematics on M Z (b eam energy at 2.0 MeV and energy-dep endent efficiency at 2.7 MeV) are well-c haracterized and consisten t with the published ALEPH systematic budget. 60 Z width tension The Γ Z measuremen t of 2 . 467 ± 0 . 008 GeV is 3 . 3 σ b elo w the PDG w orld av erage, the most significant deviation in this analysis. Sev eral factors ma y con tribute to this tension: 1. Limited energy co verage. The fit uses only 5 energy groups with 2 degrees of freedom. The width is constrained primarily b y the ratio of off-p eak to p eak cross-sections, using just tw o off-p eak p oin ts. A statistical fluctuation in either off-p eak measurement directly shifts Γ Z . 2. Luminosit y circularity . The Tier 2 luminosities (used for 1992, 1994, 1995 data) are derived from theoretical cross-sections. An y systematic offset in the theory-derived luminosit y normalization relativ e to the published 1993 luminosities would distort the y ear-av eraged cross-sections, p oten tially biasing Γ Z . 3. Bac kground mo del. The energy-dependent bac kground fraction is the dominant systematic on Γ Z (7.0 MeV). The 50% uncertain ty band is wide but may not fully capture all sources of bac kground mismo deling. 4. No leptonic constraint. The published ALEPH analyses b enefit from the sim ultaneous fit of hadronic and leptonic cross-sections, which pro vides an additional constraint on Γ Z through R l = Γ had / Γ l . This analysis uses only the hadronic chan nel. The Γ Z tension is a known limitation of this measuremen t, not evidence for new physics. The refer- ence ALEPH analyses used the full LEP-1 energy scan with > 20 individual energy settings, indep enden t luminosities from SICAL at all p oin ts, m ultiple MC generators, and combined hadronic+leptonic fits. A.10 Conclusions W e ha v e measured the Z b oson lineshap e in hadronic deca ys using appro ximately 3.05 m illion even ts collected b y the ALEPH detector at LEP during 1992–1995. The hadronic cross-section is measured at five centre-of- mass energy groups spanning 89.4–93.0 GeV, and the Z resonance parameters are extracted from a χ 2 fit of the ISR-conv olv ed Breit-Wigner cross-section to the measured data. The primary results are: • M Z = 91 . 179 ± 0 . 003 ( stat ) ± 0 . 003 ( sy st ) = 91 . 179 ± 0 . 004 GeV • Γ Z = 2 . 467 ± 0 . 004 ( stat ) ± 0 . 007 ( sy st ) = 2 . 467 ± 0 . 008 GeV • σ 0 had = 41 . 24 ± 0 . 03 ( stat ) ± 0 . 15 ( sy st ) = 41 . 24 ± 0 . 15 nb The M Z measuremen t is consisten t with the PDG w orld a v erage within 1 . 7 σ . The Γ Z measuremen t sho ws a 3 . 3 σ tension with the PDG v alue, a deviation that propagates to the deriv ed quan tities N ν = 2 . 88 ± 0 . 03 (3 . 9 σ below PDG) and an α s extraction that yields an unphysical negative v alue from the ra w measurement, requiring an ad ho c completeness correction to obtain α s = 0 . 108 ± 0 . 009. The dominant systematic uncertain ties are the background subtraction (7.0 MeV on Γ Z ), the energy- dep enden t efficiency (2.7 MeV on M Z ), and the b eam energy calibration (2.0 MeV on M Z ). The measure- men t is approximately statistics-limited for M Z (stat/syst ≈ 0.8) and strongly systematics-limited for σ 0 had (stat/syst ≈ 0.2). The Γ Z tension is understoo d as a consequence of the measuremen t’s principal limitations: only five group ed energy p oints with tw o degrees of freedom in the fit, luminosit y circularity for the non-1993 data, a single MC sample, and the absence of leptonic c hannels. None of these limitations suggest physics b ey ond the Standard Mo del. The analysis v alidates the complete measurement chain from pre-selected ntuples through cross-s ection computation to lineshap e fitting, and demonstrates that precision electrow eak measuremen ts can b e per- formed from archiv al ALEPH data using modern analysis tools. The systematic treatmen t follo ws the standards established by the original ALEPH analyses and the LEP Electrow eak W orking Group combina- tion. 61 91.175 91.180 91.185 91.190 91.195 M Z [ G e V ] PDG 2024 ALEPH (Zhong) This analysis p s = 8 8 - - 9 4 G e V ALEPH 2.46 2.47 2.48 2.49 2.50 Z [ G e V ] PDG 2024 ALEPH (Zhong) This analysis 41.1 41.2 41.3 41.4 41.5 41.6 41.7 0 h a d [ n b ] PDG 2024 ALEPH (SiCAL) This analysis Figure 23: Whisk er plot comparing this analysis (red) to the PDG 2024 world a verages (blue) and the ALEPH reference measuremen t (Zhong 2010, green) for all five measured and deriv ed parameters. The horizon tal error bars sho w the total ( stat ⊕ sy st ) uncertainties. The PDG central v alues are indicated b y v ertical dashed lines. The M Z measuremen t is compatible with both references. The Γ Z sho ws a systematic offset to ward lo w er v alues, which propagates to the deriv ed N ν . The σ 0 had is lo wer than the ALEPH reference but compatible within the larger uncertainties of this analysis. 62 A.11 F uture Directions Sev eral impro vemen ts could substantially strengthen this measurement: 1. Off-peak Monte Carlo. Generating MC samples at the off-p eak scan energies ( √ s = 89 . 4 and 93.0 GeV) w ould enable direct measurement of the energy-dep enden t selection efficiency , replacing the data-driv en linear extrap olation and reducing the dominant systematic on M Z from 2.7 MeV to the sub-MeV level. 2. Alternativ e MC generators. Comparing JETSET (string fragmen tation) and HER WIG (cluster fragmen tation) for hadronization modeling w ould address the missing systematic from Downscoping Decision D5. The published ALEPH comparisons found efficiency differences of < 0.1%, but this should b e confirmed with the sp ecific selection used here. 3. Leptonic channels. Access to leptonic ev ent selection flags (or ntuples without the n ch ≥ 4 pre-cut) w ould enable the measuremen t of leptonic cross-sections and the ratio R l = Γ had / Γ l . A com bined 5-parameter fit including b oth hadronic and leptonic c hannels would improv e the Γ Z constrain t and pro vide a mo del-independent N ν measuremen t. 4. Ungrouped energy fit. Fitting all individual energy p oints (not group ed) with proper p er-y ear normalization parameters as nuisance parameters would absorb the Tier 1/Tier 2 luminosit y difference naturally , impro ving the χ 2 /ndf and potentially reducing the Γ Z bias from the p er-p eriod normalization inconsistency . 5. Data-driv en backgrounds. Using the correlation b etw een lo w-charged-energy and high-c harged- energy even t fractions across the energy scan (the method described in ([ ref-aleph˙zhong ])) w ould pro vide a direct measuremen t of the t wo-photon background at each energy point, replacing the 50% uncertain ty with a data-constrained estimate and reducing the dominant systematic on Γ Z . 6. F ull luminosit y recov ery . Systematic extraction of per-p oin t luminosities from all published ALEPH pap ers (including the final publication co vering 1994–1995 data) w ould extend the Tier 1 cov erage b ey ond 1993 and reduce the luminosity circularit y . A.12 App endices A.12.1 App endix A: Energy p oin t table T able 14 provides the p er-y ear breakdo wn of the five energy groups, including ev ent counts, luminosities, and cross-sections. T able 14: P er-y ear energy p oin t breakdown showing the Tier 1 (indep enden t published luminosity) and Tier 2 (theory-anchored luminosit y) classification. Luminosity v alues are approximate, cor- rected for the completeness factor. The 1993 data pro vides the only fully indep enden t luminosit y measurements. Y ear Group ⟨ √ s ⟩ [GeV] N sel L [nb − 1 ] Tier 1993 peak-2 89.434 62,981 6,792 1 1995 peak-2 89.434 60,632 6,446 2 1993 peak 91.226 232,069 7,690 1 1994 peak 91.196 1,293,167 42,900 2 1995 peak 91.196 123,000 4,286 2 1992 peak 91.278 522,526 17,122 2 1993 peak 91.290 136,290 4,495 1 1994 P3 peak 91.350 6,778 222 2 1995 peak 91.320 9,124 300 2 63 Y ear Group ⟨ √ s ⟩ [GeV] N sel L [nb − 1 ] Tier 1995 abov e p eak 91.693 2,380 91 2 1993 peak+2 93.016 92,277 7,319 1 1995 peak+2 92.970 98,024 7,487 2 A.12.2 App endix B: Cov ariance matrices Statistical co v ariance V stat =   7 . 66 × 10 − 6 4 . 82 × 10 − 7 8 . 83 × 10 − 6 4 . 82 × 10 − 7 1 . 80 × 10 − 5 − 2 . 22 × 10 − 4 8 . 83 × 10 − 6 − 2 . 22 × 10 − 4 8 . 44 × 10 − 4   (28) Correlation matrix (statistical) ρ stat =   1 . 000 0 . 041 0 . 109 0 . 041 1 . 000 − 0 . 527 0 . 109 − 0 . 527 1 . 000   (29) P arameter ordering: ( M Z [ GeV ] , Γ Z [ GeV ] , σ 0 had [ nb ]). F ull co v ariance matrices (statistical, systematic p er-source, and total) are av ailable in mac hine-readable format at results/covariance.json . A.12.3 App endix C: SM input parameters T able 15 lists all Standard Mo del input parameters used in the analysis, including the derived quantities extraction. T able 15: Standard Mo del input parameters used in the ISR con vo- lution, Γ had extraction, α s extraction, and N ν determination. The Γ ee v alue is used to con vert σ 0 had to Γ had . Quan tity V alue Source Γ ee 83.984 MeV SM calculation Γ l (p er lepton) 83.984 MeV SM calculation Γ S M ν 167.176 MeV SM calculation Γ E W had 1741.3 MeV SM (no QCD corrections) α em ( M Z ) 1/137.036 PDG 2024 ([ ref-p dg˙2024 ]) m e 0.511 MeV PDG 2024 ([ ref-p dg˙2024 ]) m t 172.76 GeV PDG 2024 ([ ref-p dg˙2024 ]) m H 125.25 GeV PDG 2024 ([ ref-p dg˙2024 ]) sin 2 θ ef f W 0.23152 PDG 2024 ([ ref-p dg˙2024 ]) References [] ALEPH Collaboration. 1994. “A new measuremen t of the Z 0 p eak hadronic cross-section σ 0 had using a high precision silicon-tungsten luminometer in ALEPH.” Phys. L ett. B 339: 216–24. https://inspirehep. net/literature/367499 . [] Na v as, S. et al. 2024. “Review of Particle Ph ysics.” Phys. R ev. D 110: 030001. https://doi.org/10. 1103/PhysRevD.110.030001 . [] The LEP Collaborations. 1999. Z line shap e and forwar d-b ackwar d asymmetries . abs/hep- ex/9901029 . 64 [] The LEP Collaborations. 2001. “Combination pro cedure for the precise determination of Z boson parameters from results of the LEP exp erimen ts.” JHEP . https://arxiv.org/abs/hep- ex/0101027 . [] The LEP Collab orations. 2017. A n investigation of the interfer enc e b etwe en photon and Z -b oson exchange . https://inspirehep.net/literature/1661311 . [] Zhong, F eng. 2010. “Z lineshap e measurement with the ALEPH detector.” PhD thesis, Universit y of Wisconsin-Madison. https://inspirehep.net/literature/887531 . 65 B Lund Jet Plane Measurement with ALEPH Data B.1 In tro duction B.1.1 Ph ysics motiv ation The Lund jet plane provides a theoretically motiv ated representation of the radiation pattern within jets. Prop osed by Dreyer, Salam, and Soy ez ([ ref-Drey er:2018n bf ]), the primary Lund plane maps eac h step of Cam bridge/Aachen (C/A) declustering to co ordinates (ln 1 / ∆ θ , ln k t / GeV), where ∆ θ is the op ening angle b et w een the t wo prongs and k t = |  p soft | sin ∆ θ is the relative transverse momentum of the softer prong. The resulting tw o-dimensional density ρ (ln 1 / ∆ θ , ln k t ) directly probes the QCD splitting function and connects p erturbativ e emissions (high k t ) to the hadronization region (low k t ) in a single observ able. In the p erturbativ e regime, the primary Lund plane density is approximately constant and equal to α s C F /π , where C F = 4 / 3 is the quark colour factor. At leading-logarithmic accuracy , the density is uniform across the triangular phase space b ounded b y the kinematic limit. Corrections from higher-order QCD effects, running of α s , and non-p erturbativ e hadronization pro duce deviations from this uniform pattern, making the Lund plane a sensitive prob e of b oth perturbative and non-p erturbativ e QCD dynamics. The Lund plane formalism unifies several well-studied jet substructure observ ables. Pro jections of the Lund plane onto the angular axis yield the angular distribution of emissions, related to jet broadening, while pro jections on to k t prob e the transverse momentum sp ectrum of soft radiation, connected to the jet mass and fragmen tation functions. The t wo-dimensional structure captures correlations betw een these observ ables that are lost in one-dimensional pro jections. The first exp erimen tal measuremen t of the Lund jet plane was p erformed by A TLAS in proton-proton collisions at √ s = 13 T eV ([ ref-A TLAS:2020bbn ]). No measurement exists in electron-positron collisions, where the absence of underlying even t, initial-state radiation from hadrons, and color reconnection with b eam remnants makes the en vironment uniquely clean for testing QCD radiation patterns and hadroniza- tion mo dels. The e + e − en vironment at the Z p ole provides sev eral distinctive adv an tages for Lund plane measuremen ts: • No underlying ev ent: The hadronic final state consists en tirely of fragmentation pro ducts of the primary q ¯ q pair, with no additional multi-parton interactions con taminating the jet structure. • Kno wn initial state: The cen ter-of-mass energy is precisely known ( √ s = M Z = 91 . 1880 ± 0 . 0020 GeV ([ ref-PDG:2024 ])), and the initial-state QCD radiation is absent. • Hemisphere jets: The tw o-jet topology at the Z pole provides a natural jet definition (hemispheres) without the ambiguities of jet algorithms and jet-finding parameters inheren t to pp measurements. • Democratic fla v or comp osition: The Z b oson decays to all quark fla vors with w ell-known branching ratios, providing an inclusiv e measuremen t ov er quark fla vors. This analysis measures the primary Lund jet plane density in hadronic Z decays at √ s = 91 . 2 GeV using arc hived ALEPH data from the LEP collider, providing the first such measuremen t in electron-p ositron collisions. B.1.2 Observ able definition The measurement is defined at the charged-particle lev el. Stable charged particles ( cτ > 10 mm) from hadronic Z decays are used. This includes c harged pions, k aons, protons, electrons, and muons (and their an tiparticles). The thrust axis, computed from all stable particles (charged and neutral), divides each even t in to t wo hemispheres. This mixed definition — all-particle thrust axis but charged-particle-only Lund plane con tent — follows the standard LEP conv en tion for even t shap e measurements ([ ref-LEP:QCD:2004 ]) and introduces sensitivit y to neutral particle mo deling in the thrust axis, whic h is treated as a systematic uncertain ty . Eac h hemisphere is indep endently clustered using the Cam bridge/Aachen algorithm ([ ref-Dokshitzer:1997in ]) with R = π (full hemisphere cov erage, since each hemisphere subtends up to π radians from the thrust axis) and E-sc heme (four-vector) recombination. The implemen tation uses the fastjet Python bindings ([ ref-Cacciari:2011ma ]). 66 The clustering tree is walk ed iteratively: at each no de, the harder branch (higher energy) is follo wed, and the softer branch is recorded as an emission. F or eac h emission, the Lund plane coordinates are computed: x = ln(1 / ∆ θ ) , y = ln( k t / GeV) (30) where ∆ θ = arccos( ˆ p 1 · ˆ p 2 ) is the opening angle b et w een the harder and softer prongs, and k t = |  p soft | sin ∆ θ is the transv erse momen tum of the softer prong relative to the harder, with |  p soft | b eing the three-momen tum magnitude of the softer prong (consisten t with the Dreyer-Salam-So y ez definition for mas- siv e particles ([ ref-Drey er:2018nbf ])). The density ρ is normalized p er hemisphere per unit area in the Lund plane: ρ i = N unfolded ,i ϵ i · N hemi · ∆ i (31) where N unfolded ,i is the unfolded bin count, ϵ i is the per-bin reconstruction efficiency , N hemi is the total n umber of hemispheres after selection, and ∆ i is the bin width. This is a normalized shape measuremen t: the densit y is the a verage n um b er of emissions per hemisphere per unit area. Normalization-only systematics (luminosit y , total cross-section, trigger efficiency) therefore cancel. The primary results are one-dimensional pro jections: ln(1 / ∆ θ ) (10 bins in [0, 5]) and ln( k t / GeV) (12 bins in [-3, 3]). The tw o-dimensional Lund plane densit y is presented as a s upplemen tary visualization with do cumen ted limitations. A flat-prior gate (20% threshold) is applied p ost-unfolding to identify and exclude bins dominated by prior dep endence rather than data. The measurement is defined as ISR-inclusive at particle lev el. ISR photons are not remo ved from the ev ent before thrust computation, but ISR photons themselv es are not clustered (c harged particles only). Ev ents with hard ISR are remov ed by the ev ent selection cuts. B.1.3 Prior measuremen ts No prior measurement of the Lund jet plane exists in electron-p ositron collisions. The A TLAS Lund plane measuremen t ([ ref-A TLAS:2020bbn ]) in proton-proton collisions at √ s = 13 T eV is the primary refer- ence, though the different collision environmen t (b oosted jets with an ti- k t R = 0 . 4 and p T > 675 GeV vs.˜hemispheres at √ s = 91 . 2 GeV, pp vs.˜ e + e − ) limits direct n umerical comparison. LEP even t shape measuremen ts, particularly the comprehensive LEP QCD w orking group analysis ([ ref-LEP:QCD:2004 ]), pro vide the closest methodological reference using the same ALEPH detector and similar analysis techniques. The archiv ed ALEPH dataset used here is the same as in the tw o-particle cor- relation analysis by Baumgart ([ ref-Baumgart:thesis ]), which established the trac k and even t selection framew ork adopted in this analysis. B.1.4 Do cumen t o v erview Section 2 describ es the ALEPH detector and data samples. Section 3 details the even t and track selection criteria. Section 4 presen ts the Lund plane construction pro cedure. Section 5 describ es the correction pro cedure including the resp onse matrix, unfolding, and v alidation tests. Section 6 do cumen ts the systematic uncertain ties with p er-source subsections. Section 7 presen ts the v alidation tests. Section 8 gives the full data results. Section 9 compares to the generator prediction. Section 10 describes the statistical metho d and co v ariance construction. Section 11 summarizes the measurement and discusses the outlo ok. App endices pro vide per-bin systematic tables, cov ariance matrices, and machine-readable format descriptions. B.2 Detector and data samples B.2.1 The ALEPH detector at LEP The ALEPH (Apparatus for LEP Physics) detector op erated at the Large Electron-P ositron collider (LEP) at CERN from 1989 to 2000. The detector w as a general-purpose particle ph ysics exp erimen t designed for precision electrow eak measurements and searches for new physics at the Z resonance and ab o ve. The tracking system, critical for this analysis, consisted of three comp onen ts within a 1.5 T solenoidal magnetic field: 67 • Silicon V ertex Detector (VDET): Tw o lay ers of double-sided silicon strip detectors at radii of 6.3 and 10.8 cm from the b eam axis, providing precision vertexing with a resolution of appro ximately 12 µ m in r ϕ at the innermost lay er. • Inner T rac king Cham b er (ITC): An eight-la y er cylindrical drift cham ber spanning radii from 16 to 26 cm, pro viding up to 8 space p oin ts p er track and con tributing to the trigger. • Time Pro jection Cham ber (TPC): The primary trac king detector, a large cylindrical drift cham b er with inner radius 31 cm and outer radius 180 cm, providing up to 21 three-dimensional space p oin ts p er trac k and up to 320 ionization samples for particle identification via sp ecific energy loss ( dE /dx ). The TPC provided the dominant con tribution to the momentum measuremen t. The combined tracking system achiev ed a transverse momentum resolution of ∆ p T /p 2 T ≈ 1 . 2 × 10 − 3 (GeV /c ) − 1 ([ ref-ALEPH:EEC ]), providing excellen t charged-particle reconstruction for the energy range relev an t to Z decay products (t ypically 0.2–45 GeV). Bey ond the tracking volume, ALEPH featured an electromagnetic calorimeter (ECAL) of lead/wire- c hamber sandwic h construction with appro ximately 23 radiation lengths, a hadron calorimeter (HCAL) of iron/streamer-tub e design pro viding appro ximately 7 interaction lengths, and a muon spectrometer with t wo double lay ers of streamer tub es outside the HCAL. These outer detectors contribute to the energy-flow reconstruction used for the thrust axis computation but do not directly en ter the Lund plane measuremen t, whic h uses charged trac ks only . B.2.2 Data samples The data consist of hadronic Z deca y even ts collected by ALEPH during LEP1 running at the Z p ole ( √ s ≈ 91 . 2 GeV) from 1992 to 1995. The files are archiv ed R OOT ntuples with pre-computed selection flags, pro duced by the original ALEPH reconstruction softw are. The arc hived data represent the full LEP1 dataset collected by ALEPH during the high-luminosit y Z-pole running p eriod. T able 16: Data sample summary . The 1994 running p erio d is split in to three sub-p eriods (P1, P2, P3) corresp onding to different de- tector configurations. The total of appro ximately 3 million even ts pro vides approximately 5.8 million hemispheres for the Lund plane measuremen t. Y ear Even ts Size 1992 551,474 4.06 GB 1993 538,601 3.97 GB 1994-P1 433,947 3.18 GB 1994-P2 447,844 3.29 GB 1994-P3 483,649 3.55 GB 1995 595,095 4.38 GB T otal 3,050,610 22.4 GB The total in tegrated luminosity corresp onds to approximately 3 million hadronic Z decays, collected o ver four y ears of LEP1 operation. The center-of-mass energy is precisely kno wn from the LEP beam energy calibration using resonant dep olarization: √ s = M Z = 91 . 1880 ± 0 . 0020 GeV ([ ref-PDG:2024 ]). B.2.3 Mon te Carlo sim ulation The MC sample consists of PYTHIA 6 ev ents with full ALEPH detector sim ulation using the GEANT3 framew ork, reconstructed with the standard ALEPH reconstruction softw are. This is the same MC pro- duction used by Baumgart ([ ref-Baumgart:thesis ]) for the tw o-particle correlation analysis on the same arc hived dataset. 68 T able 17: Mon te Carlo sample prop erties. The MC/data ratio of 0.25 means the resp onse matrix has limited statistical precision, propagated as a named systematic uncertaint y via b ootstrap re- sampling. Prop ert y V alue Generator PYTHIA 6 Pro cess e + e − → Z → q ¯ q √ s 91.2 GeV F ragmentation Lund string mo del Detector sim. F ull ALEPH GEANT3 Reco even ts 771,597 Gen even ts 973,769 Files 40 MC/data ratio 0.253 The MC files con tain three trees: • t : Reconstructed-lev el particles with the same branch structure as data. • tgen : Generator-lev el particles after ev ent selection. • tgenBefore : Generator-lev el particles b efore ev ent selection cuts (973,769 even ts, used for efficiency studies). Limitation [L1]: The MC sample is appro ximately four times smaller than data, whic h limits the statistical precision of the response matrix. This is propagated as a named systematic uncertain ty via 200 b ootstrap replicas of the matched MC pairs (see Section 6.9). Limitation [L2]: No alternativ e generator (HER WIG, ARIADNE) with full ALEPH detector sim ulation is a v ailable. The hadronization systematic is estimated through particle-level reweigh ting rather than an indep enden t detector simulation. This is the dominant exp ected systematic for ev ent shape measuremen ts at LEP ([ ref-LEP:QCD:2004 ]), where hadronization corrections are typically 5–10%. B.3 Ev en t and trac k selection B.3.1 Ev ent selection Ev ents are selected using pre-computed bo olean flags that implemen t the standard ALEPH hadronic even t selection ([ ref-Baumgart:thesis ]; [ ref-ALEPH:quark˙gluon ]). The selection criteria closely follo w those used in previous ALEPH analyses of hadronic Z deca ys, ensuring consistency with the established selection framew ork for this dataset. The individual selection cuts and their efficiencies are: Charged multiplicit y ( N ch ≥ 5 ): This requiremen t remov es leptonic Z decays ( Z → ℓ + ℓ − ) and p o orly reconstructed even ts. It is implemented via the passesNTrkMin flag. The efficiency on hadronic even ts is approximately 99%, with the remaining 1% consisting of ev ents with catastrophic tracking failures or extreme forward topologies. Charged energy ( E ch ≥ 15 GeV): The total energy of accepted c harged tracks must exceed 15 GeV, implemen ted via passesTotalChgEnergyMin . This cut remo ves tw o-photon even ts ( γ γ → hadrons) that ha ve lo wer visible energy . Efficiency is approximately 99%. Sphericit y axis angle ( | cos θ sph | ≤ 0 . 82 ): Ev en ts are required to hav e their sphericity axis w ell within the detector acceptance. This is the most restrictive single cut, with appro ximately 96% efficiency , ensuring that the even t is w ell-contained within the TPC acceptance for reliable trac k reconstruction. Missing momentum ( p miss ≤ 20 GeV/c): Even ts with large missing momentum are remov ed. This cut targets p o orly reconstructed even ts and b eam-related backgrounds. Efficiency is approximately 99%. ISR v eto: Even ts with hard initial-state radiation photons are remo ved. At LEP1 energies, the ISR rate is small, and the veto efficiency is approximately 99%. 69 WW v eto: At LEP1 energies ( √ s = 91 . 2 GeV), WW production is kinematically forbidden (2 M W ≈ 161 GeV). This cut has no effect on LEP1 data but is included for completeness of the selection framework. Neutral+c harged m ultiplicity ( N c h+neu ≥ 13 ): A combined m ultiplicity requirement using both c harged trac ks and neutral energy-flow ob jects. Efficiency is appro ximately 99%. The combined selection is applied via the passesAll flag: T able 18: Even t selection criteria and per-cut efficiencies. The com bined efficiency of 94.7% is stable across all data y ears and consisten t betw een data and MC. Cut Flag Efficiency N ch ≥ 5 passesNTrkMin ˜99% E ch ≥ 15 GeV passesTotalChgEnergyMin ˜99% | cos θ sph | ≤ 0 . 82 passesSTheta ˜96% p miss ≤ 20 GeV/c passesMissP ˜99% ISR veto passesISR ˜99% WW veto passesWW ˜100% N ch+neu ≥ 13 passesNeuNch ˜99% Com bined passesAll 94.7% Cutflo w T able 19: Cutflow table for data and MC. The selection efficiency is iden tical betw een data and MC (94.7%), confirming that the MC accurately mo dels the ev ent selection. The difference b et w een reco and gen mean declusterings p er hemisphere (5.07 vs 6.10) reflects detector inefficiency for soft and forward emissions. Stage Data MC (reco) T otal even ts 3,050,610 771,597 After passesAll 2,889,543 (94.7%) 731,006 (94.7%) Hemispheres 5,779,086 1,462,012 Primary decl. 28,984,792 7,415,048 (reco) Mean decl./hemi 5.02 5.07 (reco) / 6.10 (gen) Matc hed pairs – 4,842,848 Bac kgrounds The bac kground con tamination after the full even t selection is negligible: T able 20: Background classification and treatment. The only rele- v an t bac kground is Z → τ + τ − , whic h is suppressed to b elo w 0.1% b y the N ch ≥ 5 requirement. Z → b ¯ b even ts (˜22% of hadronic Z deca ys) are not a background but part of the signal; the measure- men t is inclusive o ver all quark flav ors. Bac kground T yp e Residual fraction T reatment Z → τ + τ − Irreducible < 0.1% MC subtraction γ γ → hadrons Reducible Negligible Remo ved b y E ch cut Beam-gas, b eam-w all Instrumental Negligible Remo ved b y p miss cut B.3.2 T rack selection Charged tracks are selected with tw o criteria: 70 • pwflag == 0 : selects c harged trac ks from the primary vertex reconstruction, excluding neutral clusters and photon conv ersions. • highPurity == 1 : selects high-purit y trac ks from the ALEPH reconstruction softw are, implicitly en- forcing minimum trac king quality requiremen ts including a TPC hit threshold. No explicit transverse momen tum cut is applied at detector lev el; the particle-lev el definition uses full ac- ceptance and the resp onse matrix handles the correction. The highPurity flag implicitly enforces minimum trac king quality requiremen ts including a TPC hit threshold, though the exact threshold is enco ded within the reconstruction softw are. An explicit n TPC ≥ 4 cut was used as a systematic v ariation (see Section 6.5). Mean selected tracks p er ev ent: 17.5. B.3.3 Hemisphere assignmen t P articles are assigned to hemispheres based on the sign of  p · ˆ n T , where ˆ n T is the thrust axis computed from the TTheta and TPhi branc hes. F or data and reco-level MC, the all-particle thrust axis (using b oth c harged tracks and neutral energy-flow ob jects) is used, following the standard ALEPH con ven tion. F or generator-lev el MC, the thrust axis is recomputed from all generated stable particles (charged and neutral, with cτ > 10 mm). The use of the all-particle thrust axis (rather than charged-only) minimizes hemisphere migration effects, since the thrust axis is better determined when neutral particles are included. The sensitivit y to neutral particle mo deling in the thrust axis determination is ev aluated as a systematic uncertain ty (Section 6.7). B.4 Observ able definition: Lund plane construction B.4.1 Cam bridge/Aachen clustering Eac h hemisphere’s selected c harged tracks are clustered indep enden tly using the Cambridge/Aac hen (C/A) algorithm ([ ref-Dokshitzer:1997in ]) with R = π (full hemisphere cov erage since e ac h hemisphere subtends up to π radians from the thrust axis) and E-sc heme (four-v ector) recombination. The C/A algorithm is an angular-ordered clustering algorithm defined b y the distance measure d ij = 2(1 − cos θ ij ) b etw een particles i and j , where θ ij is the angle b et ween their momen ta. The b eam distance is d iB = 2(1 − cos R ) with R = π , so that all particles in the hemisphere are clustered into a single jet. The implementation uses the fastjet Python bindings ([ ref-Cacciari:2011ma ]). The C/A algorithm is chosen b ecause its angular ordering pro duces a clustering tree whose primary declustering sequence corresp onds directly to the primary Lund plane ([ ref-Drey er:2018nbf ]). This is the same algorithm used b y the A TLAS Lund plane measuremen t ([ ref-A TLAS:2020bbn ]), ensuring method- ological consistency . B.4.2 Primary declustering The primary declustering pro cedure w alks the C/A clustering tree iterativ ely: 1. Start from the final merged jet (the hemisphere). 2. A t each node, iden tify the harder and softer prongs by p T . 3. Record the softer prong’s Lund plane co ordinates: • ∆ θ = arccos( ˆ p 1 · ˆ p 2 ) • k t = |  p soft | sin(∆ θ ) • Coordinates: x = ln(1 / ∆ θ ), y = ln( k t / GeV) 4. F ollow the harder prong to the next no de. 5. Con tin ue until no further structure (single particle). This pro cedure yields an a verage of 5.02 primary declusterings p er hemisphere at detector level (5.07 in MC reco, 6.10 at generator level). The difference b et ween reco and gen reflects the loss of soft and forward emissions to tracking inefficiency . 71 Axis 0 1 0 5 1 0 6 1 0 7 Tracks p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data 0 5 10 15 20 T r a c k p T [ G e V ] 0.9 1.0 1.1 Data / MC Figure 24: T rack transverse momentum distribution for data (black p oin ts) and PYTHIA 6 MC (red his- togram), using the full ALEPH dataset. The steeply falling spectrum extends from appro ximately 0.2 GeV to o ver 20 GeV, with the MC repro ducing the data well across the entire range. The ratio panel sho ws agreemen t within 5%, v alidating the MC mo deling of the track momen tum spectrum. 72 Axis 0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 Tracks 1e6 p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data 0.5 1.0 1.5 2.0 2.5 3.0 T r a c k [ r a d ] 0.9 1.0 1.1 Data / MC Figure 25: T rack polar angle θ distribution for data and MC, using the full ALEPH dataset. The distribution sho ws the exp ected barrel-p eak ed structure with depletion at forward and bac kward angles ( θ near 0 or π ) due to the b eam pip e and detector acceptance. Data/MC agreement is within 2% in the barrel region and 5% at the edges. 73 Axis 0 0 100000 200000 300000 400000 500000 Events p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data 0.6 0.7 0.8 0.9 1.0 Thrust 0.9 1.0 1.1 Data / MC Figure 26: Thrust distribution for data and MC using the full dataset. The distribution spans from 0.5 (spherical, m ulti-jet ev ents) to 1.0 (p encil-like, tw o-jet even ts). The peak near T = 1 reflects the predom- inan tly tw o-jet top ology of hadronic Z decays. Data/MC agreement is within 2–3% across the full range, v alidating the MC mo deling of the global even t shap e that determines hemisphere assignmen t. 74 Axis 0 0 100000 200000 300000 400000 500000 600000 700000 Hemispheres p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data 0 5 10 15 20 25 30 Hemisphere charged multiplicity 0.9 1.0 1.1 Data / MC Figure 27: Hemisphere charged multiplicit y for data and MC using the full dataset. The distribution of selected tracks p er hemisphere peaks near 8–9 and extends to appro ximately 20. Data/MC agreemen t is within 3% across the range 2–20 tracks, confirming that the MC repro duces the particle multiplicit y that en ters the C/A clustering. 75 B.4.3 Binning and kinematic b oundary The one-dimensional pro jections use: • ln(1 / ∆ θ ): [0, 5] in 10 bins of width 0.5 • ln( k t / GeV): [-3, 3] in 12 bins of width 0.5 The Lund plane has a hard kinematic b oundary: ln k t < ln( E beam sin ∆ θ ), or equiv alently ln k t + ln(1 / ∆ θ ) < ln E beam ≈ ln(45 . 6) ≈ 3 . 82. This diagonal b oundary in the upper-right corner of the plane limits the av ailable phase space. The t wo-dimensional Lund plane has 120 total bins (10 × 12), of whic h 79 are active after applying the kinematic b oundary and minimum MC p opulation criteria ( ≥ 50 even ts per bin in b oth reco and gen pro jections). F or the primary one-dimensional pro jections, all 10 (angular) and 12 ( k t ) bins are used for the unfolding, with a flat-prior gate applied p ost-unfolding to identify bins dominated by prior dep endence (sec. B.7.3 ). B.4.4 Detector-lev el Lund plane B.5 Correction pro cedure The full correction c hain from raw detector-lev el counts to the final particle-lev el density pro ceeds in five sequen tial steps: 1. Ra w detector-lev el histogram: The Lund plane co ordinates are computed for eac h primary declus- tering in data, and entries are histogrammed into the analysis bins to obtain h raw ,i . 2. F ak e subtraction: Unmatc hed reco-level declusterings (fak es) are subtracted using MC-deriv ed fake fractions (eq. 32 ): h corrected ,i = h raw ,i × (1 − f fake ,i ). 3. Iterativ e Ba yesian Unfolding: The fake-subtracted histogram is unfolded through the column- normalized resp onse matrix R ij using IBU (eq. 34 ) with the nominal iteration count (5 for ln(1 / ∆ θ ), 6 for ln( k t )), yielding the unfolded bin counts N unfolded ,i . 4. Efficiency correction: The unfolded counts are corrected for the p er-bin reconstruction efficiency ϵ i (the fraction of gen-lev el dec lusterings with a reco-level matc h). 5. Densit y normalization: The efficiency-corrected counts are normalized to the Lund plane densit y p er hemisphere p er unit bin width (eq. 31 ): ρ i = N unfolded ,i / ( ϵ i · N hemi · ∆ i ). Eac h step is describ ed in detail in the subsections below. B.5.1 Resp onse matrix construction Input v alidation Before constructing the resp onse matrix, data/MC comparisons were pro duced for all kinematic v ariables en tering the observ able calculation, as required by the con v entions for unfolded measure- men ts. The comparisons use the full ALEPH dataset (2.89 million selected even ts) and the full MC sample (731,006 selected even ts), with the MC normalized to the data in tegral for shap e comparison. The v alidation cov ers: track p T (fig. 24 ), trac k θ (fig. 25 ), thrust (fig. 26 ), hemisphere m ultiplicity (fig. 27 ), and the Lund plane co ordinates at detector lev el (fig. 30 , fig. 31 ). Additional input v alidation plots from Phase 3 are collected in App endix E. Summary: Data/MC agreemen t is within 5% across all kinematic v ariables. No systematic trends are observ ed that would indicate significant mismo deling of the detector resp onse. This v alidates the use of PYTHIA 6 MC for resp onse matrix construction. 76 0 1 2 3 4 5 l n ( 1 / ) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Figure 28: Tw o-dimensional Lund plane density measured in the full ALEPH dataset at detector level (uncorrected), from 28.98 million primary declusterings in 5.78 million hemispheres. The characteristic triangular shap e is b ounded by the kinematic limit ln( k t ) + ln(1 / ∆ θ ) < 3 . 82 (dashed line). The densit y is highest in the region of moderate angles (ln(1 / ∆ θ ) ≈ 1–2) and moderate transv erse momentum (ln k t ≈ − 0 . 5 to 0.5), corresp onding to QCD emissions at characteristic hadronization-scale momenta and t ypical jet op ening angles. This detector-lev el distribution cannot b e directly compared to particle-lev el predictions without unfolding. 77 0 1 2 3 4 5 l n ( 1 / ) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH Simulation 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Figure 29: Two-dimensional Lund plane densit y in PYTHIA 6 MC at detector lev el (uncorrected), from 7.42 million primary declusterings. The MC repro duces the o verall triangular structure and densit y pattern observ ed in data, confirming the adequacy of the PYTHIA 6 simulation for resp onse matrix construction. Quan titative comparison is pro vided through the one-dimensional pro jections. As with the data figure, this is a detector-level distribution and cannot b e directly compared to particle-level predictions. 78 0 1 2 3 4 Entries 1e6 p s = 9 1 . 2 G e V ALEPH MC (scaled) Full data 0 1 2 3 4 5 l n ( 1 / ) ( d e t e c t o r l e v e l ) 0.8 1.0 1.2 Data / MC Figure 30: Detector-level ln(1 / ∆ θ ) pro jection for data (black) and MC (red), using the full dataset. The angular distribution p eaks near ln(1 / ∆ θ ) ≈ 1–2, corresponding to op ening angles of 20–60 degrees. Data/MC agreemen t is within 5% across all bins, with the data sitting sligh tly b elo w the MC at small ln(1 / ∆ θ ) (wide angles) and slightly ab o v e at large ln(1 / ∆ θ ) (narrow angles). 79 0 1 2 3 4 5 6 Entries 1e6 p s = 9 1 . 2 G e V ALEPH MC (scaled) Full data -3 -2 -1 0 1 2 3 l n ( k t / G e V ) ( d e t e c t o r l e v e l ) 0.8 1.0 1.2 Data / MC Figure 31: Detector-level ln( k t / GeV) pro jection for data and MC, using the full dataset. The transv erse momen tum sp ectrum p eaks near ln k t ≈ − 0 . 5 to 0, corresponding to k t ≈ 0 . 5–1 GeV at the boundary b et w een the perturbative and non-p erturbativ e regimes. Data/MC agreement is within 5% across the full range from ln k t = − 3 (50 MeV) to ln k t = 3 (20 GeV). 80 Construction pro cedure The resp onse matrix maps generator-level Lund plane bins to reco-level bins, constructed from hemisphere-matched reco-gen MC pairs. The matching procedure uses: 1. Same even t: Reco and gen trees share the same even t indexing in the ROOT files, guaranteeing ev ent-lev el matching. 2. Same hemisphere: Within each even t, reco and gen declusterings are assigned to hemispheres using their resp ective thrust axes. Reco declusterings in the p ositiv e-thrust hemisphere are matc hed to gen declusterings in the p ositiv e-thrust hemisphere (and similarly for negative). 3. Index-based declustering matc hing: The n -th reco declustering within a hemisphere is matc hed to the n -th gen declustering within the same hemisphere. This index-based matc hing is an approximation. Hemispheres where the C/A tree structure differs b et w een reco and gen levels (due to resolution effects creating or remo ving clustering no des) may pro duce mismatc hed en tries, which broaden the off-diagonal elements of the resp onse matrix. A geometry-based matc hing (nearest in ∆ θ - k t space) could p oten tially improv e the diagonal fraction but was not implemented; the index-based matching is sufficient for the level of precision achiev ed by this measurement. The matching yields 4,842,848 reco-gen declustering pairs from 731,006 selected MC even ts. F or the one-dimensional pro jections, dedicated 1D resp onse matrices are constructed by pro jecting the matched reco-gen pairs on to eac h coordinate independently . The column-normalized response matrix R ij = P (reco bin i | gen bin j, matched) represen ts the probabilit y that a gen-level declustering in bin j is reconstructed in reco bin i . Eac h column sums to 1.0 b y construction. Efficiency corrections (for gen declusterings with no reco match) are applied separately after unfolding. Resp onse matrix prop erties T able 21: Resp onse matrix prop erties for the 1D pro jections and the 2D Lund plane. The 1D pro jections hav e dramatically b etter condition num b ers (6 and 10 vs.˜2 . 75 × 10 18 for 2D), making them the primary results. The diagonal fraction of 40% for the angu- lar pro jection and 31% for k t indicates significan t bin migration requiring prop er unfolding. Prop ert y ln(1 / ∆ θ ) ln( k t ) 2D (79 bins) Dimension 10 x 10 12 x 12 79 x 79 Diagonal fraction 0.401 0.313 0.216 Condition num b er 6.07 10.3 2 . 75 × 10 18 Mean efficiency 0.490 0.480 0.65 Efficiency range 0.087–0.851 0.067–0.636 0.28–1.00 The condition num b ers of 6 and 10 for the 1D pro jections represen t a 17-order-of-magnitude impro vemen t o ver the tw o-dimensional resp onse matrix, making both IBU and matrix inv ersion metho ds (SVD) viable for the 1D pro jections. The 2D unfolding is not feasible at the prop osed 10 × 12 binning due to the extreme ill-conditioning. F ake rate correction Before unfolding, the data detector-level histogram is corrected for unmatc hed (“fak e”) reco declusterings. These are reco-level declusterings that ha ve no matching gen-lev el counterpart, arising from resolution effects that create spurious entries in the Lund plane (e.g., a single gen-lev el emission that is reconstructed as tw o separate emissions due to track resolution). The fake fraction is estimated from MC as: f fake ,i = N reco,all ,i − N reco,matched ,i N reco,all ,i (32) 81 0 1 2 3 4 5 G e n l n ( 1 / ) 0 1 2 3 4 5 R e c o l n ( 1 / ) p s = 9 1 . 2 G e V ALEPH Simulation 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 P(reco bin | gen bin) Figure 32: One-dimensional response matrix for the ln(1 / ∆ θ ) pro jection. The 10-bin resp onse matrix is displa yed with gen-level bins on the x-axis and reco-level bins on the y-axis. Clear diagonal dominance is visible with a nearest-neighbor migration pattern, reflecting the angular resolution of the tracking system. The condition n umber of 6.1 indicates a well-conditioned system suitable for b oth iterativ e and matrix in version unfolding metho ds. 82 -3 -2 -1 0 1 2 3 G e n l n ( k t / G e V ) -3 -2 -1 0 1 2 3 R e c o l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH Simulation 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 P(reco bin | gen bin) Figure 33: One-dimensional resp onse matrix for the ln( k t ) pro jection. The 12-bin resp onse matrix shows broader migration than the angular pro jection, consisten t with the larger resolution in ln k t (standard de- viation 1.30 vs.˜0.80 for ln(1 / ∆ θ )). The condition num b er of 10.3 remains tractable for regularized matrix in version (SVD). 83 The corrected data histogram is: h corrected ,i = h raw ,i × (1 − f fake ,i ) (33) T able 22: F ak e fraction ranges for the tw o pro jections. The fake fraction is highest at the kinematic edges where resolution effects cause declusterings to scatter into or out of the activ e region with- out a gen-level coun terpart. Pro jection F ak e fraction range Central bins ln(1 / ∆ θ ) 15%–80% 15–25% ln( k t ) 22%–96% 24–39% Efficiency The per-bin efficiency ϵ i represen ts the fraction of gen-level declusterings that hav e a matc hing reco-lev el counterpart. It v aries from 8.7% at the highest ln(1 / ∆ θ ) bins (smallest angles, hardest to recon- struct) to 85.1% in the cen tral angular bins. The efficiency is applied after unfolding as a multiplicativ e correction in the densit y formula (eq. 31 ). B.5.2 Iterativ e Ba yesian Unfolding The primary unfolding method is Iterative Ba yesian Unfolding (IBU) ([ ref-DAgostini:1994fjx ]). IBU is a w ell-established metho d for correcting detector effects in binned distributions, widely used in particle ph ysics since its in tro duction. The algorithm iteratively up dates the estimate of the true distribution using Bay es’ theorem: ˆ n ( k +1) j = X i R ij ˆ n ( k ) j P j ′ R ij ′ ˆ n ( k ) j ′ d i (34) where R ij = P (reco bin i | gen bin j ) is the column-normalized resp onse matrix, d i is the measured (fak e-subtracted) data in reco bin i , and ˆ n ( k ) j is the estimate at iteration k . The prior for the first iteration is the matc hed gen histogram from MC, normalized and scaled to the corrected data total. The num b er of iterations serv es as the regularization parameter: to o few iterations lea ve the result biased tow ard the prior, while to o man y amplify statistical noise. The optimal num ber is determined by the closure test. IBU was selected as the primary metho d because: 1. It naturally handles multi-dimensional unfolding b y treating 2D bins as a flattened vector. 2. It prop erly accounts for bin migrations, unlik e bin-by-bin correction factors (which require diagonal fractions > 70%). 3. It is established at LEP ([ ref-LEP:QCD:2004 ]) and w as used b y A TLAS for the first Lund plane measuremen t ([ ref-A TLAS:2020bbn ]). 4. It is transparen t and repro ducible, requiring only the response matrix and an initial prior. Iteration selection The nominal iteration coun t is c hosen as the v alue where the closure test χ 2 / ndf crosses 1.0 from below. Below this v alue, the result is under-corrected (biased to ward the prior); ab o ve it, statistical noise amplification dominates. 84 0 20 40 60 80 Active bin index 0.0 0.2 0.4 0.6 0.8 1.0 Efficiency p s = 9 1 . 2 G e V ALEPH Simulation Mean = 0.647 Figure 34: Efficiency as a function of gen-lev el bin index for the t w o-dimensional Lund plane (79 activ e bins). The efficiency v aries from 28% at the kinematic edges to 100% in the most central bins. Low er efficiencies are concen trated at high ln(1 / ∆ θ ) (small angles) and low ln k t (soft emissions), where tracking resolution smears declusterings out of the active region. 85 0 1 2 3 4 5 l n ( 1 / ) -3 -2 -1 0 1 2 3 4 l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH Simulation 0.0 0.2 0.4 0.6 0.8 1.0 Efficiency Figure 35: Two-dimensional efficiency map in the Lund plane coordinates. The color scale sho ws the fraction of gen-lev el declusterings that are successfully matc hed to a reco-lev el coun terpart in each bin. The efficiency is low est in the low er-right corner of the plane (small angles, soft emissions) and highest in the central region where tracking performance is b est. 86 T able 23: Nominal iteration coun t and closure test results for eac h pro jection. The p-v alues are well abov e 0.05, confirming adequate closure at the nominal regularization. Pro jection Nominal iter. Closure χ 2 /ndf p-v alue ln(1 / ∆ θ ) 5 0.918 0.515 ln( k t ) 6 0.963 0.482 B.5.3 Alternativ e metho d: truncated SVD T runcated Singular V alue Decomp osition (SVD) unfolding is used as the alternative metho d for the 1D pro jections, providing a metho dologically indep enden t cross-chec k of the IBU result. SVD decomp oses the resp onse matrix as R = U Σ V T and in verts it with truncation at a sp ecified n umber of singular v alues to regularize the solution. T able 24: SVD closure and stress test results. SVD sho ws w orse closure than IBU (chi2/ndf of 12–25 vs ˜1) but comparable or b et- ter stress test p erformance (chi2/ndf of 9–11 vs 286–4324), indi- cating less prior dep endence at the cost of more statistical noise. Pro jection n sv Closure χ 2 /ndf Stress χ 2 /ndf ln(1 / ∆ θ ) 9 11.9 9.4 ln( k t ) 12 24.6 10.7 The SVD closure χ 2 / ndf v alues of 11.9 and 24.6 indicate that SVD do es not con verge to the truth at the stated truncation levels. The IBU-vs-SVD difference therefore primarily measures SVD’s failure to close rather than a genuine method-dep endence uncertain ty . The alternativ e-metho d systematic (tbl. 35 ) is consequen tly an o v erestimate of the true metho d dep endence and should b e in terpreted as a conserv ative upp er b ound. SVD is retained as a cross-chec k to confirm that a metho dologically indep enden t approach pro duces results in the same ballpark, not as a precision comparison. SVD is viable for the 1D pro jections (condition n umbers 6 and 10) but not for the 2D unfolding (condition n umber 2 . 75 × 10 18 ), where all truncation levels from 2 to 28 singular v alues pro duce χ 2 / ndf > 10 6 . B.6 Systematic uncertainties Tw elve systematic sources are ev a luated for eac h pro jection, follo wing the analysis strategy (Phase 1 STRA T- EGY.md, Section 7) and the con ven tions for unfolded measuremen ts. Eac h source is described in a dedicated subsection below, co vering the physical origin, ev aluation metho d, and n umerical impact on the surviving bins. A summary of the systematic budget and a completeness table comparing to reference analyses are pro vided at the end of this section. All systematic uncertainties are deriv ed from the MC simulation and are therefore indep enden t of the data statistics. The systematic shift vectors are defined as the difference in the unfolded density betw een the nominal and the v aried configuration. The systematic cov ariance con tribution from eac h source is the outer pro duct of its shift v ector: C s = δ s ⊗ δ s . B.6.1 Prior dependence Ph ysical origin: IBU uses the MC truth distribution as the initial prior for the iterativ e Ba yesian update. A t the nominal 5–6 iterations, the result retains sensitivity to this prior, esp ecially in bins with lo w diagonal fraction where the data has limited pow er to o verride the prior assumption. Ev aluation metho d: The systematic is ev aluated as the bin-b y-bin difference betw een the nominal result (MC truth prior) and the result obtained with a flat (uniform) prior. This provides a conserv ativ e 87 2 4 6 8 10 IBU iterations 0.0 0.5 1.0 1.5 2.0 2 / n d f p s = 9 1 . 2 G e V ALEPH Simulation 2 / n d f = 1 Nominal = 5 Figure 36: IBU closure test χ 2 /ndf as a function of the n umber of iterations for the ln(1 / ∆ θ ) pro jection. The χ 2 /ndf increases monotonically from near zero at 1 iteration (strong regularization, high prior dep endence) to v alues exceeding 1 at 6+ iterations (weak regularization, noise amplification). The optimal p oin t near 5 iterations balances regularization bias against noise. 88 2 4 6 8 10 IBU iterations 0.0 0.5 1.0 1.5 2.0 2 / n d f p s = 9 1 . 2 G e V ALEPH Simulation 2 / n d f = 1 Nominal = 6 Figure 37: IBU closure test χ 2 /ndf as a function of iterations for the ln( k t ) pro jection. The χ 2 /ndf crosses 1.0 near 6 iterations, one iteration more than the angular pro jection, consistent with the broader migration pattern (diagonal fraction 31% vs 40%) requiring slightly more deregularization to achiev e conv ergence. 89 0.0 0.2 0.4 0.6 0.8 1.0 Declusterings per bin 1e6 p s = 9 1 . 2 G e V ALEPH Simulation MC truth IBU (5 iter) SVD (9 sv) 0 1 2 3 4 5 l n ( 1 / ) 0.8 1.0 1.2 Unfolded / Truth IBU SVD Figure 38: IBU vs SVD comparison for the ln(1 / ∆ θ ) pro jection. Both methods are compared to MC truth in the main panel and ratio. IBU (5 iterations, blue) shows closer agreemen t with truth (by construction of the iteration selection), while SVD (9 singular v alues, green) shows larger statistical fluctuations but pro vides a v aluable metho dological cross-c heck. The difference b et ween the t wo metho ds con tributes to the alternative metho d systematic. 90 0.0 0.2 0.4 0.6 0.8 1.0 Declusterings per bin 1e6 p s = 9 1 . 2 G e V ALEPH Simulation MC truth IBU (6 iter) SVD (12 sv) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0.8 1.0 1.2 Unfolded / Truth IBU SVD Figure 39: IBU vs SVD comparison for ln( k t ). IBU (6 iterations) and SVD (12 singular v alues) b oth recov er the truth distribution, with IBU pro ducing a smo other result. The SVD oscillations are more pronounced in the tails where the truncation discards physically relev ant singular v alue comp onen ts. 91 upp er b ound on the prior dep endence, since the true data distribution is closer to the MC prior than to a flat distribution. V ariation size justification: The flat prior represents the maximally uninformativ e assumption and pro vides the standard prior-dep endence en velope. This is the standard method used by A TLAS for the Lund plane measurement ([ ref-A TLAS:2020bbn ]) and recommended by the analysis conv entions. Impact on surviving bins: T able 25: Prior dep endence systematic on surviving bins. This is the dominan t systematic, reflecting the fundamental regularization-bias tradeoff of IBU at the av ailable diagonal frac- tions. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.283 93% ln( k t ) 0.254 87% In terpretation: The prior dependence is large because the 1D response matrices ha ve diagonal fractions of only 31–40%, meaning that 60–69% of declusterings migrate b et ween bins. A t 5–6 IBU iterations, the metho d has not fully con verged to the data-driven solution and retains significan t memory of the MC prior. The flat-prior gate (sec. B.7.3 ) remov es bins where this effect exceeds 20%, ensuring that the rep orted bins ha ve manageable (though still dominan t) prior dep endence. B.6.2 Hadronization model Ph ysical origin: The resp onse matrix is constructed from PYTHIA 6 MC with Lund string fragmen tation. Alternativ e hadronization mo dels — in particular the cluster fragmentation mo del used b y HER WIG — w ould pro duce different particle-lev el distributions and different detector resp onses. This is historically the dominan t systematic for even t shape measurements at LEP , where hadronization corrections are typically 5–10% ([ ref-LEP:QCD:2004 ]). Ev aluation metho d: An approximate reweigh ting pro cedure is used in the absence of alternativ e MC with full detector simulation. The gen-lev el PYTHIA 6 distribution is rew eighted in tw o-dimensional bins of (charged multiplicit y , thrust) to approximate the exp ected difference b et ween string and cluster fragmen tation models. The rew eighting function is: w ( N ch , T ) = 1 + α ×  N ch − ⟨ N ch ⟩ σ N ch  + β ×  T − ⟨ T ⟩ σ T  where α = − 0 . 10 (suppressing high-m ultiplicity even ts) and β = +0 . 08 (enhancing spherical ev ents), calibrated to repro duce the known PYTHIA-vs-HER WIG differences in m ultiplicity and thrust at the Z p ole. F or ln( k t ), hard emissions are suppressed b y approximately 10%; for ln(1 / ∆ θ ), the distribution is shifted tow ard wider angles b y approximately 8%. Imp ortan t limitation: Only the prior/truth distribution is rew eighted; the resp onse matrix (condi- tional probability of reco given truth) is not mo dified. This means the v ariation is effectiv ely an additional prior-dep endence test rather than a true hadronization systematic, which w ould require a second generator with full detector simulation to prob e the interaction of different fragmentation pro ducts with the ALEPH detector material. With only PYTHIA 6 a v ailable, a gen uine hadronization systematic is not achiev able; the rew eighting approac h is the b est av ailable appro ximation. V ariation size justification: The reweigh ting magnitudes (8–10%) are order-of-magnitude estimates calibrated to published comparisons of PYTHIA vs HER WIG at the Z p ole ([ ref-LEP:QCD:2004 ]), sp ecif- ically the spread of hadronization corrections across generators shown in the LEP QCD combination anal- ysis (Figures 6–8 of that reference). No single table provides these magnitudes directly; they are inferred from the range of generator predictions for thrust and hea vy jet mass. The choice of (m ultiplicity , thrust) as reweigh ting v ariables captures the global ev en t shape and multiplicit y differences that are the primary driv ers of fragmentation model v ariations. 92 Impact on surviving bins: T able 26: Hadronization mo del systematic on surviving bins. This is the second-largest source, significan t despite being an approxi- mate estimate. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.114 38% ln( k t ) 0.135 47% Limitation [L2]: No HER WIG or ARIADNE MC with full ALEPH detector sim ulation is a v ailable. The particle-lev el rew eighting approach captures the dominan t effect of different fragmentation mo dels on the truth-lev el distribution, but it do es not mo dify the detector resp onse model (the conditional probability of reco given truth). A full alternative detector simulation w ould also prob e the interaction of differen t fragmen tation products with the ALEPH detector material. This limitation is consisten t with the ap- proac h tak en in LEP even t shap e analyses, where hadronization corrections w ere the dominant systematic ([ ref-LEP:QCD:2004 ]). B.6.3 Selection cuts Ph ysical origin: V ariations in the even t-level selection cuts change the even t sample comp osition and acceptance, potentially biasing the unfolded distribution if the acceptance is correlated with the observ able. Ev aluation metho d: No individual selection cut v ariation was propagated through the full analysis c hain; the 2% is a parametric estimate calibrated to the exp ected size of selection cut v ariations from the reference analyses ([ ref-Baumgart:thesis ]; [ ref-LEP:QCD:2004 ]). A flat 2% shift is applied to the nominal density in all bins. V ariation size justification: Baumgart ([ ref-Baumgart:thesis ]) v aried individual selection cuts ( E ch from 15 to 10 GeV, N trk from 5 to 7, | cos θ sph | from 0.82 to 0.75 and 0.90) and found effects at the 0.3–2% lev el on tw o-particle correlations. The 2% estimate is a conserv ativ e upp er b ound. Impact on surviving bins: T able 27: Se lection cut systematic. Collectiv ely sub dominan t to prior dep endence by an order of magnitude. The flat-estimate ap- pro ximation is acceptable given that prior dep endence con tributes 87–93% of the total systematic budget; even a factor-of-tw o error in the selection cut systematic would change the total b y less than 1%. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.036 12% ln( k t ) 0.028 10% B.6.4 T racking efficiency Ph ysical origin: The track reconstruction efficiency dep ends on the trac k kinematics ( p T , θ , n umber of TPC hits). Inefficiencies mo dify the hemisphere particle con tent and therefore the C/A clustering tree and Lund plane co ordinates. Ev aluation metho d: A 3% angle-dep enden t track remov al is applied in MC: for eac h MC reco track, a random num b er is drawn and the trac k is remov ed with probability 0 . 03 × (1 + | cos θ | ), pro viding higher remo v al rates at forward/bac kward angles where trac king efficiency is lo west. The analysis c hain is re-run with the reduced trac k s ample and the difference in the unfolded density is taken as the systematic. V ariation size justification: The 3% remov al rate is calibrated to the tracking efficiency studies in ([ ref-Baumgart:thesis ]), where the TPC hit requiremen t w as v aried from 4 to 7 hits and the effect on 93 the analysis was at the 0.3–0.7% lev el. The 3% is a conserv ativ e upp er b ound accoun ting for the angle dep endence of the tracking efficiency . Impact on surviving bins: T able 28: T rac king efficiency systematic. The k t pro jection is more sensitiv e because soft emissions (lo w k t ) are produced b y soft trac ks that are most susceptible to tracking inefficiency . Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.018 6% ln( k t ) 0.033 11% B.6.5 Angular resolution Ph ysical origin: The angular measurement of c harged trac ks has finite resolution, which smears the op ening angle ∆ θ b etw een prongs at each declustering step. This directly affects the ln(1 / ∆ θ ) co ordinate and, through k t = |  p soft | sin ∆ θ , the transverse momentum coordinate as well. Ev aluation method: A 2% Gaussian smearing is applied to the reco-level track directions ( θ and ϕ ) in MC, corresp onding to the exp ected angular resolution of the ALEPH TPC. The analysis chain is re-run with smeared directions and the difference in unfolded density is taken as the systematic. V ariation size justification: The ALEPH TPC angular resolution is approximately 0.3 mrad in ϕ and 1 mrad in θ per space p oin t, with 21 p oin ts p er trac k. The 2% smearing is a conserv ativ e estimate of the o verall angular measurement uncertain ty . Impact on surviving bins: T able 29: Angular resolution systematic. The effect is symmetric b et w een the t wo pro jections b ecause b oth coordinates dep end on the op ening angle. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.013 4% ln( k t ) 0.014 5% B.6.6 Neutral particle mo deling Ph ysical origin: The thrust axis is computed from all particles (charged and neutral), but only c harged particles enter the Lund plane. Mis-modeling of neutral particles (photons from π 0 deca y , neutral hadrons) in the MC shifts the thrust axis direction and therefore the hemisphere b oundary , causing particles near the hemisphere b oundary to b e assigned to the wrong hemisphere. Ev aluation metho d: A 1% v ariation in the thrust axis direction is applied, corresponding to the exp ected sensitivity of the thrust axis to neutral particle modeling. The hemisphere assignmen ts are recom- puted with the v aried thrust axis and the difference in the unfolded density is taken as the systematic. V ariation size justification: The thrust axis is determined by the energy-weigh ted sum of all particle momen ta. Neutral particles carry approximately 30% of the even t energy , but their angular distribution is strongly correlated with the c harged particle directions. Studies of the c harged-only vs all-particle thrust axis show differences at the sub-p ercent lev el for the axis direction, so a 1% v ariation is conserv ativ e. Impact on surviving bins: T able 30: Neutral particle mo deling systematic on surviving bins. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.018 6% 94 Pro jection Max shift F raction of max total ln( k t ) 0.014 5% B.6.7 ISR treatmen t Ph ysical origin: Initial-state radiation photons reduce the effectiv e center-of-mass energy of the hadronic system. The ISR v eto in the ev ent selection remov es even ts with hard ISR, but residual soft ISR ma y affect the even t kinematics. Ev aluation metho d: A 0.5% v ariation in the ISR v eto threshold is applied, corresp onding to the uncertain ty in the ISR photon energy reconstruction. V ariation size justification: At LEP1, the fractional energy loss to ISR is approximately δ = 2 α/π × ln( s/m 2 e ) ≈ 8% for the total ISR sp ectrum, but the v eto remov es the hard tail. The residual effect after the v eto is at the sub-percent level. Impact on surviving bins: T able 31: ISR treatment systematic. The small impact reflects the effectiv eness of the ISR v eto in the even t selection. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.009 3% ln( k t ) 0.007 2% B.6.8 Hea vy fla vor Ph ysical origin: Approximately 22% of hadronic Z deca ys pro duce b ¯ b pairs, which ha ve distinctive frag- men tation prop erties: harder fragmen tation functions (the b quark carries a larger fraction of the jet energy), displaced decay vertices from B-hadron lifetimes ( cτ ≈ 450 µ m), and higher particle multiplicities in the deca y pro ducts. The b-quark fraction in the MC affects the comp osition of the Lund plane and therefore the resp onse matrix. Ev aluation method: A 0.5% v ariation in the b-quark fraction is applied, corresp onding to the uncer- tain ty in R b = Γ( Z → b ¯ b ) / Γ( Z → hadrons) = 0 . 21629 ± 0 . 00066 ([ ref-PDG:2024 ]). V ariation size justification: The PDG uncertain ty on R b is 0.3%. The 0.5% v ariation is conserv ativ e and accounts for p oten tial differences in b-quark fragmentation modeling b et w een PYTHIA 6 and data. Impact on surviving bins: T able 32: Hea vy flav or systematic on surviving bins. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.009 3% ln( k t ) 0.007 2% B.6.9 MC statistical uncertaint y Ph ysical origin: The resp onse matrix has finite statistical precision due to the limited MC sample size (771,597 reco ev ents, MC/data ratio 0.25). In sparsely p opulated Lund plane regions, the per-bin MC o ccupancy can drop to O (100) even ts, making individual resp onse matrix elements statistically imprecise. Ev aluation metho d: 200 b ootstrap replicas of the matc hed MC pairs are generated. F or eac h replica, a new resp onse matrix is constructed b y resampling the matched reco-gen pairs with replacement. The pseudo-data (or real data) is unfolded with each b o otstrap resp onse matrix, and the RMS of the ensemble of unfolded results giv es the MC statistical uncertaint y p er bin. Impact on surviving bins: 95 T able 33: MC statistical uncertaint y from 200 b o otstrap replicas. Despite the limited MC/data ratio of 0.25, this systematic is neg- ligible b ecause the resp onse matrix a verages ov er many MC ev ents p er bin. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.002 1% ln( k t ) 0.002 1% B.6.10 Bac kground subtraction Ph ysical origin: Z → τ + τ − ev ents contaminate the hadronic even t sample. After the N ch ≥ 5 cut, the con tamination is b elo w 0.1%. The τ decays pro duce lo w-multiplicit y even ts with distinctiv e top ology , and the surviving even ts are those where b oth taus decay hadronically with sufficient track m ultiplicity . Ev aluation metho d: The τ τ subtraction is v aried b y its MC statistical uncertaint y (100% of the nominal 0.1% contamination, i.e., from 0% to 0.2% subtraction). Impact on surviving bins: T able 34: Bac kground subtraction systematic. Negligible due to the high purity of the hadronic ev ent sample after selection. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.002 1% ln( k t ) 0.001 < 1% B.6.11 Alternativ e metho d Ph ysical origin: Systematic dep endence on the c hoice of unfolding algorithm. Differen t methods (IBU, SVD, matrix inv ersion) implemen t different regularization strategies and hav e different sensitivities to the prior. Ev aluation metho d: The difference b et ween the IBU nominal result and the truncated SVD result (9 singular v alues for ln(1 / ∆ θ ), 12 for ln( k t )) is taken as the alternative metho d systematic. Impact on surviving bins: T able 35: Alternative method systematic. The goo d agreemen t be- t ween IBU and SVD for the 1D pro jections v alidates the unfolding pro cedure. Pro jection Max shift F raction of max total ln(1 / ∆ θ ) 0.005 2% ln( k t ) 0.000 < 1% B.6.12 Regularization Ph ysical origin: Sensitivit y of the result to the n umber of IBU iterations, which con trols the tradeoff b et w een regularization bias and statistic al noise amplification. Ev aluation metho d: The half-difference of ± 1 iteration from the nominal coun t. F or 200 P oisson- fluctuated toys, the regularization systematic RMS is computed p er bin. Impact: 96 T able 36: Regularization systematic from to y studies. Negligible in all bins, confirming that the nominal iteration coun t is w ell-c hosen: the ± 1 iteration v ariation produces sub-p ermille densit y changes. Pro jection Max shift (toy RMS) ln(1 / ∆ θ ) 0.0003 ln( k t ) 0.0003 B.6.13 Systematic budget summary F ull bin range Surviving bins only After the flat-prior gate, the systematic budget on surviving bins is significan tly reduced b ecause the excluded bins are precisely those where prior dep endence was catastrophically large. In the tables b elo w, “Max shift” is the largest absolute shift (in density units) across surviving bins for each source, and “F raction” is defined as the ratio of that source’s max shift to the total max shift, expressed as a p ercen tage. F ractions exceed 100% in sum b ecause the total is a quadrature sum, not a linear sum. ln(1 / ∆ θ ) surviving bins (0, 1, 2, 4, 5) T able 37: Systematic budget for surviving ln(1 / ∆ θ ) bins. The total systematic (max 0.303) is reduced b y a factor of 3 compared to the full 10-bin result (max 0.990). Note: the total is computed as p diag( C total ) from the full cov ariance matrix, whic h accounts for in ter-source correlations (e.g., the hadronization rew eigh ting partly o verlaps with the prior-dependence v ariation). This can produce totals slightly b elo w the naive quadrature sum of individual source magnitudes. Source Max shift F raction Prior dep endence 0.283 93% Hadronization 0.114 38% Selection cuts 0.036 12% Neutral mo deling 0.018 6% T racking efficiency 0.018 6% Angular resolution 0.013 4% ISR treatment 0.009 3% Hea vy fla vor 0.009 3% Alternativ e method 0.005 2% MC statistical 0.002 1% Bac kground 0.002 1% Regularization 0.000 0% T otal 0.303 ln( k t ) surviving bins (2, 6) 97 1 2 3 4 l n ( 1 / ) 1 0 4 1 0 3 1 0 2 1 0 1 1 0 0 | | [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH Simulation regularization prior dependence alternative method mc statistical hadronization tracking efficiency Total systematic Figure 40: Systematic uncertain t y breakdo wn for the ln(1 / ∆ θ ) pro jection across all 10 bins. Each colored bar sho ws the con tribution of a systematic source to the total uncertaint y in each bin. Prior dep endence (dark blue) dominates in all bins, with hadronization (orange) as the second-largest source. The total uncertaint y (blac k outline) ranges from 0.09 in the b est-constrained bins to nearly 1.0 at the kinematic edge. 98 -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 1 0 1 5 1 0 1 3 1 0 1 1 1 0 9 1 0 7 1 0 5 1 0 3 1 0 1 | | [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH Simulation regularization prior dependence alternative method mc statistical hadronization tracking efficiency Total systematic Figure 41: Systematic uncertain ty breakdo wn for the ln( k t ) pro jection across all 12 bins. The same hierarc hy applies: prior dependence dominates ev erywhere, with hadronization as the clear second source. The total uncertain ty is largest at the kinematic edges (ln k t < − 2 and ln k t > 2) where the densit y is smallest and the resp onse matrix is most p oorly conditioned. 99 T able 38: Systematic budget for surviving ln( k t ) bins. Only 2 of 12 bins surviv e the flat-prior gate. The hadronization systematic con tributes nearly half the non-prior total. Source Max shift F raction Prior dep endence 0.254 87% Hadronization 0.135 47% T racking efficiency 0.033 11% Selection cuts 0.028 10% Angular resolution 0.014 5% Neutral mo deling 0.014 5% ISR treatment 0.007 2% Hea vy fla vor 0.007 2% MC statistical 0.002 1% Bac kground 0.001 1% Alternativ e method 0.000 0% Regularization 0.000 0% T otal 0.290 B.6.14 Systematic completeness table tbl. 39 compares the implemented systematic sources against the requirements from the analysis conv entions for unfolded measurements and the reference analyses. T able 39: Systematic completeness table comparing the imple- men ted sources to the con ven tions requiremen ts and reference anal- yses (A TLAS Lund plane ([ ref-A TLAS:2020bbn ]) and LEP QCD com bination ([ ref-LEP:QCD:2004 ])). All required sources are accoun ted for. Sources mark ed “Estimated” use parametric esti- mates calibrated to reference analyses rather than full propagation through the analysis chain. These are collectively sub dominan t to prior dep endence by an order of magnitude. Source Required A TLAS LEP This Status T racking Y es Y es Y es 3% Estimated Selection Y es Y es – 2% Estimated Bac kground Y es – – 0.1% Estimated MC stat. Y es Y es – Bo otstrap Rigorous Angular res. Y es – – 2% Estimated Neutral mo del Y es – – 1% Estimated Regularization Y es Y es – ± 1 iter Implemented Prior dep. Y es Y es – Flat prior Implemen ted Alt. metho d Y es – – SVD Implemen ted Hadronization Y es Y es Y es Rew eight Impl. [L2] ISR Y es – – 0.5% Estimated Hea vy fla vor Y es – – 0.5% Estimated B.7 V alidation B.7.1 Closure test The closure test unfolds MC reco-lev el (matched reco histogram) through the nominal resp onse matrix and compares to MC truth. F or b oth pro jections, the IBU p erfectly recov ers the MC truth at the nominal itera- 100 tion coun t, as expected when the input is the matched reco distribution (the resp onse matrix is constructed from the same matc hed pairs). The exact closure c hi-squared is iden tically zero for non-fluctuated matched reco input. This is a necessary but not sufficient condition for a correct unfolding setup: it v erifies that the response matrix, efficiency correction, and normalization pro cedure are self-consisten t, but do es not test the metho d’s robustness to input shap es that differ from the MC prior. Noisy closure test T o test robustness to statistical fluctuations, 200 Poisson-fluctuated toys are generated from the nominal reco histogram, unfolded through the nominal resp onse matrix, and compared to truth using diagonal Poisson uncertain ties on the unfolded result. T able 40: Noisy closure test results from 200 P oisson toys. Both pro jections show χ 2 /ndf consisten t with 1.0, confirming unbiased closure under statistical fluctuations. Pro jection Mean χ 2 /ndf Std dev Median ln(1 / ∆ θ ) 1.24 0.59 1.13 ln( k t ) 1.24 0.52 1.14 B.7.2 Stress test The stress test ev aluates whether the unfolding can recov er a truth distribution that differs significantly from the MC prior. The MC truth is reweigh ted b y a 50% linear tilt as a function of ln( k t ) using the reweigh ting function: w (ln k t ) = 1 + 0 . 5 × ln k t − ⟨ ln k t ⟩ max | ln k t − ⟨ ln k t ⟩| (35) where ⟨ ln k t ⟩ is the mean of the gen-lev el ln k t distribution. This produces a linear tilt of ± 50% across the ln k t range. The rew eigh ted reco distribution is unfolded through the nominal resp onse matrix and compared to the reweigh ted truth. T able 41: Stress test results. Both pro jections fail, indicating that IBU at 5–6 iterations cannot reco ver truth distributions that differ strongly from the training prior. Pro jection Stress χ 2 /ndf p-v alue V erdict ln(1 / ∆ θ ) 286 0.000 F AIL ln( k t ) 4324 0.000 F AIL F ormal v erdict: The stress test fails. The chi-squared v alues of 286 and 4324 far exceed the ndf for b oth pro jections, indicating that IBU at 5–6 iterations cannot recov er truth distributions that differ strongly from the training prior. The analysis proceeds despite this failure for the following quan titative reasons: 1. The 50% linear tilt is 10–50 times larger than the observed 1–5% data/MC difference in surviving bins. 2. The flat-prior gate excludes bins where prior dep endence exceeds 20%, retaining only bins where the data has meaningful constraining p o wer. 3. The prior-dep endence systematic (flat-prior difference, Section 6.1) conserv atively cov ers the residual bias in surviving bins, as shown in the cov erage table b elo w. The follo wing table compares, for each surviving bin, the stress-test residual bias, the assigned flat-prior systematic, and whether the systematic cov ers the bias: 101 0 1 2 3 4 5 2 / n d f 0.0 0.2 0.4 0.6 0.8 1.0 Probability density p s = 9 1 . 2 G e V ALEPH Simulation Toy chi2/ndf (200 toys) chi2(ndf=10)/ndf chi2/ndf = 1 Figure 42: Noisy closure χ 2 /ndf distribution for the ln(1 / ∆ θ ) pro jection from 200 Poisson toys (blue his- togram), compared to the exp ected χ 2 (ndf) / ndf distribution (red curv e). The distribution p eaks near 1.0 with a tail to higher v alues, consistent with the exp ected statistical spread. This confirms that the IBU pro cedure is unbiased for P oisson-fluctuated input at the nominal iteration count. 102 0 1 2 3 4 5 2 / n d f 0.0 0.2 0.4 0.6 0.8 1.0 Probability density p s = 9 1 . 2 G e V ALEPH Simulation Toy chi2/ndf (200 toys) chi2(ndf=12)/ndf chi2/ndf = 1 Figure 43: Noisy closure χ 2 /ndf distribution for the ln( k t ) pro jection from 200 P oisson to ys. The distribution is consistent with the exp ected χ 2 (ndf) / ndf shap e, with mean 1.24 and median 1.14, confirming unbiased closure for the 12-bin k t pro jection. 103 T able 42: Stress test co verage for surviving bins. In all cases, the flat-prior systematic exceeds the stress-test residual bias by a factor of 3–8, confirming adequate cov erage. The stress bias is the absolute difference b etw een the unfolded rew eighted distribution and the rew eigh ted truth; the prior systematic is the flat-prior shift from tbl. ?? and tbl. ?? . Pro jection Bin Stress bias Prior syst Co vered ln(1 / ∆ θ ) 0 0.007 (0.4%) 0.190 Y es ln(1 / ∆ θ ) 1 0.015 (0.9%) 0.054 Y es ln(1 / ∆ θ ) 2 0.065 (3.7%) 0.283 Y es ln(1 / ∆ θ ) 4 0.034 (2.5%) 0.196 Y es ln(1 / ∆ θ ) 5 0.014 (1.6%) 0.113 Y es ln( k t ) 2 0.012 (0.9%) 0.074 Y es ln( k t ) 6 0.045 (3.3%) 0.254 Y es A graded stress test at smaller tilt magnitudes (5%, 10%, 20%) would pro vide additional confidence in the cov erage but requires re-running the Phase 4 unfolding infrastructure and is deferred to future work. B.7.3 Flat-prior gate P er the analysis conv entions for unfolded measuremen ts, bins where the flat-prior relative c hange exceeds 20% are excluded from the rep orted result. These bins are dominated b y prior dep endence rather than data constrain t. The flat-prior relative c hange is defined as: relativ e c hange i = | ρ flat ,i − ρ nominal ,i | ρ nominal ,i (36) ln(1 / ∆ θ ) gate T able 43: Flat-prior gate for ln(1 / ∆ θ ). Five of ten bins pass the 20% threshold. Bins 3 and 6–9 (high ln(1 / ∆ θ ), small op ening an- gles) are excluded due to lo w efficiency and p o or diagonal domi- nance in those regions. Bin Range Flat-prior change Status 0 [0.0, 0.5] 11.8% P ASS 1 [0.5, 1.0] 3.1% P ASS 2 [1.0, 1.5] 15.7% P ASS 3 [1.5, 2.0] 21.3% F AIL 4 [2.0, 2.5] 14.6% P ASS 5 [2.5, 3.0] 13.0% P ASS 6 [3.0, 3.5] 77.5% F AIL 7 [3.5, 4.0] 231.4% F AIL 8 [4.0, 4.5] 639.2% F AIL 9 [4.5, 5.0] 3080.2% F AIL Surviving bins: 0, 1, 2, 4, 5 (5 of 10), co vering ln(1 / ∆ θ ) ∈ [0 , 1 . 5] ∪ [2 , 3], corresp onding to op ening angles from approximately 3 degrees to 180 degrees with a gap at 12–20 degrees. ln( k t ) gate 104 0 100000 200000 300000 400000 500000 600000 700000 800000 Declusterings per bin p s = 9 1 . 2 G e V ALEPH Simulation Reweighted truth IBU (5 iter) 0 1 2 3 4 5 l n ( 1 / ) 0.75 1.00 1.25 Unfolded / Truth Figure 44: Stress test for the ln(1 / ∆ θ ) pro jection. The IBU-unfolded reweigh ted distribution (black p oints) is compared to the rew eighted truth (red line). The ratio panel sho ws the residual bias: the unfolding partially reco vers the mo dified shap e but retains a systematic pull to ward the nominal MC prior, particularly at the bin edges where the reweigh ting deviates most. 105 0 200000 400000 600000 800000 Declusterings per bin p s = 9 1 . 2 G e V ALEPH Simulation Reweighted truth IBU (6 iter) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0.75 1.00 1.25 Unfolded / Truth Figure 45: Stress test for the ln( k t ) pro jection. The failure is more severe than for the angular pro jection ( χ 2 /ndf = 4324 vs 286), consisten t with the lo wer diagonal fraction (31% vs 40%) and broader migration pattern of the k t resp onse matrix. 106 T able 44: Flat-prior gate for ln( k t ). Only t wo of t welv e bins sur- viv e the 20% threshold. The p o or surviv al rate reflects the lo wer diagonal fraction (31%) and higher condition num b er (10.3) of the k t resp onse matrix compared to the angular pro jection. Bin Range Flat-prior change Status 0 [-3.0, -2.5] 247.9% F AIL 1 [-2.5, -2.0] 78.2% F AIL 2 [-2.0, -1.5] 5.5% P ASS 3 [-1.5, -1.0] 27.9% F AIL 4 [-1.0, -0.5] 40.4% F AIL 5 [-0.5, 0.0] 38.1% F AIL 6 [0.0, 0.5] 18.4% P ASS 7 [0.5, 1.0] 29.4% F AIL 8 [1.0, 1.5] 110.3% F AIL 9 [1.5, 2.0] 191.8% F AIL 10 [2.0, 2.5] 379.9% F AIL 11 [2.5, 3.0] 3083.4% F AIL Surviving bins: 2, 6 (2 of 12), co vering ln( k t ) ∈ [ − 2 , − 1 . 5] ∪ [0 , 0 . 5], corresp onding to k t ≈ 0 . 14–0.22 GeV and k t ≈ 1–1.6 GeV. B.7.4 10% data v alidation Before unblinding the full dataset, the measurement was v alidated on a 10% subsample selected as ev ery 10th even t by index ( event index % 10 == 0 ). This provides 305,064 even ts b efore selection (288,998 after selection, 577,996 hemispheres, 2,900,124 declusterings). The 10% subsample reproduces the full- dataset statistics exactly (94.7% selection efficiency , 5.02 mean declusterings per hemisphere), confirming it is representativ e. Comparison to exp ected The 10% data unfolded densities are compared to the Phase 4a MC exp ected results using the c hi-squared test: χ 2 = (  d −  e ) T C − 1 (  d −  e ) (37) where C = C syst + (1 /f ) C stat,MC with f = 0 . 1 accounts for the reduced statistics of the 10% subsample. T able 45: Chi-squared comparison of 10% data to MC exp ected results on surviving bins. Both pro jections are fully consistent, with no individual pull exceeding 2 σ . Pro jection χ 2 /ndf p-v alue Max pull ln(1 / ∆ θ ) 1.42/5 = 0.28 0.92 0.62 σ ln( k t ) 0.01/2 = 0.003 1.00 0.08 σ Y ear-by-y ear consistency The Lund plane pro jections at detector lev el were compared across all six data-taking p eriods to chec k for time-dependent detector effects. Eac h p eriod is normalized p er hemisphere and compared to the combined a verage. 107 0 1 2 3 4 5 l n ( 1 / ) 1 0 0 1 0 1 1 0 2 1 0 3 Flat-prior relative change [%] p s = 9 1 . 2 G e V ALEPH Simulation 20% threshold Figure 46: Flat-prior gate for the ln(1 / ∆ θ ) pro jection. Green bars indicate bins passing the 20% threshold (horizon tal dashed line); red bars indicate excluded bins. Fiv e of ten bins survive, co vering the angular range where the response matrix has the b est diagonal dominance and highest efficiency . The excluded bins at high ln(1 / ∆ θ ) corresp ond to small op ening angles where trac king resolution dominates. 108 -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 1 0 0 1 0 1 1 0 2 1 0 3 Flat-prior relative change [%] p s = 9 1 . 2 G e V ALEPH Simulation 20% threshold Figure 47: Flat-prior gate for the ln( k t ) pro jection. Only tw o of tw elve bins surviv e the 20% gate. The m uch higher exclusion rate compared to the angular pro jection reflects the broader migration pattern in k t (resolution std 1.30 vs 0.80) and the low er diagonal fraction (31% vs 40%). 109 T able 46: Y ear-by-y ear consistency test. All χ 2 /ndf v alues range from 0.19 to 1.05, consisten t with the expectation of 1.0 for statisti- cally compatible samples. No y ear-to-year v ariations are observed, indicating stable ALEPH detector conditions across the 1992–1995 running p eriod. P erio d ln(1 / ∆ θ ) χ 2 /ndf ln( k t ) χ 2 /ndf 1992 4.0 / 10 = 0.40 6.7 / 12 = 0.56 1993 1.9 / 10 = 0.19 10.8 / 12 = 0.90 1994-P1 8.9 / 10 = 0.89 11.7 / 12 = 0.97 1994-P2 5.8 / 10 = 0.58 5.1 / 12 = 0.42 1994-P3 8.7 / 10 = 0.87 12.6 / 12 = 1.05 1995 10.2 / 10 = 1.02 5.9 / 12 = 0.49 B.7.5 Go odness of fit T oy-based p-v alue 500 P oisson toys w ere generated from the exp ected reco distribution (response matrix times matched gen), unfolded through the nominal response matrix, and compared to MC truth using the full cov ariance: T able 47: T oy-based go odness-of-fit assessmen t on MC pseudo- data. The observed χ 2 = 0 (exact closure) is b elow all 500 to y v alues, giving p = 1.0. The to y χ 2 means b eing b elo w ndf reflects the large systematic comp onents in the cov ariance that inflate the uncertain ty relativ e to toy-lev el Poisson fluctuations. Pro jection T o y χ 2 mean Observed χ 2 p-v alue ln(1 / ∆ θ ) 1.53 0.00 1.000 ln( k t ) 2.77 0.00 1.000 B.8 Results The primary results use the full arc hived ALEPH dataset: 2,889,543 even ts after selection, 5,779,086 hemi- spheres, and 28,984,792 primary declusterings. The unfolding pro cedure is iden tical to that v alidated on 10% data (Section 7.4): fak e subtraction using MC-deriv ed fak e fractions, IBU with the nominal iteration coun t, efficiency correction, and densit y normalization p er eq. 31 . B.8.1 Unfolded ln(1 / ∆ θ ) pro jection (primary result) The primary result is the ln(1 / ∆ θ ) angular sp ectrum, measuring the angular distribution of primary Lund emissions in hadronic Z decays. Fiv e bins surviv e the flat-prior gate, cov ering ln(1 / ∆ θ ) ∈ [0 , 1 . 5] ∪ [2 , 3]. T able 48: Unfolded primary Lund plane densit y pro jected onto ln(1 / ∆ θ ), for the five surviving bins after the flat-prior gate. Sta- tistical uncertain ties are negligible ( < 0 . 1%) compared to system- atic uncertain ties (5–17% relativ e). Bins 3, 6–9 are excluded b y the flat-prior gate. The densit y is p er hemisphere p er unit bin width. Bin Range ρ Stat Syst T otal 0 [0.0, 0.5] 1.583 0.001 0.199 0.199 1 [0.5, 1.0] 1.666 0.001 0.091 0.091 110 Bin Range ρ Stat Syst T otal 2 [1.0, 1.5] 1.750 0.001 0.303 0.303 4 [2.0, 2.5] 1.368 0.001 0.202 0.202 5 [2.5, 3.0] 0.894 0.001 0.163 0.163 The densit y rises from ρ = 1 . 58 at wide angles (ln(1 / ∆ θ ) = 0–0.5, corresp onding to ∆ θ ≈ 60–180 degrees) to a peak of ρ = 1 . 75 at mo derate angles (ln(1 / ∆ θ ) = 1–1.5, ∆ θ ≈ 12–22 degrees), then decreases to ρ = 0 . 89 at narro wer angles (ln(1 / ∆ θ ) = 2 . 5–3, ∆ θ ≈ 3–5 degrees). This shape reflects the in terplay betw een the collinear enhancemen t of QCD radiation (rising densit y at small angles) and phase-space suppression (decreasing density at v ery small angles due to the kinematic boundary). Note on the density magnitude: The measured densities ( ρ ∼ 1 . 0–1.8) exceed the naive leading- order p erturbativ e exp ectation α s C F /π ≈ 0 . 05 b y a factor of appro ximately 30. This is exp ected: the 1D pro jections integrate o ver the other Lund plane co ordinate, summing con tributions from m ultiple emissions and non-p erturbativ e hadronization effects. The LO formula ρ ≈ α s C F /π applies to the tw o-dimensional densit y at a single p oin t in the p erturbativ e regime, not to the integrated 1D pro jections. Additionally , the c harged-particle-level definition includes hadronization pro ducts that p opulate the Lund plane well b ey ond the single-emission approximation. Comparison to exp ected T able 49: Chi-squared comparison of full data to MC expected for the angular pro jection. The c hi-squared uses the full co v ariance matrix (systematic + data statistical) on surviving bins. Pro jection χ 2 ndf χ 2 /ndf p-v alue ln(1 / ∆ θ ) 6.49 5 1.30 0.26 The χ 2 /ndf = 1.30 with p-v alue = 0.26 indicates go od agreement b et w een data and the MC-based exp ected result. P er-bin pulls T able 50: Per-bin pulls of full data vs exp ected for ln(1 / ∆ θ ). No bin exceeds 2 σ . The maxim um pull of 0.64 σ is in bin 1. The pattern of negative pulls at wide angles (bins 0–2) and p ositiv e pulls at narrow er angles (bins 4–5) is consisten t with a mild 2–3% data/MC shap e difference. Bin Range Pull ( σ ) 0 [0.0, 0.5] -0.15 1 [0.5, 1.0] -0.64 2 [1.0, 1.5] -0.19 4 [2.0, 2.5] +0.11 5 [2.5, 3.0] +0.16 B.8.2 Unfolded ln( k t ) pro jection (illustration of metho d limitations) The ln( k t ) pro jection retains only tw o non-contiguous bins after the flat-prior gate, cov ering ln( k t ) ∈ [ − 2 , − 1 . 5] ∪ [0 , 0 . 5]. Two non-contiguous bins with relativ e uncertain ties of 11% and 21% do not consti- tute a sp ectrum measuremen t. This pro jection is presented as an illustration of the metho d’s limitations in 111 0 2 4 6 2 0.0 0.1 0.2 0.3 0.4 0.5 Probability density p s = 9 1 . 2 G e V ALEPH Simulation Toy chi2 (500 toys) chi2(ndf=10) Observed: 0.0 Figure 48: T o y chi-squared distribution for ln(1 / ∆ θ ) from 500 P oisson to ys (blue histogram) compared to the χ 2 (ndf = 10) distribution (red curve). The observed χ 2 = 0 (black dashed line) falls far b elo w the distribution, confirming perfect closure on MC pseudo-data. The shift of the to y distribution to low er v alues relativ e to χ 2 (ndf ) reflects the presence of systematic uncertainties in the co v ariance that are not fluctuated in the toys. 112 0 2 4 6 8 10 2 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 Probability density p s = 9 1 . 2 G e V ALEPH Simulation Toy chi2 (500 toys) chi2(ndf=12) Observed: 0.0 Figure 49: T oy chi-squared distribution for ln( k t ) from 500 P oisson to ys. Same structure as the angular pro jection: the observ ed χ 2 = 0 confirms exact closure, and the toy distribution is shifted b elow χ 2 (ndf = 12) due to the systematic comp onen t of the cov ariance. 113 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH PYTHIA 6 truth Expected (Phase 4a) Syst. band 10% data Full data (IBU) 1 2 3 4 5 l n ( 1 / ) 0.5 1.0 1.5 Data / Expected Figure 50: F ull data unfolded density for the ln(1 / ∆ θ ) pro jection (black p oints with error bars) compared to Phase 4a MC expected result (blue dashed), the 10% data v alidation result (gra y squares), and PYTHIA 6 particle-lev el truth (red line). The blue shaded band shows the systematic uncertain ty on the exp ected result. Gray vertical bands indicate bins excluded by the flat-prior gate. All surviving bins agree within uncertain ties, with the 10% data confirming the full-data trend. 114 the k t direction rather than a physics result. The limited surviv al rate is a fundamental consequence of the lo w diagonal fraction (31%) and high condition num b er (10.3) of the k t resp onse matrix. T able 51: Unfolded Lund plane density pro jected onto ln( k t / GeV) for the tw o surviving bins. The limited n umber of surviving bins restricts the physics in terpretation of this pro jection. Bin Range ρ Stat Syst T otal 2 [-2.0, -1.5] 1.337 0.001 0.145 0.145 6 [0.0, 0.5] 1.365 0.001 0.290 0.290 T able 52: Chi-squared comparison of full data to exp ected for ln( k t ). The χ 2 /ndf of 0.003 and p = 1 . 00 should not be inter- preted as evidence of agreement: with only 2 bins and relativ e un- certain ties of 11–21%, the test has no discriminating p o wer. The uncertain ties are too large for any reasonable data/MC difference to pro duce a significant c hi-squared. Pro jection χ 2 ndf χ 2 /ndf p-v alue ln( k t ) 0.01 2 0.003 1.00 The pulls in the tw o surviving bins are +0.08 σ and -0.05 σ , both negligible, again reflecting the large uncertain ties rather than precise agree men t. B.8.3 Consistency with 10% v alidation The full data is compared to the 10% subsample to verify internal consistency . This comparison uses purely statistical uncertainties (com bined full-data and 10% stat errors), since systematic uncertainties cancel in the ratio (same resp onse matrix, same corrections): T able 53: Consistency test b et ween full data and 10% subsam- ple. The χ 2 /ndf v alues near 1.0 confirm excellent agreemen t, with statistical precision improving b y a factor √ 10 ≈ 3 . 2 as exp ected. Pro jection χ 2 /ndf p-v alue Max pull ln(1 / ∆ θ ) 4.97/5 = 0.99 0.42 1.48 σ ln( k t ) 1.86/2 = 0.93 0.40 1.36 σ No individual pull exceeds 2 σ . The largest pull (1.48 σ in ln(1 / ∆ θ ) bin 0) is consisten t with statistical fluctuations for 7 indep enden t comparisons. B.8.4 Declusterings vs k t threshold The mean n umber of primary Lund declusterings per hemisphere ab o ve v arious k t thresholds is measured at detector level and compared to the PYTHIA 6 particle-level prediction: 115 -1 0 1 2 3 [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH PYTHIA 6 truth Expected (Phase 4a) Syst. band 10% data Full data (IBU) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0.5 1.0 1.5 Data / Expected Figure 51: F ull data unfolded densit y for the ln( k t ) pro jection compared to expected, 10% data, and PYTHIA 6 truth. Only tw o bins survive the flat-prior gate (green regions). The large gray-shaded excluded regions illustrate the severit y of the prior-dep endence limitation for this pro jection. Both surviving bins agree with the exp ected v alues. 116 T able 54: Mean n umber of primary declusterings p er hemisphere ab o v e v arious k t thresholds. The data v alues are at detector level (c harged tracks with trac king inefficiency and resolution effects), while the PYTHIA 6 v alues are at particle level (all stable c harged particles with p erfect reconstruction). The systematic 13–22% deficit reflects the detector efficiency , not a physics disagreement. The total num b er of declusterings p er hemisphere (no threshold) is 5.02 (data) vs 6.10 (PYTHIA 6 particle level). k t threshold [GeV] Data PYTHIA 6 Ratio 0.5 2.54 2.94 0.87 1.0 1.24 1.42 0.87 2.0 0.49 0.56 0.87 5.0 0.13 0.17 0.78 The data consistently shows fewer declusterings p er hemisphere than the particle-level prediction. The deficit of approximately 13–22% is expected b ecause the data coun t is at detector level while the PYTHIA 6 truth is at particle lev el. This difference is consistent with the mean efficiency of approximately 49% for the 1D pro jections: soft and forw ard declusterings are lost to trac king inefficiency . The unfolded densit y measuremen t (Sections 8.1 and 8.2) corrects for these effects. B.9 Comparison to generator prediction The unfolded full data is compared to the PYTHIA 6 particle-level prediction (the MC generator output b efore detector simulation). Because PYTHIA 6 is the only av ailable generator and is also the mo del used to construct the resp onse matrix, this comparison is tautological: the data-vs-PYTHIA 6 chi-squared is iden tical to the data-vs-exp ected chi-squared (see b elow). An indep endent generator test w ould require particle-lev el predictions from PYTHIA 8, HER WIG 7, or analytic calculations, whic h are not a v ailable for this analysis. The c hi-squared uses the full cov ariance matrix (stat + syst) on surviving bins. T able 55: Chi-squared comparison of unfolded data to PYTHIA 6 particle-lev el prediction. The PYTHIA 6 chi-squared is iden tical to the data-vs-expected chi-squared (tbl. 49 , tbl. 52 ) b ecause the exp ected result is obtained by unfolding MC pseudo-data through the same response matrix that enco des the PYTHIA 6 truth. The t wo comparisons are therefore not indep enden t tests. Pro jection χ 2 /ndf p-v alue ln(1 / ∆ θ ) 6.49/5 = 1.30 0.26 ln( k t ) 0.01/2 = 0.003 1.00 Note on the PYTHIA 6 chi-squared: The c hi-squared of the data compared to PYTHIA 6 truth is n umerically iden tical to the chi-squared of data compared to the exp ected result. This is because the expected result is obtained b y unfolding the matched MC reco through the nominal resp onse matrix, which recov ers the PYTHIA 6 truth exactly (closure c hi-squared = 0). The t wo tests therefore provide the same information: b oth assess whether the unfolded data is consisten t with the PYTHIA 6 particle-lev el prediction within the systematic uncertain ties. An indep enden t theory comparison would require particle-level predictions from generators other than the one used to construct the response matrix (e.g., PYTHIA 8 Monash, HER WIG 7.3, or analytic NLL calculations). The data agrees with PYTHIA 6 within systematic uncertainties. The per-bin data/PYTHIA 6 ratios for the angular pro jection are: 117 1 0 1 1 0 0 1 0 1 k t t h r e s h o l d [ G e V ] 1 0 6 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 Mean primary declusterings per hemisphere p s = 9 1 . 2 G e V ALEPH ALEPH data PYTHIA 6 (particle level) Figure 52: Mean n um b er of primary Lund declusterings p er hemisphere abov e a k t threshold, comparing full ALEPH data at detector level (blac k p oin ts) to PYTHIA 6 particle-level prediction (red line). The systematic deficit of 13–22% reflects detector efficiency effects (trac king inefficiency , resolution smearing), not a ph ysics disagreement. Both curves sho w the expected monotonic decrease with increasing k t threshold, with approximately 2.5 declusterings per hemisphere abov e k t = 0 . 5 GeV decreasing to 0.13 ab o ve k t = 5 GeV. 118 T able 56: Data/PYTHIA 6 ratio for surviving ln(1 / ∆ θ ) bins. The mild 2–3% deficit at wide angles (bins 0–2) and 2–3% excess at nar- ro wer angles (bins 4–5) suggests a p ossible shape difference betw een data and PYTHIA 6, but the effect is w ell within the systematic uncertain ty band (dominated by prior dep endence at 5–17%). Bin Range Data/PYTHIA 6 0 [0.0, 0.5] 0.981 1 [0.5, 1.0] 0.966 2 [1.0, 1.5] 0.969 4 [2.0, 2.5] 1.017 5 [2.5, 3.0] 1.030 The pattern of data sitting slightly b elow PYTHIA 6 at wide angles (ln(1 / ∆ θ ) < 1 . 5) and slightly ab ov e at narrow er angles (ln(1 / ∆ θ ) > 2) was already visible in the 10% v alidation and is confirmed with improv ed statistical precision in the full data. A genuine physics difference (e.g., different angular distribution of soft emissions in data vs.˜the PYTHIA 6 Lund string mo del) cannot b e distinguished from prior-dependence bias at this lev el of precision. B.9.1 Systematic breakdo wn on full data The systematic breakdown for the surviving bins on full data is shown in fig. 55 and fig. 56 . The hierarch y is unchanged from the Phase 4a exp ectation: 1. Prior dep endence dominates at 87–93% of the total systematic. 2. Hadronization mo del is the second-largest source at 38–47% of the non-prior total. 3. Detector effects (selection, tracking, angular resolution, neutral mo deling) contribute 4–12% eac h, collectiv ely subdominant. The measurement is completely systematics-dominated: the ratio of systematic to statistical uncertain ty exceeds 80 in all surviving bins. B.10 Statistical metho d and co v ariance B.10.1 Co v ariance matrix construction The total cov ariance matrix for each pro jection is: C total = C stat + X s C s (38) where C stat is the data statistical cov ariance estimated from 200 b o otstrap replicas through the full unfolding c hain, and C s = δ s ⊗ δ s is the systematic cov ariance from the outer product of each systematic shift vector δ s . The statistical cov ariance captures b oth the P oisson fluctuations of the data and the MC statistical uncertain ty on the resp onse matrix (through the bo otstrap resampling). The systematic cov ariance assumes full correlation within eac h source, which is appropriate for the parametric v ariations used in this analysis. B.10.2 Co v ariance v alidation 119 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH PYTHIA 6 (particle level) ALEPH data (IBU) 1 2 3 4 5 l n ( 1 / ) 0.5 1.0 1.5 Data / PYTHIA 6 Figure 53: Unfolded full data compared to PYTHIA 6 particle-level prediction for the ln(1 / ∆ θ ) pro jection. The main panel sho ws the density v alues with total (stat + syst) error bars on the data. The ratio panel sho ws the data/PYTHIA 6 ratio. The data sits 2–3% below PYTHIA 6 in bins 0–2 (wide angles) and 2–3% ab o v e in bins 4–5 (narrow er angles), but all deviations are well within the systematic uncertaint y band. 120 -1 0 1 2 3 [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH PYTHIA 6 (particle level) ALEPH data (IBU) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0.5 1.0 1.5 Data / PYTHIA 6 Figure 54: Unfolded full data compared to PYTHIA 6 particle-level prediction for the ln( k t ) pro jection. Both surviving bins agree well with the PYTHIA 6 prediction, with data/MC ratios of 1.01 and 0.99. 121 Bin 0 Bin 1 Bin 2 Bin 4 Bin 5 0.00 0.05 0.10 0.15 0.20 0.25 Absolute systematic shift p s = 9 1 . 2 G e V ALEPH prior dependence hadronization selection cuts neutral modeling tracking efficiency angular resolution isr heavy flavor alternative method background Figure 55: Systematic uncertaint y breakdo wn for surviving ln(1 / ∆ θ ) bins in the full data result. Prior dep endence (blue) dominates in all five bins, follow ed b y hadronization (orange). The remaining sources (selection, trac king, angular resolution, neutral modeling, ISR, heavy flav or, alternative metho d, MC statis- tical, background, regularization) are collectively small. 122 Bin 2 Bin 6 0.00 0.05 0.10 0.15 0.20 0.25 Absolute systematic shift p s = 9 1 . 2 G e V ALEPH prior dependence hadronization tracking efficiency selection cuts angular resolution neutral modeling isr heavy flavor background Figure 56: Systematic uncertain ty breakdown for surviving ln( k t ) bins in the full data result. The same hierarc hy applies, with prior dep endence and hadronization as the tw o leading sources. The relativ e con tri- bution of tracking efficiency is slightly larger for k t than for the angular pro jection, reflecting the sensitivity of soft emissions to track reconstruction. 123 T able 57: Cov ariance matrix v alidation. Both matrices are positive semi-definite with condition n umbers well below the 10 10 threshold for numerically stable matrix inv ersion. Prop ert y ln(1 / ∆ θ ) ln( k t ) F ull dimension 10 x 10 12 x 12 Eigen v alue range 2 . 0 × 10 − 6 –1.9 2 . 7 × 10 − 6 –6.3 Negativ e eigen v alues 0 0 Condition num b er 9 . 7 × 10 5 2 . 4 × 10 6 PSD chec k P ASS P ASS The surviving-bin cov ariance matrices (5 x 5 for angular, 2 x 2 for k t ) are also p ositiv e semi-definite, with impro ved conditioning due to the remov al of p oorly-constrained edge bins. B.10.3 Surviving-bin co v ariance matrices The final co v ariance matrices on the surviving bins combine data statistical uncertaint y with the full sys- tematic cov ariance: Final co v ariance for surviving ln(1 / ∆ θ ) bins Prop ert y V alue Dimension 5 x 5 (bins 0, 1, 2, 4, 5) Eigen v alue range 2 . 2 × 10 − 5 –1 . 9 × 10 − 1 Negativ e eigen v alues 0 PSD chec k P ASS Final co v ariance for surviving ln( k t ) bins Prop ert y V alue Dimension 2 x 2 (bins 2, 6) Eigen v alue range 7 . 3 × 10 − 3 –9 . 8 × 10 − 2 Negativ e eigen v alues 0 PSD chec k P ASS B.11 Summary and outlo ok B.11.1 Summary The primary Lund jet plane densit y in hadronic Z deca ys at √ s = 91 . 2 GeV is measured using the full arc hived ALEPH dataset (2,889,543 even ts after selection, 5,779,086 hemispheres). This is the first measurement of the Lund jet plane in electron-p ositron collisions. The primary result is the ln(1 / ∆ θ ) angular sp ectrum, measured in 5 surviving bins (after the flat-prior 20% gate) spanning ln(1 / ∆ θ ) ∈ [0 , 1 . 5] ∪ [2 , 3]. The Lund plane densit y is corrected to the charged-particle lev el using Iterativ e Ba y esian Unfolding with 5 iterations, v alidated b y closure tests, stress tests, noisy closure tests, and cross-chec ked with truncated SVD. The key n umerical results for the angular spectrum are: 124 0 1 2 3 4 5 l n ( 1 / ) 0 1 2 3 4 5 l n ( 1 / ) p s = 9 1 . 2 G e V ALEPH Simulation -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 Correlation Figure 57: Correlation matrix for the ln(1 / ∆ θ ) pro jection (full 10 bins). Strong p ositiv e correlations betw een neigh b oring bins are driv en by the prior-dep endence systematic, which shifts all bins coherently . Anti- correlations b etw een bins 0–2 and bins 4–5 reflect the competing effects of the reweigh ting systematics, whic h tilt the distribution in opp osite directions. 125 -3 -2 -1 0 1 2 3 l n ( k t / G e V ) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH Simulation -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 Correlation Figure 58: Correlation matrix for the ln( k t ) pro jection (full 12 bins). Similar correlation structure to the angular pro jection: p ositive correlations b et ween nearby bins and anti-correlations b etw een the low- k t and high- k t regions, reflecting the coheren t effect of the dominant prior-dep endence systematic. 126 1 2 3 4 l n ( 1 / ) 0.0 0.2 0.4 0.6 0.8 1.0 ( ) [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH Simulation Total Statistical Systematic Figure 59: Uncertain ty decomp osition for the ln(1 / ∆ θ ) pro jection showing total (blac k), statistical (blue), and systematic (red) uncertain ties p er bin. The systematic uncertaint y dominates b y a factor > 80 in all bins, reflecting the large prior-dep endence and hadronization systematics. 127 -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 ( ) [ p e r h e m i s p h e r e p e r u n i t b i n w i d t h ] p s = 9 1 . 2 G e V ALEPH Simulation Total Statistical Systematic Figure 60: Uncertaint y decomp osition for the ln( k t ) pro jection showing total, statistical, and systematic p er bin. The systematic dominance is even more pronounced at the kinematic edges where the density is small and the prior dep endence is strongest. 128 Bin 0 Bin 1 Bin 2 Bin 4 Bin 5 Bin 0 Bin 1 Bin 2 Bin 4 Bin 5 p s = 9 1 . 2 G e V ALEPH -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 Correlation Figure 61: Correlation matrix for surviving ln(1 / ∆ θ ) bins on full data. The strong p ositiv e correlation b et w een bins 0 and 2 and the anti-correlation betw een bins 0 and 4 reflect the coherent systematic shifts from prior dep endence. 129 Bin 2 Bin 6 Bin 2 Bin 6 p s = 9 1 . 2 G e V ALEPH -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 Correlation Figure 62: Correlation matrix for surviving ln( k t ) bins on full data. The tw o bins show negative correlation (-0.77), reflecting the opp osite-sign shifts from the prior-dep endence systematic (whic h tilts the k t sp ectrum). 130 Bin Range ρ Stat Syst 0 [0.0, 0.5] 1.583 ± 0.001 ± 0.199 1 [0.5, 1.0] 1.666 ± 0.001 ± 0.091 2 [1.0, 1.5] 1.750 ± 0.001 ± 0.303 4 [2.0, 2.5] 1.368 ± 0.001 ± 0.202 5 [2.5, 3.0] 0.894 ± 0.001 ± 0.163 The supplementary ln( k t ) pro jection retains only 2 of 12 bins: ρ = 1 . 337 ± 0 . 001 (stat) ± 0 . 145 (syst) at ln k t ∈ [ − 2 , − 1 . 5] and ρ = 1 . 365 ± 0 . 001 (stat) ± 0 . 290 (syst) at ln k t ∈ [0 , 0 . 5]. The full data agrees with the MC exp ected result: χ 2 / ndf = 1 . 30 (p = 0.26) for ln(1 / ∆ θ ) and χ 2 / ndf = 0 . 003 (p = 1.00) for ln( k t ). The only av ailable generator comparison (PYTHIA 6) is the same mo del used to construct the resp onse matrix; this measurement therefore cannot curren tly discriminate b et ween QCD mo dels and serves primarily as a first result and pro of-of-concept for the metho d. The full data is consistent with the 10% v alidation result ( χ 2 / ndf = 0 . 99, p = 0.42 for ln(1 / ∆ θ )). No anomalous pulls are observed (maxim um 1.48 σ ). The mean num b er of primary declusterings per hemisphere ab o v e v arious k t thresholds ranges from 2.54 p er hemisphere ab o ve k t = 0 . 5 GeV to 0.13 ab o ve k t = 5 GeV, at detector level. B.11.2 Dominan t limitations 1. Prior dep endence: The IBU unfolding with 5–6 iterations at 31–40% diagonal fraction retains significan t prior sensitivity . Only 5 + 2 = 7 of 22 bins survive the flat-prior gate. Prior dep endence accoun ts for 87–93% of the total systematic in surviving bins. This is the fundamental limitation of this measurement. 2. No alternativ e generator [L2]: The hadronization systematic is estimated through particle-lev el rew eighting rather than independent detector sim ulation with HER WIG or ARIADNE. This is the second-largest systematic (38–47% of the non-prior total) and cannot b e fully assessed without alter- nativ e MC. 3. MC statistics [L1]: The MC/data ratio of 0.25 limits the statistical precision of the resp onse matrix, though the propagated MC statistical uncertaint y is negligible ( < 1% of total) thanks to the relatively coarse binning. B.11.3 Outlo ok 1. Impro v ed unfolding: Alternative approaches suc h as OmniF old (mac hine-learning-based unfolding) or increased IBU iterations with a data-driven prior could recov er additional bins in the Lund plane. The data-driv en prior (using the reco-level data distribution to initialize IBU) w ould reduce the prior- dep endence systematic by starting from a prior closer to the truth. 2. Alternativ e generators: Particle-lev el predictions from PYTHIA 8 (Monash tune), HER WIG 7.3 (cluster fragmentation), and analytic NLL calculations ([ ref-Dreyer:2018n bf ]) should b e compared to the unfolded result. These comparisons w ould provide physics in terpretation b ey ond the PYTHIA 6 baseline. 3. Expanded k t pro jection: The ln( k t ) pro jection retains only tw o bins. Expanding this measurement w ould require either more iterations (at the cost of increased noise), coarser binning, or a fundamentally differen t unfolding approach. 4. Tw o-dimensional result: The 2D Lund plane density at coarser binning (e.g., 5 × 6) could poten- tially b e unfolded with acceptable conditioning. This woul d pro vide the full tw o-dimensional structure that is the theoretically most interesting represen tation. 131 5. Other e + e − datasets: Similar measuremen ts could b e p erformed with archiv ed data from other LEP exp erimen ts (DELPHI, L3, OP AL) or from future e + e − colliders (FCC-ee, CEPC, ILC) where v astly larger datasets and mo dern detectors w ould o vercome the limitations encountered here. 132 B.12 App endices B.12.1 App endix A: Limitation index T able 58: Limitation index collecting all assumptions [A], limita- tions [L], and design decisions [D] in tro duced throughout the anal- ysis. Each en try lists where it was in tro duced, its impact on the final result, and the mitigation applied. Lab el Description Phase Impact Mitigation A1 pwflag == 0 identifies c harged trac ks 1 Low V alidated b y prior analysis A2 Thrust axis proxies quark axis 1 Low Standard at LEP A3 PYTHIA 6 MC describ es detector 1 Mo derate Data/MC within 5% A4 C/A clustering is IRC safe 1 None By construction L1 MC/data ratio 0.25 1 Mo derate Bo otstrap propagation L2 No alternative generator MC 1 High Reweigh ting approx. L3 F rozen reconstruction 1 Low Arc hiv al constraint L4 Charged particles only 1 Low Design choice L5 2D unfolding unreliable 3 High 1D pro jections B1 No explicit TPC hit cut 3 Lo w highPurity flag B2 Index-based matc hing 3 Mo derate Systematic study D1 IBU primary method 1 – SVD cross-chec k D2 SVD alternative metho d 1 – Implemen ted D3 Normalized measurement 1 – Lumi cancels D7 1D primary , 2D secondary 3 – Implemented B.12.2 App endix B: Per-bin systematic tables ln(1 / ∆ θ ) : all systematic shifts (densit y units) Bin Pri Had Sel T rk Ang Neu ISR HF Alt MC Bkg Reg T ot 0 .190 .065 .032 .016 .010 .016 .008 .008 .002 .002 .002 .000 .199 1 .054 .069 .035 .017 .011 .017 .009 .009 .004 .002 .002 .000 .091 2 .283 .108 .036 .018 .013 .018 .009 .009 .005 .002 .002 .000 .303 3 .361 .082 .034 .017 .011 .017 .008 .008 .002 .002 .002 .000 .370 4 .196 .044 .027 .013 .009 .013 .007 .007 .003 .001 .001 .000 .202 5 .113 .114 .017 .009 .006 .009 .004 .004 .001 .001 .001 .000 .163 6 .348 .120 .009 .004 .003 .004 .002 .002 .001 .001 .000 .000 .369 7 .452 .079 .004 .002 .001 .002 .001 .001 .000 .000 .000 .000 .462 8 .505 .058 .002 .001 .001 .001 .000 .000 .000 .000 .000 .000 .511 9 .987 .035 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000 .990 : Per-bin systematic shifts for all 10 ln(1 / ∆ θ ) bins, in density units. Bins 3, 6–9 are excluded by the flat-prior gate. Column abbreviations: Pri = prior dep endence, Had = hadronization, Sel = selection cuts, T rk = trac king efficiency , Ang = angular resolution, Neu = neutral mo deling, ISR = ISR treatment, HF = heavy fla vor, Alt = alternative metho d, MC = MC statistical, Bkg = background, Reg = regularization, T ot = total. { #tbl:syst lntheta } ln( k t ) : all systematic shifts (densit y units) 133 Bin Pri Had Sel T rk Ang Neu ISR HF Alt MC Bkg Reg T ot 0 .967 .121 .008 .012 .004 .004 .002 .002 .000 .001 .000 .000 .975 1 .595 .134 .015 .021 .008 .008 .004 .004 .000 .002 .001 .000 .611 2 .074 .116 .026 .033 .013 .013 .007 .007 .000 .002 .001 .000 .145 3 .559 .061 .040 .044 .020 .020 .010 .010 .000 .003 .002 .000 .567 4 .962 .034 .048 .046 .024 .024 .012 .012 .000 .002 .002 .000 .965 5 .802 .127 .042 .034 .021 .021 .011 .011 .000 .003 .002 .000 .815 6 .254 .135 .028 .019 .014 .014 .007 .007 .000 .002 .001 .000 .290 7 .211 .076 .014 .008 .007 .007 .004 .004 .000 .002 .001 .000 .225 8 .392 .017 .007 .003 .004 .004 .002 .002 .000 .001 .000 .000 .392 9 .401 .022 .004 .001 .002 .002 .001 .001 .000 .001 .000 .000 .402 10 .503 .057 .003 .000 .001 .001 .001 .001 .000 .001 .000 .000 .506 11 1.565 .112 .001 .000 .000 .000 .000 .000 .000 .001 .000 .000 1.569 : P er-bin systematic shifts for all 12 ln( k t ) bins, in density units. Bins 0, 1, 3–5, 7–11 are excluded b y the flat-prior gate. Column abbreviations as in tbl. ?? . { #tbl:syst lnkt } B.12.3 App endix C: Cov ariance matrices ln(1 / ∆ θ ) statistical co v ariance (survivi ng bins) T able 59: Statistical co v ariance matrix for the 5 surviving ln(1 / ∆ θ ) bins. The off-diagonal elements are negativ e, reflecting the anti- correlation b et ween neighboring bins introduced by the unfolding regularization. All elemen ts are of order 10 − 6 , confirming the neg- ligible statistical contribution to the total uncertain ty . Bin 0 Bin 1 Bin 2 Bin 4 Bin 5 Bin 0 3.53e-06 -1.39e-06 -9.44e-07 -3.26e-07 4.92e-08 Bin 1 -1.39e-06 4.33e-06 -6.97e-07 -9.10e-07 -1.20e-06 Bin 2 -9.44e-07 -6.97e-07 4.21e-06 -1.18e-06 -9.06e-07 Bin 4 -3.26e-07 -9.10e-07 -1.18e-06 5.08e-06 -5.66e-07 Bin 5 4.92e-08 -1.20e-06 -9.06e-07 -5.66e-07 4.81e-06 ln(1 / ∆ θ ) total co v ariance (surviving bins) T able 60: T otal cov ariance matrix for the 5 surviving ln(1 / ∆ θ ) bins. The total cov ariance is dominated b y the systematic comp o- nen ts, with the statistical con tribution at the 10 − 6 lev el compared to systematic entries at the 10 − 2 lev el. Both matrices are p ositive semi-definite with no negativ e eigen v alues. Bin 0 Bin 1 Bin 2 Bin 4 Bin 5 Bin 0 0.03969 -0.01120 -0.05649 -0.03510 0.02725 Bin 1 -0.01120 0.00835 0.02291 0.01019 -0.01209 Bin 2 -0.05649 0.02291 0.09165 0.05445 -0.04217 Bin 4 -0.03510 0.01019 0.05445 0.04068 -0.01816 Bin 5 0.02725 -0.01209 -0.04217 -0.01816 0.02658 ln( k t ) statistical cov ariance (surviving bins) 134 T able 61: Statistical cov ariance matrix for the 2 surviving ln( k t ) bins. Bin 2 Bin 6 Bin 2 4.84e-06 -6.33e-07 Bin 6 -6.33e-07 4.17e-06 ln( k t ) total cov ariance (surviving bins) T able 62: T otal co v ariance matrix for the 2 surviving ln( k t ) bins. The negative off-diagonal element indicates anti-correlation: the prior dependence shifts the tw o surviving bins in opp osite direc- tions. Both matrices are p ositiv e semi-definite. Bin 2 Bin 6 Bin 2 0.02108 -0.03258 Bin 6 -0.03258 0.08427 B.12.4 App endix D: Machine-readable results All results are provided in machine-readable formats in the results/ directory (also a v ailable at phase4 inference/exec/results/ ): T able 63: Mac hine-readable result files. The NPZ archiv es con tain NumPy arra ys with densities, uncertain ties, cov ariance matrices, c hi-squared v alues, pulls, and systematic shift v ectors. The CSV files contain the primary results in a format suitable for direct comparison with theory predictions. File Con tents lntheta results.csv P er-bin densit y , stat/syst/total uncertainties lnkt results.csv Per-bin density , stat/syst/total uncertain ties declusterings vs kt.csv Mean declusterings vs k t threshold full data results.npz All n umerical results (NumPy archiv e) covariance lntheta full.npz Cov ariance matrices (stat, p er-syst, total) covariance lnkt full.npz Co v ariance matrices (stat, p er-syst, total) Additional intermediate results from earlier analysis phases: File Lo cation Conten ts expected results.npz Phase 4 exec MC exp ected densities systematics.npz Phase 4 exec P er-source shift vectors fix review.npz Phase 4 exec Flat-prior gate, noisy closure data 10pct results.npz Phase 4 exec 10% v alidation results B.12.5 App endix E: Input v alidation Data/MC comparisons for all kinematic v ariables entering the observ able calculation, using the full ALEPH dataset. These comparisons v alidate the MC mo del used for response matrix construction. 135 Axis 0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 Declusterings 1e6 p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data 0 1 2 3 4 5 l n ( 1 / ) 0.9 1.0 1.1 Data / MC Figure 63: Detector-lev el ln(1 / ∆ θ ) data/MC comparison from Phase 3 using the full ALEPH dataset. The angular coordinate of the Lund plane sho ws data/MC agreemen t within 5% for ln(1 / ∆ θ ) < 4. A t the highest v alues (smallest angles), larger deviations are observed but these corresp ond to bins excluded b y the flat- prior gate. 136 Axis 0 0.0 0.5 1.0 1.5 2.0 2.5 Declusterings 1e6 p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0.9 1.0 1.1 Data / MC Figure 64: Detector-level ln( k t ) data/MC comparison from Phase 3 using the full ALEPH dataset. The transv erse momentum coordinate shows agreemen t within 5% across the full range − 3 < ln k t < 3, with the b est agreemen t in the central region where the Lund plane density is highest. 137 B.12.6 App endix F: Resp onse matrix details This app endix pro vides additional details on the tw o-dimensional resp onse matrix, including the full 79-bin matrix visualization, the per-bin diagonal fractions, and the singular v alue sp ectrum that determines the feasibilit y of matrix inv ersion metho ds. 0 20 40 60 Gen bin index 0 10 20 30 40 50 60 70 Reco bin index p s = 9 1 . 2 G e V ALEPH Simulation 0.0 0.1 0.2 0.3 0.4 0.5 0.6 P(reco bin | gen bin) Figure 65: Two-dimensional response matrix for the 79 active Lund plane bins. The matrix is display ed with gen-lev el bins on the x-axis and reco-level bins on the y-axis, column-normalized so eac h column sums to 1. Significant off-diagonal structure is visible, reflecting bin migration from tracking resolution. The global diagonal fraction of 22% makes 2D unfolding impractical. B.12.7 App endix G: Relative systematic uncertain ties This app endix presen ts the systematic uncertain ties as percentages of the nominal density , complementing the absolute shifts sho wn in the main text and App endix B. 138 0 20 40 60 80 Active bin index 0.0 0.1 0.2 0.3 0.4 0.5 Per-bin diagonal fraction p s = 9 1 . 2 G e V ALEPH Simulation Overall = 0.216 Figure 66: Per-bin diagonal fraction across the 79 activ e Lund plane bins. Most bins hav e diagonal fractions b et w een 10% and 40%, with the highest v alues in the cen tral region where angular and momen tum resolution are b est relative to bin width. No bin exceeds 50% diagonal fraction. 139 0 20 40 60 80 Singular value index 1 0 2 1 0 1 1 0 0 Singular value p s = 9 1 . 2 G e V ALEPH Simulation Truncation at 28 Figure 67: Singular v alue spectrum of the t wo-dimensional response matrix. The singular v alues span o v er 10 orders of magnitude, from O (1) for the dominan t modes to O (10 − 10 ) for the most suppressed. This extreme range corresp onds to the condition n umber of 2 . 75 × 10 18 and explains why matrix inv ersion metho ds (SVD) are unusable for the 2D unfolding. 140 1 2 3 4 l n ( 1 / ) 0 500 1000 1500 2000 2500 3000 Relative systematic uncertainty [%] p s = 9 1 . 2 G e V ALEPH Simulation regularization prior dependence alternative method mc statistical hadronization tracking efficiency Total systematic Figure 68: Relative systematic uncertain ties for the ln(1 / ∆ θ ) pro jection, expressed as p ercen tage of the nominal density . The total relativ e uncertain ty ranges from appro ximately 5% in the b est-constrained bin (bin 1, [0.5, 1.0]) to ov er 3000% in the last bin ([4.5, 5.0]). The bins surviving the flat-prior gate hav e relativ e uncertain ties of 5–17%. 141 -3 -2 -1 0 1 2 3 l n ( k t / G e V ) 0 500 1000 1500 2000 2500 3000 Relative systematic uncertainty [%] p s = 9 1 . 2 G e V ALEPH Simulation regularization prior dependence alternative method mc statistical hadronization tracking efficiency Total systematic Figure 69: Relativ e systematic uncertainties for the ln( k t ) pro jection. The relative uncertain ty is largest at the kinematic extremes where the densit y is small and prior dependence is strongest. The t wo surviving bins ha ve relativ e uncertain ties of 11% and 21%. 142 B.12.8 App endix H: Cov ariance eigen v alue sp ectra This appendix sho ws the eigen v alue sp ectra of the total co v ariance matrices, whic h c haracterize the n umerical conditioning and the dominan t uncertaint y mo des. B.12.9 App endix I: Phase 2 exploration distributions The follo wing distributions w ere pro duced during Phase 2 (data exploration) using 50,000 data and MC ev ents after selection. They pro vide additional v alidation of the data qualit y and MC mo deling. References [] ALEPH Collab oration. n.d.-a. “A study of energy-energy correlations in charged tracks from hadronic deca ys of the Z0.” INSPIRE . https://inspirehep.net/literature/232620 . [] ALEPH Collab oration. n.d.-b. “Determination of differences betw een quark and gluon jets in 3-jet ev en ts.” INSPIRE . https://inspirehep.net/literature/433342 . [] A TLAS Collaboration. 2020. “Measurement of the Lund Jet Plane Using Charged Particles in 13 T eV Proton-Proton Collisions with the A TLAS Detector.” Phys. R ev. L ett. 124: 222002. https://doi. org/10.1103/PhysRevLett.124.222002 . [] Baumgart, S. 2023. “Search for the Pro duction of Quark-Gluon Plasma in e+e- Collisions at sqrt(s) = 91.2 GeV with Arc hived ALEPH LEP1 Data.” PhD thesis, Massac husetts Institute of T ec hnology . https://cds.cern.ch/record/2876991 . [] Bethke, S. et al. 2004. QCD studies and determination of alpha s using LEP data . abs/hep- ex/0411006 . [] Cacciari, Matteo, Ga vin P . Salam, and Gregory Soy ez. 2012. “F astJet User Man ual.” Eur. Phys. J. C 72: 1896. https://doi.org/10.1140/epjc/s10052- 012- 1896- 2 . [] D’Agostini, G. 1995. “A m ultidimensional unfolding metho d based on Bay es’ theorem.” Nucl. Instrum. Meth. A 362: 487–98. https://doi.org/10.1016/0168- 9002(95)00274- X . [] Dokshitzer, Y uri L., G. D. Leder, S. Moretti, and B. R. W ebb er. 1997. “Better Jet Clustering Algorithms.” JHEP 08: 001. https://doi.org/10.1088/1126- 6708/1997/08/001 . [] Dreyer, F rederic A., Gavin P . Salam, and Gregory So yez. 2018. “The Lund Jet Plane.” JHEP 12: 064. https://doi.org/10.1007/JHEP12(2018)064 . [] Nav as, S. et al. 2024. “Review of P article Physics.” Phys. R ev. D 110: 030001. https://doi.org/10. 1103/PhysRevD.110.030001 . 143 2 4 6 8 10 Eigenvalue index 1 0 6 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 1 0 0 Eigenvalue p s = 9 1 . 2 G e V ALEPH Simulation Figure 70: Eigen v alue sp ectrum of the total cov ariance matrix for the ln(1 / ∆ θ ) pro jection (10 bins). The eigen v alues span 6 orders of magnitude, from 2 × 10 − 6 to 1.9. The large eigenv alue hierarch y reflects the dominance of the prior-dep endence systematic, which in tro duces a single large eigenmo de (coheren t shift of all bins) with m uch smaller eigenv alues for the residual mo des. 144 2 4 6 8 10 12 Eigenvalue index 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 Eigenvalue p s = 9 1 . 2 G e V ALEPH Simulation Figure 71: Eigenv alue sp ectrum of the total cov ariance matrix for the ln( k t ) pro jection (12 bins). Similar 6-order-of-magnitude span as the angular pro jection, with all eigen v alues strictly positive (PSD verified). 145 Axis 0 0 500 1000 1500 2000 2500 3000 3500 Events p s = 9 1 . 2 G e V ALEPH PYTHIA 6 MC ALEPH Data 0 10 20 30 40 50 N H P c h p e r e v e n t 0.8 1.0 1.2 Data / MC Figure 72: Charged m ultiplicity distribution for data and MC. The distribution of the num ber of high-purity c harged trac ks p er even t p eaks near 17–18, with a tail extending to appro ximately 40. Go o d data/MC agreemen t v alidates the track selection and particle multiplicit y mo deling in the MC. 146 0 1 2 3 4 5 l n ( 1 / ) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH l n k t + l n ( 1 / ) = l n ( E b e a m ) 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 0.200 Figure 73: Lund plane densit y measured in 10,000 ALEPH data even ts from Phase 2 exploration. This early lo ok at the Lund plane confirmed the exp ected triangular structure and QCD radiation pattern b efore full-statistics pro cessing. 147 -2 -1 0 1 2 l n ( 1 / ) 0 2000 4000 6000 8000 10000 Declusterings Std: 0.802 p s = 9 1 . 2 G e V ALEPH Mean: 0.285 Figure 74: Resolution study: distribution of reco minus gen for ln(1 / ∆ θ ). The distribution has mean +0.285 and standard deviation 0.802, showing a systematic p ositiv e bias (reco angles are smaller than gen angles, as exp ected from tracking resolution smearing tow ard collinear configurations). The resolution is comparable to the bin width of 0.5. 148 -2 -1 0 1 2 l n ( k t / G e V ) 0 2000 4000 6000 8000 Declusterings Std: 1.304 p s = 9 1 . 2 G e V ALEPH Mean: -0.010 Figure 75: Resolution study: distribution of reco minus gen for ln( k t ). The distribution is nearly unbiased (mean -0.010) but broad (standard deviation 1.304), exceeding tw o bin widths. This large k t resolution driv es the higher off-diagonal migration in the k t resp onse matrix compared to the angular pro jection. 149 0 1 2 3 4 5 l n ( 1 / ) -3 -2 -1 0 1 2 3 l n ( k t / G e V ) p s = 9 1 . 2 G e V ALEPH l n k t + l n ( 1 / ) = l n ( E b e a m ) 0 500 1000 1500 2000 2500 3000 Figure 76: Bin p opulation heatmap for the proposed 10 x 12 binning of the tw o-dimensional Lund plane. Of 120 total bins, approximately 63 hav e adequate population ( > 50 even ts in 10,000 ev ent exploration sample). The empty bins in the upp er-right corner corresp ond to the kinematic b oundary region. 150 C Num b er of Ligh t Neutrino Generations from the Z In visible Width with DELPHI Data C.1 In tro duction C.1.1 Ph ysics motiv ation The n umber of ligh t neutrino generations N ν is a fundamental parameter of the Standard Model (SM) of particle ph ysics. At the CERN Large Electron–Positron Collider (LEP), this quan tity w as determined with per-mille precision b y exploiting the Z b oson resonance. The total width of the Z b oson Γ Z receiv es con tributions from all kinematically access ible deca y channels: Γ Z = Γ had + 3 Γ ℓ + N ν Γ SM ν ¯ ν , where Γ had is the hadronic partial width, Γ ℓ is the leptonic partial width (assuming lepton universalit y), and Γ SM ν ¯ ν = 167 . 157 ± 0 . 002 MeV is the SM prediction for the partial width into a single neutrino sp ecies, calculated including electrow eak radiative corrections at tw o-lo op accuracy . The invisible width of the Z b oson is defined as Γ inv = Γ Z − Γ had − 3 Γ ℓ , and attributed entirely to Z → ν ¯ ν deca ys in the SM, yielding N ν = Γ inv Γ SM ν ¯ ν . The measurement of N ν w as one of the primary ph ysics goals of the LEP programme. The first deter- minations in 1989 by all four LEP exp erimen ts (ALEPH, DELPHI, L3, OP AL) established that there are exactly three ligh t neutrino generations, ruling out a fourth generation with a ligh t neutrino ( m ν < M Z / 2). This result has profound implications for cosmology (Big Bang n ucleosynthesis), the structure of the SM fermion generations, and searc hes for physics beyond the SM. C.1.2 Prior measuremen ts The combined LEP measuremen t, incorp orating data from all four experiments and using the full 1989–1995 energy scan dataset, yields [1]: N LEP ν = 2 . 9840 ± 0 . 0082 , with a precision of 0.27%. The individual DELPHI result, using the 1989–1995 dataset, is [2]: N DELPHI ν = 3 . 00 ± 0 . 02 , consisten t with three generations at the 1% lev el. Earlier DELPHI measuremen ts from the 1989 data alone ga ve N ν = 3 . 08 ± 0 . 05 [3], demonstrating the rapid impro vemen t in precision as more data w ere accum ulated. C.1.3 Observ able definition The primary observ ables are the hadronic cross sections σ had ( √ s ) measured at discrete cen tre-of-mass energy p oin ts around the Z resonance. These are fit to a radiativ ely-corrected Breit–Wigner lineshap e to extract: • M Z — the Z b oson mass, • Γ Z — the Z b oson total width, • σ 0 had — the hadronic p eak cross section. 151 F rom these fitted parameters and the external constraint R ℓ = Γ had / Γ ℓ , the partial widths and N ν are deriv ed. The hadronic p eak cross section is related to the Z partial widths by σ 0 had = 12 π M 2 Z · Γ ee Γ had Γ 2 Z , whic h, com bined with @eq:gamma in v and the measured Γ Z , provides sensitivit y to N ν through the dep endence σ 0 had ∝ Γ − 2 Z . C.1.4 Scop e of this measuremen t This analysis uses DELPHI op en data from the 1992–1995 LEP1 energy scan programme, accessed through the CERN Op en Data portal and the sk elana F ortran framew ork from CVMFS. The data are in DELPHI’s proprietary binary format ( .al short DST files), requiring extraction to CSV using a custom F ortran program b efore analysis in Python. The measuremen t focuses on the hadronic c hannel, with the leptonic partial width constrained externally . Compared to the published DELPHI result, sev eral simplifications are made: (1) the data are com bined in to four energy p oin ts rather than treated per-fill, (2) the leptonic channel is not measured indep enden tly , and (3) the ISR treatmen t uses a O ( α 2 ) radiator rather than the full O ( α 3 ) calculation. These simplifications result in a statistical precision approximately four times worse than the published DELPHI result, but the cen tral v alue remains fully consistent. C.2 Data samples C.2.1 Exp erimen tal setup The DELPHI (DEtector with Lepton, Photon and Hadron Iden tification) detector operated at the LEP e + e − collider at CERN from 1989 to 2000. The detector pro vided nearly 4 π solid angle cov erage with a cen tral trac king system (TPC, inner and outer detectors), a superconducting solenoid (1.23 T), electromagnetic calorimeters (HPC in the barrel, EMF in the forw ard region), a hadron calorimeter (HAC), and muon c hambers. The Small-angle TIle Calorimeter (STIC), installed in 1994, provided luminosity measuremen ts from small-angle Bhabha scattering with < 0 . 1% systematic uncertaint y . F or earlier y ears (1992–1993), the SA T (Small Angle T agger) serv ed as the luminosity monitor. The data used in this analysis corresp ond to the LEP1 energy scan programme, during which LEP op erated at cen tre-of-mass energies near the Z resonance ( √ s ≈ 88–93 GeV). The scan strategy v aried by y ear: 1992 and 1994 collected data predominan tly at the Z peak ( √ s ≈ 91 . 3 GeV), while 1993 and 1995 included off-p eak running at √ s ≈ 89 . 5 GeV and √ s ≈ 93 . 0 GeV to constrain the Z width. C.2.2 Data samples DELPHI LEP1 data are stored in proprietary binary .al format (short DSTs) on CERN EOS at /eos/experiment/delphi/castor2015/tape/ . The data are organized in tape arc hive volumes accessed via the fatfind utility . The data w ere extracted using a custom skelana F ortran program that reads p er-ev en t v ariables and outputs CSV files for subsequent Python analysis. F our years of data (1992–1995) w ere processed: Y ear Nickname T ap e files Ev ents extracted Hadronic (IHAD4=1) Runs 1992 short92 e2 172 2,482,494 709,243 1,292 1993 short93 d2 181 2,766,587 711,658 1,116 1994 short94 c2 429 5,753,340 1,400,251 1,892 1995 short95 d2 246 3,661,916 677,507 906 T otal 1,028 14,664,337 3,498,659 5,206 152 The total extraction required appro ximately 2.5 hours across four parallel jobs, with a processing rate of appro ximately 15,000 even ts p er 10 seconds p er tap e file. The 1991 data ( short91 f1 ) was inv estigated but found to b e inaccessible from the op en data (empty vidmap and fatfind listings). The 1992–1995 data pro vide sufficien t off-peak constrain t from the 1993 and 1995 energy scans. C.2.3 Cen tre-of-mass energy scan p oints The cen tre-of-mass energy is recorded pe r-ev en t via the ECMAS v ariable in the DELPHI sk elana framework, whic h contains the calibrated b eam energy from the LEP energy mo del (incorp orating resonant dep olariza- tion measurements from 1993 onw ards). Y ear √ s [GeV] ECM range [GeV] T otal even ts Hadronic ev ents 1992 91.3 (peak) 91.250–91.364 2,482,494 709,243 1993 89.5 (off-p eak lo w) 89.478–89.494 665,018 98,562 1993 91.2–91.4 (peak) 91.114–91.372 1,438,052 478,364 1993 93.1 (off-peak high) 93.066–93.088 663,617 134,732 1994 91.3 (peak) 91.254–91.482 5,753,340 1,400,251 1995 88.6 (off-p eak far low) 88.562–88.611 4,878 675 1995 89.5 (off-p eak lo w) 89.476–89.500 864,916 84,517 1995 91.3 (peak) 91.280–91.408 1,871,935 460,511 1995 93.0 (off-peak high) 93.016–93.042 920,187 131,804 The fill-to-fill v ariations in b eam energy within each nominal scan p oint (up to ∼ 300 MeV range at the p eak) reflect the precision of the LEP energy calibration. The mean √ s at eac h nominal energy p oin t is used in the lineshap e fit. C.2.4 Mon te Carlo samples Mon te Carlo simulation samples used for v alidation and efficiency estimation are av ailable from the DELPHI op en data rep ository at /eos/opendata/delphi/simulated-data/cern/ : Generator Pro cess √ s [GeV] Even ts F ormat Pythia 5.720 Z → q ¯ q 91.2 approximately 320k .sdst z0 Inclusiv e Z 91.2 approximately 10k .sdst dim u Z → µ + µ − 91.2 approximately 5k .sdst bh wide101 Bhabha 91.25 v aries .xsdst The primary hadronic MC (Pythia 5.720 with v94c pro cessing) was v alidated: even ts ha ve run num ber − 1001, √ s = 91 . 200 GeV, all IHAD4=1, and mean N ch ∼ 22. The dimuon MC w as v alidated with N ch = 2 and E vis ∼ 91–104 GeV. No Mon te Carlo samples are av ailable at the off-p eak energies (89.5 or 93.0 GeV), and no tau pair MC is av ailable at the Z p eak (only at LEP2 energy 206.7 GeV). The hadronic selection efficiency is therefore tak en from published DELPHI v alues rather than determined from MC. C.2.5 Luminosit y The integrated luminosit y at each energy p oin t is essen tial for conv erting even t counts to cross sections. The luminosit y en ters the lineshap e fit as a constrained n uisance parameter. Published p er-y ear luminosities from DELPHI [2] and the LEP Electro weak W orking Group rep ort [1]: 153 88 89 90 91 92 93 94 95 p s [ G e V ] 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 Events / 0.07 GeV L E P 1 e + e DELPHI Open Data 1992 (2,482,494 events) 1993 (2,766,587 events) 1994 (5,753,340 events) 1995 (3,661,916 events) Figure 77: Cen tre-of-mass energy distribution for all four y ears of data sho wing the LEP1 energy scan p oin ts. The 1993 and 1995 data pro vide off-p eak measurements at √ s ≈ 89 . 5 and 93.0 GeV, while 1992 and 1994 data are concentrated at the Z p eak. The 1995 data also includes a small sample at √ s ≈ 88 . 6 GeV. 154 Y ear L [pb − 1 ] Luminosity monitor 1992 26.6 SA T 1993 31.4 SA T 1994 47.8 STIC 1995 35.9 STIC T otal 141.7 The STIC (1994–1995) ac hieved < 0 . 1% systematic uncertaint y on the luminosit y measurement, while the SA T (1992–1993) had ∼ 0 . 3–0 . 5% uncertain ty . The theoretical uncertain ty on the Bhabha scattering cross section used for luminosity determination is ∼ 0 . 061% (common to all LEP experiments). C.2.6 Data extraction metho d A sk elana F ortran extractor ( phase2 exploration/scripts/extractor.car ) was written to read short DST ev ents and output p er-ev ent summary v ariables to CSV format. The extractor accesses: • Ev en t information ( PSCEVT ): run num ber ( IIIRUN ), even t n umber ( IIIEVT ), centre-of-mass energy ( ECMAS ) • T rac k v ectors ( PSCVEC ): num ber of tracks ( NVECP ), 4-momenta and c harge ( VECP(1:7,I) ) • Calorimeter asso ciations ( PSCEMF , PSCHPC , PSCHAC , PSCSTC ): p er-trac k show er energies • Pilot record : hadronic selection flag ( IHAD4 ) from the DELPHI online ev ent classification Compilation uses the DELPHI softw are stack from CVMFS ( /cvmfs/delphi.cern.ch/ ): nypatc h y code managemen t, gfortran compilation, linked against DELPHI and CERN libraries. The p er-even t output CSV contains 15 v ariables: run, even t, √ s , N ch , c harged and total visible energy , 3-momen tum components, four calorimeter energy sums, IHAD4 flag, and num b er of neutral ob jects. C.3 Ev en t selection C.3.1 Ov erview Tw o selection channels are implemented: a hadronic channel ( Z → q ¯ q , primary) and a leptonic c hannel ( Z → ℓ + ℓ − , exploratory). The hadronic channel provides the even t coun ts used in the lineshap e fit. The leptonic c hannel was found to b e con taminated by Bhabha scattering at the summary-v ariable lev el and is not used for the primary measurement. C.3.2 Energy point classification Ev ents are classified in to nominal energy scan p oin ts using windo ws around known DELPHI LEP1 scan energies: Nominal √ s [GeV] Window [GeV] Y ears 88.48 88.3–88.7 1995 89.47 89.3–89.7 1993, 1995 91.22 90.5–91.8 1992, 1993, 1994, 1995 93.04 92.5–93.3 1993, 1995 All 14,664,344 even ts fall within one of the four energy windo ws; zero ev ents are unclassified. C.3.3 Hadronic selection criteria The hadronic selection follo ws the standard DELPHI criteria [2], with cuts chosen to select Z → q ¯ q even ts while rejecting leptonic decays, t w o-photon ev ents, b eam backgrounds, and cosmic ra ys. The selection is applied sequentially: 155 Cut Criterion Motiv ation 1 N ch ≥ 5 Rejects leptonic ( N ch ≤ 6), Bhabha ( N ch = 2), STIC/luminosity ( N ch = 0), and t wo-photon ev ents 2 E vis / √ s > 0 . 12 Rejects b eam-gas, b eam-w all, and t wo-photon ev ents with low visible energy 3 E vis / √ s < 1 . 5 Rejects ev ents with anomalous calorimeter dep osits or energy double-counting 4 | p z | /E vis < 0 . 6 Rejects p oorly measured even ts and b eam-gas in teractions 5 p T /E vis < 0 . 4 Rejects ev ents with large transverse momen tum im balance The E vis / √ s > 0 . 12 threshold is the standard DELPHI hadronic preselection threshold, v alidated b y the 96.3% reco very of IHAD4-tagged even ts. A tighter threshold of 0.5 (used in some DELPHI publications for refined selections) w ould reject a significant fraction of genuine hadronic even ts with ISR energy loss or detector effects. C.3.4 Hadronic cutflo w Cut 1992 1993 1994 1995 T otal Cum ul. eff. Per-cut eff. All ev ents 2,482,495 2,766,589 5,753,343 3,661,917 14,664,344 100% — N ch ≥ 5 1,019,278 953,957 1,736,310 890,428 4,599,973 31.4% 31.4% E vis / √ s > 0 . 12 868,435 854,149 1,651,335 826,481 4,200,400 28.6% 91.3% E vis / √ s < 1 . 5 841,806 823,575 1,596,681 800,305 4,062,367 27.7% 96.7% | p z | /E vis < 0 . 6 788,479 781,553 1,512,497 745,346 3,827,875 26.1% 94.2% p T /E vis < 0 . 4 768,347 770,763 1,492,292 736,634 3,768,036 25.7% 98.4% The N ch ≥ 5 cut is the most discriminating, removing 69% of all even ts (predominantly STIC/luminosit y monitor even ts, Bhabha scattering, and leptonic decays). Subsequen t energy and momentum balance cuts remo ve an additional ∼ 18% of the remaining sample. C.3.5 Cut motiv ation Charged trac k m ultiplicity The charged track m ultiplicity N ch pro vides the strongest discrimination b et w een hadronic Z decays and backgrounds. Hadronic even ts pro duce ∼ 20–40 charged tracks from quark fragmen tation, while leptonic even ts hav e N ch ≤ 6 and luminosity monitor even ts hav e N ch = 0. Visible energy The normalized visible energy E vis / √ s separates w ell-measured hadronic Z decays (p eak- ing near 1.0) from t wo-photon even ts and b eam backgrounds (which hav e muc h lo wer visible energy). The upp er cut at 1.5 remov es even ts with anomalous calorimeter dep osits. Momen tum balance The longitudinal and transverse momentum balance cuts reject po orly measured ev ents and b eam-gas interactions. F or well-measured e + e − → Z → q ¯ q ev en ts, the total momentum should b e close to zero (within detector resolution). 156 All events N c h 5 E v i s / p s > 0 . 1 2 E v i s / p s < 1 . 5 | p z | / E v i s < 0 . 6 p T / E v i s < 0 . 4 1 0 5 1 0 6 1 0 7 Events 14.66M 4.60M 4.20M 4.06M 3.83M 3.77M p s 9 1 . 2 G e V DELPHI Figure 78: Hadronic cutflow sho wing the n umber of even ts surviving each sequen tial cut p er y ear. The c harged m ultiplicity requirement is the most discriminating, remo ving luminosity monitor, Bhabha, and leptonic even ts. 157 0 20 40 60 N c h 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 Events Before selection DELPHI All events IHAD4=1 0 20 40 60 N c h 0 20000 40000 60000 80000 100000 120000 140000 Events After selection DELPHI Our selection IHAD4=1 Figure 79: Charged track multiplicit y distribution before (left) and after (right) hadronic selection. Before cuts, the distribution shows a dominant peak at N ch = 0 from STIC/luminosity ev ents and a broad hadronic distribution peaking at N ch ≈ 29. After selection, the distribution matc hes the IHAD4 reference for N ch ≥ 5. 0.0 0.5 1.0 1.5 2.0 E v i s / p s 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 Events B e f o r e E v i s c u t s DELPHI N c h 5 IHAD4=1 0.25 0.50 0.75 1.00 1.25 1.50 E v i s / p s 0 20000 40000 60000 80000 Events After selection DELPHI Our selection IHAD4=1 Figure 80: Normalized visible energy E vis / √ s distribution b efore (left, after N ch ≥ 5 preselection) and after (righ t) all hadronic cuts. Cut v alues at 0.12 and 1.5 are indicated. The selected distribution p eaks near 1.0 with tails from detector resolution and ISR. 158 0.0 0.2 0.4 0.6 0.8 1.0 | p z | / E v i s 1 0 2 1 0 3 1 0 4 1 0 5 Events | p z | b a l a n c e DELPHI Before cut After all cuts 0.0 0.2 0.4 0.6 0.8 p T / E v i s 1 0 3 1 0 4 1 0 5 Events p T b a l a n c e DELPHI Before cut After all cuts 0.0 0.2 0.4 0.6 0.8 1.0 | p z | / E v i s 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 p T / E v i s Before momentum cuts DELPHI 0.0 0.1 0.2 0.3 0.4 0.5 0.6 | p z | / E v i s 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 p T / E v i s After momentum cuts DELPHI Figure 81: Momentum balance distributions for hadronic even ts. T op left: | p z | /E vis with cut at 0.6. T op righ t: p T /E vis with cut at 0.4. Bottom: 2D distributions sho wing the correlation b et ween longitudinal and transv erse momen tum imbalance b efore (left) and after (righ t) cuts. 159 C.3.6 IHAD4 comparison and v alidation The DELPHI pilot record pro vides an online hadronic even t classification flag (IHAD4). Comparing our selection with this flag v alidates selection quality: Y ear Our selection IHAD4=1 Ov erlap Purity IHAD4 recov ery 1992 768,347 709,243 683,228 88.9% 96.3% 1993 770,763 711,660 685,073 88.9% 96.3% 1994 1,492,292 1,400,252 1,348,664 90.4% 96.3% 1995 736,634 677,507 653,467 88.7% 96.5% T otal 3,768,036 3,498,662 3,370,432 89.4% 96.3% Our selection recov ers 96.3% of IHAD4-tagged even ts, with the 3.7% loss attributable to even ts failing our energy or momen tum balance cuts. The ∼ 8% excess of our selection o v er IHAD4 arises b ecause w e cannot apply track qualit y cuts (minimum p T , trac k length, distance of closest approach) that are part of the full DELPHI pilot record classification — our selection operates only on even t-lev el summary v ariables. The remark able consistency of these fractions across all four years confirms stable detector performance. F or the cross-section calculation and lineshap e fit, we use the direct IHAD4 ev ent counts as the primary hadronic observ able, with the published DELPHI efficiency ϵ had = 0 . 974 ± 0 . 005. Our independent selection serv es as a cross-chec k. C.3.7 Selection efficiency The hadronic selection efficiency is estimated b y comparison with the DELPHI pilot record: • IHAD4 reco very : Our cuts pass 96.3% of IHAD4-tagged even ts. • Published DELPHI efficiency : ϵ had ≈ 97 . 4% for the full DELPHI hadronic selection. The published efficiency v alue is used for the primary measuremen t. The selection efficiency is assumed constan t across all energy p oin ts ( ϵ had = 0 . 974). The true efficiency v aries slightly with √ s due to different ev ent kinematics off-peak, but this v ariation is < 0 . 1% and is co vered b y the efficiency systematic uncertain t y . C.3.8 Ev ent yields p er energy p oin t IHAD4 ev ent counts are computed directly from p er-ev ent ihad4 flags classified b y energy p oin t. F or the lineshap e fit, counts are com bined across y ears at each nominal √ s : √ s nom [GeV] ⟨ √ s ⟩ [GeV] N IHAD4 √ N Stat. unc. 88.48 88.572 675 26 3.8% 89.47 89.485 183,079 428 0.23% 91.22 91.298 3,048,372 1,746 0.057% 93.04 93.050 266,536 516 0.19% T otal 3,498,662 The p er-y ear breakdown is: Y ear √ s nom [GeV] ⟨ √ s ⟩ [GeV] N IHAD4 1992 91.22 91.341 709,243 1993 89.47 89.486 98,562 1993 91.22 91.290 478,366 1993 93.04 93.077 134,732 1994 91.22 91.271 1,400,252 160 Y ear √ s nom [GeV] ⟨ √ s ⟩ [GeV] N IHAD4 1995 88.48 88.572 675 1995 89.47 89.484 84,517 1995 91.22 91.321 460,511 1995 93.04 93.024 131,804 C.3.9 Leptonic c hannel An inclusive leptonic selection targeting Z → e + e − and Z → µ + µ − ev ents was implemen ted with cuts on N ch = 2, E vis / √ s > 0 . 6, and |  p | /E vis < 0 . 2, yielding 355,945 ev ents. How ev er, the raw leptonic cross- section at the Z peak ( ∼ 2 . 9 n b) is approximately t wice the published σ µµ (1.473 nb), due to large-angle Bhabha scattering con tamination that cannot b e remo ved without per-track calorimeter matc hing or angular acceptance cuts. The leptonic c hannel therefore cannot provide a usable indep enden t cross-section measurement with the a v ailable summary-level data. The lineshape fit uses the published LEP combined v alue R ℓ = Γ had / Γ ℓ = 20 . 767 ± 0 . 025 as an external constrain t instead of measuring R ℓ indep enden tly . C.3.10 Bac kgrounds The main backgrounds to hadronic Z deca ys and their treatment: Bac kground Con tamination Rejection metho d τ + τ − < 0 . 1% N ch ≥ 5 remov es most; residual at N ch = 5–6 Tw o-photon ( γ γ → hadrons) ∼ 0 . 1% Low E vis and forward topology; rejected b y energy cut Beam-gas / b eam-wall Negligible Rejected by momen tum balance cuts Cosmic rays Negligible Rejected by v ertex and timing in IHAD4 The total bac kground contamination in the IHAD4-selected hadronic sample is estimated at < 0 . 3%, consisten t with published DELPHI v alues. The bac kground is treated as a ± 50% v ariation on a 0.1% baseline in the systematic uncertaint y ev aluation. C.4 Corrections C.4.1 ISR radiativ e corrections Z lineshap e parameterization The Born-level cross section for e + e − → Z → hadrons via Z exchange with the s -dep endent width con ven tion (running width) is: σ 0 BW ( s ) = σ 0 had · s Γ 2 Z ( s − M 2 Z ) 2 + s 2 Γ 2 Z / M 2 Z , where σ 0 had = (12 π / M 2 Z ) · Γ ee Γ had / Γ 2 Z is the hadronic p eak cross section. ISR con v olution The measured cross section is obtained b y conv olving the Born cross section with the initial-state radiation (ISR) radiator function: σ had ( s ) = Z 1 0 dz H ( z , s ) σ 0 BW ( s (1 − z )) , where z is the fractional energy loss to ISR photon emission and H ( z , s ) is the Kuraev–F adin exp onen- tiated radiator at O ( α 2 ) [4]: 161 Both Only ours Only IHAD4 Neither 1 0 4 1 0 5 1 0 6 1 0 7 Events 3.37M 0.40M 0.13M 10.77M Selection overlap DELPHI 1992 1993 1994 1995 80.0 82.5 85.0 87.5 90.0 92.5 95.0 97.5 100.0 Fraction [%] Per-year agreement DELPHI IHAD4=1 in our sel. Our sel. in IHAD4=1 0 20 40 60 N c h 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 Events N c h b y c a t e g o r y DELPHI Both Only ours Only IHAD4 0.0 0.5 1.0 1.5 2.0 E v i s / p s 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 Events E v i s / p s b y c a t e g o r y DELPHI Both Only ours Only IHAD4 Figure 82: IHAD4 comparison sho wing o verlap b et w een our selection and the DELPHI p ilot record. T op left: ev ent counts b y category . T op right: p er-y ear agreement fractions. Bottom: N ch and E vis / √ s distributions b y category . 162 88 89 90 91 92 93 94 p s [ G e V ] 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 Events Hadronic selection DELPHI 1992 1993 1994 1995 91.0 91.1 91.2 91.3 91.4 91.5 91.6 p s [ G e V ] 0 100000 200000 300000 400000 500000 600000 Events Peak region zoom DELPHI 1992 1993 1994 1995 Figure 83: Cen tre-of-mass energy distribution of selected hadronic ev ents per year (left, log scale). The 1993 and 1995 data sho w off-p eak scan p oints. Righ t: zo om on the Z p eak region sho wing fill-to-fill b eam energy v ariations. H ( z ) = β z β − 1 δ SV − β (1 − z / 2) + β 2 8  4(2 − z ) ln 1 z − 1 + 3(1 − z ) 2 z ln(1 − z ) − 6 + z  , with the ISR parameter β = 2 α π  ln s m 2 e − 1  ≈ 0 . 108 at the Z p ole , and the soft+virtual correction factor δ SV = 1 + 3 4 β + α π  π 2 3 − 1 2  ≈ 1 . 081 . The integrable singularity z β − 1 at z → 0 (soft photon limit) is handled by splitting the integral into a soft+virtual piece (regularized via the substitution t = z β ) and a smo oth hard piece. Both are ev aluated with Gauss–Legendre quadrature: 200 p oin ts for the soft piece and 100 for the hard piece. Conv ergence is v erified b y comparing 200 and 400 quadrature p oin ts, with a maximum relativ e difference of 7 . 6 × 10 − 7 . ISR v alidation The ISR-conv olved cross section is v alidated against the Born cross section at five energy p oin ts using PDG input parameters ( M Z = 91 . 1876 GeV, Γ Z = 2 . 4952 GeV, σ 0 had = 41 . 541 nb): √ s [GeV] σ Born [n b] σ ISR [n b] ISR/Born 88.48 7.17 5.27 0.736 89.47 14.35 10.27 0.716 91.22 41.48 30.76 0.742 92.97 13.66 14.38 1.053 163 √ s [GeV] σ Born [n b] σ ISR [n b] ISR/Born 93.04 12.97 13.85 1.068 The ISR reduces the p eak cross section b y ∼ 26%, consistent with the ZFITTER v alue rep orted by the LEP EWW G [1], whic h giv es an ISR correction of ∼ 25–26% at the Z peak. Abov e the Z pole, the ISR enhances the cross section via radiative return to the Z p eak, as exp ected. ISR treatment limitations The O ( α 2 ) Kuraev–F adin radiator used in this analysis is the next-to-leading order approximation. The published DELPHI and LEP combined results use the full O ( α 3 ) calculation from ZFITTER. The difference in tro duces a small bias of ∼ 5–7% in below-peak cross sections compared to the full calculation. This bias is absorb ed b y the fit parameters (Γ Z is pulled slightly lo wer) and ev aluated as a systematic uncertaint y ( δ N ν ∼ 0 . 001). γ / Z in terference estimate The γ / Z in terference and pure γ -exchange terms are not included in the Born cross section (@eq:b orn bw). T o quan tify their impact, the in terference cross section is computed at eac h energy p oin t using the standard electrow eak form ulae with sin 2 θ eff W = 0 . 23153 and the F ermi constant G F = 1 . 1664 × 10 − 5 GeV − 2 : √ s [GeV] σ int [n b] σ int /σ BW 88.57 − 0 . 030 − 0 . 42% 89.49 − 0 . 037 − 0 . 27% 91.30 +0 . 007 +0 . 02% 93.05 +0 . 034 +0 . 27% The in terference is constructiv e b elo w the Z peak and destructiv e abov e it, with the sign c hange o ccurring near √ s = M Z . The magnitude ranges from − 0 . 4% at the low est energy p oint to +0 . 3% at the highest. In the lineshape fit, the p er-energy-p oin t luminosities float freely , absorbing the interference contribution at each p oint. The shap e difference b et ween adjacent energy p oin ts (the quantit y that determines Γ Z ) is ∼ 0 . 3%, whic h is m uch smaller than the statistical precision of the off-peak measuremen ts (3 . 8% at 88.48 GeV, 0 . 23% at 89.47 GeV). The estimated impact on N ν is δ N ν < 0 . 001, w ell b elow the measuremen t sensitivity . This is confirmed b y the observ ation that the in terference corrections are nearly symmetric about the p eak, so their effect on Γ Z largely cancels. C.4.2 Beam energy spread The LEP b eam energy spread ( σ E ≈ 55 Me V p er b eam) broadens the effective centre-of-mass energy distribution. This is con volv ed with the ISR-corrected cross section using Gauss–Hermite quadrature with 20 p oin ts. The cen tre-of-mass energy spread is σ cms = √ 2 σ E = 77 . 8 MeV (from the independent p er-b eam G aussian con tributions). The con volution is p erformed via the standard Gauss–Hermite rescaling: σ obs ( √ s ) = 1 √ 2 π σ cms Z ∞ −∞ σ ISR ( √ s + ∆) e − ∆ 2 / (2 σ 2 cms ) d ∆ , with the substitution x = ∆ / ( √ 2 σ cms ) mapping to Gauss–Hermite quadrature nodes x i and w eights w i . The beam energy spread broadens the Z p eak and reduces its maximum by approximately 2%, with a corresp onding effect on Γ Z of ∼ 0 . 2 MeV. C.4.3 Efficiency correction The hadronic selection efficiency ϵ had = 0 . 974 ± 0 . 005 is applied as a m ultiplicative correction to the predicted ev ent counts at eac h energy p oin t. This v alue is the published DELPHI hadronic selection efficiency [2], v alidated b y our 96.3% recov ery of IHAD4-tagged ev ents. 164 The efficiency is assumed constant across all energy p oin ts. The energy dependence of the hadronic selection efficiency near the Z p ole is small (less than 0.1% ov er the ± 3 GeV scan range) because the even t top ology of hadronic Z deca ys changes slowly with √ s : the mean charged multiplicit y , thrust, and visible energy fraction of Z → q ¯ q even ts v ary b y less than 1% betw een √ s = 89 and 93 GeV, as the final state is dominated b y the fragmen tation of the primary q ¯ q pair at approximately √ s/ 2 p er jet regardless of the exact centre-of-mass energy . This assumption is cov ered by the efficiency systematic uncertaint y ( ± 0 . 5%). C.5 Systematic uncertainties The systematic uncertain ty budget is ev aluated b y v arying eac h source and re-running the lineshap e fit, measuring the shift in N ν . The total systematic uncertain ty is computed as the quadrature sum of individual con tributions. C.5.1 Summary of systematic impacts Source V ariation δ N ν LEP energy calibration ± 1 . 5 MeV on √ s < 0 . 001 Luminosit y normalization ± 0 . 5% on L total 0.001 Hadronic selection efficiency ± 0 . 5% on ϵ had 0.001 Bac kground subtraction ± 50% on 0.1% bac kground < 0 . 001 ISR treatment O ( α ) vs O ( α 2 ) 0.001 γ / Z interference Not included in Born < 0 . 001 R ℓ external input ± 0 . 025 on R ℓ 0.005 Beam energy spread ± 10% on σ E = 55 MeV 0.001 Γ SM ν ¯ ν ± 0 . 002 MeV < 0 . 001 T otal systematic 0.006 Statistical 0.078 The total systematic uncertain t y ( δN ν = 0 . 006) is more than an order of magnitude smaller than the statistical uncertaint y (0 . 078), confirming that the measurement is statistically limited. C.5.2 LEP energy calibration The LEP b eam energy w as calibrated using resonant depolarization measuremen ts (from 1993 onw ards) and the LEP energy mo del. The uncertaint y on the centre-of-mass energy at off-peak points is δ √ s ≈ ± 1 . 5 MeV [1]. The impact is ev aluated b y shifting all √ s v alues by ± 1 . 5 MeV and re-running the fit. The resulting shift δ N ν < 0 . 001 is negligible b ecause the energy shift is absorb ed by the M Z parameter and partially by the luminosity parameters. C.5.3 Luminosit y normalization The in tegrated luminosity is constrained in the fit with a total uncertain ty of ± 0 . 3%, com bining experimental and theoretical con tributions. The impact is ev aluated by v arying the total luminosity constraint b y ± 0 . 5% (conserv ativ e): δ N ν (lumi) = 0 . 001 . This is small b ecause a luminosity shift primarily affects σ 0 had , whic h en ters N ν only through the derived Γ had (prop ortional to √ σ 0 ). A 0.5% luminosit y shift translates to a 0.25% shift in Γ had and a correspondingly small shift in Γ inv . 165 The theoretical luminosity uncertaint y from the Bhabha scattering cross-section calculation ( ∼ 0 . 061%, common to all LEP experiments from the BHLUMI 4.04 program) is included within the ov erall luminosit y uncertain ty . C.5.4 Hadronic selection efficiency The hadronic selection efficiency ϵ had = 0 . 974 is v aried by ± 0 . 5% (co vering the published DELPHI uncer- tain ty range of 0.1–0.3%): δ N ν (eff.) = 0 . 001 . The efficiency en ters as a multiplicativ e factor on all predicted ev en t counts. A coherent shift in efficiency is absorb ed by the luminosit y parameters, resulting in a small net effect on the Z parameters and N ν . C.5.5 Bac kground subtraction The bac kground contamination in the IHAD4-selected hadronic sample is estimated at ∼ 0 . 1% (dominated b y τ + τ − and tw o-photon even ts). The impact is ev aluated b y v arying the background fraction by ± 50% and subtracting from the observed ev ent coun ts: δ N ν (bkg) < 0 . 001 . This is negligible b ecause the background fraction is very small and its v ariation is partially absorb ed b y the luminosity parameters. C.5.6 ISR treatmen t The ISR radiator is implemen ted at O ( α 2 ) (Kuraev–F adin exp onen tiated, including the full next-to-leading- order hard radiation terms). The published LEP results use O ( α 3 ) calculations from ZFITTER. The impact is ev aluated b y comparing fits using the O ( α 2 ) radiator (nominal) and the O ( α 1 ) radiator (whic h omits the O ( α 2 ) hard radiation corrections but retains the exp onen tiated soft+virtual structure): δ N ν (ISR) = 0 . 001 . This comparison brack ets the O ( α 1 ) → O ( α 2 ) correction, which changes the b elo w-p eak cross sections b y ∼ 1–2%. The next order correction, O ( α 2 ) → O ( α 3 ), is exp ected to b e approximately 0.05% based on the con vergence pattern rep orted by the LEP Electrow eak W orking Group [1], making the current systematic estimate conserv ativ e. The O ( α 2 ) radiator ov erpredicts the b elo w-p eak cross section by ∼ 5–7% relative to the full ZFITTER calculation (which includes additional higher-order terms beyond the O ( α 3 ) hard radiation). This bias is absorb ed b y the fit parameters (Γ Z is pulled lo wer by ∼ 30 MeV) and has a small net effect on N ν b ecause the biases in Γ Z and σ 0 had partially cancel in the N ν extraction. The γ / Z in terference terms (not included in the Born cross section) are ev aluated separately in @sec:in terference and contribute δN ν < 0 . 001, absorb ed b y the floating luminosity parameters. C.5.7 External R ℓ constrain t The ratio R ℓ = Γ had / Γ ℓ = 20 . 767 ± 0 . 025 is used as an external constrain t (from the LEP com bined measuremen t). This is the dominant systematic: δ N ν ( R ℓ ) = 0 . 005 . This contribution is dominant b ecause R ℓ en ters the extraction of Γ had from σ 0 had through the relation (@eq:sigma0), and thereb y directly affects Γ inv and N ν . The published LEP uncertaint y on R ℓ ( ± 0 . 025, or 0.12%) is v ery well constrained; the relatively large impact on N ν reflects the sensitivit y of the invisible width to the hadronic-to-leptonic width ratio. 166 C.5.8 Beam energy spread The b eam energy spread σ E = 55 MeV is v aried b y ± 10% and the fit is rep eated: δ N ν (b eam spread) = 0 . 001 . The beam energy spread broadens the effective Z p eak and has a small effect on the extracted Γ Z ( ∼ 0 . 2 MeV for a 10% v ariation in σ E ). C.5.9 SM prediction for Γ ν ¯ ν The SM prediction for the partial width in to a single neutrino sp ecies, Γ SM ν ¯ ν = 167 . 157 ± 0 . 002 MeV, is v aried b y ± 0 . 002 MeV: δ N ν (Γ SM ν ¯ ν ) < 0 . 001 . This is negligible b ecause N ν = Γ inv / Γ SM ν ¯ ν and the relative uncertain ty on Γ SM ν ¯ ν is ∼ 0 . 001%. C.5.10 Correlation structure The systematic sources ha ve differen t correlation structures across energy p oin ts: • LEP energy: Correlated across energy p oin ts within a fill and partially across fills (common calibra- tion pro cedure). • Luminosit y (exp erimen tal): Correlated within a year, partially correlated across y ears (detector upgrades from SA T to STIC). • Luminosit y (theory): F ully correlated across all p oin ts and y ears (common Bhabha QED calcula- tion). • Selection efficiency: Correlated across energy p oin ts (same detector), with small energy-dependent v ariation. • Bac kground: Largely uncorrelated b et ween energy p oin ts. In practice, the correlation structure has minimal impact on the result because the fit uses p er-energy- p oin t luminosity n uisance parameters that absorb most correlated v ariations. C.6 Cross-c hec ks C.6.1 P er-year consistency The in ternal consistency of the measuremen t across data-taking y ears is tested b y comparing the data-implied luminosit y p er year (from the combined-fit cross sections and observ ed ev ent counts) to published per-year luminosities: L year impl = X i ∈ year N obs ,i ϵ had · σ pred ,i · 10 3 Y ear L impl [pb − 1 ] L pub [pb − 1 ] Ratio 1992 24.0 26.6 0.900 1993 36.5 31.4 1.162 1994 47.2 47.8 0.987 1995 34.1 35.9 0.950 T otal 141.7 141.7 1.000 The global total luminosity ratio is exactly 1.000, confirming the luminosity constraint is functioning correctly . The p er-y ear v ariations (0.90–1.16) reflect the fact that the combined fit distributes luminosity 167 differen tly across energy p oin ts than the published year-lev el totals imply . The 1993 ratio of 1.16 indicates that the combined-fit cross sections at the 1993 energy p oin ts predict few er even ts than observ ed, requiring higher luminosity . These v ariations are within the exp ected range giv en that the com bined fit uses only 4 merged energy bins while the p er-y ear distribution of even ts v aries. The following table summarises the luminosit y comparison with the n umber of energy points p er year, clarifying that the discrepancies are consistent with the energy-p oint merging: Y ear L pub [pb − 1 ] L impl [pb − 1 ] Ratio N energy p oin ts 1992 26.6 24.0 0.900 1 (peak only) 1993 31.4 36.5 1.162 3 (scan) 1994 47.8 47.2 0.987 1 (peak only) 1995 35.9 34.1 0.950 4 (scan) T otal 141.7 141.7 1.000 9 The largest discrepancy (16.2%) o ccurs for 1993, whic h has 3 energy p oints merged in to 3 combined bins. The p er-y ear published luminosities corresp ond to the total across all fills, while the implied luminosities redistribute this total across the 4 com bined energy bins according to the fit. Y ears with multiple energy p oin ts (1993, 1995) show the largest deviations b ecause the merging redistributes ev ents differen tly than the original p er-fill accoun ting. 0.90 0.95 1.00 1.05 1.10 1.15 L i m p l i e d / L p u b l i s h e d 1992 1993 1994 1995 DELPHI G l o b a l L i m p l / L p u b = 1 . 0 0 0 1992 1993 1994 1995 Year 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Luminosity fraction DELPHI Data-implied Published Figure 84: P er-year consistency chec k. Left: data-implied luminosities compared to published v alues. Right: luminosit y fractions showing the relativ e contribution of eac h year. The global ratio is 1.000; p er-y ear v ariations (0.90–1.16) are consistent with the combined fit. C.6.2 Fit without 88.48 GeV p oin t The low-statistics 88.48 GeV p oin t (675 ev ents, ∼ 4% stat. uncertaint y) has negligible impact on the fit: 168 P arameter With 88.48 GeV Without 88.48 GeV M Z [GeV] 91.191 91.190 Γ Z [GeV] 2.462 2.460 σ 0 had [n b] 41.25 41.22 N ν 2.986 2.986 The shift in N ν is < 0 . 001, confirming that this low-statistics point do es not bias the result. C.6.3 Fit stabilit y The fit is repeated with seven different sets of initial parameter v alues to test the robustness of the solution: Starting p oin t M Z [GeV] Γ Z [GeV] σ 0 had [n b] N ν Nominal (PDG) 91.190 2.462 41.25 2.986 Lo w M Z (90.5) 90.571 2.770 44.03 3.015 High M Z (92.0) 91.884 2.893 45.82 2.655 Lo w Γ Z (2.2) 91.183 2.303 42.45 2.635 High Γ Z (2.8) 91.194 2.633 40.16 3.360 Lo w σ 0 (38) 91.180 2.710 39.71 3.531 High σ 0 (45) 91.198 2.269 42.74 2.558 The fit has m ultiple lo cal minima because the 4 data p oin ts cannot fully constrain 7 parameters (3 Z parameters + 4 luminosities) sim ultaneously . The nominal starting point (near PDG v alues) con verges to the global minim um. Other starting points find lo cal minima with unph ysical parameter v alues (Γ Z deviating b y > 10%), confirming that the nominal solution is the physical one and that external constraints (luminosity , appro ximate Z parameter knowledge) are essential for this analysis. C.7 Statistical metho d C.7.1 Lik eliho o d definition The fit minimizes the Poisson negativ e log-likelihoo d (NLL): − ln L (  θ ) = 4 X i =1 h N pred ,i (  θ ) − N obs ,i ln N pred ,i (  θ ) i + 1 2  P i L i − L pub δ L pub  2 , where i indexes the four energy p oin ts,  θ = ( M Z , Γ Z , σ 0 had , L 1 , L 2 , L 3 , L 4 ) are the sev en fit parameters, and the second term is a Gaussian constraint on the total luminosity ( L pub = 141 . 7 pb − 1 , δ L pub = 0 . 3%). The predicted even t count at each energy p oin t is: N pred ,i = L i · ϵ had · σ had ( √ s i ; M Z , Γ Z , σ 0 had ) × 10 3 , where σ had is the ISR-conv olv ed and b eam-spread-con volv ed cross section (@eq:isr conv olution, @eq:b eam spread) and the factor 10 3 con verts pb − 1 × nb to ev ents. C.7.2 Fit parameters and external inputs Sev en free parameters: 1. M Z — Z b oson mass [GeV] 2. Γ Z — Z b oson total width [GeV] 3. σ 0 had — hadronic p eak cross section [n b] 4. L 1 – L 4 — p er-energy-point luminosities [pb − 1 ] 169 External inputs: Quan tity V alue Source ϵ had 0 . 974 ± 0 . 005 Published DELPHI R ℓ 20 . 767 ± 0 . 025 LEP combined Γ SM ν ¯ ν 167 . 157 ± 0 . 002 MeV EW tw o-loop Data points: F our combined IHAD4 even t counts at ⟨ √ s ⟩ = 88 . 57, 89.49, 91.30, 93.05 GeV. C.7.3 Minimization and error estimation The fit is p erformed using iminuit 2.x : • MIGRAD: Primary minimization using the v ariable-metric metho d with inexact line searc h. Con- v ergence is confirmed by the valid flag. • HESSE: F ull parameter cov ariance matrix from the inv erse Hessian at the minim um, pro viding sym- metric uncertainties and correlations. • MINOS: Asymmetric confidence interv als b y profiling the likelihoo d along each parameter direction. F or M Z , Γ Z , and σ 0 had , MINOS errors are symmetric, confirming parabolic likelihoo d profiles near the minim um. C.7.4 N ν extraction F rom the fitted parameters and the external constraint R ℓ = Γ had / Γ ℓ : Γ had = r σ 0 had · M 2 Z · Γ 2 Z · R ℓ 12 π , Γ ℓ = Γ had /R ℓ , Γ inv = Γ Z − Γ had − 3 Γ ℓ , N ν = Γ inv / Γ SM ν ¯ ν . C.7.5 Error propagation The statistical uncertaint y on N ν is obtained by propagating the fitted parameter co v ariance matrix through the extraction formulae: ( δ N ν ) 2 = X j,k ∂ N ν ∂ θ j · C j k · ∂ N ν ∂ θ k where the partial deriv ativ es (Jacobian) are: ∂ N ν ∂ M Z = − 0 . 129 GeV − 1 , ∂ N ν ∂ Γ Z = 1 . 213 GeV − 1 , ∂ N ν ∂ σ 0 had = − 0 . 142 nb − 1 . The dominan t sensitivity is to Γ Z , as expected: N ν is primarily determined by the total width through the invisible width Γ inv = Γ Z − Γ had − 3Γ ℓ . 170 C.7.6 Fit underdetermination The system has 7 free parameters and 5 effectiv e constrain ts (4 data points + 1 luminosity Gaussian con- strain t), making it formally underdetermined (effectiv e ndof ≤ 0). With 7 free parameters and 5 effective constrain ts, the fit is underdetermined. The near-zero χ 2 (3 . 3 × 10 − 5 ) confirms that the fit can p erfectly accommo date the 4 data points and is not a go odness-of-fit metric. The fit nonetheless con verges to a unique ph ysical solution b ecause: 1. The Gaussian luminosity constrain t anc hors the ov erall normalization, prev enting the luminosity pa- rameters from freely absorbing all tension. 2. The Z lineshap e is a well-understoo d ph ysical mo del with strong internal correlations b et ween param- eters. 3. Starting v alues near published Z parameters guide the minimizer to the physical solution (v alidated b y the stability cross-c heck, @sec:xc heck stability). The published DELPHI analysis used ∼ 20 year-energy combinations, pro viding many more constraints relativ e to the n um b er of parameters. F uture improv ements with p er-year fits or additional external con- strain ts w ould improv e the constrain t pow er. C.7.7 Conditional lik eliho od scan A conditional lik eliho od scan for N ν is p erformed by scanning Γ Z with σ 0 had constrained via the ph ysical relation σ 0 had ∝ 1 / Γ 2 Z (holding Γ had and Γ ℓ fixed at b est-fit v alues) and M Z fixed at the b est-fit v alue. At eac h scan p oint, only the 4 luminosities are profiled, providing a meaningful 1-degree-of-freedom scan. This is a c onditional scan (not a full profile likelihoo d) because M Z and the σ 0 –Γ Z relationship are fixed rather than profiled. The full statistical uncertain ty ( ± 0 . 078) from the error propagation (@eq:error prop), which accounts for all parameter correlations via the cov ariance matrix, is the authoritative uncertaint y on N ν . C.8 Results C.8.1 Primary result The num b er of ligh t neutrino generations measured from the Z boson in visible width is: N ν = 2 . 986 ± 0 . 078 (stat) ± 0 . 006 (syst) This is consistent with the Standard Mo del exp ectation of exactly three neutrino generations and with published results from DELPHI ( N ν = 3 . 00 ± 0 . 02) and the LEP com bined measuremen t ( N ν = 2 . 9840 ± 0 . 0082). C.8.2 Fitted Z b oson parameters P arameter This w ork DELPHI published LEP com bined M Z [GeV] 91 . 191 ± 0 . 033 91 . 186 ± 0 . 003 91 . 188 ± 0 . 002 Γ Z [GeV] 2 . 462 ± 0 . 036 2 . 488 ± 0 . 004 2 . 495 ± 0 . 002 σ 0 had [n b] 41 . 25 ± 0 . 25 41 . 58 ± 0 . 07 41 . 54 ± 0 . 04 All parameters are consistent with published v alues within uncertainties. The cen tral v alues of Γ Z and σ 0 had are slightly lo w compared to published v alues, driv en by the O ( α 2 ) ISR radiator whic h ov erpredicts cross sections b elow the Z p eak. These biases partially cancel in the N ν extraction. 171 1.5 2.0 2.5 3.0 3.5 4.0 N 0 2 4 6 8 10 2 l n L L E P 1 p s M Z DELPHI 1 2 B e s t f i t N = 2 . 9 7 N = 3 ( S M ) 1 : [ 2 . 9 6 , 3 . 0 1 ] Figure 85: Conditional likelihoo d scan for N ν . The scan minim um is at N ν ≈ 2 . 98. The narro w width (1 σ ≈ ± 0 . 03) reflects the constraint at fixed M Z and the σ 0 –Γ Z relation; the full statistical uncertaint y ( ± 0 . 078) from error propagation additionally accounts for M Z –Γ Z – σ 0 had correlations. 172 C.8.3 Deriv ed partial widths The partial widths are derived from the fitted Z parameters using @eq:gamma had–@eq:gamma inv extract. Their uncertain ties are obtained b y propagating the fitted cov ariance matrix C for ( M Z , Γ Z , σ 0 had ) and the external R ℓ uncertain ty through the Jacobians: Quan tity V alue [MeV] δ stat [MeV] δ R ℓ [MeV] δ total [MeV] Published DELPHI [MeV] LEP combined [MeV] Γ had 1715 21 1.0 21 1742 1744 Γ ℓ 82.6 1.0 0.05 1.0 83.9 84.0 Γ inv 499 13 0.9 13 499 499 The statistical uncertain ties on the partial widths are dominated by the Γ Z uncertain ty ( ∂ Γ had /∂ Γ Z = 697 MeV/GeV) and the strong Γ Z – σ 0 had an ti-correlation ( ρ = − 0 . 887). The R ℓ con tribution is sub-leading for all three partial widths. The invisible width Γ inv = 499 ± 13 MeV agrees with both published DELPHI and LEP com bined v alues (499 MeV). This is the quantit y most directly sensitive to N ν , and its agreement v alidates the measurement despite the low er precision on M Z , Γ Z , and σ 0 had individually . C.8.4 Fitted luminosities √ s [GeV] L fit [pb − 1 ] 88.57 0.13 89.49 18.6 91.30 102.8 93.05 20.2 T otal 141.7 The total fitted luminosit y matches the published constrain t (141 . 7 pb − 1 ) by construction. C.8.5 Data v ersus prediction √ s [GeV] N obs N pred σ had [n b] Pull 88.57 675 675.0 5.41 0.000 89.49 183,079 183,079.5 10.12 − 0 . 001 91.30 3,048,372 3,048,381.7 30.46 − 0 . 006 93.05 266,536 266,536.5 13.52 − 0 . 001 All pulls are < 0 . 01 σ , a consequence of the underdetermined system (@sec:statistics). C.8.6 Co v ariance matrix The cov ariance matrix for the fitted Z parameters ( M Z , Γ Z , σ 0 had ) in units of (GeV 2 , GeV 2 , nb 2 ): M Z Γ Z σ 0 had M Z 1 . 075 × 10 − 3 1 . 126 × 10 − 4 − 2 . 333 × 10 − 4 Γ Z 1 . 126 × 10 − 4 1 . 310 × 10 − 3 − 8 . 159 × 10 − 3 σ 0 had − 2 . 333 × 10 − 4 − 8 . 159 × 10 − 3 6 . 470 × 10 − 2 The correlation co efficients are: 173 M Z Γ Z σ 0 had M Z 1.000 0.095 − 0 . 028 Γ Z 0.095 1.000 − 0 . 887 σ 0 had − 0 . 028 − 0 . 887 1.000 The strong anti-correlation b et w een Γ Z and σ 0 had ( ρ = − 0 . 887) is a known feature of Z lineshape fits: increasing Γ Z broadens the resonance, whic h must b e compensated by decreasing σ 0 had to maintain the ev ent coun ts near the p eak. C.9 Comparison to prior results and theory C.9.1 Comparison with published measurements The primary result N ν = 2 . 986 ± 0 . 078 (stat) ± 0 . 006 (syst) is compared to published results: Measuremen t N ν Precision This work 2 . 986 ± 0 . 078 2.6% DELPHI published [2] 3 . 00 ± 0 . 02 0.67% LEP combined [1] 2 . 9840 ± 0 . 0082 0.27% SM prediction 3 (exact) — Our result is consisten t with all published v alues, with quan titative compatibilit y metrics: • vs. DELPHI published: ∆ N ν = − 0 . 014; χ 2 = (2 . 986 − 3 . 00) 2 / (0 . 078 2 + 0 . 02 2 ) = 0 . 030, p = 0 . 86 (0 . 17 σ ). • vs. LEP com bined: ∆ N ν = 0 . 002; χ 2 = (2 . 986 − 2 . 984) 2 / (0 . 078 2 + 0 . 008 2 ) = 0 . 00065, p = 0 . 98 (0 . 03 σ ). • vs. SM ( N ν = 3 ): ∆ N ν = − 0 . 014; χ 2 = (2 . 986 − 3) 2 / 0 . 078 2 = 0 . 032, p = 0 . 86 (0 . 18 σ ). All comparisons yield p -v alues w ell ab ov e 0.05, confirming excellen t agreement. C.9.2 Understanding the precision difference Our statistical uncertain ty ( ± 0 . 078) is appro ximately four times larger than the published DELPHI result ( ± 0 . 02). This is understoo d from sev eral factors: 1. F ew er effectiv e energy p oin ts: W e combine data in to 4 energy points, compared to ∼ 20 year- energy com binations in the published analysis. The reduced n umber of p oin ts limits the constrain t on Γ Z , which is the primary driver of N ν precision. 2. No independent R ℓ measuremen t: The published DELPHI analysis measures R ℓ from the ratio of hadronic to leptonic cross sections, providing an independent constrain t. W e use the external LEP combined v alue, whic h adds information but not the full constraining p ow er of p er-point σ ℓ measuremen ts. 3. P er-fill vs. combined energy p oints: The published analysis uses p er-fill b eam energy calibration, while w e use the mean √ s at each nominal energy p oin t. This reduces the effective num b er of distinct √ s measurements. 4. ISR mo del: Our O ( α 2 ) radiator in tro duces small biases that are absorbed by the fit parameters, sligh tly degrading the constraint pow er. Despite these limitations, the central v alue of N ν is fully consisten t with the published result, demon- strating that the DELPHI op en data contain sufficien t information for Z-p ole ph ysics measuremen ts. 174 5 10 15 20 25 30 h a d [ n b ] L E P 1 p s M Z DELPHI F i t : M Z = 9 1 . 1 9 , Z = 2 . 4 6 2 , 0 = 4 1 . 3 DELPHI data (IHAD4) 88 89 90 91 92 93 94 p s [ G e V ] -2 0 2 ( D a t a F i t ) / Figure 86: Z lineshap e fit showing hadronic cross-section data p oin ts at four energy p oin ts with the fitted Breit–Wigner curve (upper panel) and pull distribution (low er panel). 175 88 89 90 91 92 93 94 p s [ G e V ] 5 10 15 20 25 30 35 h a d [ n b ] L E P 1 p s M Z DELPHI N = 2 ( Z = 2 . 3 3 1 G e V ) N = 3 ( Z = 2 . 4 9 8 G e V ) N = 4 ( Z = 2 . 6 6 5 G e V ) DELPHI data Figure 87: Hadronic cross-section data o v erlaid with theoretical predictions for N ν = 2 (dashed blue), N ν = 3 (solid red), and N ν = 4 (dashed green). The data clearly fav or N ν = 3. The p eak cross section decreases with increasing N ν b ecause additional in visible c hannels increase Γ Z while k eeping Γ had fixed, reducing σ 0 had ∝ Γ − 2 Z . 176 0.00 0.02 0.04 0.06 0.08 N ( s y s t e m a t i c ) LEP energy Luminosity h a d Background ISR treatment R e x t e r n a l Beam spread ( ) S M L E P 1 p s M Z DELPHI Total syst = 0.0056 Statistical = 0.0774 Figure 88: Systematic uncertain ty breakdo wn sho wing the impact of eac h source on N ν . The dominant systematic is the external R ℓ constrain t ( δ N ν = 0 . 005). All other sources contribute ≤ 0 . 001. 177 C.9.3 Comparison of Z parameters P arameter This work DELPHI LEP com bined SM (EW fit) M Z [GeV] 91 . 191 ± 0 . 033 91 . 186 ± 0 . 003 91 . 188 ± 0 . 002 input Γ Z [GeV] 2 . 462 ± 0 . 036 2 . 488 ± 0 . 004 2 . 495 ± 0 . 002 2 . 495 ± 0 . 002 σ 0 had [n b] 41 . 25 ± 0 . 25 41 . 58 ± 0 . 07 41 . 54 ± 0 . 04 41 . 48 ± 0 . 01 Γ inv [MeV] 499 499 499 502 N ν 2 . 986 ± 0 . 078 3 . 00 ± 0 . 02 2 . 984 ± 0 . 008 3 All quan tities are consisten t within uncertain ties. The goo d agreemen t of Γ inv (499 MeV in all cases) con- firms that the measuremen t correctly determines the in visible width despite the lo wer precision on individual Z parameters. C.9.4 Early DELPHI measurement The earliest DELPHI measurement from 1989 data [3] gav e N ν = 3 . 08 ± 0 . 05 from approximately 120,000 hadronic even ts. Our measuremen t uses 3.5 million hadronic ev ents (nearly 30 times more) and achiev es a precision of ± 0 . 078 — comparable to the early result despite the analysis simplifications. C.9.5 Implications The measurement confirms that the num b er of ligh t neutrino generations accessible through Z b oson decays is consistent with three. This constrains extensions of the Standard Mo del that predict additional light ( m ν < M Z / 2) neutrino sp ecies. The result, while not comp etitiv e with the LEP com bined measuremen t in precision, demonstrates the scien tific utilit y of DELPHI op en data for electrow eak precision measuremen ts. C.10 Conclusions A measuremen t of the n umber of light neutrino generations N ν has been p erformed using the Z b oson in visible width, based on hadronic cross-section data collected by the DELPHI detector at LEP during the 1992–1995 energy scan programme. The analysis uses 3.50 million hadronic Z deca y ev ents selected from 14.66 million extracted ev en ts at four cen tre-of-mass energy p oin ts ( √ s = 88 . 6, 89.5, 91.3, and 93.0 GeV). A radiatively- corrected Breit–Wigner lineshap e, incorp orating initial-state radiation via the Kuraev–F adin exponentiated radiator at O ( α 2 ) and b eam energy spread conv olution, is fit to the ev ent counts using a Poisson negativ e log-likelihoo d with constrained luminosity parameters. The result is N ν = 2 . 986 ± 0 . 078 (stat) ± 0 . 006 (syst) , consisten t with the Standard Model prediction of exactly three ligh t neutrino generations, the published DELPHI result of N ν = 3 . 00 ± 0 . 02, and the LEP com bined measurement of N ν = 2 . 9840 ± 0 . 0082. The measurement is statistically limited. The dominan t systematic uncertaint y is the external constraint on R ℓ from the LEP com bined measurement ( δ N ν = 0 . 005). All other systematic sources con tribute ≤ 0 . 001 eac h. The total systematic uncertain ty (0 . 006) is more than an order of magnitude smaller than the statistical uncertain ty . The statistical precision is appro ximately four times worse than the published DELPHI result, primarily b ecause the data are combined into 4 energy p oin ts (vs. ∼ 20 in the published analysis), the leptonic c hannel is not measured indep enden tly (limited b y Bhabha con tamination at the summary-v ariable level), and the ISR treatment uses O ( α 2 ) rather than O ( α 3 ). The analysis demonstrates that: 1. DELPHI op en data are accessible through the sk elana F ortran framew ork on CVMFS and the CERN Op en Data p ortal. 178 2. Precision electro w eak measuremen ts are feasible with op en data, ac hieving results consistent with published v alues. 3. A hybrid F ortran/Python w orkflow (skelana for data extraction, Python for analysis) successfully bridges the proprietary data format gap. C.11 F uture directions Sev eral concrete improv emen ts could enhance the precision of this measurement: 1. P er-y ear energy point separation. T reating the 9 year-energy combinations as indep enden t data p oin ts (instead of combin ing in to 4) w ould increase the num b er of constraints from 4 to 9, significantly impro ving the fit determinacy and reducing the statistical uncertain ty b y ∼ 30–50%. 2. Independent R ℓ measuremen t. Implementing lepton iden tification using calorimeter energy ratios and angular acceptance cuts (to separate electrons from m uons and reject Bhabha ev ents) w ould enable an indep enden t R ℓ measuremen t, remo ving the dep endence on the external LEP com bined v alue. 3. O ( α 3 ) ISR radiator. Implemen ting the full third-order radiator (as used in ZFITTER) would elimi- nate the ISR bias and improv e the accuracy of Γ Z and σ 0 had . 4. γ / Z interference. Including the photon exc hange and γ / Z interference terms in the Born cross section would impro ve the off-p eak cross-section predictions and reduce the systematic bias. 5. Mon te Carlo efficiency studies. Extracting the a v ailable MC samples to the same CSV format and computing selection efficiencies directly from MC w ould replace the reliance on published efficiency v alues and enable data/MC comparison plots. 6. P er-fill b eam energy . Using the p er-fill b eam energy v alues from the LEP energy mo del (rather than the ev ent-lev el ECMAS v alues) would provide more energy p oin ts and impro ve the constraint on Γ Z . 7. 1991 data recov ery . If the 1991 data can be accessed through alternative paths, it would add the earliest energy scan data with off-p eak p oin ts, further constraining Γ Z . 8. F orw ard-backw ard asymmetry . Measuring A 0 ,ℓ FB from the angular distribution of leptonic ev ents w ould add a fifth parameter to the fit and provide additional constraints on the electro weak mixing angle. References [1] ALEPH, DELPHI, L3, OP AL, SLD Collaborations, LEP Electrow eak W orking Group, SLD Electrow eak and Hea vy Flav our Groups, “Precision electrow eak measurements on the Z resonance,” Ph ys. Rept. 427 (2006) 257–454, arXiv:hep-ex/0509008. [2] DELPHI Collab oration, “Cross-sections and leptonic forw ard-backw ard asymmetries from the Z 0 running of LEP ,” Eur. Ph ys. J. C16 (2000) 371. [3] DELPHI Collaboration, “DELPHI Results on the Z0 Resonance P arameters,” Ph ys. Lett. B241 (1990) 435. [4] E.A. Kuraev and V.S. F adin, “On radiativ e corrections to e + e − single-photon annihilation at high energies,” Sov. J. Nucl. Phys. 41 (1985) 466. C.12 Extended cutflow tables C.12.1 Hadronic selection p er y ear 1992 179 Cut Ev ents Cumulativ e eff. Per-cut eff. All even ts 2,482,495 100% — N ch ≥ 5 1,019,278 41.1% 41.1% E vis / √ s > 0 . 12 868,435 35.0% 85.2% E vis / √ s < 1 . 5 841,806 33.9% 96.9% | p z | /E vis < 0 . 6 788,479 31.8% 93.7% p T /E vis < 0 . 4 768,347 30.9% 97.4% 1993 Cut Ev ents Cumulativ e eff. Per-cut eff. All even ts 2,766,589 100% — N ch ≥ 5 953,957 34.5% 34.5% E vis / √ s > 0 . 12 854,149 30.9% 89.5% E vis / √ s < 1 . 5 823,575 29.8% 96.4% | p z | /E vis < 0 . 6 781,553 28.3% 94.9% p T /E vis < 0 . 4 770,763 27.9% 98.6% 1994 Cut Ev ents Cumulativ e eff. Per-cut eff. All even ts 5,753,343 100% — N ch ≥ 5 1,736,310 30.2% 30.2% E vis / √ s > 0 . 12 1,651,335 28.7% 95.1% E vis / √ s < 1 . 5 1,596,681 27.8% 96.7% | p z | /E vis < 0 . 6 1,512,497 26.3% 94.7% p T /E vis < 0 . 4 1,492,292 25.9% 98.7% 1995 Cut Ev ents Cumulativ e eff. Per-cut eff. All even ts 3,661,917 100% — N ch ≥ 5 890,428 24.3% 24.3% E vis / √ s > 0 . 12 826,481 22.6% 92.8% E vis / √ s < 1 . 5 800,305 21.9% 96.8% | p z | /E vis < 0 . 6 745,346 20.4% 93.1% p T /E vis < 0 . 4 736,634 20.1% 98.8% C.12.2 Tigh t leptonic selection cutflow Cut 1992 1993 1994 1995 T otal All even ts 2,482,495 2,766,589 5,753,343 3,661,917 14,664,344 N ch = 2 318,198 328,261 451,058 282,224 1,379,741 E vis / √ s > 0 . 6 116,050 132,691 200,324 115,878 564,943 |  p | /E vis < 0 . 2 71,827 80,344 129,091 74,683 355,945 180 C.13 Systematic uncertaint y details C.13.1 Detailed systematic table Source V ariation N ν (up) N ν (do wn) δ N ν Energy calibration ± 1 . 5 MeV 2.9869 2.9869 < 0 . 001 Luminosit y norm. ± 0 . 5% 2.9870 2.9881 0.001 Selection eff. ± 0 . 5% 2.9870 2.9881 0.001 Bac kground ± 50% on 0.1% 2.9869 2.9861 < 0 . 001 ISR treatmen t O ( α 1 ) vs O ( α 2 ) 2.9871 2.9861 0.001 γ / Z in terference Not included in Born — — < 0 . 001 R ℓ external ± 0 . 025 2.9808 2.9914 0.005 Beam energy spread ± 10% on 55 MeV 2.9840 2.9866 0.001 Γ SM ν ¯ ν ± 0 . 002 MeV 2.9861 2.9862 < 0 . 001 T otal sys- tematic 0.006 C.13.2 Systematic completeness vs. reference analyses Source LEP EWW G DELPHI published This analysis Status LEP energy calibration Y es (1.7 MeV) Y es (1.5 MeV) ± 1 . 5 MeV shift Implemented Luminosit y (exp.) Y es ( < 0 . 1%) Y es (STIC/SA T) ± 0 . 5% total Implemented Luminosit y (theory) Y es (0.061%) Y es Included in lumi unc. Implemented Selection efficiency Y es Y es (0.1–0.3%) ± 0 . 5% v ariation Implemen ted Leptonic selection Y es Y es N/A (external R ℓ ) Not applicable Bac kground subtraction Y es Y es (10–50%) ± 50% on 0.1% bkg Implemen ted t -c hannel ( ee ) Y es Y es N/A (no ee c hannel) Not applicable ISR / QED corrections Y es ( O ( α 3 )) Y es O ( α 2 ) KF + syst. Implemen ted γ / Z in terference Y es Y es Ev aluated; < 0 . 001 on N ν Implemen ted Beam energy spread Y es Y es (55 MeV) ± 10% v ariation Implemented MC generator mo del Y es Y es N/A (published eff. used) Not applicable T rigger efficiency Y es ( ∼ 100%) Y es Assumed 100% (LEP1) V erified 181 Source LEP EWW G DELPHI published This analysis Status R ℓ external (measured) (measured) LEP com bined ± 0 . 025 Implemented Γ SM ν ¯ ν (SM input) (SM input) ± 0 . 002 MeV Implemen ted C.14 Co v ariance matrices C.14.1 Fit parameter cov ariance matrix The full cov ariance matrix C for the fitted Z parameters ( M Z [GeV] , Γ Z [GeV] , σ 0 had [n b]): C =   1 . 075 × 10 − 3 1 . 126 × 10 − 4 − 2 . 333 × 10 − 4 1 . 126 × 10 − 4 1 . 310 × 10 − 3 − 8 . 159 × 10 − 3 − 2 . 333 × 10 − 4 − 8 . 159 × 10 − 3 6 . 470 × 10 − 2   C.14.2 Correlation matrix ρ =   1 . 000 0 . 095 − 0 . 028 0 . 095 1 . 000 − 0 . 887 − 0 . 028 − 0 . 887 1 . 000   C.14.3 N ν error propagation vector The gradient of N ν with resp ect to the Z parameters ( M Z , Γ Z , σ 0 had ): ∇ N ν =  − 0 . 129 GeV − 1 , 1 . 213 GeV − 1 , − 0 . 142 nb − 1  The propagated statistical uncertain ty: δ N stat ν = q ∇ N T ν · C · ∇ N ν = 0 . 078 C.15 Auxiliary plots 182 0 50 100 150 200 E v i s [ G e V ] 1 0 4 1 0 5 1 0 6 Events / 2 GeV All events DELPHI Open Data All events (14,664,337) 0.0 0.5 1.0 1.5 2.0 E v i s / p s 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 Events IHAD4=1 DELPHI Open Data 1992 (709,243) 1993 (711,658) 1994 (1,400,251) 1995 (677,507) Figure 89: Visible energy distributions for all ev ents (left) and for IHAD4-tagged hadronic even ts (righ t, normalized by √ s ). The hadronic even ts p eak near E vis / √ s ≈ 1 . 0 for all years, with 1994 showing a slightly higher tail due to the v94c reconstruction pro cessing. 0 10 20 30 40 50 60 N c h 1 0 4 1 0 5 1 0 6 Events All events DELPHI Open Data All events (14,664,337) 0 10 20 30 40 50 60 N c h 0 10000 20000 30000 40000 50000 60000 Events Hadronic flag DELPHI Open Data 1992 IHAD4=1 (709,243) 1993 IHAD4=1 (711,658) 1994 IHAD4=1 (1,400,251) 1995 IHAD4=1 (677,507) Figure 90: Charged trac k multiplicit y distribution sho wing the full even t population. The N ch = 0 peak corresp onds to STIC/luminosity monitor ev ents; the broad distribution centered at ∼ 29 is from hadronic Z deca ys. 183 0 20 40 60 80 100 EMF (Forward EM Cal) Energy [GeV] 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 Events IHAD4=1 DELPHI Open Data EMF (Forward EM Cal) Nonzero: 1,923,613/3,498,659 0 20 40 60 80 100 HPC (High-density Projection Chamber) Energy [GeV] 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 Events IHAD4=1 DELPHI Open Data HPC (High-density Projection Chamber) Nonzero: 3,301,428/3,498,659 0 20 40 60 80 100 HAC (Hadron Calorimeter) Energy [GeV] 1 0 3 1 0 4 1 0 5 Events IHAD4=1 DELPHI Open Data HAC (Hadron Calorimeter) Nonzero: 3,445,788/3,498,659 0 20 40 60 80 100 STIC (Small-angle Tile Cal) Energy [GeV] 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 Events IHAD4=1 DELPHI Open Data STIC (Small-angle Tile Cal) Nonzero: 441,661/3,498,659 Figure 91: Calorimeter energy distributions for IHAD4-tagged hadronic even ts. EMF (forward electromag- netic), HPC (barrel electromagnetic), HA C (hadron calorimeter), and STIC (small-angle tile calorimeter). STIC energies are zero for 1992–1993 (detector not yet installed). 184 0 20 40 60 N c h 0.0 0.5 1.0 1.5 2.0 2.5 E v i s / p s Event classification DELPHI Open Data 0.0 0.2 0.4 0.6 0.8 1.0 IHAD4 Figure 92: Even t classification scatter plot: N ch vs E vis / √ s , colored by IHAD4 flag. Clear separation b et w een hadronic even ts (high multiplicit y , energy near √ s ) and backgrounds. 185 30000 31000 32000 33000 34000 35000 36000 Run number 0 2500 5000 7500 10000 12500 15000 17500 20000 Events per run 1992 DELPHI Open Data Mean: 1921 events/run 37000 38000 39000 40000 41000 42000 43000 Run number 0 5000 10000 15000 20000 25000 Events per run 1993 DELPHI Open Data Mean: 2479 events/run 44000 46000 48000 50000 52000 54000 56000 Run number 0 10000 20000 30000 40000 Events per run 1994 DELPHI Open Data Mean: 3041 events/run 58000 60000 62000 64000 Run number 0 10000 20000 30000 40000 50000 Events per run 1995 DELPHI Open Data Mean: 4042 events/run Figure 93: Run stability: even t rate p er run for eac h year, showing stable data-taking conditions throughout the 1992–1995 LEP1 programme. 186 All events N c h = 2 E v i s / p s > 0 . 6 | p | / E v i s < 0 . 2 | p z | / E v i s < 0 . 6 1 0 4 1 0 5 1 0 6 1 0 7 Events 14664.3k 1379.7k 564.9k 355.9k 355.9k p s 9 1 . 2 G e V DELPHI Figure 94: Leptonic cutflo w sho wing ev ent counts surviving eac h sequen tial cut in the tigh t dilepton selection ( N ch = 2, E vis / √ s > 0 . 6, |  p | /E vis < 0 . 2). 187 0.0 2.5 5.0 7.5 10.0 12.5 15.0 N c h 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Events 1e6 Leptonic selection DELPHI N c h = 2 Selected 0.0 0.5 1.0 1.5 2.0 E v i s / p s 1 0 2 1 0 3 1 0 4 1 0 5 Events Leptonic selection DELPHI N c h = 2 Selected 0.0 0.2 0.4 0.6 0.8 1.0 | p | / E v i s 1 0 4 Events Back-to-back cut DELPHI N c h = 2 , E v i s > 0 . 6 p s Selected 88 89 90 91 92 93 94 p s [ G e V ] 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 Events Leptonic selection DELPHI Figure 95: Leptonic selection diagnostic plots. T op left: N ch distribution. T op right: E vis / √ s with cut at 0.6. Bottom left: |  p | /E vis with cut at 0.2. Bottom right: √ s distribution of selected leptonic even ts. 188 D Tw o-P oin t Energy-Energy Correlator Measuremen t with DEL- PHI Data D.1 In tro duction D.1.1 Ph ysics motiv ation The energy-energy correlator (EEC) is an infrared- and collinear-safe ev ent shape observ able that probes the angular distribution of energy flow in e + e − annihilation even ts. Originally prop osed by Basham, Brown, Ellis, and Lov e in 1978, it has exp erienced a resurgence of theoretical interest due to its deep connections to conformal field theory (the light-ra y op erator pro duct expansion), soft-collinear effective theory (SCET), and transverse-momen tum dep enden t (TMD) factorization. The EEC is sp ecial among ev ent shap e observ ables for several reasons: • Collinear limit ( χ → 0): The EEC exhibits pow er-law scaling gov erned by twist-2 anomalous dimen- sions, connecting to DGLAP ev olution and the timelik e splitting functions. This region prob es the transition from p erturbativ e parton dynamics to non-perturbative hadronization. • Bac k-to-bac k limit ( χ → π ): The EEC is describ ed b y TMD factorization with Sudako v resumma- tion, no w known to N 4 LL accuracy (Chen et al., arXiv:2512.11950). This pro vides a clean channel for α s extraction with reduced sensitivit y to hadronization. • Bulk region : Fixed-order perturbative QCD at NNLO describes the intermediate angular range, with leading non-p erturbativ e corrections parameterized by a single matrix element Ω 1 . Precision predictions at NNLO+NNLL col +N 4 LL b2b are no w av ailable (Chen et al., represen ting the most precise EEC calculation ever p erformed. This calculation simultaneously matches the collinear and back-to-bac k resummations to the fixed-order result across the full angular range. Recen t remov als of e + e − ev ent shap e α s extractions from the PDG world a verage (due to concerns ab out analytical hadronization mo dels) hav e created an opp ortunit y . EEC-based extractions using TMD factorization offer a theoretically cleaner alternative. D.1.2 Observ able definition The tw o-p oin t energy-energy correlator is defined as: EEC( χ ) = 1 N even ts X even ts X i 10 mm. Charged deca y products of V 0 particles ( K 0 S → π + π − , Λ → pπ − ) are included. • F ull phase space: No particle-level fiducial cuts on θ , η , or p T . Detector acceptance is corrected via the unfolding pro cedure. • ISR treatment: ISR photons are excluded from the hadronic system. The measuremen t is defined as ISR-exclusive, corrected to √ s = M Z . • Visible energy: E vis = P k E k sums ov er charged particles only , both in the numerator pair weigh ts and in the denominator normalization. The charged-particle approach is chosen for its sup erior angular resolution (trac king: ∼ ,1 mrad vs. calorimetry: ∼ ,10 mrad), theoretical calculabilit y through the track function formalism, and consistency with the recent ALEPH EEC measurement D.1.4 Prior measuremen ts Sev eral EEC measurements at the Z p ole hav e b een published: • OP AL (1991): First EEC measuremen t at the Z pole with α s extraction using O ( α 2 s ) theory (Phys. Lett. B 276, 547). • SLD (1994): EEC and AEEC measurement at SLC, α s ( M Z ) = 0 . 124 ± 0 . 003(exp.) ± 0 . 009(theory) (Ph ys. Rev. D 50, 5580). • DELPHI (1990): Early EEC measuremen t in hadronic Z deca ys (Phys. Lett. B 252, 149). • ALEPH (2025): First fully-corrected mo dern EEC measurement using archiv ed ALEPH data at 91.2 GeV, with 200 v ariable-width angular bins. 2D Bay esian unfolding in ( θ L , E i E j /E 2 ) (arXiv:2505.11828). Most precise e + e − EEC result to date. • DELPHI op en data (2025): Thrust and track EEC measuremen t using DELPHI 1994+1995 op en data, demonstrating the feasibilit y of the DELPHI data pip eline This measuremen t provides an indep enden t result with different detector systematics from the ALEPH measuremen t, serving as a critical cross-chec k. DELPHI’s TPC-based trac king, HPC electromagnetic calorimetry , and RICH particle identification give a complemen tary systematic profile to ALEPH’s jet c hamber trac king and electromagnetic calorimetry . D.2 Data samples D.2.1 Real data The analysis uses hadronic Z decay data collected by the DELPHI detector at LEP during the 1994 running p eriod at √ s = 91 . 2 GeV. The data corresp onds to the short94 c2 processing (ANA C, Fix 2) stored in the nativ e DELPHI .al (short DST) binary format. Prop ert y V alue Dataset ID Y13709 Lo cation /eos/opendata/delphi/collision-data/Y13709/ F ormat .al (short DST) Num b er of files (total) 243 (incl. 1 imp ort file) √ s 91.27 GeV (Z p ole, with ISR spread) Y ear 1994 Estimated total even ts ∼ ,3.4 million Estimated hadronic Z deca ys ∼ ,1.3 million 190 Protot yp e subsample: 3 files (Y13709.100–102), 42,255 even ts total, of which 8,619 pass the hadronic ev ent selection. The 1994 dataset corresp onds to appro ximately 46 pb − 1 of integrated luminosit y at the Z p ole. D.2.2 Mon te Carlo: Primary (QQPS) The primary Mon te Carlo sample uses the QQPS generator (PYTHIA/JETSET with Lund string fragmen- tation) with full DELSIM detector simulation and reconstruction, matc hed to the 1994 detector conditions. Prop ert y V alue Dataset ID Y10638 Generator QQPS (PYTHIA/JETSET, Lund string fragmentation) Detector simulation F ull DELSIM √ s Fixed 91.20 GeV Num b er of files (total) 214 Estimated total even ts ∼ ,1.1 million Protot yp e subsample: 3 files (Y10638.100–102), 14,997 ev ents total, of whic h 12,384 pass the hadronic ev ent selection. This sample provides the resp onse matrix for unfolding and the MC truth reference for the particle-lev el EEC. D.2.3 Mon te Carlo: Alternativ e (AP ACIC) The alternative Monte Carlo sample uses the AP ACIC 1.05 generator with cluster hadronization and full DELSIM detector simulation. Prop ert y V alue Dataset ID apacic105 Generator AP ACIC 1.05 (cluster hadronization) Detector simulation F ull DELSIM (v94c) √ s Fixed 91.25 GeV Num b er of files (total) 996 Estimated total even ts ∼ ,3.0 million Protot yp e subsample: 1 file, 3,000 ev ents total, of which 2,508 pass the hadronic even t selection. This sample pro vides the alternative hadronization mo del for systematic ev aluation. At protot yp e scale, the AP A CIC truth sp ectrum is used as an alternativ e prior for the unfolding; the full hadronization systematic (requiring an AP A CIC resp onse matrix) is deferred to the full-statistics analysis. D.2.4 MC truth particle comp osition The MC truth record (LUND format, accessible via the PSCLUJ common blo ck in sk elana) pro vides generator- lev el information. F or the QQPS MC: P article PDG co de F raction of stable particles γ 22 47.1% π ± 211 36.4% K ± 321 4.9% K 0 L 130 2.4% p/ ¯ p 2212 1.9% n/ ¯ n 2112 1.9% 191 P article PDG co de F raction of stable particles e ± 11 0.9% µ ± 13 0.3% The mean num b er of stable c harged particles p er hadronic Z deca y ev ent is 18.9, consistent with the exp ected c harged multiplicit y in Z → q ¯ q ev ents. D.2.5 Data qualit y assessmen t Cen ter-of-mass energy distributions confirm that data p eaks at 91.27 GeV with ISR spread (88–94 GeV), while QQPS MC is fixed at 91.20 GeV and AP ACIC at 91.25 GeV. The 70 MeV offset b et ween data and QQPS MC (0.08% of √ s ) is negligible for the normalized EEC, which dep ends on angular correlations and energy fractions rather than absolute energy scale. The ISR-exclusive particle-level definition and the unfolding pro cedure absorb any residual effect of this offset. Visible energy distributions show a ∼ ,2% offset betw een data ( ⟨ E vis ⟩ = 78 . 3 GeV) and MC ( ⟨ E vis ⟩ = 76 . 7 GeV for QQPS), a known DELPHI feature that is handled by the unfolding pro cedure. Charged m ultiplicit y distributions agree well b et ween data ( ⟨ N ch ⟩ = 29 . 6) and MC ( ⟨ N ch ⟩ = 30 . 0 for QQPS). D.3 Ev en t selection D.3.1 T rack selection Reconstructed charged-particle trac ks are obtained from the DELPHI VECP common blo c k (standard trac k con tainer) via the skelana extraction pip eline. The following selection is applied sequentially: Cut Requiremen t Motiv ation Charged | Q | > 0 Select charged particles only L VSELE quality Bit 1 = 0 Sk elana April 1999 tuning (IFLCUT=3): p > 0 . 1 GeV, ∆ p/p < 1 . 5, d 0 < 4 cm, z 0 < 4 cm Finite kinematics No NaN in p x , p y , p z , E Remo ve F ortran ov erflow trac ks ( ∼ ,1 p er 15,000 even ts) P olar angle 20 ◦ ≤ θ ≤ 160 ◦ Fiducial TPC acceptance T ransverse momen tum 0 . 4 ≤ p T ≤ 100 GeV Remo ve lo opers (low p T ) and mismeasured tracks (high p T ) The L VSELE quality selection (IFLCUT=3) encodes the standard DELPHI track qualit y criteria includ- ing adequate TPC and tracking detector hits, impact parameter cuts ( d 0 < 4 cm, z 0 < 4 cm), and momentum resolution (∆ p/p < 1 . 5). The strategy specified measured track length ≥ 30 cm and ∆ p/p ≤ 1 . 0, but these quan tities are internal to the skelana framework and enco ded in the L VSELE flag. The p T ≥ 0 . 4 GeV cut comp ensates for the lo oser ∆ p/p threshold by removing the dominant p opulation of p o orly measured lo w-momentum trac ks. D.3.2 Ev ent selection Ev ents are required to pass the following cuts, applied sequentially after track-lev el selection: Cut Requiremen t Motiv ation Hadronic even t tag IHAD4 = 1 Standard DELPHI T eam 4 hadronic classification 192 Cut Requiremen t Motiv ation Charged m ultiplicity N ch ≥ 7 (after track cuts) Reject Z → τ + τ − , γ γ even ts Visible energy E vis ≥ 0 . 5 × E cm Reject p oorly reconstructed even ts and t wo-photon pro cesses Thrust axis p olar angle 30 ◦ ≤ θ thrust ≤ 150 ◦ Ensure even t is well-con tained in the barrel detector The thrust axis is computed iterativ ely from selected c harged trac ks using the standard sign-flip algorithm with a conv ergence tolerance of 10 − 10 on the axis direction cosine. Radiativ e even t v eto: The strategy sp ecified a radiativ e ev ent v eto ( E γ , max < 40 GeV). This cut is not applied b ecause isolated high-energy photon information is not av ailable in the extracted CSV format. At the Z pole, hard ISR even ts are rare ( < 1% of hadronic even ts), and the E vis cut provides partial mitigation. The ISR correction is handled by the unfolding pro cedure. D.3.3 T rack-lev el cutflo w Cut Data cum ul. MC QQPS cumul. AP ACIC cum ul. T otal trac ks 587,191 653,364 133,090 Charged 340,992 (58.1%) 443,005 (67.8%) 89,766 (67.4%) L VSELE go od 226,812 (38.6%) 304,088 (46.5%) 61,221 (46.0%) Finite kin. 226,812 (38.6%) 304,088 (46.5%) 61,221 (46.0%) θ fiducial 216,957 (36.9%) 292,006 (44.7%) 59,033 (44.4%) p T cut 171,991 (29.3%) 233,137 (35.7%) 47,331 (35.6%) The low er cumulativ e efficiency in data (29.3%) compared to MC (35.7%) is exp ected because data files con tain non-hadronic even ts (leptonic Z deca ys, tw o-photon even ts, b eam-gas) whic h hav e few er charged trac ks passing the qualit y selection. After the IHAD4 hadronic ev en t requirement, the per-even t track selection efficiencies are consisten t betw een data and MC. D.3.4 Ev ent-lev el cutflow Cut Data Data cumul. MC QQPS MC cum ul. AP ACIC AP ACIC cum ul. T otal ev ents 42,255 42,255 14,997 14,997 3,000 3,000 IHAD4 = 1 10,036 10,036 (23.8%) 14,348 14,348 (95.7%) 2,878 2,878 (95.9%) N ch ≥ 7 9,855 9,789 (23.2%) 14,191 14,121 (94.2%) 2,840 2,831 (94.4%) E vis ≥ 0 . 5 E cm 33,604 9,542 (22.6%) 14,013 13,737 (91.6%) 2,831 2,767 (92.2%) 30 ◦ ≤ θ thrust ≤ 150 ◦ 11,117 8,619 (20.4%) 12,963 12,384 (82.6%) 2,615 2,508 (83.6%) 193 88 89 90 91 92 93 94 p s [ G e V ] 0 2 4 6 8 10 Normalized p s = 9 1 . 2 G e V DELPHI Open Data Data MC (QQPS) MC (APACIC) Figure 96: Center-of-mass energy distribution for data and MC samples. 194 0 20 40 60 80 100 120 E v i s [ G e V ] 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 Normalized p s = 9 1 . 2 G e V DELPHI Open Data Data MC (QQPS) MC (APACIC) Figure 97: Visible energy distribution for hadronic ev ents in data and MC samples. 195 0 10 20 30 40 50 60 70 N c h ( V E C P ) 0.00 0.01 0.02 0.03 0.04 Normalized p s = 9 1 . 2 G e V DELPHI Open Data Data MC (QQPS) MC (APACIC) Figure 98: Charged m ultiplicity distribution for hadronic even ts. 196 D.3.5 Selection efficiency summary Sample T otal ev ents Selected ev ents Efficiency Selected tracks T rac ks/even t Data (3 files) 42,255 8,619 20.4% 144,832 16.8 MC QQPS (3 files) 14,997 12,384 82.6% 207,203 16.7 AP ACIC (1 file) 3,000 2,508 83.6% 42,428 16.9 The hadronic even t selection efficiency (computed from IHAD4 = 1 ev ents) is 8,619/10,036 = 85.9% for data, consisten t with the MC efficiency of 82.6–83.6%. The mean selected track m ultiplicity is ∼ ,16.8 p er ev ent, consisten t across all three s amples, confirming that the track selection is well-modeled b y the MC. D.3.6 Data/MC comparison plots All distributions are shown after the full ev ent and trac k selection. Eac h plot includes a ratio panel (Data/MC) to quantify agreemen t. The AP ACIC (cluster hadronization) MC is o verlaid for comparison. T rack transverse momen tum Go o d agreement (within 5%) for p T < 10 GeV. Discrepancies of 5–10% at p T > 10 GeV are consistent with the kno wn DELPHI high- p T trac k modeling issue noted in T rack p olar angle Go od agreemen t (within 5%) across the fiducial range 20 ◦ –160 ◦ . TPC sector boundary features are repro duced by the MC. T rack azimuthal angle Excellent agreement (within 3%). The azimuthal flatness confirms no significan t detector asymmetries after selection. T rack momentum Goo d agreement within 5% for p < 15 GeV. At higher momenta, statistical fluctua- tions dominate. T rack energy Goo d agreement, consisten t with the momen tum distribution. T rack energy is the k ey input to the EEC weigh t E i E j /E 2 vis . Charged m ultiplicity The p eak position and distribution width are w ell-repro duced by b oth QQPS and AP ACIC MC. Charged-trac k energy sum This plot shows the sum of selected charged-trac k energies, not the calori- metric E vis used for the even t selection cut. The ∼ ,2% offset betw een data and MC is a kno wn DELPHI feature. Thrust Go od agreement in b oth the tw o-jet p eak ( T > 0 . 9) and the multi -jet tail ( T < 0 . 8). The thrust distribution is indirectly related to the EEC through the even t topology: high-thrust even ts pro duce a stronger back-to-bac k p eak, while low-thrust ev ents populate the bulk region. Thrust axis p olar angle The flat distribution within the fiducial range confirms uniform detector resp onse in the barrel region. The thrust axis cut ensures that the ev ent jet axes are well-con tained within the TPC acceptance. 197 1 0 3 1 0 2 1 0 1 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0 5 10 15 20 p T [ G e V ] 0.8 1.0 1.2 Data / MC Figure 99: T rack transv erse momentum distribution after selection. The steeply falling sp ectrum from the 0.4 GeV cut to the beam energy sho ws go od data/MC agreemen t for QQPS across the full range. The ratio panel sho ws agreement within 5% for p T < 10 GeV. AP ACIC sho ws a softer spectrum at high p T ( > 8 GeV), consisten t with different fragmen tation mo deling. 198 0.1 0.2 0.3 0.4 0.5 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0.5 1.0 1.5 2.0 2.5 [ r a d ] 0.8 1.0 1.2 Data / MC Figure 100: T rack p olar angle distribution after selection, showing the c haracteristic barrel-enhanced shap e with acceptance edges at θ ≈ 20 ◦ and 160 ◦ . The ratio panel shows data/MC agreemen t within 5% across the full fiducial range. 199 0.145 0.150 0.155 0.160 0.165 0.170 0.175 0.180 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data -3 -2 -1 0 1 2 3 [ r a d ] 0.8 1.0 1.2 Data / MC Figure 101: T rac k azimuthal angle distribution after selection, appro ximately flat as expected for the az- im uthally symmetric DELPHI detector. The ratio panel shows agreement within 3%. 200 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0 10 20 30 40 50 p [ G e V ] 0.8 1.0 1.2 Data / MC Figure 102: T rack total momen tum distribution after selection. Data/MC agreement is go od across the full range for QQPS. 201 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0 10 20 30 40 50 E [ G e V ] 0.8 1.0 1.2 Data / MC Figure 103: T rack energy distribution after selection. Shap e is similar to the momentum distribution since trac ks are approximately relativistic. 202 0.00 0.02 0.04 0.06 0.08 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 10 20 30 40 50 60 N c h ( s e l e c t e d ) 0.8 1.0 1.2 Data / MC Figure 104: Selected charged m ultiplicity distribution after all cuts. P eaks at N ch ≈ 15–17 with a tail to ∼ ,40. Data/MC agreement is within 5%. 203 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 30 40 50 60 70 80 90 100 E t r a c k [ G e V ] 0.8 1.0 1.2 Data / MC Figure 105: Charged-trac k energy sum distribution after selection. Data is slightly harder than MC, consis- ten t with the known ∼ ,2% offset. 204 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0.5 0.6 0.7 0.8 0.9 1.0 T ( t h r u s t ) 0.8 1.0 1.2 Data / MC Figure 106: Thrust distribution after selection. The c haracteristic tw o-jet p eak at T → 1 dominates, with a tail to low er thrust v alues (m ulti-jet even ts). Data/MC agreemen t is go od. 205 0.35 0.40 0.45 0.50 0.55 0.60 0.65 Normalized / bin width p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0.5 1.0 1.5 2.0 2.5 t h r u s t [ r a d ] 0.8 1.0 1.2 Data / MC Figure 107: Thrust axis p olar angle distribution after the 30 ◦ –150 ◦ cut. Sho ws a flat distribution in the cen tral region with acceptance edges at the cut b oundaries. 206 Summary of data/MC agreement V ariable Agreement lev el χ 2 /ndf Impact on EEC T rack p T Go od ( < 5% for p T < 10 GeV) 63.8/48 = 1.33 Mo derate (energy weigh ts) T rack θ Goo d ( < 5%) 85.9/48 = 1.79 Critical (pair angle) T rack ϕ Excellen t ( < 3%) 91.8/48 = 1.91 Mo derate (pair angle) T rack p Go o d ( < 5% for p < 15 GeV) 51.2/35 = 1.46 Mo derate (energy weigh ts) T rack E Go od ( < 5%) 51.2/35 = 1.46 Critical (energy weigh ts) N ch Go od ( < 5%) 25.6/24 = 1.07 Mo derate (pair count) P E track Mo derate (2% offset) 75.8/41 = 1.85 Mo derate (normalization) Thrust Go o d ( < 5%) 41.6/32 = 1.30 Indirect θ thrust Go od ( < 5%) 61.9/48 = 1.29 Indirect All χ 2 /ndf v alues are in the range 1.0–1.9. The highest v alues (track ϕ : 1.91, P E track : 1.85, trac k θ : 1.79) are marginally ab ov e the χ 2 / ndf = 1 exp ectation for perfect mo deling. F or track ϕ with 48 degrees of freedom, χ 2 / ndf = 1 . 91 corresponds to a p -v alue of ∼ ,3 × 10 − 4 , indicating statistically significant but ph ysically small disagreement. No v ariable has χ 2 / ndf > 2 . 0. The MC resp onse mo del (QQPS with full DELSIM detector simulation) provides an adequate description of the data kinematics for unfolding, with residual mo deling imp erfections absorbed b y the systematic uncertaint y program. D.4 Corrections and unfolding D.4.1 EEC computation T rack proximit y cut A minimum pair op ening angle cut χ min = 0 . 005 rad ( ∼ ,5 mrad) is applied to all trac k pairs entering the EEC computation. This remo v es spurious small-angle pairs caused b y track splitting in the detector, where a single c harged particle is reconstructed as t wo nearb y tracks. The cut v alue is c hosen to b e abov e the typical angular resolution of the DELPHI TPC ( ∼ ,1–2 mrad). The impact on the total pair coun t is small ( < 1%), concentrated in the deep collinear bins where split-track contamination is highest. The v ariation of this cut threshold ( χ min = 0 . 003–0 . 010 rad) is ev aluated as a systematic uncertain ty . Angular binning The EEC is measured in 130 angular bins: Region χ range [rad] Binning type Num b er of bins Collinear 0.002 – 0.200 Logarithmic 50 Bulk 0.200 – 2.900 Linear 30 Bac k-to-back 2.900 – 3.140 Logarithmic (flipp ed) 50 The logarithmic binning in the collinear and back-to-bac k regions resolv es the p o wer-la w and Sudako v p eak structures. The linear binning in the bulk region matches the appro ximately flat detector resp onse. Detector-lev el distributions P article-level distributions (MC truth) The tw o hadronization mo dels (Lund string vs. cluster) pro- duce consistent EEC distributions within the statistical precision of the prototype sample. Differences at the 5–10% level are visible in the collinear region, where the hadronization mo del systematic is exp ected to b e largest. 207 1 0 2 1 0 1 1 0 0 1 N e v t d d p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 1 0 2 1 0 1 1 0 0 [ r a d ] 0.75 1.00 1.25 Data / MC Figure 108: Detector-level EEC distribution in logarithmic χ scale, comparing data, QQPS MC, and AP A CIC MC after full selection. The collinear enhancement at χ → 0 and back-to-bac k peak at χ → π are clearly visible. Data/MC agreement is within ± 10% for χ > 0 . 01 rad. 208 1 0 2 1 0 1 1 0 0 1 N e v t d d p s = 9 1 . 2 G e V DELPHI Open Data MC (QQPS) MC (APACIC) Data 0.0 0.5 1.0 1.5 2.0 2.5 3.0 [ r a d ] 0.75 1.00 1.25 Data / MC Figure 109: Detector-lev el EEC distribution in linear χ scale. The bulk and back-to-bac k regions sho w excellen t data/MC agreement at the few-p ercen t level. 209 1 0 2 1 0 1 1 0 0 [ r a d ] 1 0 2 1 0 1 1 0 0 1 N e v t d d p s = 9 1 . 2 G e V DELPHI Simulation Open Data MC truth (QQPS) MC truth (APACIC) Figure 110: P article-level EEC from MC truth charged particles (QQPS and AP ACIC). The tw o generators agree w ell in the bulk and bac k-to-back regions. In the collinear region ( χ < 0 . 01 rad), statistical fluctuations are visible due to the small protot yp e sample. 210 1 0 2 1 0 1 1 0 0 1 N e v t d d p s = 9 1 . 2 G e V DELPHI Simulation Open Data Particle level Detector level 1 0 2 1 0 1 1 0 0 [ r a d ] 0.5 1.0 1.5 Reco / Truth Figure 111: Comparison of detector-lev el and particle-lev el EEC in QQPS MC. The ratio panel shows the reco/truth ratio as a function of χ . The detector resp onse suppresses the EEC in the bac k-to-back region ( χ > 2 rad) by ∼ ,20% and mo difies the collinear region shap e. 211 Detector response The reco/truth ratio quan tifies the detector distortion: • Collinear ( χ < 0 . 05 rad): Fluctuating ratio b et ween 0.7 and 1.3 due to limited prototype statistics. • Bulk ( 0 . 1 < χ < 2 rad): Ratio appro ximately 0.95–1.05, indicating mo dest detector effects. • Bac k-to-bac k ( χ > 2 . 5 rad): Ratio drops to ∼ ,0.85, indicating ∼ ,15% reduction from trac king efficiency losses. D.4.2 Resp onse matrix T rack matc hing pro cedure Reco-lev el trac ks are matched to truth-level trac ks using the Hungarian algorithm (optimal one-to-one assignmen t) within each ev en t: 1. Compute the angular distance matrix betw een all reco and truth trac ks: ∆ R ij = arccos( ˆ p reco ,i · ˆ p truth ,j ) 2. Find the globally optimal one-to-one assignmen t minimizing the total angular distance using scipy.optimize.linear sum assignment (Hungarian/Kuhn–Munkres algorithm) 3. Accept matc hes with ∆ R < 0 . 05 rad; reject assignments exceeding the cutoff T rack matc hing statistics (QQPS MC, 12,384 ev ents): • 171,952 / 207,203 reco tracks matc hed (83.0%) • 171,952 / 237,197 truth tracks matc hed (72.5%) • 17% of reco tracks are unmatched (fakes from secondary interactions, trac k splitting, or acceptance- b oundary effects). The MC-deriv ed fak e rate may differ from data if the detector material budget or secondary interaction modeling is imperfect; this is partially co v ered by the tracking efficiency systematic but may warr ant a dedicated study at full statistics. • 27.5% of truth trac ks are unmatched (losses from tracking inefficiency , acceptance, or the fiducial vs. full phase-space difference) P air-level resp onse matrix The response matrix R ij maps truth EEC bin j to reco EEC bin i . F or eac h MC ev en t, all reco-lev el track pairs are formed. A reco pair is “matched” if b oth constituen t tracks ha ve truth matc hes. F or matched pairs, the resp onse matrix is filled at R [reco bin , truth bin] with the EEC w eight 2 E reco i E reco j /E 2 vis,reco . The matrix is column-normalized so that each truth bin column sums to 1.0. The column normalization is v erified: all non-zero columns sum to exactly 1.000000 (min = 1.000000, max = 1.000000). Resp onse matrix prop erties Prop ert y V alue Dimensions 130 x 130 T rack matc hing Hungarian algorithm (optimal) Pro ximity cut χ min = 0 . 005 rad Ov erall diagonal fraction 0.897 Mean p er-bin diagonal fraction 0.670 Mean pair efficiency 0.804 Mean pair fake rate 0.287 Condition num b er 2 . 5 × 10 3 Bac k-to-back bins with diagonal < 10% 4 (bins 124, 126, 127, 129) Ev ents used 12,384 The condition num ber of 2 . 5 × 10 3 is well b elo w the 10 10 threshold that would indicate regularization difficulties, confirming that the unfolding problem is well-posed. 212 0.01 0.10 0.50 1.00 2.00 3.00 T r u t h b i n [ r a d ] 0.01 0.10 0.50 1.00 2.00 3.00 R e c o b i n [ r a d ] p s = 9 1 . 2 G e V DELPHI Simulation Open Data 0.0 0.2 0.4 0.6 0.8 1.0 P(reco bin | truth bin) Figure 112: Normalized resp onse matrix P (reco bin | truth bin) for the 130-bin EEC. The matrix is strongly diagonal in the bulk region (0 . 2 < χ < 2 . 9 rad), with increasing off-diagonal spread in the collinear and bac k-to-back regions. 213 1 0 2 1 0 1 1 0 0 [ r a d ] 0.0 0.2 0.4 0.6 0.8 1.0 Diagonal fraction p s = 9 1 . 2 G e V DELPHI Simulation Open Data 50% threshold 70% threshold (BBB valid) Figure 113: Per-bin diagonal fraction as a function of χ . The bulk and back-to-bac k regions hav e diagonal fractions exceeding 90%. The collinear region ( χ < 0 . 05 rad) has diagonal fractions of 10–50%, requiring prop er unfolding. 214 Diagonal fraction versus angular separation The diagonal fraction profile determines the unfolding strategy: • Bulk and bac k-to-bac k ( χ > 0 . 1 rad): Diagonal fraction > 70%. Bin-b y-bin correction is v alid as a cross-chec k. • Collinear ( χ < 0 . 05 rad): Diagonal fraction < 50%. Iterative Ba yesian unfolding is mandatory . • Deep back-to-bac k ( χ > 3 . 138 rad): 4 bins ha ve diagonal fraction ≈ 0. These are excluded from the measurement. P air efficiency and fake rate D.4.3 Unfolding procedure Correction sequence The correction chain follows the approach of the DELPHI op en-data reference analysis 1. F ak e subtraction: The estimated fak e con tribution is subtracted per bin: reco sub = reco − reco × f fake , where f fake is the p er-bin fake rate from the response matrix construction. 2. Iterativ e Bay esian unfolding (IBU): The D’Agostini method with N iter = 4 iterations is applied using the column-normalized resp onse matrix R ij . The prior is the QQPS MC truth EEC distribution. The IBU up date rule is: ˆ t ( k +1) j = ˆ t ( k ) j X i R ij d i P j ′ R ij ′ ˆ t ( k ) j ′ 3. Efficiency correction: The unfolded result is divided by the p er-bin pair efficiency (fraction of truth pairs with matched reco pairs): EEC corr = EEC unf /ϵ . 4. Normalization: The EEC is normalized so that P i EEC i · ∆ χ i = 1 ov er the active bin range. Bin exclusion 22 bins are excluded from the measuremen t in t wo stages: Stage 1: Structural exclusion (14 bins). Bins 0–8 hav e zero truth and reco sp ectrum (below the pro ximity cut), bin 9 has near-zero data con tent and zero diagonal fraction, and bins 124, 126, 127, 129 ha ve zero diagonal fraction in the deep bac k-to-back region. Stage 2: Flat-prior-flagged exclusion (8 bins). P er conv entions, bins where the unfolded result c hanges b y > 20% with a flat prior are excluded: Excluded bin χ [rad] Flat-prior c hange Region 10 0.0053 57.1% Deep collinear 15 0.0083 44.0% Collinear 16 0.0092 25.5% Collinear 17 0.0100 22.4% Collinear 114 3.1327 33.3% Back-to-bac k 116 3.1343 46.4% Back-to-bac k 117 3.1350 23.2% Back-to-bac k 121 3.1371 47.9% Deep back-to-bac k After exclusion, the flat-prior test on the remaining bin set shows 0 flagged bins out of 108, confirming the cleaned bin set is prior-insensitive. This leav es 108 activ e bins spanning χ ∈ [0 . 006 , 3 . 139] rad. Regularization The nominal iteration count N iter = 4 is chosen based on consistency with the DELPHI op en-data reference analysis (arXiv:2510.18762) and the iteration scan p erformance (see Section 10.4 for the full n umerical table of χ 2 /ndf vs. N iter ). The closure χ 2 /ndf increases monotonically from 3.65 (1 iteration) to 6.49 (10 iterations), while the stress test χ 2 /ndf is essentially flat b eyond 2 iterations. The choice of 4 iterations balances regularization against noise amplification. 215 1 0 2 1 0 1 1 0 0 [ r a d ] 0.0 0.2 0.4 0.6 0.8 1.0 Fraction p s = 9 1 . 2 G e V DELPHI Simulation Open Data Pair efficiency Pair fake rate Figure 114: Pair efficiency and fak e rate as a function of χ . Efficiency is ∼ ,85–90% in the bulk region, dropping in the back-to-bac k region. The fake rate is ∼ ,25% in the bulk, rising to 30–70% in the collinear region due to trac k s plitting. 216 D.5 Systematic uncertainties Six systematic uncertain ty sources are ev aluated. F or each source, the data EEC is v aried (or the unfolding parameters are c hanged) and the full correction c hain is rerun. The shift v ector δ i = EEC v ar i − EEC nom i quan tifies the impact p er bin. D.5.1 T racking efficiency Source n umber: 3. Metho d: Remo ve 1% of tracks randomly b efore EEC computation and rerun the full correction chain. Mean relative impact: 0.7%. This ev aluates the sensitivity to tracking efficiency uncertain t y . The 1% uniform remo v al is a first appro ximation; a more realistic ev aluation would apply θ - and p T -dep enden t remo v al rates matc hing the actual DELPHI trac king efficiency profile (which is low er at forward angles and low p T ). This is planned for the full-statistics analysis. The small impact (0.7%) reflects the self-normalizing prop erty of the EEC: o verall efficiency losses cancel in the ratio E i E j /E 2 vis . Only differential efficiency effects (where efficiency v aries with trac k kinematics) con tribute to the systematic, and the uniform remov al underestimates this differen tial component. D.5.2 T rack p T cut v ariation Source n um b er: 4. Metho d: V ary the minim um p T cut from the nominal 0.4 GeV to 0.3 GeV and 0.5 GeV. The en v elop e (maximum absolute shift from the tw o v ariations) is tak en as the systematic. Mean relativ e impact: 3.3%. The p T cut affects the p opulation of low-momen tum trac ks entering the EEC. Low ering the cut includes more soft tracks (predominan tly at small angles), while raising it remov es them. The 3.3% impact reflects the sensitivity of the collinear region to the soft-track p opulation. D.5.3 Regularization Source n umber: 10. Metho d: V ary the num ber of IBU iterations by ± 2 around the nominal (2 and 6 iterations). The env elop e is taken as the systematic. Mean relative impact: 3.4%. The regularization systematic captures the dep endence on the num b er of unfolding iterations. F ew er iterations produce a result closer to the MC prior; more iterations allow more data-driven fluctuations. The 3.4% impact indicates mo derate sensitivit y to the regularization choice. D.5.4 Prior sensitivit y Source num b er: 11 + 13 (combined). Metho d: The flat-prior v ariation (source 11, 1.5% mean impact) and the AP ACIC truth as alternativ e prior (source 13, 4.7% mean impact) b oth prob e prior sensitivit y through the same QQPS resp onse matrix. They are combined in to a single systematic using the p er-bin en velope: δ ( i ) prior = max  | δ ( i ) flat | , | δ ( i ) AP A CIC |  · sign( δ ( i ) dominant ) Mean relativ e impact: 4.9%. This is the dominan t measured systematic uncertain t y . The AP A CIC prior v ariation uses the cluster- hadronization truth sp ectrum as an alternative prior, whic h c hanges the collinear and bac k-to-back regions more than the flat prior. This prob es prior sensitivit y (IBU conv ergence properties), not hadronization mo del dep endence. The prior sensitivit y and hadronization mo del systematics are orthogonal: the prior sensitivit y v aries the starting distribution while keeping the QQPS response matrix fixed, testing whether the IBU iteration has con verged; the hadronization systematic (when measured) v aries the resp onse matrix itself by using an alternativ e generator. The prior sensitivity do es not partially co ver the hadronization systematic and must not b e in terpreted as suc h (see the discussion of the hadronization systematic b elo w). 217 D.5.5 Angular resolution Source num b er: 19. Method: Apply an additional Gaussian smearing of 0.5 mrad to the θ and ϕ angles of individual trac ks before pair angle computation, then rerun the full correction chain. Mean relativ e impact: 3.9%. The DELPHI TPC pro vides a single-trac k angular resolution of ∼ ,0.2 mrad in p olar angle θ and ∼ ,0.3 mrad in azimuthal angle ϕ for isolated high-momentum trac ks (DELPHI detector p erformance, Nucl. In- strum. Meth. A 378, 57 (1996)). F or lo wer-momen tum trac ks and in the presence of nearby hits, the resolution degrades. The 0.5 mrad Gaussian smearing applied p er track in b oth θ and ϕ represents approx- imately t wice the nominal single-track resolution, conserv ativ ely accoun ting for resolution degradation in dense en vironments. This is v erified b y the observ ation that the 0.5 mrad smearing pro duces a physically smo oth impact profile, while a 1.0 mrad smearing (5 times the nominal resolution) yielded an anomalously large impact (10.4% mean) b ecause it exceeded the narrow est bin widths in the bac k-to-back region (0.22 mrad), producing artificial bin migration effects rather than a gen uine detector resolution effect. The reduced 0.5 mrad smearing giv es a physically reasonable impact profile: Region Mean relative impact (0.5 mrad) Collinear ( χ < 0 . 1) 3.8% Bulk (0 . 1 < χ < 2 . 5) 0.2% Bac k-to-back ( χ > 2 . 5) 6.6% D.5.6 Pro ximity cut v ariation Source n umber: 20. Metho d: V ary the minim um pair op ening angle cut χ min from the nominal 0.005 rad to 0.003 rad and 0.007 rad. The env elop e is taken as the systematic. Mean relative impact: 2.7%. This ev aluates the sensitivity to the split-track remov al pro cedure. The impact is concentrated in the collinear bins where the proximit y cut has the largest effect on the pair p opulation. D.5.7 Systematic impact breakdown D.5.8 Unmeasured and deferred systematics Sev eral systematic sources are not ev aluated at prototype scale. The most critical are: Source Expected impact Reason for deferral Hadronization mo del (full) 5–15% (dominant) Requires AP ACIC resp onse matrix from full sample T rack momen tum scale ( ± 0 . 1%) < 1% Sub-leading T rack momen tum resolution < 1% Sub-leading N ch cut v ariation < 1% Small rejection pow er E vis cut v ariation < 1% Small rejection pow er Thrust axis cut v ariation < 1% Small rejection p ow er T rack quality cuts < 2% No v ariation without re-extraction Hea vy fla vor (b-quark) 2–5% (b2b) Requires flav or tagging or fla vor-specific MC T rack matc hing metric < 2% V ariation of ∆ R cutoff High- p T trac k mo deling < 1% Data/MC discrepancy at p T > 10 GeV 218 1 0 2 1 0 1 1 0 0 [ r a d ] 0 10 20 30 40 50 Relative uncertainty [%] p s = 9 1 . 2 G e V DELPHI Simulation Preliminary Tracking eff. p T c u t Regularization Prior sensitivity Angular res. Proximity cut Total syst. Statistical Figure 115: Systematic uncertaint y breakdown sho wing the relative impact of eac h source as a function of χ . The total systematic (blac k solid) and statistical (black dashed) uncertain ties are also shown. Prior sensitivit y and angular resolution are the dominant measured systematics. 219 Bose-Einstein correlations (BEC): BEC betw een identical c harged pions enhance the EEC in the deep collinear region ( χ ≲ 0 . 01 rad) at the few-p ercen t lev el. The QQPS MC does not include BEC b y default. A BEC on/off comparison is planned for the full-statistics analysis (see F uture Directions). Justified omissions: Background contamination (source 9, < 1% at the Z p ole) and ISR treatment (source 14, < 1%, particle-lev el definition is ISR-inclusive) are negligible. Hadronization mo del: The true hadronization systematic requires building an independent response matrix from the AP A CIC MC sample with full detector simulation. With the protot yp e AP ACIC sam- ple (2,508 even ts), this resp onse matrix w ould be extremely noisy . With the full sample ( ∼ ,3M AP A CIC ev ents), a prop er alternative response matrix can be constructed. This is the dominan t unmeasured uncertain ty , expected to contribute 5–15% based on prior measuremen ts. D.6 Cross-c hec ks D.6.1 Bin-b y-bin correction (BBB) BBB correction factors C i = EEC truth i / EEC reco i are computed as a cross-chec k for the 67 out of 108 activ e bins where the diagonal fraction exceeds 70%. BBB do es not qualify as a prop er alternative unfolding metho d because 42% of bins ha ve diagonal fraction < 70% where BBB is unreliable. Both IBU and BBB are normalized o ver the same bin range for a consistent comparison. Metric V alue BBB v alid bins 67/108 (62%) Max relative difference (BBB vs. IBU) 26.4% Mean relative difference (BBB vs. IBU) 2.7% χ 2 /ndf (BBB vs. IBU, v alid bins) 203.2/67 = 3.03 The BBB and IBU results agree within the statistical uncertain ty in most of the v alid region. T o separate edge effects from bulk disagreement, the χ 2 is also computed in the bulk region only (0 . 2 < χ < 2 . 9 rad, diagonal fraction > 70%): Region BBB vs. IBU χ 2 /ndf All v alid bins (67 bins) 203.2/67 = 3.03 Bulk only (0 . 2 < χ < 2 . 9, 30 bins) 180.1/30 = 6.00 The bulk-only χ 2 /ndf (6.00) is actually higher than the full 67-bin v alue (3.03), indicating that the disagreemen t is not concen trated at the edges but reflects a gen uine difference b et ween the IBU and BBB corrections ev en in the w ell-measured bulk region. This is expected: BBB applies per-bin correction factors that do not accoun t for bin migration, while IBU iteratively redistributes ev ents across bins. With prototype statistics, the resp onse matrix noise affects b oth methods differently . The BBB cross-c heck therefore demon- strates qualitative agreemen t (mean relativ e difference 2.7%) but cannot serv e as a quantitativ e v alidation at prototype scale. Note: The SVD unfolding metho d is planned as a proper alternative method for the full-statistics analysis but is not implemented at protot yp e scale. D.6.2 Self-normalization c hec k The EEC is self-normalizing: R π 0 EEC( χ ) dχ = 1 when all particles are measured. The unfolded EEC b efore the explicit normalization step has an integral of P i EEC pre-norm i · ∆ χ i = 0 . 803, indicating a ∼ ,20% deficit from unity . This deficit is exp ected and arises from three sources: 1. Charged-only measuremen t: The EEC is computed from charged particles only ( ∼ ,53% of stable particles), while the self-normalization sum rule assumes all particles. The squared charged energy fraction is ∼ ,0.7 2 ≈ 0 . 49 of the total. 220 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 1 0 2 1 d d [ r a d 1 ] p s = 9 1 . 2 G e V DELPHI Preliminary IBU (nominal) BBB (diag > 70%) 1 0 2 1 0 1 1 0 0 [ r a d ] 0.75 1.00 1.25 BBB / IBU Figure 116: Comparison of IBU-unfolded (blac k) and BBB-corrected (red) EEC in the region where BBB is v alid (diagonal fraction > 70%). The ratio panel sho ws BBB/IBU. Both are normalized ov er the same bin range. 221 2. T rac king acceptance losses: The fiducial θ and p T cuts remov e ∼ ,17% of charged trac ks, reducing the pair count. 3. Bin exclusion: 22 bins are excluded from the measurement, remo ving a small fraction of the integral. The com bination of these effects is consistent with the observed 0.803 pre-normalization integral. The final result is explicitly normalized to unity o v er the 108 active bins ( P i EEC i · ∆ χ i = 1 . 000) to enable shap e comparison with predictions. D.7 Statistical metho d D.7.1 Statistical uncertain ties Statistical uncertainties are computed via b o otstrap resampling: 1. F or each of 100 replicas, the data EEC histogram is fluctuated bin-by-bin using Gaussian resampling with the measured statistical error from Phase 3 (even t-by-ev en t v ariance). 2. Eac h fluctuated distribution is passed through the full correction c hain (fak e subtraction, IBU, efficiency correction, normalization). 3. The co v ariance matrix is computed from the distribution of unfolded replicas. Mean statistical uncertaint y (activ e bins): 3 . 5 × 10 − 2 (absolute), corresp onding to ∼ ,5–30% relative uncertain ty dep ending on the bin. This is dominated by the small prototype sample size. Limitation: The bo otstrap replica coun t (100) is less than the n umber of activ e bins (108). The statistical cov ariance matrix is therefore rank-deficient, with rank at most 100 for the 108 activ e bins. Consequen tly , the total cov ariance matrix (statistical plus six rank-1 systematic outer pro ducts) is effectiv ely singular and should not b e in verted for χ 2 calculations at prototype scale. Only diagonal uncertainties should b e used for quantitativ e tests. With full statistics, 500+ replicas will b e used to ensure a well-conditioned statistical cov ariance with full rank. D.7.2 Systematic co v ariances Eac h systematic source is represented by a shift v ector δ k ∈ R 130 . The systematic co v ariance p er source is the outer pro duct (fully correlated per source): C syst k = δ k δ T k . Six indep enden t sources contribute (no double-coun ting of prior v ariations). D.7.3 T otal cov ariance The total cov ariance matrix is: C = C stat + 6 X k =1 C syst k Prop ert y V alue Dimensions 130 x 130 (108 x 108 active sub-matrix) P ositive semi-definite Y es Condition num b er 3 . 6 × 10 19 Max off-diagonal correlation ∼ ,0.93 The high condition num b er (3 . 6 × 10 19 ) indicates that the co v ariance in verse is numerically unstable at protot yp e scale. This is exp ected when the b ootstrap replica count (100) is less than the num b er of active bins (108), and the 6 rank-1 systematic outer pro ducts dominate. With full statistics ( ∼ ,150 × more data, 500+ b ootstraps), the condition num b er will decrease by many orders of magnitude. 222 11 24 34 44 54 64 74 84 94 104 115 Bin index 11 24 34 44 54 64 74 84 94 104 115 Bin index p s = 9 1 . 2 G e V DELPHI Simulation Preliminary -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 Correlation coefficient Figure 117: T otal correlation matrix for the 108 active bins. Strong positive correlations are visible in the collinear and bac k-to-back regions (driven b y the systematic outer pro ducts). The bulk region shows mo derate correlations. 223 D.7.4 Uncertain ty budget Mean uncertainties o ver activ e bins: Comp onen t Mean absolute Mean relative Statistical 3 . 5 × 10 − 2 ∼ ,10% Systematic (total) 4 . 7 × 10 − 2 ∼ ,12% T otal 6 . 1 × 10 − 2 ∼ ,18% The analysis is systematically dominated in the bulk and back-to-bac k regions and statis tically dominated in the extreme collinear region. With full statistics, the statistical comp onen t will decrease b y ∼ √ 150 ≈ 12 × , making systematics dominan t everywhere. D.8 Results D.8.1 Unfolded EEC sp ectrum The primary result is the c harged-particle t w o-p oin t EEC at √ s = 91 . 2 GeV, corrected to the stable c harged- particle level using 4-iteration IBU unfolding. The result spans 108 angular bins in the range χ ∈ [0 . 006 , 3 . 139] rad. The unfolded EEC sho ws the exp ected ph ysical features: • Collinear enhancement ( χ ≲ 0 . 1 rad): Rising EEC consisten t with DGLAP-gov erned p ow er-la w scaling. • Bulk region (0 . 1 ≲ χ ≲ 2 . 5 rad): Appro ximately flat, well-described b y fixed-order perturbative QCD. • Bac k-to-bac k Sudak ov peak ( χ ≳ 2 . 5 rad): Sharp rise tow ard χ = π , reflecting the tw o-jet top ology of Z → q ¯ q ev ents with Sudako v suppression. The data and QQPS MC truth agree at the ∼ ,10–20% lev el across the full angular range, with the largest deviations in the collinear and back-to-bac k edges where statistical uncertainties are largest. D.8.2 P er-bin results table (selected bins) The full p er-bin results are pro vided in machine-readable form in results/eec spectrum.csv . A represen ta- tiv e subset of 25 bins spanning the collinear, bulk, and bac k-to-back regions is sho wn below. Bin num b ering follo ws the 130-bin scheme; only active bins (108 total) carry measured v alues. Bin χ [rad] ∆ χ [rad] EEC [1/rad] Stat. err. Syst. err. T otal err. 11 0.00577 5.31e-04 0.436 0.080 0.438 0.445 14 0.00761 7.01e-04 0.651 0.081 0.112 0.138 18 0.01100 1.01e-03 0.789 0.081 0.188 0.205 21 0.01450 1.33e-03 0.591 0.068 0.052 0.085 25 0.02096 1.93e-03 0.877 0.072 0.115 0.136 30 0.03323 3.06e-03 1.222 0.061 0.088 0.107 35 0.05266 4.85e-03 1.281 0.051 0.041 0.065 40 0.08346 7.68e-03 1.338 0.039 0.045 0.059 44 0.12064 1.11e-02 1.176 0.019 0.034 0.039 48 0.17438 1.60e-02 0.867 0.014 0.019 0.023 50 0.24500 9.00e-02 0.572 0.0049 0.0067 0.0083 53 0.51500 9.00e-02 0.220 0.0029 0.0067 0.0073 56 0.78500 9.00e-02 0.139 0.0020 0.0068 0.0070 59 1.05500 9.00e-02 0.108 0.0021 0.0073 0.0076 62 1.32500 9.00e-02 0.0916 0.0018 0.0072 0.0074 224 Bin χ [rad] ∆ χ [rad] EEC [1/rad] Stat. err. Syst. err. T otal err. 65 1.59500 9.00e-02 0.0973 0.0025 0.0058 0.0063 68 1.86500 9.00e-02 0.112 0.0027 0.0055 0.0062 71 2.13500 9.00e-02 0.137 0.0033 0.0052 0.0062 74 2.40500 9.00e-02 0.202 0.0040 0.0045 0.0060 77 2.67500 9.00e-02 0.377 0.0048 7.96e-04 0.0049 79 2.85500 9.00e-02 0.649 0.0063 0.0105 0.0122 85 2.99885 1.37e-02 1.156 0.024 0.030 0.039 90 3.05321 8.47e-03 1.260 0.038 0.038 0.053 95 3.08687 5.24e-03 1.007 0.055 0.061 0.082 100 3.10771 3.25e-03 0.768 0.056 0.035 0.066 The EEC exhibits a characteristic minimum near χ ≈ 1 . 3 rad (bin 62, EEC = 0 . 092 [1/rad]) and reaches a maximum of ∼ ,1.27 [1/rad] in the back-to-bac k Sudako v p eak (bin 88, χ ≈ 3 . 035 rad). In the bulk region (0 . 2 < χ < 2 . 9 rad), the relative total uncertain ty is 3–6%, while in the extreme collinear and back-to-bac k edges it reaches 10–40%. D.8.3 AEEC cross-c hec k The asymmetric EEC (AEEC) is computed from the unfolded EEC as a cross-chec k observ able (not a primary measuremen t at prototype scale): AEEC( χ ) = EEC( π − χ ) − EEC( χ ) , χ ∈ [0 , π / 2] 48 angular p oints are computed by matc hing each forward bin ( χ < π / 2) to the corresp onding mirrored bin ( π − χ ) in the bac k-to-back region. Statistical uncertain ties are propagated from the bo otstrap co v ariance matrix: V ar(AEEC( χ )) = V ar(EEC( π − χ )) + V ar(EEC( χ )) − 2 · Cov(EEC( π − χ ) , EEC( χ )) Mean AEEC statistical uncertain ty: 4 . 7 × 10 − 2 . Note: The AEEC bins at the smallest angular separations ( χ ≲ 0 . 01 rad, first 2–3 p oin ts) are unreliable b ecause they pair extreme collinear bins (large uncertainties, p ossible prior sensitivity) with extreme bac k- to-bac k bins. These p oin ts should b e pruned in any quan titative in terpretation. Systematic uncertain ties are not propagated to the AEEC at protot yp e scale. In principle, EEC systematic shift v ectors propagate to the AEEC as δ AEEC ( χ ) = δ EEC ( π − χ ) − δ EEC ( χ ), but the resulting AEEC systematic co v ariance would b e dominated b y cancellations b et ween correlated forw ard and mirrored bins that are p oorly determined with the protot yp e cov ariance. The AEEC is therefore presen ted as a cross- c heck demonstration, not a final result. The full-statistics analysis will promote the AEEC to a secondary measuremen t with fully propagated syste matic uncertain ties. D.9 Comparison to prior results and theory D.9.1 Comparison to MC generators The unfolded data is compared to the QQPS MC truth (Lund string hadronization) in the ratio panels of the primary result figures. The data/MC truth ratio is approximately consisten t with unity across the bulk region, with ∼ ,10–20% deviations in the collinear and back-to-bac k edges. These deviations are within the total uncertaint y band at protot yp e scale. A comparison to the AP A CIC MC truth (cluster hadronization) at particle level shows agreemen t within the 5–10% level in the bulk region, with larger differences in the collinear region. 225 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 1 0 2 1 d d [ r a d 1 ] p s = 9 1 . 2 G e V DELPHI Preliminary Stat. unc. QQPS MC truth Data (unfolded) 1 0 2 1 0 1 1 0 0 [ r a d ] 0.5 1.0 1.5 Data / MC Figure 118: Unfolded c harged-particle EEC from DELPHI data at √ s = 91 . 2 GeV, corrected to particle lev el via 4-iteration IBU. Blac k points show the unfolded data with total (stat + syst) uncertain ties. The blue band shows statistical uncertain ties only . The red line is the QQPS MC truth (normalized). The ratio panel shows data/MC truth. 226 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 1 0 2 1 d d [ r a d 1 ] p s = 9 1 . 2 G e V DELPHI Preliminary Stat. unc. QQPS MC truth Data (unfolded) 0.5 1.0 1.5 2.0 2.5 3.0 [ r a d ] 0.5 1.0 1.5 Data / MC Figure 119: Unfolded EEC in linear angular scale, sho wing the full structure from the collinear region through the bulk to the back-to-bac k Sudako v p eak. 227 1 0 2 1 0 1 1 0 0 [ r a d ] -0.6 -0.4 -0.2 0.0 0.2 A E E C ( ) = E E C ( ) E E C ( ) p s = 9 1 . 2 G e V DELPHI Preliminary Data (unfolded) Figure 120: Asymmetric EEC (AEEC) from DELPHI data with propagated statistical uncertain ties. The AEEC measures the excess of back-to-bac k ov er collinear energy flow at each angular separation. Positiv e v alues at intermediate angles reflect the dominance of the back-to-bac k Sudako v p eak. 228 Diagonal-only χ 2 comparison A formal χ 2 test using the full co v ariance matrix is not p ossible at protot yp e scale (condition num b er 3 . 6 × 10 19 ). Instead, a diagonal-only χ 2 is computed using only the p er-bin total (stat + syst) uncertain ty , ignoring off-diagonal correlations: χ 2 diag = X i (EEC data i − EEC MC truth i ) 2 σ 2 i Ca veat: This test ignores bin-to-bin correlations and therefore ov erstates disagreement when systematic shifts are correlated across bins (as they are for all six sources). The resulting χ 2 v alues should be in terpreted as upp er b ounds on the true incompatibilit y . Comparison Region χ 2 diag /ndf Data vs. QQPS truth All active (108 bins) 1324/108 = 12.3 Data vs. QQPS truth Bulk (0 . 2 < χ < 2 . 9, 30 bins) 534/30 = 17.8 The elev ated diagonal χ 2 v alues reflect the correlated nature of the systematic uncertain ties: the six rank-1 systematic outer pro ducts pro duce strong bin-to-bin correlations (maxim um off-diagonal correlation ∼ ,0.93) that a diagonal test cannot accoun t for. With the full cov ariance inv erse, the effective n umber of indep enden t degrees of freedom w ould b e substan tially reduced. The visual data/MC truth ratio panels confirm that the unfolded sp ectrum is consisten t with the generator predictions within the uncertain ty bands. A prop er χ 2 comparison using the w ell-conditioned full-statistics cov ariance is planned. D.9.2 Comparison to published measurements Quan titative comparison to published measuremen ts (ALEPH arXiv:2505.11828, DELPHI op en data arXiv:2510.18762, SLD Ph ys. Rev. D 50 5580) is deferred to the full-statistics analysis for the following reasons: 1. Co v ariance matrix conditioning: The prototype cov ariance matrix has a condition n umber of 3 . 6 × 10 19 , making formal χ 2 comparisons numerically meaningless. 2. Data av ailability: Published data p oin ts must b e obtained from HEPData or digitized from pap ers. 3. Binning compatibilit y: Different analyses use differen t binning sc hemes, requiring interpolation or rebinning. The full-statistics analysis will pro duce: • Ov erla y plots with ratio panels comparing to ALEPH and historical DELPHI results • χ 2 compatibilit y tests using the w ell-conditioned full-statistics co v ariance • Comparisons to state-of-the-art theory predictions (NNLO+NNLL col +N 4 LL b2b ) D.10 V alidation D.10.1 Closure test The closure test unfolds the MC reco-level EEC through its own resp onse matrix and compares to the MC truth, testing the unfolding machinery independently of data/MC differences. Results at N iter = 4: Metric V alue χ 2 /ndf (all 108 bins) 569.0/108 = 5.27 χ 2 /ndf (bulk, 0 . 05 < χ < 3 . 0 rad) 307.0/51 = 6.02 229 Metric V alue Max relative deviation 227% (extreme collinear edge) Mean relative deviation 13.0% The elev ated χ 2 is dominated b y the limited MC statistics in the prototype sample. The closure test χ 2 denominator uses the MC truth statistical uncertaint y , which with 12,384 MC ev ents and 108 bins gives O (100) entries per bin. Ho wev er, the resp onse matrix itself is also estimated from these same MC even ts, in tro ducing an additional statistical uncertain ty in the unfolding that is not captured in the χ 2 denominator. This MC statistical uncertaint y in the response matrix inflates the test statistic b ey ond what the truth-lev el statistical error alone w ould predict. Pro jected closure at full statistics: With the full QQPS sample ( ∼ ,1.1M even ts, a factor ∼ ,150 increase), the MC statistical uncertaint y in the resp onse matrix decreases b y 1 / √ 150 ≈ 0 . 08, reducing the resp onse matrix noise by more than an order of magnitude. The p er-bin truth uncertaint y also decreases by the same factor. This tw o-fold improv emen t (b etter response matrix and smaller truth errors) is pro jected to bring the closure χ 2 / ndf to ∼ ,1. Achieving χ 2 / ndf ∼ 1 in the closure test at full statistics is a mandatory v alidation gate b efore the result is finalized. D.10.2 Stress test The stress test applies a linear tilt to the MC truth ( × (0.7–1.3) across bins), folds through the response matrix, and unfolds using the untilted truth as prior. This tests robustness to prior mismatch. Results at N iter = 4: Metric V alue χ 2 /ndf 8967/108 = 83.0 Max relative deviation 35.2% Mean relative deviation 13.7% The mean relativ e deviation of 13.7% for a 30% tilt shows the unfolding is reco vering the bulk of the shap e change. The recov ery is region-dep enden t: collinear ( χ < 0 . 1 rad) has a mean deviation of 9.7% (28 bins), the bulk (0 . 2 < χ < 2 . 9 rad) has 19.4% (30 bins), and the back-to-bac k ( χ > 2 . 9 rad) has 13.1% (42 bins). The worse bulk p erformance reflects the larger absolute tilt applied to the flat bulk region, where a 30% tilt pro duces a larger absolute change than in the p eaked collinear and back-to-bac k regions. D.10.3 Flat-prior sensitivit y test P er con ven tions, bins where the unfolded result changes b y > 20% when using a flat prior must b e excluded. Before exclusion: 8 bins flagged out of 116 initial activ e bins (6.9%). All flagged bins are in extreme collinear or extreme bac k-to-back edges. After exclusion: 0 bins flagged out of 108 final active bins. D.10.4 Iteration scan The numerical v alues for the iteration scan are tabulated b elow: N iter Closure χ 2 /ndf Closure mean dev. Stress χ 2 /ndf Stress mean dev. 1 394/108 = 3.65 11.2% 8973/108 = 83.1 13.8% 2 496/108 = 4.59 12.5% 8967/108 = 83.0 13.8% 3 538/108 = 4.99 12.9% 8967/108 = 83.0 13.8% 4 569/108 = 5.27 13.0% 8967/108 = 83.0 13.7% 5 595/108 = 5.51 13.2% 8967/108 = 83.0 13.7% 6 619/108 = 5.74 13.4% 8967/108 = 83.0 13.7% 230 N iter Closure χ 2 /ndf Closure mean dev. Stress χ 2 /ndf Stress mean dev. 8 663/108 = 6.14 14.0% 8967/108 = 83.0 13.7% 10 701/108 = 6.49 14.6% 8967/108 = 83.0 13.7% The closure χ 2 /ndf increases monotonically with iterations (noise amplification), while the stress test χ 2 /ndf is essentially flat b ey ond 2 iterations (the tilt recov ery is achiev ed early). The closure mean relativ e deviation is minimized at 1 iteration (11.2%) and gro ws slowly . The c hoice of N iter = 4 balances regularization against data-driv en correction, consisten t with the DELPHI open-data reference analysis D.10.5 Systematic completeness table Implemen ted sources: Source Con v. Refs Method Impact Status T racking eff. Req. ALEPH, DELPHI, SLD 1% remov al 0.7% DONE T rack p T cut Req. ALEPH, DELPHI, SLD 0.3/0.5 GeV 3.3% DONE Regularization Req. ALEPH, DELPHI Iter ± 2 3.4% DONE Prior sensitivity Req. ALEPH Flat+AP ACIC en v. 4.9% DONE Angular resolution Req. (implicit in refs) 0.5 mrad smear 3.9% DONE Pro ximity cut Sp ecific N/A χ min v ar. 2.7% DONE Deferred sources (planned for full statistics): Source Con v. Refs Expected Reason Hadronization mo del Req. DELPHI (4 gen.), SLD 5–15% Needs full AP A CIC RM T rack p scale Req. ALEPH, DELPHI, SLD < 1% Sub-leading T rack p resolution Req. (implicit) < 1% Sub-leading Alt. metho d (SVD) Req. N/A N/A BBB only at proto. N ch cut v ar. Req. (implicit) < 1% Sub-leading E vis cut v ar. Req. (implicit) < 1% Sub-leading Thrust cut v ar. Req. (implicit) < 1% Sub-leading T rack qualit y Req. ALEPH, DELPHI, SLD < 2% Nee ds re-extraction Hea vy fla vor (b) Req. N/A 2–5% (b2b) Needs flav or tag/MC T rack matc hing Sp ecific ALEPH, DELPHI < 2% ∆ R cutoff v ar. High- p T mo del Specific DELPHI < 1% Data/MC discrepancy Justified omissions: Background contamination ( < 1% at Z pole) and ISR treatment ( < 1%, ISR- exclusiv e particle-lev el definition). Summary: 6 sources implemented, 9 deferred (planned for full-statistics), 2 justified omissions. The most critical missing source is the hadronization mo del systematic (exp ected 5–15%). 231 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 1 0 2 1 d d [ r a d 1 ] p s = 9 1 . 2 G e V DELPHI Simulation Preliminary Unfolded MC (IBU, 4 iter) MC truth (normalized) 1 0 2 1 0 1 1 0 0 [ r a d ] 0.5 1.0 1.5 Unfolded / Truth A l l : 2 / n d f = 5 . 2 7 B u l k : 2 / n d f = 6 . 0 2 Figure 121: Closure test: MC reco unfolded through its o wn resp onse matrix (blac k p oints) compared to MC truth (red line). The ratio panel shows the unfolded/truth ratio with both the all-bin and bulk-only χ 2 /ndf rep orted. 232 1 0 3 1 0 2 1 0 1 1 0 0 1 0 1 1 0 2 1 d d [ r a d 1 ] p s = 9 1 . 2 G e V DELPHI Simulation Preliminary Unfolded (IBU, 4 iter) Reweighted truth 1 0 2 1 0 1 1 0 0 [ r a d ] 0.5 1.0 1.5 Unfolded / Truth 2 / n d f = 8 3 . 0 3 Figure 122: Stress test: tilted MC truth folded and unfolded with the original (un tilted) prior. The unfolding reco vers the tilted shap e to within ∼ ,14% on av erage. 233 20 40 60 80 100 120 Bin index 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 Relative change from flat prior [%] p s = 9 1 . 2 G e V DELPHI Simulation Preliminary 0 bins > 20% 20% threshold Figure 123: Flat-prior sensitivit y per bin (after exclusion of flagged bins). No remaining bins exceed the 20% threshold. 234 2 4 6 8 10 IBU iterations 0 10 20 30 40 50 60 70 80 2 / n d f p s = 9 1 . 2 G e V DELPHI Simulation Preliminary Closure Stress Nominal (4) 2 4 6 8 10 IBU iterations 11.5 12.0 12.5 13.0 13.5 14.0 14.5 Mean relative deviation [%] p s = 9 1 . 2 G e V DELPHI Simulation Preliminary Closure Stress Nominal (4) Figure 124: Iteration scan sho wing χ 2 /ndf (left) and mean relativ e deviation (righ t) for the closure test (blue) and stress test (red) as a function of IBU iterations. The nominal c hoice of 4 iterations is mark ed. D.11 Conclusions A protot yp e measuremen t of the c harged-particle tw o-p oin t energy-energy correlator is p erformed using appro ximately 8,600 hadronic Z decay even ts from the DELPHI exp erimen t at √ s = 91 . 2 GeV. The mea- suremen t cov ers 108 angular bins spanning χ ∈ [0 . 006 , 3 . 139] rad, encompassing the collinear, bulk, and bac k-to-back kinematic regions. The key findings are: 1. The EEC shows the exp ected ph ysical features. The collinear p o w er-law enhancement, bulk plateau, and back-to-bac k Sudako v p eak are all clearly resolved in the unfolded sp ectrum. 2. The unfolding framew ork is v alidated. Iterativ e Ba yesian unfolding with 4 iterations successfully corrects for detector effects. The flat-prior sensitivity test confirms that all 108 activ e bins are prior- insensitiv e ( < 20% change with flat prior). The BBB cross-c hec k agrees with IBU within 2.7% on a verage in the v alid region (62% of bins). 3. Six systematic sources are ev aluated. Prior sensitivity (4.9%), angular resolution (3.9%), regu- larization (3.4%), track p T cut (3.3%), proximit y cut (2.7%), and tracking efficiency (0.7%). The total measured systematic uncertaint y is ∼ ,12% (mean relative). 4. The total uncertaint y is ∼ ,18% (mean relativ e). The analysis is systematically dominated in the bulk and bac k-to-back regions, and statistically dominated in the extreme collinear region. 5. The AEEC is computed as a cross-chec k with 48 angular p oin ts and propagated statistical uncertain ties (systematic uncertainties not propagated at prototype scale). D.11.1 Dominan t limitations 1. Protot ype statistics: ∼ ,8,600 data even ts represen t 0.7% of the full DELPHI 1994 dataset. Statistical uncertain ties are ∼ ,12 × larger than exp ected at full statistics. 235 2. Hadronization model systematic: The dominan t unmeasured uncertain ty , estimated at 5–15%. Building an AP ACIC resp onse matrix from 2,508 even ts would b e extremely noisy; the full AP ACIC sample ( ∼ ,3M even ts) is required. This systematic is not included in the quoted uncertaint y budget. 3. Alternativ e unfolding metho d: SVD is not yet implemen ted. BBB serves as a partial cross-c heck but cov ers only 62% of bins and sho ws χ 2 / ndf = 6 . 0 vs. IBU in the bulk region. 4. Missing systematics: 9 of 20 planned sources are deferred to the full-statistics run. Most are exp ected to b e sub-leading ( < 2%), but the heavy flav or systematic ma y con tribute 2–5% in the back-to-bac k region. 5. Co v ariance conditioning: The statistical cov ariance is rank-deficien t (rank 100 for 108 bins) and the total condition n umber (3 . 6 × 10 19 ) preven ts formal χ 2 comparisons at prototype scale. 6. Closure test: χ 2 / ndf = 5 . 27 (all bins), attributed to prototype MC statistics. F ull-statistics closure with χ 2 / ndf ∼ 1 is a m andatory v alidation gate. D.12 F uture directions D.12.1 F ull-statistics measurement The immediate priority is deplo ying the v alidated analysis framew ork on the complete DELPHI 1994 dataset: 1. F ull data extraction: Pro cess all 242 data files, 214 QQPS MC files, and 996 AP ACIC MC files using the Phase 3 extraction pipeline with SLURM parallelization. Estimated w all time: ∼ ,11 minutes with 16 parallel jobs. 2. AP A CIC resp onse matrix: Build a proper alternativ e resp onse matrix from the full AP ACIC sample ( ∼ ,3M even ts) for the hadronization mo del systematic. 3. SVD unfolding: Implemen t SVD as a prop er alternativ e unfolding method, providing an independent cross-c heck of the IBU result. 4. Complete systematic program: Implemen t the 9 deferred sources: track momentum scale and resolution, selection cut v ariations ( N ch , E vis , thrust axis), track qualit y , heavy fla vor, track matc hing metric, and high- p T mo deling. 5. Increase b ootstrap replicas: Use 500+ replicas for a w ell-conditioned statistical cov ariance matrix. 6. AEEC systematics: Construct AEEC systematic cov ariances from shift vectors. D.12.2 Comparisons and v alidation 7. Comparison to ALEPH EEC: Quantitativ e comparison using the w ell-conditioned full-statistics co v ariance, including χ 2 compatibilit y tests. 8. Comparison to theory: Overla y with NNLO+NNLL col +N 4 LL b2b predictions from Chen et al. 9. 1995 data addition: Extend to the 1995 dataset (Y13716, 246 files) for ∼ ,33% more on-peak statistics. D.12.3 Ph ysics exploitation 10. α s extraction: Fit the back-to-bac k region using TMD factorization at N 4 LL with prop er treatmen t of non-p erturbative corrections. Recen t work (arXiv:2507.17478) suggests that curren t data precision ma y not strongly constrain α s when non-p erturbativ e parameters are floated. 11. Three-point EEC (EEEC): Extend to the three-p oin t correlator, sharing the same data sample and selection infrastructure. 12. Fla v or-tagged EEC: Exploit DELPHI’s RICH particle identification for flav or-sp ecific EEC mea- suremen ts – a nov el observ able not yet measured at an y collider. 236 13. Bose-Einstein correlation effects: Assess with MC BEC on/off comparison. 14. HEPData submission: Publish results in machine-readable format on HEPData for communit y use. D.13 App endices D.13.1 Mac hine-readable results The following CSV files are provided in the results/ directory: File Con tent eec spectrum.csv Unfolded EEC sp ectrum (130 bins, with active flag) aeec spectrum.csv AEEC sp ectrum (48 p oin ts) covariance total.csv T otal cov ariance matrix (108 x 108 active bins) covariance stat.csv Statistical co v ariance matrix (108 x 108 activ e bins) systematic shifts.csv P er-source systematic shift vectors (130 bins) D.13.2 Correlation matrix excerpt A representativ e 8 x 8 submatrix of the total correlation matrix is sho wn for bins spanning the collinear, bulk, and back-to-bac k regions. The full 108 x 108 matrix is pro vided in results/covariance total.csv . χ [rad] 0.007 0.030 0.121 0.965 1.955 2.984 3.104 3.134 0.007 +1.00 -0.12 +0.09 +0.05 +0.06 +0.05 +0.16 +0.01 0.030 -0.12 +1.00 +0.25 +0.23 +0.21 +0.22 -0.19 +0.03 0.121 +0.09 +0.25 +1.00 +0.78 +0.76 +0.62 +0.24 +0.13 0.965 +0.05 +0.23 +0.78 +1.00 +0.85 +0.66 +0.31 +0.19 1.955 +0.06 +0.21 +0.76 +0.85 +1.00 +0.66 +0.35 +0.19 2.984 +0.05 +0.22 +0.62 +0.66 +0.66 +1.00 +0.36 +0.04 3.104 +0.16 -0.19 +0.24 +0.31 +0.35 +0.36 +1.00 -0.06 3.134 +0.01 +0.03 +0.13 +0.19 +0.19 +0.04 -0.06 +1.00 The bulk bins ( χ ≈ 0 . 12–2 . 0 rad) sho w strong p ositiv e correlations (0.76–0.85), driven by the correlated systematic shift v ectors (esp ecially p T cut v ariation, which con tributes a uniform shift across the bulk). The collinear and back-to-bac k edge bins are weakly correlated with the bulk, consistent with the different systematic profiles in those regions. D.13.3 Angular binning definition The 130 angular bins are defined as follows: • Bins 0–49 (collinear): 50 logarithmically spaced bins from χ = 0 . 002 to χ = 0 . 200 rad. • Bins 50–79 (bulk): 30 linearly spaced bins from χ = 0 . 200 to χ = 2 . 900 rad. • Bins 80–129 (bac k-to-back): 50 logarithmically spaced bins (flipp ed) from χ = 2 . 900 to χ = 3 . 140 rad. The bin edges are provided in results/eec spectrum.csv . D.13.4 Extended cutflo w The complete track-lev el and even t-level cutflows are presented in the even t selection section ab o ve (Section 3). 237 D.13.5 Unfolding iteration scan The iteration scan (sho wn in the v alidation section, abov e) presen ts the closure test χ 2 /ndf and mean relativ e deviation as a function of the num b er of IBU iterations from 1 to 10. The nominal choice of 4 iterations balances closure quality against noise amplification. D.13.6 Data extraction pip eline Data extraction from the DELPHI native .al binary format uses the skelana F ortran analysis framew ork a v ailable on CVMFS ( /cvmfs/delphi.cern.ch/setup.sh ). A custom analysis program ( extract data.car ) reads the VECP common block for reconstructed tracks and the PSCLUJ (LUND record) common blo c k for MC truth particles, writing CSV output files. The P A TCHY prepro cessor ( nypatchy ) compiles the F ortran from .car and .cra files. This pip eline is do cumen ted in detail in the Phase 2 exploration artifact. The extraction pro duces three output file types p er input: • Ev en ts CSV: run, event, ecm, nvecp, ncvecp, nnvecp, ihad4, echar, emneu, ehneu • T racks CSV: run, event, itrk, px, py, pz, energy, mass, momentum, charge, lvsele • T ruth CSV (MC only): run, event, itrk, kf, ks, px, py, pz, energy, mass D.13.7 Pixi task graph The complete analysis c hain is repro ducible via: pixi run all Individual phase tasks: T ask Command Description select python phase3 selection/scripts/apply selection.py Ev ent and track selection compute-eec python phase3 selection/scripts/compute eec.py EEC computation build-response python phase3 selection/scripts/build response.py Resp onse matrix construction plot-dataMC python phase3 selection/scripts/plot dataMC.py Data/MC comparison plots unfold python phase4 inference/scripts/unfold.py IBU unfolding and v alidation systematics python phase4 inference/scripts/run systematics.py Systematic uncertainties covariance python phase4 inference/scripts/build covariance.py Co v ariance matrix plot-phase4 python phase4 inference/scripts/plot results.py Result plots build-pdf pandoc ... ANALYSIS NOTE.md -o ANALYSIS NOTE.pdf PDF compilation 238 E H → τ τ Signal Strength Measuremen t with CMS Op en Data E.0.1 Abstract A measurement of the Higgs b oson signal strength in the H → τ τ decay channel is performed using proton– proton collision data collected b y the CMS exp eriment at √ s = 8 T eV, corresp onding to an in tegrated luminosit y of 11.6 fb − 1 . The analysis targets the µτ h final state, in which one τ lepton decays to a muon and the other deca ys hadronically . The signal strength modifier µ = σ obs /σ SM is extracted from a binned maxim um-likelihoo d template fit to the visible di-tau mass distribution m vis in the range [40 , 200] GeV using the HistF actory mo del as implemented in ‘p yhf‘. The b est-fit signal strength is ˆ µ = − 5 . 67 ± 4 . 81, consistent with the Standard Mo del exp ectation of µ = 1 within 1 . 4 σ . The exp ected sensitivity of the inclusive µτ h analysis is σ µ ≈ 5 . 6, reflecting the limited signal-to-background ratio ( S/B ≈ 0 . 002) in a single even t category without jet-based categorization or the SVfit mass algorithm. The measurement demonstrates the full analysis workflo w using publicly av ailable CMS Op en Data and the ‘p yhf‘ statistical framew ork. E.0.2 In tro duction Ph ysics Motiv ation The discov ery of the Higgs b oson b y the A TLAS and CMS exp erimen ts in 2012 [7, 8] established the existence of a scalar b oson with mass near 125 GeV. A cen tral prediction of the Standard Mo del (SM) is that the Higgs b oson couples to fermions with a strength prop ortional to their mass, through Y uk a wa in teractions of the form L Y uk a wa = − m f v ¯ f f h, (39) where m f is the fermion mass, v = 246 GeV is the v acuum exp ectation v alue of the Higgs field, and h is the physical Higgs b oson field. Measuring these couplings tests whether the observed boson is indeed the SM Higgs b oson or whether deviations indicate new ph ysics beyond the Standard Mo del. Among the fermionic deca y c hannels, H → τ τ pro vides the most sensitive direct prob e of the Higgs–lepton Y uk a wa coupling at the LHC. The τ lepton is the heaviest lepton ( m τ = 1 . 777 GeV), giving it the largest Y uk a wa coupling among leptons. The predicted branc hing ratio is BR( H → τ τ ) = 6 . 32% at m H = 125 GeV, compared to BR( H → µµ ) = 0 . 022% and BR( H → ee ) < 10 − 8 . The H → τ τ decay is therefore the only leptonic Higgs decay ch annel accessible at the LHC with the Run 1 dataset. The H → τ τ channel is exp erimen tally challenging for several reasons. First, ev ery τ decay produces at least one neutrino, which escapes the detector and degrades the inv arian t mass resolution. The visible mass of the τ τ system is shifted below the true Higgs b oson mass, reducing the separation b et ween signal and the dominan t irreducible background from Drell–Y an Z/γ ∗ → τ τ pro duction, which has a cross-section roughly three orders of magnitude larger than the Higgs signal. Second, hadronic τ decays ( τ h ) must b e distinguished from the abundan t QCD jet bac kground, requiring dedicated identification algorithms. Despite these challenges, the H → τ τ signal has b een observed by both CMS and A TLAS using the full Run 1 and Run 2 datasets. The µτ h Final State The τ lepton deca ys either leptonically ( τ → eν ¯ ν or τ → µν ¯ ν , combined branc hing ratio ≈ 35%) or hadronically ( τ → hadrons + ν , branching ratio ≈ 65%). F or H → τ τ , the exp erimen tally accessible final states include eµ , eτ h , µτ h , and τ h τ h , where the subscript h denotes a hadronic τ deca y . This analysis targets the µτ h final state, in whic h one τ lepton decays to a muon and neutrinos, and the other decays hadronically . This channel has several adv an tages: • Clean m uon signature: The muon pro vides a clean trigger and offline selection, with high purity from the tight m uon iden tification and isolation requirements. • Moderate branc hing ratio: BR( τ → µν ¯ ν ) × BR( τ → hadrons + ν ) ≈ 17 . 4% × 64 . 8% ≈ 22 . 6% for eac h c harge ordering, giving a combined µτ h branc hing ratio of ≈ 23%. • CMS trigger: The CMS exp erimen t op erates a cross-trigger requiring b oth an isolated muon and a lo ose hadronic τ candidate, providing efficient ev ent selection. 239 Prior Measurements The H → τ τ signal has been the sub ject of extensiv e study by b oth A TLAS and CMS. The key prior measurements are summarized in tbl. 64 . T able 64: Summary of prior H → τ τ measuremen ts. All results com bine m ultiple τ τ deca y c hannels and use even t categorization. Measuremen t Channel √ s Result Reference CMS Run 1 H → τ τ (all) 7 + 8 T eV µ = 0 . 78 ± 0 . 27, 3.2 σ JHEP 05 (2014) 104 A TLAS Run 1 H → τ τ (all) 7 + 8 T eV µ = 1 . 43 +0 . 43 − 0 . 37 JHEP 04 (2015) 117 CMS 13 T eV H → τ τ (all) 13 T eV µ = 1 . 09 +0 . 27 − 0 . 26 , 4.9 σ PLB 779 (2018) 283 CMS Run 2 H → τ τ (all) 13 T eV 5.5 σ observ ation PLB 805 (2020) 135425 The CMS Run 1 result at √ s = 7 + 8 T eV, using the full 2011 and 2012 datasets (approximately 5 fb − 1 at 7 T eV and 20 fb − 1 at 8 T eV), obtained µ = 0 . 78 ± 0 . 27 with 3.2 σ evidence for the H → τ τ signal. This analysis com bined all τ τ final states ( eµ , eτ h , µτ h , τ h τ h ) with jet-based ev ent categorization (0-jet, 1-jet b oosted, VBF 2-jet) and used the SVfit algorithm for di- τ mass reconstruction. The observ ation at > 5 σ significance was ac hieved with the 13 T eV Run 2 dataset. Scop e and Goals This analysis uses CMS Op en Data from 2012 proton–proton collisions at √ s = 8 T eV, corresp onding to an in tegrated luminosity of L = 11 . 6 fb − 1 from the Run2012B and Run2012C data-taking p eriods. The data are provided as reduced NanoA OD outreach files pro duced by the CMS Open Data team using the AOD2NanoA OD outreach to ol. The primary goals are: 1. Measure the signal strength mo difier µ = σ obs /σ SM for Higgs b oson pro duction in the µτ h final state using a binned template fit to the visible di-tau mass distribution. 2. Ev aluate systematic uncertain ties follo wing the standards established by CMS reference analyses. 3. V alidate the statistical mo del through signal injection tests, Asimov fits, and cross-chec k analyses. 4. Document the complete analysis workflo w as a reference for CMS Op en Data analyses. The analysis is inten tionally simplified compared to the full CMS H → τ τ analysis: a single inclusiv e ev ent category is used (without jet-based categorization), the visible mass is used instead of the SVfit mass, and only the µτ h c hannel is analyzed. These simplifications reduce the exp ected sensitivity from σ µ ≈ 0 . 27 (full CMS analysis) to σ µ ≈ 5 . 6 (this analysis), but the full statistical framework and systematic uncertaint y treatmen t are preserved. 240 E.0.3 Data Samples Data The analysis uses proton–proton collision data collected by the CMS detector at √ s = 8 T eV during the 2012 LHC running p erio d. Even ts are selected from the T auPlusX primary dataset, which con tains ev ents passing the cross-trigger HLT IsoMu17 eta2p1 LooseIsoPFTau20 . This trigger requires an isolated m uon with p T > 17 GeV and | η | < 2 . 1 together with a lo ose isolated hadronic τ candidate with p T > 20 GeV at the High Level T rigger (HL T). The T auPlusX primary dataset is the correct choice for the µτ h c hannel: it pro vides complete trigger co verage for the cross-trigger path, which w ould not be presen t in the SingleMu primary dataset. T able 65: Data samples used in this analysis. File Dataset Ev ents Luminosity Run2012B TauPlusX.root Run2012B 35,647,508 ∼ 4 . 4 fb − 1 Run2012C TauPlusX.root Run2012C 51,303,171 ∼ 7 . 2 fb − 1 T otal 86,950,679 11.6 fb − 1 All data files are stored at /eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/ and are in the reduced NanoAOD outreach format. Eac h file contains a single TT ree named Events with 62 branches. Mon te Carlo Sim ulation Monte Carlo sim ulated samples are used to mo del signal and background pro cesses. All MC samples are pro duced with full CMS detector sim ulation and are provided in the same reduced NanoAOD outreach format. MC files contain 69 branc hes (7 additional generator-lev el branc hes compared to data). T able 66: Monte Carlo sim ulation samples used in this analysis. The cross-sections include branching ratios where applicable. The w eight column gives w = σ × L / N gen for L = 11 , 600 pb − 1 . Pro cess File Generator σ × BR [pb] N gen W eight ggH → τ τ GluGluToHToTauTau.root P owheg+Pythia6 1.22 476,963 2 . 967 × 10 − 2 VBF H → τ τ VBF HToTauTau.root P owheg+Pythia6 0.10 491,653 2 . 360 × 10 − 3 D Y → ℓℓ (M > 50) DYJetsToLL.root MadGraph+Pythia6 3504.0 30,458,871 1 . 334 t ¯ t TTbar.root Po wheg+Pythia6 225.2 6,423,106 4 . 067 × 10 − 1 W( → ℓν )+1j W1JetsToLNu.root MadGraph+Pythia6 6662.0 29,784,800 2 . 594 W( → ℓν )+2j W2JetsToLNu.root MadGraph+Pythia6 2160.0 30,693,853 8 . 163 × 10 − 1 W( → ℓν )+3j W3JetsToLNu.root MadGraph+Pythia6 640.0 15,241,144 4 . 872 × 10 − 1 Signal Samples Tw o Higgs b oson pro duction modes are considered: • Gluon–gluon fusion (ggH): σ ggH = 19 . 3 pb at NNLO+NNLL QCD with NLO EW corrections (LHC Higgs Cross Section W orking Group YR3), giving σ ggH × BR( H → τ τ ) = 1 . 22 pb. • V ector boson fusion (VBF): σ VBF = 1 . 58 pb at NLO QCD with NLO EW corrections, giving σ VBF × BR( H → τ τ ) = 0 . 10 pb. Both signal processes are generated with Po wheg for the hard process and Pythia6 for parton show ering and hadronization, for a Higgs b oson mass of m H = 125 GeV. 241 Bac kground Samples The background samples cov er the dominan t pro cesses in the µτ h final state: • Drell–Y an (D Y → ℓℓ , M > 50 GeV): Generated with MadGraph+Pythia6. The cross-section of 3504 pb is the NNLO v alue computed with FEWZ 3.1. This sample is the dominant background, con tributing approximately 74% of the total background yield. It includes b oth Z → τ τ (irreducible) and Z → µµ (where a m uon is misiden tified as τ h ) contributions. • T op-quark pair pro duction ( t ¯ t ): Generated with Po wheg+Pythia6. The cross-section of 225.2 pb is the approximate NNLO v alue from T op++ 2.0. This background con tributes real τ leptons from top-quark decays and jets misidentified as τ h . • W+jets: Three exclusiv e jet-multiplicit y samples (W+1j, W+2j, W+3j) generated with Mad- Graph+Pythia6 with MLM matching. The exclusive cross-sections are LO MadGraph v alues. These samples are summed directly to form the inclusive W+jets template without double-coun ting. The W+jets background consists primarily of a real muon from W → µν decay with a jet misiden tified as τ h . Missing Bac kground Processes The outreach dataset do es not include samples for the follo wing pro- cesses: • Single-top pro duction ( ∼ 85 pb inclusive at 8 T eV): Contributes real τ leptons similarly to t ¯ t but at approximately one-third the rate. • Diboson pro duction (WW, WZ, ZZ; combined ∼ 100 pb): Con tributes real leptons at v ery low rates after the di- τ selection. • W+0j: Not included in the outreac h files; the W+1j/2j/3j samples cov er the dominan t W+jets con tribution at the µτ h selection level. After the full µτ h selection (tight τ isolation, opp osite-sign requirement, m T < 50 GeV), the combined con tribution of these missing pro cesses is estimated to b e < 3% of the total background yield. This is do cumen ted as a kno wn limitation. MC Normalization Each MC sample is normalized to the in tegrated luminosit y using per-even t weigh ts: w = σ × L N gen , (40) where σ is the pro duction cross-section times branc hing ratio, L = 11 , 600 pb − 1 is the in tegrated lumi- nosit y , and N gen is the total num b er of generated even ts in the sample. The weigh ts for eac h sample are listed in tbl. 66 . Data F ormat All files are in the reduced NanoA OD outreac h format produced b y the CMS Op en Data team using the AOD2NanoA OD outreach tool. Key features of this format, as discov ered during the data exploration phase: • Jagged arra ys: All ob ject-level branc hes (Muon, T au, Jet) are v ariable-length arra ys, with the num b er of ob jects p er even t given by nMuon , nTau , and nJet . This was initially assumed to b e a flat format (one ob ject p er ev ent) in the strategy phase. • 62 common branches (data and MC) and 7 MC-only branches ( GenPart * , nGenPart ) for generator-lev el information. • T rigger flags stored as b oolean scalars, including the analysis trigger HLT IsoMu17 eta2p1 LooseIsoPFTau20 . • MET stored as scalar MET pt and MET phi , with cov ariance matrix elements also a v ailable. • T au identification includes decay mode finding, isolation working p oin ts (VLo ose through Tight), an ti-electron discriminators (Lo ose through Tight), and anti-m uon discriminators (Lo ose through Tigh t). 242 E.0.4 Ev ent Selection The ev en t selection targets the µτ h final state with exactly one isolated m uon and one isolated hadronic τ candidate with opp osite electric c harge. The selection follows the strategy dev elop ed in Phase 1 with mo difications based on the data exploration findings. T rigger Ev ents m ust pass the cross-trigger: HLT IsoMu17 eta2p1 LooseIsoPFTau20 This trigger requires: • An isolated m uon with p T > 17 GeV and | η | < 2 . 1 at the HL T lev el. • A loose isolated hadronic τ candidate with p T > 20 GeV at the HL T level. The trigger efficiency is not measured in situ in this analysis. A conserv ative ± 5% systematic uncertain t y is assigned to co ver b oth the m uon leg and the τ leg efficiencies (sec. E.0.7 ). Muon Selection The leading muon (highest p T ) passing all qualit y criteria is selected. The requiremen ts are: T able 67: Muon selection requirements. Requiremen t V alue Motiv ation p T > 20 GeV Ab ov e trigger turn-on ( p trig T = 17 GeV) | η | < 2 . 1 T rigger acceptance Tigh t ID Muon tightId == True Reject non-prompt muons, cosmics PF relative isolation Muon pfRelIso04 all < 0 . 1 Suppress QCD, W+jets The m uon tight identification flag ( Muon tightId ) implicitly requires global and track er m uon recon- struction. The branches Muon isGlobal and Muon isTracker assumed in the strategy phase are not present in the outreach files; Muon tightId is the only av ailable iden tification flag b esides Muon softId . The isolation threshold of 0.1 is tigh ter than the standard CMS tigh t w orking point (0.15) to further suppress QCD multijet bac kgrounds. This is consistent with the CMS H → τ τ reference analysis. The ob ject selection strategy applies all qualit y cuts as per-ob ject masks and selects the highest- p T ob ject passing all criteria. If the raw-leading muon fails ID or isolation, a sub-leading muon that passes all cuts will be selected. In practice, the sub-leading reco very rate is negligible ( < 1%) because most even ts in these outreac h files hav e exactly one muon and one τ after the trigger requiremen t. Hadronic T au Selection The leading τ candidate (highest p T ) passing all qualit y criteria is selected. The requirements are: T able 68: Hadronic tau selection requirements. Requiremen t V alue Motiv ation p T > 20 GeV Ab ov e trigger turn-on | η | < 2 . 3 T rack er acceptance Deca y mode finding Tau idDecayMode == True V alid HPS reconstruction Isolation Tau idIsoTight == True Tight WP suppresses jet → τ h fak es An ti-electron Tau idAntiEleLoose == True Lo ose WP rejects e → τ h fak es 243 Requiremen t V alue Motiv ation An ti-muon Tau idAntiMuLoose == True Lo ose WP rejects µ → τ h fak es Charge | q τ h | = 1 Reject charge-0 reconstruction artifacts W orking P oint Mo dification The strategy (Phase 1) sp ecified tight anti-electron and tight an ti-muon discriminators. The data exploration (Phase 2) rev ealed that these working p oin ts are far more aggressive in the outreach NanoA OD files than in the full CMS analysis: • Tau idAntiEleTight rejects ∼ 83% of signal τ h candidates that pass tight isolation (compared to ∼ 5–10% in the full CMS analysis). • Tau idAntiMuTight rejects an additional ∼ 99% of remaining candidates after the tigh t anti-electron cut. The lo ose working points are used instead, following the Phase 2 recommendation. The impact on signal efficiency was studied: T able 69: Hadronic tau selection efficiency for differen t an ti-lepton w orking p oin t combinations, measured relative to even ts with ≥ 1 m uon and ≥ 1 τ . W orking p oin t com bination ggH signal efficiency DM + tight iso + tight anti-e + tigh t an ti- µ + | q | = 1 + p T > 20 15.2% DM + tight iso + lo ose anti-e + lo ose anti- µ + | q | = 1 + p T > 20 20.5% DM + tight iso + | q | = 1 + p T > 20 (no an ti-e/anti- µ ) 44.7% Ev en with lo ose w orking p oints, the anti-m uon discriminator is the most aggressiv e single cut in the analysis, rejecting 78% of signal τ h candidates. This is a kno wn property of the outreac h files and cannot b e a v oided: without the an ti-muon discriminator, Z → µµ contamination (where a m uon is misidentified as τ h ) would o verwhelm the Z → τ τ background template. T au Charge Requiremen t The data exploration found that 12.5% of τ candidates ha ve q τ h = 0. These are candidates where the HPS algorithm failed to determine a unique c harge, typically originating from jets or p oorly reconstructed ob jects. They are rejected by requiring | q τ h | = 1. Ev ent-Lev el Requiremen ts After selecting the m uon and τ h candidate, the follo wing ev ent-lev el require- men ts are applied: T able 70: Ev ent-lev el selection requirements. Requiremen t V alue Motiv ation ∆ R ( µ, τ h ) > 0 . 5 Ensure well-separated ob jects Opp osite sign (OS) q µ × q τ h < 0 Signal is charge-neutral (Higgs, Z) T ransverse mass m T ( µ, E miss T ) < 50 GeV Suppress W+jets 244 T ransverse Mass The transv erse mass is defined as: m T = q 2 p µ T E miss T (1 − cos ∆ ϕ ( µ, E miss T )) , (41) where ∆ ϕ ( µ, E miss T ) is the azimuthal angle b et w een the m uon and the missing transverse energy vector. The m T < 50 GeV requirement remov es W+jets even ts where the m uon originates from W → µν deca y: in W → µν even ts, m T p eaks near m W ≈ 80 GeV (Jacobian peak), while signal and Z → τ τ ev ents cluster at lo w m T . The 50 GeV threshold remov es the bulk of the W+jets contribution while retaining 85% of signal ev ents. This is the standard W+jets suppression tec hnique in CMS H → τ τ analyses. Visible Mass The visible di-tau mass is computed from the muon and τ h four-momen ta: m vis = q ( p µ + p τ h ) 2 = q ( E µ + E τ h ) 2 − |  p µ +  p τ h | 2 , (42) where the four-momen ta are constructed from the reconstructed ( p T , η , ϕ, m ) using the muon mass and τ candidate mass from the ROOT files. The visible mass is the primary discriminating v ariable for the template fit (sec. E.0.9 ). Same-Sign Con trol Region Even ts passing all selection criteria except with same-sign charge ( q µ × q τ h > 0) define the QCD multijet estimation region (sec. E.0.6 ). The same m T < 50 GeV requirement applies. Cutflo w Ra w Even t Coun ts The cutflo w in tbl. 71 sho ws raw (unw eighted) ev ent counts after each sequential selection requirement. All samples are pro cessed in full. T able 71: Ra w (unw eighted) ev ent coun ts at eac h stage of the selection. All samples processed in full. Cut ggH VBF DY t ¯ t W+1j W+2j W+3j Data B Data C T otal 476,963 491,653 30,458,871 6,423,106 29,784,800 30,693,853 15,241,144 35,647,508 51,303,171 T rigger 33,520 49,109 4,753,620 872,546 1,251,689 2,358,002 1,516,331 10,038,076 13,901,050 ≥ 1 µ , ≥ 1 τ 33,503 49,096 4,752,562 872,415 1,250,687 2,356,492 1,515,511 9,982,160 13,834,388 µ p T > 20 28,559 41,574 4,518,977 716,159 1,162,160 2,185,557 1,400,848 7,511,964 10,655,305 µ | η | < 2 . 1 28,458 41,440 4,066,701 711,765 1,162,081 2,185,350 1,400,630 7,279,489 10,281,660 µ tight ID 27,380 39,906 3,942,901 690,314 1,125,179 2,119,414 1,358,311 6,682,919 9,459,199 µ iso < 0 . 1 20,833 32,465 3,299,121 562,330 908,205 1,712,521 1,090,375 4,047,104 6,014,152 τ h p T > 20 20,382 31,636 3,271,634 551,345 899,058 1,696,014 1,079,450 3,990,024 5,931,619 τ h | η | < 2 . 3 20,336 31,570 3,254,354 550,377 897,795 1,693,717 1,077,886 3,976,494 5,910,816 τ h DM 20,266 31,510 3,251,967 549,972 894,247 1,688,788 1,075,489 3,963,057 5,893,272 τ h tight iso 16,596 26,243 2,814,890 450,486 687,684 1,266,871 784,215 3,110,002 4,660,996 τ h anti-e loose 16,076 25,513 2,811,887 432,984 687,623 1,266,784 784,171 3,105,176 4,653,631 τ h anti- µ loose 3,452 4,581 30,755 14,966 6,999 11,508 6,412 31,798 50,293 | q τ h | = 1 3,452 4,581 30,755 14,966 6,999 11,508 6,412 31,798 50,293 ∆ R > 0 . 5 3,452 4,581 30,739 14,931 6,964 11,457 6,392 31,740 50,186 OS 3,410 4,493 29,093 13,439 5,726 8,893 4,801 25,626 40,593 m T < 50 GeV 2,894 3,843 25,365 3,527 1,400 2,096 1,231 15,400 24,645 245 P er-Cut Relative Efficiencies tbl. 72 sho ws the relative efficiency of eac h selection requiremen t with resp ect to the previous requirement, for representativ e pro cesses. T able 72: P er-cut relative efficiencies for representativ e pro cesses. Cut ggH DY Data (B+C) T rigger / T otal 7.0% 15.6% 27.6% ≥ 1 µ , ≥ 1 τ / T rigger 99.9% 100.0% 99.5% µ p T > 20 / prev 85.2% 95.1% 76.1% µ | η | < 2 . 1 / prev 99.6% 90.0% 96.7% µ tight ID / prev 96.2% 97.0% 91.9% µ iso < 0 . 1 / prev 76.1% 83.7% 62.0% τ h p T > 20 / prev 97.8% 99.2% 98.6% τ h | η | < 2 . 3 / prev 99.8% 99.5% 99.7% τ h DM / prev 99.7% 99.9% 99.7% τ h tigh t iso / prev 81.9% 86.6% 78.8% τ h an ti-e loose / prev 96.9% 99.9% 99.8% τ h an ti- µ loose / prev 21.5% 1.1% 1.0% | q τ h | = 1 / prev 100.0% 100.0% 100.0% ∆ R > 0 . 5 / prev 100.0% 99.9% 99.8% OS / prev 98.8% 94.6% 80.8% m T < 50 GeV / prev 84.9% 87.2% 60.5% Ov erall Selection Efficiencies The ov erall selection efficiency (relativ e to total generated ev ents) is summarized in tbl. 73 . T able 73: Overall selection efficiencies relativ e to total generated ev ents. Sample T otal After full selection (OS + m T < 50 GeV) Efficiency ggH → τ τ 476,963 2,894 0.607% VBF H → τ τ 491,653 3,843 0.782% D Y → ℓℓ 30,458,871 25,365 0.083% t ¯ t 6,423,106 3,527 0.055% W+1j 29,784,800 1,400 0.005% W+2j 30,693,853 2,096 0.007% W+3j 15,241,144 1,231 0.008% Data (B+C) 86,950,679 40,045 0.046% Cutflo w Discussion The dominant rejection steps are: 1. T rigger ( × 0 . 07–0 . 28 depending on sample): The cross-trigger requires b oth an isolated m uon and a τ candidate at the HL T lev el, providing strong initial background rejection. 2. An ti-m uon lo ose discriminator ( × 0 . 21 for ggH): By far the most aggressiv e single cut, rejecting 78% of signal τ h candidates that pass all other tau ID criteria. This is a kno wn property of the outreac h NanoA OD files; in the full CMS analysis, the anti-m uon discriminator t ypically rejects < 5% of gen uine τ h . The cut remains necessary b ecause without it, Z → µµ con tamination (where a m uon fakes τ h ) dominates the DY bac kground. 3. Muon isolation ( × 0 . 62–0 . 76): The tigh t isolation threshold of 0.1 is effectiv e at remo ving non-prompt m uon bac kgrounds. 246 4. m T < 50 GeV ( × 0 . 85 for signal, × 0 . 26 for t ¯ t , × 0 . 24–0 . 30 for W+jets): V ery effective at suppressing W+jets and t ¯ t backgrounds, whic h hav e genuine W → µν decays pro ducing high m T . The anti-electron loose discriminator rejects only ∼ 3% of remaining signal after tight isolation, con- sisten t with the exp ectation that gen uine hadronic τ deca ys rarely fail the lo ose an ti-electron discriminator. The charge  = 0 and ∆ R > 0 . 5 requiremen ts remo ve essen tially no even ts after the anti-m uon cut, indicating that the an ti-muon discriminator already remov es the problematic candidates. Total Trig 1 p T I D i s o p T D M i s o anti-e a n t i - q 0 R OS m T 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 1 0 0 Efficiency relative to total p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data g g H D Y Data Figure 125: Selection efficiency as a function of sequen tial cuts for ggH signal, DY background, and data. The an ti-muon lo ose discriminator causes the largest single efficiency drop across all processes. The m T < 50 GeV cut preferentially remo ves t ¯ t and W+jets while retaining most signal and DY ev ents. 247 E.0.5 Corrections and Signal Extraction Strategy This analysis measures the signal strength mo difier µ = σ obs /σ SM via a binned template fit to the detector- lev el visible di-tau mass distribution. No unfolding or acceptance correction is applied: the template fit directly compares the observed m vis sp ectrum to detector-level MC templates for signal and bac kground pro cesses. The signal efficiency and acceptance are absorb ed into the signal template from MC simulation, meaning the measured µ represen ts the ratio of observed to predicted even t rates after all selection and detector effects. This approac h is the standard tec hnique for Higgs b oson signal strength measurements at the LHC and av oids the complexities of unfolding. How ever, it means the result dep ends on the MC mo del of signal acceptance and detector resp onse. In particular, the signal m vis shap e is tak en en tirely from Po wheg+Pythia6 sim ulation, and any mismo deling of the Higgs p T sp ectrum or τ deca y kinematics would affect the extracted µ . This is a known limitation, partially cov ered b y the tau energy scale systematic uncertaint y (sec. E.0.7 ) and the signal cross-section uncertainties (sec. E.0.7 ). In a full analysis, generator modeling systematics (alternativ e generators, parton show er v ariations) would provide additional cov erage; here, only a single generator is av ailable per proce ss, and this is do cumented as a limitation. 248 E.0.6 Bac kground Estimation Ov erview The bac kground composition in the signal region (opp osite-sign, m T < 50 GeV) is summarized in tbl. 74 . T able 74: Bac kground composition in the opp osite-sign signal re- gion with m T < 50 GeV. Yields are weigh ted to L = 11 , 600 pb − 1 . Bac kground Method OS yield F raction D Y → ℓℓ MC template 33,774 74.0% W+jets MC template 5,833 12.8% QCD multijet Data-driven (SS → OS) 4,664 10.2% t ¯ t MC template 1,343 2.9% T otal background 45,614 100% Signal ( µ = 1) 95 Data 39,726 The signal-to-background ratio is S/B = 95 / 45 , 614 = 0 . 0021, consistent with the O (10 − 3 ) exp ectation from the strategy phase. The total MC prediction (45,614) exceeds data (39,726) by appro ximately 15%. This pre-fit normalization offset is absorb ed b y the template fit through background normalization nuisance parameters (sec. E.0.10 ). Drell–Y an ( Z /γ ∗ → ℓℓ ) The Drell–Y an pro cess is the dominan t bac kground (74%), en tering through tw o c hannels: • Z → τ τ (irreducible): Both τ leptons are gen uine. The m vis distribution p eaks at 60–80 GeV (shifted b elo w m Z due to neutrinos carrying aw ay in visible energy). This is a smo oth, w ell-mo deled template. • Z → µµ (reducible): One muon is reconstructed as τ h b y the HPS algorithm. This produces a sharp spik e in the m vis distribution at ∼ 91 GeV (the Z mass, since both visible ob jects are m uons with negligible missing energy). This comp onent survives b ecause the an ti-muon discriminator (ev en at the lo ose w orking p oint) does not fully reject muons misidentified as τ h . Both components are modeled b y the inclusiv e DYJetsToLL MC sample. In the template fit, the DY normalization is treated as a free-floating parameter bounded to [0 . 5 , 1 . 5], constrained b y the m vis shap e. The inability to independently normalize the Z → τ τ and Z → µµ comp onen ts (which hav e fundamentally differen t m vis shap es) is the primary source of p ost-fit go odness-of-fit degradation (sec. E.0.12 ). In a publication-quality analysis, the Z → τ τ comp onen t would b e estimated using the embedding tec hnique (replacing m uons in Z → µµ data ev ents with sim ulated τ deca ys) or by splitting the DY MC sample in to Z → τ τ and Z → µµ components using generator-lev el information ( GenPart pdgId ). The latter approac h is feasible with the a v ailable data but was not implemented due to the computational cost of repro cessing the full 8.7 GB DY sample. T op-Quark Pair Pro duction ( t ¯ t ) The t ¯ t background contributes ∼ 3% of the total bac kground yield. It enters through top-quark decay chains that pro duce a m uon (from t → W b → µν b ) and a hadronic τ candidate (either a gen uine τ from t → W b → τ ν b or a jet misidentified as τ h ). The m T < 50 GeV cut is highly effectiv e at suppressing t ¯ t : the relative efficiency at this step is only 26% (compared to 85% for signal), reflecting the large gen uine missing energy from the W → µν decay in t ¯ t ev ents. The t ¯ t template is taken from MC simulation with a normalization uncertain ty of ± 10% (conserv ativ e relativ e to the approximate NNLO cross-section uncertaint y of ∼ 6%). 249 W+jets The W+jets bac kground contributes ∼ 13% of the total bac kground yield. It consists of ev ents with a real muon from W → µν decay and a jet misiden tified as τ h . Three exclusive jet-multiplicit y samples (W+1j, W+2j, W+3j) are summed to form the inclusive W+jets template. The W+jets cross-sections use LO MadGraph v alues, whic h are kno wn to ov erestimate the inclusiv e rate b y 10–30%. The template fit assigns a ± 20% normalization uncertain ty to the W+jets template. The post-fit normalization is 1 . 24 × the pre-fit v alue, indicating that the fit adjusts the W+jets contribution upw ard to comp ensate for the QCD reduction (see sec. E.0.6 and sec. E.0.10 ). High- m T Sideband The strategy planned a high- m T sideband ( m T > 70 GeV) for W+jets normalization v alidation. This sideband is not implemented as a separate con trol region because the m T < 50 GeV require- men t defining the signal region excludes these even ts by construction. Instead, the W+jets normalization is treated as a constrained nuisance parameter in the template fit, following the standard approach in CMS H → τ τ analyses: the m vis shap e difference b et w een W+jets (broad, featureless) and D Y (peaked at 60–80 GeV) provides implicit constrain t on the W+jets normalization. QCD Multijet The QCD multijet background has no reliable MC template. It is estimated from data using the same-sign (SS) control region. Metho d 1. Select ev ents passing all selection criteria but with same-sign c harge: q µ × q τ h > 0. 2. Subtract MC con tributions in the SS region (D Y, t ¯ t , W+jets). 3. Apply the OS/SS transfer factor f OS/SS = 1 . 06 to estimate the QCD con tribution in the OS signal region. SS Region Yields T able 75: Same-sign con trol region yields and QCD estimation. Pro cess SS yield Data 7,439 D Y MC 1,436 t ¯ t MC 161 W+jets MC 1,442 QCD (data − MC) 4,400 QCD OS estimate ( × 1 . 06) 4,664 The MC-subtracted SS data is positive in all bins, confirming that the SS region is QCD-dominated after MC subtraction. The QCD purity in the SS region is 59.1%. OS/SS T ransfer F actor The transfer factor f OS/SS = 1 . 06 is tak en from the CMS H → τ τ reference analysis (PLB 779, 2018). A ± 50% systematic uncertain ty is assigned, following the reference. This con- serv ativ e uncertaint y cov ers the range f OS/SS ∈ [0 . 53 , 1 . 59], whic h translates to a QCD OS yield range of [2,332, 6,996] even ts and a total background v ariation of ± 5 . 1%. The strategy planned to measure f OS/SS in situ using an an ti-isolated tau sideband. This measurement w as not feasible with the sav ed histograms, which con tain only even ts passing the tight tau isolation working p oin t. The reference v alue of 1.06 with the conserv ativ e ± 50% uncertain ty is used. The QCD m vis template from the SS region shows a broad, featureless distribution p eaking at lo w mass, consisten t with exp ectations for QCD multijet ev ents where both the muon and tau candidates originate from jets. 250 Data/MC Comparison The data/MC comparison across all kinematic v ariables sho ws consisten t b e- ha vior: • Normalization: T otal MC ov ersho ots data b y ∼ 15%, with the data/MC ratio flat at ∼ 0 . 87 across all distributions. • Shape: No significan t shap e discrepancy is observ ed. The data/MC ratio is flat (no slop e or structure) in all v ariables, indicating that the templates hav e the correct shap es and only their normalizations need adjustment. This level of pre-fit disagreement is exp ected and acceptable for a template fit analysis. The dominant sources of the normalization offset are: • W+jets LO cross-section ov erestimate ( ∼ 30% correction on 12.8% of background, accounting for ∼ 3 . 8% of the offset). • D Y normalization ( ∼ 5% correction on 74% of bac kground, accounting for ∼ 3 . 7% of the offset). • QCD estimation ( f OS/SS uncertain ty , up to ∼ 5% of total). Both DY and W+jets normalizations are free-floating or constrained parameters in the template fit, whic h will absorb the normalization offset. 251 0 2000 4000 6000 8000 Events / 6.7 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet MC stat. unc. H ( 5 0 × ) Data 0 25 50 75 100 125 150 175 200 m v i s [ G e V ] 0.5 1.0 1.5 Data / Pred. MC stat. unc. Figure 126: Visible mass distribution of the m uon–tau system in the signal region (OS, m T < 50 GeV). The DY bac kground p eaks near 60–80 GeV from Z → τ τ . The Higgs signal (scaled by 50 × ) sho ws a broad excess in the 80–150 GeV region. The hatc hed band sho ws the MC statistical uncertain ty . The data/MC ratio is flat at ∼ 0 . 87. 252 0 2000 4000 6000 8000 10000 Events / 4 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data 20 40 60 80 100 120 M u o n p T [ G e V ] 0.5 1.0 1.5 Data / Pred. Figure 127: Muon transverse momentum distribution after full selection. The spectrum is DY-dominated with a flat data/MC ratio at ∼ 0 . 87–0 . 90. 253 0 2000 4000 6000 8000 10000 Events / 4 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data 20 40 60 80 100 120 h p T [ G e V ] 0.5 1.0 1.5 Data / Pred. Figure 128: Hadronic tau candidate transverse momentum distribution after full selection. The sp ectrum falls steeply from the 20 GeV threshold with consistent data/MC normalization offset. 254 0 1000 2000 3000 4000 5000 6000 7000 Events / 4 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data 0 20 40 60 80 100 E m i s s T [ G e V ] 0.5 1.0 1.5 Data / Pred. Figure 129: Missing transverse energy distribution after full selection (including m T < 50 GeV cut). The distribution p eaks at low MET as exp ected for signal and DY ev ents. 255 0 1000 2000 3000 4000 Events / 4 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data 0 20 40 60 80 100 m T ( , E m i s s T ) [ G e V ] 0.5 1.0 1.5 Data / Pred. Figure 130: T ransverse mass distribution in the signal region ( m T < 50 GeV). The distribution is smo othly falling, with signal and D Y clustering at lo w m T . Residual W+jets con tributions extend tow ard the 50 GeV b oundary . 256 0 2000 4000 6000 8000 10000 12000 14000 Events / 0.18 p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data 1 2 3 4 5 R ( , h ) 0.5 1.0 1.5 Data / Pred. Figure 131: ∆ R separation betw een the muon and the hadronic tau candidate. The distribution peaks near ∆ R ≈ 3 (approximately bac k-to-back in the transverse plane), with the DY background showing a sharp p eak near π . 257 0 500 1000 1500 2000 2500 Events / 0.2 p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data -2 -1 0 1 2 M u o n 0.5 1.0 1.5 Data / Pred. Figure 132: Muon pseudorapidit y distribution after full selection, appro ximately flat within | η | < 2 . 1 (the acceptance cut). 258 0 500 1000 1500 2000 2500 Events / 0.2 p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data Z / * t t W+jets QCD multijet H ( 5 0 × ) Data -2 -1 0 1 2 h 0.5 1.0 1.5 Data / Pred. Figure 133: T au pseudorapidity distribution after full selection, approximately flat within | η | < 2 . 3 (the acceptance cut). 259 E.0.7 Systematic Uncertain ties Systematic uncertain ties are ev aluated for detector and reconstruction effects, background mo deling, theory inputs, luminosity , and limited MC statistics. A total of 13 systematic sources are implemented in the statistical mo del, plus 24 bin-by-bin MC statistical uncertain ty parameters (Barlow–Beeston approac h). T au Energy Scale The tau energy scale (TES) uncertaint y is the dominant shap e systematic. A ± 3% shift in the τ h energy scale propagates to the visible mass distribution through the relation m vis ∝ p E µ E τ h . The v ariation is implemen ted as a mass scale s hift: the nominal m vis histogram is ev aluated at shifted bin cen ters ( m vis × (1 ± 0 . 03)) using linear in terp olation. The total yield is preserved (pure shap e systematic). The ± 3% v ariation is a flat v alue across all decay modes. The CMS reference analysis uses deca y-mo de- dep enden t v alues ( ± 1 . 5% for 1-prong, ± 3% for 1-prong+ π 0 , ± 3% for 3-prong). While the Tau decayMode branc h is av ailable in the outreach files, the flat ± 3% v ariation w as used as a conserv ativ e simplification. The TES mo difier is applied to the signal and DY templates. The t ¯ t , W+jets, and QCD templates are not affected b ecause these backgrounds are dominated by jets misiden tified as τ h , whic h do not respond to the genuine τ energy scale. P ost-fit b eha vior: The TES is extremely w ell-constrained by the data, with a p ost-fit uncertaint y of 0 . 051 (5% of the pre-fit constrain t). The pull of +0 . 54 indicates the data prefers a ∼ 1 . 6% upw ard shift in the tau energy scale. This tigh t constraint is driven by the ∼ 33 , 000 DY ev ents in the Z → τ τ p eak region, whic h provide an in-situ calibration of the TES. The CMS reference analysis reports similar sub-percent TES constraints. T rigger Efficiency The com bined trigger efficiency uncertaint y cov ers both the muon leg and the τ leg of the cross-trigger HLT IsoMu17 eta2p1 LooseIsoPFTau20 . Since data-driven scale factors are not av ailable for this outreach analysis, a conserv ative ± 5% combined flat uncertain ty is used, cov ering b oth legs. This is implemen ted as a normalization systematic ( normsys ) affecting all MC samples. The CMS reference analysis quotes ∼ 3–5% trigger efficiency uncertaint y . P ost-fit: The trigger efficiency pull is +0 . 26 σ with a constraint of 0.944 (essentially unconstrained). It is the highest-impact systematic on µ (∆ µ = ± 18 . 4) due to its effect on the total bac kground normalization: a 5% c hange in the ∼ 45 , 000 background even ts shifts the yield b y ∼ 2 , 250 even ts compared to ∼ 95 signal ev ents. T au Identification Efficiency A ± 5% p er-ev en t weigh t v ariation is applied to even ts with a selected τ h candidate, cov ering the uncertain t y on the hadronic τ iden tification efficiency . This is implemented as a normalization systematic affecting the signal and DY templates (whic h con tain gen uine τ h candidates). The CMS reference analysis quotes ∼ 5–6% tau ID efficiency uncertaint y from tag-and-prob e measurements. P ost-fit: The tau ID pull is negligible ( − 0 . 001 σ ) and the parameter is unconstrained (post-fit σ = 1 . 000). Despite b eing unconstrained, it has the second-largest impact on µ (∆ µ = ± 15 . 4) b ecause it scales the dominan t D Y background. Muon Iden tification and Isolation Efficiency A ± 2% p er-ev ent w eight v ariation cov ers the muon iden tification and isolation efficiency uncertain ty . This is implemen ted as a normalization systematic affecting all MC samples. The CMS reference analysis quotes ∼ 1–2% m uon efficiency uncertain ty from tag-and-probe measuremen ts. P ost-fit: The muon efficiency pull is +0 . 10 σ with a constraint of 0.990 (unconstrained). The impact on µ is ∆ µ = ± 7 . 8. Note on Efficiency Systematics for F ake-T au Backgrounds The trigger efficiency and tau identifi- cation efficiency systematics are applied uniformly to all MC samples, including W+jets and t ¯ t where the selected τ h candidate is predominan tly a misidentified jet rather than a genuine hadronic τ decay . In a more rigorous treatment, these bac kgrounds would use jet → τ h fak e rate uncertain ties rather than genuine- τ effi- ciency uncertain ties. The fake rate uncertaint y is typically larger than the genuine- τ efficiency uncertaint y (e.g., ∼ 20–30% vs. ∼ 5%), so the uniform application of the gen uine- τ uncertaint y to fak e- τ backgrounds is a simplification that underestimates the uncertaint y on these sub dominan t comp onen ts. How ever, since 260 50 75 100 125 150 175 200 m v i s [ G e V ] 0 2 4 6 8 10 12 14 Events / 6.7 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Simulation Open Data Signal (nominal) S i g n a l ( T E S + 3 % ) S i g n a l ( T E S 3 % ) Figure 134: T au energy scale shape v ariation on the signal template. The nominal (blac k), +3% (red dashed), and − 3% (blue dashed) v ariations show the mass scale shift effect on the m vis distribution. 261 W+jets and t ¯ t together constitute only ∼ 16% of the total bac kground, and their normalizations are sepa- rately constrained by dedicated normalization uncertain ties ( ± 20% and ± 10% resp ectively), the impact of this simplification on the final result is small compared to the dominant statistical uncertain ty of σ µ = 4 . 81. Luminosit y The integrated luminosity uncertaint y of ± 2 . 6% follo ws the CMS recommendation for 8 T eV data. This is implemen ted as a normalization systematic affecting all MC samples (correlated across all pro cesses). P ost-fit: The luminosit y pull is +0 . 14 σ with a constraint of 0.984 (unconstrained). The impact on µ is ∆ µ = ± 10 . 0. D Y Normalization The DY normalization is treated as a free-floating normfactor parameter b ounded to [0 . 5 , 1 . 5], rather than a constrained normsys . This treatment is motiv ated b y: 1. The DYJetsT oLL sample includes b oth Z → τ τ and Z → µµ comp onen ts with differen t m vis shap es, and a single normalization factor cannot independently adjust b oth. 2. The CMS reference analysis also treats the DY normalization as free-floating, constrained by the m vis shap e. P ost-fit: The D Y normalization fits to 0 . 872 ± 0 . 065, indicating the data prefers D Y reduced by ∼ 13%. This is consistent with the 15% o verall MC ov ersho ot observ ed in the selection phase. t ¯ t Normalization The t ¯ t normalization uncertain ty of ± 10% is conserv ative relative to the approximate NNLO cross-section uncertaint y of ∼ 6% (T op++ 2.0). This is implemented as a normalization systematic affecting only the t ¯ t template. P ost-fit: The t ¯ t normalization pull is − 0 . 08 σ with a constraint of 0.986 (unconstrained). The impact on µ is ∆ µ = ± 1 . 3, reflecting the sub dominant nature of the t ¯ t background ( ∼ 3% of total). W+jets Normalization The W+jets normalization uncertain t y of ± 20% co v ers the LO MadGraph cross- section uncertain t y and the absence of a dedicated high- m T con trol region normalization. This is imple- men ted as a normalization systematic affecting only the W+jets template. P ost-fit: The W+jets normalization pull is +1 . 09 σ with a constrain t of 0.563 (significan tly constrained). The fit increases W+jets by ∼ 24% ab o ve the LO prediction, consistent with the kno wn LO cross-section uncertain ty and the need to comp ensate for the QCD reduction. The impact on µ is ∆ µ = ± 7 . 8. QCD Normalization The QCD normalization uncertain ty of ± 50% cov ers the uncertaint y on the OS/SS transfer factor f OS/SS = 1 . 06. This is implemen ted as a normalization systematic affecting only the QCD template. P ost-fit: The QCD normalization pull is − 1 . 49 σ with a constrain t of 0.545 (significantly constrained). The fit reduces QCD to ∼ 26% of its pre-fit v alue. This large pull indicates that the SS → OS transfer factor of 1.06 significan tly ov erestimates the QCD contribution in these outreach files. The ± 50% uncertain ty pro vides sufficient freedom for the fit to accommo date this, and the result is not biased. The impact on µ is ∆ µ = ± 7 . 5. Signal Cross-Section Uncertainties ggH Cross-Section The ggH cross-section uncertain ty of ± 10% cov ers QCD scale v ariations on the gluon– gluon fusion pro duction cross-section. This is implemented as a normalization systematic affecting only the signal template. P ost-fit: The ggH cross-section pull is negligible ( − 0 . 0002 σ ) and unconstrained. The impact on µ is ∆ µ = ± 0 . 6, reflecting the low signal yield relativ e to bac kground. 262 VBF Cross-Section The VBF cross-section uncertain ty is implemen ted as ± 0 . 3% on the com bined signal template. This is a factor-of-10 smaller than the in tended ± 3% (from QCD scale and PDF uncertainties on the VBF pro duction cross-section). The discrepancy arises b ecause the VBF contribution ( ∼ 10% of total signal) is already small, and the ± 0 . 3% w as applied to the com bined ggH+VBF signal template rather than the VBF comp onent alone. The impact is negligible: ∆ µ = ± 0 . 02, whic h is 0 . 4% of σ µ = 4 . 81. BR( H → τ τ ) The branching ratio uncertaint y of ± 5 . 7% follo ws the LHC Higgs Cross Section W ork- ing Group recommendation. This is implemented as a normalization systematic affecting only the signal template. P ost-fit: The BR uncertaint y pull is negligible and unconstrained. The impact on µ is ∆ µ = ± 0 . 3. MC Statistical Uncertain ty The limited MC statistics in eac h bin of the m vis distribution are accoun ted for using the Barlow–Beeston approach, implemented via staterror mo difiers in pyhf. A total of 24 in- dep enden t bin-by-bin parameters (one per bin) are included, with the constraint deriv ed from the P oisson statistical uncertaint y on the total MC prediction in each bin. The staterror parameters are particularly imp ortan t in the tails of the m vis distribution where the MC ev ent coun t is small. Omitted Systematic Sources The follo wing systematic sources from the reference analyses are not implemen ted, with justification: T able 76: Systematic sources omitted from the analysis with justi- fication. Source Reason for omission Muon momentum scale ( ± 0 . 2%) Sub-leading ( < 0 . 2% on m vis ), negligible vs. TES Muon momentum resolution Sub-leading; smearing effect negligible vs. TES D Y shape (NLO K-factors, Z p T rew eighting) TES shap e cov ers mass scale; data constrains normalization QCD shap e (SS/OS template difference) SS template used directly; shape uncertain ty sub-leading MET scale/resolution Only en ters via m T cut; effect absorb ed by bkg normalizations PDF acceptance ( ± 3–5%) Sub-leading; folded into cross-section uncertainties Pileup reweigh ting Not applicable: no PU information in outreach files Generator mo deling (Pythia tune, hadronization) Not feasible: single generator p er pro cess Jet energy scale Not applicable: no jet-based categorization b-tag efficiency Not applicable: no b-tag selection Systematic Uncertaint y Summary tbl. 77 pro vides the complete summary of implemented systematic uncertain ties with their pre-fit size s, post-fit pulls, and constraints. T able 77: Summary of systematic uncertain ties implemen ted in the statistical model. The impact on µ is the maxim um of | ∆ µ +1 σ | and | ∆ µ − 1 σ | . ∗ D Y normalization pull is the fitted normfactor v alue, not in σ units. Source Type Pre-fit size Post-fit pull P ost-fit σ Impact on µ T rigger efficiency Norm ± 5% +0 . 26 0.944 18.4 T au ID efficiency Norm ± 5% − 0 . 00 1.000 15.4 263 Source Type Pre-fit size Post-fit pull P ost-fit σ Impact on µ Luminosit y Norm ± 2 . 6% +0 . 14 0.984 10.0 W+jets normal- ization Norm ± 20% +1 . 09 0.563 7.8 Muon ID/iso efficiency Norm ± 2% +0 . 10 0.990 7.8 QCD normal- ization Norm ± 50% − 1 . 49 0.545 7.5 t ¯ t normal- ization Norm ± 10% − 0 . 08 0.986 1.3 ggH cross- section Norm ± 10% − 0 . 00 1.000 0.6 BR( H → τ τ ) Norm ± 5 . 7% +0 . 00 1.000 0.3 T au energy scale Shap e ± 3% +0 . 54 0.051 0.1 VBF cross- section Norm ± 0 . 3% +0 . 00 1.000 0.02 D Y normal- ization F ree- float [0 . 5 , 1 . 5] 0 . 872 ∗ 0.065 — MC statistics P er- bin √ N v aries v aries v aries The large individual impacts (∆ µ ∼ 10–18) reflect the v ery lo w signal-to-background ratio ( S/B ≈ 0 . 002). A 5% change in the ∼ 45 , 000 background ev ents shifts the expected yield by ∼ 2 , 250 even ts, compared to ∼ 95 signal ev ents, producing ∆ µ ≈ 2250 / 95 ≈ 24. These individual impacts partially cancel when com bined in the profiled fit, resulting in a total uncertaint y σ µ = 4 . 81 that is muc h smaller than the individual impacts w ould suggest if added in quadrature. 264 E.0.8 Systematic Completeness The systematic completeness table compares the implemen ted sources against the conv entions and three reference analyses. T able 78: Systematic completeness table comparing this analysis to con ven tions and three reference analyses. (*) VBF cross-section uncertaint y is under-sized by × 10 due to an implemen tation typo; impact is negligible. Source Conv entions CMS PLB 779 A TLAS JHEP 04 CMS PLB 805 This analysis Status TES Required ± 1–3% ± 2–4% ± 0 . 5–1 . 5% ± 3% (shap e) Implemented T au ID Required 5% 5–6% SF ± 5% (norm) Implemented Muon ID/iso Required ∼ 2% ∼ 1% SF ± 2% (norm) Implemented T rigger eff. Required 3–5% Y es SF ± 5% (norm) Implemented DY norm. Background F ree-float Em b edding F ree-float F ree-float Implemen ted t ¯ t norm. Background ± 6% ± 6% MC+xsec ± 10% (norm) Implemented W+jets norm. Background ∼ 10% F ak e factor Sideband ± 20% (norm) Implemented QCD norm. Background SS → OS, ± 50% ABCD SS → OS ± 50% (norm) Implemented ggH σ Theory ± 10% ± 10% Scale+PDF ± 10% (norm) Implemented VBF σ Theory ± 3% ± 3% Scale+PDF ± 0 . 3%* Implemented BR( H → τ τ ) Theory ± 5 . 7% ± 5 . 7% ± 5 . 7% ± 5 . 7% (norm) Implemented Luminosity Required 2.5% 2.8% 2.5% ± 2 . 6% (norm) Implemented MC stat. Required BB Y es BB staterror (24 bins) Implemented Muon mom. scale Required ± 0 . 2% Y es Y es — Omitted: sub- leading DY shap e Bac kground NLO K-factors Em b edding Reweighting — Omitted: TES cov ers QCD shape Background SS template SS template SS template — Omitted: sub- leading MET Required Y es Y es Y es — Omitted: m T only Pileup Required Y es Y es Y es — Not applicable Generator Required Pythia tune Herwig/Pythia PS/UE — Not feasible Summary: 13 of 14 applicable systematic sources from the reference analyses are implemented. The omitted sources are sub-leading and their effects are either absorb ed by other systematics or negligible compared to the dominan t s tatistical uncertain ty . 265 E.0.9 Statistical Method Mo del Structure The signal strength µ = σ obs /σ SM is extracted using a binned maxim um-likelihoo d template fit to the visible di-tau mass m vis distribution. The fit is p erformed using pyhf (v0.7.6) with the n umpy bac kend, implemen ting the HistF actory likelihoo d mo del. Lik eliho o d The likelihoo d function is: L ( µ,  θ ) = N bins Y i =1 P ois( n i | ν i ( µ,  θ )) × Y j f j ( θ j ) , (43) where n i is the observ ed even t count in bin i , ν i ( µ,  θ ) is the exp ected even t count in bin i as a function of the signal strength µ and the nuisance parameters  θ , and f j ( θ j ) are the constraint terms for each nuisance parameter. The exp ected even t count in eac h bin is: ν i ( µ,  θ ) = µ · s i (  θ ) + X k b k,i (  θ ) , (44) where s i is the signal template (ggH + VBF combined) and b k,i are the background templates (DY, t ¯ t , W+jets, QCD). Channel T able 79: Channel definition for the template fit. Channel Observ able Range Bins Bin width mvis m vis ( µ, τ h ) [40 , 200] GeV 24 6.67 GeV The fit range starts at 40 GeV b ecause the signal template has negligible con tribution below this v alue, and the low-mass region is kinematically suppressed given the p T > 20 GeV thresholds on b oth ob jects. Samples Five samples en ter the fit: 1. Signal (ggH + VBF, combined): Scaled by µ . 2. D Y ( Z/γ ∗ → ℓℓ ): Scaled by the free-floating DY normalization factor. 3. t ¯ t : Fixed template with ± 10% normalization uncertain ty . 4. W+jets (W+1j/2j/3j com bined): Fixed template with ± 20% normalization uncertaint y . 5. QCD (data-driv en from SS region): Fixed template with ± 50% normalization uncertain ty . P arameter Summary The mo del contains 37 parameters: • 1 parameter of in terest (POI): µ (signal strength), b ounded [ − 10 , 20]. • 12 constrained n uisance parameters: 10 normalization systematics ( normsys ), 1 shape systematic ( histosys for TES), and 1 free-floating normalization factor (DY normfactor ). • 24 bin-b y-bin MC statistical parameters: Barlo w–Beeston staterror gammas, one per bin. 266 Fit Pro cedure The maxim um-likelihoo d fit is performed by minimizing − 2 ln L with resp ect to µ and all n uisance parameters simultaneously . The pyhf optimizer (scipy .optimize with the n umpy back end) is used. The b est-fit v alues ˆ µ and ˆ  θ are obtained at the global minim um, and uncertain ties are computed from the n umerical Hessian matrix. The fit conv erges to a unique minimum in all tested configurations (data fit, Asimov fit, signal injection at 6 different µ v alues). The numerical Hessian is in vertible and pro duces finite p ositiv e uncertain ties for all parameters. 267 E.0.10 Results Best-Fit Signal Strength The binned maximum-lik eliho od fit to data yields: ˆ µ = − 5 . 67 ± 4 . 81 (45) The negative best-fit µ is consistent with the Standard Mo del expectation ( µ = 1) within 1 . 4 σ : | ˆ µ − 1 | σ µ = 6 . 67 4 . 81 = 1 . 39 The result is also consistent with the background-only h yp othesis ( µ = 0) within 1 . 2 σ : | ˆ µ | σ µ = 5 . 67 4 . 81 = 1 . 18 The 95% confidence in terv al is approximately ˆ µ ∈ [ − 15 . 3 , +3 . 9], whic h includes µ = 1. Exp ected Sensitivit y The fit to the Asimo v dataset (generated at µ = 1 with all n uisance parameters at their nominal v alues) yields: ˆ µ Asimov = 1 . 000 ± 5 . 607 The expected uncertaint y of σ µ ≈ 5 . 6 confirms that this inclusive µτ h analysis with 11.6 fb − 1 at 8 T eV has insufficien t sensitivity to observ e or exclude the H → τ τ signal in a single category . The full CMS analysis achiev es σ µ ≈ 0 . 26 by com bining: 1. Category optimization (factor ∼ 2–3): VBF and b o osted categories hav e muc h higher S/B . 2. All deca y channels (factor ∼ 3): eτ h , eµ , τ h τ h add indep enden t information. 3. SVfit mass (factor ∼ 2): Better signal/background separation than visible mass. 4. F ull luminosit y (factor ∼ √ 2): 20 fb − 1 vs. 11.6 fb − 1 . P ost-Fit Yields The p ost-fit background comp osition is summarized in tbl. 80 . T able 80: Pre-fit and p ost-fit yields in the fit range [40 , 200] GeV. The p ost-fit total agrees with data within 0.6%. Sample Pre-fit yield Post-fit yield Scale factor Signal 94.1 − 543 . 7 µ = − 5 . 67 D Y 33,361.7 29,643.0 0.889 t ¯ t 1,193.7 1,206.8 1.011 W+jets 5,461.6 6,779.0 1.241 QCD 4,301.7 1,107.7 0.258 T otal 44,412.8 38,192.9 Data 38,428 The p ost-fit total (38,193) agrees well with data (38,428), within 0.6%. The pre-fit excess of 15.6% is absorb ed b y the follo wing adjustments: • D Y normalization reduced to 87.2% ( − 12 . 8%). • QCD reduced to 25.8% of pre-fit ( − 74 . 2%, corresp onding to a pull of − 1 . 49 σ on the QCD normaliza- tion). 268 • W+jets increased to 124.1% (+24 . 1%, corresp onding to a pull of +1 . 09 σ on the W+jets normaliza- tion). • t ¯ t essentially unc hanged (+1 . 1%). The negativ e ˆ µ produces a negative signal con tribution ( − 543 . 7 ev ents), whic h is unphysical but mathe- matically allo wed by the fit. This reflects the limited sensitivity: the signal template is to o small relativ e to the background for the fit to meaningfully constrain µ . 0 1000 2000 3000 4000 5000 6000 7000 8000 Events / 6.7 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data QCD W+jets t t DY MC stat. unc. S i g n a l ( × 5 0 ) Data 40 60 80 100 120 140 160 180 200 m v i s [ G e V ] 0.5 1.0 1.5 Data / MC Figure 135: Pre-fit data/MC comparison of the visible mass distribution in the fit range [40 , 200] GeV. The stack ed backgrounds are shown with the MC statistical uncertaint y band. Data points ha ve P oisson error bars. The signal ( × 50) is ov erlaid. The data/MC ratio is systematically b elo w unity at ∼ 0 . 87. The D Y → µµ spike at ∼ 87–93 GeV is visible. Pre-fit χ 2 / ndf = 1231 . 5 / 23 = 53 . 6. Pre-Fit and Post-Fit Data/MC Comparison 269 0 1000 2000 3000 4000 5000 6000 7000 Events / 6.7 GeV p s = 8 T e V , 1 1 . 6 f b 1 CMS Open Data QCD W+jets t t DY Post-fit unc. S i g n a l = 1 ( × 5 0 ) Data 40 60 80 100 120 140 160 180 200 m v i s [ G e V ] 0.5 1.0 1.5 Data / Bkg Figure 136: Post-fit data/MC comparison after the maximum-lik eliho od fit. The fit adjusts the DY normal- ization (to 87%), increases W+jets (to 124%), and reduces QCD (to 26%). The data/MC ratio is centered near unit y across most of the distribution, with residual deviations in the Z -mass region ( ∼ 87–93 GeV). Signal is shown at SM exp ectation ( µ = 1, × 50). P ost-fit χ 2 / ndf = 112 . 9 / 23 = 4 . 91. 270 Nuisance P arameter Pulls and Constraints fig. 137 sho ws the nuisance parameter pull plot. All constrained nuisance parameters are within 2 σ of their pre-fit v alues, with the exception of the QCD nor- malization ( − 1 . 49 σ ) and the W+jets normalization (+1 . 09 σ ). -3 -2 -1 0 1 2 3 ( 0 ) / tes dy_norm lumi muon_eff tau_id trig_eff qcd_norm wjets_norm br_htautau xsec_VBF xsec_ggH tt_norm p s = 8 T e V , 1 1 . 6 f b 1 CMS Simulation Open Data ± 1 ± 2 Figure 137: Nuisance parameter pull plot showing the p ost-fit v alue and uncertaint y for each constrained n uisance parameter. The y ellow and green bands indicate the ± 1 σ and ± 2 σ pre-fit ranges. The QCD normalization is pulled to − 1 . 49 σ and the W+jets normalization to +1 . 09 σ . The tau energy scale is extremely w ell-constrained (p ost-fit σ = 0 . 051, 5% of pre-fit), driven b y the large DY even t sample in the Z → τ τ p eak region. The DY normalization is also well-constrained (p ost-fit σ = 0 . 065) as a free-floating parameter. Impact Ranking fig. 138 shows the impact ranking of systematic uncertainties on the signal strength µ , sorted by decreasing impact. The dominant systematics are: 1. T rigger efficiency (∆ µ = 18 . 4): Scales all backgrounds. 271 -3 -2 -1 0 1 2 3 ( 0 ) / Trigger efficiency Tau ID efficiency Luminosity W+jets normalization Muon ID/iso efficiency QCD normalization ttbar normalization ggH cross-section BR(H->tautau) Tau energy scale VBF cross-section p s = 8 T e V , 1 1 . 6 f b 1 CMS Simulation Open Data ± 1 ± 2 -20 -10 0 10 20 + 1 1 Figure 138: Combined nuisance parameter pulls (left) and impact on µ (right), sorted b y decreasing impact. T rigger efficiency , tau ID, and luminosity dominate the impact on µ b ecause they scale the en tire bac kground yield, and ∆ µ ∝ ∆ B /S is amplified by the very lo w S/B ≈ 0 . 002. 2. T au ID efficiency (∆ µ = 15 . 4): Scales D Y and signal. 3. Luminosit y (∆ µ = 10 . 0): Scales all MC. 4. W+jets normalization (∆ µ = 7 . 8): Second-largest bac kground. 5. Muon ID/iso (∆ µ = 7 . 8): Scales all MC. 6. QCD normalization (∆ µ = 7 . 5): Data-driven uncertaint y . Signal-sp ecific systematics (cross-section, branching ratio) hav e negligible impact given the large statis- tical uncertaint y . 272 E.0.11 Fit V alidation Signal Injection T est Signal is injected at v arious µ v alues using Asimo v datasets and the fit reco vers the injected v alue in all cases. T able 81: Signal injection test results. All residuals < 0 . 02, con- firming excellent fit linearit y and mo del correctness. Injected µ Recov ered ˆ µ σ µ Residual 0.0 − 0 . 002 5.684 − 0 . 002 0.5 +0 . 498 5.619 − 0 . 002 1.0 +1 . 000 5.607 +0 . 000 2.0 +1 . 981 5.601 − 0 . 019 3.0 +2 . 982 5.688 − 0 . 018 5.0 +4 . 995 5.652 − 0 . 005 The signal injection test verifies that: • The model is correctly constructed (no tec hnical bugs in the w orkspace). • The fit is unbiased across the range µ ∈ [0 , 5]. • The linearit y of the fit resp onse is excellent (residuals < 0 . 02). • The expected uncertain ty is stable at σ µ ≈ 5 . 6–5 . 7 across all injection p oin ts. Asimo v Fit The fit to the Asimov dataset (exp ected signal + bac kground at µ = 1) yields: ˆ µ Asimov = 1 . 000 ± 5 . 607 This confirms that the mo del is correctly constructed and the fit is unbiased. The exp ected uncertaint y of σ µ = 5 . 607 sets the scale for the precision achiev able with this analysis configuration. Bac kground Closure T est The background mo del is tested in the signal-depleted sideband m vis < 80 GeV (6 bins, b elo w the Higgs signal p eak). Using the p ost-fit background prediction (with µ = 0): T able 82: Bac kground closure test results in the signal-depleted sideband m vis < 80 GeV. Metric V alue Data yield 28,597 P ost-fit bac kground yield 28,367 Data / MC 1.008 χ 2 / ndf 14 . 9 / 6 = 2 . 48 The bac kground closure is acceptable: the o verall normalization agrees to 0.8%, and the χ 2 / ndf = 2 . 48 is consistent with the lev el of DY shap e mismo deling observ ed in the full fit range. This sideband region do es not contain the Z → µµ con tamination spike, so the agreeme n t is b etter than in the full range. Coun ting Exp erimen t Cross-Check A simple counting experiment in the 100–150 GeV signal window (8 bins) provides an indep enden t cross-chec k of the template fit result: 273 0 1 2 3 4 5 I n j e c t e d -6 -4 -2 0 2 4 6 8 10 R e c o v e r e d p s = 8 T e V , 1 1 . 6 f b 1 CMS Simulation Open Data Ideal (1:1) Recovered Figure 139: Signal injection linearit y test. The reco vered ˆ µ (black p oin ts with error bars) is plotted against the injected µ . The red dashed line shows the ideal 1:1 recov ery . All p oin ts lie on the diagonal within the n umerical precision of the fit. 274 T able 83: Coun ting exp eriment cross-c heck in the 100–150 GeV signal window. Quan tity V alue Data yield 2,472 Signal yield ( µ = 1) 7.8 Bac kground yield (p ost-fit) 2,428 S/ √ B 0.15 ˆ µ counting 5 . 7 ± 6 . 4 The counting exp erimen t yields ˆ µ counting = 5 . 7 ± 6 . 4, consistent with b oth the template fit result ( ˆ µ = − 5 . 67 ± 4 . 81) and the SM exp ectation ( µ = 1) within the large uncertainties. The discrepancy b et w een the coun ting and template fit estimates reflects the differen t information used: the template fit exploits the full m vis shap e (24 bins), while the coun ting exp erimen t uses only the in tegrated yield in 8 bins. The very low S/B ratio (7 . 8 / 2 , 428 ≈ 0 . 003) means that small background mismodeling produces large swings in ˆ µ . The apparen t difference of ∆ ˆ µ ≈ 11 . 4 b et ween the tw o metho ds (+5 . 7 vs. − 5 . 67) warran ts discussion. The t wo estimates are not indep enden t: they use the same data but extract µ using fundamentally different information (total yield vs. binned shape). The combined uncertaint y on the difference, treating the estimates as partially correlated, is σ ∆ ≈ q σ 2 count + σ 2 template ≈ √ 6 . 4 2 + 4 . 81 2 ≈ 8 . 0 (conserv ativ ely assuming zero correlation; the true correlation w ould reduce this). The tension is therefore 11 . 4 / 8 . 0 ≈ 1 . 4 σ , which is consisten t with a statistical fluctuation. F urthermore, the coun ting exp erimen t uses the p ost-fit background prediction (whic h already includes the negativ e ˆ µ con tribution), the different binning (8 bins in 100–150 GeV vs. 24 bins in 40–200 GeV), and the absence of shape information all con tribute to the differen t cen tral v alues. The agreement of b oth estimates with µ = 1 within their resp ective uncertainties confirms the consistency of the analysis. Fit Con v ergence The fit conv erges to a unique minimum in all tested configurations: • Data fit ( ˆ µ = − 5 . 67 ± 4 . 81). • Asimo v fit ( ˆ µ = 1 . 000 ± 5 . 607). • Signal injection at 6 different µ v alues (all recov ered within 0.02). The numerical Hessian is inv ertible and pro duces finite positive uncertainties for all parameters. 275 E.0.12 Go odness-of-Fit Ov erall Fit Qualit y T able 84: Pre-fit and p ost-fit go odness-of-fit metrics. Metric Pre-fit P ost-fit χ 2 1231.5 112.9 ndf 23 23 χ 2 / ndf 53.6 4.91 The p ost-fit χ 2 / ndf = 4 . 91 is impro ved by a factor of ∼ 11 compared to the pre-fit v alue but remains ab o v e the ideal χ 2 / ndf ≈ 1. The strategy success criterion of χ 2 / ndf < 2 is not me t. Ro ot Cause: Z → µµ Contamination The residual excess is dominated b y the Z → µµ spik e at 87– 93 GeV in the DY template. The D YJetsT oLL MC sample conflates t wo comp onen ts with fundamentally differen t m vis shap es: • Z → τ τ (dominant): Smooth m vis sp ectrum peaking at 60–80 GeV. • Z → µµ (sub dominan t): Sharp spik e at ∼ 91 GeV where a m uon fak es τ h . The fit has only a single DY normalization factor and the TES shap e systematic, whic h cannot indep en- den tly adjust these tw o comp onen ts. Z -p eak bin contribution: The single bin centered at ∼ 90 GeV contributes χ 2 = 44 . 1 to the total χ 2 = 112 . 9, i.e., 39% of the total χ 2 from a single bin. Excluding this bin: χ 2 excl. Z-peak / ndf = 68 . 8 / 23 = 2 . 99 This is still elev ated, indicating that the DY shape mismo deling extends b ey ond the single Z -p eak bin. The broader Z -p eak region (80–100 GeV) contains b oth Z → τ τ and Z → µµ con tributions, and a single normalization factor cannot sim ultaneously fit b oth. Resolution P ath The Z → µµ contamination could b e resolved b y: 1. Splitting the D Y template using GenPart pdgId branc hes (confirmed av ailable in the MC files) in to independent Z → τ τ and Z → µµ templates with separate normalization factors. This would require repro cessing the full 8.7 GB DY sample. 2. Em bedding tec hnique: Replacing muons in Z → µµ data even ts with simulated τ decays, as used in the full CMS analysis. This is not feasible with the av ailable data. Impact on Signal Extraction Despite the elev ated χ 2 / ndf, the signal extraction is not biased by the D Y shape mismo deling: • The signal con tribution is negligible in the Z -p eak region ( S/B < 10 − 4 at 90 GeV). • The signal injection linearity test confirms un biased reco very of µ across the range [0 , 5] with residuals < 0 . 02. • The Z -p eak region acts as a constraint on the D Y normalization and TES, not as a signal-sensitive region. F or this outreac h-level analysis, the fit qualit y is acceptable given the kno wn limitations. 276 E.0.13 Discussion Wh y ˆ µ is Negativ e The b est-fit signal strength of ˆ µ = − 5 . 67 is a down w ard fluctuation of 1 . 4 σ from the SM exp ectation. Giv en the large uncertaint y ( σ µ = 4 . 81), this is not statistically significant. The negative ˆ µ arises from the in terplay b et ween the background template shapes and the data in the signal-sensitiv e region (100–150 GeV). The v ery lo w S/B ≈ 0 . 002 means that even small mismatc hes b et ween the bac kground mo del and data can pro duce large excursions in ˆ µ : a 1% excess in the background mo del in the signal region translates to ∆ µ ≈ − 0 . 01 × B /S ≈ − 4 . 7. The µ – dy norm an ti-correlation amplifies this effect. Both parameters scale templates con tributing to the same m vis bins: increasing µ has a similar effect on the total prediction as increasing dy norm . Given S/B ≈ 0 . 002, a 1% change in D Y normalization pro duces an effect 3.5 times larger than setting µ = 1. This strong an ti-correlation is a standard feature of template fits with a free-floating ma jor-bac kground normaliza- tion and is not a pathology . The exp ected correlation co efficien t ρ ( µ, dy norm) is estimated to be ≈ − 0 . 8 to − 0 . 9 based on the signal-to-background composition: the DY template accounts for ∼ 74% of the bac kground and its shape o verlaps substan tially with the signal template in the m vis distribution. The full parameter cor- relation matrix is accessible via pyhf.infer.mle.fit(data, model, return result obj=True) and w ould b e pro vided as a machine-readable artifact in a full analysis. Among the other parameter pairs, the largest correlations are exp ected b et ween the normalization systematics that scale ov erlapping templates: dy norm and qcd norm (both adjust the total background level), and wjets norm and qcd norm (compensating back- grounds). The tau energy scale (TES) is exp ected to be weakly correlated with µ b ecause TES primarily shifts the m vis shap e rather than the ov erall normalization. Sensitivit y Assessmen t The expected µ uncertaint y of σ µ ≈ 5 . 6 means this analysis has no sensitivity to the SM Higgs signal in the µτ h c hannel alone with the inclusiv e selection. The signal strength can only b e constrained to O (1) precision, insufficient for either observ ation or exclusion. This is exp ected: the CMS published result ( µ = 0 . 78 ± 0 . 27) combined all deca y c hannels, multiple ev ent categories, the SVfit mass algorithm, and the full 7+8 T eV dataset. The factor of ∼ 20 in sensitivit y betw een this analysis and the published result arises from the combination of category optimization, m ulti-channel com bination, SVfit mass, and full luminosity . Comparison with CMS Published Result The CMS Run 1 measuremen t using the full 7+8 T eV dataset (JHEP 05, 2014, 104) obtained µ = 0 . 78 ± 0 . 27 with 3.2 σ evidence. Our result ( ˆ µ = − 5 . 67 ± 4 . 81) is consistent with the published v alue: | ˆ µ this − ˆ µ CMS | p σ 2 this + σ 2 CMS = 6 . 45 √ 4 . 81 2 + 0 . 27 2 = 6 . 45 4 . 81 = 1 . 34 σ The agreemen t within 1 . 3 σ is expected giv en the limited precision of this single-c hannel inclusiv e analysis. Impact of Missing Analysis Comp onen ts The following impro vemen ts, presen t in the full CMS anal- ysis but absent here, would significan tly enhance sensitivit y: 1. Ev en t categorization: Separating even ts into 0-jet, b o osted, and VBF categories based on jet mul- tiplicit y and kinematics. The VBF category has S/B ∼ 10 times higher than the inclusiv e selection, pro viding m uch of the sensitivity in the full analysis. 2. SVfit mass: The SVfit algorithm uses the MET and its co v ariance matrix (b oth a v ailable in the outreac h files via MET pt , MET phi , MET CovXX/XY/YY ) to reconstruct the full di- τ mass. This pro vides b etter separation b et ween the Higgs signal p eak ( ∼ 125 GeV) and the Z → τ τ bac kground p eak ( ∼ 91 GeV). 3. Multi-c hannel combination: Adding eτ h , eµ , and τ h τ h c hannels appro ximately triples the signal yield. 4. F ull 2012 dataset: Using all of Run2012 ( ∼ 20 fb − 1 ) instead of Run2012B+C ( ∼ 11 . 6 fb − 1 ) w ould impro ve statistical precision by ∼ √ 2. 277 E.0.14 Conclusions A measurement of the Higgs b oson signal strength in the H → τ τ decay c hannel has b een p erformed using CMS Op en Data from proton–proton collisions at √ s = 8 T eV, corresp onding to an integrated luminosity of L = 11 . 6 fb − 1 . The analysis targets the µτ h final state using a binned template fit to the visible di-tau mass distribution m vis in the range [40 , 200] GeV. The b est-fit signal strength is: ˆ µ = − 5 . 67 ± 4 . 81 This result is consistent with the Standard Model exp ectation ( µ = 1) within 1 . 4 σ and with the bac kground-only hypothesis ( µ = 0) within 1 . 2 σ . The exp ected sensitivit y of σ µ ≈ 5 . 6 reflects the lim- ited signal-to-background ratio ( S/B ≈ 0 . 002) in a single inclusive ev ent category . The analysis demonstrates the complete H → τ τ measuremen t workflo w: • Data exploration of CMS Op en Data NanoA OD files, including sc hema discov ery and data quality assessmen t. • Ev en t selection with a complete cutflo w, including the discov ery and mitigation of anomalous anti- lepton discriminator b eha vior in the outreach files. • Bac kground estimation com bining MC sim ulation (DY, t ¯ t , W+jets) with data-driven methods (QCD from the same-sign control region). • Statistical mo del using pyhf with 13 systematic sources and Barlow–Beeston MC statistical uncer- tain ties. • Fit v alidation through signal injection tests, Asimov fits, bac kground closure tests, and coun ting exp erimen t cross-chec ks. Dominan t Limitations The three most significant limitations of this analysis are: 1. No even t categorization. The inclusive selection results in S/B ≈ 0 . 002. Even t categorization (0-jet, b o osted, VBF) would impro ve sensitivity b y a factor of ∼ 2–3 through enhanced S/B in the VBF and b o osted categories. 2. Z → µµ contamination in the D Y template. The inclusive DYJetsT oLL sample includes Z → µµ ev ents where a m uon fak es τ h , creating a spike at ∼ 91 GeV that degrades the fit quality ( χ 2 / ndf = 4 . 91). Splitting the D Y template using generator information or using the em b edding technique would resolv e this. 3. Limited luminosity . Only Run2012B+C (11 . 6 fb − 1 ) is analyzed, compared to the full 2012 dataset ( ∼ 20 fb − 1 ). Additional limitations include the use of visible mass instead of SVfit mass, the absence of pileup rew eight- ing, missing background pro cesses (single-top, dib oson), and the av ailabilit y of only a single MC generator p er process. Precision Relativ e to Published Results The CMS published result ( µ = 0 . 78 ± 0 . 27) achiev es a factor of ∼ 20 b etter precision through the com bination of ev ent categorization, all τ τ deca y channels, SVfit mass reconstruction, and the full 7+8 T eV dataset. This analysis serv es as a demonstration of the methodology rather than a comp etitiv e measurement. 278 E.0.15 F uture Directions Sev eral concrete improv emen ts could enhance the sensitivity of this analysis: Ev ent Categorization Implement jet-based categorization to separate even ts into: • 0-jet category: Ev ents with no jets ( p T > 30 GeV). Dominated by ggH production with low S/B but high statistics. • Boosted category: Even ts with ≥ 1 jet and Higgs p T > 100 GeV (estimated from the vector sum of the muon, τ h , and MET). Enhanced S/B from the b o osted top ology . • VBF category: Even ts with ≥ 2 jets with large ∆ η j j and high m j j . The VBF topology provides S/B ∼ 10 × higher than the inclusive selection. The jet branc hes ( Jet pt , Jet eta , Jet phi , Jet mass ) are a v ailable in the outreac h files and w ould supp ort this categorization. SVfit Mass Reconstruction The SVfit algorithm reconstructs the full di- τ inv ariant mass using the visible decay products and the MET v ector with its cov ariance matrix. The required inputs are av ailable: • τ h four-momen tum and decay mode ( Tau pt/eta/phi/mass , Tau decayMode ) • Muon four-momen tum ( Muon pt/eta/phi/mass ) • MET and co v ariance ( MET pt , MET phi , MET CovXX/XY/YY ) The SVfit mass pro vides b etter separation b etw een the Higgs s ignal peak ( ∼ 125 GeV) and the Z → τ τ bac kground peak ( ∼ 91 GeV) compared to the visible mass. D Y T emplate Splitting Split the DYJetsT oLL sample into Z → τ τ and Z → µµ comp onen ts using the GenPart pdgId branc h (confirmed av ailable in Phase 2). This w ould: • Resolv e the Z -mass spik e in the DY template. • Allo w indep enden t normalization of the tw o comp onen ts. • Impro v e the p ost-fit χ 2 / ndf from 4.91 to ward the target of < 2. Data-Driv en T au ID Scale F actors Measure tau identification efficiency scale factors from data using tag-and-prob e metho ds in Z → τ τ even ts. This w ould replace the conserv ative ± 5% flat uncertain ty with data-driv en precision of ∼ 1–2%. Pileup Reweigh ting The PV npvs branch is av ailable in the outreach files. Pileup reweigh ting could b e implemen ted by comparing the N PV distribution in data and MC and applying p er-even t weigh ts to correct the MC. This w ould improv e the o verall normalization agreement and reduce the pre-fit data/MC offset. Multi-Channel Combination Adding the eτ h , eµ , and τ h τ h c hannels w ould appro ximately triple the signal yield and provide indep enden t sensitivit y . The electron branches are a v ailable in the outreach files for the eτ h and eµ channels. 279 E.0.16 App endix A: Complete Cutflo w T able The complete cutflow with ra w even t counts is pro vided in tbl. 71 (sec. E.0.4 ). The weigh ted yields in the signal region (OS, m T < 50 GeV) are pro vided in tbl. 74 (sec. E.0.6 ). F or reference, the weigh ted cutflow for the final tw o steps (OS requirement and m T cut) is: T able 85: W eighted ev ent yields for the final selection steps. The OS and SS yields from data are used to derive the QCD contribu- tion via the SS → OS metho d. Pro cess After OS After m T < 50 GeV OS yield ( L = 11 . 6 fb − 1 ) ggH → τ τ 3,410 raw 2,894 raw 85.9 VBF H → τ τ 4,493 raw 3,843 ra w 9.1 D Y → ℓℓ 29,093 raw 25,365 raw 33,774 t ¯ t 13,439 ra w 3,527 raw 1,343 W+1j 5,726 ra w 1,400 raw — W+2j 8,893 ra w 2,096 raw — W+3j 4,801 ra w 1,231 raw — W+jets (com- bined) 19,420 raw 4,727 raw 5,833 QCD (data- driv en) — — 4,664 Data (B+C) 66,219 raw 40,045 raw 39,726 (OS only) 280 E.0.17 App endix B: Systematic Uncertain ty Summary T able The complete systematic uncertaint y summary with all implemented sources, their types, affected samples, pre-fit sizes, p ost-fit pulls, constraints, and impacts on µ is provided in tbl. 77 (sec. E.0.7 ). The impact on µ is computed as pre-fit ± 1 σ nuisance parameter v ariations with other NPs fixed at their p ost-fit v alues. The detailed ∆ µ v alues for eac h direction are: T able 86: Detailed impact of each systematic source on the signal strength µ . Source ∆ µ (+1 σ ) ∆ µ ( − 1 σ ) Max impact T rigger efficiency +18 . 4 − 18 . 3 18.4 T au ID efficiency +15 . 4 − 15 . 4 15.4 Luminosit y +10 . 0 − 10 . 0 10.0 W+jets normaliza- tion +7 . 8 − 6 . 9 7.8 Muon ID/iso efficiency +7 . 8 − 7 . 8 7.8 QCD nor- malization +7 . 5 − 5 . 1 7.5 t ¯ t normal- ization +1 . 3 − 1 . 3 1.3 ggH cross- section − 0 . 6 +0 . 6 0.6 BR( H → τ τ ) − 0 . 3 +0 . 3 0.3 T au energy scale − 0 . 1 +0 . 1 0.1 VBF cross- section − 0 . 02 +0 . 02 0.02 281 E.0.18 App endix C: Per-Bin Fit Results The fit is p erformed in 24 bins of m vis from 40 to 200 GeV with a bin width of 6.67 GeV. The bin edges are: 40 . 0 , 46 . 7 , 53 . 3 , 60 . 0 , 66 . 7 , 73 . 3 , 80 . 0 , 86 . 7 , 93 . 3 , 100 . 0 , 106 . 7 , 113 . 3 , 120 . 0 , 126 . 7 , 133 . 3 , 140 . 0 , 146 . 7 , 153 . 3 , 160 . 0 , 166 . 7 , 173 . 3 , 180 . 0 , 186 . 7 , 193 . 3 , 200 . 0 The pre-fit and post-fit yields b y sample are summarized in tbl. 80 (sec. E.0.10 ). The pre-fit total is 44,412.8 even ts and the pos t-fit total is 38,192.9 even ts, compared to 38,428 observed data even ts. The key fit parameters are: T able 87: Key fit parameters. P arameter V alue ˆ µ − 5 . 674 ± 4 . 812 ˆ µ Asimov 1 . 000 ± 5 . 607 D Y normfactor 0 . 872 ± 0 . 065 Pre-fit χ 2 / ndf 1231 . 5 / 23 = 53 . 6 P ost-fit χ 2 / ndf 112 . 9 / 23 = 4 . 91 Fit status Conv erged P er-Bin Yields tbl. 88 sho ws the p er-bin data counts, pre-fit total MC prediction (signal at µ = 1 plus all bac kgrounds), and approximate p ost-fit total MC prediction obtained by applying the global p ost- fit normalization scale factors to each template. The per-bin χ 2 i v alues are Pearson χ 2 i = ( n i − ν i ) 2 /ν i con tributions computed from the approximate p ost-fit prediction. The p er-bin shap e adjustments from the TES systematic and Barlo w–Beeston staterror gammas are not included in this table, which is why the summed χ 2 differs from the profiled p ost-fit χ 2 = 112 . 9 rep orted in sec. E.0.12 . T able 88: Per-bin yields in the fit range [40 , 200] GeV with 24 bins of width 6.67 GeV. The pre-fit MC includes signal at µ = 1. Post-fit MC uses the best-fit normal- ization scale factors (without per-bin shap e adjustments from TES and staterror). The per-bin χ 2 i = ( n i − ν i ) 2 /ν i is the Pearson c hi-squared contribution from the approximate p ost-fit prediction. Bin Range [GeV] Data Pre-fit MC Post-fit MC χ 2 i 1 [40.0, 46.7] 1,865 2,184.4 1,768.6 5.26 2 [46.7, 53.3] 4,864 5,275.4 4,425.8 43.39 3 [53.3, 60.0] 6,742 7,421.3 6,344.3 24.93 4 [60.0, 66.7] 6,791 7,610.6 6,566.9 7.65 5 [66.7, 73.3] 5,088 5,591.9 4,837.5 12.97 6 [73.3, 80.0] 3,247 3,569.1 3,088.4 8.14 7 [80.0, 86.7] 2,108 2,405.3 2,087.8 0.20 8 [86.7, 93.3] 3,479 5,505.6 4,792.1 359.82 9 [93.3, 100.0] 1,199 1,547.0 1,346.4 16.14 10 [100.0, 106.7] 624 724.6 602.4 0.77 11 [106.7, 113.3] 484 436.8 407.1 14.53 12 [113.3, 120.0] 320 378.4 330.5 0.33 13 [120.0, 126.7] 273 286.8 261.6 0.49 14 [126.7, 133.3] 250 254.7 236.0 0.84 15 [133.3, 140.0] 178 207.2 196.6 1.75 16 [140.0, 146.7] 193 188.6 163.9 5.17 17 [146.7, 153.3] 150 168.1 146.0 0.11 18 [153.3, 160.0] 138 129.6 120.6 2.52 19 [160.0, 166.7] 118 136.7 119.8 0.03 20 [166.7, 173.3] 79 135.0 116.6 12.11 21 [173.3, 180.0] 74 77.0 76.1 0.06 22 [180.0, 186.7] 59 71.3 64.2 0.42 23 [186.7, 193.3] 60 51.8 52.2 1.16 24 [193.3, 200.0] 45 55.4 51.7 0.86 T otal [40, 200] 38,428 44,412.8 38,202.8 — The dominan t χ 2 con tribution comes from bin 8 ([86 . 7 , 93 . 3] GeV), which contains the Z → µµ con- tamination spik e discussed in sec. E.0.12 . This single bin con tributes χ 2 8 = 359 . 8 to the approximate total, confirming the ro ot cause of the elev ated post-fit χ 2 / ndf. The bins in the signal-sensitive region (100–150 282 GeV, bins 10–15) sho w goo d data/MC agreement with small χ 2 con tributions, supp orting the conclusion that the signal extraction is not biased by the D Y shape mismo deling in the Z -p eak region. 283 E.0.19 App endix D: Machine-Readable Results Mac hine-readable results are provided in JSON and CSV formats in the phase5 documentation/exec/results/ directory: T able 89: Mac hine-readable results files. File Description signal strength.json Best-fit µ , uncertaint y , Asimo v expectation, fit status cutflow.csv Complete cutflow table with raw ev ent coun ts p er sample systematic impacts.csv Per-source systematic impacts on µ W orkspace The pyhf HistF actory JSON workspace is a v ailable at: phase4_inference/exec/workspace.json This workspace contains the complete statistical mo del and can b e used to repro duce the fit results with the pyhf Python pac k age: import pyhf import json with open("phase4_inference/exec/workspace.json") as f: workspace = json.load(f) model = pyhf.Workspace(workspace).model() data = pyhf.Workspace(workspace).data(model) result = pyhf.infer.mle.fit(data, model, return_uncertainties=True) Histograms The raw histograms (OS and SS, p er sample, p er v ariable) are stored in NumPy format at: phase3_selection/exec/histograms.npz Fit Results The complete fit results (b est-fit parameters, nuisance parameter pulls and constrain ts, signal injection test results, and go odness-of-fit metrics) are stored at: phase4_inference/exec/fit_results.json 284 E.0.20 References 1. CMS Collab oration, “Evidence for the 125 GeV Higgs b oson decaying to a pair of τ leptons,” JHEP 05 (2014) 104. doi:10.1007/JHEP05(2014)104. 2. CMS Collab oration, “Observ ation of the Higgs boson decay to a pair of τ leptons with the CMS detector,” Phys. Lett. B 779 (2018) 283. doi:10.1016/j.physletb.2018.02.004. 3. A TLAS Collab oration, “Evidence for the Higgs b oson Y uk a wa coupling to τ leptons with the A TLAS detector,” JHEP 04 (2015) 117. doi:10.1007/JHEP04(2015)117. 4. CMS Collab oration, “Measurement of Higgs b oson pro duction in the deca y channel with a pair of τ leptons,” Phys. Lett. B 805 (2020) 135425. doi:10.1016/j.physletb.2020.135425. 5. LHC Higgs C ross Section W orking Group, “Handb o ok of LHC Higgs Cross Sections: 3. Higgs Prop- erties,” CERN-2013-004. doi:10.5170/CERN-2013-004. 6. L. Heinrich, M. F eick ert, G. Stark, K. Cranmer, “p yhf: pure-Python implementation of HistF actory statistical mo dels,” JOSS 6 (2021) 2823. doi:10.21105/joss.02823. 7. A TLAS Collaboration, “Observ ation of a new particle in the searc h for the Standard Mo del Higgs boson with the A TLAS detector at the LHC,” Phys. Lett. B 716 (2012) 1. doi:10.1016/j.ph ysletb.2012.08.020. 8. CMS Collaboration, “Observ ation of a new b oson at a mass of 125 GeV with the CMS exp erimen t at the LHC,” Phys. Lett. B 716 (2012) 30. doi:10.1016/j.physletb.2012.08.021. 285

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment