Mayfly: Private Aggregate Insights from Ephemeral Streams of On-Device User Data

Mayfly: Private Aggregate Insights from Ephemeral Streams of On-Device User Data
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper introduces Mayfly, a federated analytics approach enabling aggregate queries over ephemeral on-device data streams without central persistence of sensitive user data. Mayfly minimizes data via on-device windowing and contribution bounding through SQL-programmability, anonymizes user data via streaming differential privacy (DP), and mandates immediate in-memory cross-device aggregation on the server – ensuring only privatized aggregates are revealed to data analysts. Deployed for a sustainability use case estimating transportation carbon emissions from private location data, Mayfly computed over 4 million statistics across more than 500 million devices with a per-device, per-week DP $\varepsilon = 2$ while meeting strict data utility requirements. To achieve this, we designed a new DP mechanism for Group-By-Sum workloads leveraging statistical properties of location data, with potential applicability to other domains.


💡 Research Summary

Mayfly is a federated analytics system that enables privacy‑preserving aggregate queries over on‑device data streams without persisting any raw user data on a central server. The authors motivate the need for such a system by pointing out that modern on‑device AI models generate high‑frequency, high‑dimensional data (e.g., location traces, activity logs) that are valuable for analytics but highly sensitive. Traditional centralized analytics either store raw data, violating privacy, or apply naïve differential privacy (DP) mechanisms that either require prohibitive noise or expose large privacy budgets.

The core design of Mayfly rests on three pillars: (1) on‑device data minimization, (2) ephemeral in‑memory aggregation, and (3) streaming DP. Analysts write SQL‑style continuous queries limited to GROUP‑BY‑SUM operations. These queries are compiled into lightweight SQLite sub‑queries that run on the client. The client selects only the columns required for the query, applies activity‑level scaling (e.g., different scaling factors for walking, cycling, driving, flying) and a per‑device contribution bound, then sends the summarized rows to the server. This step dramatically reduces the amount of data transmitted and ensures that each device’s contribution is already bounded before aggregation.

On the server side, Mayfly performs immediate, in‑memory aggregation for each predefined time window (TW). No intermediate results are persisted to disk; after the aggregation finishes, the server discards the raw contributions. The aggregated sums are then processed by a new DP mechanism tailored for Group‑By‑Sum workloads. The mechanism first rescales the aggregated values back to the original units, then adds calibrated Gaussian noise calibrated to a per‑device‑per‑week privacy budget of ε = 2 (δ≈10⁻⁵). A post‑processing threshold step removes aggregates whose noise dominates the signal, ensuring that released statistics retain high utility.

A key technical contribution is the activity‑level scaling combined with a single global clipping bound. Because transportation data exhibit extreme variance (e.g., a short walk versus an inter‑continental flight), naïve clipping would either require a huge bound (leading to massive noise) or would truncate large contributions, biasing results. By learning per‑activity mean and variance offline, Mayfly normalizes each record before clipping, reducing the overall ℓ₁‑sensitivity by an order of magnitude. This enables the system to achieve the target relative error of ≤ 3 % while staying within ε = 2, an 8× improvement over baseline DP mechanisms.

Mayfly is built on top of Google’s existing federated learning infrastructure, reusing its task distribution, device eligibility checks, and two‑person code‑review controls. Devices check in at most once per day, optionally running a lightweight eligibility pre‑computation to avoid unnecessary work. This design mitigates bias against low‑resource devices and boosts participation from 49 % to 93 % of eligible devices in the production deployment.

The system was evaluated in a real‑world sustainability use case: the Environmental Insights Explorer (EIE), which aggregates Google Maps Timeline data to estimate city‑level transportation carbon emissions. Over a period of several weeks, Mayfly processed data from more than 500 million devices, generating over 4 million distinct statistics (region × activity × direction aggregates of distance and duration). The authors report that the DP‑protected aggregates closely match non‑private baselines, with average relative errors well below the 3 % target, and that the privacy budget consumption is dramatically lower than what would be required by naïve DP approaches (which would need ε > 16).

Beyond transportation, the authors discuss applicability to health monitoring, smart‑home energy usage, and any domain where high‑dimensional, temporally granular data are collected on devices. Future work includes extending the DP mechanism to support other aggregation types (e.g., quantiles), exploring hybrid central‑local DP models, and integrating hardware‑based trusted execution environments for stronger insider‑threat protection.

In summary, Mayfly demonstrates that it is possible to run large‑scale, continuous analytics over billions of devices while guaranteeing strong central DP, preserving data utility, and respecting strict privacy principles such as data minimization, ephemerality, and non‑targetability. The paper provides both a system architecture and a novel DP algorithm that together set a new benchmark for privacy‑first federated analytics.


Comments & Academic Discussion

Loading comments...

Leave a Comment