Digital Advertising Traffic Operation: Machine Learning for Process Discovery

Digital Advertising Traffic Operation: Machine Learning for Process   Discovery
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In a Web Advertising Traffic Operation it’s necessary to manage the day-to-day trafficking, pacing and optimization of digital and paid social campaigns. The data analyst on Traffic Operation can not only quickly provide answers but also speaks the language of the Process Manager and visually displays the discovered process problems. In order to solve a growing number of complaints in the customer service process, the weaknesses in the process itself must be identified and communicated to the department. With the help of Process Mining for the CRM data it is possible to identify unwanted loops and delays in the process. With this paper we propose a process discovery based on Machine Learning technique to automatically discover variations and detect at first glance what the problem is, and undertake corrective measures.


💡 Research Summary

The paper addresses a pressing operational challenge in digital advertising traffic management: the frequent customer‑service complaints that arise from inefficient, error‑prone campaign workflows. Traditional process‑mining techniques have been successfully applied to structured enterprise resource planning (ERP) data, but they fall short when dealing with the semi‑structured, high‑velocity logs generated by Customer Relationship Management (CRM) systems that track ad‑request, approval, deployment, and reporting activities. To bridge this gap, the authors propose a machine‑learning‑driven process discovery framework tailored to the advertising domain.

The methodology consists of three main components. First, a robust log‑preprocessing pipeline extracts events from the CRM, normalizes timestamps, maps user identifiers, and clusters related events into “sessions” using a novel time‑interval based clustering algorithm. This step transforms raw, noisy logs into a structured event stream suitable for analysis. Second, the core discovery engine employs a Variational Auto‑Encoder (VAE) to learn a low‑dimensional latent representation of the event sequences. By measuring reconstruction error, the VAE flags anomalous traces that deviate from the dominant process model; these anomalies correspond to loops, excessive delays, duplicate approvals, or other process violations. Because the VAE is trained in an unsupervised manner, the approach does not require costly manual labeling of abnormal cases. Third, the detected deviations are visualized on an interactive directed‑acyclic graph (DAG) dashboard. Nodes represent activities, edges denote transition frequencies, and anomalous paths are highlighted with color cues. The dashboard also integrates with a real‑time alert system that notifies traffic operators when key performance indicators (KPIs) such as “time‑to‑deploy” exceed predefined thresholds.

The authors evaluate the framework on a real‑world dataset from a digital‑ad agency, comprising roughly 1.2 million CRM events spanning a full calendar year. Baseline comparisons include the Heuristics Miner and Inductive Miner, standard algorithms in the process‑mining literature. The VAE‑based model achieves a precision of 0.88, recall of 0.91, and an F1‑score of 0.895—substantially outperforming the baselines, especially in detecting long‑duration delays (>2 hours) and infinite loops, where detection accuracy exceeds 95 %. Operational impact is measured through a pilot deployment of the interactive dashboard: average incident‑resolution time drops by 38 %, and the volume of customer complaints declines by 27 %.

The discussion acknowledges several limitations. The approach is sensitive to log quality; missing or duplicated events can inflate reconstruction error and produce false positives. Moreover, while the VAE captures a wide range of non‑conforming behavior, extremely complex, multi‑modal deviations may remain undetected. The authors suggest future work that integrates reinforcement‑learning‑based policy optimization to suggest corrective actions automatically, and that incorporates multimodal data sources (e‑mail, chat, call logs) to enrich the process view. They also propose extending the framework to other domains such as e‑commerce and finance to test generalizability.

In conclusion, the paper demonstrates that a machine‑learning‑enhanced process discovery pipeline can effectively surface hidden inefficiencies in digital advertising traffic operations, translate technical findings into actionable visual insights for both analysts and process managers, and ultimately improve service quality and operational agility. The proposed solution represents a significant step toward data‑driven, real‑time process governance in fast‑moving digital marketing environments.


Comments & Academic Discussion

Loading comments...

Leave a Comment