A Hybrid DNN Transformer AE Framework for Corporate Tax Risk Supervision and Risk Level Assessment

Tax risk supervision has become a critical component of modern financial governance, as irregular tax behaviors and hidden compliance risks pose significant challenges to regulatory authorities and enterprises alike. Traditional rule-based methods often struggle to capture complex and dynamic tax-related anomalies in large-scale enterprise data. To address this issue, this paper proposes a hybrid deep learning framework (DNN-Transformer-Autoencoder) for corporate tax risk supervision and risk level assessment. The framework integrates three complementary modules: a Deep Neural Network (DNN) for modeling static enterprise attributes, a Transformer-based architecture for capturing long-term dependencies in historical financial time series, and an Autoencoder (AE) for unsupervised detection of anomalous tax behaviors. The outputs of these modules are fused to generate a comprehensive risk score, which is further mapped into discrete risk levels (high, medium, low). Experimental evaluations on a real-world enterprise tax dataset demonstrate the effectiveness of the proposed framework, achieving an accuracy of 0.91 and a Macro F1-score of 0.88. These results indicate that the hybrid model not only improves classification performance but also enhances interpretability and applicability in practical tax regulation scenarios. This study provides both methodological innovation and regulatory implications for intelligent tax risk management.

💡 Research Summary

The paper addresses the growing challenge of corporate tax‑risk supervision by proposing a novel hybrid deep‑learning architecture that simultaneously exploits static firm attributes, longitudinal financial time series, and unsupervised detection of anomalous tax behaviours. The framework consists of three complementary modules: (1) a Deep Neural Network (DNN) that ingests structured, non‑temporal variables such as firm size, industry, legal form, and past audit outcomes; (2) a Transformer‑based model that processes monthly revenue, expense, and tax‑filing figures, leveraging multi‑head self‑attention and positional encodings to capture long‑range dependencies and seasonal patterns; and (3) an Autoencoder (AE) that learns a compact latent representation of unlabeled, potentially irregular tax actions (e.g., abnormal deduction ratios, atypical counterparties) and flags high reconstruction error as an anomaly indicator. Each sub‑model outputs a normalized risk score (0–1). These scores are fused through a weighted average, where the weights are automatically tuned via Bayesian optimization on validation performance (accuracy, AUC, etc.). The aggregated score is then discretized into three risk levels—high, medium, low—using pre‑defined thresholds.

The authors evaluated the system on a real‑world dataset comprising 12,000 Chinese enterprises over five fiscal years, collected in partnership with a major accounting firm. Data preprocessing involved K‑nearest‑neighbors imputation for missing values, one‑hot encoding for categorical fields, and log‑scaling plus standardization for continuous variables. Baselines included a traditional rule‑based engine, a standalone DNN, an LSTM, and a Gradient Boosting Machine. The hybrid model achieved an overall accuracy of 0.91, a macro‑averaged F1‑score of 0.88, and an ROC‑AUC of 0.94, outperforming all baselines, especially in distinguishing high‑risk from low‑risk firms where it gained more than 7 percentage points.

Interpretability is a key contribution. The authors applied SHAP analysis to the DNN to reveal the most influential static features (e.g., firm size and prior audit flags). For the Transformer, attention heatmaps were visualized, showing that spikes in revenue combined with unusually high deduction ratios in specific months received the strongest attention, thereby explaining why the model raised the risk score at those points. The AE’s reconstruction error distribution was used to generate a ranked list of anomalous tax behaviours for downstream audit. These explainable outputs enable regulators to prioritize investigations and allow firms to proactively address risk factors.

In conclusion, the study demonstrates that integrating DNN, Transformer, and Autoencoder components yields a robust, accurate, and interpretable solution for corporate tax‑risk supervision and risk‑level assessment. The authors suggest future extensions such as incorporating multimodal textual reports via natural‑language models, employing graph neural networks to model inter‑firm risk propagation, and developing online learning mechanisms for real‑time streaming tax data.

💡 Research Summary

📜 Original Paper Content