Information Templates: A New Paradigm for Intelligent Active Feature Acquisition

Information Templates: A New Paradigm for Intelligent Active Feature Acquisition
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Active feature acquisition (AFA) is an instance-adaptive paradigm in which, at inference time, a policy sequentially chooses which features to acquire (at a cost) before predicting. Existing approaches either train reinforcement learning policies, which deal with a difficult MDP, or greedy policies that cannot account for the joint informativeness of features or require knowledge about the underlying data distribution. To overcome this, we propose Template-based AFA (TAFA), a non-greedy framework that learns a small library of feature templates – sets of features that are jointly informative – and uses this library of templates to guide the next feature acquisitions. Through identifying feature templates, the proposed framework not only significantly reduces the action space considered by the policy but also alleviates the need to estimate the underlying data distribution. Extensive experiments on synthetic and real-world datasets show that TAFA outperforms the existing state-of-the-art baselines while achieving lower overall acquisition cost and computation.


💡 Research Summary

Active Feature Acquisition (AFA) addresses the practical need to acquire costly features at inference time by selecting them sequentially, balancing predictive performance against acquisition cost. Existing approaches fall into two main categories. Reinforcement‑learning (RL) methods formulate AFA as a Markov Decision Process (MDP) and learn a policy, but they suffer from large action spaces, sample inefficiency, training instability, and high computational overhead. Greedy methods based on Conditional Mutual Information (CMI) or discriminative variants (e.g., DIME) select the next feature with the highest estimated information gain, yet they cannot capture joint informativeness of feature sets and often require a generative model of the data distribution.

The paper proposes a fundamentally different paradigm: Template‑based AFA (TAFA). A “template” is defined as a small subset of features that, when acquired together, provides high information gain relative to its cost. By learning a library of such templates in advance, the policy’s decision space is dramatically reduced: at each step the agent chooses among a handful of templates rather than among all individual features. This eliminates the need for costly distribution estimation and sidesteps the difficulties of RL training.

The authors formalize the objective as minimizing the expected cost‑benefit loss
(e(x_b, y) = \ell(b_y(x_b), y) + \lambda \sum_{j\in b} c(j))
over a collection (\mathcal B) of templates, where (\ell) is a prediction loss, (c(j)) is the acquisition cost of feature (j), and (\lambda) trades off accuracy against cost. The aggregate objective
(g(\mathcal B) = \mathbb{E}_{x,y}\big


Comments & Academic Discussion

Loading comments...

Leave a Comment