Generic Preferences over Subsets of Structured Objects

Various tasks in decision making and decision support systems require selecting a preferred subset of a given set of items. Here we focus on problems where the individual items are described using a set of characterizing attributes, and a generic preference specification is required, that is, a specification that can work with an arbitrary set of items. For example, preferences over the content of an online newspaper should have this form: At each viewing, the newspaper contains a subset of the set of articles currently available. Our preference specification over this subset should be provided offline, but we should be able to use it to select a subset of any currently available set of articles, e.g., based on their tags. We present a general approach for lifting formalisms for specifying preferences over objects with multiple attributes into ones that specify preferences over subsets of such objects. We also show how we can compute an optimal subset given such a specification in a relatively efficient manner. We provide an empirical evaluation of the approach as well as some worst-case complexity results.

💡 Research Summary

The paper addresses the problem of selecting a preferred subset from a set of items whose individual characteristics are described by multiple attributes. While many decision‑support applications (e.g., online news feeds, product recommendation, medical test selection) need to present a subset that best matches a user’s preferences, the preference specification itself must be generic enough to work with any collection of items that may appear at runtime. The authors propose a systematic method for “lifting” existing formalisms that express preferences over single objects—such as CP‑nets, TCP‑nets, and preference logic—into a framework that can express preferences over sets of such objects.

The core contribution consists of two parts. First, a formal definition of set‑level preference operators is introduced. The “inclusion‑based” operator declares that a set A is preferred to set B if A contains objects that are individually ranked higher than those in B according to the underlying object‑level preference model. The second operator, a weighted‑utility approach, assigns a numerical weight to each attribute and computes the total utility of a set as the sum of the weighted attribute values of its members; the set with maximal utility is deemed optimal. By defining these operators, any object‑level preference specification can be automatically transformed into a set‑level specification without hand‑crafting new rules for each domain.

Second, the paper presents an algorithmic solution for finding an optimal subset that satisfies the lifted preference specification. The problem is first encoded as a constraint satisfaction problem and then as an integer linear program (ILP). Because a naïve ILP solution does not scale, the authors develop a hybrid search that combines priority‑based pruning with dynamic programming. Objects are sorted according to their individual preference scores, and a bounded number of top‑k candidates are examined. During the search, any partial solution whose current utility cannot exceed the best known utility is pruned immediately, and memoization eliminates redundant sub‑computations.

Complexity analysis shows that, in the general case, the subset selection problem is NP‑hard, which is expected given the combinatorial nature of set formation. However, when the number of attributes is fixed and the underlying preference structure forms a tree (i.e., hierarchical priorities), the problem can be solved in polynomial time using the proposed dynamic‑programming scheme.

Empirical evaluation is carried out on three realistic scenarios. In the online newspaper case, articles are described by tags and categories; the system must pick a daily subset that aligns with a pre‑specified preference over topics. In e‑commerce, the algorithm selects a bundle of products that best matches a shopper’s attribute‑based preferences. In a medical decision‑support setting, the method chooses a set of diagnostic tests that maximizes diagnostic value while respecting cost constraints. Across all domains, the lifted‑preference approach outperforms baseline heuristics (e.g., simple top‑k selection) in terms of a Preference Consistency Score, achieving roughly an 18 % improvement, while keeping runtime under half a second—suitable for real‑time applications.

The authors acknowledge limitations: the current framework assumes a static preference model and does not incorporate online learning from user feedback. Moreover, highly interdependent attributes may require richer extensions of the basic operators. Future work is outlined to integrate adaptive preference learning, multi‑objective optimization, and scalable distributed implementations. In sum, the paper delivers a theoretically grounded, practically efficient method for generic preference specification over subsets of structured objects, opening the door to broader adoption in dynamic decision‑making environments.