Learning to Select: Query-Aware Adaptive Dimension Selection for Dense Retrieval

Learning to Select: Query-Aware Adaptive Dimension Selection for Dense Retrieval
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Dense retrieval represents queries and documents as high-dimensional embeddings, but these representations can be redundant at the query level: for a given information need, only a subset of dimensions is consistently helpful for ranking. Prior work addresses this via pseudo-relevance feedback (PRF) based dimension importance estimation, which can produce query-aware masks without labeled data but often relies on noisy pseudo signals and heuristic test-time procedures. In contrast, supervised adapter methods leverage relevance labels to improve embedding quality, yet they learn global transformations shared across queries and do not explicitly model query-aware dimension importance. We propose a Query-Aware Adaptive Dimension Selection framework that \emph{learns} to predict per-dimension importance directly from query embedding. We first construct oracle dimension importance distributions over embedding dimensions using supervised relevance labels, and then train a predictor to map a query embedding to these label-distilled importance scores. At inference, the predictor selects a query-aware subset of dimensions for similarity computation based solely on the query embedding, without pseudo-relevance feedback. Experiments across multiple dense retrievers and benchmarks show that our learned dimension selector improves retrieval effectiveness over the full-dimensional baseline as well as PRF-based masking and supervised adapter baselines.


💡 Research Summary

Dense retrieval systems map queries and documents into high‑dimensional vectors and rank documents by cosine similarity. While these embeddings capture rich semantic information, many dimensions are redundant for a given query: only a subset consistently contributes to relevance, and the rest may be neutral or even detrimental. Prior work addresses this redundancy in two main ways. First, dimension‑importance estimation methods such as DIME and Eclipse use pseudo‑relevance feedback (PRF) or LLM‑generated pseudo‑documents to derive per‑query importance scores. These approaches are unsupervised and can produce query‑aware masks, but they rely on noisy pseudo‑labels and require heuristic post‑processing at inference time, limiting robustness and deployment simplicity. Second, supervised adapter techniques (e.g., Search Adapter) learn a global linear transformation on top of frozen encoders using relevance labels. Although adapters improve full‑dimensional performance, they apply the same transformation to all queries, failing to capture query‑specific dimension patterns.

The authors propose a Query‑Aware Adaptive Dimension Selection (QADS) framework that learns to predict per‑dimension importance directly from supervised relevance signals, eliminating the need for PRF. The method proceeds in two stages. (1) Oracle importance distribution construction: for each training query q, they collect positively labeled documents D⁺(q) and sample hard negatives D⁻(q) from the top‑K retrieved non‑relevant items. They compute a weighted positive centroid p and a negative centroid n, then calculate a raw discrimination score for each dimension j as r_q(j) = e_{q,j}·(p_j – n_j). Applying a temperature‑scaled softmax yields a probability distribution π_q over dimensions, representing the oracle importance. (2) Predictor training: a lightweight fully‑connected layer f_θ maps the frozen query embedding e_q to logits ℓ = f_θ(e_q). After a log‑softmax, the predicted distribution \hat{π}_q is obtained. The model is trained by minimizing the KL‑divergence KL(π_q‖\hat{π}_q). Only the predictor parameters θ are updated; document embeddings and the ANN index remain untouched.

At inference, the predictor outputs \hat{π}_q for a query, the top‑k dimensions are selected, and a binary mask m^{(k)}_q zeroes out the remaining coordinates. The masked query vector e^{(k)}_q = e_q ⊙ m^{(k)}_q is then used with the original document embeddings to compute cosine similarity. Because document vectors are unchanged, the approach incurs no re‑indexing cost and can be deployed as a drop‑in module.

Experiments span seven dense encoders—including three Qwen‑Embedding models trained with multi‑scale (Matryoshka) objectives, OpenAI’s text‑embedding‑3‑large, and three LLM2VEC variants—and three benchmark datasets: SciFact, NFCorpus, and MS MARCO. For each model‑dataset pair, the authors evaluate NDCG@10 across a grid of retained dimension ratios (2 %–100 %). Baselines comprise (i) the full‑dimensional baseline, (ii) a static prefix cutoff, (iii) a simple “Norm” baseline that selects dimensions with largest absolute query values, (iv) PRF‑based DIME and Eclipse, and (v) a supervised Search Adapter.

Results show that QADS consistently outperforms PRF‑based methods and the Norm baseline, often achieving higher NDCG while using far fewer dimensions. For example, on Qwen‑4B, retaining only 56 % of dimensions yields NDCG 0.899, surpassing the full‑dimensional baseline (0.849) and matching or exceeding the Adapter’s performance (0.849) with a substantially smaller vector. When constrained to a fixed 30 % dimension budget, QADS still attains near‑peak performance (e.g., 0.895 on Qwen‑4B), whereas adapters require the full vector to reach comparable scores. The method also demonstrates robustness across datasets: on MS MARCO, QADS with 20–44 % dimensions achieves NDCG improvements over both the baseline and DIME/Eclipse, and on NFCorpus it yields the highest scores among all tested approaches.

The paper discusses limitations: the current binary top‑k masking may discard useful fine‑grained weighting information, and the hard‑negative sampling strategy introduces hyper‑parameters (K, M) that may need dataset‑specific tuning. Future work is suggested in three directions: (a) learning a soft weighting mask rather than a hard binary mask, (b) dynamic selection of k via meta‑learning or reinforcement learning, and (c) extending the framework to multimodal queries (e.g., image‑text) where dimension relevance may differ across modalities.

In summary, the authors introduce a novel supervised dimension‑selection mechanism that leverages relevance labels to distill per‑query importance distributions and trains a lightweight predictor to approximate them. By applying query‑specific masks at inference without altering document embeddings or indexes, the approach offers a practical, label‑efficient alternative to PRF‑based masking and global adapters, delivering consistent retrieval gains across a wide range of models and benchmarks.


Comments & Academic Discussion

Loading comments...

Leave a Comment