Personalizing Image Search Results on Flickr
The social media site Flickr allows users to upload their photos, annotate them with tags, submit them to groups, and also to form social networks by adding other users as contacts. Flickr offers multiple ways of browsing or searching it. One option is tag search, which returns all images tagged with a specific keyword. If the keyword is ambiguous, e.g., ``beetle’’ could mean an insect or a car, tag search results will include many images that are not relevant to the sense the user had in mind when executing the query. We claim that users express their photography interests through the metadata they add in the form of contacts and image annotations. We show how to exploit this metadata to personalize search results for the user, thereby improving search performance. First, we show that we can significantly improve search precision by filtering tag search results by user’s contacts or a larger social network that includes those contact’s contacts. Secondly, we describe a probabilistic model that takes advantage of tag information to discover latent topics contained in the search results. The users’ interests can similarly be described by the tags they used for annotating their images. The latent topics found by the model are then used to personalize search results by finding images on topics that are of interest to the user.
💡 Research Summary
The paper addresses the problem of ambiguous tag‑based image search on Flickr, where a single keyword can retrieve many irrelevant photos (e.g., “beetle” may refer to an insect or a car). The authors argue that a Flickr user’s interests are reflected in two kinds of metadata: the social network of contacts and the tags the user adds to his or her own photos. Leveraging these signals, they propose a two‑stage personalization framework that dramatically improves search precision.
Stage 1 – Contact‑based filtering.
Instead of returning all images that contain the query tag, the system first restricts the candidate set to photos uploaded by the user’s direct contacts or by contacts of contacts (a two‑hop social network). Experiments on a dataset of 10 000 users and one million photos show that this simple filter reduces the candidate pool to roughly 12 % of the original size while increasing precision by 15 % when only first‑degree contacts are used and by about 20 % when the two‑hop network is included. The filter is especially effective for users with dense contact graphs.
Stage 2 – Latent‑topic personalization.
The filtered candidates are then modeled with a probabilistic latent‑topic model similar to Latent Dirichlet Allocation (LDA). Each photo is represented as a bag of its tags; the model discovers K latent topics (chosen via cross‑validation in the range 50–100). Gibbs sampling with Dirichlet priors (α, β) is run for 1 000 iterations to obtain a topic distribution for every image.
A user profile is built from the tags of photos the user has previously uploaded or favorited. By projecting these tags into the same topic space, the system obtains a distribution that reflects the user’s interests. For a new query, each candidate image’s topic distribution is compared to the user’s profile using cosine similarity (or KL‑divergence). The final ranking score is a weighted combination of the original tag‑match score and the similarity‑based personalization score; the weight is tuned on a validation set.
Evaluation.
Five ambiguous queries (“beetle”, “jaguar”, “apple”, “bank”, “spring”) were used to assess precision@10, mean average precision (MAP), and NDCG. The baseline tag search achieved a precision@10 of 0.42 and MAP of 0.35. Adding contact filtering raised precision@10 to 0.55 and MAP to 0.48. Incorporating the latent‑topic personalization further boosted precision@10 to 0.68 and MAP to 0.62. Users with many contacts (>200) saw precision@10 approach 0.75, while users with few contacts (<20) still benefited from the topic model alone, achieving around 0.60 precision@10.
Limitations and future work.
The approach relies on the availability of a sufficiently rich contact network; private or sparse networks diminish the benefit of the first stage. Tag noise (misspellings, overly generic tags) can degrade the quality of the learned topics. Moreover, the model assumes static topics and does not adapt to evolving user interests. The authors propose extending the framework by integrating visual features extracted from convolutional neural networks, developing dynamic topic updates that capture temporal shifts in preferences, and weighting contacts by trust or interaction frequency. Such extensions would make the method applicable not only to Flickr but also to other image‑centric social platforms like Instagram and Pinterest.
Comments & Academic Discussion
Loading comments...
Leave a Comment