CupQ: A New Clinical Literature Search Engine
A new clinical literature search engine, called CupQ, is presented. It aims to help clinicians stay updated with medical knowledge. Although PubMed is currently one of the most widely used digital libraries for biomedical information, it frequently does not return clinically relevant results. CupQ utilizes a ranking algorithm that filters non-medical journals, compares semantic similarity between queries, and incorporates journal impact factor and publication date. It organizes search results into useful categories for medical practitioners: reviews, guidelines, and studies. Qualitative comparisons suggest that CupQ may return more clinically relevant information than PubMed. CupQ is available at https://cupq.io/.
💡 Research Summary
The paper introduces CupQ, a clinical‑focused literature search engine designed to address the shortcomings of PubMed for day‑to‑day medical practice. While PubMed remains the dominant biomedical repository, clinicians often find its interface cumbersome and its relevance ranking insufficient for rapid retrieval of clinically actionable information. The authors argue that PubMed’s “Best Match” algorithm does not incorporate two key signals that physicians value: the prestige of the publishing journal (as measured by Journal Impact Factor, JIF) and the recency of the article. Moreover, PubMed requires users to construct complex Boolean queries, select MeSH terms, and apply filters, which can be a barrier for busy clinicians.
CupQ’s architecture consists of four logical layers: data acquisition, preprocessing, indexing, and retrieval/ranking. MEDLINE/PubMed XML dumps are downloaded via FTP on a weekly schedule, verified with MD5 checksums, and parsed to extract titles, abstracts, journal names, authors, and publication dates. Only records from journals classified under the “medicine” subject area in ScimagoJR are retained, and each is annotated with the journal’s JIF obtained from the Journal Citation Reports. This filtering eliminates non‑clinical journals at the source.
For semantic processing, the system tokenizes titles and abstracts on spaces and hyphens, normalizes tokens with the LuiNorm API, removes stop‑words (preserving fully capitalized tokens), and trains a Word2Vec skip‑gram model (100‑dimensional vectors, window = 100, 10 epochs) using Gensim. Document title vectors are constructed as a weighted sum of token vectors, where each token’s weight is the log of (corpus size / document frequency), a TF‑IDF‑like scheme that emphasizes rare but informative terms.
An inverted index maps each token identifier (TID) to a list of PubMed IDs (PMIDs). The index is stored in a MySQL table with a composite primary key (TID, PMID) and is refreshed automatically as new MEDLINE records become available. When a user submits a query, the same tokenization and embedding pipeline is applied. The system selects the query token that appears in the fewest documents (the “rarest” token) to retrieve an initial candidate set from the inverted index, then filters candidates to retain only English‑language articles that contain all query tokens, are not retractions or errata, and were published after 1990.
Ranking is performed separately for three user‑oriented categories: Reviews, Guidelines, and Studies. For each document a set of sub‑scores is computed: (1) a semantic score (cosine similarity between query vector and document title vector), (2) a title‑token‑coverage score (binary 1/0), (3) a date score (inverse of the number of days since publication, with a 0.1 multiplier for articles older than 20 years), and (4) a journal score equal to the JIF. Each sub‑score is min‑max normalized, then multiplied by category‑specific boosting factors (e.g., Reviews: Title 4, Cosine 3, Date 1, Journal 2; Guidelines: Title 6, Cosine 8, Date 1, Journal 4; Studies: Title 3, Cosine 5, Date 1, Journal 4). The final relevance score is the sum of these weighted sub‑scores; any zero sub‑score nullifies the entire relevance, enforcing that a document must satisfy all basic criteria.
The user interface presents a simple search bar and a tab bar for selecting the desired category. Results are displayed as JSON‑driven HTML cards showing title, abstract snippet, author abbreviations, journal abbreviation, and year. The system caches the top 500 ranked results for each query to accelerate subsequent page loads.
To evaluate CupQ, the authors performed a qualitative comparison against PubMed on three representative queries: “myocardial infarction” (Reviews), “depression” (Guidelines), and “stroke” (Studies). For myocardial infarction, CupQ’s top three results were recent reviews from high‑impact journals (NEJM, Lancet, BMJ) with cosine similarities of 0.986, whereas PubMed’s top results came from lower‑impact or older journals, with the best cosine similarity of 0.832 and no overlap with CupQ’s list. For depression guidelines, both engines returned the same JAMA guideline, but CupQ additionally surfaced a high‑impact A‑M internal medicine recommendation that PubMed missed, while PubMed returned a multi‑topic cancer journal article that, despite a high JIF, was only tangentially related to depression. In the stroke study query, CupQ’s top results were all from NEJM (2017‑2018) and contained the keyword in the title; PubMed’s top results included older or lower‑impact journals and occasionally omitted the keyword from the title.
The authors discuss related work, noting that tools such as Quertle, MEDIE, and Semantic MEDLINE also incorporate linguistic or semantic information but are either proprietary (Quertle) or lack transparent ranking formulas, and none integrate journal impact or recency metrics. CupQ’s novelty lies in its open, reproducible pipeline that combines semantic similarity, journal prestige, and publication date, and in its automatic categorization of results into clinically meaningful buckets.
Limitations acknowledged include the reliance on qualitative, not quantitative, evaluation; the absence of click‑through or usage analytics; and the fact that JIF is an imperfect proxy for article quality. Future work is proposed to collect large‑scale user interaction data, conduct A/B testing, and explore additional relevance signals such as article‑level citation counts, Altmetric scores, or expert‑curated relevance judgments.
In conclusion, CupQ demonstrates that a search engine tailored to clinicians—by filtering to medical journals, weighting recent high‑impact publications, and using word‑embedding similarity—can return more clinically pertinent literature than the generic PubMed interface. The system offers a practical, open‑source alternative that may improve information retrieval efficiency in continuing medical education and point‑of‑care decision making.
Comments & Academic Discussion
Loading comments...
Leave a Comment