The Discovery Gap: How Product Hunt Startups Vanish in LLM Organic Discovery Queries
Reading time: 5 minute
...
📝 Original Info
Title: The Discovery Gap: How Product Hunt Startups Vanish in LLM Organic Discovery Queries
ArXiv ID: 2601.00912
Date: 2026-01-01
Authors: Amit Prakash Sharma
📝 Abstract
When someone asks ChatGPT to recommend a project management tool, which products show up in the response? And more importantly for startup founders: will their newly launched product ever appear? This research set out to answer these questions.
I randomly selected 112 startups from the top 500 products featured on the 2025 Product Hunt leaderboard and tested each one across 2,240 queries to two different large language models: ChatGPT (gpt-4o-mini) and Perplexity (sonar with web search).
The results were striking. When users asked about products by name, both LLMs recognized them almost perfectly: 99.4% for ChatGPT and 94.3% for Perplexity. But when users asked discovery-style questions like "What are the best AI tools launched this year?" the success rates collapsed to 3.32% and 8.29% respectively. That's a gap of 30-to-1 for ChatGPT.
Perhaps the most surprising finding was that Generative Engine Optimization (GEO), the practice of optimizing website content for AI visibility, showed no correlation with actual discovery rates. Products with high GEO scores were no more likely to appear in organic queries than products with low scores.
What did matter? For Perplexity, traditional SEO signals like referring domains (r = +0.319, p < 0.001) and Product Hunt ranking (r = -0.286, p = 0.002) predicted visibility. After cleaning the Reddit data for false positives, community presence also emerged as significant (r = +0.395, p = 0.002).
The practical takeaway is counterintuitive: don't optimize for AI discovery directly. Instead, build the SEO foundation first and LLM visibility will follow.
💡 Deep Analysis
📄 Full Content
The Discovery Gap: LLM Discoverability of Product Hunt Startups
The Discovery Gap
How Product Hunt Startups Disappear in LLM Discovery Queries
A Study in Generative Engine Optimization (GEO)
Amit Prakash Sharma
IIT Patna, December 2025
ABSTRACT
When someone asks ChatGPT to recommend a project management tool, which
products show up in the response? And more importantly for startup founders: will their
newly launched product ever appear? This research set out to answer these questions.
I randomly selected 112 startups from the top 500 products featured on the 2025
Product Hunt leaderboard and tested each one across 2,240 queries to two different
large language models: ChatGPT (gpt-4o-mini) and Perplexity (sonar with web
search).
The results were striking. When users asked about products by name, both LLMs
recognized them almost perfectly. 99.4% for ChatGPT and 94.3% for Perplexity. But
when users asked discovery-style questions like "What are the best AI tools launched
this year?" the success rates collapsed to 3.32% and 8.29% respectively. That's a gap of
30-to-1 for ChatGPT.
Perhaps the most surprising finding was that Generative Engine Optimization (GEO),
the practice of optimizing website content for AI visibility, showed no correlation with
actual discovery rates. Products with high GEO scores were no more likely to appear in
organic queries than products with low scores.
What did matter? For Perplexity, traditional SEO signals like referring domains (r =
+0.319, p < 0.001) and Product Hunt ranking (r = -0.286, p = 0.002) predicted visibility.
After cleaning the Reddit data for false positives, community presence also emerged as
significant (r = +0.395, p = 0.002).
The practical takeaway is counterintuitive: don't optimize for AI discovery directly.
Instead, build the SEO foundation first and LLM visibility will follow.
Page 1 of 20
The Discovery Gap: LLM Discoverability of Product Hunt Startups
Keywords: Generative Engine Optimization, Large Language Models, Product Discovery, Startup
Visibility, AI Search
LIST OF SYMBOLS AND ABBREVIATIONS
LLM
Large Language Model
GEO
Generative Engine Optimization
SEO
Search Engine Optimization
POTD
Product of the Day
API
Application Programming Interface
r
Pearson correlation coefficient
p
Statistical significance (p-value)
NS
Not Significant
n
Sample size
Page 2 of 20
The Discovery Gap: LLM Discoverability of Product Hunt Startups
CHAPTER 1
INTRODUCTION
1.1 The Problem
Something strange is happening with AI-powered search. Ask ChatGPT "What is
Notion?" and you'll get a detailed, accurate response. But ask "What are the best note-
taking apps?" and Notion might not even appear. This gap between recognition and
recommendation is what I set out to study.
The shift matters because how people discover products is changing. If this trend
continues, and there's every reason to think it will, then understanding how products
appear in AI responses becomes a business necessity, not just an academic curiosity.
Traditional search engine optimization has been studied exhaustively since Brin and Page
published their PageRank paper in 1998 [2]. We know how Google ranks pages. We
know what makes a site authoritative. But LLMs work differently. They don't return
ranked lists of links; they generate synthesized responses. The rules are different and we
don't fully understand them yet.
1.2 Why Startups Face a Harder Problem
New startups sit at a particular disadvantage here. Consider the mechanics:
First, there's the training data problem. ChatGPT's knowledge has a cutoff date. Any
product launched after that date simply doesn't exist in the model's world. No amount of
optimization can change that fundamental fact.
Second, even for products that launched before the cutoff, the training data naturally
overrepresents established players. Wikipedia has extensive articles about Salesforce; it
doesn't have articles about the CRM tool that launched last month. This creates what I
call an "authority concentration" effect. The rich get richer in terms of AI visibility.
Third, web-search augmented models like Perplexity introduce different dynamics. They
can find new products through real-time search, but now SEO factors come back into
play. It's a different game with different rules.
1.3 Research Questions
Page 3 of 20
The Discovery Gap: LLM Discoverability of Product Hunt Startups
This study examines six questions:
RQ1: How large is the gap between direct queries (asking about a product by name) and
discovery queries (category searches where products might organically appear)?
RQ2: Do Product Hunt metrics like upvotes, comments, daily rankings correlate with
LLM visibility?
RQ3: Does GEO optimization actually improve discovery rates, as prior research
suggests?
RQ4: Do traditional SEO signals still matter for AI visibility?
RQ5: How much better are web-search LLMs at discovering new products compared to
knowledge-cutoff models?
RQ6: Do community signals fr