Data-mining the Foundational Patents of Photovoltaic Materials: An application of Patent Citation Spectroscopy
We apply Patent Citation Spectroscopy (PCS)–originally developed as Reference Publication Year Spectroscopy for studying landmarks and milestones in scientific literature–to patent literature classified into the nine Y-subclasses of the Cooperative Patent Classification (CPC) that describe material photovoltaic technologies. For this study we extended the routine with the option to use the advanced search queries at PatentsView. On the basis of two normalizations of the longitudinal distribution of the publication years of the patents cited by the retrieved patents, the routine (at http://www.leydesdorff.net/comins/pcs/index.html) provides a best guess of the foundational patent for the subject specified in the string. In five of the nine cases, we found corroborating evidence for the foundational character of the patent indicated by the routine.
💡 Research Summary
The paper introduces Patent Citation Spectroscopy (PCS), a data‑mining technique adapted from Reference Publication Year Spectroscopy (RPYS), to automatically identify foundational patents within nine CPC Y‑subclasses that cover photovoltaic (PV) material technologies. PCS works by aggregating all cited patents of a selected set, grouping them by grant year, and then detrending the yearly citation counts using a five‑year moving median. This first normalization highlights years with unusually high citation activity. To distinguish whether a peak is driven by a single highly influential patent or by multiple moderately cited patents, a second normalization multiplies the detrended value by the proportion of citations that the most‑cited patent in that year receives. The resulting “spectral” peak therefore points to a candidate seminal patent.
The authors implemented PCS as a web application that queries the PatentsView API. Users can submit advanced search strings based on CPC subclass identifiers; for example, the query ADVANCED={“cpc_subgroup_id”:“Y02E10/541”} retrieves 962 granted US patents and 3,502 unique cited references for the CuInSe₂ material subclass. PCS processes these data, produces a visual citation spectrum, and outputs the most likely foundational patent. In this case, US 4335266 (“Methods for forming thin‑film heterojunction solar cells from I‑III‑IV₂”, granted 1982) emerged as the peak.
To validate the algorithmic output, the authors searched scholarly literature for papers that cite the identified patent as the technical basis of CuInSe₂ PV cells. A Materials Science Forum article explicitly references US 4335266 as the source of the “bilayer” process that enabled the first 10 % efficiency thin‑film cells in the 1980s, confirming the patent’s seminal role.
Applying the same workflow to all nine CPC subclasses, the study produced a table of candidate foundational patents. Five subclasses—CuInSe₂, dye‑sensitized, Group II‑VI, Group III‑V, and organic PV—showed corroborating scholarly evidence, while the remaining four (micro‑crystalline silicon, poly‑crystalline silicon, mono‑crystalline silicon, and amorphous silicon) lacked such validation, possibly due to differing citation practices or a stronger industry‑focused development path.
The authors argue that PCS offers a rapid, systematic “starting point” for technology‑landscape analyses. Traditional expert‑driven searches for landmark patents are time‑consuming and subject to bias; PCS can locate a plausible root patent within minutes, supporting corporate R&D planning, policy‑maker technology road‑mapping, and academic studies of innovation diffusion. However, they acknowledge limitations: the normalization parameters may need refinement for specific domains, and patent citations can be strategically motivated (e.g., defensive citing). Future work should explore domain‑specific weighting, integration with citation‑network main‑path analysis, text‑mining of patent claims, and co‑inventor network metrics to improve robustness. Combining PCS with longitudinal visualizations (e.g., PatViz) could further enable users to trace the evolution of a technology from its identified root through subsequent citation cascades. In sum, PCS demonstrates that systematic citation‑year analysis, when properly normalized, can reliably surface foundational patents, offering a valuable tool for navigating the ever‑growing patent corpus.
Comments & Academic Discussion
Loading comments...
Leave a Comment