What’s in a Citation? Motivation and Classification of Citing References

To put the question “What’s in a citation?” in historical context, I’ll start with the legendary Eugene Garfield. In 1955, he set out to “propose a bibliographic system for science literature that can eliminate the uncritical citation of fraudulent, incomplete, or obsolete data by making it possible for the conscientious scholar to be aware of criticisms of earlier papers” (“Citation Indexes for Science: A New Dimension in Documentation Through the Association of Ideas,” Science, New Series, v. 122, no. 3159, Jul. 15, 1955: pp. 108–111).

But Garfield and others soon realized that tracing cited references from article to article could be leveraged for other purposes. In 1979, Garfield, in writing a defense of using citation counts for the evaluation of researchers, refutes three of the commonly held reasons for concern at that time—self-citations, negative/critical citations, and cites related to methodological issues. In it, he discusses problems with the Science Citation Index and Social Science Citation Index and gives counterarguments about why the issues raised do not significantly affect the measurement of a researcher’s contribution to scientific knowledge: “We know that citation rates say something about the contribution made by an individual’s work, at least in terms of the utility and interest the rest of the scientific community finds in it” (“Is Citation Analysis a Legitimate Evaluation Tool?” Scientometrics, v. 1, no. 4, 1979: pp. 359–375; garfield.library.upenn.edu/papers/scientometricsv1(4)p359y1979.pdf).

Garfield goes so far as to advocate for a metric that never gained much traction: the lifetime-citation count per paper. Developed by Geller, et al. in “Lifetime-Citation Rates to Compare Scientists’ Work,” and in press at the time of Garfield’s missive (published subsequently in Social Science Research. v. 7, no. 4, 1978: pp. 345–365), it has not, to my knowledge, become popular. (Part of this may have to do with the 40-year time span required to fully assess a researcher’s oeuvre using this metric. Research evaluation usually relies on a quick turnaround, and a new researcher’s early citation count is sometimes the indicator by which said researcher demonstrates the promise of future success. Research evaluators such as funders, labs, and other institutions do not have 40 years to wait for an answer about a researcher’s record of ostensible influence/impact.)

Still, Garfield agreed that as an evaluative measure, citation counts are not “completely definitive,” stating: “They very definitely are an interpretive tool that calls for thoughtful and subtle judgements on the part of those who employ them.” Also, in his conclusion Garfield notes, [T]here is much about the meaning of citation rates that we do not know.”

WHY RESEARCH IS CITED

Thinking about cited references for even a moment, it stands to reason that a researcher’s impetus to cite a given work varies. There have been many peer-reviewed articles (not to mention a swath of opinion pieces) that consider possible justifications for citing sources in research articles. Conveniently, Donqing Lyu, et al. published a meta-synthesis in 2021 that aggregates and classifies these various documented motivations (Lyu, D., Ruan, X., Xie, J., & Cheng, Y. “The Classification of Citing Motivations: A Meta-Synthesis,” Scientometrics, v. 126, no. 4, 2021: pp. 3243–3264; paywalled at doi.org/10.1007/s11192-021-03908-z). They started with 1,771 studies, of which 38 passed the criteria review and critical appraisal processes. Get this: Their analysis on the 38 rigorous research articles found 35 expressions of motivation for citing previous research. They boiled these down to 13 “themes.”

Really, though, when they got down to brass tacks, they found two basic motivations. The first would be “Scientific” reasons. The themes that count as scientific are likely what you would expect: “Background, Gap, Basis, Comparison, and Application.” The second, called “Tactical,” is defined essentially as non-scientific and includes themes they dub “Subjective Norm, Advertising, and Profit-seeking.” Another way of putting it: Tactical motivations tend to be more socially related, whereas Scientific motivations are more rhetorically related.

ENTER MACHINE LEARNING

In the 21st century, the ease of crunching large amounts of citation data, the development of Digital Object Identifiers (DOIs) to pull research articles more readily, and the rise of machine learning have afforded scientometricians and citation data providers’ new options for traditional bibliometric analyses. The development of Clarivate’s Category-Normalized Citation Impact (CNCI) and derivatives Journal Citation Indicator (JCI) and Collaborative CNCI, for example, has created a component of predictive analytics in that they measure actual citation counts (or mean citation counts) over a baseline or predicted citation counts for a given subject category.

I have expressed concerns about the transparency of this type of calculation previously (see my Sept./Oct. 2021 column and the one from May/June 2022). At the same time, I see the value and the opportunities machine learning and analytics offer. We can glean new information about all kinds of bibliometric behavior and patterns. Such activity helps scientometricians and bibliometricians derive new and hopefully interesting insights through new methods of citation analysis.

One particularly interesting tool using machine learning and text mining goes a step further from Clarivate’s new indicators. scite (scite.ai) works by automating the classification of cited references into four categories: supporting, mentioning, contrasting, and not determined. When I first became aware of scite, I downloaded a browser extension that pops up a little window with the counts for these classes of citations every time you view a paper that is indexed by the tool. Originally, if you had a login, you could get a modicum of detailed information about all of the citations picked up by scite, but now you can only see a few of the snippets where the work was cited. The meaty aspects of the tool are behind a paywall.

However, faculty and students at my institution can obtain scite at a discount. So what the heck? I ponied up a nominal payment for a year’s subscription so I could see what scite is up to, and I was impressed. I reached out to scite to get some background and ask some questions about the tool. Josh Nicholson, CEO and co-founder of scite (linkedin.com/in/joshua-nicholson), indicated to me, first and foremost, that they are actively working with institutions to make scite available without having to go the individual subscription route.

Nicholson’s background is as a cellular biologist. The genesis of scite stemmed from a concern that he and his partners had about challenges in research settings, particularly the reproducibility crisis and confirming the validity of studies. They saw an opportunity to look at these issues in a different way—through deep learning and text mining.

To develop scite, the team set about to manually classify cited references in a set of publications. They then created an algorithm to attempt to replicate the manual classification. The current iteration still has a test set against which tweaks to the algorithm are applied. Since development, scite has leveraged more than 24 publisher agreements to go behind paywalls to pull the snippets and do the text analysis. As of March 2022, scite claims to have more than a billion “smart citations” in its dataset (scite.ai/blog/the-next-generation-of-citations-arrives-as-scite-crosses-one-billion-smart-citations). scite is a small operation but is aiming big, actively contrasting its approach to that of the likes of well-funded citation database vendors such as Clarivate, Digital Science, and Elsevier.

Automated classification of text statements is not at a new thing. I first became aware of the practice in 2012, when David Milwad from the U.K.-based firm Linguamatics spoke at the SLA annual conference. Linguamatics performed a natural language processing (NLP) analysis of tweets and correctly predicted the election of U.K. Prime Minister David Cameron in 2010 (linguamatics.com/blog/trend-analysis-%E2%80%93-can-prediction-be-made). This led me to wonder why someone hadn’t done this type of analysis with cited references on a large scale sooner.

Nicholson explained that research papers have a much more complicated sentence structure and syntax than popular sources such as news or social media. You may be thinking, “Duh”; however, consider how NLP algorithms must then be more complicated to generate accurate results. “Sentence segmentation” is the key issue … making sure the tool “reads” the right part of a sentence to properly classify the citation. Lyu, et al.’s article reinforces this statement. They posit that while their schema can be used to automate classification of citing references, the tactical motivations are “not easily identified through text parsing.”

How well does scite classify references that cite a given paper? To be honest, I struggled to find an article with a significant number of “contrasting” citing references. I settled on Loftus and Pickrell’s seminal and somewhat controversial work: “The Formation of False Memories” (Psychiatric Annals, v. 25, no. 12, 1995: pp. 720–725). scite pulled 542 citation statements from its data garnered via publisher agreements. Of those 508 that mention the article neutrally, 15 support the paper, four contrast, and 15 were unclassifiable.

I looked at a few smart citation snippets in each of these categories. Many of the neutral citations are included in papers that actually build on the paper’s findings or adapt its methods. Citations categorized as supportive tended to be the most accurately categorized, and the four contrasting citing references were … wishy-washy at best; they were very close in sentiment to those in the neutral “mention” category. All 15 of the unclassifiable citing references were in languages other than English.

There is an imperial-sized load of other cool features in scite, yet corporate practices definitely bear a modicum of scrutiny. scite’s publisher agreements and datasets are proprietary, meaning we can’t tell what content is, and is not, included in the results. Librarians and info pros have long howled about the lack of publication lists used by citation indexes. In some cases, such as with Web of Science, formerly non-public lists are now available.) Some, but not all, of scite’s code is on GitHub (github.com/scitedotai).

Nicholson explains that his company needs capital to keep working and improving, since he has some extremely well-funded competitors. For now, my point is that scite sheds an insight on the types of citing references, and in looking at the snippets, there are perhaps some hints at the citing researchers’ motivation for including the reference. This is a very different purpose than the evaluation of researchers, a task for which transparency is paramount.

CONTEXTUALIZING CITED REFERENCES

scite may not be perfect, but it is a start at contextualizing cited references. Circling back to the 1955 article from Garfield, I’m not sure we are truly able to “eliminate the uncritical citation of fraudulent, incomplete, or obsolete data by making it possible for the conscientious scholar to be aware of criticisms of earlier papers.” For one thing, if we were, the Retraction Watch website (retractionwatch.com) would not exist. Also, I don’t think I am going too far out on a limb when I allege researchers are likely to be cautious about not stepping on their colleagues’ toes, at least when it comes to writing something that will undergo peer review.

Nicholson indicated to me an unforeseen market for scite: students. Using scite, students are better able to research their own essays and papers by seeing how others have cited an assigned paper or one they’ve found from their own searching. Back in my library school days, we called this type of search “pearl growing,” taking one very on-point article and obtaining others related to it through identifying cited and citing references. scite makes pearl growing even easier. Students don’t have to pore through reference lists or citation indexes and then pull the papers. Instead, they can get snippets through scite, thereby speeding up the filtering process. (One hopes that after filtering, the student pulls the relevant papers and does not just rely on a snippet.)

As info pros, we can use our expertise to learn, test, and understand the nuances of tools like scite and then inform our users of the tools’ limitations. Sometimes, we can follow up on the snippets produced in a tool like scite and determine additional insights for our customers/users. Of course, we can advise on how to use citation data in general, and when it is—and is not—appropriate for research evaluation. Our top-notch information literacy skills are certainly flexed when we meet user needs with regard to impact indicators, contextualizing citing references and explaining the nuances of new tools and resources.

In my experience, our knowledgebase in this area demonstrates the truly meaningful value of librarians and info pros in all kinds of settings where research and development can be found. Staying current on citation analytics tools is vital to preserving and increasing this value.