History of Peer Review: From the Royal Society to Modern Science

Peer review is so deeply embedded in contemporary science that it can feel like a permanent feature of the intellectual landscape — as fixed as the periodic table or the double helix. It isn't. The process has a specific history, a traceable origin, and a development shaped by institutional politics, technological change, and epistemological debate. Understanding where peer review came from clarifies what it actually does, why it is structured the way it is, and why it remains contested despite its near-universal adoption.

The Royal Society and the Birth of Formal Review

The Philosophical Transactions of the Royal Society, first published in 1665 under the editorship of Henry Oldenburg, is the oldest continuously published scientific journal in the world. Its founding is the conventional starting point for discussions of peer review, but the attribution requires some precision. Oldenburg did circulate manuscripts among members of the Royal Society for comment, and the Society formally took ownership of the journal in 1752, at which point a committee-based review process was established. That 1752 moment is closer to what modern scientists would recognize as institutional peer review.

The Royal Society's charter, granted by King Charles II in 1660, established its authority to publish and disseminate natural knowledge. The Society continues to operate under this mission and publishes detailed guidelines on its editorial standards through the Royal Society Publishing platform. The early review process was less about gatekeeping statistical methodology than about social credibility — who vouched for a claim mattered enormously in an era when experimental science was still establishing its legitimacy against theological and philosophical traditions.

The Nineteenth Century: Specialization and the Rise of Disciplinary Journals

Through the 1700s and into the 1800s, most scientific communication happened through correspondence, letters to societies, and broad-scope periodicals. As scientific disciplines differentiated — chemistry from natural philosophy, biology from natural history — specialized journals multiplied. The Lancet, founded in 1823, and the British Medical Journal, launched in 1840, brought emerging review practices into medicine. The American Journal of Science, founded in 1818, performed similar work in the United States.

Crucially, nineteenth-century review was still largely editorial rather than systematic. An editor with appropriate expertise would judge submissions, sometimes consulting colleagues informally. The idea that multiple independent referees should evaluate a manuscript according to explicit criteria — blind to the author's identity — had not yet crystallized. What existed was a precursor: a culture of expert judgment without standardized procedure.

The Twentieth Century: Institutionalization and the Double-Blind Turn

The modern architecture of peer review consolidated during the mid-twentieth century, driven by two forces: the dramatic expansion of scientific output following World War II and the increasing professionalization of academic publishing.

The National Science Foundation, established by the National Science Foundation Act of 1950 (42 U.S.C. § 1861 et seq.), embedded peer review into federal grant-making from its inception. The NIH similarly formalized peer review through the Public Health Service Act, now codified at 42 U.S.C. § 289a, which mandates expert review of competing grant applications. These legislative anchors transformed peer review from a journal-level practice into a regulatory requirement with statutory force.

During this same period, journals began formalizing referee processes. Nature introduced systematic external refereeing in the 1970s. The concept of double-blind review — in which neither author nor reviewer knows the other's identity — gained traction through debate in the 1970s and 1980s, though adoption remained uneven. The American Psychological Association's Publication Manual formalized manuscript review standards for psychology and social science. The Council of Science Editors (CSE), founded in 1957 as the Conference of Biology Editors, developed guidelines that eventually became authoritative across life sciences publishing.

For a detailed breakdown of how the contemporary review process functions mechanically, see How It Works.

The Digital Era: Scale, Speed, and Structural Stress

The internet did not change what peer review is supposed to do. It changed the scale at which it operates and exposed structural weaknesses that slower publication cycles had partially concealed.

By 2023, the Provider Network of Open Access Journals (DOAJ) indexed over 20,000 peer-reviewed open-access journals. CrossMark, managed by Crossref, provides persistent metadata on publication corrections and retractions across thousands of journals. The sheer volume of submissions — estimated at several million manuscripts annually across all disciplines — has created a chronic reviewer shortage. Some journals report that reviewer acceptance rates have fallen below 50 percent, meaning editors often approach six or more potential reviewers to secure two usable reports.

Preprint servers complicated the picture further. arXiv, launched in 1991, allowed physicists to share manuscripts before peer review; bioRxiv and medRxiv extended this practice to biology and medicine. During the COVID-19 pandemic, thousands of preprints circulated without peer review and were cited in policy documents and media reports. This was not a failure of peer review per se — preprints are explicitly not peer-reviewed — but it forced public attention onto the difference between peer-reviewed and non-peer-reviewed claims in a way that prior decades had not. Understanding the full scope of what peer review does and doesn't cover matters more now than it did in 1965.

Persistent Criticisms and Reform Efforts

Peer review has attracted sustained criticism from within science itself. The central objections are not new, but they have become sharper as empirical evidence on review outcomes has accumulated.

Reproducibility. A 2015 study published in Science by the Open Science Collaboration attempted to replicate 100 published psychology studies and found that only 39 produced results consistent with the original findings. Peer review had passed all 100 originals. Replication failure does not mean peer review is worthless; it means peer review was never designed to verify experimental results — only to evaluate methodological plausibility and logical coherence before publication.

Bias. Studies have documented gender bias, institutional prestige bias, and citation-network bias in referee recommendations. A frequently cited 1997 study by Peters and Ceci submitted previously published and accepted articles from prestigious institutions to the same journals under fictitious names from low-status institutions; a majority were rejected. The Committee on Publication Ethics (COPE), founded in 1997, has developed guidelines and case studies addressing conflicts of interest, reviewer misconduct, and editorial bias. COPE membership now exceeds 12,000 journals and institutions.

Transparency. Open peer review — in which referee reports are published alongside accepted articles — is now practiced by a growing number of journals, including those published by PLOS and by eLife (which restructured its entire publication model in 2022 to eliminate accept/reject decisions in favor of publishing all reviewed preprints with reviewer reports). The outcomes of these experiments are still being evaluated.

For questions about how peer review applies to specific situations and what qualifies as adequate review, the Peer Review FAQ addresses the most common practical queries.

What the History Actually Tells Us

Peer review was not designed by a committee with a complete theory of scientific validation. It grew incrementally from social practices in seventeenth-century learned societies, was formalized by professional organizations and government mandate in the mid-twentieth century, and is now under active revision as the infrastructure of science changes faster than any single institution can adapt.

Three things remain stable across all of this history. First, expert judgment is irreplaceable — no algorithm has successfully substituted for domain-specific knowledge in evaluating whether a claim follows from evidence. Second, peer review is not a truth guarantee; it is a quality filter, and the distinction matters enormously when interpreting published findings. Third, the process is a social institution, subject to the same incentive structures and power dynamics as any other.