Manually evaluating academic papers has long been a time- and energy- consuming process. In an attempt to find a more efficient alternative, Thelwall set off on testing the potential of Large Language Models (LLM) in giving a fair and swift evaluation.
As part of his research, he input the UK’s 2021 Research Excellence Framework (REF) criteria into ChatGPT and asked the model to simulate an expert review of academic papers. While the model’s initial assessments varied, he found that its performance improved significantly over multiple trials — producing results that increasingly mirrored those of human reviewers.
To demonstrate both the capabilities and limitations of large language models, Professor Thelwall shared a funny yet insightful experiment. He submitted a fake paper titled “Do Squirrel Surgeons Receive More Citations?” to ChatGPT.
The model not only gave the fake paper a glowing review, it also seriously analyzed the impact of “difference of species of the author”. However, it did give a clear-cut ‘no’ when asked whether squirrels could write treatises — showing basic common sense.
He concluded his talk with a note of caution — LLMs may tempt researchers to write for algorithms rather than people. Uploading papers to such platforms could raise copyright issues. Most importantly, AI-generated evaluations should never replace formal peer review.
The session closed with a lively Q&A, where students and faculty engaged in thoughtful discussion. To mark the occasion, Associate Professor Liu Chang presented Professor Thelwall with a commemorative gift on behalf of the Department of Information Management.