Home Medicine Detection of fake papers in the era of artificial intelligence
Article
Licensed
Unlicensed Requires Authentication

Detection of fake papers in the era of artificial intelligence

  • Mehdi Dadkhah ORCID logo EMAIL logo , Marilyn H. Oermann ORCID logo , Mihály Hegedüs ORCID logo , Raghu Raman and Lóránt Dénes Dávid ORCID logo
Published/Copyright: August 17, 2023

Abstract

Objectives

Paper mills, companies that write scientific papers and gain acceptance for them, then sell authorships of these papers, present a key challenge in medicine and other healthcare fields. This challenge is becoming more acute with artificial intelligence (AI), where AI writes the manuscripts and then the paper mills sell the authorships of these papers. The aim of the current research is to provide a method for detecting fake papers.

Methods

The method reported in this article uses a machine learning approach to create decision trees to identify fake papers. The data were collected from Web of Science and multiple journals in various fields.

Results

The article presents a method to identify fake papers based on the results of decision trees. Use of this method in a case study indicated its effectiveness in identifying a fake paper.

Conclusions

This method to identify fake papers is applicable for authors, editors, and publishers across fields to investigate a single paper or to conduct an analysis of a group of manuscripts. Clinicians and others can use this method to evaluate articles they find in a search to ensure they are not fake articles and instead report actual research that was peer reviewed prior to publication in a journal.


Corresponding author: Mehdi Dadkhah, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India; and Technology Forecasting Department, SnowaTec Technology Center and Innovation Factory, Entekhab Industrial Group, Isfahan, Iran, E-mail:

  1. Research ethics: Not applicable.

  2. Informed consent: Not applicable.

  3. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission. There is no AI generated content in this article.

  4. Competing interests: The authors declare no conflicts of interest.

  5. Research funding: None declared.

  6. Data availability: The raw data can be obtained on request from the corresponding author.

References

1. COPE, STM. Paper mills research report from COPE & STM [Internet]; 2022. Available from: https://doi.org/10.24318/jtbG8IHL.Search in Google Scholar

2. Abalkina, A, Bishop, D. Paper mills: a novel form of publishing malpractice affecting psychology. PsyArXiv 2022:1–24. https://doi.org/10.31234/osf.io/2yf8z.Search in Google Scholar

3. Santos-d’Amorim, K, Wang, T, Lund, B, Macedo Dos Santos, RN. From plagiarism to scientific paper mills: a profile of retracted articles within the SciELO Brazil collection. Ethics Behav 2022:1–18. https://doi.org/10.1080/10508422.2022.2141747.Search in Google Scholar

4. Day, A. Exploratory analysis of text duplication in peer-review reveals peer-review fraud and paper mills. Scientometrics 2022;127:5965–87. https://doi.org/10.1007/s11192-022-04504-5.Search in Google Scholar

5. Perez-Neri, I, Pineda, C, Sandoval, H. Threats to scholarly research integrity arising from paper mills: a rapid scoping review. Clin Rheumatol 2022;41:2241–8. https://doi.org/10.1007/s10067-022-06198-9.Search in Google Scholar PubMed

6. Byrne, JA, Park, Y, Richardson, RA, Pathmendra, P, Sun, M, Stoeger, T. Protection of the human gene research literature from contract cheating organizations known as research paper mills. Nucleic Acids Res 2022;50:12058–70. https://doi.org/10.1093/nar/gkac1139.Search in Google Scholar PubMed PubMed Central

7. Calver, M. Combatting the rise of paper mills. Pac Conserv Biol 2021;27:1–2. https://doi.org/10.1071/pcv27n1_ed.Search in Google Scholar

8. Dadkhah, M, Raja, AM, Memon, AR, Borchardt, G, Nedungadi, P, Abu-Eteen, K, et al.. A toolkit for detecting fallacious calls for papers from potential predatory journals. Adv Pharm Bull 2023;13:1–8.10.34172/apb.2023.068Search in Google Scholar

9. Dadkhah, M, Bianciardi, G. Ranking predatory journals: solve the problem instead of removing it. Adv Pharmaceut Bull 2016;6:1. https://doi.org/10.15171/apb.2016.001.Search in Google Scholar PubMed PubMed Central

10. Mathew, RP, Patel, V, Low, G. Predatory journals-The power of the predator versus the integrity of the honest. Curr Probl Diagn Radiol 2022;51:740–6. https://doi.org/10.1067/j.cpradiol.2021.07.005.Search in Google Scholar PubMed

11. Oermann, MH, Wrigley, J, Nicoll, LH, Ledbetter, LS, Carter-Templeton, H, Edie, AH. Integrity of databases for literature searches in nursing: avoiding predatory journals. Adv Nurs Sci 2021;44:102. https://doi.org/10.1097/ans.0000000000000349.Search in Google Scholar

12. Sureda‐Negre, J, Calvo‐Sastre, A, Comas‐Forgas, R. Predatory journals and publishers: characteristics and impact of academic spam to researchers in educational sciences. Learn Publ 2022;35:441–7. https://doi.org/10.1002/leap.1450.Search in Google Scholar

13. Dadkhah, M, Rahimnia, F, Darbyshire, P, Borchardt, G. Ten (Bad) reasons researchers publish their papers in hijacked journals. J Clin Nurs 2021;30:e60–3.10.1111/jocn.15947Search in Google Scholar PubMed

14. Dadkhah, M, Borchardt, G. Hijacked journals: an emerging challenge for scholarly publishing. Aesthetic Surg J 2016;36:739–41. https://doi.org/10.1093/asj/sjw026.Search in Google Scholar PubMed

15. Dadkhah, M, Lagzian, M, Borchardt, G. Questionable papers in citation databases as an issue for literature review. J Cell Commun Signal 2017;11:181–5. https://doi.org/10.1007/s12079-016-0370-6.Search in Google Scholar PubMed PubMed Central

16. Cabanac, G, Labbé, C. Prevalence of nonsensical algorithmically generated papers in the scientific literature. J Assoc Inf Sci Technol 2021;72:1461–76. https://doi.org/10.1002/asi.24495.Search in Google Scholar

17. Ali, MJ, Djalilian, A. Readership awareness series – paper 4: chatbots and ChatGPT – ethical considerations in scientific publications. Semin Ophthalmol 2023;1–2:403–4. https://doi.org/10.1016/j.jtos.2023.04.001.Search in Google Scholar PubMed

18. Gao, CA, Howard, FM, Markov, NS, Dyer, EC, Ramesh, S, Luo, Y, et al.. Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers. NPJ Digit Med 2023;6:1–5.10.1038/s41746-023-00819-6Search in Google Scholar PubMed PubMed Central

19. Sun, GH, Hoelscher, SH. The ChatGPT storm and what faculty can do. Nurse Educat 2023;48:119–24. https://doi.org/10.1097/nne.0000000000001390.Search in Google Scholar PubMed

20. van Dis, EA, Bollen, J, Zuidema, W, van Rooij, R, Bockting, CL. ChatGPT: five priorities for research. Nature 2023;614:224–6. https://doi.org/10.1038/d41586-023-00288-7.Search in Google Scholar PubMed

21. Gravel, J, D’Amours-Gravel, M, Osmanlliu, E. Learning to fake it: limited responses and fabricated references provided by ChatGPT for medical questions. Mayo Clin Proc Digital Health 2023;1;226–34.10.1016/j.mcpdig.2023.05.004Search in Google Scholar

22. Oermann. Writing for publication in nursing. New York: Springer Publishing; 2024.Search in Google Scholar

23. The retraction watch hijacked journal checker [Internet]. 2022 [cited 2023 Mar 31]. Available from: https://retractionwatch.com/the-retraction-watch-hijacked-journal-checker/.Search in Google Scholar

24. Candal-Pedreira, C, Ross, JS, Ruano-Ravina, A, Egilman, DS, Fernández, E, Pérez-Ríos, M. Retracted papers originating from paper mills: cross sectional study. BMJ 2022;379. https://doi.org/10.1136/bmj-2022-071517.Search in Google Scholar PubMed PubMed Central

25. Campos-Varela, I, Ruano-Raviña, A. Misconduct as the main cause for retraction. A descriptive study of retracted publications and their authors. Gac Sanit 2019;33:356–60. https://doi.org/10.1016/j.gaceta.2018.01.009.Search in Google Scholar PubMed

26. Martinson, BC, Anderson, MS, De Vries, R. Scientists behaving badly. Nature 2005;435:737–8. https://doi.org/10.1038/435737a.Search in Google Scholar PubMed

27. Anderson, N, Belavy, DL, Perle, SM, Hendricks, S, Hespanhol, L, Verhagen, E, et al.. AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in Sports & Exercise Medicine manuscript generation. BMJ Open Sport Exerc Med 2023;9:e001568. https://doi.org/10.1136/bmjsem-2023-001568.Search in Google Scholar PubMed PubMed Central

28. Stokel-Walker, C, Van Noorden, R. What ChatGPT and generative AI mean for science. Nature 2023;614:214–6. https://doi.org/10.1038/d41586-023-00340-6.Search in Google Scholar PubMed

29. El Naqa, I, Murphy, MJ. What is machine learning? In: El Naqa, I, Li, R, Murphy, MJ, editors Machine learning in radiation oncology: theory and applications [Internet]. Cham: Springer International Publishing; 2015. pp. 3–11.10.1007/978-3-319-18305-3_1Search in Google Scholar

30. Theobald, O. Machine learning for absolute beginners: a plain English introduction, 157. UK: Scatterplot press London; 2017.Search in Google Scholar

31. Weka 3: machine learning software in Java [Internet]. 2023 [cited 2023 Mar 30]. Available from: https://www.cs.waikato.ac.nz/ml/weka/.Search in Google Scholar

32. Myles, AJ, Feudale, RN, Liu, Y, Woody, NA, Brown, SD. An introduction to decision tree modeling. J Chemometr 2004;18:275–85. https://doi.org/10.1002/cem.873.Search in Google Scholar

33. Breiman, L. Classification and regression trees. New York: Routledge; 2017.10.1201/9781315139470Search in Google Scholar

34. Quinlan, JR. C4. 5: programs for machine learning. Burlington: Elsevier; 2014.Search in Google Scholar

35. Kass, GV. An exploratory technique for investigating large quantities of categorical data. J Roy Stat Soc Ser C 1980;29:119–27. https://doi.org/10.2307/2986296.Search in Google Scholar

36. Loh, WY, Shih, YS. Split selection methods for classification trees. Stat Sin 1997:815–40.Search in Google Scholar

37. Hermawan, DR, Fatihah, MFG, Kurniawati, L, Helen, A. Comparative study of J48 decision tree classification algorithm, random tree, and random forest on in-vehicle CouponRecommendation data. In: 2021 International conference on artificial intelligence and big data analytics. 2021. pp. 1–6.10.1109/ICAIBDA53487.2021.9689701Search in Google Scholar

38. Song, YY, Ying, L. Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry 2015;27:130. https://doi.org/10.11919/j.issn.1002-0829.215044.Search in Google Scholar PubMed PubMed Central

39. Oermann, MH, Nicoll, LH, Carter-Templeton, H, Owens, JK, Wrigley, J, Ledbetter, LS, et al.. How to identify predatory journals in a search: precautions for nurses. Nursing 2022;52:41–5. https://doi.org/10.1097/01.nurse.0000823280.93554.1a.Search in Google Scholar

Received: 2023-07-18
Accepted: 2023-08-02
Published Online: 2023-08-17

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Reviews
  3. Diagnostic errors in uncommon conditions: a systematic review of case reports of diagnostic errors
  4. Routine blood test markers for predicting liver disease post HBV infection: precision pathology and pattern recognition
  5. Opinion Papers
  6. The challenge of clinical reasoning in chronic multimorbidity: time and interactions in the Health Issues Network model
  7. The first diagnostic excellence conference in Japan
  8. Clouds across the new dawn for clinical, diagnostic and biological data: accelerating the development, delivery and uptake of personalized medicine
  9. Original Articles
  10. Towards diagnostic excellence on academic ward teams: building a conceptual model of team dynamics in the diagnostic process
  11. Error codes at autopsy to study potential biases in diagnostic error
  12. Multicenter evaluation of a method to identify delayed diagnosis of diabetic ketoacidosis and sepsis in administrative data
  13. Detection of fake papers in the era of artificial intelligence
  14. Is language an issue? Accuracy of the German computerized diagnostic decision support system ISABEL and cross-validation with the English counterpart
  15. The feasibility of a mystery case curriculum to enhance diagnostic reasoning skills among medical students: a process evaluation
  16. Internal medicine intern performance on the gastrointestinal physical exam
  17. Scaling up a diagnostic pause at the ICU-to-ward transition: an exploration of barriers and facilitators to implementation of the ICU-PAUSE handoff tool
  18. Learned cautions regarding antibody testing in mast cell activation syndrome
  19. Diagnostic properties of natriuretic peptides and opportunities for personalized thresholds for detecting heart failure in primary care
  20. Incomplete filling of spray-dried K2EDTA evacuated blood tubes: impact on measuring routine hematological parameters on Sysmex XN-10
  21. Letters to the Editor
  22. The diagnostic accuracy of AI-based predatory journal detectors: an analogy to diagnosis
  23. Explainable AI for gut microbiome-based diagnostics: colorectal cancer as a case study
  24. Restless X syndrome: a new diagnostic family of nocturnal, restless, abnormal sensations of various body parts
  25. Erratum
  26. Retraction of: Establishing a stable platform for the measurement of blood endotoxin levels in the dialysis population
Downloaded on 27.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/dx-2023-0090/html
Scroll to top button