Data-based prediction of microbial contamination in herbs and identification of optimal harvest parameters

Stefan Anlauf; Andreas Haghofer; Karl Dirnberger; Stephan Winkler

doi:10.1515/ijfe-2021-0027

Article

Data-based prediction of microbial contamination in herbs and identification of optimal harvest parameters

Stefan Anlauf , Andreas Haghofer , Karl Dirnberger and Stephan Winkler

Published/Copyright: August 9, 2021

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal International Journal of Food Engineering Volume 18 Issue 3

Abstract

The quality of freshly harvested herbs is affected by several crucial factors, such as weather, tillage, fertilization, drying, and the harvesting process, e.g. Our main goal is to learn models that are able to predict spore contaminations in different types of herbs on the basis of information about the harvesting process, transport conditions, drying, and storage conditions. This shall enable us to identify optimal processing parameters, which will allow more effective and cost efficient contamination prevention. Using machine learning, we have generated ensembles of models that predict the risk for spore contamination on the basis of harvest processing parameters. The training information about contamination in herbs is given as results of laboratory analysis data. We applied different modeling algorithms (random forests, gradient boosting trees, genetic programming, and neural networks). In this paper we report on modeling results for yeast and mold contaminations in peppermint and nettle; e.g., for yeast contamination in peppermint we obtained models with 78.13% accuracy. Additionally, we use descriptive statistics to identify those parameters that have a statistically significant influence on the contamination; for example, our analysis shows that there seems to be a relationship between mold in peppermint and information about harrowing and the growth height (p = 0.001).

Keywords: data science; herbs; machine learning; microbial contamination; statistics

Corresponding author: Stefan Anlauf, FFoQSI GmbH, Technopark 1C, 3430 Tulln, Austria; and University of Applied Sciences Upper Austria, Bioinformatics, Softwarepark 11, 4232 Hagenberg, Austria, E-mail: stefan.anlauf@ffoqsi.at

Stefan Anlauf and Andreas Haghofer contributed equally to the paper.

Acknowledgment

The work presented in this article was sponsored by FFOQSI, the Austrian Competence Centre for Feed and Food Quality Safety and Innovation.

Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Sivakumar, MVK, Motha, RP. Managing weather and climate risks in agriculture. Berlin, Heidelberg, New York: Springer Science & Business Media; 2007.10.1007/978-3-540-72746-0Search in Google Scholar

2. Liakos, KG, Busato, P, Moshou, D, Pearson, S, Bochtis, D. Machine learning in agriculture: a review. Sensors 2018;18. Available from: https://www.mdpi.com/1424-8220/18/8/2674. https://doi.org/10.3390/s18082674.Search in Google Scholar PubMed PubMed Central

3. Obermayr, M, Vrettos, C. Anwendung der Diskrete-Elemente-Methode zur Vorhersage von Kräften bei der Bodenbearbeitung. Geotechnik 2013;36:231–42.10.1002/gete.201300009Search in Google Scholar

4. Pantazi, XE, Moshou, D, Alexandridis, T, Whetton, RL, Mouazen, AM. Wheat yield prediction using machine learning and advanced sensing techniques. Comput Electron Agric 2016;121:57–65. Available from: https://www.sciencedirect.com/science/article/pii/S0168169915003671. https://doi.org/10.1016/j.compag.2015.11.018.Search in Google Scholar

5. El-Bendary, N, El Hariri, E, Hassanien, AE, Badr, A. Using machine learning techniques for evaluating tomato ripeness. Expert Syst Appl 2015;42:1892–905. Available from: https://www.sciencedirect.com/science/article/pii/S0957417414006186. https://doi.org/10.1016/j.eswa.2014.09.057.Search in Google Scholar

6. Sethy, P, Panda, S, Behera, S, Rath, A. On tree detection, counting & post- harvest grading of fruits based on image processing and machine learning approach – a review. Int J Eng Technol 2017;9:649–63. https://doi.org/10.21817/ijet/2017/v9i2/170902058.Search in Google Scholar

7. Barbedo, JGA. Detection of nutrition deficiencies in plants using proximal images and machine learning: a review. Comput Electron Agric 2019;162:482–92. Available from: https://www.sciencedirect.com/science/article/pii/S0168169918318957. https://doi.org/10.1016/j.compag.2019.04.035.Search in Google Scholar

8. Streiner, DL. The case of the missing data: methods of dealing with dropouts and other research vagaries. Can J Psychiatr 2002;47:70–7. https://doi.org/10.1177/070674370204700111.Search in Google Scholar

9. Acock, AC. Working with missing values. J Marriage Fam 2005;67:1012–28. https://doi.org/10.1111/j.1741-3737.2005.00191.x.Search in Google Scholar

10. Hall, M. Correlation-based feature selection for machine learning. Hamilton, New Zealand: University of Waikato Hamilton; 1999.Search in Google Scholar

11. Webb, GI, Sammut, C, Perlich, C, Horváth, T, Wrobel, S, Korb, KB, et al.. Leave-one-out cross-validation. In: Encyclopedia of machine learning. Boston, MA: Springer; 2011:600–1 pp.10.1007/978-0-387-30164-8_469Search in Google Scholar

12. Breiman, L. Random forrest. Mach Learn 2001;45:5–32.10.1023/A:1010933404324Search in Google Scholar

13. Chen, T, Guestrin, C. XGBoost. New York: Association for Computing Machinery; 2016:785–94 pp.10.1145/2939672.2939785Search in Google Scholar

14. Murtagh, F. Multilayer perceptrons for classification and regression. Neurocomputing 1991;2:183–97. https://doi.org/10.1016/0925-2312(91)90023-5.Search in Google Scholar

15. Koza, J. Genetic programming: on the programming of computers by means of natural selection. Cambridge (USA), London: MIT press; 1992.Search in Google Scholar

16. Wagner, S, Kronberger, G, Beham, A, Kommenda, M, Scheibenpflug, A, Pitzer, E, et al.. Architecture and design of the HeuristicLab optimization environment. In: Advanced methods and applications in computational intelligence. Springer; 2014:197–261 pp. https://doi.org/10.1007/978-3-319-01436-4_10.Search in Google Scholar

17. Pedregosa, F, Varoquaux, G, Gramfort, A, Michel, V, Thirion, B, Grisel, O, et al.. Scikit-learn: machine learning in Python. J Mach Learn Res 2011;12:2825–30.Search in Google Scholar

18. Chollet, F. Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek. Bonn, Germany: MITP-Verlags GmbH & Co. KG; 2018.Search in Google Scholar

19. Warwick, K. Neural networks: an introduction. In: Neural network applications in control. London: Iet; 2011.10.1049/PBCE053E_ch1Search in Google Scholar

20. Loyola-Gonzalez, Octavio. Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view. New York: IEEE; 2019.10.1109/ACCESS.2019.2949286Search in Google Scholar

21. Affenzeller, M, Winkler, S, Wagner, S, Beham, A. Genetic algorithms and genetic programming: modern concepts and practical applications; 2009.10.1201/9781420011326Search in Google Scholar

22. Pearson, K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. New York: Taylor & Francis; 1992.10.1007/978-1-4612-4380-9_2Search in Google Scholar

23. McKnight, Patrick, Najab, Julius. Mann-Whitney U test. Wiley Online Library; 2010.10.1002/9780470479216.corpsy0524Search in Google Scholar

24. Schumacker, R, Tomek, S. F-test. In: Understanding statistics using R. New York: Springer; 2013.10.1007/978-1-4614-6227-9Search in Google Scholar

25. Peres, R. Mastering ASP.NET Core 2.0 – MVC patterns, configuration, routing, deployment, and more, 1st ed. Birmingham: Packt Publishing Ltd; 2017.Search in Google Scholar

26. DuBois, P, Widenius, M. Mysql. USA: New Riders Publishing; 1999.Search in Google Scholar

Received: 2021-01-29

Accepted: 2021-07-10

Published Online: 2021-08-09

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/ijfe-2021-0027

Keywords for this article

data science; herbs; machine learning; microbial contamination; statistics