Home Interpreting the relationship between properties of wood and pulping & paper via machine learning algorithms combined with SHAP analysis
Article
Licensed
Unlicensed Requires Authentication

Interpreting the relationship between properties of wood and pulping & paper via machine learning algorithms combined with SHAP analysis

  • Xing Liu , Jie Hong , Mingming Zhang and Liang Zhou EMAIL logo
Published/Copyright: January 3, 2025
Become an author with De Gruyter Brill

Abstract

The pulping ability and quality of paper high relay on the wood properties. However, the relationship between them are profound. Based on the extracting digital information from the anatomical, chemical, and physical properties of poplar wood, predictive models were developed for paper properties (tensile index, burst index and tear index) and pulping properties (Kappa number and pulp yield) using six algorithms, namely PLSR, ENR, RF, XGBoost, LightGBM, and CatBoost. The prediction results revealed that among the six algorithms, PLSR, ENR, and RF exhibited results of most prediction greater than 0.79. Notably, XGBoost, LightGBM, and CatBoost algorithms demonstrated superior predictive performance, with results greater than 0.9, except for the tear index. Furthermore, SHAP analysis suggested that the cellulose content is the primary factors to modulate pulping ability and the morphological features of cell wall shows apparent effects on mechanical properties of paper. It hopes the result will benefit to provide information to evaluate the value of poplar wood from different resources and then deliver instructions to genetic breeding program and forest management of poplar plantation.


Corresponding author: Liang Zhou, Key Laboratory of National Forestry and Grassland Administration “Wood Quality Improvement & Efficient Utilization”, School of Materials and Chemistry, Anhui Agricultural University, Hefei 230036, China, E-mail:
Xing Liu is the first author.

Award Identifier / Grant number: 32371802

Acknowledgments

The author thanks the poplar samples provided by Key Laboratory of National Forestry and Grassland Administration “Wood Quality Improvement & Efficient Utilization” of Anhui Agricultural University.

  1. Research ethics: Not applicable.

  2. Informed consent: All authors know and agree to publish.

  3. Author contributions: Xing Liu and Liang Zhou conceived this research and conducted related experiments. Jie Hong and Mingming Zhang participated in the analysis of the experiment and the discussion of data. Xing Liu wrote this manuscript. All the authors read and approved the final version of the manuscript.

  4. Use of Large Language Models, AI and Machine Learning Tools: Not applicable.

  5. Conflict of interest: The authors declare no conflict of interest.

  6. Research funding: This work is supported by a grant from the Natural Science Foundation of China (Grant number: 32371802)

  7. Data availability: The raw data can be obtained on request from the corresponding author.

References

Amidon, T.E. (1981). Effect of the wood properties of hardwoods on kraft paper properties. Tappi 64: 123–126.Search in Google Scholar

Bangdiwala, S.I. (2018). Regression: multiple linear. Inter. J. Inj. Control and Saf. Promot. 25: 232–236, https://doi.org/10.1080/17457300.2018.1452336.Search in Google Scholar PubMed

Baptista, M.L., Goebel, K., and Henriques, E.M.P. (2022). Relation between prognostics predictor evaluation metrics and local interpretability SHAP values. Artif. Intell. 306, https://doi.org/10.1016/j.artint.2022.1033667.Search in Google Scholar

Barrett, J.P. (1974). The coefficient of determination–some Limitations. Am. Statistician 28: 19–20, https://doi.org/10.1080/00031305.1974.10479056.Search in Google Scholar

Bentéjac, C., Csörgő, A., Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54:1937–1967, https://doi.org/10.1007/s10462-020-09896-5.Search in Google Scholar

Brumen, B., Černezel, A., and Bošnjak, L. (2021). Overview of machine learning process modelling. Entropy 23: 112, https://doi.org/10.3390/e23091123, https://www.mdpi.com/1099-4300/23/9/1123.Search in Google Scholar PubMed PubMed Central

Chen, Y., Yan, Z., Liang, L., Ran, M., Wu, T., Wang, B., Zou, X., Zhao, M., Fang, G., and Shen, K. (2020). Comparative evaluation of organic acid pretreatment of eucalyptus for kraft dissolving pulp production. Materials 13: 361, https://doi.org/10.3390/ma13020361, https://www.mdpi.com/1996-1944/13/2/361.Search in Google Scholar PubMed PubMed Central

Cheng, J., Sun, J., Yao, K., Xu, M., and Cao, Y. (2022). A variable selection method based on mutual information and variance inflation factor. Spectrochim. Acta : Mol. Biomol. Spectrosc. 268, https://doi.org/10.1016/j.saa.2021.120652.Search in Google Scholar PubMed

Chow, K.Y. (1947). A comparative study of the structure and composition of tension wood in beech. (Fagus sylvatica L.). Forestry 20: 62–77, https://doi.org/10.1093/forestry/20.1.62.Search in Google Scholar

Cui, Z., Li, X., Li, T., and Li, M. (2023). Improvement and assessment of convolutional neural network for tree species identification based on bark characteristics. Forests 14, https://doi.org/10.3390/f14071292, https://www.mdpi.com/1999-4907/14/7/1292.Search in Google Scholar

Duplooy, A.B.J. (1980). The relationship between wood and pulp properties of E. grandis (Hill ex-Maiden) grown in South Africa. Appita 33: 257–264.Search in Google Scholar

Freund, Y. (2001). An adaptive version of the boost by majority algorithm. Mach. Learn. 43: 293–318, https://doi.org/10.1023/A:1010852229904.10.1023/A:1010852229904Search in Google Scholar

Gao, W., Zhou, L., Jiang, Q., Guan, Y., Hou, R., Hui, B., and Liu, S. (2022a). Reliable and realistic models for lignin content determination in poplar wood based on FT-Raman spectroscopy. Ind. Crops Prod. 182, https://doi.org/10.1016/j.indcrop.2022.114884.Search in Google Scholar

Gao, W., Zhou, L., Liu, S., Guan, Y., Gao, H., and Hui, B. (2022b). Machine learning prediction of lignin content in poplar with Raman spectroscopy. Bioresour. Technol. 348, https://doi.org/10.1016/j.biortech.2022.126812.Search in Google Scholar PubMed

Gewers, F.L., Ferreira, G.R., Arruda, H.F.D., Silva, F.N., Comin, C.H., Amancio, D.R., and Costa, L.D.F. (2021). Principal component analysis. ACM Comput. Survey. 54: 1–34, https://doi.org/10.1145/3447755.Search in Google Scholar

Gobeyn, S., Mouton, A.M., Cord, A.F., Kaim, A., Volk, M., and Goethals, P.L.M. (2019). Evolutionary algorithms for species distribution modelling: a review in the context of machine learning. Ecological Modell. 392: 179–195, https://doi.org/10.1016/j.ecolmodel.2018.11.013.Search in Google Scholar

Greenacre, M., Groenen, P.J.F., Hastie, T., D’Enza, A.I., Markos, A., and Tuzhilina, E. (2022). Principal component analysis. Nat. Rev. Methods Prim. 2: 100, https://doi.org/10.1038/s43586-022-00184-w.Search in Google Scholar

Hajihosseinlou, M., Maghsoudi, A., and Ghezelbash, R. (2023). A novel scheme for mapping of MVT-Type Pb–Zn Prospectivity: LightGBM, a highly efficient gradient boosting decision tree machine learning algorithm. Nat. Resour. Res. 32: 2417–2438, https://doi.org/10.1007/s11053-023-10249-6.Search in Google Scholar

Konstantinov, A.V. and Utkin, L.V. (2021). Interpretable machine learning with an ensemble of gradient boosting machines. Knowl.-Base. Syst. 222, https://doi.org/10.1016/j.knosys.2021.106993.Search in Google Scholar

Kotsiantis, S. (2011). Combining bagging, boosting, rotation forest and random subspace methods. Artif. Intell. Rev. 35: 223–240, https://doi.org/10.1007/s10462-010-9192-8.Search in Google Scholar

Kumar, A., Srivastava, N.K., and Gera, P. (2021). Removal of color from pulp and paper mill wastewater- methods and techniques. A Rev. J. Environ. Manage. 298, https://doi.org/10.1016/j.jenvman.2021.113527.Search in Google Scholar PubMed

Kumar, D., Sood, S.K., and Rawat, K.S. (2023). Early health prediction framework using XGBoost ensemble algorithm in intelligent environment. Artif. Intell. Rev. 56: 1591–1615, https://doi.org/10.1007/s10462-023-10565-6.Search in Google Scholar

Lee, Y.-G., Oh, J.-Y., Kim, D., and Kim, G. (2023). SHAP Value-Based feature importance analysis for Short-Term load forecasting. J. Electrical Eng. Technol. 18: 579–588, https://doi.org/10.1007/s42835-022-01161-9.Search in Google Scholar

Li, R., Wang, X., Lei, L., and Song, Y. (2019a). L21 -Norm based loss function and regularization extreme learning machine. IEEE Access 7: 6575–6586, https://doi.org/10.1109/ACCESS.2018.2887260.Search in Google Scholar

Li, Y., Liang, Z., Hu, Y., Li, B., Xu, B., and Wang, D. (2019b). A multi-model integration method for monthly streamflow prediction: modified stacking ensemble strategy. J. Hydroinf. 22: 310–326, https://doi.org/10.2166/hydro.2019.066.Search in Google Scholar

Liang, L., Wei, L., Fang, G., Xu, F., Deng, Y., Shen, K., Tian, Q., Wu, T., and Zhu, B. (2020). Prediction of holocellulose and lignin content of pulp wood feedstock using near infrared spectroscopy and variable selection. Spectrochim. Acta : Molecul. Biomol. Spectrosc. 225, https://doi.org/10.1016/j.saa.2019.117515.Search in Google Scholar PubMed

Liu, Y., Wu, X., Zhang, J., Liu, S., Semple, K., and Dai, C. (2023). Maturation stress and wood properties of poplar (Populus × euramericana cv. ‘Zhonglin46’) Tension Wood. Forests 14, https://doi.org/10.3390/f14071505, https://www.mdpi.com/1999-4907/14/7/1505.Search in Google Scholar

Mabula, M.J., Kisanga, D., and Pamba, S. (2023). Application of machine learning algorithms and sentinel-2 satellite for improved bathymetry retrieval in lake victoria, Tanzania. Egypt. J. Rem. Sens. Space Sci. 26: 619–627, https://doi.org/10.1016/j.ejrs.2023.07.003.Search in Google Scholar

Mahdi, G.J.M., Mohammed, N.J., and Al-Sharea, Z.I. (2021) Regression shrinkage and selection variables via an adaptive elastic net model. J. Phys.: Conf. Ser. 1879, https://doi.org/10.1088/1742-6596/1879/3/032014.Search in Google Scholar

Mansfield, S.D. and Weineisen, H. (2010). Wood fiber quality and kraft pulping efficiencies of trembling aspen (Populus tremuloides Michx) Clones. J. Wood Chem. Technol. 27: 135–151, https://doi.org/10.1080/02773810700786.Search in Google Scholar

McMillin, C.W. (1969a). Quality of refiner groundwood pulp as related to handsheet properties and gross wood characteristics. Wood Sci. Technol. 3: 287–300, https://doi.org/10.1007/BF00367214.Search in Google Scholar

Mcmillin, C.W. (1969b). Wood chemical composition as related to properties of handsheets made from loblolly pine refiner groundwood. Wood Sci. Technol. 3: 232–238, https://doi.org/10.1007/bf00367214.Search in Google Scholar

Molteberg, D. and Høibø, O. (2006). Development and variation of wood density, kraft pulp yield and fibre dimensions in young Norway spruce (Picea abies). Wood Sci. Technol. 40: 173–189, https://doi.org/10.1007/s00226-005-0020-2.Search in Google Scholar

Na, K.-S. and Kim, E. (2019). A machine learning-based predictive model of return to work after sick leave. J. Occup. Environ. Med. 61: 191–199, https://doi.org/10.1097/jom.0000000000001567.Search in Google Scholar PubMed

Okwuashi, O., Ndehedehe, C., and Attai, H. (2020). Tide modeling using partial least squares regression. Ocean Dynam. 70: 1089–1101, https://doi.org/10.1007/s10236-020-01385-1.Search in Google Scholar

Oluwadare, A.O. and Sotannde, O.A. (2007). The relationship between fibre characteristics and pulp-sheet properties of leucaena leucocephala (Lam.) De Wit. Middle East J. Sci. Res. 2: 63–68.Search in Google Scholar

Parham, R.A., Robinson, K.W., and Isebrands, J.G. (1977). Effects of tension wood on kraft paper from a short-rotation hardwood (Populus “Tristis No. 1”). Wood Sci. Technol. 11: 291–303, https://doi.org/10.1007/bf00356927.Search in Google Scholar

Pattanayak, S. and Singh, T. (2022). Cardiovascular disease classification based on machine learning algorithms using gridSearchCV, cross validation and stacked ensemble methods, Vol. 4. Springer, Cham, pp. 219–230.10.1007/978-3-031-12638-3_19Search in Google Scholar

Rajkomar, A., Dai, A.M., Sun, M., Hardt, M., Chen, K., Rough, K., and Dean, J. (2018). Reply: metrics to assess machine learning models. Npj Digit. Med. 1: 57, https://doi.org/10.1038/s41746-018-0063-z.Search in Google Scholar PubMed PubMed Central

Ramirez, M., Rodriguez, J., Balocchi, C., Peredo, M., Elissetche, J.P., Mendonca, R., and Valenzuela, S. (2009). Chemical composition and wood anatomy of eucalyptus globulus clones: variations and relationships with pulpability and handsheet properties. J. Wood Chem. Technol. 29: 43–58, https://doi.org/10.1080/02773810802607559.Search in Google Scholar

Schonlau, M. and Zou, R.Y. (2020). The random forest algorithm for statistical learning. The Stata J. 20: 3–29, https://doi.org/10.1177/1536867X20909688.Search in Google Scholar

Seth, R.S. and Page, D.H. (1988). Fiber properties and tearing resistance. Tappi J. 71: 103–107.Search in Google Scholar

Sheikh, M. and Coolen, A.C.C. (2019). Analysis of overfitting in the regularized cox model. J. Phys. A: Math. Theor. 52: 384002, https://doi.org/10.1088/1751-8121/ab375c.Search in Google Scholar

Sheng-zuo, F., Wen-zhong, Y., and Xiang-xiang, F. (2004). Variation of microfibril angle and its correlation to wood properties in poplars. J. Fores. Res. 15: 261–267, https://doi.org/10.1007/bf02844949.Search in Google Scholar

Sun, Z., Xu, B., Jin, F., Zhou, G., Lin, L. (2022). Machine learning approach for on-demand rapid constructing metasurface. IEEE J. Sel. Top. Quantum Electron. 28: 1–9, https://doi.org/10.1109/JSTQE.2021.3083565.Search in Google Scholar

Szadkowska, D., Zawadzki, J., Kozakiewicz, P., and Radomski, A. (2021). Identification of extractives from various poplar species. Forests 12: 647, https://doi.org/10.3390/f12050647, https://www.mdpi.com/1999-4907/12/5/647.Search in Google Scholar

Tavassoli, N., Poursorkh, Z., Bicho, P., and Grant, E. (2020). TOGA feature selection and the prediction of mechanical properties of paper from the Raman spectra of unrefined pulp. Anal. Bioanal. Chem. 412: 8401–8415, https://doi.org/10.1007/s00216-020-02978-x.Search in Google Scholar PubMed

Tominaga, Y. and Fujiwara, I. (1997). Prediction-weighted partial least-squares regression method (PWPLS). Chemom. Intell. Lab. Syst. 38: 139–144, https://doi.org/10.1016/S0169-7439(97)00043-9.Search in Google Scholar

Uzun, İ. (2023). Methods of determining the degree of crystallinity of polymers with X-ray diffraction: a review. J. Polym. Res. 30: 394, https://doi.org/10.1007/s10965-023-03744-0.Search in Google Scholar

Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Cournapeau, D. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17: 261–272, https://doi.org/10.1038/s41592-019-0686-2.Search in Google Scholar PubMed PubMed Central

Wheeler, D. and Tiefelsdorf, M. (2005). Multicollinearity and correlation among local regression coefficients in geographically weighted regression. J. Geograph. Syst. 7: 161–187, https://doi.org/10.1007/s10109-005-0155-6.Search in Google Scholar

Xia, J., Min, S., and Li, J. (2023). Rapid analysis the type of customs paper using Micro-NIR spectrometers and machine learning algorithms. Spectrochim. Acta : Mol. Biomol. Spectrosc. 290, https://doi.org/10.1016/j.saa.2022.122272.Search in Google Scholar PubMed

Xiao, T., Zhu, J., Liu, T. (2013). Bagging and Boosting statistical machine translation systems. Artif. Intell. 195: 496–527, https://doi.org/10.1016/j.artint.2012.11.005.Search in Google Scholar

Yang, H., Liu, Y., Xiong, Z., and Liang, L. (2019). Rapid determination of holocellulose and lignin in wood by Near Infrared Spectroscopy and kernel extreme learning machine. Anal. Lett. 53: 1–15, https://doi.org/10.1080/00032719.2019.1700267.Search in Google Scholar

Yoon, H.I., Lee, H., Yang, J.-S., Choi, J.-H., Jung, D.-H., Park, Y.J., Park, J.-E., Kim, S.M., Park, S.H. (2023). Predicting models for plant metabolites based on PLSR, AdaBoost, XGBoost, and LightGBM algorithms using hyperspectral imaging of brassica juncea. Agriculture 13: 1477, https://doi.org/10.3390/agriculture13081477.Search in Google Scholar

Zhang, W., Kasun, L.C., Wang, Q.J., Zheng, Y., and Lin, Z. (2022). A review of machine learning for Near-Infrared spectroscopy. Sensors 22: 9764, https://doi.org/10.3390/s22249764.Search in Google Scholar PubMed PubMed Central


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/npprj-2024-0066).


Received: 2024-09-10
Accepted: 2024-12-11
Published Online: 2025-01-03
Published in Print: 2025-03-26

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Biorefining
  3. Fractionation methods of eucalyptus kraft lignin for application in biorefinery
  4. Pulp and paper industry side-stream materials as feed for the oleaginous yeast species Lipomyces starkeyi and Rhodotorula toruloides
  5. Chemical Pulping
  6. Comparing classic time series models and state-of-the-art time series neural networks for forecasting as-fired liquor properties
  7. Optimization of kraft pulping process for Sesbania aculeata (dhaincha) stems using RSM
  8. On the nature of the selectivity of oxygen delignification
  9. Unlocking potential: the role of chemometric modeling in pulp and paper manufacturing
  10. Effects of chemical environment on softwood kraft pulp: exploring beyond conventional washing methods
  11. Bleaching
  12. Variations in carbohydrates molar mass distribution during chemical degradation and consequences on fibre strength
  13. Mechanical Pulping
  14. Energy consumption in refiner mechanical pulping
  15. Paper Technology
  16. Australian wheat and hardwood fibers for advanced packaging materials
  17. Compression refining: the future of refining? Application to bleached kraft eucalyptus pulp
  18. The effect of nanocellulose to coated paper and recycled paper
  19. Interpreting the relationship between properties of wood and pulping & paper via machine learning algorithms combined with SHAP analysis
  20. Hybridization to prepare environmentally friendly, cost-effective superhydrophobic oleophobic coatings
  21. Paper Physics
  22. Characterising the mechanical behaviour of dry-formed cellulose fibre materials
  23. Paper Chemistry
  24. Study on the properties of ground film paper prepared from lactic acid-modified cellulose
  25. Environmental Impact
  26. Characterization of sludge from a cellulose pulp mill for its potential biovalorization
  27. The in situ green synthesis of metal organic framework (HKUST-1)/cellulose/chitosan composite aerogel (CSGA/HKUST-1) and its adsorption on tetracycline
  28. Evaluation of the potential use of powdered activated carbon in the treatment of effluents from bleached kraft pulp mills
  29. Recycling
  30. Waste newspaper activation by sodium phosphate for adsorption dynamics of methylene blue
Downloaded on 18.11.2025 from https://www.degruyterbrill.com/document/doi/10.1515/npprj-2024-0066/pdf
Scroll to top button