Prediction of Copper and Manganese Concentrations in Citrus Leaves Based on Easily Measured Soil Characteristics: Comparison of Stepwise Regression Models and Gene Expression Programming

Document Type : Research Article

Authors

1 Soil and Water Research Department, South Kerman Agricultural and Natural Resources Research and Education Center, Agricultural Research, Education and Extension Organization (AREEO), Jiroft, Iran.

2 Soil and Water Research Institute, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran.

10.22034/sps.2026.70403.1026

Abstract

Background and Objectives
The availability of micronutrients, particularly copper (Cu) and manganese (Mn), is critical for sustainable citrus production. Cu serves as a vital component of enzymes involved in photosynthesis and respiration, while Mn is crucial for enzyme activation and chlorophyll synthesis. Deficiencies in these nutrients can severely compromise plant vitality and yield, posing a significant economic threat to agricultural regions. This challenge is particularly acute in arid areas like southern Kerman, Iran, where the calcareous soils (characterized by high pH and low organic matter content) severely limit the bioavailability of Cu and Mn. Consequently, there is an urgent need to develop accurate predictive methods for assessing the nutritional status of citrus trees to guide precision fertilization strategies. Traditionally, researchers have relied on linear statistical methods, such as Stepwise Regression (SWR), to model the relationship between soil properties and nutrient uptake. However, these models are fundamentally limited by their assumption of linearity, whereas soil-plant systems are inherently non-linear and interactive. This limitation has prompted a shift toward more sophisticated machine learning approaches, such as Gene Expression Programming (GEP). GEP is an evolutionary algorithm that offers a unique advantage by producing explicit, transparent mathematical equations, thereby combining high predictive accuracy with interpretability. This study aims to fill a critical research gap by directly comparing the performance of traditional stepwise regression against GEP for predicting citrus leaf Cu and Mn concentrations under the specific soil conditions of southern Kerman. The central hypothesis was that the non-linear GEP model would significantly outperform its linear counterpart.
 
Materials and Method
This study was conducted across 40 commercial Valencia orange orchards in southern Kerman, Iran. From each orchard, composite soil samples were collected from two depths (0–30 cm and 31–60 cm), alongside leaf samples from 4- to 6-month-old spring shoots. In the laboratory, leaf Cu and Mn concentrations were quantified using atomic absorption spectrometry. Soil samples were analyzed for key physicochemical properties using standard protocols, including texture, pH, electrical conductivity (EC), organic carbon (OC), total neutralizing value (TNV), and available phosphorus (P). Two distinct modeling approaches were developed and compared: (1) Stepwise Multiple Linear Regression (SWR) was employed to generate linear predictive models, and (2) GEP was implemented using GeneXproTools software to generate non-linear models. To ensure robust model validation, the full dataset (n=40) was randomly partitioned into a training set (70%, 28 samples) and a testing set (30%, 12 samples). The performance of all developed models was rigorously evaluated using the Coefficient of Determination (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE).
 
Results
Descriptive statistics showed that the soils were predominantly neutral to alkaline (mean pH 7.75). Pearson correlation analysis revealed that citrus leaf Cu concentration was negatively correlated with soil pH (r=-0.41) and positively with OC (r=0.39). Similarly, citrus leaf Mn concentration showed negative correlations with both pH (r=-0.33) and EC (r=-0.29).
The SWR models yielded limited predictive accuracy on the test data. For leaf Cu, the SWR model (predictors: pH, clay, P) resulted in an R2 of 0.36 and an RMSE of 1.30 mg/kg. The model for leaf Mn concentration (predictors: EC, pH) exhibited lower performance, with an R2 of 0.28 and an RMSE of 5.31 mg/kg. In stark contrast, the GEP models demonstrated vastly superior predictive power. The optimal GEP model for Cu (GEP3; inputs: pH, OC, and clay) achieved an R2 of 0.60 and an RMSE of 0.82 mg/kg on the test set. The best model for Mn (GEP2; inputs: pH and EC) yielded an R2 of 0.53 and an RMSE of 3.67 mg/kg. These results represent a 67% improvement in Cu concentration and a near-doubling for Mn concentration compared to the SWR models. Furthermore, the prediction error (RMSE) was reduced by 37% for Cu concentration and 31% for Mn concentration.
 
Conclusions
The findings of this study demonstrate that Gene Expression Programming (GEP) is a significantly more powerful and accurate tool than conventional stepwise regression for predicting citrus leaf Cu and Mn concentrations in calcareous soils. The results confirm that the relationships between soil properties and nutrient uptake are fundamentally non-linear, and GEP’s capacity to model these complex interactions is the primary driver of its superior performance. The models consistently identified soil pH as a master variable controlling micronutrient availability. Moreover, the GEP model for Cu highlighted the crucial, non-linear role of organic carbon in protecting Cu from precipitation a nuance that the linear model failed to capture. The practical implication of this research is twofold. First, the explicit mathematical equations generated by the GEP models can be directly embedded into spreadsheet software or mobile applications to serve as a preliminary screening tool. This allows growers to input routine soil test data and receive instant, reliable predictions of potential nutrient deficiencies. Second, on a broader scale, this work provides a methodological framework for other regions, encouraging the adoption of non-linear modeling to develop locally calibrated decision-support systems. Ultimately, this study offers a tangible pathway toward empowering farmers with data-driven tools for enhancing citrus productivity in challenging agricultural environments.

Author Contributions

Heidari and S.A. Ghaffari Nejad conceived and planned the experiments. J. Sarhadi carried out the experiments. S. Heidari and S.A. Ghaffari Nejad analyzed data S. Heidari and S.A. Ghaffari Nejad wrote the first manuscript. All authors contributed to the interpretation of the results. All authors provided critical feedback and helped shape the research, analysis and manuscript.

Data Availability Statement
Data available on request from the authors.

Acknowledgements
The authors would like to thank the research council of the Soil and Water Research Institute, Iran for the financial support of this research.

Ethical considerations
The authors avoided data fabrication, falsification, plagiarism, and misconduct.

Conflict of interest
The author declares no conflict of interest.

Keywords

Main Subjects


Ahmad, N., Hussain, S., Ali, M. A., Minhas, A., Waheed, W., Danish, S., Fahad, S., Ghafoor, U., Baig, K. S., Sultan, H., Hussain, M. I., Ansari, M. J., Marfo, T. D., & Datta, R. (2022). Correlation of soil characteristics and citrus leaf nutrients contents in current scenario of Layyah District. Horticulturae, 8(1), 61. https://doi.org/10.3390/horticulturae8010061
Allison, L., & Richards, L. (1954). Diagnosis and improvement of saline and alkali soils. Agriculture Handbook No. 60, Soil and Water Conservative Research Branch, Agricultural Research Service, USDA, Washington, USA.
Asadi Kangarshahi, A., Fallah Nosratabad, A., & Akhlaghi Amiri, N. (2019). Guide for sampling and interpretation of soil and leaf analysis results of citrus trees. Technical Paper No. 561, Soil and Water Research Institute, Karaj, Iran. (in Persian with English abstract)
Chatzistathis, T., Papaioannou, A., Gasparatos, D., & Molassiotis, A. (2017). From which soil metal fractions Fe, Mn, Zn and Cu are taken up by olive trees (Olea europaea L., cv. ‘Chondrolia Chalkidikis’) in organic groves? Journal of Environmental Management, 203, 489-499. https://doi.org/10.1016/j.jenvman.2017.07.079
Ebrahimi, M., Sarikhani, M. R., & Shiri, J. (2022). Application of artificial neural network and gene expression programming to estimate soil microbial metabolic quotient. Applied Soil Ecology, 175, 104465. https://doi.org/10.1016/j.apsoil.2022.104465
Ebrahimi, M., Sarikhani, M. R., Shiri, J., & Shahbazi, F. (2021). Modeling soil enzyme activity using easily measured variables: Heuristic alternatives. Applied Soil Ecology, 1(157), 103753. https://doi.org/10.1016/j.apsoil.2020.103753
Fageria, N. K., Gheyi, H. R., & Moreira, A. (2011). Nutrient bioavailability in salt affected soils. Journal of Plant Nutrition, 34(7), 945-962. https://doi.org/10.1080/01904167.2011.555578
Feil, S. B., Pii, Y., Valentinuzzi, F., Tiziani, R., Mimmo, T., & Cesco, S. (2020). Copper toxicity affects phosphorus uptake mechanisms at molecular and physiological levels in Cucumis sativus plants. Plant Physiology and Biochemistry, 157, 138-147. https://doi.org/10.1016/j.plaphy.2020.10.023
Ferreira, C. (2001). Gene expression programming: A new adaptive algorithm for solving problems. Complex Systems, 13. https://doi.org/10.48550/arXiv.cs/0102027
Fu, B. J., Liu, S. L., Ma, K. M., & Zhu, Y. G. (2004). Relationships between soil characteristics, topography and plant diversity in a heterogeneous deciduous broad-leaved forest near Beijing, China. Plant and Soil, 261(1), 47-54. https://doi.org/10.1023/B:PLSO.0000035567.97093.48
Gunasekaran K, A. K and Sreevardhan P (2025) Real-time soil fertility analysis, crop prediction, and insights using machine learning and deep learning algorithms. Frontiers in Soil Science, 5, 1652058. https://doi.org/10.3389/fsoil.2025.1652058
Heidari, S., Ghaffari Nejad, S. A., Sarhadi, J., & Sharif, M. (2024). Modeling the relationship between iron concentration in citrus leaves and some soil properties using artificial neural network (case study of southern Kerman province). Iranian Journal of Soil and Water Research, 55(2), 285-296. (in Persian with English abstract) https://doi.org/10.22059/ijswr.2024.369507.669619.
Heidari, S., Vadiati, M., Ghaffari Nejad, S. A., Sarhadi, J., & Kisi, O. (2024). Modeling Zn availability and uptake by citrus plants using easily measured soil characteristics. Environmental Modeling & Assessment, 29(5), 883-900. https://doi.org/10.1007/s10666-024-09962-0
Hosseinifard, S. J., Shirani, H., & Hashemipour, H. (2019). Modeling the relationship between cadmium and some soil physical and chemical properties in pistachio orchards using regression and artificial neural network. Environmental Sciences, 17(3), 177-188. (In Persian) https://doi.org/10.29252/envs.17.3.177
Hosseinpour, M., Sharifi, H., & Sharifi, Y. (2018). Stepwise regression modeling for compressive strength assessment of mortar containing metakaolin. International Journal of Modelling and Simulation, 38(4), 207-215.
Koukoulakis, P., Chatzissavvidis, C., Papadopoulos, A., & Pontikis, D. (2013). Interactions between leaf macro, micronutrients and soil properties in pistachio (Pistacia vera L.) orchards. Acta Botanica Croatica, 72(2), 295-310.
Li, Y., Han, M.-Q., Lin, F., Ten, Y., Lin, J., Zhu, D.-H., Guo, P., Weng, Y., & Chen, L.-S. (2015). Soil chemical properties,'Guanximiyou'pummelo leaf mineral nutrient status and fruit quality in the southern region of Fujian province, China. Journal of Soil Science and Plant Nutrition, 15(3), 615-628. http://dx.doi.org/10.4067/S0718-95162015005000029
Mehdizadeh, S., Behmanesh, J., & Khalili, K. (2017). Application of gene expression programming to predict daily dew point temperature. Applied Thermal Engineering, 112, 1097-1107. https://doi.org/10.1016/j.applthermaleng.2016.10.181
Mhalla, B., Ahmed, N., Datta, S. P., Golui, D., Singh, M., & Shrivastava, M. (2021). Solubility relationship of metals in acid soils of kumaon himalaya region of India. Communications in Soil Science and Plant Analysis, 52(19), 2373-2387. https://doi.org/10.1080/00103624.2021.1928170
Moreno-Lora, A., & Delgado, A. (2020). Factors determining Zn availability and uptake by plants in soils developed under Mediterranean climate. Geoderma, 376, 114509. https://doi.org/10.1016/j.geoderma.2020.114509
Najafi N., Parsazadeh M., Tabatabaei S.J., & Oustan S. (2010). Effect of nitrogen form and pH of nutrient solution on the uptake of Fe, Zn, Cu and Mn by spinach plant in hydroponic culture. Iranian Journal of Soil and Water Research, 41(2), 283–295. (in Persian with English abstract) https://dor.isc.ac/dor/20.1001.1.2008479.1389.41.2.16.7
Nelson, D. a., & Sommers, L. E. (1983). Total carbon, organic carbon, and organic matter. Pp. 539-579. In: Methods of soil analysis: Part 2. Chemical and microbiological properties. ASA, SSSA, USA.  
Olsen, S. R. (1954). Estimation of available phosphorus in soils by extraction with sodium bicarbonate. US Department of Agriculture, USA.  
Patel, M. B., Patel, J. N., & Bhilota, U. M. (2022). Comprehensive modelling of ANN. Pp. 31-40. In: Research anthology on artificial neural network applications. IGI Global. https://doi.org/10.4018/978-1-6684-2408-7.ch002
Rahman, R. & Nath Das, K. (2025). Artificial intelligence and machine learning in soil analysis for precision agriculture: a review. Journal of Experimental Agriculture International, 47(5), 511–524. https://doi.org/10.9734/jeai/2025/v47i53440
Rengel, Z. (2015). Availability of Mn, Zn and Fe in the rhizosphere. Journal of Soil Science and Plant Nutrition, 15(2), 397-409. http://dx.doi.org/10.4067/S0718-95162015005000036
Rowell, D. L. (2014). Soil science: Methods & applications (1st ed.). University of Reading, Routledge, London. https://doi.org/10.4324/9781315844855
Sahoo, S., & Jha, M. K. (2013). Groundwater-level prediction using multiple linear regression and artificial neural network techniques: a comparative assessment. Hydrogeology Journal, 21(8), 1865-1887. https://doi.org/10.1007/s10040-013-1029-5
Salimi Tarazoj, S., Reyhanitabar A., & Najafi N. (2024) Effects of biochar and phosphorus on dry matter and uptake of calcium, magnesium, iron, zinc, copper, and manganese by rapeseed in a calcareous soil. Journal of Soil and Plant Science, 34(4), 91–113. (in Persian with English abstract) https://doi.org/10.22034/sps.2024.19185
Sarhadi, J., heidari, S., & Sharif, M. (2020). The effect of organic, chemical fertilizer and superabsorbant on nutritional status of sure orange rootstock (Citrus aurantium). Horticultural Plants Nutrition, 2(2), 198-212 (in Persian with English abstract) . https://doi.org/10.22070/hpn.2020.4840.1047
Shiri, J., Sadraddini, A. A., Nazemi, A. H., Kisi, O., Landeras, G., Fard, A. F., & Marti, P. (2014). Generalizability of gene expression programming-based approaches for estimating daily reference evapotranspiration in coastal stations of Iran. Journal of hydrology, 508, 1-11. https://doi.org/10.1016/j.jhydrol.2013.10.034
Vashisth, T., & Kadyampakeni, D. (2020). Diagnosis and management of nutrient constraints in citrus. Pp. 723-737. In: Fruit crops. Elsevier.
Yang, J., Wang, J., Xu, C., Liao, X., & Tao, H. (2022). Modeling the spatial relationship between rice cadmium and soil properties at a regional scale considering confounding effects and spatial heterogeneity. Chemosphere, 287, 132402. https://doi.org/10.1016/j.chemosphere.2021.132402
Zhang, Y.-Q., Deng, Y., Chen, R.-Y., Cui, Z.-L., Chen, X.-P., Yost, R., Zhang, F.-S., & Zou, C.-Q. (2012). The reduction in zinc concentration of wheat grain upon increased phosphorus-fertilization and its mitigation by foliar zinc application. Plant and Soil, 361(1), 143-152. https://doi.org/10.1007/s11104-012-1238-z