USE OF MACHINE LEARNING ALGORITHMS AND IN SITU DATA FOR ESTIMATING PARTICULATE ORGANIC CARBON FROM THE MEDITERRANEAN SEA

S. FELLOUS, A. BENDJAMA, Y. BENZAOUI

Abstract


Water quality indicators, including biological, chemical and physical properties, are usually determined by collecting data from the field and analyzing them in the laboratory. Although these in situ measurements are costly and time-consuming, they offer high accuracy. This study focuses on the estimation of particulate organic carbon (POC) as a water quality parameter using a combination of machine learning algorithms and hyperspectral in situ data. A data-driven approach that does not need any domain knowledge was used. We were interested in POC generated by bacteria, phytoplankton, zooplankton, detritus and sediments in the Mediterranean Sea from 15 May to 10 June 2017. Therefore, the objective of this study was to use five regression frameworks from machine learning algorithms to estimate POC with hyperspectral in situ data and evaluate their performance. Based on the coefficient of determination R2, the best-performing modes were nearest neighbors (KNN), gradient boosting (GB) and random forest (RF), with an R2 in the range of 72.33 to 74.7%. These machine learning models can be used to investigate more water quality parameters, as they reveal the great potential of this approach.


Keywords


POC, Machine learning, In situ measurement, Phytoplankton, Hyperspectral

Full Text:

PDF

References


AKINWANDE M.O., DIKKO H.G., SAMSON A. (2015). Variance inflation factor: As a condition for the inclusion of suppressor variable(s) in regression analysis. Open Journal of Statistics, Vol. 5, pp. 754–767.

ALIN A. MULTICOLLINEARITY. (2010). Wiley Interdiscip. Rev. Comput. Stat, Vol. 2, pp. 370–374.

ALLISON D.B., STRAMSKI D., MITCHELL B.G. (2010). Empirical ocean color algorithms for estimating particulate organic carbon in the Southern Ocean, Journal of Geophysical Research Oceans, Vol. 115, Issue C10, pp. 1-16.

https://doi.org/10.1029/2009JC006040

BAILEY S.W., WERDELL P.J. (2006). A multisensor approach for the on-orbit validation of ocean color satellite data products, Remote Sensing of Environment, Vol. 102, Issues 1-2, pp. 12-23. https://doi.org/10.1016/j.rse.2006.01.015.

BOPP L., MONFRAY P., AUMONT O., DUFRESNE J.L., TREUT H.L., MADEC G., TERRAY L., ORR J.C. (2021). Potential impact of climate change on marine export production, Global Biogeochemical Cycles, Vol. 15, No 1, pp. 81-99.

https://doi.org/10.1029/1999GB001256

BRICAUD A., CLAUSTRE H., RAS J., OUBELKHEIR K., (2004). Natural variability of phytoplanktonic absorption in oceanic waters: Influence of the size structure of algal populations, Journal of Geophysical Research Oceans, Vol. 109, Issue C11, pp. 1–12. https://doi.org/10.1029/2004JC002419

BRICAUD A., MOREL A., BABIN M., ALLALI K., CLAUSTRE H., (1998). Variations of light absorption by suspended particles with chlorophyll a concentration in oceanic (case 1) waters: analysis and implications for bio-optical models, Journal of Geophysical Research. Oceans, Vol. 103, Issue C13, pp. 31033-31044.

https://doi.org/10.1029/98JC02712.

CHAN J.Y.L., LEOW S.M.H., BEA K.T., CHENG W.K., PHOONG S.W., HONG Z.W., CHEN Y.L. (2022). Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review Mathematics, Vol. 10, pp. 1283. https://doi.org/10.3390/math10081283

CHAPLOT B., PETERS M., BIRBAL P., PU J.H., SHAFIE A. (2021). Use of gene- expression programming to estimate manning’s roughness coefficient for a low flow stream, Larhyss Journal, No 48, pp. 135-150

IFARRAGUERRI A., CHANG C.I. (2000). Unsupervised hyperspectral image analysis with projection pursuit, In IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No 6, pp. 2529–2538. DOI: 10.1109/36.885200.

KELLER S., MAIER P.M., RIESE F.M., NORRA S., HOLBACH A., BOERSIG N., WILHELMS A., MOLDAENKE C., ZAAKE A., HINZ S. (2018). Hyperspectral data and machine learning for estimating CDOM, Chlorophyll a, diatoms, green algae and turbidity, International Journal of Environmental Research and Public Health, Vol. 15, Issue 9, 1881.

KIMON N., KARAGRIGORIOU A., ARTEMIOU A. (2021). Interdependency Pattern Recognition in Econometrics: A Penalized Regularization Antidote. Econometrics Vol. 9, pp. 44. https://doi.org/10.3390/econometrics9040044

https://doi.org/10.3390/ijerph15091881

MUELLER J.L., FARGION G.S. (2002). Ocean Optics Protocols for SeaWiFS Validation, Revision 3, NASA Technical Memorandum, 2002–210004, NASA Goddard Space flight center, Greenbelt, Maryland, 308 p.

RASSE R., DALL'OLMO G., GRAFF J., WESTBERRY T. K., VAN DONGEN-VOGELS V., BEHRENFELD M.J. (2017). Evaluating optical proxies of particulate organic carbon across the surface Atlantic Ocean, Frontiers in Marine Science, Vol. 4, pp. 1-18. https://doi.org/10.3389/fmars.2017.00367.

STRAMSKA M., STRAMSKI D. (2005). Variability of particulate organic carbon concentration in the north polar Atlantic on SeaWiFS ocean color observation, Journal of Geophysical Research Oceans, Vol. 110, Issue C10, C10018.

https://doi.org/10.1029/2004JC002762.

STRAMSKI D., REYNOLDS R.A., BABIN M., KACZMAREK S., LEWIS M.R., RÖTTGERS R., SCIANDRA A., STRAMSKA M., TWARDOWSKI M.S., FRANZ B.A., CLAUSTRE H. (2008). Relationships between the surface concentration of particulate organic carbon and optical properties in the eastern South Pacific and eastern Atlantic Oceans, Biogeosciences, Vol. 5, pp. 171-201.

https://doi.org/10.5194/bg-5-171-2008.

WANG W., JIN S., GUO Y.L. (2019). Exploration of a Method of Distinguishing Different Nongxiang Tieguanyin Tea Grades Based on Aroma Determined by GC‒MS Combined with Chemometrics. Molecules, Vol. 24, 1707.

https://doi.org/10.3390/molecules24091707

WANG K.J. (2022). How Is the Lung Cancer Incidence Rate Associated with Environmental Risks? Machine-Learning-Based Modeling and Benchmarking, International Journal of Environmental Research and Public Health, Vol. 19, 8445.

https://doi.org/10.3390/ijerph19148445.

WENG S., GUO B., TANG P., YIN X., PAN F., ZHAO J., HUANG L., ZHANG D. (2020). Rapid detection of adulteration of minced beef using Vis/NIR reflectance spectroscopy with multivariate methods, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, Vol. 230, 118005.

WERDELL P.J., BAILEY S.W., FARGION G., PIETRAS C., KNOBELSPIESSE K., FELDMAN G., MCCLAIN C. (2003). Unique data repository facilitates ocean color satellite validation, EOS Science News, Vol. 84, Issue 38, pp. 377-387.

https://doi.org/10.1029/2003EO380001.

WERDELL P.J., BAILEY S.W. (2005). An improved bio-optical dataset for ocean color algorithm development and satellite data product validation, Remote Sensing of Environment, Vol. 98, Issue 1, pp. 122-140.

https://doi.org/10.1016/j.rse.2005.07.001.

WOOLDRIDGE J.M. (2015). Introductory Econometrics, A Modern Approach; Cengage Learning: Boston, MA, USA.

XIE F., TAO Z., ZHOU X., LV T., WANG J. (2019). Spatial and temporal variations of particulate organic carbon sinking flux in global ocean from 2003 to 2018, Remote Sensing, Vol. 11, Issue 24, 2941. https://doi.org/10.3390/rs11242941.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.