World Aquaculture 2023

May 29 - June 1, 2023

Darwin, Northern Territory, Australia

MODELLING HIGH SURVIVAL OUTCOMES IN VANNAMEI PRAWN FARMING USING DATA FROM CENTRAL PHILIPPINES

Neil Arvin Bretaña*, Maria Rosario Marin Marmol Kilrain, Vishwa Vijaysheel, Masum Patel, Yee Ting Chung, Mary Ann Solis, Neslly Marie Bretaña

Neil.Bretana@unisa.edu.au

University of South Australia, Adelaide, Australia

 



The contribution of aquaculture to global aquatic food production is expected to increase 5-fold by 2030. The massive growth in this sector has created a need to efficiently manage resources to maximize productivity. However, issues such as climate change and emerging diseases contributes to the challenges in aquatic food farming. For instance, these factors greatly affect the survival of aquatic culture. This can impact yield outcome creating shortage in food supply and an economic loss for aquaculture farmers. Therefore, it is essential to explore and leverage innovative technologies.

We analysed industry data consisting of 3 harvest cycles from 9 P. vannamei shrimp ponds in the Philippines. The dataset includes daily physio-chemical properties, feed and supplement data, and water management input data. The survival rate, indicating the health outcome of the collective shrimp culture at the end of each harvest cycle, was taken as the target variable for the study. The survival rate threshold was set to 100%, and a high survival outcome was treated as a classification problem. We labelled 870 data points resulting to below 100% survival as class 1, while the remaining 174 data points resulting to 100% and above as class 0. We trained 3 models based on various configurations of the variables: using all variables, using only physio-chemical properties, and using only feed variables. We used the XGBoost algorithm using Python to train these models. XGBoost selects variables based on feature importance and removes redundant variables sharing a high correlation with another. A smote function was implemented to balance the data by generating synthetic data points. Parameters were tuned to reduce model overfitting. Cross-validation was applied to test the model. We compared the model performance of using each variable configuration based on accuracy, sensitivity, and specificity.

Results show that using all variables increases model performance with a prediction accuracy of up to 85.17%. Using only physio-chemical variables reduces model accuracy to 67.94% while using only feed variables brings this down further to 63.16%. This result indicates the importance of implementing thorough and accurate data collection practices in farm settings. This study also indicates the potential for predicting survival and yield in a farm culture.