Background: Genomic selection (GS) is widely used in breeding for aquaculture species, and has greatly elevated the breeding efficiency. Various models for genomic selection have been applied in aquaculture research, including GBLUP, BayesA, BayesB, BayesCπ, Bayes Lasso, and several machine learning models. Optimal combinations of models and genetic variants were key to accurate prediction in breeding.
Materials and methods: In this study, the main GS models were evaluated in two traits of large yellow croaker (Larimichthys crocea), combining with different SNP datasets. Totally 534 samples were used as reference population for body weight trait analysis, 522 samples were used as reference population for muscle elasticity trait analysis. Moreover, 716 samples with only genotyping data were used as candidate population. All genotyping data were acquired from two resources: SNP data from probe hybridation and mSNP data from targeted sequencing.
Results: Principal component analyses were conducted on all reference and candidate populations, indicating close genetic background among different populations. GWAS for both traits were conducted considering the population structure, and associated SNPs were used as datasets as well as whole SNPs. SVM and Logistic regression model showed higher prediction accuracies than other models using mSNP-GWAS dataset, indicating the prospective of machine learning models in fish genomic breeding.