Today, more than 160 welfare indicators for farmed fish are described. In most species-specific welfare assessment systems, each indicator is assessed based on a uniform ordinal scale (mostly 3-6 scores) of identical numbers for all included indicators. Generally, each welfare indicator needs to meet the criteria of being valid, reliable, and practicable for each application to a specific species and production system. Reliability is crucial when developing and using welfare assessment systems. However, almost none of these indicators have been tested for reliability in rigorous scientific studies. We evaluated a set of 25 welfare indicators of different complexity for rainbow trout under laboratory and farm conditions. The set of indicators covered aspects of farm management, husbandry measures, resources, fish behavior, stunning and killing during slaughter, as well as fish health-related topics. Assessment systems were specified for each indicator. All indicators were statistically tested for intra- and inter-observer reliability.
The study consisted of four distinct phases: 1. A methodical review of all relevant welfare indicators and corresponding classification systems for rainbow trout; 2. Pre-Test of welfare indicators and corresponding assessment-systems according to the literature on aquaculture farms; 3. Elaboration of a survey scheme for the indicators and corresponding assessment systems regarding practicability, specific production systems, and corresponding welfare needs; 4. Training, and testing of welfare indicators under laboratory and farm conditions.
For statistical validation, we calculated relative agreement, as well as Gwet’s AC1, and the Brennan-Prediger agreement coefficients. We used 0.61 as the minimum statistical benchmark for reliability in AC1 as well as in the Brennan-Prediger agreement coefficient.
After training, observers were able to reliably assess each indicator on 10 different farms and more than 200 individual fish. Indicators based on binary assessment systems generally achieved reliabilities of ≥ 0.95. Indicators based on a 3- and 4-score assessment system achieved reliabilities of ≥0.69. Fin condition, being the only indicator based on a 5-score assessment system, varied between 0.69 and 0.92.
In general, there was a slight negative trend between field assessment when compared to laboratory assessment. In addition, observers achieved higher reliability as they gained more experience.
In the current study, we have documented a statistically validated set of operational welfare indicators for farmed rainbow trout, based on indicator-specific assessment-systems, tested in traditional pond- as well as raceway systems.