Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores in Nelore cattle

- The aim of this study was to evaluate genomic information inclusion in genetic parameter estimation of standardized body weight at birth and at 240, 365, and 450 days of age, and visual scores for body structure, precocity, and body muscularity, measured as yearlings in Nelore cattle. We compared genetic parameters, (co)variance components (estimated from Bayesian inference and Gibbs sampling), breeding value accuracies, genetic trends, and principal component analysis (PCA) obtained through traditional GBLUP and ssGBLUP methods. For all traits analyzed, part of the phenotypic variation was explained by the additive genetic effect, thus indicating the capacity of traits to respond to the selection process. Estimates of genetic correlation, in both methodologies, between body weights and visual scores were, in general, high and positive, showing that the selection for visual scores can lead to heavier animals. Genetic trends showed genetic progress, both when estimated breeding values and genomic estimated breeding values were used. The PCA, genetic trends, and accuracy estimates on breeding values showed that inclusion of single nucleotide polymorphism information contributed towards slightly better estimates of the genetic variability of evaluated traits. Genomic information did not bring greater gains in genetic estimates, due to redundancy of kinship information from the pedigree, which already had complete animal kinship data.


Introduction
The use of genomic information for evaluating production animals makes it possible to obtain genetic gains more rapidly compared with the use of information obtained only through phenotype records (Meuwissen et al., 2016). Incorporating genomic information within genetic improvement can also include gains in accuracy regarding the genetic breeding values for selection candidates and reduction of the generation interval, which enables selection of younger animals. Given that the amount of genotype data available for Nelore breed commercial cattle herds is relatively small in comparison with the volume of pedigree data, the best manner for using the genomic data on these animals is to combine genomic, pedigree, and phenotype information, through single-step genomic best linear unbiased prediction (ssGBLUP) (Haile-Mariam et al., 2013). This methodology consists of using genomic information to correct the pedigree relationship matrix and ascertain the genetic breeding Selection for growth traits has advantages over selection for reproductive traits, considering the ease of measurement and the possibility of doing it at different ages (Moreira et al., 2015). Body weight (BW) measurements made during an animal's initial development phase are important for determining the economic efficiency of any beef cattle rearing system (Zuin et al., 2012). Heritability estimates (h 2 ) for BW traits indicate that a high proportion of phenotypic variance is attributed to additive genetic variance. Therefore, these traits tend to be used as a selection criterion in genetic improvement programs because of its favorable genetic correlation with several traits of economic interest (Araújo et al., 2014), especially for visual score traits (Abreu et al., 2018).
Beef cattle evaluation by visual scores has been shown to be important to selection for meat production. Like BW, this can be obtained simply and at low cost (Paterno et al., 2017). Through visual scores, it is possible to identify animals with better weight development and better morphological proportions, thus indicating animals with earlier development and finishing (Koury Filho et al., 2010) and also favoring sexual precocity (Abreu et al., 2018;Freitas et al., 2020).
One multivariate technique that is used to explore data variability, when working with several correlated traits, is principal component analysis (PCA). This analysis can assist in interpretation of the genetic variation of economically important traits (Buzanskas et al., 2013) and in exploring genetic relationships between animals and their breeding values (Vargas et al., 2018). This technique has been used in genome-wide association studies to improve quantitative trait loci detection (Zhang et al., 2018) and prospect for candidate genes associated with growth, carcass, conformation, and fatty acid composition traits (Vargas et al., 2020).
Therefore, the benefit of using visual scores in genetic breeding programs relies on the association between visual scores and BW. The objective of this study was to evaluate genomic information inclusion in genetic parameter estimation of standardized BW at birth (BWB) and at 240 (BW240), 365 (BW365), and 450 (BW450) days of age, visual scores for body structure (BS), precocity (FP), and body muscularity (MS), measured in yearling Nelore cattle. We compared genetic parameters, (co)variance components (estimated from Bayesian inference and Gibbs sampling), breeding value accuracies, genetic trends, and PCA obtained through traditional best linear unbiased prediction (BLUP) and ssGBLUP methods to verify whether genotyping animals brought gain in genetic estimate accuracies that guide the selection of superior animals.

Ethics statement
The use that was made of the animals in this study, including their welfare and husbandry, experimental procedures, and blood samples collected, was under the responsibility of the Associação Nacional de Criadores e Pesquisadores (ANCP; Ribeirão Preto, SP, Brazil).

Phenotypic and pedigree datasets
Pedigree data from 192,483 animals of the Nelore breed that were born between 1934 and 2016, along with 80,114 phenotype records and 8,652 genotype records, were obtained through the programa Nelore Brasil, which is from ANCP. We studied traits of BWB, BW240, BW365, and BW450 (Table 1), along with visual scores of BS, FP, and MS, based on the morphological evaluation system (Table 2), measured in yearlings. To these visual scores, conceptual scores relating to the management group were attributed on a scale of 1 to 6, in which 6 was the best trait expression and 1, the worst. This was done through a consensus reached among three evaluators.
The animals had been reared within a pasture-grazing system, with weaning at the age of 6-8 months.
Reproductive management consisted of the use of a mounting season of 60-120 days, with use of Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 3 artificial insemination or controlled natural mounting. Contemporary groups (CG) were created in relation to all the traits in accordance with the effects of sex, birth year, farm of origin, management group, and trimester of birth. Each CG had a minimum of five animals and a maximum of 894, 544, 500, and 461 animals per group for BWB, BW240, BW365, and BW450, respectively, and a maximum of 360 animals with regard to visual score traits.
Contemporary groups containing fewer than five animals were disregarded. Connectedness analysis among the groups for each trait was performed using the AMC software. This analysis consists of verifying direct genetic links (DGL) among CG, due to common sires and dams, or any common ancestor, weighted according to the corresponding additive relationship. A minimum DGL value is used by the software for CG to be considered connected (Roso and Schenkel, 2006). Connectedness analysis was done to ascertain whether any groups were disconnected from the dataset and, thus, reject them from the analyses. It was found that 5, 14, and 15 CG were disconnected regarding traits of BW240, BW365, and BW450 and six groups regarding traits of BS, FP, and MS, respectively.
The fixed effects that were considered in the genomic prediction models were determined using the least-squares method using the GLM function available for the R 3.4.1 statistics software (R Core Team, 2018). For all traits studied, the fixed effect of the CG was significant (P<0.05) and was, therefore, included in the model. For all the BW studied and BS, the linear quadratic effect of the covariable of age of cow at calving was included, since this was shown to be significant (P<0.05) in the analyses on these traits. For the other scores (FP and MS), only the linear effect of the covariable of age of cow at calving was significant (P<0.05), and this was considered in the models. The assumptions for analysis of variance on the normality of the residuals and for homogeneity of variance were met for all the traits. Residuals from the observations that were more than +3.0 standard deviations above or more than −3.0 standard deviations below the average were considered to be outliers and were excluded from the dataset.

Quality control for genomic dataset
The genomic dataset in this study comprised 7,008 animals, 2,446 males and 4,562 females. The dataset came from an initial dataset of 8,652 animals and was subjected to quality control using the  Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 4 R/Bioconductor package snpStats (Clayton, 2020). This consisted of removal of the single nucleotide polymorphism (SNP) markers using alleles of frequency lower than 5% and call rate lower than 95%. Animals with a call rate lower than 93% were removed. After this quality control step, there remained 456,241 SNP and 7,008 genotyped animals.
Among the 8,652 animals (original dataset), 960 bulls were genotyped using a high-density (HD) panel (Illumina Bovine HD BeadChip), while 1,000 animals were genotyped using a medium-density panel (50k) (Illumina BovineSNP50 BeadChip) and were subsequently imputed to the HD panel. Furthermore, 6,692 animals were genotyped using a low-density panel (12k) (Clarifide Nelore 2.0) and were subsequently imputed to the 50k panel and then to the HD panel. Thus, all the animals considered in the genomic analyses were imputed to the HD panel.

Variance component estimation
The prediction of estimated breeding values (EBV) was obtained through the traditional BLUP methodology. Genomic EBV (GEBV) were obtained through ssGBLUP. (Co)variance components were estimated from Bayesian inferences and Gibbs sampling. Bayesian inferences enabled to establish regions of credibility (intervals of higher a posteriori density, HPD) for each parameter or combination of parameters, constructed from the Gibbs sampling. (Co)variances, EBV, and GEBV presented in this study were estimated using THRGIBBS1F90 (Misztal et al., 2014;Tsuruta and Misztal, 2006). For the multi-trait models, initial variance values were obtained from single-trait models: estimates obtained using the GIBBS2F90 software (Misztal et al., 2014) for linear traits and THRGIBBS1F90 for categorical traits. Multi-trait analyses were performed considering combinations of each BW with the three visual scores. The posteriori estimates for variance components were obtained through the POSTGIBBSF90 software (Misztal et al., 2014). The general model can be represented as follows: in which y is the phenotype vector, β is the vector of systematic effects of CG, a is the vector of random additive genetic effects, e is the vector of random residual effects, and X and Z are the incidence matrices that correlate the observations to the systematic and random effects, respectively. The model assumptions were presented as follows: in which G a is the matrix of (co)variances of additive genetic effects, A is the relationship matrix, R is the matrix of residual (co)variances, I r is the identity matrix, and ⊗ represents the direct Kronecker product between two matrices. The relationship matrix H is used when ssGBLUP was considered and is presented as follows: in which A −1 is the inverse of the numerator relationship matrix, G −1 is the inverse of the genomic relationship matrix, and A -2 1 2 is the inverse of the pedigree-based relationship matrix for genotyped animals.
(Co)variance components were estimated by Bayesian inference, via Gibbs sampling. A prior normal distribution was considered in the single-and multi-trait models. In the implementation of Gibbs sampling from Monte Carlo and Markov sequences (MCMC), an initial chain of 500,000 iterations was used. The first 50,000 cycles were discarded (burn-in), and samples were removed after each 100 cycles (thinning interval), thus totaling 4,500 samples. For the convergence analysis, the convergence assessment criteria proposed by Heidelberger and Welch (1983) and Geweke (1992) were used. Convergence between the chains was ascertained using the R BOA package (Smith, 2005). The Heidelberger and Welch test considered that the sample values came from stationary distribution Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 5 for the null hypothesis (Heidelberger and Welch, 1983). A half-amplitude test was performed together with the Heidelberger and Welch test, with calculation of the 95% confidence interval for the mean using the fragment of the chain that went through the stationary test. Geweke's convergence diagnostics are based on a test for equality of means between the first and last portions of the Markov chain. For Geweke's diagnostic test, 10% of the initial samples and 50% of the last samples were used. Heritability estimates, genetic correlations, phenotype correlations, and environmental correlations were obtained from the mean values of the respective samples a posteriori.
The accuracies (ac) of EBV and GEBV were calculated considering the following equation: in which σ 2 e is the predictive variance of the error of genetic value estimates and σ 2 a is the additive genetic variance (BIF, 2018). Animals were divided into classes of EBV and GEBV accuracy to verify the symmetry data distribution. These classes were divided into proportions of animals with accuracy levels as follows: proportion of animals with EBV and GEBV accuracy values less than 0, between 0 and 0.4 (not including accuracy values of 0.4), between 0.4 and 0.6 (including accuracy values of 0.4 and not including values of 0.6), between 0.6 and 0.8 (including accuracy values of 0.6 and not including values of 0.8), and greater than 0.8 (including accuracy values of 0.8).

Principal component analysis and genetic trends
Principal component analysis was performed using the statistics package, available for the software R 3.4.1 (R Core Team, 2018), which uses the decomposition of singular values of the data matrix (centralized and scaled). In this analysis, the Kaiser criterion (Kaiser, 1960) was used to explain variable variance, such that more relevant proportions of the data variance were explained when eigenvalues were greater than 1. This was done by considering the means for the EBV and GEBV estimates, for the traits studied, which were estimated from multi-trait analyses. Through this analysis, the interaction between traits could be seen. The proportion of the animals' genetic variability that was explained through PCA was observed upon including SNP information from the animals in GEBV estimates.
Genetic trends were estimated from linear regression on EBV and GEBV, as a function of the animals' birth year. This was done using the statistics package, available for the software R 3.4.1 (R Core Team, 2018). The significance of the regression coefficient b was tested using the t test. The genetic trends were ascertained to determine whether there had been any genetic progress regarding the traits studied. Bias in predicting genetic breeding values was verified according to Yoshida et al. (2019). To avoid unbiased estimates, the angular coefficient of the statistical linear regression model for GEBV as a function of EBV needs to be 1. Values above 1 indicates inflated estimates and values below 1, deflated estimates.

Results
Convergence was achieved based on at least one of the convergency criteria, for all the variance components that imply reliability in the results obtained. Heritability estimates for the traits evaluated in ssGBLUP and BLUP ranged from 0.31±0.04 to 0.81±0.01 and 0.31±0.02 to 0.82±0.01, respectively. For all the traits analyzed, part of the phenotypic variation was explained by the additive genetic effect (Table 3), thus indicating the capacity of the traits to respond to selection. A slight increase in h 2 estimates was observed when genomic information for BW240, BW365, and B2450 was considered. The BW240 trait was the one that most presented gain when using ssGLBUP, increasing from 0.33±0.02 (traditional BLUP) to 0.38±0.02. The gain in the estimates was due to better knowledge of the relationship coefficients among related animals when molecular information was used.
The genetic variability could also be observed in the PCA (Table 4 and Figure 1). Principal components 1 and 2 (PC1 and PC2, respectively) were considered to represent the variability of the data analyzed as they passed the Kaiser criterion (ssGBLUP: 4.6255 and 1.1721 for PC1 and PC2, respectively; traditional BLUP: 4.5318 and 1.1801 for PC1 and PC2, respectively). In the present study, by using ssGBLUP, PC1 was capable of explaining 66.1% of the variability of the data, while PC2 explained 16.7% (Figure 1 b). In turn, through the use of BLUP, PCA was capable of identifying 64.74 and 16.86% of the BLUP -best linear unbiased prediction; ssGBLUP -single-step genomic best linear unbiased prediction; BWB -weight at birth; BW240 -weight at 240 days of age; BW365 -weight at 365 days of age; BW450 -weight at 450 days of age; BS -structure; FP -precocity; MS -muscularity; h 2 -heritability estimate; σ 2 a -estimate of additive genetic variance; σ 2 e -estimate of residual genetic variance; 95% HPD -estimates of higher a posteriori density. Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 7 variability of the data, in PC1 and PC2, respectively. Therefore, by using the ssGBLUP method, it was possible to identify a slightly higher proportion of the genetic variability. This was achieved through incorporation of SNP information into the prediction model, which enabled better identification of the genetic relationships among the animals.
The greater proximity between the vectors for the mean genetic breeding values of BWB and BS (Figure 1) corroborated the estimates for genetic correlations (Table 5). The estimate for genetic correlation between BWB and BS was greater than the estimates between the other visual scores. The BLUP -best linear unbiased prediction; ssGBLUP -single-step genomic best linear unbiased prediction.  proximities between the vectors for mean genetic breeding values relating to BW and visual scores for FP and MS can also be explained through the high estimates for genetic correlations between these traits. Genetic correlations between visual scores and BW were positive according to both methodologies, ranging from 0.20±0.09 between BWB and MS to 0.88±0.02, between FP and MS, obtained through traditional BLUP. Through ssGBLUP, genetic correlation estimates ranged from 0.24±0.10 between BWB and FP to 0.89±0.02 between FP and MS (Table 5). Genetic correlations were positive and high between BW measured after 240 days of age and visual scores, in both methodologies.
Estimates for GEBV were generally higher than those for EBV for all traits ( Table 6). The differences between GEBV and EBV increased with increasing age among animals and with increasing BW measurements. The greatest proportion of animals presented accuracy of genetic breeding values within the class going from greater than or equal to zero to less than 0.4 (Table 7). Through the use of the ssGBLUP method, there was a slight increase in the proportion of animals with accuracy estimates greater than or equal to 0.4 and a reduction in the proportion with accuracy less than zero, primarily for BW traits, compared with accuracy estimates from the traditional BLUP methodology. This indicated that inclusion of the SNP information enabled estimates of greater accuracy regarding genetic breeding values of animals, thus providing greater confidence in these estimates.
In the present study, all the angular coefficient values of the linear regression between GEBV and EBV, for all traits, were close to 1 (ranging from 1.02 for BWB to 1.12 for BW240), thus indicating that there was little bias in GEBV estimates in relation to EBV. The genetic trends showed the mean changes for GEBV and EBV estimates. The angular coefficient of the mean genetic breeding values per birth year was significant (P<0.05) for all the traits, thus showing genetic progress was occurring both for GEBV and EBV estimates (Figures 2 and 3).
For all the genetic trends, the coefficient of determination (R 2 ) showed that the model used was adequate for explaining the variation of genetic breeding values as a function of birth year. For visual  Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 9 scores measured as yearlings, genetic trend graphs showed that genetic gains occurred over the years both for GEBV and EBV.
The R 2 for the mean EBV had greater variation (0.57 for FP and 0.95 for BWB) compared with values obtained through regression of GEBV according to birth year. Like in the other results presented earlier, this increase can be attributed to use of genomic information in the prediction model.
Genetic trends for all the BW represented genetic progress through selection, especially after the year 1990. Therefore, after the implementation of a breeding program for the Nelore breed by the ANCP, gradual genetic progress was observed for all the evaluated traits, which indicates that favorable and effective selection was achieved by the selection of these traits.

Discussion
Heritability estimates in the present study (Table 3) were close to those found by Araújo et al. (2014) (0.36±0.03 and 0.31±0.01 for BW240 and BW365, respectively) by using only the pedigree matrix in evaluating the Nelore breed. Greater difference in heritability can be observed in relation to birth weight (0.37±0.02 for BWB, found by the cited authors). It is possible that the h 2 observed for BWB was overestimated in the present study because of non-inclusion of the maternal genetic effect in the model. Initial BW (measured from birth up to 240 days of age) may be more influenced by the maternal genetic effect and may inflate h 2 estimates for this trait when not considered in the analyses (Clément et al., 2001).
Heritability estimates for BW240 and BW365 similar to those obtained using the traditional BLUP in the present paper were observed by Zuin et al. (2012) (0.25±0.02 and 0.29±0.02 for BW240 and BW365, respectively). Those authors used a database that was part of the same database used in the present study. The authors did not consider the maternal permanent environmental effect for BW240 due to non-convergence when considering this effect in those models. They attributed this lack of convergence to the data structure, in which 49% of the mothers only had one offspring. The higher proportion of dams with only one offspring in the present study (78.68% on average) was due to the greater number BLUP -best linear unbiased prediction; ssGBLUP -single-step genomic best linear unbiased prediction; BWB -weight at birth; BW240 -weight at 240 days of age; BW365 -weight at 365 days of age; BW450 -weight at 450 days of age; BS -structure; FP -precocity; MS -muscularity.
Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al.
of individuals in the pedigree information compared with the pedigree used by Zuin et al. (2012). The non-convergence of the statistical models when considering the maternal permanent environmental effect (results not shown) in this study can be explained by the pedigree structure, which presented a high proportion of dams with only one offspring.
The gains in h 2 and σ 2 a estimates through inclusion of genotype information in constructing the H matrix (gains in h 2 of 0.05, 0.03, and 0.03 for BW240, 365, and 450, respectively), along with the decrease in σ 2 e , were concordant with estimates found by Veerkamp et al. (2011), who estimated genetic parameters in Holstein-Friesian heifers using different kinship matrices. The kinship matrices were constructed only using pedigree information (A), only using SNP information (G), and a combination of these two matrices (H). These authors reported that after combining the information on phenotype, genotype, and pedigree (using the H matrix), h 2 estimates for BW, milk yield (MY), and EBV -estimated breeding value; GEBV -genomic estimated breeding value; BWB -body weight at birth; BW240 -weight at 240 days of age; BW365 -weight at 365 days of age; BW450 -weight at 450 days of age. The angular coefficient was significantly different from zero according to the t test (P<0.05).  Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 11 feed intake (FI), measured over the first 15 weeks of lactation, presented smaller standard errors (0.09, 0.08, and 0.09 for MY, FI, and BW, respectively) than when the A matrix was used (0.12, 0.11, and 0.12 for MY, FI, and BW, respectively).
The traits that did not show gains in h 2 estimates (BWB and visual scores) when using information from genotypes of animals in the ssGBLUP methodology, possibly presented redundancy of information in relation to other sources that were already considered in the model. This may have led to no gains or only slight gains in the estimates of genetic parameters, and this may also have affected some EBV accuracy gains. Gordo et al. (2016) estimated genetic parameters for visual score traits and observed results that were similar to those from both methodologies proposed in the present study. Those authors attributed their results to the greater number of phenotypes (13,524 for visual scores of conformation, finishing precocity, and muscling) than of genotypes (1,634 genotyped animals after quality control) present in their study. The same may have occurred in the present study.
The genetic correlation results in this study (   Year of birth GEBV_MS Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores... Watanabe et al. 12 The genetic correlation estimates in the present study showed that a substantial proportion of additive action genes influenced these traits in the same direction, i.e., positive estimates would indicate that selection for one trait would influence another trait in the same direction as the selection. By selecting animals from visual score traits, animals with higher BW will also be selected. Direct selection for yearling visual score traits may promote indirect selection for BW, such that visual scores would be an indicator of the animal's biotype. These results are consistent with the results reported in the literature (Bertipaglia et al. (2012), with genetic correlation results of 0.77, 0.75, and 0.58 for BS, FP, and MS, with BW at 18 months of age, respectively; and Paterno et al. (2017)).
The results from the genetic trend analyses using both methodologies for visual scores were concordant with those of Koetz Júnior et al. (2017). These authors evaluated 186,293 visual score records for conformation, precocity, and musculature in yearling Nelore cattle. They observed genetic gains ranging from 1.46±0.007 to 9.90±0.002 per year and, as in the present study, they showed that genetic progress existed for these traits. The results from the present study are also concordant with those of Abreu et al. (2017), who evaluated visual score traits for BS, FP, and MS in a Guzerat breed and observed that the genetic gains per year were small (0.004, 0.0013, and 0.0039, for BS, FP, and MS, respectively) but significant. The genetic progress observed in relation to all BW in the present study corroborates the results presented in the literature for different BW among Nelore cattle (Laureano et al., 2011;Silva et al., 2013) and among other cattle breeds (Mucari and Oliveira, 2003;Abreu et al., 2017).
A slight improvement in breeding value estimate accuracy (Table 7) was noted when the ssGBLUP method was used, through better identification of genetic association among the animals, especially with regard to later-appearing BW (BW365 and BW450). The slight gains in accuracy obtained through both methods may have been due to the lower amount of genotype information compared with the phenotype records. The pedigree structure of this work may have influenced the small gains in the estimates with the addition of animal genotypes. The high values of the genetic breeding value standard deviations indicate the high data variation, and this may have occurred because these estimates came from animals of all pedigrees, i.e., animals born between 1934 and 2016. Environmental effects may have become better controlled after the beginning of the genetic breeding program, which started in 1988. The proportion of animals in the accuracy class of less than zero was greater for visual score traits, probably because of the smaller number of observations compared with the number related to BW traits.

Conclusions
The principal component analysis, genetic trend, and accuracy estimates on the genetic and genomic breeding values show that inclusion of information from single nucleotide polymorphism contributes towards slightly better estimates of the genetic variability of the studied traits. With inclusion of animal genotype information, genetic values for the later-appearing traits are estimated with slightly greater accuracy. The genomic information does not bring greater gains in the estimates, due to redundancy of kinship information from the pedigree, which already had complete animal kinship data. Genetic trends show genetic progress, both when estimated breeding values and when genomic estimated breeding values are used.
Inclusion of genomic information in estimation of genetic parameters for body weights and visual scores.