Article

Implementing structural equation models to observational data from feedlot production systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The objective of this study was to illustrate the implementation of a mixed-model-based structural equation modeling (SEM) approach to observational data in the context of feedlot production systems. Different from traditional multiple-trait models, SEMs allow assessment of potential causal interrelationships between outcomes and can effectively discriminate between direct and indirect effects. For illustration, we focused on feedlot performance and its relationship to health outcomes related to Bovine Respiratory Disease (BRD), which accounts for approximately 75% of morbidity and 50–80% of deaths in feedlots. Our data consisted of 1430 lots representing 178,983 cattle from 9 feedlot operations located across the US Great Plains. We explored functional links between arrival weight (AW; i = 1), BRD-related treatment costs (Trt;asaproxyforhealth;i=2)andaveragedailyweightgain(ADG;asanindicatorofproductiveperformancei=3),accountingforthefixedeffectofsexandcorrelationpatternsduetotheclusteringoflotswithinfeedlots.Weproposedcompetingplausiblecausalmodelsbasedonexpertknowledge.ThebestfittingmodelselectedforinferencesupporteddirecteffectsofAWonADGaswellasindirecteffectsofAWonADGmediatedbyTrt; as a proxy for health; i = 2) and average daily weight gain (ADG; as an indicator of productive performance i = 3), accounting for the fixed effect of sex and correlation patterns due to the clustering of lots within feedlots. We proposed competing plausible causal models based on expert knowledge. The best fitting model selected for inference supported direct effects of AW on ADG as well as indirect effects of AW on ADG mediated by Trt. Direct effects from outcome i’ to outcome i are quantified by the structural coefficient λii’, such that every unit increase in kg/head of AW had a direct effect of increasing ADG by approximately (estimate ± standard error) λˆ31=0.002±0.0001 kg/head/day and also a direct effect of reducing Trtbyanestimatedλˆ21= by an estimated λˆ21=0.08±0.006 USD per head. In addition, every 1USDspentonTrt1 USD spent on Trt directly decreased ADG by an estimated λˆ32=0.004±0.0006 kg/head/day. From these estimates, we show how to compute the indirect, Trt$-mediated, effect of AW on ADG, as well as the overall effect of AW on ADG, including both direct and indirect effects. We further compared estimates of SEM-based effects with those obtained from standard linear regression mixed models and demonstrated the additional advantage of explicitly distinguishing direct and indirect components of an overall regression effect using SEMs. Understanding the direct and indirect mechanisms of interplay between health and performance outcomes may provide valuable insight into production systems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... accounting for the many interrelated risk factors such as season, sex, arrival body weight (AW), shipping distance, and the degree of commingling of cattle from multiple sources (Sanderson et al., 2008;Step et al., 2008;Cernicchiaro et al., 2012a). Cha et al. (2017) illustrated the use of SEM using observational data on BRD treatment costs and average daily gain (ADG). The study noted a major advantage of SEM is its ability to consider both direct and indirect effects of an overall relationship. ...
... Cohorts were randomly assigned to consist of all heifers, all steers, or mixed cohorts (i.e. steers and bulls) with probabilities 0.40, 0.45, and 0.15, respectively, consistent with observed data in previous research (Sanderson et al., 2008;Cernicchiaro et al., 2012a;Cha et al., 2017). Contemporary groups (CG), consisting of 10 cohorts each, were defined to cluster cohorts managed within the same month and year. ...
... Mb p ij = β 0,Mb p +β 1,Mb p * steers ij +β 2,Mb p * mixed ij +λ Mb p,AW AW ij +u j,Mb p +e Mbp,ij (10) Parameter values selected based on previous research from our group (Sanderson et al., 2008;Cernicchiaro et al., 2012a;Cernicchiaro et al., 2013;Cha et al., 2017). Each response also receives the fixed effect of sex (i.e. ...
Article
Most commercial software for implementation of structural equation models (SEM) cannot explicitly accommodate outcome variables of binomial nature. As a result, SEM modeling strategies of binomial outcomes are often based on normal approximations of empirical proportions. Inferential implications of these approximations are particularly relevant to health-related outcomes. The objective of this study was to assess the inferential implications of specifying a binomial variable as an empirical proportion (%) in predictor and outcome roles in a SEM. We addressed this objective first by a simulation study, and second by a proof-of-concept data application on beef feedlot morbidity to bovine respiratory disease (BRD). We simulated data on body weight at feedlot arrival (AW), morbidity count for BRD (Mb), and average daily gain (ADG). Alternative SEMs were fitted to the simulated data. Model 1 specified a directed acyclic causal diagram with morbidity fitted as a binomial outcome (Mb) and as a proportion (Mb_p) predictor. Model 2 specified a similar causal diagram with morbidity fitted as a proportion for both outcome and predictor roles within the network. Structural parameters for Model 1 were accurately estimated based on the nominal coverage probability of 95 % confidence intervals. In turn, there was poor coverage for most morbidity-related parameters under Model 2. Both SEM models showed adequate empirical power (>80 %) to detect parameters not equal to zero. Model 1 and Model 2 produced predictions that were reasonable from a management standpoint, as determined by calculating the root mean squared error (RMSE) through cross-validation. However, interpretability of parameter estimates in Model 2 was impaired due to the model misspecification relative to the data generation. The data application fitted SEM extensions, Model 1 * and Model 2 * , to a dataset from a group of feedlots in the Midwestern US. Models 1 * and 2 * included explanatory covariates, specifically percent shrink (PS), backgrounding type (BG), and season (SEA). Lastly, we tested if AW exerted both direct and BRD-mediated indirect effects on ADG using Model 2 * . In Model 1 * , mediation was not testable due to the incomplete path from morbidity as a binomial outcome through Mb_p as a predictor to ADG. Model 2 * supported a minor morbidity-mediated mechanism between AW and ADG, though parameter estimates were not directly interpretable. Our results indicate normal approximation to a binomial disease outcome in a SEM may be a viable option for inference on mediation hypotheses and for predictive purposes, despite limitations in interpretability due to inherent model misspecification.
... The results showed an upward trend, suggesting that improving practices in the Amazonian Chakra entails a significant improvement in results. A significant relationship was found between sustainability constructs and economic performance, characterized by a sigmoid shape curve like the Cobb-Douglas function with decreasing returns concerning productive factors [23,79,80]. Surprisingly, the results related to Hypothesis 4 were rejected. ...
... The results showed an upward trend, suggesting that improving practices in the Amazonian Chakra entails a significant improvement in results. A significant relationship was found between sustainability constructs and economic performance, characterized by a sigmoid shape curve like the Cobb-Douglas function with decreasing returns concerning productive factors [23,79,80]. ...
Article
Full-text available
This study focuses on investigating the dimensions of sustainability and their influence on financial-economic sustainability (FES) in traditional agroforestry systems (TAFS) using the case of the Amazonian Chakra. The main objectives were to analyze the dimensions of sustainability and to establish the causal relationships between these dimensions and the FES. To carry out this research, 330 households in Napo Province that use the Amazonian Chakra system to grow cocoa were selected in order to analyze the relationship between the different dimensions of sustainability and FES in this unique context. The results of the study show that practices related to food security (FS) and business factors (BF) have a positive and significant impact on the FES of cocoa-producing households in the Amazonian Chakra system. These findings support the importance of ensuring the availability and quality of food and promoting responsible business practices in these environments. In contrast, the dimensions of environmental resilience (ER) and biodiversity conservation (BC) showed a negative impact on FES, highlighting an economic-financial imbalance in relation to conservation and environmental resilience actions in the Amazonian Chakra. This study contributes to the knowledge needed to promote agricultural practices that include an equal focus on FES, biodiversity conservation, and environmental resilience practices in a globally significant area, providing valuable information for the design of sustainable agricultural policies and practices in the Amazonian Chakra.
... There have been many important research projects which have relied on observational feed yard data. Cha et al. (2017) used structural equation models with operational feed yard data with a focus on health outcomes related to bovine respiratory disease (BRD). Their model demonstrated indirect effects of arrival weight on average daily gain (ADG) mediated by BRD-related treatment costs. ...
... These types of studies (Cha et al. 2017;Irsik et al. 2006;Babcock et al. 2009;Babcock et al. 2013) are all part of a large body of research using operational feed yard data that could be described as retrospective. The data are used after cattle have finished the production cycle and knowledge gained from the research is used to adjust future production to increase the efficiency and profitability of the cattle finishing industry. ...
Article
Full-text available
Cattle feed yards routinely track and collect data for individual calves throughout the feeding period. Using such operational data from nine U.S. feed yards for the years 2016–2019, we evaluated the scalability and economic viability of using machine learning classifier predicted mortality as a culling decision aid. The expected change in net return per head when using the classifier predictions as a culling aid as compared to the status quo culling protocol for calves having been pulled at least once for bovine respiratory disease was simulated. This simulated change in net return ranged from −1.61to1.61 to 19.46/head. Average change in net return and standard deviation for the nine feed yards in this study was 6.31/headand6.31/head and 7.75/head, respectively.
... Thus, in contrast to classical regression approaches, the outcome and the predictors are not defined as such beforehand, but within the network different GLMs applicable to the data at hand are evaluated. ABN modelling is a pure data-driven technique, contrasting other approaches where the model is theory driven such as Structural Equation Modeling [13,14]. Consequently, the first step in an ABN analysis is to find the optimal or most complex network still supported by the data, based on a metric which is controlling for complexity, allowing for the maximum number of links or associations between all variables included. ...
... The colors represent the direction of the association with green indicating a positive and red a negative association. The parents are listed in the columns and the children in the rows 1 Gender (baseline male versus female); 2 Presence of pets (baseline no versus yes); 3 Farmsize (baseline S: small < 500, M: medium 500 to 1000 and L: large > 1000), M and L compared to S 4 Management (baseline free range and semi-intensive versus intensive); 5 Eggtrays re-use (baseline no versus yes); 6 Vaccinator (baseline PS: private service, S: self or family member, E: employee), S and E compared to PS; 7 Disposal (baseline 1 = burrying, 2 = burning, 3 = throwing away, 4 = giving to animals (dogs and pigs), 5 = drop in a pit); 8 Sulphonamides, 9 Ciprofloxacin, 10 Tetracycline, 11 Trimethoprim, 12 Sulfamethoxazole-trimethoprim, 13 Chloramphenicol, 14 Ampicillin ...
Article
Full-text available
Background Multi-drug resistant bacteria are seen increasingly and there are gaps in our understanding of the complexity of antimicrobial resistance, partially due to a lack of appropriate statistical tools. This hampers efficient treatment, precludes determining appropriate intervention points and renders prevention very difficult. Methods We re-analysed data from a previous study using additive Bayesian networks. The data contained information on resistances against seven antimicrobials and seven potential risk factors from 86 non-typhoidal Salmonella isolates from laying hens in 46 farms in Uganda. Results The final graph contained 22 links between risk factors and antimicrobial resistances. Solely ampicillin resistance was linked to the vaccinating person and disposal of dead birds. Systematic associations between ampicillin and sulfamethoxazole/trimethoprim and chloramphenicol, which was also linked to sulfamethoxazole/trimethoprim were detected. Sulfamethoxazole/trimethoprim was also directly linked to ciprofloxacin and trimethoprim. Trimethoprim was linked to sulfonamide and ciprofloxacin, which was also linked to sulfonamide. Tetracycline was solely linked to ciprofloxacin. Conclusions Although the results needs to be interpreted with caution due to a small data set, additive Bayesian network analysis allowed a description of a number of associations between the risk factors and antimicrobial resistances investigated.
... PLS-SEM (partial least squares structural equation modeling) was employed to measure the association between the four phases of startup incubation and graduation rate. To test the posited hypotheses, we proposed a nonlinear model ( Figure 2) with statistical estimates derived from PLS regression analysis [104][105][106]. The model was estimated using WarpPLS 8.0 software. ...
Article
Full-text available
Business incubators contribute to the growth of a country, and it is of great interest to deepen knowledge of the impact of incubation phases on the results of incubators to evaluate the effectiveness of developed incubation programs. The objective of this research was to propose a model that quantitatively related different incubation phases to the graduation rate of business incubators in Spain. A sample of 88 incubators was obtained. The survey included 42 items identified in different phases (spreading entrepreneurship, 9 items; pre-incubation, 9 items; basic incubation, 9 items; advanced incubation, 6 items; and graduation, 9 items) and four hypotheses relating to the existence of a positive influence from the startup incubation phases on the incubators results. These were validated by using a structural equation model (SEM) with five latent variables. Three of the four proposed hypotheses that linked startup pre-incubation (H2), basic incubation (H3), and advanced incubation (H4) with graduation rates in Spanish incubators were accepted. These startup incubation stages showed a positive influence on the startup graduation rate. The advanced incubation stage had a very strong relationship with the graduation rate (β = 0.543). Furthermore, a strong indirect effect between business incubation and the graduation rate, explaining 71% of the success of the incubators, was found. Proposals for improvement in each incubation phase to enhance the results of the business incubators are provided. Furthermore, future challenges that should be incorporated into the development of incubator programs, such as the social focus, the implementation of a training and monitoring model, an increase in network businesses, the internationalization of incubators with a globalized approach, the sustainability of the startup’s approach, and the transfer focus, are raised. Given the high variability of Spanish incubators and the wide sampling range, the model could be extended to other contexts with similar behavior within the sample range.
... PLS-SEM (partial least squares-structural equation modeling) was employed to measure the association between the four phases of startup incubation, considering the proposed constructors. The association between constructs was measured through path coefficients, significance level, and cross-validated redundancy [105][106]. The model was estimated by applying the Partial Least Squares (PLS) procedure using the Warp PLS 8.0 software. ...
Article
Full-text available
This research quantified the relationships among the different phases of the business incubation process. 89 surveys coming from business incubators in Spain in the period 2022–2023 have been collected. A structural equation model (SEM) was applied to determine the association among incubation phases 1, 2, 3, and 4. The results showed that the “spreading entrepreneurship” phase had a strongly positive significative influence on preincubation, phases 1 and 2 (hypothesis 1) basic incubation, phase 3 (hypothesis 4), and advanced incubation phase 4 (hypothesis 5). Besides, a moderate positive influence was found between preincubation and basic incubation (hypothesis 2) and between preincubation and advanced incubation (hypothesis 6). In this context, spreading entrepreneurship will be a useful tool to determine the success of entrepreneurship during the incubation process. Improving variables such as counseling, channels and training will positively impact incubation. Therefore, taking action at the spreading entrepreneurship stage to improve the business incubator results, and evaluate the structural deficiencies of entrepreneurs to improve their training level and technicians' specialization is recommended. Applying SEM models in business incubators to evaluate their influence on graduation rates would also be of great interest.
... SEM is a multivariate statistical analysis that is used to analyze data based on cause-effect relationships and which is widespread in social, behavioral and commercial research and which measures the causality on complex data structure. (Alkis, 2016;Barrett, 2007;Cha et al., 2017). ...
Article
Full-text available
Teachers' job satisfaction (TJS) can be defined as the emotional reactions of teachers to their jobs or teaching roles. In this study, it is aimed to investigate the determinants of teachers, principal and school-based factors on job satisfaction of teachers. In this study, which is based on relational survey model, secondary data obtained from TALIS-2018 evaluation were analyzed with Multilevel Structural Equation Modeling. 196 principals and 3952 teachers from Turkey who participated in TALIS-2018 survey constitute the sample of the research. According to the results of the study, teachers' age, gender, career preferences and participation in professional development activities, the locations of the schools they work in and the type of school (state / private) and the gender of the school principals were found to be determinants of job satisfaction. Teachers' work experience, having foreign students in their classes, school principal's age and work experience did not affect teachers' job satisfaction.
... Structural equation modeling (SEM) is a widely used modeling tool in many areas such as behavioral, commercial and social sciences [3]. SEM is a multivariate statistical analysis which investigates the relationship between multiple results in complex systems with causality [4]. SEM is a modeling method used to test hypotheses based on cause-effect [5]. ...
Article
Full-text available
The Programme For International Student Assesment (PISA) is an international survey funded by the Organization of Economic Cooperation and Development (OECD). PISA survey is conducted every three years since 2000, to measure and evaluate the educational quality of students aged between 15 and 16. PISA survey is aimed to evaluate students' achievements through the concept of description that they have learned in Science, Mathematics and Reading Skills. In PISA 2015 survey, Science literacy performance of the students were examined. Multilevel Structural Equation Modeling is a multilevel statistical analysis technique used in the analysis of models with complex data structure. Nowadays, , data obtained from many projects such as PISA, TIMSS, and PIRLS, have a complex and hierarchical structure. The MSEM analysis is needed for hierarchical data. The aim of this study is to analyze the created model for PISA 2015 Science Literacy Performance of the Turkish students by using MSEM analysis comparing with the Singaporean students which are the first rank amongst participating countries’ students. Turkish and Singapore students were analyzed by using Mplus package program. It has been observed that the model established for both countries is in good fit
... Chains can convey causal effects either directly or indirectly through an intermediate mediator. Figure 3 illustrates 2 chain paths transmitting the effect of arrival weight (i.e., C 1 ) on beef cattle performance (i.e., C 3 ) in a simplified feedlot production system (adapted from Cha et al., 2017). One of the paths shows a direct effect (i.e., C 1 → C 3 ), whereas the other one indicates an indirect effect (i.e., C 1 → C 2 → C 3 ) mediated by the health indicator C 2 . ...
Article
Full-text available
Understanding causal mechanisms among variables is critical to efficient management of complex biological systems such as animal agriculture production. The increasing availability of data from commercial livestock operations offers unique opportunities for attaining causal insight, despite the inherently observational nature of these data. Causal claims based on observational data are substantiated by recent theoretical and methodological developments in the rapidly evolving field of causal inference. Thus, the objectives of this review are as follows: 1) to introduce a unifying conceptual framework for investigating causal effects from observational data in livestock, 2) to illustrate its implementation in the context of the animal sciences, and 3) to discuss opportunities and challenges associated with this framework. Foundational to the proposed conceptual framework are graphical objects known as directed acyclic graphs (DAGs). As mathematical constructs and practical tools, DAGs encode putative structural mechanisms underlying causal models together with their probabilistic implications. The process of DAG elicitation and causal identification is central to any causal claims based on observational data. We further discuss necessary causal assumptions and associated limitations to causal inference. Last, we provide practical recommendations to facilitate implementation of causal inference from observational data in the context of the animal sciences.
Article
Full-text available
The objective of this study was to evaluate the causal relationship between technological innovation and sheep farm's results, based on a Structural Equation Modeling Approach (SEM) in dairy sheep systems in the center of Spain. Different from traditional multiple-trait models, SEM analysis allows assessment of potential causal interrelationships among outcomes and can effectively discriminate effects. Information from 157 dairy sheep farms in Castilla La Mancha was used. The questionnaires included 38 technological innovations and 188 questions on productive, economic and social data. Four hypotheses were formulated oriented to understand how the farm's technological innovation will affect the productive structure and farm's performance. The results derived from the SEM analysis showed a positive relationship between the technological indicator and the farm's structure, productivity, and economic results. The variable technological adoption could be regarded as a predictable measure of structure, productivity, and economic performance. Technology is associated with the productive structure. Independent of sheep farms' size, dairy sheep farms can be positioned in the growing returns area as a consequence of a proper use of it. SEM approach to observational data in the context of dairy sheep system suggests that there is not a single optimal structure. The model built constitutes a tool of great utility to make decisions, as it allows predicting the impact of technologies on final results ex-ante.
Conference Paper
Full-text available
Türkiye’nin TIMSS-2015 verilerinin kullanıldığı bu çalışmada, sınıf öğretmenlerinin eğitim düzeylerinin ve iş doyumlarının ilkokul 4. sınıf öğrencilerinin matematik ve fen alanlarındaki akademik başarısına etkisini incelemek amaçlanmıştır. Nicel araştırma modellerinden ilişkisel tarama modeli kullanılan bu çalışmada TIMSS-2015 değerlendirmesinden elde edilen ikincil verilerin analizleri yapılmıştır. İki aşamalı tabakalı örnekleme yönteminin kullanıldığı TIMSS-2015’e Türkiye’den katılan 6456 ilkokul 4. sınıf öğrencisi araştırmanın örneklemini oluşturmaktadır. Araştırmanın verileri, Uluslararası Eğitim Başarılarını Değerlendirme Kuruluşu’nun (International Association for the Evaluation of Education Achievement, IEA) internet sitesinden elde edilmiştir. Elde edilen verilerin analizinde Çok Seviyeli Yapısal Eşitlik Modellemesi (MSEM) kullanılmıştır. Araştırma bulgularına göre, sınıf öğretmenlerinin eğitim düzeylerinin artmasının öğrencilerin matematik ve fen başarı puanlarını pozitif yönde etkilediği söylenebilir. İlkokul öğretmenlerinin iş doyumlarının öğrencilerin matematik başarı puanları üzerinde değişkenliğe neden olmadığı, öğretmenlerin iş doyumu azaldıkça öğrencilerin fen başarı puanlarının da negatif yönde etkilendiği bulgusuna ulaşılmıştır. Bu araştırmada, ileri akademik derecelere sahip sınıf öğretmenlerinin sayısının artırılmasının öğrencilerin matematik performansları üzerinde olumlu yönde etkiler meydana getirebileceği, matematik eğitiminin niteliğinin yükseltilebileceği, uluslararası değerlendirmelerde Türkiye’nin matematik alanında daha üst sıralarda kendisine yer bulmasını sağlayabileceği, öğretmenlerin iş doyumlarının yükseltilmesinin öğrencilerin fen başarısına olumlu yönde etki edebileceği sonucuna varılmıştır. Sonuçlar, ilgili alanyazınla göreceli olarak uyumludur. Çalışma sonuçlarına dayanarak önerilerde bulunulmuştur.
Conference Paper
Full-text available
Türkiye’nin TIMSS-2015 verilerinin kullanıldığı bu çalışmada, sekizinci sınıf matematik ve fen öğretmenlerinin eğitim düzeylerinin ortaöğretim 8. sınıf öğrencilerinin matematik ve fen alanlarındaki akademik başarısına etkisini incelemek amaçlanmıştır. Nicel araştırma modellerinden ilişkisel tarama modeli kullanılan bu çalışmada TIMSS-2015 değerlendirmesinden elde edilen ikincil verilerin analizleri yapılmıştır. İki aşamalı tabakalı örnekleme yönteminin kullanıldığı TIMSS-2015’e Türkiye’den katılan 6079 ortaöğretim 8. sınıf öğrencisi araştırmanın örneklemini oluşturmaktadır. Araştırmanın verileri, Uluslararası Eğitim Başarılarını Değerlendirme Kuruluşu’nun (International Association for the Evaluation of Education Achievement, IEA) internet sitesinden elde edilmiştir. Elde edilen verilerin analizinde Çok Seviyeli Yapısal Eşitlik Modellemesi (MSEM) kullanılmıştır. Araştırma bulgularına göre, sekizinci sınıf matematik öğretmenlerinin eğitim düzeylerinin artmasının öğrencilerin matematik başarı puanlarını pozitif yönde etkilediği söylenebilir. Sekizinci sınıf fen öğretmenlerinin eğitim düzeylerinin ise öğrencilerin fen başarı puanları üzerinde değişkenliğe neden olmadığı bulgusuna ulaşılmıştır. Bu araştırmada, ileri akademik derecelere sahip matematik öğretmenlerinin sayısının artırılmasının öğrencilerin akademik performansları üzerinde olumlu yönde etkiler meydana getirebileceği, matematik eğitiminin niteliğinin yükseltilebileceği, uluslararası değerlendirmelerde Türkiye’nin matematik alanında daha üst sıralarda kendisine yer bulmasını sağlayabileceği, fen öğretmenlerinin aldıkları lisansüstü eğitimlerin öğrencilerin akademik başarılarını artıracak nitelikte olmadığı ya da fen öğretmenlerinin lisansüstü eğitimlerinde edindikleri kazanımları öğrencilerin akademik başarılarına yansıtamadıkları sonucuna varılmıştır. Sonuçlar, ilgili alanyazınla göreceli olarak uyumludur. Çalışma sonuçlarına dayanarak önerilerde bulunulmuştur. In this study which uses TIMSS-2015 of Turkey data, it is aimed to examine the effect of 8th-grade math and science teachers’ level of education on 8th-grade students’ academic achievements of math and science. Out of quantitative research models, a correlational survey model is used. In this study, secondary data from the TIMSS-2015 is analyzed. A number of 6079 8th grade students participated in TIMSS-2015 - uses a two-phase stratified sampling model is the sampling of the study. The data of the study is obtained from the website of the International Association for the Evaluation of Education Achievement, IEA. In analyzing the obtained data, Multilevel Structural Equation Modeling is used. As a result of the study, it can be said that the increase in the education level of 8th-grade math teachers has a positive effect on students’ math grades. On 8th-grade science students, however, it is reached the conclusion that the 8th-grade science teachers’ education levels have no effect. In this study, the following conclusions are reached; increasing the number of math teachers with advanced academic levels can have a positive effect on students’ academic achievements, quality of mathematical study can be improved, science teachers’ advanced degrees don’t have the quality to improve students’ academic levels or science teachers are not able to reflect the acquisitions they get at postgraduate education to their students’ academic levels. Conclusions are relatively compatible with the literature. Based on the results of the study, suggestions concerning the improvement of students’ academic achievements are given.
Article
Structural equation models (SEM) are a type of multi-trait model increasingly being used for inferring functional relationships between multiple outcomes using operational data from livestock production systems. These data often present a hierarchical architecture given by clustering of observations at multiple levels including animals, cohorts and farms. A hierarchical data architecture introduces correlation patterns that, if ignored, can have detrimental effects on parameter estimation and inference. Here, we evaluate the inferential implications of accounting for, or conversely, misspecifying data architecture in the context of SEM. Motivated by beef cattle feedlot data, we designed simulation scenarios consisting of multiple responses in a clustered architecture. Competing fitted SEMs differed in their model specification so that data architecture was explicitly accounted for (M1; true model) or misspecified due to disregarding either the cluster-level correlation between responses (M2) or the correlation between observations of a response within a cluster (M3), or ignored all together (M4). Model fit was increasingly impaired when data architecture was misspecified or ignored. Both accuracy and precision of estimation were also negatively affected when data architecture was disregarded. Our findings are further illustrated using data from feedlot operations from the US Great Plains. Standing statistical recommendations that call for proper model specification capturing relevant hierarchical levels in data structure extend to the multivariate context of structural equation modeling.
Conference Paper
Full-text available
Abstract Text: Phenotypic data on 30 production traits of 385 quails from two lines were modeled to predict total egg production (TEP). Prediction models included linear regression and artificial neural networks (ANN). Bayesian networks and a stepwise approach were applied as variable selection methods. The learned structures for the two lines show that partial egg production is the only variable in TEP’s Markov Blanket, which implies expected independence from the other traits considered in these sets. Furthermore, even if no causal interpretation is projected on the output, such data-driven analysis is interesting to verify if the statistical consequences of the recovered graph are consistent with prior biological beliefs about the system. The best predictive model was ANN after feature selection, showing .79 and .71 maximum prediction accuracy for lines 1 and 2, respectively. In conclusion, for prediction of TEP, a partial egg production measurement is necessary. Keywords: Phenotype prediction, networks
Article
Full-text available
Knowledge regarding causal relationships among traits is important to understand complex biological systems. Structural equation models (SEM) can be used to quantify the causal relations between traits, which allow prediction of outcomes to interventions applied to such a network. Such models are fitted conditionally on a causal structure among traits, represented by a directed acyclic graph and an Inductive Causation (IC) algorithm can be used to search for causal structures. The aim of this study was to explore the space of causal structures involving bovine milk fatty acids and to select a network supported by data as the structure of a SEM. The IC algorithm adapted to mixed models settings was applied to study 14 correlated bovine milk fatty acids, resulting in an undirected network. The undirected pathway from C4:0 to C12:0 resembled the de novo synthesis pathway of short and medium chain saturated fatty acids. By using prior knowledge, directions were assigned to that part of the network and the resulting structure was used to fit a SEM that led to structural coefficients ranging from 0.85 to 1.05. The deviance information criterion indicated that the SEM was more plausible than the multi-trait model. The IC algorithm output pointed towards causal relations between the studied traits. This changed the focus from marginal associations between traits to direct relationships, thus towards relationships that may result in changes when external interventions are applied. The causal structure can give more insight into underlying mechanisms and the SEM can predict conditional changes due to such interventions.
Article
Full-text available
Abstract We investigate two approaches to increase the efficiency of phenotypic prediction from genome-wide markers, which is a key step for genomic selection (GS) in plant and animal breeding. The first approach is feature selection based on Markov blankets, which provide a theoretically-sound framework for identifying non-informative markers. Fitting GS models using only the informative markers results in simpler models, which may allow cost savings from reduced genotyping. We show that this is accompanied by no loss, and possibly a small gain, in predictive power for four GS models: partial least squares (PLS), ridge regression, LASSO and elastic net. The second approach is the choice of kinship coefficients for genomic best linear unbiased prediction (GBLUP). We compare kinships based on different combinations of centring and scaling of marker genotypes, and a newly proposed kinship measure that adjusts for linkage disequilibrium (LD). We illustrate the use of both approaches and examine their performances using three real-world data sets with continuous phenotypic traits from plant and animal genetics. We find that elastic net with feature selection and GBLUP using LD-adjusted kinships performed similarly well, and were the best-performing methods in our study.
Article
Full-text available
In this paper, we suggest ways to improve mediation analysis practice among consumer behavior researchers. We review the current methodology and demonstrate the superiority of structural equations modeling, both for assessing the classic mediation questions and for enabling researchers to extend beyond these basic inquiries. A series of simulations are pre- sented to support the claim that the approach is superior. In addition to statistical demonstra- tions, logical arguments are presented, particularly regarding the introduction of a fourth construct into the mediation system. We close the paper with new prescriptive instructions for mediation analyses.
Article
Full-text available
Objective: To evaluate associations between economic and performance outcomes with the number of treatments after an initial diagnosis of bovine respiratory disease (BRD) in commercial feedlot cattle. Animals: 212,867 cattle arriving in a Midwestern feedlot between 2001 and 2006. Procedures: An economic model was created to estimate net returns. Generalized linear mixed models were used to determine associations between the frequency of BRD treatments and other demographic variables with economic and performance outcomes. Results: Net returns decreased with increasing number of treatments for BRD. However, the magnitude depended on the season during which cattle arrived at the feedlot, with significantly higher returns for cattle arriving during fall and summer than for cattle arriving during winter and spring. For fall arrivals, there were higher mean net returns for cattle that were never treated (39.41)thanforcattletreatedonce(39.41) than for cattle treated once (29.49), twice (16.56),or3times(16.56), or ≥ 3 times (-33.00). For summer arrivals, there were higher least squares mean net returns for cattle that were never treated (31.83)thanforcattletreatedonce(31.83) than for cattle treated once (20.22), twice (6.37),or3times(6.37), or ≥ 3 times (-42.56). Carcass traits pertaining to weight and quality grade were deemed responsible for differences in net returns among cattle receiving different numbers of treatments after an initial diagnosis of BRD. Conclusions and clinical relevance: Differences in economic net returns and performance outcomes for feedlot cattle were determined on the basis of number of treatments after an initial diagnosis of BRD; the analysis accounted for the season of arrival, sex, and weight class.
Article
Full-text available
Data regularly recorded in commercial herds have been used extensively for estimation of disease incidence rates, for inferences regarding genetic and phenotypic associations between traits, or for developing predictive models for economically important traits. Some studies have also used field data to investigate potential causal relationships between variables. However, inferring causal effects from observational data is complex due to potential confounding effects, and careful analyses using specific statistical and data mining techniques, as well as different sets of assumptions, are required. Nonetheless, although virtually unknown in the agricultural research community, such methods are available and have been used in many other fields. In this paper, we review and discuss the analysis of observational data using field-recorded information, and its potential utility in the study of causal effects in livestock. It is our postulation that there is much to be learned from such data, which can be used either to explicitly investigate causal relationships between variables or to generate hypotheses for further investigation using controlled experiments or additional field-recorded data.
Article
Full-text available
This study examined perceived coping (perceived problem-solving ability and progress in coping with problems) as a mediator between adult attachment (anxiety and avoidance) and psychological distress (depression, hopelessness, anxiety, anger, and interpersonal problems). Survey data from 515 undergraduate students were analyzed using structural equation modeling. Results indicated that perceived coping fully mediated the relationship between attachment anxiety and psychological distress and partially mediated the relationship between attachment avoidance and psychological distress. These findings suggest not only that it is important to consider attachment anxiety or avoidance in understanding distress but also that perceived coping plays an important role in these relationships. Implications for these more complex relations are discussed for both counseling interventions and further research. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Recent work with graphical methods for inductive causal inference with observational econometric data is reviewed and compared with earlier work. Two alternative algorithms are described. Caveats on applications are discussed. KeywordsMachine learning–Directed acyclic graphs–Markov condition–Causality–Econometrics
Article
Full-text available
Body weight loss during transport or shrink (SHK) is a common occurrence in feeder cattle that results from a physiological, complex process. Previous studies have assessed the effects of environmental and dietary stressors on transport-associated BW loss; however, data on associations between shrink and subsequent health and performance parameters in feeder cattle are limited. Operational data from 13 U.S. commercial feedlots (n = 16,590 cattle cohorts) were used to quantify how SHK was associated with bovine respiratory disease (BRD) morbidity and overall mortality risks, HCW and ADG in feeder cattle cohorts arriving to feedlots during 2000 to 2008. Multivariable mixed-effects negative binomial and linear regression models were employed to determine these associations while accounting for other cohort-level demographic variables. The median SHK among the study cohorts was 3.0% with a mean (± SEM) of 2.4 ± 0.02%. The mean (± SEM) cumulative BRD morbidity was 10.0% ± 0.09% (median = 5.8%; range 0 to 100%) and the mean (± SEM) overall cumulative mortality was 1.3% ± 0.01% (median = 0.9%; range: 0 to 25.6%). The mean and median number of days on feed of cohorts experiencing initial BRD cases was 143 and 150 d (range = 23 to 288 d). The effects of SHK were significantly (P < 0.05) associated with BRD morbidity, overall mortality, HCW and ADG, and these effects were significantly (P < 0.05) modified by gender, season and mean arrival BW of the cohort. Combining data on BW loss during transport with cohort demographics could allow a more precise prediction of health and performance of feedlot cattle.
Article
Full-text available
Phenotypic traits may exert causal effects between them. For example, on the one hand, high yield in dairy cows may increase the liability to certain diseases and, on the other hand, the incidence of a disease may affect yield negatively. Likewise, the transcriptome may be a function of the reproductive status in mammals and the latter may depend on other physiological variables. Knowledge of phenotype networks describing such interrelationships can be used to predict the behavior of complex systems, e.g. biological pathways underlying complex traits such as diseases, growth and reproduction. Structural Equation Models (SEM) can be used to study recursive and simultaneous relationships among phenotypes in multivariate systems such as genetical genomics, system biology, and multiple trait models in quantitative genetics. Hence, SEM can produce an interpretation of relationships among traits which differs from that obtained with traditional multiple trait models, in which all relationships are represented by symmetric linear associations among random variables, such as covariances and correlations. In this review, we discuss the application of SEM and related techniques for the study of multiple phenotypes. Two basic scenarios are considered, one pertaining to genetical genomics studies, in which QTL or molecular marker information is used to facilitate causal inference, and another related to quantitative genetic analysis in livestock, in which only phenotypic and pedigree information is available. Advantages and limitations of SEM compared to traditional approaches commonly used for the analysis of multiple traits, as well as some indication of future research in this area are presented in a concluding section.
Article
Full-text available
Biology is characterized by complex interactions between phenotypes, such as recursive and simultaneous relationships between substrates and enzymes in biochemical systems. Structural equation models (SEMs) can be used to study such relationships in multivariate analyses, e.g., with multiple traits in a quantitative genetics context. Nonetheless, the number of different recursive causal structures that can be used for fitting a SEM to multivariate data can be huge, even when only a few traits are considered. In recent applications of SEMs in mixed-model quantitative genetics settings, causal structures were preselected on the basis of prior biological knowledge alone. Therefore, the wide range of possible causal structures has not been properly explored. Alternatively, causal structure spaces can be explored using algorithms that, using data-driven evidence, can search for structures that are compatible with the joint distribution of the variables under study. However, the search cannot be performed directly on the joint distribution of the phenotypes as it is possibly confounded by genetic covariance among traits. In this article we propose to search for recursive causal structures among phenotypes using the inductive causation (IC) algorithm after adjusting the data for genetic effects. A standard multiple-trait model is fitted using Bayesian methods to obtain a posterior covariance matrix of phenotypes conditional to unobservable additive genetic effects, which is then used as input for the IC algorithm. As an illustrative example, the proposed methodology was applied to simulated data related to multiple traits measured on a set of inbred lines.
Article
Full-text available
Recent criticism of epidemiologic methods has focused on the limitations of ‘black box’ epidemiology, a pejorative label given to the simple identification of exposure–disease relationships. The assessment of mediation is an important tool for addressing this criticism. By using mediation analysis to open the black box, underlying mechanisms of the observed associations can be described and causal inference improved. An explicit theoretical motivation for such an analysis has been missing from the epidemiological literature. To provide this motivation, we integrate literature from epidemiology and other social sciences to describe the reasons that an investigator might want to assess mediation. We then describe the connections between these reasons and specific measures of indirect and direct effects that have been previously described.
Article
Full-text available
Structural equation models (SEMs) of a recursive type with heterogeneous structural coefficients were used to explore biological relationships between gestation length (GL), calving difficulty (CD), and perinatal mortality, also known as stillbirth (SB), in cattle, with the last two traits having categorical expression. An acyclic model was assumed, where recursive effects existed from the GL phenotype to the liabilities (latent variables) to CD and SB and from the liability to CD to that of SB considering four periods regarding GL. The data contained GL, CD, and SB records from 90,393 primiparous cows, sired by 1122 bulls, distributed over 935 herd-calving year classes. Low genetic correlations between GL and the other calving traits were found, whereas the liabilities to CD and SB were high and positively correlated, genetically. The model indicated that gestations of approximately 274 days of length (3 days shorter than the average) would lead to the lowest CD and SB and confirmed the existence of an intermediate optimum of GL with respect to these traits.
Article
Full-text available
Generalized linear mixed models were developed using retrospective feedlot data collected on individually treated cattle (n = 31,131) to determine whether cattle performance and health outcomes in feedlot cattle were associated with timing of treatment for bovine respiratory disease (BRD) during the feeding phase. Cattle that died at any point during the feeding phase were removed from the analysis. Information on individual animal performance (ADG, HCW, quality grade, yield grade) and health outcomes (treatments) were incorporated into an economic model that generated a standardized net return estimate for each animal. Prices were standardized to minimize variation between economic outcomes due to market conditions allowing direct comparisons of health and performance effects between animals. While controlling for sex, risk code, and arrival BW class, potential associations between net returns and the timing of BRD identification were investigated using 2 categorical variables created to measure time: 1) weeks on feed at initial BRD treatment, and 2) weeks from BRD treatment to slaughter. The first model using net return as the outcome identified an interaction between weeks on feed at initial BRD treatment and animal arrival BW. Cattle with arrival BW between 227 and 272 kg (5WT) and 273 and 318 kg (6WT) displayed decreased net returns (P < 0.05) if treated during wk 1 as compared with subsequent weeks in the first month of the feeding phase. The cattle with BW between 319 and 363 kg (7WT) and 364 and 408 kg (8WT) exhibited decreased net returns (P < 0.05) if treated during the later weeks of the feeding phase compared with earlier in the feeding phase. The number of times cattle were treated contributed to variation in net returns for the 5WT and 6WT cattle. For the 7WT and 8WT cattle, HCW was the main factor contributing to decreased net returns when cattle were treated late in the feeding phase. The second model identified an interaction between weeks from BRD treatment to slaughter and arrival BW. The 181 to 226 kg of BW, 5WT, 6WT, 7WT, and 8WT cattle all exhibited decreased net returns (P < 0.05) when cattle were on feed fewer weeks from BRD treatment to slaughter. Cattle with more weeks on feed between BRD treatment and slaughter had greater HCW, decreased ADG, and more total treatments compared with cattle treated closer to slaughter. This research indicates that timing of initial BRD treatment is associated with performance and health outcomes.
Article
Full-text available
The impact of respiratory disease during a 150-d feedlot finishing period on daily gain, carcass traits, and longissimus tenderness was measured using 204 steer calves. Feedlot health status was monitored in two ways. First, clinical signs of respiratory infection were evaluated each day; treatment with antibiotic was based on degree of fever (if rectal temperature exceeded 40 degrees C then calves were treated). Steers that were treated (n = 102) had lower (P<.05) final live weights, ADG, hot carcass weights (HCW), less external and internal fat, and more desirable yield grades. Steers that were treated had a higher prevalence of carcasses that graded U.S. Standard than steers that were never treated. Second, as an alternative index of health status, lungs of all steers were evaluated at the processing plant using a respiratory tract lesion classification system; this health index included presence or absence of preexisting pneumonic lesions in the anterioventral lobes plus activity of the bronchial lymph nodes (inactive vs active). Lung lesions were present in 33% of all lungs and were distributed almost equally between treated (37%) and untreated cattle (29%). Steers with lesions (n = 87) had lower (P<.05) daily gains, lighter HCW, less internal fat, and lower marbling scores than steers without lesions. Compared to steers with lesions but inactive bronchial lymph nodes (n = 78), steers with lung lesions plus active lymph nodes had lower (P<.01) ADG and dressing percentage. Longissimus shear force values for steaks aged 7 d were lower (P = .05) from steers without lung lesions than those for steaks from steers with lung lesions. Overall, morbidity suppressed daily gains and increased the percentage of U.S. Standard carcasses. Compared to health assessment by clinical appraisal (based on elevated body temperature), classification based on respiratory tract lesions at slaughter proved more reliable statistically and, thereby, more predictive of adverse effects of morbidity on production and meat tenderness.
Article
Full-text available
Several clostridial vaccines are currently being used in the beef cattle industry. Of greatest concern is altering the location and route of administration of these vaccines to reduce injection-site lesions while maintaining seroconversion. Two experiments were conducted to determine the effect of clostridial vaccines and injection sites on the performance, feeding behavior, and lesion size scores of beef steers. In Exp. 1, 80 crossbred beef steers (BW 237 +/- 3.2 kg) were allotted randomly into five groups and given 14 d to adapt to the feed and individual feed intake-monitoring devices (Pinpointer devices) before starting the study. Each group was assigned randomly to one of the following vaccination treatments: 1) control (sterile saline water), 2) Alpha-7 Ear (A7E), 3) Alpha-7 Prescapula (A7P), 4) Vision-7 Prescapula (V7P), and 5) Ultrabac-7 Prescapula (U7P). All vaccines were injected s.c. in the ear or prescapular region, and injection sites were palpated on d 0 and 28 (Exp. 1) and on d 63 and 91 (Exp. 2). The protocol for Exp. 2 was exactly the same as for Exp. 1 except treatments included control, A7P, Alpha-CD Ear (ACDE), Alpha-CD Prescapula (ACDP), Fortress-7 Prescapula (F7P), and V7P. Also, control and steers receiving F7P and V7P were revaccinated on d 63 and palpated on d 91. Results of Exp. 1 indicated that the A7E and U7P steers had a feed intake lower (P < 0.01) than all other treatment groups. The ADG of the A7P and A7E steers were not different (P > 0.05) from those of the control steers. The gain:feed ratio of the A7E steers was 41% higher (P < 0.01) than that of the V7P steers (Exp. 1). The results of Exp. 2 indicated that the control, ACDP, and V7P steers had greater (P < 0.01) ADG than all other treatment groups, but the gain:feed ratios were not different (P > 0.05) among all treatment groups. Lesion sizes differed by vaccine and injection site in both experiments. These data suggest that vaccinating beef steers s.c. in the ear produced gain:feed ratios and lesion size scores that were similar to prescapular vaccinations. However, more research is required to determine the immune response of vaccinating cattle in the ear.
Article
Full-text available
Multivariate models are of great importance in theoretical and applied quantitative genetics. We extend quantitative genetic theory to accommodate situations in which there is linear feedback or recursiveness between the phenotypes involved in a multivariate system, assuming an infinitesimal, additive, model of inheritance. It is shown that structural parameters defining a simultaneous or recursive system have a bearing on the interpretation of quantitative genetic parameter estimates (e.g., heritability, offspring-parent regression, genetic correlation) when such features are ignored. Matrix representations are given for treating a plethora of feedback-recursive situations. The likelihood function is derived, assuming multivariate normality, and results from econometric theory for parameter identification are adapted to a quantitative genetic setting. A Bayesian treatment with a Markov chain Monte Carlo implementation is suggested for inference and developed. When the system is fully recursive, all conditional posterior distributions are in closed form, so Gibbs sampling is straightforward. If there is feedback, a Metropolis step may be embedded for sampling the structural parameters, since their conditional distributions are unknown. Extensions of the model to discrete random variables and to nonlinear relationships between phenotypes are discussed.
Article
Full-text available
An analysis of litter size and average piglet weight at birth in Landrace and Yorkshire using a standard two-trait mixed model (SMM) and a recursive mixed model (RMM) is presented. The RMM establishes a one-way link from litter size to average piglet weight. It is shown that there is a one-to-one correspondence between the parameters of SMM and RMM and that they generate equivalent likelihoods. As parameterized in this work, the RMM tests for the presence of a recursive relationship between additive genetic values, permanent environmental effects, and specific environmental effects of litter size, on average piglet weight. The equivalent standard mixed model tests whether or not the covariance matrices of the random effects have a diagonal structure. In Landrace, posterior predictive model checking supports a model without any form of recursion or, alternatively, a SMM with diagonal covariance matrices of the three random effects. In Yorkshire, the same criterion favors a model with recursion at the level of specific environmental effects only, or, in terms of the SMM, the association between traits is shown to be exclusively due to an environmental (negative) correlation. It is argued that the choice between a SMM or a RMM should be guided by the availability of software, by ease of interpretation, or by the need to test a particular theory or hypothesis that may best be formulated under one parameterization and not the other.
Article
Full-text available
The incidence of initial respiratory disease was followed for 12 weeks in 122 pens of feedlot cattle, based on producer-collected daily morbidity counts. Weekly incidence density was calculated based on the number of new cases and the population at risk. Incidence density was greatest in the 1st week after arrival and decreased in following weeks. Weekly incidence rate varied between pens and over time from 0 to 27.7 cases per 100 animal weeks at risk. A negative binomial model controlling for multiple events within pens and over time was used to model effects on the number of new cases. Mixed gender groups, cattle from multiple sources and increasing distance shipped were associated with increased risk for initial respiratory morbidity. Heavier entry weight was associated with decreased morbidity risk. These factors may be useful in categorizing groups of calves into risk groups for targeted purchase and management decision making.
Article
Meat quality is one of the most important traits determining carcass price in the Japanese beef market. Optimized breeding goals and management practices for the improvement of meat quality traits requires knowledge regarding any potential functional relationships between them. In this context, the objective of this research was to infer phenotypic causal networks involving beef marbling score (BMS), beef color score (BCL), firmness of beef (FIR), texture of beef (TEX), beef fat color score (BFS), and the ratio of MUFA to SFA (MUS) from 11,855 Japanese Black cattle. The inductive causation (IC) algorithm was implemented to search for causal links among these traits and was conditionally applied to their joint distribution on genetic effects. This information was obtained from the posterior distribution of the residual (co)variance matrix of a standard Bayesian multiple trait model (MTM). Apart from BFS, the IC algorithm implemented with 95% highest posterior density (HPD) intervals detected only undirected links among the traits. However, as a result of the application of 80% HPD intervals, more links were recovered and the undirected links were changed into directed ones, except between FIR and TEX. Therefore, 2 competing causal networks resulting from the IC algorithm, with either the arrow FIR → TEX or the arrow FIR ← TEX, were fitted using a structural equation model () to infer causal structure coefficients between the selected traits. Results indicated similar genetic and residual variances as well as genetic correlation estimates from both structural equation models. The genetic variances in BMS, FIR, and TEX from the structural equation models were smaller than those obtained from the MTM. In contrast, the variances in BCL, BFS, and MUS, which were not conditioned on any of the other traits in the causal structures, had no significant differences between the structural equation model and MTM. The structural coefficient for the path from MUS (BCL) to BMS showed that a 1-unit improvement in MUS (BCL) resulted in an increase of 0.85 or 1.45 (an decrease of 0.52 or 0.54) in BMS in the causal structures. The analysis revealed some interesting functional relationships, direct genetic effects, and the magnitude of the causal effects between these traits, for example, indicating that BMS would be affected by interventions on MUS and BCL. In addition, if interventions existed in this scenario, a breeding strategy based only on the MTM would lead to a mistaken selection for BMS. © 2016 American Society of Animal Science. All rights reserved.
Chapter
In this chapter, we provide a brief introduction about graphical models, with an emphasis on Bayesian networks, and discuss some of their applications in genetics and genomics studies with agricultural and livestock species. First, some key definitions regarding stochastic graphical models are provided, as well as basic principles of inference related to graphical structure and model parameters. Next is a discussion of some examples of applications, which include prediction of complex traits using genomic information or other correlated traits as well as the investigation of the flow of information from DNA polymorphisms to endpoint phenotypes, including intermediate phenotypes such as gene expression. A first example with prediction refers to the forecasting of total egg production in quails using early expressed traits (such as weekly body weight, partial egg production, and egg quality traits) as explanatory variables to support decision making (e.g., earlier culling decisions) in production/breeding systems. An additional example uses genomic information for the estimation of genetic merit of selection candidates for genetic improvement of economically important traits. An example with causal inference deals with the network underlying carcass fat deposition and muscularity in pigs by jointly modeling phenotypic, genotypic, and transcriptomic data. Some additional applications of Bayesian networks and other graphical model techniques are highlighted as well, including multitrait quantitative trait loci (QTL) analysis and structural equation models with latent variables. It is shown that graphical models such as Bayesian networks offer a powerful and insightful approach both for prediction and for causal inference, with a myriad of applications in the areas of genetics and genomics, and the study of complex phenotypic traits in agriculture.
Article
Meat quality is one of the most important traits determining carcass price in the Japanese beef market. Optimized breeding goals and management practices for the improvement of meat quality traits requires knowledge regarding any potential functional relationships between them. In this context, the objective of this research was to infer phenotypic causal networks involving beef marbling score (BMS), beef color score (BCL), firmness of beef (FIR), texture of beef (TEX), beef fat color score (BFS), and the ratio of MUFA to SFA (MUS) from 11,855 Japanese Black cattle. The inductive causation (IC) algorithm was implemented to search for causal links among these traits and was conditionally applied to their joint distribution on genetic effects. This information was obtained from the posterior distribution of the residual (co)variance matrix of a standard Bayesian multiple trait model (MTM). Apart from BFS, the IC algorithm implemented with 95% highest posterior density (HPD) intervals detected only undirected links among the traits. However, as a result of the application of 80% HPD intervals, more links were recovered and the undirected links were changed into directed ones, except between FIR and TEX. Therefore, 2 competing causal networks resulting from the IC algorithm, with either the arrow FIR → TEX or the arrow FIR ← TEX, were fitted using a structural equation model () to infer causal structure coefficients between the selected traits. Results indicated similar genetic and residual variances as well as genetic correlation estimates from both structural equation models. The genetic variances in BMS, FIR, and TEX from the structural equation models were smaller than those obtained from the MTM. In contrast, the variances in BCL, BFS, and MUS, which were not conditioned on any of the other traits in the causal structures, had no significant differences between the structural equation model and MTM. The structural coefficient for the path from MUS (BCL) to BMS showed that a 1-unit improvement in MUS (BCL) resulted in an increase of 0.85 or 1.45 (an decrease of 0.52 or 0.54) in BMS in the causal structures. The analysis revealed some interesting functional relationships, direct genetic effects, and the magnitude of the causal effects between these traits, for example, indicating that BMS would be affected by interventions on MUS and BCL. In addition, if interventions existed in this scenario, a breeding strategy based only on the MTM would lead to a mistaken selection for BMS.
Article
Structural equation models (SEQM) can be used to model causal relationships between multiple variables in multivariate systems. Among the strengths of SEQM is its ability to consider causal links between latent variables. The use of latent variables allows modeling complex phenomena while reducing at the same time the dimensionality of the data. One relevant aspect in the quantitative genetics context is the possibility of correlated genetic effects influencing sets of variables under study. Under this scenario, if one aims at inferring causality among latent variables, genetic covariances act as confounders if ignored. Here we describe a methodology for assessing causal networks involving latent variables underlying complex phenotypic traits. The first step of the method consists of the construction of latent variables defined on the basis of prior knowledge and biological interest. These latent variables are jointly evaluated using confirmatory factor analysis. The estimated factor scores are then used as phenotypes for fitting a multivariate mixed model to obtain the covariance matrix of latent variables conditional on the genetic effects. Finally, causal relationships between the adjusted latent variables are evaluated using different SEQM with alternative causal specifications. We have applied this method to a data set with pigs for which several phenotypes were recorded over time. Five different latent variables were evaluated to explore causal links between growth, carcass, and meat quality traits. The measurement model, which included 5 latent variables capturing the information conveyed by 19 different phenotypic traits, showed an acceptable fit to data (e.g., χ2/df = 1.3, root-mean-square error of approximation = 0.028, standardized root-mean-square residual = 0.041). Causal links between latent variables were explored after removing genetic confounders. Interestingly, we found that both growth (-0.160) and carcass traits (-0.500) have a significant negative causal effect on quality traits (P-value ≤ 0.001). This result may have important implications for strategies for pig production improvement. More generally, the proposed method allows further learning regarding phenotypic causal structures underlying complex traits in farm species. © 2015 American Society of Animal Science. All rights reserved.
Article
Cattle feedlots are concentrated in the central plains of the USA near areas of high grain production and slaughter plants. Cattle that are brought into these feedlots are generally young animals from six months to twelve months of age. Average morbidity rates are about 8% and mortality rates are under 1% during the feeding period. Bovine Respiratory Disease (BRD), is the most common disease of feedlot cattle causing about 75% of the morbidity and over 50% of the mortality. BRD is also known as "shipping fever" since it occurs soon after the cattle arrive in the feedlot and the stress from shipping is considered to be one of the major factors in producing the disease. Reducing the expense of this disease complex is considered to be very worthwhile. Many preventive steps can be taken to reduce the calves' exposure and minimize shipping stress.
Article
Structural equation models (SEQM) can be used to model causal relationships between multiple variables in multivariate systems. Among the strengths of SEQM is its ability to consider causal links between latent variables. The use of latent variables allows modeling complex phenomena while reducing at the same time the dimensionality of the data. One relevant aspect in the quantitative genetics context is the possibility of correlated genetic effects influencing sets of variables under study. Under this scenario, if one aims at inferring causality among latent variables, genetic covariances act as confounders if ignored. Here we describe a methodology for assessing causal networks involving latent variables underlying complex phenotypic traits. The first step of the method consists of the construction of latent variables defined on the basis of prior knowledge and biological interest. These latent variables are jointly evaluated using confirmatory factor analysis. The estimated factor scores are then used as phenotypes for fitting a multivariate mixed model to obtain the covariance matrix of latent variables conditional on the genetic effects. Finally, causal relationships between the adjusted latent variables are evaluated using different SEQM with alternative causal specifications. We have applied this method to a data set with pigs for which several phenotypes were recorded over time. Five different latent variables were evaluated to explore causal links between growth, carcass, and meat quality traits. The measurement model, which included 5 latent variables capturing the information conveyed by 19 different phenotypic traits, showed an acceptable fit to data (e.g., χ2/df = 1.3, root-mean-square error of approximation = 0.028, standardized root-mean-square residual = 0.041). Causal links between latent variables were explored after removing genetic confounders. Interestingly, we found that both growth (-0.160) and carcass traits (-0.500) have a significant negative causal effect on quality traits (P-value ≤ 0.001). This result may have important implications for strategies for pig production improvement. More generally, the proposed method allows further learning regarding phenotypic causal structures underlying complex traits in farm species.
Article
Background Joint modeling and analysis of phenotypic, genotypic and transcriptomic data have the potential to uncover the genetic control of gene activity and phenotypic variation, as well as shed light on the manner and extent of connectedness among these variables. Current studies mainly report associations, i.e. undirected connections among variables without causal interpretation. Knowledge regarding causal relationships among genes and phenotypes can be used to predict the behavior of complex systems, as well as to optimize management practices and selection strategies. Here, we performed a multistep procedure for inferring causal networks underlying carcass fat deposition and muscularity in pigs using multi-omics data obtained from an F2 Duroc x Pietrain resource pig population. Results We initially explored marginal associations between genotypes and phenotypic and expression traits through whole-genome scans, and then, in genomic regions with multiple significant hits, we assessed gene-phenotype network reconstruction using causal structural learning algorithms. One genomic region on SSC6 showed significant associations with three relevant phenotypes, off-midline10th-rib backfat thickness, loin muscle weight, and average intramuscular fat percentage, and also with the expression of seven genes, including ZNF24, SSX2IP, and AKR7A2. The inferred network indicated that the genotype affects the three phenotypes mainly through the expression of several genes. Among the phenotypes, fat deposition traits negatively affected loin muscle weight. Conclusions Our findings shed light on the antagonist relationship between carcass fat deposition and lean meat content in pigs. In addition, the procedure described in this study has the potential to unravel gene-phenotype networks underlying complex phenotypes.
Article
Milk losses associated with mastitis can be attributed to either effects of pathogens per se (i.e., direct losses) or effects of the immune response triggered by intramammary infection (indirect losses). The distinction is important in terms of mastitis prevention and treatment. Regardless, the number of pathogens is often unknown (particularly in field studies), making it difficult to estimate direct losses, whereas indirect losses can be approximated by measuring the association between increased somatic cell count (SCC) and milk production. An alternative is to perform a mediation analysis in which changes in milk yield are allocated into their direct and indirect components. We applied this method on data for clinical mastitis, milk and SCC test-day recordings, results of bacteriological cultures (Escherichia coli, Staphylococcus aureus, Streptococcus uberis, coagulase-negative staphylococci, Streptococcus dysgalactiae, and streptococci other than Strep. dysgalactiae and Strep. uberis), and cow characteristics. Following a diagnosis of clinical mastitis, the cow was treated and changes (increase or decrease) in milk production before and after a diagnosis were interpreted counterfactually. On a daily basis, indirect changes, mediated by SCC increase, were significantly different from zero for all bacterial species, with a milk yield decrease (ranging among species from 4 to 33g and mediated by an increase of 1000SCC/mL/day) before and a daily milk increase (ranging among species from 2 to 12g and mediated by a decrease of 1000 SCC/mL/day) after detection. Direct changes, not mediated by SCC, were only different from zero for coagulase-negative staphylococci before diagnosis (72g per day). We concluded that mixed structural equation models were useful to estimate direct and indirect effects of the presence of clinical mastitis on milk yield. Copyright © 2015. Published by Elsevier B.V.
Article
A nationwide longitudinal study was conducted to investigate risk factors for bovine respiratory disease (BRD) in cattle in Australian feedlots. After induction (processing), cattle were placed in feedlot pens (cohorts) and monitored for occurrence of BRD over the first 50 days on feed. Data from a national cattle movement database were used to derive variables describing mixing of animals with cattle from other farms, numbers of animals in groups before arrival at the feedlot, exposure of animals to saleyards before arrival at the feedlot, and the timing and duration of the animal's move to the vicinity of the feedlot. Total and direct effects for each risk factor were estimated using a causal diagram-informed process to determine covariates to include in four-level Bayesian logistic regression models. Mixing, group size and timing of the animal's move to the feedlot were important predictors of BRD. Animals not mixed with cattle from other farms prior to 12 days before induction and then exposed to a high level of mixing (≥4 groups of animals mixed) had the highest risk of developing BRD (OR 3.7) compared to animals mixed at least 4 weeks before induction with less than 4 groups forming the cohort. Animals in groups formed at least 13 days before induction comprising 100 or more (OR 0.5) or 50 to 99 (OR 0.8) were at reduced risk compared to those in groups of less than 50 cattle. Animals moved to the vicinity of the feedlot at least 27 days before induction were at reduced risk (OR 0.4) compared to cattle undergoing short-haul transportation (<6 hours) to the feedlot within a day of induction, while those experiencing longer transportation durations (6 hours or more) within a day of induction were at slightly increased risk (OR 1.2). Knowledge of these risk factors could potentially be used to inform management decisions to reduce the risk of BRD in feedlot cattle.
Article
A dataset of test-day records, fertility traits, and one health trait including 1275 Brown Swiss cows kept in 46 small-scale organic farms was used to infer relationships among these traits based on recursive Gaussian-threshold models. Test-day records included milk yield (MY), protein percentage (PROT-%), fat percentage (FAT-%), somatic cell score (SCS), the ratio of FAT-% to PROT-% (FPR), lactose percentage (LAC-%), and milk urea nitrogen (MUN). Female fertility traits were defined as the interval from calving to first insemination (CTFS) and success of a first insemination (SFI), and the health trait was clinical mastitis (CM). First, a tri-trait model was used which postulated the recursive effect of a test-day observation in the early period of lactation on liability to CM (LCM), and further the recursive effect of LCM on the following test-day observation. For CM and female fertility traits, a bi-trait recursive Gaussian-threshold model was employed to estimate the effects from CM to CTFS and from CM on SFI. The recursive effects from CTFS and SFI onto CM were not relevant, because CM was recorded prior to the measurements for CTFS and SFI. Results show that the posterior heritability for LCM was 0.05, and for all other traits, heritability estimates were in reasonable ranges, each with a small posterior SD. Lowest heritability estimates were obtained for female reproduction traits, i.e. h(2)=0.02 for SFI, and h(2)≈0 for CTFS. Posterior estimates of genetic correlations between LCM and production traits (MY and MUN), and between LCM and somatic cell score (SCS), were large and positive (0.56-0.68). Results confirm the genetic antagonism between MY and LCM, and the suitability of SCS as an indicator trait for CM. Structural equation coefficients describe the impact of one trait on a second trait on the phenotypic pathway. Higher values for FAT-% and FPR were associated with a higher LCM. The rate of change in FAT-% and in FPR in the ongoing lactation with respect to the previous LCM was close to zero. Estimated recursive effects between SCS and CM were positive, implying strong phenotypic impacts between both traits. Structural equation coefficients explained a detrimental impact of CM on female fertility traits CTFS and SFI. The cow-specific CM treatment had no significant impact on performance traits in the ongoing lactation. For most treatments, beta-lactam-antibiotics were used, but test-day SCS and production traits after the beta-lactam-treatment were comparable to those after other antibiotic as well as homeopathic treatments.
Article
The paper analyses the impact of a priori determinants of biosecurity behaviour of farmers in Great Britain. We use a dataset collected through a stratified telephone survey of 900 cattle and sheep farmers in Great Britain (400 in England and a further 250 in Wales and Scotland respectively) which took place between 25 March 2010 and 18 June 2010. The survey was stratified by farm type, farm size and region. To test the influence of a priori determinants on biosecurity behaviour we used a behavioural economics method, structural equation modelling (SEM) with observed and latent variables. SEM is a statistical technique for testing and estimating causal relationships amongst variables, some of which may be latent using a combination of statistical data and qualitative causal assumptions. Thirteen latent variables were identified and extracted, expressing the behaviour and the underlying determining factors. The variables were: experience, economic factors, organic certification of farm, membership in a cattle/sheep health scheme, perceived usefulness of biosecurity information sources, knowledge about biosecurity measures, perceived importance of specific biosecurity strategies, perceived effect (on farm business in the past five years) of welfare/health regulation, perceived effect of severe outbreaks of animal diseases, attitudes towards livestock biosecurity, attitudes towards animal welfare, influence on decision to apply biosecurity measures and biosecurity behaviour. The SEM model applied on the Great Britain sample has an adequate fit according to the measures of absolute, incremental and parsimonious fit. The results suggest that farmers' perceived importance of specific biosecurity strategies, organic certification of farm, knowledge about biosecurity measures, attitudes towards animal welfare, perceived usefulness of biosecurity information sources, perceived effect on business during the past five years of severe outbreaks of animal diseases, membership in a cattle/sheep health scheme, attitudes towards livestock biosecurity, influence on decision to apply biosecurity measures, experience and economic factors are significantly influencing behaviour (overall explaining 64% of the variance in behaviour). Three other models were run for the individual regions (England, Scotland and Wales). A smaller number of variables were included in each model to account for the smaller sample sizes. Results show lower but still high levels of variance explained for the individual models (about 40% for each country). The individual models' results are consistent with those of the total sample model. The results might suggest that ways to achieve behavioural change could include ensuring increased access of farmers to biosecurity information and advice sources.
Article
In dairy cattle, many farming practices have been associated with occurrence of mastitis but it is often difficult to disentangle the causal threads. Structural equation models may reduce the complexity of such situations. Here, we applied the method to examine the links between mastitis (subclinical and clinical) and risk factors such as herd demographics, housing conditions, feeding procedures, milking practices, and strategies of mastitis prevention and treatment in 345 dairy herds from the Walloon region of Belgium. During the period January 2006 to October 2007, up to 110 different herd management variables were recorded by two surveyors using a questionnaire for the farm managers and during a farm visit. Monthly somatic cell counts of all lactating cows were collected by the local dairy herd improvement association. Structural equation models were created to obtain a latent measure of mastitis and to reduce the complexity of the relationships between farming practices, between indicators of herd mastitis and between both. Robust maximum likelihood estimates were obtained for the effects of the herd management variables on the latent measure of herd mastitis. Variables associated directly (p<0.05) with the latent measure of herd mastitis were the addition of urea in the rations; the practices of machine stripping, of pre-and post-milking teat disinfection; the presence of cows with hyperkeratotic teats, of cubicles for housing and of dirty liners before milking; the treatment of subclinical cases of mastitis; and the age of the herd (latent variable for average age and parity of cows, and percentage of heifers in the herd). Treatment of subclinical mastitis was also an intermediate in the association between herd mastitis and post-milking teat disinfection. The study illustrates how structural equation model provides information regarding the linear relationships between risk factors and a latent measure of mastitis, distinguishes between direct relationships and relationships mediated through intermediate risk factors, allows the construction of latent variables and tests the directional hypotheses proposed in the model.
Article
The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.
Article
During 1983-85, 279 calves requiring treatment for bovine respiratory disease and 290 comparison (control) animals from 15 different groups of feedlot calves were bled on arrival and again at 28 days postarrival. Their sera were then analyzed for antibodies to seven putative respiratory pathogens. On arrival, the prevalences of indirect agglutination titers to Pasteurella haemolytica, P. haemolytica cytotoxin, Mycoplasma bovis and M. dispar were greater than 50%, the prevalence of titers to bovine virus diarrhea virus (BVDV) was approximately 40%, and the prevalences of titers to infectious bovine rhinotracheitis virus (IBRV), bovine respiratory syncytial virus (RSV) and parainfluenza virus type 3 (PIV3) were all below 25%. Seroconversion during the first month after arrival occurred in more than half the calves to P. haemolytica cytotoxin, PIV3 and RSV. Seroconversion of agglutination titers to P. haemolytica, Mycoplasma and BVDV occurred in about 40% of calves, and seroconversion to IBRV was infrequent (less than 5%). Initial titers were negatively correlated to subsequent titer changes within organism. Initial titers, and titer changes between organisms were essentially independent. Light calves had an increased risk of being selected for treatment for respiratory disease. Seroconversion to P. haemolytica cytotoxin, RSV and BVDV were predictive of respiratory disease cases, explaining approximately 69% of all respiratory disease cases in the feedlots. It was not possible to accurately predict weight gain or relapse from the serological data.
Article
To quantify the effects of treatment for clinical respiratory tract disease and pulmonary lesions identified at slaughter on rate of weight gain in feedlot cattle. Prospective longitudinal study. 469 feedlot steers. Clinical respiratory tract disease was monitored between birth and slaughter. Steers were weaned at approximately 6 months old and entered into the feedlot for a mean of 273 days. Mean daily weight gain (MDG) was monitored during the feeding period. Lungs were collected at slaughter and evaluated for gross lesions indicative of active or resolved pneumonia. Mean daily weight gain during the feeding period was 1.30 kg, and ranged from 1.16 to 1.46 kg within individual pens. Thirty-five percent of steers received treatment for respiratory tract disease between birth and slaughter, whereas 72% had pulmonary lesions evident at slaughter. Among steers treated for clinical respiratory tract disease, 78% had pulmonary lesions, whereas 68% of untreated steers had pulmonary lesions. Pulmonary lesions at slaughter were associated (P < 0.01) with a 0.076-kg reduction in MDG during the feeding period. Treatment for clinical disease was not associated with MDG after adjustment for the effect of pulmonary lesions. Treatment of clinically affected feedlot cattle may be inadequate to prevent significant production losses attributable to respiratory tract disease.
Article
The data presented consistently demonstrated the cost of BRD from weaning to the packers to be approximately 7% of the total production cost when compared to animals with health respiratory tracts. As a clinician it is distressing to feel that we cannot accurately identify all the animals that require treatment. Greater emphasis must be placed on prevention of BRD.
Article
Morbidity and mortality of feedlot cattle have a variety of causes. Compared to respiratory disease, metabolic and digestive disorders generally are less prevalent and occur later in the feeding period. In addition to the obvious costs related to animal death and medication, subsequent performance of sick cattle often is depressed substantially. Closer coordination between veterinarians, nutritionists, and feedlot managers should help reduce the incidence of morbidity and mortality of feedlot cattle.
Article
Sources of variation in measures of reproductive performance in dairy cattle were evaluated using data collected from 3207 lactations in 1570 cows in 50 herds from five geographic regions of Reunion Island (located off the east coast of Madagascar). Three continuously distributed reproductive parameters (intervals from calving-to-conception, calving-to-first-service and first-service-to-conception) were considered, along with one Binomial outcome (first-service-conception risk). Multilevel models which take into account the hierarchical nature of the data were used to fit all models. For the overall measure of calving-to-conception interval, 86% of the variation resided at the lactation level with only 7, 6 and 2% at the cow, herd and regional levels, respectively. The proportion of variance at the herd and cow levels were slightly higher for the calving-to-first-service interval (12 and 9%, respectively) - but for the other two parameters (first-service-conception risk and first-service-to-conception interval), >90% of the variation resided at the lactation level. For the three continuous dependent variables, comparison of results between models based on log-transformed data and Box-Cox-transformed data suggested that minor departures from the assumption of normality did not have a substantial effect on the variance estimates. For the Binomial dependent variable, five different estimation procedures (penalised quasi-likelihood, Markov-Chain Monte Carlo, parametric and non-parametric bootstrap estimates and maximum-likelihood) yielded substantially different results for the estimate of the cow-level variance.
Article
Relationships between production and diseases may involve recursive or simultaneous effects between traits. Four structural equation models (SEqM) for somatic cell score and milk yield, with varying specifications for the effects relating the 2 traits, were compared. Data consisted of repeated records of milk yield and somatic cell score of 33,453 first-lactation daughters of 245 Norwegian Red sires that had their first progeny test in 1991 and 1992. All models included random effects of the sire and of the cow and were fitted using the LISREL software. The Bayesian information criterion clearly favored a model with a recursive effect from somatic cell score on milk yield over the 3 other models fitted (absence of recursive effects; an effect from milk yield on somatic cell score; simultaneity of effects between the 2 traits). This provides evidence that the negative association between milk yield and somatic cell score is more likely due to an effect of infection (measured indirectly by the somatic cell score) on production than to a dilution effect. Estimates indicated that a mastitis event would reduce milk yield in the following 15 d by about 900 g/d. The estimated genetic (co)variances did not change sizably when the specification of recursive or simultaneous effects was varied. However, estimates of the phenotypic covariance were altered when a recursive effect from somatic cell score on milk yield was included in the model.
Article
The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as the procedure for statistical model identification. The classical maximum likelihood estimation procedure is reviewed and a new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced. When there are several competing models the MAICE is defined by the model and the maximum likelihood estimates of the parameters which give the minimum of AIC defined by AIC = (-2)log-(maximum likelihood) + 2(number of independently adjusted parameters within the model). MAICE provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The practical utility of MAICE in time series analysis is demonstrated with some numerical examples.