Conference PaperPDF Available

Pairs of Latin Squares to Counterbalance Sequential Effects and Pairing of Conditions and Stimuli

Authors:
  • MeasuringU

Abstract

This paper discusses methods with which one can simultaneously counterbalance immediate sequential effects and pairing of conditions and stimuli in a within-subjects design using pairs of Latin squares. Within-subjects (repeated measures) experiments are common in human factors research. The designer of such an experiment must develop a scheme to ensure that the conditions and stimuli are not confounded, or randomly order stimuli and conditions. While randomization ensures balance in the long run, it is possible that a specific random sequence may not be acceptable. An alternative to randomization is to use Latin squares. The usual Latin square design ensures that each condition appears an equal number of times in each column of the square. Latin squares have been described which have the effect of counterbalancing immediate sequential effects. The objective of this work was to extend these earlier efforts by developing procedures for designing pairs of Latin squares which ensure complete counterbalancing of immediate sequential effects for both conditions and stimuli, and also ensure that conditions and stimuli are paired in the squares an equal number of times.
A preview of the PDF is not available
... After the training session, participants experienced each of the six treatment conditions (defensive, normal, and aggressive driving behaviors for signalized and unsignalized crosswalks) once. The conditions' sequence was counterbalanced using a Latin square design [131]. The standard Latin square design we employed in the study is available in Appendix A.1. ...
... The balanced Latin square design has a group of sequences of treatment conditions such that every condition appears before and after every other condition exactly once. This design helps to compensate for immediate sequential effects [131]. ...
Thesis
First and foremost, I would like to thank my advisor, Professor Dawn Tilbury, for her constant guidance and encouragement. She has been extremely helpful in developing my technical, research, and personal skills and immensely supportive of my ideas and endeavors throughout graduate school. She has been an excellent mentor and has always been there in my time of need, encouraging and boosting my confidence when I needed them the most. I would like to specially thank my committee members and collaborators, Professors Lionel Robert and Jessie Yang, for their support and encouragement, right from the start of my graduate program. The multi-disciplinary nature of the research initiated by these three Professors is what first drew me towards pursuing a Ph.D. I would also like to thank my other committee members Professors Ilya Kolmanovsky and Ram Vasudevan, for providing their support and feedback that improved the dissertation. I would like to thank the Department of Mechanical Engineering, Rackham Graduate School, and the University of Michigan for giving me the opportunity to pursue the doctoral degree and providing financial support during my time at the university. In addition, I would like to thank the Toyota Research Institute and the Automotive Research Center for providing financial assistance. I really appreciate the support I received from the MAVRIC lab members. The multi-disciplinary culture and environment that the Professors have fostered in the MAVRIC lab have deeply broadened my perspectives. Specically, I would like to thank Hebert Azevedo-Sa. He is usually the first person I discuss my ideas with and has been an excellent critique. I would also like to thank Connor Esterwood, Na Du, Qiaoning Zhang, and Huajing Zhao for the numerous discussions and help with my user studies; especially Connor, who took on a variety of roles to help with my user study|from an engineer to a tailor, to even a hidden driver. Outside of the University of Michigan, I would like to thank my undergraduate advisor, Professor Madhu M., and my internship advisor at the Indian Institute of Technology-Madras, Professor Saravanan Gurunathan. They encouraged me to pursue research and provided me with the necessary opportunities. A special thanks to Sajaysurya Ganesh, a close friend, and collaborator in my early research projects, with who I discuss ideas even now. Last but not least, I would like to thank my family and friends for supporting me during the past several years. My friends at Ann Arbor made life away from home much easier; they are like my second family. A long list of people from my Master's and Ph.D. programs at the University of Michigan has played an essential role in my graduate experience. Still, I would like to especially thank Sandipp Krishnan Ravi, Subramaniam Balakrishna, Rahasudha Kannan, and Paavai Pari for all their love and support. I will fondly remember my time at the University of Michigan and in Ann Arbor because of all of the people I encountered, the friends I made, and the experiences I had. My parents, wife, and extended family have all been incredibly supportive of the pursuit of my degree, and I am eternally grateful for their love and guidance.
... After the training sessions described above, participants experienced each of the six treatment conditions (defensive, normal, and aggressive driving behaviors for each of signalized and unsignalized crosswalks) once. The conditions' sequence was counterbalanced using a Latin square design (Lewis, 1989). The standard Latin square design that we employed in the study is available in Appendix 1. ...
... The balanced Latin square design has a group of sequences of treatment conditions such that every condition appears before and after every other condition exactly once. This design helps to compensate for immediate sequential effects (Lewis, 1989). ...
Article
Full-text available
Pedestrians' acceptance of automated vehicles (AVs) depends on their trust in the AVs. We developed a model of pedestrians' trust in AVs based on AV driving behavior and traffic signal presence. To empirically verify this model, we conducted a human-subject study with 30 participants in a virtual reality environment. The study manipulated two factors: AV driving behavior (defensive, normal, and aggressive) and the crosswalk type (signalized and unsignalized crossing). Results indicate that pedestrians' trust in AVs was influenced by AV driving behavior as well as the presence of a signal light. In addition, the impact of the AV's driving behavior on trust in the AV depended on the presence of a signal light. There were also strong correlations between trust in AVs and certain observable trusting behaviors such as pedestrian gaze at certain areas/objects, pedestrian distance to collision, and pedestrian jaywalking time. We also present implications for design and future research.
... Presenting a large set of questionnaires in a non-random order would not be advisable as whichever instrument is presented last would always be completed by subjects in their most fatigued, bored, distracted, or impatient state. Rather than randomizing, the symptom questionnaires were presented in an order determined by constructing a Williams Pair [26,27]. This technique is a useful alternative to standard randomization when the number of possible questionnaire (or other "treatment") orderings far exceeds the number of subjects and all subjects are being administered all questionnaires, as it balances assignments over time and prevents any more than two consecutive subjects from having the same ordering. ...
Article
Full-text available
Purpose To use artificial intelligence to identify relationships between morphological characteristics of the Meibomian glands (MGs), subject factors, clinical outcomes, and subjective symptoms of dry eye. Methods A total of 562 infrared meibography images were collected from 363 subjects (170 contact lens wearers, 193 non-wearers). Subjects were 67.2 % female and were 54.8 % Caucasian. Subjects were 18 years of age or older. A deep learning model was trained to take meibography as input, segment the individual MG in the images, and learn their detailed morphological features. Morphological characteristics were then combined with clinical and symptom data in prediction models of MG function, tear film stability, ocular surface health, and subjective discomfort and dryness. The models were analyzed to identify the most heavily weighted features used by the algorithm for predictions. Results MG morphological characteristics were heavily weighted predictors for eyelid notching and vascularization, MG expressate quality and quantity, tear film stability, corneal staining, and comfort and dryness ratings, with accuracies ranging from 65 % to 99 %. Number of visible MG, along with other clinical parameters, were able to predict MG dysfunction, aqueous deficiency and blepharitis with accuracies ranging from 74 % to 85 %. Conclusions Machine learning-derived MG morphological characteristics were found to be important in predicting multiple signs, symptoms, and diagnoses related to MG dysfunction and dry eye. This deep learning method illustrates the rich clinical information that detailed morphological analysis of the MGs can provide, and shows promise in advancing our understanding of the role of MG morphology in ocular surface health.
... The following delay values were selected: 300 ms (minimum), 600 ms, 900 ms, An essential consideration when establishing experimental conditions is randomization and balancing [33]. To ensure that conditions were balanced, the Graeco-Latin distribution was used to organize the delay and figure conditions [23]. In this way, we ensured that the same number of pairs of conditions existed for each possible combination. ...
Article
Immersive technologies like eXtended Reality (XR) are the next step in videoconferencing. In this context, understanding the effect of delay on communication is crucial. This paper presents the first study on the impact of delay on collaborative tasks using a realistic Social XR system. Specifically, we design an experiment and evaluate the impact of end-to-end delays of 300, 600, 900, 1200, and 1500 ms on the execution of a standardized task involving the collaboration of two remote users that meet in a virtual space and construct block-based shapes. To measure the impact of the delay in this communication scenario, objective and subjective data were collected. As objective data, we measured the time required to execute the tasks and computed conversational characteristics by analysing the recorded audio signals. As subjective data, a questionnaire was prepared and completed by every user to evaluate different factors such as overall quality, perception of delay, annoyance using the system, level of presence, cybersickness, and other subjective factors associated with social interaction. The results show a clear influence of the delay on the perceived quality and a significant negative effect as the delay increases. Specifically, the results indicate that the acceptable threshold for end-to-end delay should not exceed 900 ms. This article, additionally provides guidelines for developing standardized XR tasks for assessing interaction in Social XR environments.
... Each participant was shown seven of the 49 videos (or 49 transcripts); the videos/transcripts were allocated such that every participant viewed each of the seven actors and each of the seven conditions of behavior, once. The assignment of actors to conditions of behavior and the condition order were counterbalanced simultaneously using a pair of Latin squares developed by Lewis (1989). ...
Article
Behaviors such as gaze aversion and repetitive movements are commonly believed to be signs of deception and low credibility; however, they may also be characteristic of individuals with developmental or mental health conditions. We examined the effect of five behaviors that are common among autistic individuals—gaze aversion, repetitive movements, misinterpretation of figurative language, monologues, and flat affect—on observers' evaluations of deception and credibility. This study focused on judgments made in everyday social situations which contrasts with most previous studies which have examined such judgments in contexts (e.g., legal proceedings) where they are of primary importance. In three experiments, we presented participants with video segments of individuals being interviewed about biographical information and participants then indicated their perception of the individuals' truthfulness and credibility. Overall, individuals were perceived as more deceptive and less credible when they displayed autistic behaviors than when they did not; however, the effect sizes detected were weak. This article is protected by copyright. All rights reserved.
... Visual approach, season and vegetation structure had significant influences on landscape perception and preference except for experimental design, which is in line with the study of Charness et al. (2012), which confirmed that the between-subject and within-subject experimental designs can produce consistent results under certain conditions. In this study, it is probable that the experimental samples in each group were all selected undergraduate and graduate students and the treatment that randomly displayed the green spaces to each group of participants balanced the sequential and repetitive effects of the within-subject design (Lansdale et al., 2010;Lewis, 2016). Therefore, both experimental methods are suitable for experimental conditions with participants of similar demographic background and randomly assigned experimental stimuli. ...
Article
In order to identify the reliability and validity of the different visual approaches in assessing landscape perception and preference, off-site surveys with photo elicitation and virtual reality and on-site surveys of urban green spaces were conducted under certain conditions across four seasons and with different selections of participants as an experimental design. Nonparametric tests (Kruskal-Wallis and Mann-Whitney tests) and the Generalized Linear Model have been respectively applied to identify the differences among visual approaches. The results showed: (1) landscape perception and preference through on-site and off-site (photo elicitation and virtual reality) approaches were significantly different, and virtual reality was more consistent than on-site survey. (2) Season significantly influenced on-site and off-site visual strategies but experimental design did not. (3) The preferences for urban green spaces with different vegetation structure were significantly influenced by three visual approaches under different seasons. The three visual approaches were significantly different except for perception of open green space in winter and closed green space in autumn. It is suggested in practice that for open green space, photo elicitation could replace on-site survey particularly in autumn and winter; virtual reality could replace on-site survey in semi-open green space in any season and all green spaces in winter; and photo elicitation could replace virtual reality in winter. The results can provide scientific support for obtaining more accurate assessments of landscape perception and preference in the future.
... The within-subject variable was NCAN, which had five levels: 0, 1, 2, 3 and 4. The between-subject variable was the keyboard layout, which had three levels: QWERTY, alphabetic square and alphabetic tworow. A pair of Latin squares (Lewis, 1989(Lewis, , 1993 was used to counterbalance the immediate sequential effects for the withinsubject variable and stimuli. The experimental conditions for each subject can been seen in Table 1, which includes only 10 subjects, since all these subjects, as mentioned in Section 4.1, were assigned into three groups, and the detailed experimental conditions of each group were the same. ...
Article
Entering text with a general five-key TV remote is a laborious task. A strategy for entering text with two interconnected cursors is proposed, whereby a secondary cursor is employed to maneuver the main cursor through fast-tracks. The main cursor is maneuvered in a 2D full-size onscreen keyboard space, whereas the sub-cursor moves among all predictive candidates in a 1D subspace. Each cursor is operated by a specific interaction method, and the movement of either must be mapped from one to the other. Compared to single-cursor methods, the combination of the main cursor and the sub-cursor operations usually results in fewer manual loadings, even when the target character is out of the prediction list range. A computer simulation based on a corpus of 57 258 multimedia titles (in Chinese) demonstrated that the keystrokes per character, powered by a dual-cursor technique, could be predicted to be reduced by 38.6–69.9% with very few predictive candidates for various keyboard layouts (compared with those of conventional non-predictive method). The keyboard layout and the number of candidates were further investigated by means of a usability test. The results revealed that with only 10 min of practice, novice users could achieve a mean text entry speed of 33.3, 29.5 and 22.8 characters per minute for QWERTY, alphabetic square, and alphabetic two-row layouts, respectively, which is 31.6%, 14.3% and 67.6% faster than the corresponding conventional input method, and is 12.7%, 6.9% and 25.0% faster than the current version of popup dialog method. The dual-cursor can significantly improve perceived usability and offers the potential to be applied to numerous other cursor-based text entry contexts. RESEARCH HIGHLIGHTS A new interaction strategy for TV input with two interconnected cursors. This strategy employs long-pressing for jumping around predicted candidates and short-pressing for navigating through keyboard keys. The number of keystrokes could be predicted to be reduced by 38.6–69.9% when compared with that of the conventional non-predictive input method for various keyboard layouts. Few predicted candidates were required to achieve a substantial decrement of keystrokes. The user experiment showed that novice users’ TV input speed could be substantially increased with dual-cursor.
Article
The IBM Design Center in Boca Raton studied the operating-point key force for a portable computer keyboard. Alden, Daniels, and Kanarick (1972) report that typists prefer operating-point key forces of between 25 and 150 grams. We compared different key forces that fell within the range recommended by Alden et al. The only difference between the keyboards we studied was the amount of force required to activate the keys. The first keyboard (58 keyboard) required 58 grams of force to activate the keys. The second keyboard (74 keyboard) required 74 grams of force to activate the keys. Sixteen skilled typists used both keyboards to enter text. Input speed was significantly faster on the 58–gram keyboard. A significant number of typists preferred the 58-gram keyboard. The results suggest that the optimal key force for portable computer keyboards is less than 74 grams.
Article
A Latin square is a matrix containing the same number of rows and columns. The cell entries are a sequence of symbols inserted in such a way that each symbol occurs only once in each row and only once in each column. Fisher (1925) proposed that Latin squares could be useful in experimental designs for controlling the effects of extraneous variables. He argued that a Latin square should be chosen at random from the set of possible Latin squares that would fit a research design and that the Latin-square design should be carried through into the data analysis. Psychological researchers have advanced our appreciation of Latin-square designs, but they have made only moderate use of them and have not heeded Fisher's prescriptions. Educational researchers have used them even less and are vulnerable to similar criticisms. Nevertheless, the judicious use of Latin-square designs is a powerful tool for experimental researchers.
Article
If there is an even number of experimental conditions (Latin letters), it is possible to construct a Latin Square in which each condition is preceded by a different condition in every row (and in every column, if desired). These designs are useful in counterbalancing immediate sequential, or other order, effects. A simple, and easily remembered, procedure by which to construct such squares is described and illustrated. A proof is offered which shows that the procedure is valid for any size square having an even number of cells on a side.