ArticlePDF Available

Single comments or average ratings: which elements of RateMyProfessors.com (TM) shape university students' judgments and course choice intentions?

Article

Single comments or average ratings: which elements of RateMyProfessors.com (TM) shape university students' judgments and course choice intentions?

Abstract and Figures

The use and abuse of course and lecturer rating websites such as RateMyProfessors.com™ is a highly relevant topic for universities’ evaluation and assessment policies and practice. However, only a few studies have paid attention to the actual influence of teaching evaluation websites on the students themselves—that is, their perceptions of a certain course and their course choice intention at university. Findings point to the fact that positive comments on the website about professors improve students’ evaluations. However, professor evaluation websites contain two types of information: single student comments and average ratings. Research on exemplification effects has shown that single cases often have a stronger influence on recipients than more valid base rate information. We test this assumption in an experiment (n = 126) using a professor evaluation website stimulus. Results show that single comments strongly influence opinions and course choice intentions but that they are moderated by the valence of the average rating.
Content may be subject to copyright.
Single Comments or Average Ratings: Which Elements of RateMyProfessors.com
Shape University Students´ Judgments and Course Choice Intentions?
by
Sebastian Scherr, Philipp Müller & Victoria Fast
1
Single Comments or Average Ratings: Which Elements of RateMyProfessors.com
Shape University Students´ Judgments and Course Choice Intentions?
Abstract The use and abuse of course and lecturer rating websites such as
RateMyProfessors.com™ is a highly relevant topic for universities´ evaluation and
assessment policies and practice. However, only a few studies have paid attention to the
actual influence of teaching evaluation websites on the students themselves—that is, their
perceptions of a certain course and their course choice intention at university. Findings point
to the fact that positive comments on the website about professors improve students´
evaluations. However, professor evaluation websites contain two types of information: single
student comments and average ratings. Research on exemplification effects has shown that
single cases often have a stronger influence on recipients than more valid base-rate
information. We test this assumption in an experiment (n = 126) using a professor evaluation
website stimulus. Results show that single comments strongly influence opinions and course
choice intentions but that they are moderated by the valence of the average rating.
Keywords university professor rating; teaching evaluation website; student
behaviour/attitudes; exemplification effect; experiment
1 Introduction
Student evaluation1 of teaching has been a substantial part of university classroom
experiences for decades (cf. Remmers 1927). In recent years, it has become a highly discussed
1 The terms ‘rating‘ and ‘evaluation‘ have been subject to a discussion about the degree of interpretation that
differentiates both of these constructs (see Otto, Sanford and Ross 2008). As we do not focus on the accuracy
2
topic of educational research (Davison and Price 2009; Lewandowski, Higgins, and Nardoni
2012; Otto, Sanford and Ross 2008) due to the expansion of online information sharing
(Rafaeli and Raban 2005). The website RateMyProfessors.com™, which launched in 2001,
offers a platform for university students to share evaluations of their lecturers. These
evaluations are archived for an unlimited period. This suggests that they may have a higher
relevance for judgments about teaching staff than evaluations distributed through mouth-to-
mouth communication. Future students are able to review professor ratings of past and present
students before they choose to participate in a class. Several studies have shown that students
in fact use the evaluations on RateMyProfessors.com™ for this purpose (Davison and Price
2009; Edwards et al. 2007; Kindred and Mohamed 2005; Kowai-Bell et al. 2011).
The present study focuses on the effects of professor rating websites on students´ judgments
and course choice intentions. Studies have already found evidence for effects of rating
websites on course and professor perceptions (Edwards et al. 2007; Lewandowski, Higgins,
and Nardoni 2012), grade expectancies (Kowai-Bell et al. 2011) and learning success
(Edwards et al. 2009). Our study focuses on an aspect in this context that has been neglected
by the research so far: The fact that professor evaluation websites contain two different types
of information that can be contradictive: single users´ comments and aggregate average
ratings. Research in other fields suggests that contradictory single comments are often more
important for people´s judgments than statistical information (Zillmann and Brosius 2000). If
the same mechanism applied to the processing of university instructor profiles on professor
rating websites this could have consequences for the resulting professor perception. These
single comments could then have a large impact on resulting professor perceptions even if
they contradicted the aggregate average rating of all users. Therefore, we present an
experiment that tests the notion of exemplification effects for professor evaluation websites.
of websites like RateMyProfessors.com, we keep the discussion in mind, but have decided to use both terms
interchangeably.
3
2 Existing research on RateMyProfessors.com
In recent years, RateMyProfessors.com™ has provoked an ongoing and vivid discussion
among education scholars about 1) the reliability of professor ratings, 2) the use of professor
rating websites, 3) the quality of the ratings as compared to official in-class student
evaluations, 4) the awareness of such websites, 5) differences of professor ratings between
different disciplines in higher education, or 6) about factors that influence professor ratings
(see Bleske-Rechek and Fritsch 2011; Bleske-Rechek and Michels 2010; Brown, Baillie, and
Fraser 2011; Coladarci and Kornfield 2007; Davison and Price 2009; Felton et al. 2008; Legg
and Wilson 2012; Sonntag, Bassett, and Snyder 2009). Most of the contributions to this
discussion focus on the accuracy of students´ evaluations of their courses and professors, e.g.,
in comparison to in-class evaluations (see Coladarci and Kornfield 2007; Otto, Sanford and
Ross 2008; Timmerman 2008; Sonntag et al. 2009; Brown, Baillie, and Fraser 2011). Authors
have argued that the rating categories applied on the website do not represent the major
categories of teaching evaluation (Davison and Price 2009). Furthermore, students´ teaching
quality evaluations are strongly correlated with course easiness and perceived professor
sexiness (Felton et al. 2008) and given categories of the website are deemed to support such a
superficiality of judgments (Davison and Price 2009; Freng and Webber 2009).
Given the nature of these criticisms, it is not surprising that many researchers are skeptical
about negative influences of RateMyProfessors.com™ and other professor evaluation
websites. In sum, these doubts usually cover three domains: First, professor evaluations
published on these websites could have a biasing influence on teaching staff´s promotion and
tenure if considered in such contexts by superiors (see Coladarci and Kornfield 2007).
Second, ratings on RateMyProfessors.com™ could be used as an indicator of teaching quality
in comparative university reputation rankings and can therefore lead to biases in such
4
rankings (see Bleske-Rechek and Michels 2010). For example, the Forbes ranking of US
colleges and universities includes students´ evaluations which are explicitly taken from
RateMyProfessors.com™. And third, they could influence students´ judgments about
professors and courses and can consequently have an effect on course choice behaviors or in-
class performance of students (Kindred and Mohammed 2005). This paper deals with the third
aspect, the effects of professor evaluation websites on students.
3 The effects of professor rating websites on students
The discussion about the accuracy of online professor ratings implicitly contains the
assumption that ratings have some effect on students who read them (Brown, Baillie, and
Fraser 2011). Edwards et al. (2007) have experimentally investigated the influence of a
RateMyProfessors.com™ profile on the professor´s perceived credibility and attractiveness,
attitudes about the teaching subject and learning motivation. Subjects were provided with a
positive or negative professor profile, or no profile at all, and were then exposed to a video
lecture of the professor. Afterwards, students were interviewed to assess their perceptions and
attitudes towards learning. The study showed a positive correlation between the valence of the
professor website profile and the dependent variables. In a subsequent experiment, Edwards et
al. (2009) demonstrated that exposure to a positive professor profile leads to higher learning
success and knowledge.
Kowai-Bell et al. (2011) showed that reading positive comments from
RateMyProfessors.comhas a positive effect on perceived control, grade expectancy, and
attitude toward the class. These factors are relevant for course choice decisions. Furthermore,
Lewandowski, Higgins, and Nardoni (2012, p. 992) manipulated the valence of
RateMyProfessors.com™ profiles as well as the content of the comments. They compared the
influence of ratings based on superficial vs. legitimate categories (i.e., professor is “fun and
5
entertaining and never assigns homework” vs. professor is “intelligent and explains clearly”).
The results of the study suggest that legitimate professor evaluations have a stronger influence
on students´ judgments about a professor than superficial ones. Most interestingly, this
category seems to interact with profile valence, such that the effects of a positive rating are
stronger for superficial comments whereas a negative profile shows stronger effects when it
contains legitimate comments. Moreover, Edwards et al. (2009) show that rating websites
have an effect on expectancies which in turn can impact learning performance. However, this
study does not focus on professor rating websites´ influence on learning performance as
behavioral outcome.
One feature of professor rating websites, however, has not been considered by the research so
far: The fact that teaching evaluation websites contain two different types of information.
Basically, they are built upon (1) open comments about the professor to be evaluated and (2)
standardized ratings on scales comparable to school grades, ranging from A to F or from 1 to
6. On RateMyProfessors.com, raters are asked to evaluate easiness, helpfulness, clarity and
overall quality on these standardized scales. The single ratings of these categories are then
combined in mean scores that appear on the top of each professor´s profile site. This means
that users who want to inform themselves about a certain professor have the opportunity to
read single comments and the appending standardized ratings and they are supported with a
summary of all ratings in the form of mean scores which are the more reliable source of
information from a logical point of view (Tversky and Kahneman 1974).
However, Babad, Darley, and Kaplowitz (1999) observed that evaluations of teaching staff
and course choice decisions depend more on ‘juicy’ single case information about a college
instructor rather than on pooled quantitative student ratings. This finding is in line with a
branch of research focusing on “exemplification effects”.
Its basic argument is that single case exemplars are cognitively easier to catch and process
than more valid aggregate statistical information (Tversky and Kahneman 1974). Thus,
6
people falsely rely upon single case information when forming impressions and ignore base
rate information that is available at the same time (“base-rate fallacy”, Bar-Hillel 1980). A
number of experiments have confirmed this hypothesis for different dependent variables: the
perceived climate of opinion, the personal opinion and intentions to act (for an overview, see
Zillmann and Brosius 2000). Such exemplification effects have consistently been replicated
for different fields, for example for health campaigns (cf. Kim et al. 2012) or political
campaigns and issues (cf. Moy and Rinke 2012; Lefevere, De Swert, and Walgrave 2012).
The findings of Babad, Darley, and Kaplowitz (1999) can be regarded as supporting the
hypothesis in the area of professor rating—although the authors have not explicitly referred to
exemplification research.
Applied to the effects of professor ratings websites this would mean that students give more
weight to single comments of others than to the mean ratings at the top of the website page
when forming professor judgments and course choice intentions. This could have dramatic
consequences for professor evaluation. If, for example, the most recent comments on a
professor´s rating profile were coincidentally strongly negative whereas the aggregate overall
ratings showed that the general opinion about the professor is quite positive, a student who
only read those first negative comments could deduce that the majority of students see the
respective professor in a negative light—ignoring the positive base rate.
However, existing research on RateMyProfessors.comhas not tried to investigate this
question: Kowai-Bell et al. (2011) did not include average ratings in their stimulus material
and Edwards et al. (2007), Edwards et al. (2009) as well as Lewandowski, Higgins, and
Nardoni (2012) used stimuli in which both the valence of comments and average ratings were
consistent. This study seeks to bridge this research gap. We intend to explore which elements
of a teaching evaluation websites have the stronger effect on professor judgments and course
choice intentions. For this purpose, we conducted an experiment in which both an individual
7
student comment and numerical average student ratings are simultaneously presented but
separately manipulated in valence.
4 Hypotheses
Research suggests that, in an experimental setting, the valence of single comments should
have a stronger effect upon professor judgments and course choice intentions than the valence
of the mean score. Thus, we hypothesize that:
H1a: Participants will more strongly rely on individual student comments from a teaching
evaluation website compared to more valid base-rate information when assessing the
described course climate.
H1b: Participants will more strongly rely on individual student comments from a teaching
evaluation website compared to more valid base-rate information when asked for their
personal opinion on the described course.
H1c: Participants will more strongly rely on individual student comments from a teaching
evaluation website compared to more valid base-rate information for their intended future
behavior, i.e., the probability of enrolling in the described course.
5 Method
5.1 Participants
8
A total sample of n = 127 undergraduate students from a large [country deleted to maintain
the integrity of the review process] university volunteered in the experiment. The mean age
was 20.49 years (SD = 1.99) and 76% were female. All participants were recruited in
introductory university classes of communication due to restricted means and availability
reasons. By doing so, participants were not likely to unmask the purpose of this study and
were probably less dull in evaluating university professors than students of higher classes.
The restricted representativeness of our student sample is of lower importance as we do not
present descriptive statistics referring to the population. In fact, testing professor judgments
and course choice intentions as a product of professor ratings websites requires interviewing
students to enhance external validity. Due to missing data, numbers may vary slightly for
some analyses. Students received no credit or extra credit for participation.
5.2 Procedure
Although, in general, the impact of professor rating websites on the perceived course climate
and the personal opinion of prospective students is well investigated, existing research in the
field fails to combine consistent and inconsistent single user comments and base-rate
information at the same time. Our experimental survey is intended to close this research gap.
For this purpose, participants were randomly assigned to four experimental groups and
exposed to a manipulated screenshot of a professor rating website that mimicked the design
and layout of the original website. Before and after exposure to the screenshot, participants
completed different written measures and were de-briefed afterwards.
5.3 Materials
9
Participants of each group were exposed to a printed screenshot of the website [name deleted
to maintain the integrity of the review process] which is the [country deleted to maintain the
integrity of the review process] equivalent of RateMyProfessors.comidentical in website
structure and very similar in design. The screenshot contained a single profile page of a
fictitious professor. On the screenshot, 13 individual student comments on a given university
lecture were depicted at center taking up about 50% of the space. Moreover, an average
numerical rating of the lecture was shown at top right of the screenshot in bigger font. Only
one comment could be read completely, while the rest of the comments was ‘unfolded’ in so
far as these comments were only partly shown as often on websites (‘click here to read
more’). The average rating was furthermore verbalized which was depicted beside the
numerical rating (e.g., ‘fail’) to add clarity. The intention behind the profile being fictitious
was to eliminate pre-existing attitudes as a competing influence. However, to enhance
involvement and, thus, external validity, instructions told the participants that the professor in
question was currently working at another [country deleted to maintain the integrity of the
review process] university but in the same discipline they studied, and that the university
currently discusses about hiring the professor. We assumed that participants will be 1) more
involved in their estimations if the professor is working in the same discipline and will
possibly be hired in the future, but at the same time 2) not biased in their judgments as if the
professor was personally known to them from their university. According to
For all but one user comment, participants could only read a few words like “To my mind
[…]” indicating a longer comment which had to be expanded by the user on the original site
by clicking on […]. The unabbreviated comment concretely described what an exemplary
student who had already participated in a class of the said professor liked (or disliked) about
him (“Honestly, I´m really excited about how Prof [name deleted to maintain the integrity of
the review process] pulls off his courses. Once, I could even reach him late on his phone
when I was totally freaking out about my study. In the course, Prof [name] has always been
10
constructive in a positive way; he incited the others in the course to get engaged in what we
were doing and he´s always been interested in feedback on him.”).
With regard to the gender of the professor we only included a male professor. Limited
evidence suggests that female professor ratings are e.g., more likely to be influenced by
stereotypes rather than reflecting professor characteristics which can blur ratings (Sinclair and
Kunda, 2000). Furthermore, a study conducted by Bachen, McLoughlin and Garcia (1999)
suggests that university students´ evaluations of male and female professors are better for the
same sex. Nevertheless, there were no significant mean differences for the dependent
variables in our study between male and female participants (tclimate (121) = .64, Cohen´s d =
.12, p > .05; topinion (122) = -1.00, Cohen´s d = -.17, p > .05; tchoice (122) = -1.76, Cohen´s d = -
0.32, p > .05).
The simultaneous presentation of one exemplar and the base-rate information (there were
consistent as well as inconsistent combinations of the two types of information) is a well-
known setting for experiments that test the influence of individual statements in contrast to
base-rate information (for a summing-up of these “exemplification effects” studies, see
Zillmann and Brosius 2000).
In all, there were four experimental conditions (positive exemplar only, positive exemplar and
positive base-rate information, positive exemplar and negative base-rate information, negative
exemplar only). Based on the strong and consistent findings with regard to existing research
on exemplification effects and due to practical considerations we did not realize a full
experimental design. The stimulus was integrated as a printed screenshot within a pen and
paper questionnaire, to avoid confounding influences which could have appeared in case of an
online questionnaire (e.g., the possibility using an internet search engine, such as Google™,
for aspects presented in the stimulus that are involved in the manipulation of the experiment).
Treatment checks indicated no problems resulting from using this stimulus (credibility:
M=2.47, SD=1.00 on a five-point scale ranging from 1 “credible” to 5 “not credible”;
11
comprehensibility: M=1.90, SD=0.95 on a five-point scale ranging from 1 “comprehensible”
to 5 “incomprehensible”).
5.4 Measures of professor judgment, perceived course climate and course choice intentions
After exposure to the screenshot of the teaching evaluation website, participants answered
questions about different aspects of stimulus evaluation (e.g., its credibility,
comprehensibility, language). These dimensions that describe the perception of the stimulus
were measured on 5-point semantic differentials ranging from 1 “credible” to 5 “not credible”
(M = 2.5, SD = 1.0), from 1 “comprehensible” to 5 “not comprehensible” (M = 1.9, SD =
0.95) or 1 “formal language” to 5 “colloquial language” (M = 3.9, SD = 0.95). Means and
standard deviations of the variables used for treatment checking indicate that the manipulation
of the stimulus has worked.
Participants were also asked for their impressions of the course climate, for their personal
opinion on the professor in question and for their intentions to select the described course.
The perceived course climate was defined as the sum of perceived professor ratings (either in
form of individual statements and/or in form of numerical average ratings) expressed online
by unknown other students depicted on a professor ratings website. The central dependent
measures of our data analysis were indices (three items for the perceived course climate,
Cronbach´s alpha = .802, “For the students, Prof [name] is a good / involved instructor”, “For
the testimonial Prof [name] is a good instructor”; two items for the personal opinion,
Cronbach´s alpha = .807, “To me, Prof [name] is a good / involved instructor”; single item
measure for course choice intention, “I would apply to a course of Prof [name]” with answers
ranging from 1 “strongly disagree” to 5 “strongly agree”).
12
6 Results
Overall, results show that there are significant mean differences in the perception of the
course climate (F (3) = 154.41; p < .001; R2corr. = .79), the personal opinion (F (3) = 23.35; p
< .001; R2corr. = .35) and the willingness to enroll in the described course (F (3) = 13.39; p <
.001; R2corr. = .23) between the four experimental groups. Figure 1 shows that students (1) are
influenced by a single user comment on a course and follow the presented opinion and (2)
numerical average ratings that are presented at the top of the website can bolster this effect.
Both effects are strongest for the perceived course climate, but there are significant influences
on the personal opinion and on the probability of application too.
**** Figure 1 about here ****
Participants who were exposed to a negative comment estimated the course climate more
negatively than participants who were exposed to positive comments even if an additional
numerical average rating contradicted the single user comment (see Table 1).
**** Table 1 about here ****
The personal opinion as well as the course choice intention to the described course depends
on (1) the valence of the single user comment and (2) on presentation consistency:
Participants showed a more negative personal opinion towards the course and were less likely
to enroll in a comparable course, when the single user comment was either negative or
contradicted the additionally presented numerical average rating (see Table 1). Therefore, we
confirm our hypotheses H1a-H1c: Individual student comments on a teaching evaluation
13
website exert a larger influence than simultaneously presented base-rate information when
participants were asked either to assess the course climate, or for their personal opinion, or for
their course choice intention.
In sum, results show that there is a difference between a positive and a negative single user
comment (ex+ vs. ex-) on the perception of course climate, t (61) = 15.57, p < .001, the
personal opinion, t (59) = 6.68, p < .001 and the probability of application to the course as
well, t (59) = 3.86, p < .001. A simultaneously presented positive average rating (ex+ vs.
ex+/br+) does not exert a statistically significant influence on the estimations of the
participants, tclimate (63) = 1.63, n.s., topinion (62) = 0.52, n.s., tchoice (62) = 1.33, n.s. (see Table
1). On the contrary, a negative average rating (ex+ vs. ex+/br-) has a negative influence on the
evaluation of the participants, tclimate (60) = 7.14, p < .001, topinion (60) = 5.86, p < .001, tchoice
(60) = 3.22, p < .001 (see Table 1).
7 Discussion
Our experiment showed that profiles of professor rating websites have the largest effect on
users´ estimations of the course climate (i.e., their impression of the evaluation of a professor
by other students) compared to their personal professor judgment and course choice
intentions. However, results support the notion of an exemplification effect of profiles on
professor rating websites. A single user comment seems to determine the general tendency of
professor evaluation; additional average ratings of all users´ evaluations only moderate this
general tendency. For the positive average rating, a ceiling effect occurs: they do not
significantly boost positive estimations in comparison to the control group without a mean
score. In contrast to that, negative average ratings show a negative influence on all of the
three dependent variables. However, all values remain pretty much in the middle of the scale
14
indicating that in sum both types of information presented on a teaching evaluation website
exert similarly strong influences on students´ professor judgments and course choice
intentions.
7.1 Strengths and limitations
From a methodological point of view, it is important to note that the dependent variables are
correlated as they were all measured after treatment exposure. Thus, it is possible that course
choice intention is not a distinct reaction of the participants to the presented stimulus, but a
concerted reaction to the stimulus and to cognitions and perceptions. Nevertheless, our results
can be regarded as externally valid, as direct and indirect experiences and perceptions of
social reality also influence students´ estimations in real life, outside the laboratory.
Furthermore, it has to be considered that in real life situations, students may even have pre-
existing attitudes or prejudices towards professors or instructors of higher education before
turning to professor rating websites. This could additionally reduce a website´s real life
influence on students´ personal opinions and the intended course choice. Also, we have to be
aware that user behavior in real life will hardly be composed of the exposure to merely one
exemplary statement but of a consecutive exposure to several statements—often accompanied
by average values that will frame the reception and processing of several single statements as
well.
Moreover, it is worth explicitly noting that professor rating websites are especially popular in
North America and among college students. Here, professor rating websites can be a crucial
factor for an academic career in terms of promotion, tenure and merit (Felton, Mitchell and
Stinson 2004; Kowai-Bell et al. 2011; Otto, Sanford and Ross 2008). Hence, the results
presented here must be carefully interpreted for countries with a different higher education
system.
15
7.2 Conclusion
Results show that both average ratings and individual statement information are important for
teaching evaluation. Single user comments separately have a strong influence on professor
judgment and course choice selection. When complemented by a contradictory average rating,
this influence is reduced but it is not eliminated as one could expect from a logical standpoint.
Although average ratings are, by far, the more reliable indicator of professor quality, their
influence on professor judgments and course choice intentions seemed to be only just as
strong as that of a single user comment. This confirms the results of existing research on
exemplification effects. Once again, vivid information seems to be more accessible to people
compared to the more valid but pallid summary information.
Therefore, the aim of objectivity on a professor rating website is not easy to realize: it must be
assumed that single negative user comments can downplay the effect of positive average
ratings (or the other way around). Professor judgments and course choice selections, thus,
heavily depend upon which single comments a user of a professor rating websites selects to
read. For example, if the latest single comments on a professor profile indicated a rather
negative evaluation while the mean score is quite positive, users filtering for those latest
comments will probably come to a more negative professor perception than average ratings
would suggest. This may result in inaccurate estimations of courses and professors and
probably inappropriate decisions of future course choice intentions.
16
References
Babad, E., Darley, J. M., & Kaplowitz, H. (1999). Developmental aspects in students’ course
selection. Journal of Educational Psychology, 91(1), 157–168.
Bachen, C. M., McLoughlin, M. M., & Garcia, Sara S. (1999). Assessing the role of gender in
college students´ evaluations of faculty. Communication Education, 48, 193–210.
Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. Acta Psychologica,
44(3), 211–233.
Bleske-Rechek, A., & Fritsch, A. (2011). Student consensus on RateMyProfessors.com.
Practical Assessment, Research & Evaluation, 16(18).
http://pareonline.net/getvn.asp?v=16&n=18. Accessed 5 July 2012.
Bleske-Rechek, A., & Michels, K. (2010). RateMyProfessors.com: Testing assumptions about
student use and misuse. Practical Assessment, Research & Evaluation, 15(5).
http://pareonline.net/getvn.asp?v=15&n=5. Accessed 5 July 2012.
Brown, M. J., Baillie, M., & Fraser, S. (2011). Rating RateMyProfessors.com. A comparison
of online and official student evaluations of teaching. College Teaching, 57(2), 89–92.
Coladarci, T., & Kornfield, I. (2007). RateMyProfessors.com versus formal in-class student
evaluations of teaching. Practical Assessment, Research & Evaluation, 12(6).
http://pareonline.net/getvn.asp?v=12&n=6. Accessed 5 July 2012.
Davison, E., & Price, J. (2009). How do we rate? An evaluation of online student evaluations.
Assessment & Evaluation in Higher Education, 34(1), 51–65.
Edwards, C., Edwards, A., Qingmei, Q., & Wahl, S. (2007). The influence of computer-
mediated word-of-mouth communication on student perceptions of instructors and
attitudes toward learning course content. Communication Education, 56(3), 255–277.
Edwards, A., Edwards, C., Shaver, C., & Oaks, M. (2009). Computer-mediated word-of-
mouth communication on RateMyProfessors.com: Expectancy effects on student
17
cognitive and behavioral learning. Journal of Computer-Mediated Communication, 14(2),
368–392.
Felton, J., Koper, P. T., Mitchell, J., & Stinson, M. (2008). Attractiveness, easiness and other
issues: Student evaluations of professors on RateMyProfessors.com. Assessment &
Evaluation in Higher Education, 33(1), 45–61.
Felton, J., Mitchell, J., & Stinson, M. (2004). Web-based student evaluations of professors:
the relations between perceived quality, easiness and sexiness. Assessment & Evaluation
in Higher Education, 29(1), 91–108.
Freng, S., & Webber, D. (2009). Turning up the heat on online teaching evaluations: Does
‘hotness’ matter? Teaching of Psychology, 36(3), 189–193.
Kim, H. S., Bigman, C. A., Leader, A. E., Lerman, C., & Cappella, J. N. (2012). Narrative
health communication and behavior change: The influence of exemplars in the news on
intention to quit smoking. Journal of Communication, 62(3), 473–492.
Kindred, J., & Mohammed, S. N. (2005). “He will crush you like an academic ninja!’’:
Exploring teacher ratings on Ratemyprofessors.com. Journal of Computer-Mediated
Communication, 10(3), doi: 10.1111/j.1083-6101.2005.tb00257.x
Kowai-Bell, N., Guadagno, R. E., Little, T., Preiss, N., & Hensley, R. (2011). Rate my
expectations: How online evaluations of professors impact students’ perceived control.
Computers in Human Behavior, 27(5), 1862–1867.
Lefevere, J., de Swert, K., & Walgrave, S. (2012). Effects of popular exemplars in television
news. Communication Research, 39(1), 103–119.
Legg, A. M., & Wilson, J. H. (2012). RateMyProfessors.com offers biased evaluations.
Assessment & Evaluation in Higher Education, 37(1), 89–97.
Lewandowski, G. W., Higgins, E., & Nardoni, N. N. (2012). Just a harmless website?: An
experimental examination of RateMyProfessors.com’s effect on student evaluations.
Assessment & Evaluation in Higher Education, 37(8), 987–1002.
18
Moy, P., & Rinke, E. M. (2012). Attitudinal and behavioral consequences of published
opinion polls. In C. Holtz-Bacha, & J. Strömbäck (Eds.), Opinion polls and the media:
Reflecting and shaping public opinion (pp. 225–245). London: Palgrave.
Otto, J., Sanford, D. A., & Ross, D. N. (2008). Does ratemyprofessor.com really rate my
professor? Assessment & Evaluation in Higher Education, 33(4), 355–368.
Rafaeli, S., & Raban, D. R. (2005). Information sharing online: A research challenge.
International Journal of Knowledge and Learning, 1(1/2), 62–79.
Remmers, H. H. (1927). The purdue rating scale for instructors. Educational Administration
and Supervision, 6, 399–406.
Sinclair, L., & Kunda, K. (2000). Motivated stereotyping of women: She’s ne if she praised
me but incompetent if she criticized me. Personality and Social Psychology Bulletin, 26,
1329–1342.
Sonntag, M. E., Bassett, J. F., & Snyder, T. (2009). An empirical test of the validity of student
evaluations of teaching made on ratemyprofessors.com. Assessment & Evaluation in
Higher Education, 34(5), 499–504.
Timmerman, T. (2008). On the validity of RateMyProfessors.com. Journal of Education for
Business, 84(1), 55–61.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Biases in judgments
reveal some heuristics of thinking under uncertainty. Science, 185, 1124–1131.
Zillmann, D., & Brosius, H.-B. (2000). Exemplification in communication: The influence of
case reports on the perception of issues. Mahwah, NJ: Erlbaum.
19
Fig. 1 Influence of individual student comments and overall ratings on a professor rating
website on the perceived course climate, the personal opinion of an applicant and the course
choice intention.
Note n = 125-126; each scale from, e.g., 1 “negative perceptions of the climate of opinion” to
5 “positive perception of the climate of opinion”
Abbreviations “ex+”: positive exemplar
“br-“: negative average ratings (base-rate information)
1
2
3
4
5
ex+ ex+
br+ ex+
br- ex-
stimulus version
perceived course climate
personal opinion
course choice intention
20
Table 1 Perceptional differences between the course climate, the personal opinion of an
applicant and the probability of enrollment in course for the four different experimental
groups.
Perceived Course
Climate Personal Opinion
Course Choice
Intention
Experimental
Condition
M (SD) M (SD)
M (SD)
ex+ 4.2c (0.7) 3.7b (0.6)
3.4b (0.8)
ex+
br+
4.4c (0.4) 3.6b (0.7)
3.6b (0.7)
ex+
br-
3.0b (0.5) 2.8a (0.6)
2.7a (0.8)
ex- 1.7a (0.5) 2.5a (0.8)
2.6a (0.7)
F(3) = 154.41***
R2corr. = .79 F(3) = 23.35***
R2corr. = .35
F(3) = 13.39***
R2corr. = .23
Note: *p < .05; **p < .01; ***p < .001; all of the scales ranging from 1 “negative estimation” to
5 “positive estimation”
Different groups according to ANOVA results are indicated by a superscript letter. Same
letters symbolize groups that statistically do not differ in means
Abbreviations: “ex+”: positive exemplar
“br-“: negative average ratings (base-rate information)
... This effect has been supported in many studies in various fields and for various media channels (e.g., Bosch, 2014;Daschmann, 2001;Tran, 2012; and for an overview see Zillmann & Brosius, 2000). Also, a study in the context of online rating sites verified the strong impact of user comments on the perception of a university professor (Scherr, Müller, & Fast, 2013b). In addition, exemplification effects were confirmed in various studies in the health domain (e.g., Betsch, Renkewitz & Haase, 2013;Rossmann & Pfister, 2008;Ziegler, Pfister & Rossmann, 2013; for an overview see Zillmann, 2006). ...
... The idea is that exemplars, due to their authenticity and liveliness, can be processed more easily than statistical information (Peter, 2017). Therefore, exemplars have a stronger influence on judgment, risk perception, attitude and behavioral intention in comparison to the more valid base-rate information (Brosius, 1996;Peter, 2017;Rossmann & Pfister, 2008;Scherr et al., 2013b;Ziegler et al., 2013). Zillmann, Perkins, and Sundar (1992) investigated the influence of exemplars in the media environment for the first time. ...
... These findings were stronger when the user comments had a higher degree of emotionality. Scherr et al. (2013b) examined the exemplification effect in the context of the online rating site RateMyProfessor.com. They found that the rating of a fictional professor was mainly influenced by a few user comments, whereas the simultaneously presented and statistically more valid average rating was just secondary. ...
Article
Full-text available
In recent years, the number of users on online physician rating sites has increased continuously. According to exemplification theory, user comments have a strong influence on individual judgments. Visualized base-rate information can constrain these effects. This study examines the impact of exemplars and base-rate information regarding physician rating sites on the perception and evaluation of doctors, as well as participants' behavioral intentions. To address this question, we conducted a 2 x 2 x 2 online experiment (N = 216). We developed eight alternative versions of a physician rating site, varying the valence of the exemplars (factor 1: positive/negative) and base-rate information (factor 2: positive/negative) and the type of presentation of the base-rate information (factor 3: bar graph/average grade). Our results show that the valence of both significantly influenced participants' recall, evaluation and behavioral intentions. Furthermore, the negative bar graph led to a more accurate evaluation than the negative average grade. The strong impact of base-rate information limits the scope of exemplification effects in the context of physician online rating sites.
... Students rely on websites that describe peers' impressions of instructors (Liang, Bejerano, Kearney, McPherson, & Plax, 2015) and consult instructor rating sites such as Ratemyprofessors.com (RMP) prior to course enrollment (Scherr, Müller, & Fast, 2013). RMP ratings are valid markers of instructor credibility as it relates to student learning (Otto, Sanford, & Ross, 2008). ...
Article
Full-text available
This experiment assessed whether instructor credibility (low or high) moderated the effects of grade incentives (rewards or punishment) and advantage framing (gain or loss) of technology policies on students’ intent to comply and motivation to learn. Results indicate that credibility increased motivation to learn. Significant moderated moderated mediation was found: a three-way interaction affected both intent to comply, and motivation to learn as mediated by attitude toward the policy. Specifically, credibility positively influenced learning outcomes via attitude when the syllabus used opposing frame-incentive structure (i.e., gain-framed punishment and loss-framed reward).
Article
This study takes a corpus pragmatics approach to investigate the use of evaluative language in professor reviews, focusing on how review writers express evaluation through recurrent four-word sequences and the pragmatic functions of these sequences in positive and negative reviews on the website, RateMyProfessors.com . Based on an analysis of a 2.9-million-word corpus of free text comments, the findings indicate that positive reviews used more 4-grams, and more varied types, than negative ones. The 4-word sequences were found to carry out four pragmatic functions: attitudinal evaluation, reader engagement, referential expression, and discourse organization. While a similar distribution of the main functional categories was observed among the top 100 4-grams in both review types, with evaluative clusters being most predominant, distinctive intra-genre variations were found in the ways review writers employed different functional sub-categories. For example, positive reviews relied heavily on hedged suggestion 4-grams to engage readers, whereas negative reviews used directive 4-grams for the same purpose. These findings suggest the important role of multi-word sequences in the understanding of evaluative resources in professor reviews of different valence types.
Article
This study evaluates unit and, more narrowly, sensitive item nonresponse to surveys in the university setting. Subgroups within the responding sample of a student survey at a large, public university in the United States are probed for patterns of differential nonresponse, with a focus on assessing sensitive item nonresponse. The standout result is that international students are significantly more likely than domestic students to be sensitive to items involving sexual orientation. This result aligns with literature on cultural differences between domestic and international students in US universities. Additionally, this study found nonsignificant nonresponse to other generally accepted sensitive items.
Article
Instructors have the ability to respond to student evaluations on RateMyProfessors.com (RMP). The current research conceptualized the juxtaposition of student evaluations and instructor responses using communication processes on participatory websites. In an original experiment, the results demonstrated that when faced with multiple negative student evaluations on RMP, an instructor's statement of trustworthiness robustly increased students’ lower-level cognitive learning. A supplemental analysis revealed that an instructor's statement of competence, caring, and trustworthiness additively increased the likelihood of student enrollment toward course content. This research established the empirical efficacy of instructor responses to student evaluations on RMP.
Article
Full-text available
Nonnative English speakers (NNESs) who teach at English-medium institutions in the United States (US) have frequently been the subject of student complaints. Research into language ideologies concerning NNESs in the US suggests that such complaints can be understood as manifestations of a broader project of social exclusion operating, in part, through the ideological construction of the NNES as incomprehensible Other. The present study explores the extent to which such ideological presuppositions and exaggerative performances are observable in students' evaluations of ‘Asian’ mathematics instructors on the website RateMyProfessors.com (RMP). A mixed methodological approach combining statistical analysis of numeric RMP ratings, quantitative corpus linguistic techniques, and critical discourse analysis was employed. Findings confirm the presence of disadvantages related to ‘Asian’ instructors' race and language. However, RMP users' discourse is shown to be less overtly discriminatory and instead to reproduce dominant language ideology in subtle, previously undescribed ways.(Student evaluations, higher education, university teaching, nonnative speakers, second language users, ethnicity, critical discourse analysis, corpus linguistics, formulaic language)*
Article
Full-text available
On RateMyProfessors.com, students do the grading. A close analysis of a website's ratings, however, raises serious questions about the usefulness of student evaluations of professors.
Article
Full-text available
Using data for 426 instructors at the University of Maine, we examined the relationship between RateMyProfessors.com (RMP) indices and formal in-class student evaluations of teaching (SET). The two primary RMP indices correlate substantively and significantly with their respective SET items: RMP overall quality correlates r = .68 with SET item, Overall, how would you rate the instructor?; and RMP ease correlates r = .44 with SET item, How did the work load for this course compare to that of others of equal credit? Further, RMP overall quality and RMP ease each correlates with its corresponding SET factor derived from a principal components analysis of all 29 SET items: r = .57 and.51, respectively. While these RMP/SET correlations should give pause to those who are inclined to dismiss RMP indices as meaningless, the amount of variance left unexplained in SET criteria limits the utility of RMP. The ultimate implication of our results, we believe, is that higher education institutions should make their SET data publicly available online.
Article
Full-text available
This set of experiments assessed the influence of RateMyProfessors.com profiles, and the perceived credibility of those profiles, on students’ evaluations of professors and retention of material. In Study 1, 302 undergraduates were randomly assigned to read positive or negative RateMyProfessors.com profiles with comments that focused on superficial or legitimate professor features. Participants then watched a lecture clip, provided professor ratings and completed a quiz on the lecture. In Study 2, 81 students who were randomly assigned to read the same RateMyProfessors.com profiles before an actual class provided credibility ratings of the information, listened to a lecture and provided professor ratings at the end. Across both experiments, one in a laboratory setting, and one in an authentic undergraduate lecture, participants gave more favourable professor ratings after reading positive evaluations from RateMyProfessors.com information. These findings establish the causal link between professor information and subsequent evaluations.
Chapter
It is a truism to speak of the key normative role that public opinion polls play in contemporary democratic societies. Theorists and practitioners have long extolled how polls inextricably link citizens to their elected officials. Indeed, public opinion polls not only offer citizens a mechanism with which to express their sentiment on key issues of the day but also provide policy makers with information about what their constituents might or might not desire. Citizens may be able to express their views on a particular issue through individual acts such as donating money to a cause or writing a letter to the editor of a local newspaper, but along with voting polls are one of the few opportunities that offer the mass public equal voice. Moreover, the role that polls play in the citizen–policy maker relationship hinges upon their dissemination by the mass media.
Article
At the same time as some faculty committees and corporations are appealing to the use of online ratings from RateMyProfessors.com to inform promotion decisions and nationwide university rankings, others are derogating the site as an unreliable source of idiosyncratic student ratings and commentary. In this paper we describe a study designed to test the assumption that students' ratings are unreliable. The sample included 366 instructors with 10 or more student ratings. Contrary to the assumption that students' ratings are unreliable, variance in students' ratings about a given instructor was similar across number of raters, with 10 raters showing the same degree of consensus as 50 or more raters. Students showed the most consensus about instructors who were among the top third of the distribution in quality, and this effect occurred even among instructors rated as the most difficult. Taken alongside other investigations of RateMyProfessors.com and the broad literature on student evaluations of teaching, our findings suggest that students who use RateMyProfessors.com are likely providing each other with useful information about quality of instruction.
Article
The purpose of this study was to address some of the most common questions and concerns regarding RateMy Professors.com (RMP). Data from 5 different universities and 1,167 faculty members showed that (a) the ratings are not dominated by griping, (b) the summary evaluation correlates highly with summary evaluations from an official university evaluation, (c) substantive relations are generally the same when only a single rating has been provided, (d) the relation between RMP Easiness and RMP Quality is partially explained by the fact that learning is associated with perceived easiness, and (e) the substantive findings generalize to business faculty in different universities. The author discusses the possible value of RMP without endorsing its unlimited use for administrative purposes.
Article
Web sites like RateMyProfessors.com (RMP) are increasing in popularity. However, many have questioned the validity of unofficial online teaching evaluations. We first surveyed students about their use and perceptions of RMP. We then compared RMP ratings to official student evaluations of teaching (SET). We found that students believe RMP ratings are honest and representative of instructors' abilities. The results of our comparison suggest that, in lieu of SET ratings, RMP ratings may serve as a practical alternative for students.