Content uploaded by Mohammad AL-Smadi
Author content
All content in this area was uploaded by Mohammad AL-Smadi on Jan 17, 2014
Content may be subject to copyright.
Self- and peer-assessment in collaborative writing: Findings on
motivational-emotional aspects, usability and usage patterns
Gudrun Wesiak and Mohammad AL-Smadi, Graz University of Technology and
Christian Gütl, Graz University of Technology and Curtin University
Abstract
With the growing demands on learners in modern e-learning environments, new
ways for social learning and corresponding assessments have to be developed.
Computer-supported collaborative learning (CSCL) with integrated Web 2.0 features
offers a possibility to enhance e-learning processes with group work and social
interactions. This study reports findings from a collaborative writing assignment using
an enhanced wiki system with integrated self- and peer-assessments as well as
features to continuously monitor the group’s collaboration progress. Twenty-three
computer science students collaborated in small groups to write a short paper. Over
the course of the assignment, students rated their attitudes towards self- and peer-
assessments, their emotional status and the usability of the ‘co-writing Wiki’ by
means of three questionnaires. To obtain more information about students’ ways to
use the co-writing Wiki, usage patterns derived from computer log analysis were
examined. Results show large differences in individual usage patterns for example
in the time spent in the co-writing Wiki, the time editing text, or the frequency of
access to different features provided by the co-writing Wiki. The combination of
questionnaire and log data indicates that different assessment forms are viewed
positively by the students but are only used if mandatory. Furthermore, results reveal
only small emotional involvement, little satisfaction with the tool’s performance and a
positive relationship between satisfaction and happiness, as well as negative
relationships between satisfaction, anger, and sadness. Positive correlations were
also found among time spent with the system, number of self-assessments and
achievement as well as between intrinsic motivation and interest in the contribution
progress. Thus, future research should focus on incentives for increasing intrinsic
motivation and participation as well as on providing well performing systems with
high usability.
Keywords
Peer-assessment; collaborative writing; wiki; computer log data; motivational-
emotional aspects.
Introduction
With the continuously growing development of information and communication
technology in the context of learning, the adjustment of educational goals, settings
and assessment methods become a major challenge. The increasing number of
courses offered fully online or bl ended meets the demands of today’s learner
generation who grew up with digital media, such as email, instant messaging, wikis,
blogs, etc. (Redecker et al. 2009). With the emergence of Web 2.0 the challenge of
integrating social learning features (e.g. discussion forums, social networks, blogs or
wikis) could also be met. For online learning environments, computer-supported
collaborative learning (CSCL) is a promising way to let learners collaborate and work
together in online learning environments (Stahl, Koschmann and Suthers 2006).
International Journal of e-Assessment vol.3 no.1 2013
2
However, research with wikis also showed that students do not contribute when the
contribution is optional (Ebner, Kickmeier-Rust and Holzinger 2008). Thus Judd,
Kennedy and Cropper (2010) pointed out that there is still research required to find
effective incentives for fostering participation and collaboration. AL-Smadi, Höfler
and Gütl (2011) developed an enhanced wiki system for collaborative writing and
peer-review. The tool is called ‘co-writing Wiki’ and it provides integrated self-, peer-,
and group assessments as well as an assessment rubric for assignment grading and
feedback.
The aim of the presented study was to use the co-writing Wiki (i) to research self-
and peer-assessment attitudes (motivation), emotional aspects and usability
perceptions over the course of a collaborative writing assignment, (ii) to investigate
how students use the co-writing Wiki (usage patterns) and (iii) to examine the
relationships among motivational, emotional and usability aspects, usage patterns,
and performance.
In the remainder of this paper we will first give a short overview of CSCL and
assessment. This willl be followed by an introduction to the co-writing Wiki tool.
Thereafter, we will give a detailed description of the methodology and results of our
study, in which students were asked to work collaboratively on a writing assignment
with integrated self-, peer- and group-assessments. Finally, we will discuss our
findings and conclude the paper with some ideas for future research.
Collaborative learning and assessment
As known from constructivist theories of learning (Vygotsky 1978) and experiential
learning theory (Kolb 1984), cognitive development and knowledge construction are
fostered by social interaction with the environment, reflection on one’s experiences,
and social dialogue. For distance learning environments, CSCL supported with Web
2.0 features, can enhance learning processes by preventing learners from becoming
isolated when using e-learning systems (Elliott 2008). In CSCL, students learn by
interacting with content, the teacher and their peers which supports their learning
process in a collaborative way (Murphy 1994). In supporting social discussions and
negotiations, CSCL also helps students to reflect upon other ideas and to improve
interpersonal and social skills (Huang 2002).
Together with the advancements of e-learning activities, e-assessment has also had
to change to keep up with the new technological and pedagogical challenges. In
CSCL, collaborative activities are often accompanied by peer-assessments (Crisp
2007), which help students to track the group members’ individual efforts and to
evaluate the contributions. Nevertheless, there are still many challenges for
assessment in CSCL, for example the consideration of group interactions or affective
aspects such as self-confidence, motivation and emotion (Macdonald 2003).
Whereas, from a traditional viewpoint, the primary goal of assessment is to grade
students’ performances at the end of a course, current assessment research points
to the importance of supporting and improving one’s performance, knowledge or
skills. This also includes the consideration of motivational factors, which are not only
central to learning but also to the development of learning. According to Garris,
Ahlers and Driskell (2002) motivated learners are enthusiastic, focused, engaged
and interested in what they are doing, i.e. they have a high task value. Thereby
intrinsic motivation (e.g. challenge or curiosity) is supposed to be more effective than
extrinsic motivation (e.g. grades or job opportunity), although both are important for
learning. As with motivation, emotional aspects of learning have also been neglected
International Journal of e-Assessment vol.3 no.1 2013
3
for a long time (see, for example, Pekrun, Elliot and Maier 2006; Kay and Loverock
2008).
New assessment forms should thus be applied formative and summative and
consider multiple assessment domains (cognitive vs. affective), methods (e.g. tests,
observations, behavioural data, self-reports) and strategies (e.g. instructor vs. peer-
vs. self-assessment). Formative assessments have the advantage that the learning
progress can be monitored and improved continuously (Boston 2002); self- and
peer-assessments lead to a more active involvement, which enhances higher
cognitive abilities, critical reflection and self-awareness (McConnell 1995; Roberts
2006; see also Prins et al. 2005 or Wen and Tsai 2006 for a review of peer -
assessment).
Co-writing Wiki
Wikis are websites with an easy-to-use group and knowledge management system
to support online collaboration (Ebner, Kickmeier-Rust and Holzinger 2008). Due to
the possibility to add, edit, delete and comment on current as well as previous
versions of a site, to receive and give feedback and to interact with peers, wiki users
are not simply readers and writers, but also editors and reviewers (Cubric 2007;
Judd, Kennedy and Cropper 2010). To overcome the above mentioned drawbacks of
wikis, namely the lack of effective incentives for participation, AL-Smadi, Höfler and
Gütl (2011) enhanced the Screw Turn Wiki
1
in order to maintain task and social
awareness. The developed ‘co-writing Wiki’ system (co-Wiki for short) for
collaborative writing and peer-assessment has the following features:
• Continuous feedback provision for learner scaffolding and for teachers to follow
the collaboration progress and facilitate grading. Feedback can be given by peers
as well as by the instructor. Furthermore, self -assessments are provided to
foster self-reflection and to communicate quickly the meaningfulness of one’s
contribution to the group members.
• Enhanced assignment home page with visualization tools to support both
students and teachers in knowing who did what and when. Figure 1 shows an
example assignment home page with actions feed and contribution chart. The
actions feed lists which actions the group members have taken since the
Figure 1: Actions feed and contribution chart in the co-Wiki’s assignment home page
1
ScrewTurn Wiki is free asp.net wiki software. http://www.screwturn.eu/ (accessed April 14, 2012).
International Journal of e-Assessment vol.3 no.1 2013
4
Figure 2: Motivations charts page with contribution and assessment graphs
beginning of an assignment, whereas the contribution graph (on the right-hand
side in figure 1) shows how much each group member has contributed to the
assignment.
• Motivation charts page (figure 2) with integrated assessment overview to
motivate peers to contribute and work in comparison with others. It allows
students as well as instructors to monitor the contributions of each student over
the course of an assignment. Besides an assignment graph (on the left in figure.
2), which shows the structure or pages of the contribution, the motivation charts
page contains two contribution graphs and one assessment graph (in the middle).
The latter shows the self-rated importance of all assessments given for this
contribution. The assessment details (rubric as described next) can be seen by
clicking on the specific bar. The two contribution graphs to the left and right of the
assessment graph show how much each student has contributed to the
assignment before and after the assessment. Clicking on a specific student gives
information about his or her contribution per page.
• Self-, peer- and group-assessment with the use of flexible assessment rubrics for
grading and feedback (star-rating plus open comments). Figure 3 shows
examples for (a) a peer-assessment (top left) given after another group member
has contributed something to the group assignment, (b) a self-assessment (top
right) given after the student has edited something and (c) a group-assessment
given by a student regarding the work of a different group. The same rubric is
used by the instructor. Self- and peer-assessments should lead to greater
participation of students and empower them with assessment methods to reflect
on themselves, evaluate their peers and to provide feedback (Whitelock 2011;
Gouli, Gogoulow and Grigoriadou 2008; Roberts 2006; Boston 2002; Dochy and
McDowell 1997). Thus, they use their pre-knowledge to evaluate their peers’
performance and products which not only leads to develop new knowledge and
understanding of the learning domain, but also improves meta-cognitive skills
such as self-awareness and self-reflection (Orsmond, Merry and Reiling 2002;
Topping et al. 2000; Dochy and McDowell 1997). Assessment rubrics can provide
students with more informative feedback about their strengths and the areas in
need of improvement than traditional forms of assessment can do
International Journal of e-Assessment vol.3 no.1 2013
5
Figure 3: Peer- and self-assessments for latest contributions (top) and rubric for
group- and instructor-assessments (bottom)
(Andrade 2000). However, for this purpose assessment rubrics need to be clear,
usable and have consistent criteria descriptors (Tierney and Simon 2004).
• Grading page with peer-scores, teacher-scores and weighting possibility (figure
4). This page summarizes the scores given by teachers and peers. Scores are
calculated from the stars given in the assessment rubric based on the rubric
criteria weights provided by the teacher in the rubric authoring phase (e.g. one
star equals 20%, four stars 80%, and five stars 100%). Additionally, the teacher
has the possibility to set the weight of the student score according to his personal
needs.
Research methodology
Students enrolled in the Information Search and Retrieval course were asked to use
the Co-writing Wiki to communicate and collaborate in writing a scientific paper. We
used log data to get information on students’ usage patterns and three online
questionnaires to assess satisfaction, motivation and emotion. To evaluate the tool’s
usability, we differentiated between satisfaction, efficiency and effectiveness as
suggested by Frøkjær, Hertzum and Hornbæk (2000). The peer- and teacher-
assessments of the group work were taken as indicators of performance.
Participants
Out of 26 computer science students (telematics, information technology or software
development) 23 gave their consent to participate in the study (18 male, 5 female) by
filling out at least one out of three presented questionnaires. From the 21 participants
answering the pre-questionnaire (with a section on demographic data) 14.3% had a
master’s degree, 76.2% a bachelor’s degree and 9.5% a high school diploma. Their
International Journal of e-Assessment vol.3 no.1 2013
6
Figure 4: Grading page with average peer-, teacher- and weighted scores
age ranged between 21 and 39 years (M = 26.14, SD = 3.86) and the course was
mandatory for 52.4% of the participating students. Regarding their experience in
working collaboratively, 85.7% indicated that they had had quite a bit or a lot of
experience with face-to-face group work; 47.6% had experience of group work in an
online environment. Furthermore, 76.2% liked working collaboratively; the remaining
23.8% were undecided. 38.1% of the participants agreed or strongly agreed that they
had previous experience with wiki-tools, whereas 42.9% (strongly) disagreed. The
tools most often listed by the students were TWiki and MediaWiki. For the
collaborative writing assignments, students formed seven groups containing three
members and one group with four members. One student worked alone.
Materials and procedure
Each student group had to select a topic and collaboratively work out a short paper
using the co-Wiki. The assignment was divided into four phases, starting with
preparing a paper structure and performing a literature search in Phase 1. In Phase
2 students had to work out a first version of the paper and review the work of one
other group (in what follows we will refer to this review of another group’s work as
‘group-assessment’). After receiving additional feedback from the instructors, the first
draft was revised and re-submitted (Phase 3). These final papers were again
reviewed by the instructors, graded and presented in class (Phase 4).
While working with the co-Wiki, students could comment and rate the importance of
each of their own and their peers’ contributions on a five-star scale. These reviews of
oneself or for members of the same group are referred to as ‘self-‘ and ‘peer-
assessments’ (see figure 3). While peer-assessments were optional, self-
assessments were mandatory after creating or editing a page. The group-
assessment (assessment of another group’s contribution) at the end of Phase 2 was
based on an assessment rubric with three main categories: references, content, and
formal aspects with two, six and four subcategories respectively. For each sub-
category a short comment field and a 5-star rating scale were provided (see figure
International Journal of e-Assessment vol.3 no.1 2013
7
3). The same rubric was used by the instructor to provide the final grade at the end
of Phase 3. At all times, students were able to continuously monitor the actions of all
group members via the actions feed and the contribution charts on the assignment
home page (figure 1) and to use the motivation charts page (figure 2) for viewing the
progress of contributions and assessment results. Additionally, students could check
changes from the latest version of the assignment via the coloured difference page,
which highlights added text, removed text, edited text and edited style in different
colours.
In order to evaluate the co-Wiki, three online questionnaires (pre-, intermediate, and
post-questionnaire) were presented to the students via LimeSurvey
2
. The pre-
questionnaire (PreQ) was delivered after the selection of topics at the beginning of
Phase 1 and contained sections about demographic data, previous experience with
group work and wiki-tools and the self- and peer-assessment scale by Tseng and
Tsai (2010; see below). The intermediate questionnaire (IntQ) was delivered in
Phase 2 after students had handed in the first version of their papers. It contained
questions on task awareness (i.e. how supportive students perceived the single
components of the co-Wiki), usability of the co-Wiki (System Usability Scale (SUS)
by Brooke 1996), and the Computer Emotions Scale (CES by Kay and Loverock
2008). The post-questionnaire (PostQ), presented at the end of Phase 4, contained
task awareness items, SUS, CES and the self-and peer-assessment scale by Tseng
and Tsai (reformulated to refer to current experiences). Further questions concerned
the recent experiences of working collaboratively and of group-assessment.
The selection of scales to measure usability, motivation and emotion was based on
theoretical as well as practical considerations. On the one hand, we had to keep the
time and effort of participants reasonably low in order to maintain their commitment
over the course of the study. On the other hand, we wanted to use open instruments
that had been used before in online environments. Therefore, the finally presented
scales are all publicly available, tested with respect to reliability and validity and do
not require extensive time and effort from the participants.
The self- and peer-assessment scale by Tseng and Tsai (2010) measures general
attitudes and is divided into the four subscales (i) extrinsic and (ii) intrinsic motivation
for doing the assessment activity, (iii) evaluating (confidence in evaluating the work
of peers) and (iv) receiving (ability to constructively use peer-assessments for
recognizing one’s own weaknesses). The latter two scales together with a reacting
scale measure students’ self-efficacy in online peer-assessment activities. Example
items for the subscales are ’In a peer-assessment activity ...’ (i) ‘... I liked opinions
from peers because I got more ideas’, (ii) ’... I think the opinions of my work from
teachers were more important than those from peers’, (iii) ‘... I found the strength of
my peer’s work when I reviewed it’, and (iv) ‘... I recognized my weaknesses when I
got comments from peers’. Performing exploratory factor analysis, Tseng and Tsai
(2010) found that the final fifteen items of the self-efficacy scales explain 65.38% of
the variance. The reliability (alpha) coefficients for evaluating and receiving were
0.90 and 0.71 respectively. The twelve items of the extrinsic and intrinsic motivation
scales explain 55.6% of the variance; their alpha coefficients were 0.89 and 0.71. In
our study, this section contained two items on self-assessment and seventeen items
2
LimeSurvey is an open source survey application. http://www.limesurvey.org/ (accessed October 18,
2012).
International Journal of e-Assessment vol.3 no.1 2013
8
on peer-assessment. From the original instrument the reacting scale, three
motivation and two self-efficacy items were not presented in order to fit our context.
Task awareness is based on ten questions asking how well the different features
(actions feed, coloured difference page, contribution graphs) of the co-Wiki support
and motivate students (e.g. ‘the contribution graphs gave me a good overview about
the progress of other groups’ or ‘the actions feed in the assignment home page
supported me in coordinating tasks with my group members’). The SUS by Brooke
(1996) is a simple ten-item attitude scale giving a global view of subjective
assessments of usability. It was used to evaluate the system in general. Example
items are ’I think that I would like to use this system frequently’ or ‘I find the system
unnecessarily complex’. SUS scores have a range of 0 to 100. Bangor, Kortum and
Miller (2008) analyzed data from 206 studies to evaluate the SUS empirically. In their
multi-survey study, SUS scores ranged from 30.00 to 93.93 with an average of 69.69
(SD = 11.87). A factor analysis revealed only one significant factor with factor
loadings between 0.66 and 0.85 for the ten statements. With regard to reliability, they
found an alpha value of 0.91. In addition to the ten SUS statements, the presented
questionnaires contained three open questions on what participants liked, disliked
and what they would improve about the system.
Self- and peer-assessment, task awareness and usability scales are all based on a
5-point agreement scale ranging from (1) I strongly disagree to (5) I strongly agree.
High mean values indicate positive attitudes and tool evaluations.
The CES by Kay and Loverock (2008) was developed to measure emotions related
to learning new computer software. With a total of twelve items it covers the
emotions happiness, sadness, anxiety and anger. Example items for the four
emotional constructs are ‘When I used the tool, I felt ...
satisfied/disheartened/anxious/irritable’. Answers are given on a 4-point. rating scale
ranging from (1) none of the time to (4) all of the time. Kay and Loverock report
internal reliability estimates that range from 0.65 to 0.73. Regarding validity, they
performed a principal component’s analysis, which revealed the four distinct
emotions also assumed theoretically. Small but significant correlations among the
emotional constructs confirmed the found factors.
Results
From 23 participating students, 20 filled out the PreQ and 19 the IntQ. PostQ items
on motivational and emotional aspects were answered by 17 participants, SUS items
by 18 and task awareness items by 19 participants. The presented log data is based
on the behaviour of those 22 students who worked collaboratively. Thus, in the
analyses below, sample sizes and degrees of freedom vary from one test to another.
Questionnaire data
Generally students stated (PostQ) that they liked working collaboratively with the co-
Wiki (M = 3.59, SD = 0.94, Md = 4 on a 5-point scale). Students’ open comments
showed that they especially liked that the tool supported them in working together,
that the work is shared and that it is possible to work on different aspects at the
same time. Furthermore, they liked to receive more inputs on the topic and to get
faster feedback.
International Journal of e-Assessment vol.3 no.1 2013
9
Task awareness and usability
Mean task awareness ratings for the ten presented questions ranged between M =
2.42 (SD
inter
= 1.35, SD
post
= 1.22) and M = 3.26 (SD
inter/post
= 1.33) per item for both
the intermediate and the post-questionnaire. The overall mean score for task
awareness (i.e. across all items) was M = 2.88 (SD = .96) for IntQ and M = 2.85 (SD
= 0.93) for PostQ. Mean SUS scores of 44.11 (SD = 20.7) for IntQ and 41.17 (SD =
23.6) for PostQ also indicate an average usability of the tool as far as the aspect of
satisfaction is concerned. Related t-tests show that both task awareness and SUS
scores did not change significantly over the course of the study (see table 1 for
details).
Motivational and emotional aspects
Figure 5 shows the mean ratings from the three questionnaires for motivational and
emotional aspects and mean task awareness. PostQ ratings are compared to those
from either PreQ or InterQ to show eventual changes during the learning process.
Figure 5: Mean ratings for motivation, task awareness TA (both 5-point Likert
scales), and emotion (4-point scal). IM/EM = intrinsic/extrinsic motivation; R =
receiving, Ev = evaluating; Hp = happiness; Sd = sadness; Ax = anxiety; Ag =
anger.
Table 1 summarizes the results from related t-tests for the different questionnaire
sections. The average scores from the respective questionnaires are given and the t-
values with degrees of freedom and obtained levels of significance, p. With the
exception of extrinsic motivation, we found no significant changes over the course of
the study.
Students’ ratings on the 5-point Likert scales regarding their attitudes towards self-
and peer-assessments (Tseng and Tsai 2010) showed that at the beginning of the
study (PreQ) peer-assessments were motivated more intrinsically than extrinsically
(M
intr
= 3.65, SD
intr
= .49, M
extr
= 2.65, SD
extr
= .81, t = 4.16, df = 19, p = 0.001),
whereas we found no difference in motivation after the course (PostQ: M
intr
= 3.12,
SD
intr
= .49, M
extr
= 3.41, SD
extr
=.62, t = -1.43, df = 16, p = 0.172). Results from the
scales for receiving and evaluating indicate that students are able to handle peer-
assessments to recognize their own strengths and weaknesses (PreQ: M
Rc
= 3.7,
SD
Rc
= .47) and that they are confident in evaluating their peers’ work (PreQ: M
Ev
=
3.8, SD
Ev
= .52). These findings did not change significantly over the course of the
study (table 1). Thus, the feedback provided by the peer- and group-assessments
International Journal of e-Assessment vol.3 no.1 2013
10
Table 1: Results of related t-tests (PreQ/IntQ vs. PostQ) for ratings regarding
usability (task awareness, SUS), motivational and emotional aspects
Scale Questionnaire
M SD t df p
Task awareness IntQ.
PostQ
2.77
2.85
1.006
0.960
-0.194
14 0.849
SUS IntQ.
PostQ
46.14
43.07
22.160
23.730
0.349
13 0.732
Intrinsic
motivation
PreQ
PostQ
3.57
3.21
0.514
0.426
2.11
13 0.055
Extrinsic
motivation
PreQ
PostQ
2.71
3.29
0.825
0.611
-2.28
13
0.040
Receiving PreQ
PostQ
3.64
3.64
0.497
0.497
0.00
13
1.000
Evaluating PreQ
PostQ
3.71
3.50
0.469
0.519
1.39
13
0.189
Happiness IntQ.
PostQ
2.00
1.77
0.707
0.725
0.898
12
0.387
Sadness IntQ.
PostQ
1.69
1.85
0.947
1.068
-0.56
12
0.584
Anxiety IntQ.
PostQ
1.54
1.85
0.660
0.899
-1.0
12 0.337
Anger IntQ.
PostQ
2.15
2.23
1.068
1.166
-0.185
12
0.856
support the students in their learning process by helping them to recognize their own
strengths and weaknesses (cf. Boston 2002; Nicol and Macfarlane-Dick 2006).
With respect to emotional aspects, results from the CES indicate that students’
emotions were not very strong during their work with the co-Wiki and this did not
change over the course of the study (see table 1). To check for differences regarding
participants’ emotional states, one-way ANOVAs with repeated measures were
calculated for the four types of emotions covered in the intermediate and post-
questionnaire. With F
(1.47,23.51)
= 1.21 and p = 0.303 we found no effect for the post-
questionnaire, but a small effect for the intermediate questionnaire, F
(1.73, 31.22)
= 3.72,
p = 0.041,
η
= .171. Related t-test performed with Bonferroni correction yielded
significant differences for sadness vs. anger (t = -3.284, df = 18, p = 0.004) and
anxiety vs. anger (t = -3.986, df = 18, p = 0.001).
Experience with group assessments
After students had finished the first version of their paper they were asked to
evaluate the contribution of one other group. For this group-assessment, the
provided assessment rubric (see figure 3) effectively supported the students to learn
more about other groups’ topics (M = 3.24, SD = 0.97, Md = 3). However the
students neither agreed nor disagreed on the statements that the assessment rubric
supported them in reviewing the product of other groups (M = 2.71, SD = 1.16, Md =
3) or that it was easy to use (M = 2.71, SD = 1.26, Md = 3).
International Journal of e-Assessment vol.3 no.1 2013
11
Open comments on the group-assessment revealed that the students liked using
new technologies, getting in touch with another topic, seeing how other groups
solved the assignment and to learn from that. On the other hand, some students
disliked the categories used in the rubric, the interface and the pre-structuring of the
peer review form. According to Andrade (2000) assessment rubrics are not only
useful instruments to inform students about their strengths and weaknesses, but
their use can also support students in focusing their effort and thus promote meta-
cognitive skills such as self-awareness and self-reflection. However, creating an
instructional rubric is a challenging task and might need several revisions to be
become clear, usable and consistent in its criteria (Tierney and Simon 2004).
Behavioural data (log file analysis)
Usage patterns
For the analysis of students’ behavioural data, we logged number and time of typical
actions performed while they were working with the co-Wiki. Figure 6 shows the
mean number of logins, how often students edited or created a page (edits and
creates), how often they accessed the assignment home page, difference page and
motivation charts page (HP, DP, MCP), as well as the mean number of self-, peer-,
and group-assessments.
Figure 6: Mean number of various actions performed during the assignment.
HP = assignment home page; DP = difference page; MCP = motivation charts page;
A.= assessment.
The results indicate that students mainly used the co-Wiki to edit a page, i.e. to work
on the assignment. The strong differences in the number of self- peer-, and group-
assessments (in the given order M
SA
= 45.86, SD
SA
= 26.36; M
PA
= 0.09, SD
PA
=
0.294; M
GA
= 5.36, SD
GA
= 5.9) can be explained by the fact that self-assessments
were mandatory after saving a page and students were explicitly asked to perform a
group-assessment of one other group (using the rubric from figure 3), whereas peer-
assessments (for group members) were optional.
Regarding the number of times students used the different features of the co-Wiki,
the log data show that, besides editing, the difference page was visited most often
(M = 14.5, SD = 14.72), followed by the assignment home page (M = 9.77, SD =
9.85). Generally it should be noted that the standard deviations are extremely high
indicating a wide range of usage behaviour. For example, the number of visits to the
International Journal of e-Assessment vol.3 no.1 2013
12
difference page ranged from 1 to 59 and that for the home page from 0 to 46. Similar
data were found for the number and time of edits as well as the overall working time.
Some students spent only 2.86 hours in the co-Wiki (editing text for 23.42 min),
whereas others spent up to 63.36 hours (with a maximum edit time of 24.69 hours).
Another aspect concerning the students’ usage patterns is how and in which order
they entered the different pages. After logging in to the co-Wiki, students most often
started to edit a page, i.e. on average 37.31% (SD = 14.57) of the actions performed
right after the login concerned editing the contribution. This was followed by starting
a group review (17.76%, SD = 10.63) and going to the home page (13.47%, SD =
14.09). Sometimes, students also went to the motivation charts page (9.73%, SD =
9.64) or created a new page (5.39%, SD = 5.82) right after logging in (the reported
percentages are corrected for logins that were directly followed by a logout, which
happened in 46.53% of all cases). Having a closer look at the editing function, which
was used most often by students (on average they spent 21% (SD = 12.74) of the
overall working time on editing), the usage patterns show that before editing,
students usually visited either the difference page (M = 30.42%, SD = 19.8) or just
logged in to the co-Wiki (M = 22.7%, SD = 8.9). Regarding the paths right before
students left the co-Wiki, data show that students mostly logged out after editing a
page (M = 40.61%, SD = 14.6), starting a group review (M = 10.93%, SD = 6.38), or
visiting the home page (M = 10.13%, SD = 8.04).
Usability in terms of efficiency and effectiveness
To further evaluate the co-Wiki’s usability we used quality of solution for peer and
teacher grades as an indicator of effectiveness. For working and editing time, we
used completion time as an indicator for efficiency. In terms of effectiveness, the
collaborative work with the co-Wiki lead to good grades being given by peers (in the
form of group-assessments) as well as teachers. Grades were calculated from the
assessment rubric by averaging the star-ratings given for the twelve sub-categories
(see figure 3). Out of possible 100%, students graded their peers’ first version of the
assignment with a mean of 84.45% (SD = 7.29) and teachers gave an average of
90.91% (SD = 9.03) on the revised version. As far as efficiency is concerned,
average working time (WT) was 1050 minutes (SD = 768) and average editing time
(ET) was 244 min (SD = 312). The variability of working and editing times indicates
that students used the co-Wiki in very different ways.
Relationships between motivational-emotional aspects and usage patterns
Considering the increase in extrinsic motivation over the course of the study and the
highly varying usage patterns, we also looked for possible relationships among
behavioural and questionnaire data.
Table 2 summarizes the correlations between
measures derived from the questionnaires as well as from the logged data. Because
of the applied 4- and 5-point rating scales and the great variability in the behavioural
data, we applied Spearman’s Rho coefficient. For the log data, Pearson’s product-
moment correlation coefficient yields similar results. Table 2 shows the means
obtained from rating scales and log data as well as the correlational matrix. Since not
all students filled out all the questionnaire sections, the number of students varies for
the different variables. For a more concise presentation only variables with at least
one significant correlation are listed.
When relating the three usability indicators -satisfaction, efficiency and effectiveness
- we found a significant correlation between effectiveness (as indicated by peer
International Journal of e-Assessment vol.3 no.1 2013
13
Table 2: Means, S and Spearman’s ρ (N) for behavioural and self-report measures
WT ET MC SA PG SUS Hp Sd Ax Ag IM
N 22 22 22 22 22 19 19 19 19 19 17
M 1050 244 6.14 45.86 84.45 44.11 1.84 1.74 1.58 2.32 3.12
SD 768 312 4.39 26.36 7.29 20.7 0.688 0.872 0.692 1.06 0.485
WT - 0.627
**
(22)
0.376
(22)
0.697**
(22)
0.547
**
(22)
0.147
(19)
0.359
(19)
0.030
(19)
0.305
(19)
-0.045
(19)
0.140
(16)
ET - -0.075
(22)
0.724**
(22)
0.515*
(22)
-0.083
(19)
0.093
(19)
-0.088
(19)
0.098
(19)
0.016
(19)
-0.049
(16)
MC - 0.097
(22)
0.223
(22)
-0.360
(19)
-0.086
(19)
0.533*
(19)
0.362
(19)
0.226
(19)
0.528*
(16)
SA - 0.404
(22)
0.098
(19)
0.336
(19)
-0.043
(19)
0.182
(19)
-0.090
(19)
-0.118
(16)
PG - -0.394
(19)
-0.016
(19)
0.024
(19)
0.209
(19)
0.160
(19)
-0.164
(16)
SUS
- 0.554*
(19)
-0.512
*
(19)
-0.426
(19)
-0.612
**
(19)
-0.418
(13)
Hp - -0.172
(19)
0.166
(19)
-0.308
(19)
-0.537
(13)
Sd - 0.639
**
(19)
0.677**
(19)
0.728**
(13)
Ax - 0.678**
(19)
0.494
(13)
Ag - 0.622*
(13)
*p< 0.05. **p<0.01.
WT/ET = working/edit time (minutes); MC = motivation charts page; SA=-self-assessments
(frequencies); PG = peer-grade (out of 100); Hp = happiness, Sd = sadness, Ax = anxiety,
Ag = anger (4-point scales, IntQ); IM = intrinsic motivation (5-point scales, PostQ).
Note: Correlations were also calculated for group-assessments, teacher grades, task
awareness, and extrinsic motivation; for all p> 0.05.
___________________________
grades, PG) and efficiency (as indicated by working time, WT with ρ = 0.547, p<0.01
and editing time ET with ρ = 0.515, p< 0.05), whereas the correlations with
satisfaction (as indicated by SUS and task awareness) are negligible (range: ρ =
-0.394 to 0.327, with all p> 0.05). Grades given by the teacher were also unrelated to
all other variables.
With respect to motivation in self- and peer-assessment activities, we found that
higher scores on the intrinsic motivation (IM) scale in the PostQ are associated with
higher numbers of visits to the motivation charts page (which shows the amount of
contributions and assessment results), as well as with higher ratings for sadness and
anger. Having a closer look at the emotions, results show that anger, anxiety and
sadness are positively interrelated, whereas happiness is independent. Significant
International Journal of e-Assessment vol.3 no.1 2013
14
correlations of SUS-scores with happiness (ρ = 0.554, < 0.05), sadness (ρ = 0.512,
p< 0.05), and anger (ρ = 0.61, p< 0.01) indicate that participants’ emotional state is
also closely related to their satisfaction with the tool
Discussion
The results of this study show, that students generally liked working collaboratively.
Concerning the co-writing Wiki with integrated self- and peer-assessments, students
valued features such as the possibility to share the work, to work on different aspects
at the same time, to get an overview on their learning progress, to get more inputs
on the topic and to give and receive punctual feedback. Regarding the last point,
feedback forms are valuable tools by which learners become aware of the gaps in
their knowledge, skill, or performance within a course (Boston 2002). Nicol and
Macfarlane-Dick (2006) provide seven principles they called ‘good feedback
practice’. Among these principles, feedback delivered to students should be timely
and carefully designed so that it supports students to regulate their learning towards
the desired learning goals. Moreover, feedback should be more descriptive rather
than evaluative. In co-Wiki, feedback is descriptive and evaluative based on the
feedback text and the feedback rate provided in the assessment rubric and internal
peer-review tools (see figure 3). Another point in this context is that students may
prefer giving feedback to receiving it. In their open comments, students pointed out
that the group-assessment was especially helpful. This supports previous studies in
which evaluating their peers’ work helped students to learn from others’ mistakes
and regulate their progress towards their goals (Gouli, Gogoulow and Grigoriadou
2008; Roberts 2006; Boston 2002; Dochy and McDowell 1997).
Somekh (2007) lists five issues central to ICT innovation, namely level of resources
for students, nature of learning tasks, teacher role, course organization and
assessment. In order to smoothly integrate the tool into existing courses, the
features of the co-Wiki are based on pedagogical considerations, i.e. the tool is
meant to support students in their learning progress, foster motivation and maintain
social as well as task awareness. Furthermore, it is specified as a tool for writing
tasks and does therefore not entail a major change of the course structure.
Continuous teacher-integration is provided by means of different feedback tools
(assessment rubric and color-coded analytic feedback given within the text), which
also allow formative as well as summative assessments.
With respect to our first study goal, namely to research self- and peer-assessment
attitudes, emotional aspects and usability perception over the course of a
collaborative writing assignment, the results can be summarized and interpreted as
follows. For all three aspects, we compared ratings from two different phases of the
study, and except for extrinsic motivation students’ attitudes, emotional states or
perception of usability did not change. Thus, working collaboratively by using the co-
Wiki does for the most part not change students’ emotions and motivation regarding
self- and peer-assessments. Extrinsic motivation increased during the collaborative
work, which means that external rewards, such as grades became more important at
the end of the course. Considering that the teacher assessed the work and gave his
final grade at the end of Phase 4, this result is not surprising. Overall, the results
from the motivational scales indicate that students had neutral to positive attitudes
towards self- and peer-assessment at the beginning and at the end of their
assignment. Also the group-assessment activity with rubrics at the end of Phase 2
was evaluated positively. Nevertheless, the number of voluntary (in-group) peer-
International Journal of e-Assessment vol.3 no.1 2013
15
assessments was extremely low (only two participants performed one review each).
Similarly, log data shows that most students didn’t access the assignment home
page and the motivation charts page very often and thus did not frequently track their
peers’ contributions. This is in line with the findings by Ebner et al. (2008) who
reported that without pressure or rewards students did not contribute to wikis.
Results from the Computer Emotions Scale indicate that students’ emotions were not
very strong while working with the co-Wiki. Thus, although there seems to be no
need to develop strategies for reducing anxiety or anger, the promotion of curiosity
or excitement may be a way to increase students’ voluntary participation in
assessment activities. Regarding the two measures for usability in terms of
satisfaction, mean task awareness scores and SUS scores are both below average.
Having a sample of computer science students who are used to high performing
tools and media, quick frustration with a prototype was to be expected. A closer look
at the single questions and open comments shows that more than half of the
students considered the co-Wiki’s features as supportive tools for getting an
overview of their learning progress, but they also found the system’s response time
to be too slow, and some features too complex. Also a redesign of some graphs was
suggested. Therefore, improvements of the tool should primarily concern the system
performance and graphical design. However, considering that the tool is still in
developmental stage, the overall results are still very promising.
Our second goal was to examine the relationships between motivational-emotional
aspects, usability, usage patterns and performance. Morris, Finnegan and Wu
(2005) found participation to be a significant factor for achievement and pointed out
that persistence is important for motivation. In our study, students with higher
intrinsic motivation had higher sadness and anger scores and used the provided
feature of the motivation charts page more often. This indicates that motivated
students showed more interest in the progress of their own work and how the work
was split up between the group members. Furthermore, students with higher
participation (longer working and editing times) achieved higher peer-grades from
the group-assessments. Therefore, finding further ways to foster intrinsic motivation
and to motivate students to increase their participation (in the form of contributions,
but also discussions and assessments) seems to be one important factor for future
developments. Our findings also show that high SUS scores go along with more
happiness, whereas low scores are related to sadness and anger. Thus, another
focus of future research needs to be on the enhancement of usability in terms of
satisfaction.
Acknowledgements
This research was supported by the European Commission under the Collaborative
Project ALICE ’Adaptive Learning via Intuitive/Interactive, Collaborative and
Emotional Systems‘, VII Framework Program, Theme ICT-2009.4.2 (Technology-
Enhanced Learning), Grant Agreement n.257639. We are grateful to Margit Höfler,
Isabella Pichlmair and Dominik Kowald for their support in conducting this study.
References
AL-Smadi, M., M. Höfler and C. Gütl. 2011. Enhancing wikis with visualization tools
to support groups production function and to maintain task and social awareness. In
Proceedings of 4th International Conference on Interactive Computer-aided Blended
Learning, Antigua, Guatemala.
International Journal of e-Assessment vol.3 no.1 2013
16
Andrade, H.G. 2000. Using rubrics to promote thinking and learning. Educational
Leadership, 57: 13-18.
Bangor, A., P.T. Kortum and J.T. Miller. 2008. An empirical evaluation of the system
usability scale. International Journal of Human-Computer Interaction 24, 6:, 574-94.
Boston, C. 2002. The concept of formative assessment. Practical Assessment,
Research & Evaluation 8, 9. http://pareonline.net/getvn.asp?v=8&n=9 (accessed
August 9, 2010).
Brooke, J. 1996. SUS: A ’quick and dirty‘ usability scale. In Usability evaluation in
industry, ed.
B. A. Weerdmeester and A. L. McClelland,
189-94. London: Taylor &
Francis.
Crisp, G. 2007. The e-assessment handbook. New York: Continuum International
Publishing Group.
Cubric, M. 2007. Wiki-based process framework for blended learning. In WikiSym’07,
11-22. Montréal, Québec, Canada.
Dochy, F.J., and L. McDowell. 1997. Introduction. Assessment as a tool for learning.
Studies in Educational Evaluation 23: 279-98.
Ebner, M., M. Kickmeier-Rust and A. Holzinger. 2008. Utilizing wiki-systems in
higher education classes: A chance for universal access? Universal Access in the
Information Society 7:199-207.
Elliott, B. 2008. Online collaborative assessment.
http://www.scribd.com/doc/9375123/Online-Collaborative-Assessment (accessed
April 14, 2012).
Frøkjær, E., M. Hertzum and K. Hornbæk. 2000. Measuring usability: Are
effectiveness, efficiency and satisfaction really correlated? In Proceedings of the CHI
2000 Conference on Human factors in computing systems, 345-52. The Hague, The
Netherlands.
Garris, R., R. Ahlers and J.E. Driskell. 2002. Games, motivation, and learning: A
research and practice model. Simulation & Gaming 33: 441-67.
Gouli, E., A. Gogoulou and M. Grigoriadou. 2008. Supporting self-, peer- and
collaborative-assessment in elearning: The case of the PECASSE environment.
Journal of Interactive Learning Research 19, 4: 615-47.
Huang, H. 2002. Toward constructivism for adult learners in online learning
environments. British Journal of Educational Technology 33, 1: 27-37.
Judd, T., G. Kennedy and S. Cropper. 2010. Using wikis for collaborative learning:
Assessing collaboration through contribution. Australasian Journal of Educational
Technology 26, 3: 341-54.
Kay, R.H., and S. Loverock. 2008. Assessing emotions related to learning new
software: The computer emotion scale. Computers in Human Behavior 24: 1605-23.
Kolb, A.Y. 1984. Experiential learning: Experience as the source of learning and
development. New Jersey: Prentice Hall.
Macdonald, J. 2003. Assessing online collaborative learning: Process and product.
Computers & Education 40, 4: 377-91.
International Journal of e-Assessment vol.3 no.1 2013
17
McConnell, D. 1995. A methodology for designing post graduate professional
development distant learning CSCL programmes. CSCL Proceedings.
http://delivery.acm.org/10.1145/230000/222192/p234-mcconnell.pdf?key1=
222192&key2=6119837821&coll=GUIDE&dl=GUIDE&CFID=106231274&CFTOKEN
=44180163 (accessed October 18, 2010).
Morris, L.V., C. Finnegan and S. Wu. 2005. Tracking student behavior, persistence,
and achievement in online courses. Internet and Higher Education 8: 221-31.
Murphy, S. 1994. Portfolios and curriculum reform: Patterns in practice. Assessing
Writing 1: 175-206.
Nicol, D.J., and D. Macfarlane-Dick. 2006. Formative assessment and self-regulated
learning: A model and seven principles of good feedback practice. Studies in Higher
Education 31, 2 : 199–218.
Orsmond, P., S. Merry and K. Reiling. 2002. The use of examplars and formative
feedback when using student derived marking criteria in peer and self-assessment.
Assessment and Evaluation in Higher Education 27, 4: 309-23.
Pekrun, R., A.J. Elliot and M.A. Maier. 2006. Achievement goals and discrete
achievement emotions: A theoretical model and prospective test. Journal of
Educational Psychology 98: 583-97.
Prins, F.J., D.M. Sluijsmans, P.A. Kirschner and J.-W. Strijbos. 2005. Formative peer
assessment in a CSCL environment: A case study. Assessment & Evaluation in
Higher Education 30: 417-44.
Redecker, C., K. Ala-Mutka, M. Bacigalupo, A. Ferrari and Y. Punie. 2009. Review of
learning 2.0 practices: Study on the impact of web 2.0 innovations on education and
training in Europe. JRC Scientific and Technical Reports.
ftp://ftp.jrc.es/pub/EURdoc/EURdoc/JRC55629.pdf (accessed April 14, 2010).
Roberts, T.S. 2006. Self, peer and group assessment in e-learning. Hershey:
Information Science Publishing.
Somekh, B. 2007. Pedagogy and learning with ICT. Researching the art of
innovation. New York: Routledge.
Stahl, G., T. Koschmann and D. Suthers. 2006. Computer-supported collaborative
learning: An historical perspective. In Cambridge handbook of the learning sciences,
ed. R.K. Sawyer, 409-26. Cambridge: Cambridge University Press.
Tierney, R., and M. Simon. 2004. What's still wrong with rubrics: Focusing on the
consistency of performance criteria across scale levels. Practical Assessment,
Research & Evaluation 9, 2. http://PAREonline.net/getvn.asp?v=9&n=2 (accessed
February, 2012).
Topping, K. J., E.F. Smith, I. Swanson and A. Elliot. 2000. Formative peer
assessment of academic writing between postgraduate students. Assessment &
Evaluation in Higher Education 25, 2: 150-69.
Tseng, S.-C., and C.-C. Tsai. 2010. Taiwan college students‘ self-efficacy and
motivation of learning in online peer-assessment environments. Internet and Higher
Education 13: 164-9.
Vygotsky, L.S. 1978. Mind in society: The development of higher order psychological
processes. Cambridge and London: Harvard University Press.
International Journal of e-Assessment vol.3 no.1 2013
18
Wen, M.L., and C.-C. Tsai. 2006. University students’ perceptions of and attitudes
toward (online) peer assessment. Higher Education 51: 27-44.
Whitelock, D. 2011. Activating assessment for learning: Are we on the way with Web
2.0?’ In Web 2.0-based-e-learning: Applying social informatics for tertiary teaching,
ed. M.J.W. Lee and C. McLoughlin, 319-42. IGI Global.