Conference PaperPDF Available

Sentiment Analysis on Conversations in Collaborative Active Learning as an Early Predictor of Performance

Authors:

Abstract and Figures

This full research paper studies affective states in students' verbal conversations in an introductory Computer Science class (CS1) as they work in teams and discuss course content. Research on the cognitive process suggests that social constructs are an essential part of the learning process [1]. This highlights the importance of teamwork in engineering education. Besides cognitive and social constructs, performance evaluation methods are key components of successful team experience. However, measuring students' individual performance in low-stake teams is a challenge since the main goal of these teams is social construction of knowledge rather than final artifact production. On the other hand, in low-stake teams the small contribution of teamwork to students' grade might cause students not to collaborate as expected. We study affective metrics of sentiment and subjectivity in collaborative conversations in low-stake teams to identify the correlation between students' affective states and their performance in CS1 course. The novelty of this research is its focus on students' verbal conversations in class and how to identify and operationalize affect as a metric that is related to individual performance. We record students' conversation during low-stake teamwork in multiple sessions throughout the semester. By applying Natural Language Processing (NLP) algorithms, sentiment classes and subjectivity scores are extracted from their speech. The result of this study shows a positive correlation between students' performance and their positive sentiment as well as the level of subjectivity in speech. The outcome of this research has the potential to serve as a performance predictor in earlier stages of the semester to provide timely feedback to students and enables instructors to make interventions that can lead to student success.
Content may be subject to copyright.
Sentiment Analysis on Conversations in
Collaborative Active Learning as an Early
Predictor of Performance
Nasrin Dehbozorgi
Dept. of Computer Science
University of North Carolina
at Charlotte
Charlotte, USA
ndehbozo@uncc.edu
Mary Lou Maher
Dept. of Software and
Information Systems
University of North Carolina
at Charlotte
Charlotte, USA
m.maher@uncc.edu
Mohsen Dorodchi
Dept. of Computer Science
University of North Carolina at
Charlotte
Charlotte, USA
mohsen.dorodchi@uncc.edu
ABSTRACT:
This full research paper studies affective states in
students’ verbal conversations in an introductory
Computer Science class (CS1) as they work in teams and
discuss course content. Research on the cognitive process
suggests that social constructs are an essential part of the
learning process [1]. This highlights the importance of
teamwork in engineering education. Besides cognitive
and social constructs, performance evaluation methods
are key components of successful team experience.
However, measuring students’ individual performance
in low-stake teams is a challenge since the main goal of
these teams is social construction of knowledge rather
than final artifact production. On the other hand, in low-
stake teams the small contribution of teamwork to
students’ grade might cause students not to collaborate
as expected. We study affective metrics of sentiment and
subjectivity in collaborative conversations in low-stake
teams to identify the correlation between students’
affective states and their performance in CS1 course. The
novelty of this research is its focus on students’ verbal
conversations in class and how to identify and
operationalize affect as a metric that is related to
individual performance. We record students’
conversation during low-stake teamwork in multiple
sessions throughout the semester. By applying Natural
Language Processing (NLP) algorithms, sentiment
classes and subjectivity scores are extracted from their
speech. The result of this study shows a positive
correlation between students’ performance and their
positive sentiment as well as the level of subjectivity in
speech. The outcome of this research has the potential to
serve as a performance predictor in earlier stages of the
semester to provide timely feedback to students and
enables instructors to make interventions that can lead
to student success.
This work is supported by the National Science Foundation
Award 1519160: IUSE/PFE: RED: The Connected Learner: Design
Patterns for Transforming Computing and Informatics Education.
Keywords
Active learning, CS1, verbal
conversation, Natural Language Processing (NLP),
sentiment analysis, affective domain, student performance
I. INTRODUCTION
Research in how people learn reveals that in
addition to the cognitive process, social constructs are
a part of the learning process [1]. This highlights the
importance of teamwork in educational settings such as
collaborative active learning. Effective teamwork is
vital since it creates knowledge, promotes innovation,
enhances productivity and ensures success [2]. A key
component of a good team experience is performance
measurement. However, the inherently complex nature
of teamwork in active learning calls for tools and
methods to measure and predict students’ performance
at both individual and collaborative levels [3, 2]. In
active learning, there are two types of teams: low-stake
and high-stake teams. Low-stake teams are typically
practiced in introductory-level courses where the goal
of teamwork is learning from peers and improving
students’ soft skills [4]. Students’ performance in low-
stake teams does not contribute much to their final
grade. On the other hand, high-stake teams are mainly
practiced in upper-level courses where students apply
what they have learned into practice in order to make a
final product. Evaluating students’ performance in
high-stake teams is mainly based on evaluating the
final product and assessing the individual’s
contribution to the product. However, evaluating
individuals’ performance in low-stake teams where no
final product is produced is a challenge. To address this
issue theories on team performance converge in
identifying attitude components that influence team
performance such as affective states and behavioral
processes [5, 6, and 2]. These essential components of
attitude are important to measure since they promote
team effectiveness and are associated with team
performance [2]. Research shows affective attributes
have been measured using self-report by having
students fill out surveys and by using a Likert scale
expressing their feelings [2] or by analyzing offline
textual dialogues while students communicate on
forums asynchronously [2, 7, 8]. The drawback of
surveys is the lack of commitment from students to fill
them out in a timely manner, not taking it so seriously
to provide precise answers, or even not being aware of
their emotional states at the moment. More advanced
tools allow researchers to retrieve emotional
information signals by capturing facial expressions,
gestures, and posture but they have certain drawbacks
to be applied in educational settings such as distracting
the learning process [9, 10]. Research suggests that
speech is a good source to identify affective
information, however because of challenges such as
environmental noise level it is not often practiced in
educational settings [11].
The goal of this research is to operationalize affect
(sentiment and subjectivity) from students’ real-time
speech in class and investigate if there are correlations
between students’ affect in low-stake teams and their
individual performance. The correlations can lead us to
predict student performance based on the emotions that
they express and identify at-risk students earlier in the
semester to provide timely feedback to them. To collect
the verbal data, we record students’ conversations in
multiple active learning class sessions during the
semester. By using NLP algorithms, we quantify
affective states of sentiment and subjectivity. The
results of the data analysis can serve as helpful
feedback for both the instructor and students. Research
shows that presenting detected emotions to students
during their interactions makes them more conscious of
their situation and can serve as a prompt to adjust their
behavior in the learning process [12]. Beyond
providing feedback to the student, the results of our
research can guide instructors in applying interventions
that improve students’ performance.
In section 2, we review the relevant literature and
elaborate on different types of teams that are practiced
in active learning. We describe team performance
measurement techniques and attitudinal constructs as
important metrics in cognition process. We
demonstrate the existing gap in the literature and the
need for developing measurement tools for low-stake
teams in active learning. In section 3, we present our
research methodology to operationalize affective
metrics from verbal conversations to study the
correlations of the affect constructs with students’
performance. Finally, we present the results of data
analysis on a CS1 class as a case study.
II. BACKGROUND
Teamwork plays two important roles in active
learning: one is peer learning or the social construction
of knowledge which helps students to learn from each
other and the second is about improving social and soft
skills [13, 28]. Teamwork in active learning is being
practiced either in the form of high-stake or low-stake
teams. The low-stake format is typically applied in
introductory-level courses where students interact with
each other and improve their social skills as they learn
from peers in a socially supported environment. This
type of teamwork best suits less-challenging concepts
where students get a chance to learn from peers and
reduce the gaps between team members’ computing
backgrounds. This low-stake teamwork model is also
defined as ‘lightweight’ teams in which teamwork has
less contribution to students’ final grades [4]. In low-
stake teams normally there are no assigned roles and
students contribute equally to solve given problems
without pressure for grades. In such teams, effective
evaluation helps in a fair assessment of individuals
besides providing timely feedback to them.
A. Team performance measurement
Researchers have proposed different tools and
insight on evaluating team performance over the past
30 years. A review of the literature of teamwork
assessment shows there are mainly four ways to collect
data on team performance: 1) self-report, 2) peer
assessment, 3) observation and 4) objective outcomes.
For optimal results, it is suggested to combine different
ways of both qualitative and quantitative data
collection [2]. In high-stake teams, the most common
way to measure success in teams is to evaluate the
quality of the artifact generated by those teams. In low-
stake teams, there is no significant team-level final
product to be evaluated as a team performance
measure. In these types of teams, students do not have
assigned roles, and given that teamwork outcome has a
low contribution to final grades, there is a high chance
that team members rely on peers and don’t attend team
activities as expected. This scenario has negative
consequences in active learning classes since students’
will not learn from peers and do not use class time for
learning the course material. In such cases, the
emphasis of the team evaluation should be at the
individual level, in order to provide timely feedback to
students. Here the question arises that what data needs
to be collected and what factors should be used for
evaluating individuals in low-stake teams to ensure
they are having a positive teamwork experience.
There is a large body of recent research around the
affective domain in teamwork and how they affect
team performance [14]. Research shows the first step
to evaluate team performance is identifying the
characteristics that individuals in teams pose such as
motivation, attitude and personality traits [15]. The
most common form of measuring attitude is having
students fill out surveys by using a Likert scale to
express their feelings [2]. However, such tools may not
necessarily reveal reliable information about the
affective state while having other drawbacks such as
the need for direct student involvement and the
overhead it may have for students. The challenge calls
for methods to quantify the affective metrics at the
individual level without individual self-report.
B. Affective domain
Multiple studies have reported that affective states
impact the interpersonal relationships in the
educational domain [16,17]. The shortage of soft skills
among employees in the workplace is another evidence
that there is a lack of focus on individual affects in the
educational setting which needs to be integrated into
curricula [18]. It is reported that students’ attitudes are
observable through their behavior in class and how
they engage in the class activities [19]. One aspect of
attitude is emotion which can influence students’
behavior in collaborative environments and impact
performance at both individual and team level [7].
Research shows emotional obstacles can hinder
students’ learning process while feelings of joy,
happiness, and satisfaction about the given subject
positively influence students’ performance [20].
Recent studies claim that students who experience
emotional and behavioral difficulties are not identified
early and therefore may not receive appropriate
feedback and intervention [21]. The affective state is
most important in collaborative active learning where
learning occurs during teamwork [7].
Capturing attitude information and emotional
awareness can benefit both students and instructors [7].
From the students’ perspective, achieving emotional
awareness involves the acquisition of skills to manage
emotions and establish positive relationships with
teammates and to learn how to handle challenging
situations [17]. From the instructor’s perspective
identifying students’ emotional state can lead to
cognitive scaffolding [7]. Researchers propose diverse
methods of affect recognition including heart rate,
diagnostic tasks, self-reports, facial expressions and
knowledge-based methods [11]. Some researchers
recognize affective state by doing sentiment analysis
on students’ journals and learning diaries or from the
chat and discussions in the collaborative conversations
in forums or other asynchronous textual data [22]. In
these methods, they either identify the polarity of
students’ emotions to see whether they are negative,
positive or neutral or they identify the expressions most
related to the six fundamental emotions of anger, trust,
surprise, sadness, joy, fear, disgust, and anticipation
[22]. Although sentiment analysis is a more promising
way to identify the affective domain, it has not been
widely applied in the educational setting compared to
the social media domain or review corpora because of
limitations in existing educational corpora [20,22]. The
selection of a suitable method depends on the type of
emotions to be recognized, the required resources to
collect data and the context and setting in which the
task is performed [11]. For example, in some contexts
using self-reports may be more appropriate than using
sensors, since sensors can cause interference with the
given task [11]. Some cases may need real-time
detection which requires computational resources for
data analysis. Researchers believe speech is a very
good source to identify affective information, however
because of challenges such as environmental noise
level it is not often practiced [11]. We have identified
a way to capture individual student’s speech while in
groups in the classroom and use this data to measure
their affective states. We investigate correlations
between the identified affective metrics in students’
speech and their individual performance.
III.
RESEARCH METHOD
The research question we pursue in this study is:
Is there a relationship between a student’s positive
sentiment or level of subjectivity during in-class
collaborative conversation and their individual
performance in the course?
To answer this question, we operationalize
sentiment and the level of subjectivity from students’
verbal conversations during the class activities and
identify their correlation with students’ individual
performance. The study protocol involves the
processes of identifying modules to measure affect,
identifying team characteristics and class design,
recording students’ conversations in teams during class
and applying voice filtering and transcription modules
on audio data for text mining and analysis. “Fig. 1”
shows the steps of study protocol.
Fig. 1. Study protocol
The two hypotheses that we have in this research
are:
H1: There is a correlation between student’s
positive sentiment in teamwork and their individual
performance in the course.
H2: There is a correlation between the level of
subjectivity in student’s speech in teamwork and their
individual performance in the course.
The rest of this section describes our methodology
to operationalize sentiment and subjectivity, our data
collection approach, and analysis of data from the case
study.
A. Methodology for operationalizing sentiments
For data collection, the students were recorded
while they were working on the class activities with
their assigned peers. During the recording process,
some environmental, technical or human errors
happened which made the data unavailable for
analysis. For example, sometimes the TAs who were
assigned to set up recorders misplaced the microphone
cord. In other cases, students accidentally pressed the
stop button on the recorder which led to the loss of data.
On top of all these challenges we had high
environmental noise level when all team members
were talking at the same time while sitting close to each
other in the classroom. In order to overcome these
challenges we employed some protocols such as
assigning the tasks related to recorders only to well-
trained TAs and encouraging teams to sit in certain
locations to minimize the noise level in recordings. One
of the major steps we took was conducting an extensive
research on the recording devices. The recorders we
required for our study were expected to have
bidirectional paired microphones, have built-in noise
cancellation feature and lasting battery, be user friendly
and cost effective. After trying three different types of
recorders we identified the one that best matched our
requirements.
In order to transcribe the conversations, first we
filtered the audio data and reduced the noise level to
improve the quality. Next, we transcribed the audio
data by assigning a unique ID to each speaker based on
voice recognition. Each team was recorded on one
device and so their voices were stored into one audio
file. Because the teams were sitting close to each other
or sometimes they had questions from the teacher or
TAs the transcriptions included the speech utterances
from people other than the team members. In the next
step, we removed the speech utterances from speakers
other than the team members and stored speech
utterances related to each ID (i.e. student) into a
separate dataset. This resulted in 28 datasets each
containing the transcription of speech of that individual
in multiple sessions of the class. At this point, the
textual datasets (i.e. ‘speech corpuses’) were ready for
being imported to the text mining algorithm that we
developed for this study.
The text-mining algorithm is shown in “Fig. 2”.
The first step of the algorithm is segmentation where
we segmented each corpora and separated the part of
speech (i.e. vectors) based on speech initiation point
such that each vector in the dataset represents the
speech of the person until it is finished or interrupted
by the other teammate. This means that the number of
vectors in each dataset denotes the number of times a
person initiated the talk (in both active and reactive
mode).
Next, we applied a contraction filtering on the text
so that we do not miss any meaningful words while
tokenizing the vectors based on the regular expression
which is explained in the next step. In the third step of
the algorithm, we tokenized the vectors based on a
customized regular expression to clean the text and
eliminate the extra characters which did not impact the
sentiment score. Next, we applied the standard English
dictionary to remove the stop words. We also did a
word frequency count in each dataset and defined a
dictionary based on the most frequently common
words that speakers used habitually which did not have
any impact in the context of this study and removed
those words from the text.
In the last step of the algorithm, we did sentiment
analysis on the parsed text by applying TextBlob
library, Natural Language Toolkit (NKLT) and
Valence Aware Dictionary for sEntiment Reasoning
(VADER) tool to measure positive sentiment, and
subjectivity score of all vectors in each dataset.
TextBlob which is a rule based sentiment analysis
tool has the essential component for the basics of
natural-language processing such as calculating
polarity and subjectivity [26]. The output subjectivity
level is a float number within the range [0.0, 1.0] where
0.0 is very objective and 1.0 is very subjective [27].
We apply TextBlob to measure the subjectivity of
the text and use NLTK and VADER for measuring the
sentiments (i.e. polarity and valence of the records).
NLTK is a tool that allows tokenizing and word
Fig.2. Text-mining algorithm
frequency analysis on the corpus. The Vader
sentiment analysis algorithm is applied due to its higher
precision and accuracy in particular on short tokens at
string-level compared to the other known sentiment
analysis tools [23, 24, 25]. Most sentiment analysis
tools have either polarity-based or valence-based
approaches. Polarity determines if a part of the text is
positive or negative, while valence-based approaches
determine the intensity of each sentiment class.
VADER is both polarity-based and valence-based
which outputs sentiment scores into 4 classes of
‘Negative’, ‘Neutral’, ‘Positive’ and ‘Compound’ with
values between -1 to 1. The compound value is the
normalized value of the sum of valence scores of each
word in the lexicon, adjusted according to the rules.
Equation (1) shows how compound value is calculated
based on the normalized sum of valence scores:
(1)
where
sum_val is the sum of the sentiment arguments passed
to the score_valence() function in Vadar algorithm.
Compound value is the most useful metric for a
unidimensional measure of sentiment. Depending on
the context, the threshold for neutral compound value
normally can be anywhere between -0.05 and 0.05 [32].
In this study, to determine the threshold for neutral
compound value, we did a k-means clustering on the
compound values using the elbow method. The result
indicated the optimum number of clusters would be
three. The 3-means clustering of data showed most
records fall into the class near to the zero value.
Therefore, we consider zero as the threshold for
classifying sentences into positive, neutral, or negative
meaning any negative compound value is labeled as
negative, positive values are labeled as positive and
vectors with zero compound value are considered
neutral. Finally, we normalized the sum of all
compound values with values greater than 1 (i.e.
positive compound values) and subjectivity score of all
vectors in each dataset.
B. Data collection (study design)
To implement the algorithm and test the hypothesis
we targeted a CS1 active learning class, during spring
2019 for data collection where the students worked in
low-stake semester-long teams [13, 28, 29]. Teams
were formed at the beginning of the semester by a
gamified activity such that the students were paired
based on their month of birth regardless of their gender.
Since students themselves were involved in the process
of team formation, we observed a significant level of
satisfaction in students about their teammates. The
class was conducted twice a week with each session for
75 minutes. In each class, the last 40 minutes were
dedicated to the class activity when the students
worked on class activities in pairs as we recorded their
speech. In order to maintain consistency in the data
collection if a team member was absent in one class,
that team was excluded from recording on that day. The
number of participants analyzed in this study is 28
students (i.e. 14 teams) during 5 weeks of the semester.
In this course, students’ performance was measured
in a formative style. Before every class, they took a
mini quiz on their understanding of the prep material.
Each class started with a short poll quiz from prep-
work [28] followed by a mini-lecture if students had a
problem understanding the content. Then students were
assigned to work on a graded class activity in their
teams and were asked to submit an exit form (minute
paper) individually once they finished the class activity
[33]. During the semester individuals were assessed in
4 tests, 4 major assignments, and 4 lab tests. The final
grade of the students was calculated based on their
grades in class and lab tests, assignments, class
activities and polls as well as prep work quiz grades
with different proportions. In this study we consider
students’ final grade as a performance metric to see if
it has any correlation between their positive sentiments
and subjectivity with performance. In the following, we
show the result of data analysis and the hypothesis test.
C. Results
To identify the correlation between positive
sentiments and performance, we consider both the
intensity and frequency of positive compound values
for analysis. The intensity of positive compound value
determines the level of positive sentiment, while
frequency of positive compound value denotes the
occurrence rate of positive sentiment in the corpus.
“Fig. 3” plots the distribution of positive sentiment and
subjectivity scores of all 28 participants. The horizontal
axis shows the participant numbers and the vertical axis
indicates the scores (i.e. level of positive sentiment and
subjectivity).
As shown in “Fig. 3” the intensity of positive
compound values is consistently lower than the
Fig.3. Positive sentiment and subjectivity scatter plot
frequency of positive compound values, and the
range of subjectivity score is larger than intensity and
frequency of positive compound values.
The grade distribution of the participants in this
study is shown in Table 1. The normal distribution of
the grades shows equal number of high performing and
low performing students, while most of the participants
are in medium range of grades B and C.
T
ABLE
1.
G
RADE DISTRIBUTION OF PARTICIPANTS
In “Fig. 4”, “Fig. 5”, “Fig. 6”, and “Fig. 7” the
kernel density plot of the intensity of compound values
for the four grade categories (A, B, C and DFW) are
presented. In these plots the horizontal axis shows the
intensity level of compound value and the vertical axis
denotes the density level. The plots show more density
in positive compound values for students with higher
grades, and as the students’ grades decrease the density
of positive compound values decrease.
Fig. 4. Compound value kernel density plot of Grade A
Fig.5. Compound value kernel density plot of Grade B
Fig. 6. Compound value kernel density plot of Grade C
Fig. 7. Compound value Kernel Density plot of Grade DFW
To answer the research question, we measure the
correlation of the positive compound value (intensity
and frequency) and the performance score as well as
the correlation of the subjectivity score and the
performance score by applying Spearman's rank
correlation coefficient. For hypothesis testing, we
applied a two-tailed p value statistical method and
measure the p value for each correlation.
Spearman’s rank correlation coefficient is a
nonparametric (distribution-free) rank statistic for
measuring the strength of an association between two
variables [31]. It assesses how well the relationship
between two variables can be described using a
monotonic function, without making any assumptions
about the frequency distribution of the variables [31].
The coefficient value is signified by r
s
where r
s
can be
anywhere between -1 and 1. The interpretation is that
the closer is r
s
to +1 and -1, the stronger the monotonic
relationship is between the two variables. The strength
of the correlation can be described using the following
guide for the absolute value of r
s
[30]; r
s
= 00-.19 “very
weak”, r
s
=.20-.39 “weak”, r
s
= .40-.59 “moderate”, r
s
= .60-.79 “strong”, r
s
= .80-1.0 “very strong”. The r
s
value is calculated using equation (2): (n = number of
cases)
(2)
Where d
i
is the difference in ranks for variables
By using Spearman’s correlation coefficient
equation, we measured the correlation between
positive compound values (intensity and frequency) as
well as subjectivity with performance score. The
coefficient values (r
s
) are presented in Table 2.
T
ABLE
2.
C
OEFFICIENT VALUES OF POSITIVE SENTIMENT
,
SUBJECTIVITY AND PERFORMANCE
The coefficient values (r
s
) related to intensity of
and frequency of positive sentiments indicate both have
a strong positive correlation with performance. The
calculated coefficient value of subjectivity shows a
moderate positive correlation with performance.
“Fig. 8”, ”Fig. 9” and “Fig. 10” visualize linear
regression analysis of positive sentiment and
subjectivity vs performance. In these plots the
horizontal axis specifies the positive compound value
(intensity and frequency) and subjectivity while the
vertical axis shows performance score of each
participant.
In “Fig. 8” and ”Fig. 9”, we observe a
homogeneous pattern between positive sentiment
(intensity and frequency) and performance (i.e. higher
performance scores have higher positive sentiments).
On the other hand, in “Fig. 10”, the regression plot of
the subjectivity and performance does not show any
consistency between these two metrics.
Fig. 8. Positive compound value (intensity) - performance
regression plot
Fig. 9. Positive compound value (frequency) - performance
regression plot
Fig. 10. Subjectivity - performance regression plot
For hypothesis testing we applied 2-tailed p value
statistical method. The p values of positive sentiment
and subjectivity scores and performance are presented
in Table 3.
T
ABLE
3.
P
-
VALUES OF POSITIVE SENTIMENT
,
SUBJECTIVITY
AND PERFORMANCE
In testing H1 the null hypothesis states:
H1
0
: There is no correlation between student’s
positive sentiment in teamwork and that student’s
individual performance in the course.
The calculated p value for intensity of positive
compound value is 0.001 and the p value for frequency
of positive compound value is 0.002. Since the result
of the p values for both the intensity and frequency of
positive compound values are statistically significant,
therefore the null hypothesis is rejected which confirms
that there is a correlation between students’ positive
sentiment and their performance.
For validating H2 the null hypothesis states:
H2
0
: There is no correlation between the level of
subjectivity in student’s speech in teamwork and that
student’s individual performance. Cconsidering the
subjectivity and the performance metrics, the
calculated p value is 0.03. Since the p value is less than
the confidence level of 0.05 therefore the null
hypothesis is rejected which indicates there is a
correlation between students’ level of subjectivity in
speech and their individual performance.
In another observation from data analysis, we found
that there is a negative yet strong correlation between
the frequency of students’ neutral sentiment and their
performance. The calculated coefficient value (r
s
) for
the neutral sentiment is -0.61, and the p value is 0.001.
This means that students with higher performance
score had fewer number of records with neutral
sentiment scores. “Fig. 11” shows the regression plot
of the frequency of neutral sentiment values vs
performances.
Fig. 11. Neutral sentiment frequency and performance
regression plot
However, we did not find any correlation between
students’ negative sentiments and their individual
performance.
IV. CONCLUSION AND FUTURE WORK
The purpose of this study is to analyze the
correlation between students’ positive sentiments as
they talk during teamwork and their individual
performance in the course. The identified correlations
help instructors to make novel interventions in order to
encourage students’ engagement in teams and also
identifying at risk students to provide timely feedback
and support for them.
The novelty of this study is: 1) using verbal
conversations as a medium to measure affective states,
and 2) focus on low-stake teams (i.e. lightweight
teams). There has been much focus on methods to
evaluate students’ performance in capstone teams.
However, due to low grade contribution and lack of
student engagement in low stake teams, evaluating
students in such settings have been a challenge.
Supported by the research, we argue that capturing
students’ affective states in low stake teams can help us
in evaluate individuals’ performance.
By developing a text-mining algorithm as shown in
“Fig. 2”, we operationalized the positive sentiment and
subjectivity level for each student as they spoke in
teams. Based on the result of this work, we conclude
that there is a strong correlation between performance
and positive sentiment as well as the level of
subjectivity. This method helps in evaluating students’
level of involvement in low-stake teams and by doing
frequent formative analysis on their speech in
teamwork, we can identify lower performers and
provide more learning opportunities to them.
In future work we will conduct multi-class
sentiment analysis on the collected data by studying
diverse classes of sentiment such as “joy, anger,
anxiety, etc.” rather than just the polarity of the
sentiments. We will analyze different sentiment classes
to identify their correlation with students’ performance.
In the next step we will conduct aspect-based
sentiment analysis on the corpus to identify in which
areas or themes students show more positive or
negative sentiments. The result of this analysis would
help instructors to get a more precise feedback from the
context of speech in teams and enables them to
implement cognitive interventions.
References:
[
1] Salomon, G., & Perkins, D. N. (1998). Individual and social
aspects of learning. Review of Research in Education, 23, 1–
24.
[2] Salas, E., Reyes, D. L., & Woods, A. L. (2017). The Assessment
of Team Performance : Observations and Needs, 21–37.
[3] Davier, A. A. Von, & Zapata-rivera, D. (2016). A Tough Nut to
Crack:, 344–359.
[4] Latulipe, C., Long, N. B., & Seminario, C. E. (2015, February).
Structuring flipped classes with lightweight teams and
gamification. In Proceedings of the 46th ACM Technical
Symposium on Computer Science Education (pp. 392-397).
ACM.
[5] Ilgen, D., Hollenbeck, J., Johnson, M., & Jundt, D. (2005).
Teams in organizations: From input-process-output models to
IMOI models. Annual Review of Psychology, 56, 517–543.
[6] Kozlowski, S. W. J., & Bell, B. S. (2003). Work groups and
teams in organizations. In W. C. Borman, D. R. Ilgen, & R. J.
Klimoski (Eds.), Handbook of psychology: Industrial and
organizational psychology (Vol. 12, pp. 333–375). London,
England: Wiley.
[7] Arguedas, M., Daradoumis, T., & Xhafa, F. (2014, July).
Towards an emotion labeling model to detect emotions in
educational discourse. In 2014 Eighth International
Conference on Complex, Intelligent and Software Intensive
Systems (pp. 72-78). IEEE.
[8] Pekrun, R., Goetz, T., Frenzel, A. C., Barchfeld, P., & Perry, R.
P. (2011). Measuring emotions in students’ learning and
performance: The Achievement Emotions Questionnaire
(AEQ). Contemporary educational psychology, 36(1), 36-48.
[9] Anders, S., Heinzle, J., Weiskopf, N., Ethofer, T., & Haynes, J.
(2011). Flow of affective information between communicating
brains. Neuroimage, 54, 439–446
[10] Schippers, M., Roebroeck, A., Renken, R., Nanetti, L., &
Keysers, C. (2010). Mapping the information flows from one
brain to another during gestural communication. Proceedings
of the National Academy of Sciences USA, 107, 9388–9393
[11] Hudlicka, E. (2003). To feel or not to feel: The role of affect
in human–computer interaction. International journal of
human-computer studies, 59(1-2), 1-32.
[12] Arguedas, M., Daradoumis, A., & Xhafa Xhafa, F. (2016).
Analyzing how emotion awareness influences students'
motivation, engagement, self-regulation and learning
outcome. Educational technology and society, 19(2), 87-103.
[13] Dehbozorgi, N., MacNeil, S., Maher, M. L., & Dorodchi, M.
(2018, October). A Comparison of Lecture-based and Active
Learning Design Patterns in CS Education. In 2018 IEEE
Frontiers in Education Conference (FIE) (pp. 1-8). IEEE.
[14] Salas, E., Burke, C. S., & Fowlkes, J. E. (2005). Measuring
team performance “in the wild:” Challenges and tips. In W.
Bennet Jr., C. E. Lance, & D. J. Woehr (Eds.), Performance
measurement: Current perspectives and future challenges (pp.
245–272). Mahwah, NJ, USA: Erlbaum
[15] Driskell, J. E., Salas, E., & Hughes, S. (2010). Collective
orientation and team performance: development of an
individual differences measure. Human Factors, 52(2), 316–
328.
[16] Schutz, P. A .. & Pekrun, R. (Eds.). (2007). Emotion in
education. San Diego, CA: Academic Press.
[17] Araújo-Simões, A. C., & Guedes-Gondim, S. M. (2016).
Performance and affects in group problem-solving. Revista de
Psicología del Trabajo y de las Organizaciones, 32(1), 47-54.
[18] McLeod, D. B. (1992). Research on affect in mathematics
education: A reconceptualization. Handbook of research on
mathematics teaching and learning, 1, 575-596.
[19] Laguador, J. M. (2013). Developing students’ attitude leading
towards a life-changing career. Educational Research
International, 1(3), 28-33.
[20] Munezero, M., Montero, C. S., Mozgovoy, M., & Sutinen, E.
(2013, November). Exploiting sentiment analysis to track
emotions in students' learning diaries. In Proceedings of the
13th Koli Calling International Conference on Computing
Education Research (pp. 145-152). ACM.
[21] Maras, P., & Kutnick, P. (1999). Emotional and behavioural
difficulties in schools: Consideration of relationships between
theory and practice. Social Psychology of Education, 3(3),
135-153.
[22] Tarmazdi, H., Vivian, R., Szabo, C., Falkner, K., & Falkner,
N. (2015, June). Using learning analytics to visualise computer
science teamwork. In Proceedings of the 2015 ACM
Conference on Innovation and Technology in Computer
Science Education (pp. 165-170). ACM.
[23] Ahmed, T., Bosu, A., Iqbal, A., & Rahimi, S. (2017, October).
SentiCR: a customized sentiment analysis tool for code review
interactions. In Proceedings of the 32nd ieee/acm international
conference on automated software engineering (pp. 106-111).
IEEE Press.
[24] Hood, K., & Kuiper, P. K. (2018, January). Improving student
surveys with natural language processing. In 2018 Second
IEEE International Conference on Robotic Computing
(IRC)(pp. 383-386). IEEE.
[25] Hutto, C. J., & Gilbert, E. (2014, May). Vader: A parsimonious
rule-based model for sentiment analysis of social media text.
In Eighth international AAAI conference on weblogs and
social media.
[26] Hasan, A., Moin, S., Karim, A., & Shamshirband, S. (2018).
Machine learning-based sentiment analysis for twitter
accounts. Mathematical and Computational Applications,
23(1), 11.
[27] Loria, S. (2018). textblob Documentation (pp. 1-73). Technical
report.
[28] Dehbozorgi, N. Maher, M.L., MacNeil, S. , Dorodchi, M.
(2018). An Object-based Design Pattern Model for
Collaborative Active Learning (Manuscript submitted to the
Computer Science Education Journal) (peer reviewed)
[29] Dehbozorgi, N. (2017, August). Active Learning Design
Patterns for CS Education. In Proceedings of the 2017 ACM
Conference on International Computing Education
Research(pp. 291-292). ACM.
[30]Mukaka, M. M. (2012). A guide to appropriate use of
correlation coefficient in medical research. Malawi medical
journal, 24(3), 69-71.
[31] Hauke, J., & Kossowski, T. (2011). Comparison of values of
Pearson's and Spearman's correlation coefficients on the same
sets of data. Quaestiones geographicae, 30(2), 87-93.
[32]Bonta, V., & Janardhan, N. K. N. (2019). A Comprehensive
Study on Lexicon Based Approaches for Sentiment Analysis.
Asian Journal of Computer Science and Technology, 8(S2), 1-
6.
[33] Dehbozorgi, N. MacNeil, 2019. “Semi-automated analysis of
reflection as a continuous course improvement tool”, Frontiers
in Education Conference (FIE), 2019 IEEE
... However, although the use of positive, negative, or neutral scale has been reported to be the most frequent association within threelevel polarity scales (Chiarello et al., 2020), a high dispersion in the measurements emerges. For example, by measuring extremely positive, positive, and non-positive sentiments (Zhang et al., 2012) or by using normalized compound between − 1 (extreme negative) and 1 (extreme positive) in Camacho and Goel (2018), Dehbozorgi et al. (2020), andOkoye et al. (2020). ...
... Hence, although only four studies present frontend solution, many other studies analyzed in this research acknowledge that the next steps in SA should go towards the development of more easy-to-use visual analytics tools for teachers and students, both at individual and at a social and collaborative levels, that would help instructors providing effective feedback interventions (Dehbozorgi et al., 2020;Le et al., 2018). This development would benefit from integrating usability tests for teachers to maximize SA tools for personalizing the information displayed by the user, such as dynamic filters (by students' group, by each student, by academic year…), dynamic grouping (so teachers can group their own charts and add notes) and dynamic data labels (so teachers can more easily read the data). ...
... Complementary, in four studies SA has been used to predict learning performance and possible withdrawals or dropouts early detection. Dehbozorgi et al. (2020) established a relationship between students' performance and a positive emotional climate. However, no correlation between students' negative sentiments and individual performance was measured, suggesting a need to develop more sophisticated predictive models. ...
Article
Full-text available
Sentiment Analysis (SA), a technique based on applying artificial intelligence to analyze textual data in natural language, can help to characterize interactions between students and teachers and improve learning through timely, personalized feedback, but its use in education is still scarce. This systematic literature review explores how SA has been applied for learning assessment in online and hybrid learning contexts in higher education. Findings from this review show that there is a growing field of research on SA, although most of the papers are written from a technical perspective and published in journals related to digital technologies. Even though there are solutions involving different SA techniques that can help predicting learning performance, enhancing feedback and giving teachers visual tools, its educational applications and usability are still limited. The analysis evidence that the inclusion of variables that can affect participants’ different sentiment expression, such as gender or cultural context, remains understudied and should need to be considered in future developments.
... Historically students' performances have been evaluated based on their knowledge of the course independent of their affective states. Research suggests that emotions play a critical role in the mental functions of the individual and can help in determining a person's behavior or performance in a given context [7]. Affective computing as a multidisciplinary field integrates different aspects of computer science, cognitive science, psychology, and artificial intelligence to develop systems that capture emotions via body language, facial recognition, speech recognition, and other behavioral patterns. ...
... Towards the application of SER in analysis of students performance in another study [7] the authors investigate the relationship between students' positive sentiments expressed during teamwork and their individual performance in the course. This study introduces the use of verbal conversations as a means to measure affective states by focusing on low-stake teams, which have less emphasis in performance evaluation compared to capstone teams. ...
... Student-centered pedagogical practices have been applied by multiple educators in engineering education. One of the known forms of student-centered learning is collaborative active learning in which students mostly work in the form of low-stake teams [1], [4] to learn from each other and develop their interpersonal skills [1]. Active learning in engineering education improves students' computational thinking and problem-solving skills as well as long-term knowledge retention by 70% [2]. ...
... Student-centered pedagogical practices have been applied by multiple educators in engineering education. One of the known forms of student-centered learning is collaborative active learning in which students mostly work in the form of low-stake teams [1], [4] to learn from each other and develop their interpersonal skills [1]. Active learning in engineering education improves students' computational thinking and problem-solving skills as well as long-term knowledge retention by 70% [2]. ...
... In another study [15], the researchers explore the application of SER in analyzing students' performance by investigating the relationship between students' positive sentiments expressed during teamwork and their individual performance in the course. This study focuses on low-stake teams, which have received less attention in performance evaluation compared to capstone teams, and proposes the use of verbal conversations as a means to measure affective states. ...
Article
Full-text available
italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Contribution: An AI model for speech emotion recognition (SER) in the educational domain to analyze the correlation between students’ emotions, discussed topics in teams, and academic performance. Background: Research suggests that positive emotions are associated with better academic performance. On the other hand, negative emotions have a detrimental impact on academic achievement. This highlights the importance of taking into account the emotional states of the students to promote a supportive learning environment and improve their motivation and engagement. This line of research allows the development of tools that allow educators to address students’ emotional needs and provide timely support and interventions. Intended Outcome: This work analyzes students’ conversations and their expressed emotions as they work on class activities in teams and investigates if their conversations are course-related or not by applying topic extraction to the conversations. Furthermore, a comprehensive analysis is conducted to identify the correlation between emotions expressed by students and the discussed topics with their performance in the course in terms of their grades. Application Design: The student’s performance is formatively evaluated, taking into account a combination of their scores in various components. The core of the developed model comprises a speech transcriber module, an emotion analysis module, and a topic extraction module. The outputs of all these modules are processed to identify the correlations. Findings: The findings show a strong positive correlation between the expressed emotions of “relief” and “satisfaction” with students’ grades and a strong negative correlation between “frustration” and grades. Data also shows a strong positive correlation between course-related topics discussed in teams and grades and a strong negative correlation between noncourse-related topics and grades.
... Traditional machine learning methods according to textual features such as Naïve Bayes classifier [35], NLTK vader [5], etc, have been proposed for ERC task [34]. As deep learning develops, many neural networks models are employed to classify the emotion at each utterance. ...
Preprint
Full-text available
This paper deals with the utterance-level modalities missing problem with uncertain patterns on emotion recognition in conversation (ERC) task. Present models generally predict the speaker's emotions by its current utterance and context, which is degraded by modality missing considerably. Our work proposes a framework Missing-Modality Robust emotion Recognition (M2R2), which trains emotion recognition model with iterative data augmentation by learned common representation. Firstly, a network called Party Attentive Network (PANet) is designed to classify emotions, which tracks all the speakers' states and context. Attention mechanism between speaker with other participants and dialogue topic is used to decentralize dependence on multi-time and multi-party utterances instead of the possible incomplete one. Moreover, the Common Representation Learning (CRL) problem is defined for modality-missing problem. Data imputation methods improved by the adversarial strategy are used here to construct extra features to augment data. Extensive experiments and case studies validate the effectiveness of our methods over baselines for modality-missing emotion recognition on two different datasets.
Article
This paper deals with the utterance-level modalities missing problem with uncertain patterns on emotion recognition in conversation (ERC) task. Present models generally predict the speaker's emotions by its current utterance and context, which is degraded by modality missing considerably. Our work proposes a framework Missing-Modality Robust emotion Recognition (M2R2), which trains emotion recognition model with iterative data augmentation by learned common representation. Firstly, a network called Party Attentive Network (PANet) is designed to classify emotions, which tracks all the speakers' states and context. Attention mechanism between speaker with other participants and dialogue topic is used to decentralize dependence on multi-time and multi-party utterances instead of the possible incomplete one. Moreover, the Common Representation Learning (CRL) problem is defined for modality-missing problem. Data imputation methods improved by the adversarial strategy are used here to construct extra features to augment data. Extensive experiments and case studies validate the effectiveness of our methods over baselines for modality-missing emotion recognition on two different datasets.
Conference Paper
Full-text available
This work-in-progress paper proposes a semi-automated method to analyze students' reflections. It is challenging to include reflection activities in computing classes because of the amount of time required from students to answer the reflection questions and the amount of effort required for instructors to review the students' responses. These challenges inspired us to adopt Digital Minute Paper (DMP) as a way to give students multiple, quick opportunities to stop and reflect on their experiences in class. In this way, students are given an opportunity to develop metacognitive skills and to potentially improve their performance in the class. In addition, we used these DMPs as formative feedback for the instructors to address students' problems in the class and to continuously improve the course design. Reading reflections is tedious, time-consuming, and does not scale to large classes. To extract insights from the DMPs, we created a semi-automated process for analyzing DMPs by applying natural language processing (NLP). Our process extracts unigrams and bigrams from the reflections and then visualizes related quotes from the reflections using a treemap visualization. We found that this semi-automatic analysis of the reflections is a good, low-effort way to capture student feedback in addition to helping students be more self-regulating learners.
Article
Full-text available
In recent years, it is seen that the opinion-based postings in social media are helping to reshape business and public sentiments, and emotions have an impact on our social and political systems. Opinions are central to mostly all human activities as they are the key influencers of our behaviour. Whenever we need to make a decision, we generally want to know others opinion. Every organization and business always wants to find customer or public opinion about their products and services. Thus, it is necessary to grab and study the opinions on the Web. However, finding and monitoring sites on the web and distilling the reviews remains a big task because each site typically contains a huge volume of opinion text and the average human reader will have difficulty in identifying the polarity of each review and summarizing the opinions in them. Hence, it needs the automated sentiment analysis to find the polarity score and classify the reviews as positive or negative. This article uses NLTK, Text blob and VADER Sentiment analysis tool to classify the movie reviews which are downloaded from the website www.rottentomatoes. com that is provided by the Cornell University, and makes a comparison on these tools to find the efficient one for sentiment classification. The experimental results of this work confirm that VADER outperforms the Text blob.
Conference Paper
Full-text available
This paper describes and compares two categories of pedagogical design patterns that have emerged from CS education practice: lecture-based design patterns and active learning design patterns. Pedagogical design patterns provide faculty with combinations of generalized descriptions of problems and solutions that occur in teaching and learning. The benefit of forming design patterns is the codification of successful practice that can be reused in multiple scenarios and draw on the creativity of the instructor for defining the details relevant to the course and the students. Design patterns have been represented in many formats since Alexander's initial design pattern model highlighting different aspects of what is important in each domain in which the patterns are created and used. This paper analyzes design patterns emerging from recent developments in lecture-based pedagogy and active learning in CS education. Traditional lectures in computer science, engineering, and other STEM disciplines are being reconsidered due to research that shows that students are less likely to learn while listening and more likely to learn while actively engaged. Design patterns that address problems and provide potential solutions to traditional lectures in computer science education have been published that provide solutions to engage students during the lecture. The pedagogy of flipped classrooms and active learning have recently been adopted by many faculty in Computer Science leading to emerging design patterns for active learning. We compare how previously published lecture-based patterns and our active learning patterns address similar problems with different solutions to engaging students. We show how an object-based structure for pedagogical design patterns can provide additional information about the problems and the solutions addressed by the patterns that are more easily indexed and combined.
Article
Full-text available
Growth in the area of opinion mining and sentiment analysis has been rapid and aims to explore the opinions or text present on different platforms of social media through machine-learning techniques with sentiment, subjectivity analysis or polarity calculations. Despite the use of various machine-learning techniques and tools for sentiment analysis during elections, there is a dire need for a state-of-the-art approach. To deal with these challenges, the contribution of this paper includes the adoption of a hybrid approach that involves a sentiment analyzer that includes machine learning. Moreover, this paper also provides a comparison of techniques of sentiment analysis in the analysis of political views by applying supervised machine-learning algorithms such as Naïve Bayes and support vector machines (SVM).
Conference Paper
Full-text available
Successful implementation of active learning depends on a wide range of practical tactics. In this work, we adopt pedagogical design patterns to bridge between theory and best practices of active learning. This offers practical solutions for known problems in implementing active learning in Computer Science Education (CSE). We believe these patterns would help instructors customize and apply the best practices in active learning CSE. The patterns can be applied iteratively and adaptively for designing course materials and activities to achieve desired goals, depending on the context of the course. Pedagogical design patterns can help educators share their teaching design ideas in a structured style as well as provide a framework for thinking about and comparing design decisions. Another contribution of this work will be the practical and theoretical distinctions between activity-based teams and project-based teams in CSE.
Article
The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly, using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.
Chapter
The abundance of teams within organizations illustrates the importance of team performance measurement—tools that measure teamwork. Taking into account the inherently complex nature of teams, this chapter presents a few insights and a picture of the research and practice on teamwork measurement over time. We define what makes a team and identify the characteristics of an effective team. Then, we present critical observations to team performance measurement that reflect the 30 years of experience of the first author, at observing, measuring, and assessing team performance in various domains. These observations provide insight into what attitudes, behaviors, and cognitions—how teams feel, act, and think—play an integral role in performance assessment, while taking situational factors and construct considerations into account. Support is presented from the literature on teams and performance measurement, and we provide major contributions from a sample of team performance measurement literature in the past 30 years. We conclude with a discussion on needs for developing future team-based measurement approaches. In this discussion of the future, emphasis is placed on our need, as a field, to continue closing the gap between research and practice through designing and validating effective performance-based measures that target practitioner needs.