Carolyn Penstein Rosé

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Are you Carolyn Penstein Rosé?

Claim your profile

Publications (170)16.44 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we explore student dropout behavior in a Massively Open Online Course (MOOC). We use a survival model to measure the impact of three social factors that make predictions about attrition along the way for students who have participated in the course discussion forum.
    Proceedings of the first ACM conference on Learning @ scale conference; 03/2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: The unique social presence of robots can be leveraged in learning situations to reduce student evaluation anxiety, while still providing instructional guidance on multiple levels of communication. Furthermore, social role of the instructor can also impact the prevalence of evaluation apprehension. In this study, we examine how human and robot social role affects help-seeking behaviors and learning outcomes in a one-on-one tutoring setting. Our results show that help-seeking is a moderator of the significant relationship between condition and learning, with the "human teacher" condition resulting in significantly less learning (and marginally less help-seeking) than the "human assistant" and both robot conditions.
    Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction; 03/2014
  • Rohit Kumar, Carolyn P. Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: Conversational agent technology is an emerging paradigm for creating a social environment in online groups that is conducive to effective teamwork. Prior work has demonstrated advantages in terms of learning gains and satisfaction scores when groups learning together online have been supported by conversational agents that employ Balesian social strategies. This prior work raises two important questions that are addressed in this article. The first question is one of generality. Specifically, are the positive effects of the designed support specific to learning contexts? Or are they in evidence in other collaborative task domains as well? We present a study conducted within a collaborative decision-making task where we see that the positive effects of the Balesian social strategies extend to this new context. The second question is whether it is possible to increase the effectiveness of the Balesian social strategies by increasing the context sensitivity with which the social strategies are triggered. To this end, we present technical work that increases the sensitivity of the triggering. Next, we present a user study that demonstrates an improvement in performance of the support agent with the new, more sensitive triggering policy over the baseline approach from prior work. The technical contribution of this article is that we extend prior work where such support agents were modeled using a composition of conversational behaviors integrated within an event-driven framework. Within the present approach, conversation is orchestrated through context-sensitive triggering of the composed behaviors. The core effort involved in applying this approach involves building a set of triggering policies that achieve this orchestration in a time-sensitive and coherent manner. In line with recent developments in data-driven approaches for building dialog systems, we present a novel technique for learning behavior-specific triggering policies, deploying it as part of our efforts to improve a socially capable conversational tutor agent that supports collaborative learning.
    ACM Transactions on Interactive Intelligent Systems (TiiS). 01/2014; 3(4).
  • Source
    Diyi Yang, Tanmay Sinha, David Adamson, Carolyn Penstein Rose
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we explore student dropout behavior in Massive Open Online Courses(MOOC). We use as a case study a recent Coursera class from which we develop a survival model that allows us to measure the influence of factors extracted from that data on student dropout rate. Specifically we explore factors related to student behavior and social positioning within discussion forums using standard social network analytic techniques. The analysis reveals several significant predictors of dropout.
    NIPS Workshop on Data Driven Education; 12/2013
  • Source
    Elijah Mayfield, M Barton Laws, Ira B Wilson, Carolyn Penstein Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: Coding of clinical communication for fine-grained features such as speech acts has produced a substantial literature. However, annotation by humans is laborious and expensive, limiting application of these methods. We aimed to show that through machine learning, computers could code certain categories of speech acts with sufficient reliability to make useful distinctions among clinical encounters. The data were transcripts of 415 routine outpatient visits of HIV patients which had previously been coded for speech acts using the Generalized Medical Interaction Analysis System (GMIAS); 50 had also been coded for larger scale features using the Comprehensive Analysis of the Structure of Encounters System (CASES). We aggregated selected speech acts into information-giving and requesting, then trained the machine to automatically annotate using logistic regression classification. We evaluated reliability by per-speech act accuracy. We used multiple regression to predict patient reports of communication quality from post-visit surveys using the patient and provider information-giving to information-requesting ratio (briefly, information-giving ratio) and patient gender. Automated coding produces moderate reliability with human coding (accuracy 71.2%, κ=0.57), with high correlation between machine and human prediction of the information-giving ratio (r=0.96). The regression significantly predicted four of five patient-reported measures of communication quality (r=0.263-0.344). The information-giving ratio is a useful and intuitive measure for predicting patient perception of provider-patient communication quality. These predictions can be made with automated annotation, which is a practical option for studying large collections of clinical encounters with objectivity, consistency, and low cost, providing greater opportunity for training and reflection for care providers.
    Journal of the American Medical Informatics Association 09/2013; · 3.57 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper investigates the use of conversational agents to scaffold online collaborative learning discussions through an approach called academically productive talk (APT). In contrast to past work on dynamic support for collaborative learning, which has involved using agents to elevate the conceptual depth of collaborative discussion by leading students in groups through directed lines of reasoning, this APT-based approach lets students follow their own lines of reasoning and promotes productive practices such as explanation of reasoning and refinement of ideas. Two forms of support are contrasted, namely, Revoicing support and Feedback support. The study provides evidence that Revoicing support resulted in significantly more intensive reasoning exchange between students in the chat and significantly more learning during the chat than when that form of support was absent. Another form of support, namely, Feedback support increased expression of reasoning while marginally decreasing the intensity of the interaction between students and did not affect learning.
    IEEE Transactions on Learning Technologies 01/2013; 6(3):240-247. · 0.76 Impact Factor
  • NAACL HLT 2013; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a novel semi-supervised approach for detecting profanity-related offensive content in Twitter. Our approach exploits linguistic regularities in profane language via statistical topic modeling on a huge Twitter corpus, and detects offensive tweets using automatically these generated features. Our approach performs competitively with a variety of machine learning (ML) algorithms. For instance, our approach achieves a true positive rate (TP) of 75.1% over 4029 testing tweets using Logistic Regression, significantly outperforming the popular keyword matching baseline, which has a TP of 69.7%, while keeping the false positive rate (FP) at the same level as the baseline at about 3.77%. Our approach provides an alternative to large scale hand annotation efforts required by fully supervised learning approaches.
    Proceedings of the 21st ACM international conference on Information and knowledge management; 10/2012
  • Elijah Mayfield, Miaomiao Wen, Mitch Golant, Carolyn Penstein Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: For users of online support groups, prior research has suggested that a positive social environment is a key enabler of coping. Typically, demonstrating such claims about social interaction would be approached through the lens of sentiment analysis. In this work, we argue instead for a multifaceted view of emotional state, which incorporates both a static view of emotion (sentiment) with a dynamic view based on the behaviors present in a text. We codify this dynamic view through data annotations marking information sharing, sentiment, and coping efficacy. Through machine learning analysis of these annotations, we demonstrate that while sentiment predicts a user's stress at the beginning of a chat, dynamic views of efficacy are stronger indicators of stress reduction.
    Proceedings of the 17th ACM international conference on Supporting group work; 10/2012
  • Miaomiao Wen, Carolyn Penstein Rose
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an automatic analysis method that enables efficient examination of participant behavior trajectories in online communities, which offers the opportunity to examine behavior over time at a level of granularity that has previously only been possible in small scale case study analyses. We provide an empirical validation of its performance. We then illustrate how this method offers insights into behavior patterns that enable avoiding faulty oversimplified assumptions about participation, such as that it follows a consistent trend over time. In particular, we use this method to investigate the connection between user behavior and distressful cancer events and demonstrate how this tool could assist in cancer story summarization.
    Proceedings of the 17th ACM international conference on Supporting group work; 10/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a systematic analysis of existing multi-domain learning approaches with respect to two questions. First, many multi-domain learning algorithms resemble ensemble learning algorithms. (1) Are multi-domain learning improvements the result of ensemble learning effects? Second, these algorithms are traditionally evaluated in a balanced class label setting, although in practice many multi-domain settings have domain-specific class label biases. When multi-domain learning is applied to these settings, (2) are multi-domain methods improving because they capture domain-specific class biases? An understanding of these two issues presents a clearer idea about where the field has had success in multi-domain learning, and it suggests some important open questions for improving beyond the current state of the art.
    Conference on Empirical Methods in Natural Language Processing; 07/2012
  • Elijah Mayfield, David Adamson, Carolyn Penstein Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: Conversational practices do not occur at a single unit of analysis. To understand the interplay between social positioning, information sharing, and rhetorical strategy in language, various granularities are necessary. In this work we present a machine learning model for multi-party chat which predicts conversation structure across differing units of analysis. First, we mark sentence-level behavior using an information sharing annotation scheme. By taking advantage of Integer Linear Programming and a sociolinguistic framework, we enforce structural relationships between sentence-level annotations and sequences of interaction. Then, we show that clustering these sequences can effectively disentangle the threads of conversation. This model is highly accurate, performing near human accuracy, and performs analysis on-line, opening the door to real-time analysis of the discourse of conversation.
    Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue; 07/2012
  • David Adamson, Carolyn Penstein Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: The field of computer supported collaborative learning has evolved an ontology of types of support for group learning. In recent years, conversational agents have been used successfully to realize forms of dynamic micro and macro level script based support for group learning. However, using existing architectures for managing the coordination of these agent-based behaviors (which can vary widely in scope, timing, and constraints), infelicitous "collision" of behaviors have been observed. In this paper, we introduce a new architecture that facilitates the development, coordination, and co-performance of multiple agent-based support behaviors.
    Proceedings of the 11th international conference on Intelligent Tutoring Systems; 06/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we explore using an intelligent dialogue tutor to influence student academic self-efficacy, as well as its interaction with group self-efficacy composition in a dyadic learning environment. We find providing additional tutor prompts encouraging students to participate in discussion may have unexpected negative effects on self-efficacy, especially on students with low self-efficacy scores who have partners with low self-efficacy scores.
    Proceedings of the 11th international conference on Intelligent Tutoring Systems; 06/2012
  • Gregory Dyke, David Adamson, Iris Howley, Carolyn Penstein Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we investigate the use of conversational agents to scaffold on-line collaborative learning discussions through an approach called academically productive talk. In contrast to past work, which has involved using agents to elevate the conceptual depth of collaborative discussion by leading students in groups through directed lines of reasoning, this approach lets students follow their own lines of reasoning and promotes productive practices such as explaining, stating agreement and disagreement, and reading and revoicing the statements of other students. We contrast two types of academically productive talk support for a discussion about 9th grade biology and show that one type in particular has a positive effect on the overall conversation, while the other is worse than no support. This positive effect carries over onto participation in a full-class discussion the following day. We use a sociolinguistic style analysis to investigate how the two types of support influence the discussion and draw conclusions for redesign. In particular, our findings have implications for how dynamic micro-scripting agents such as those scaffolding academically productive talk can be used in consort with more static macro- and micro- scripting.
    Proceedings of the 11th international conference on Intelligent Tutoring Systems; 06/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: SimStudent, an intelligent-agent architecture that generates a cognitive model from worked-out examples, currently interacts with human subjects only in a limited capacity. In our application, SimStudent attempts to solve algebra equations, querying the user about the correctness of each step as it solves, and the user explains the step in natural language. Based on that input, SimStudent can choose to ask further questions that prompt the user to think harder about the problem in an attempt to elicit deeper responses. We show how text classification techniques can be used to train models that can distinguish between different categories of student feedback to SimStudent, and how this enables interaction with SimStudent in a pilot study.
    Proceedings of the 11th international conference on Intelligent Tutoring Systems; 06/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes an application of machine translation technology for supporting collaboration in Wikipedia. Wikipedia hosts separate language Wikipedias for hundreds of different languages. While some content is specific to these different versions of Wikipedia, some topics have pages within multiple different Wikipedias. Similarly, while some users participate only in one Wikipedia, we find users who play a bridging role between these sub-communities and participate in the process of maintaining similar pages in different Wikipedias. Since these are not the majority of users, a support tool that allows stretching the effort of these specialized users further by indicating where their effort is needed could be a tremendous benefit to the community. An evaluation of the proposed approach demonstrates promise that such a tool could substantially reduce the effort involved in playing this bridging role on Wikipedia.
    01/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this work, we employ quantitative methods to describe the discourse practices observed in a direction giving task. We place a special emphasis on comparing differences in strategies between two separate populations and between successful and unsuccessful groups. We isolate differences in these strategies through several novel representations of discourse practices. We find that information sharing, instruction giving, and social feedback strategies are distinct between subpopulations in empirically identifiable ways.
    01/2012;
  • Philip Gianfortoni, David Adamson, Carolyn P. Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we describe a novel feature discovery technique that can be used to model stylistic variation in sociolects. While structural features offer much in terms of expressive power over simpler features used more frequently in machine learning approaches to modeling linguistic variation, they frequently come at an excessive cost in terms of feature space size expansion. We propose a novel form of structural features referred to as "stretchy patterns" that strike a balance between expressive power and compactness in order to enable modeling stylistic variation with reasonably small datasets. As an example we focus on the problem of modeling variation related to gender in personal blogs. Our evaluation demonstrates a significant improvement over standard baselines.
    Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties; 07/2011
  • Dong Nguyen, Noah A. Smith, Carolyn P. Rosé
    [Show abstract] [Hide abstract]
    ABSTRACT: While the study of the connection between discourse patterns and personal identification is decades old, the study of these patterns using language technologies is relatively recent. In that more recent tradition we frame author age prediction from text as a regression problem. We explore the same task using three very different genres of data simultaneously: blogs, telephone conversations, and online forum posts. We employ a technique from domain adaptation that allows us to train a joint model involving all three corpora together as well as separately and analyze differences in predictive features across joint and corpus-specific aspects of the model. Effective features include both stylistic ones (such as POS patterns) as well as content oriented ones. Using a linear regression model based on shallow text features, we obtain correlations up to 0.74 and mean absolute errors between 4.1 and 6.8 years.
    Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities; 06/2011

Publication Stats

2k Citations
16.44 Total Impact Points

Institutions

  • 2–2014
    • Carnegie Mellon University
      • • Language Technologies Institute
      • • Human-Computer Interaction Institute
      Pittsburgh, Pennsylvania, United States
  • 1998–2006
    • University of Pittsburgh
      • Learning Research and Development Center
      Pittsburgh, Pennsylvania, United States