Chapter

What Are IBD Patients Talking About on Twitter?

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In recent years, social networking sites and online communities have served as alternate information sources for patients, who use social media to share health and treatment information, learn from each other’s experiences, and provide social support. This research aimed to investigate what patients with Inflammatory Bowel Disease (IBD) are talking about on Twitter and to learn from the experimental knowledge of living with the disease they share online. We collected tweets of 337 IBD patients who openly tweeted about their disease on Twitter and used the Natural Language Understanding (NLU) module by IBM Cloud to apply category classification and keywords extraction to their tweets. To evaluate the results, we suggested a method for sampling the general population of Twitter users and forming a control group. We found statistically significant differences between the thematic segmentations of the patients and those of random Twitter users. We identified keywords that patients frequently use in the contexts of health, fitness, or nutrition, and obtained their sentiment. The results of the research suggest that the personal information shared by IBD patients on Twitter can be used to understand better the disease and how it affects patients’ lives. By leveraging posts describing patients’ daily activities and how they influence their wellbeing, we can derive complementary knowledge about the disease that is based on the wisdom of the crowd.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In addition to the above, to further support research and development in this field, we present an open-access dataset of 522,886 Tweet IDs of the same number of tweets about the COVID-19 Omicron variant, which have been posted publicly on Twitter since the first detected case of this variant on 24 November 2021. The development of Twitter datasets has been of significant interest to the scientific community, as can be seen from the recent Twitter datasets on the 2020 U.S. Presidential Elections [72], 2022 Russia-Ukraine war [73], climate change [74], natural hazards [75], European Migration Crisis [76], movies [77], toxic behavior amongst adolescents [78], music [79], civil unrest [80], drug safety [81], and Inflammatory Bowel Disease [82]. Since the outbreak of COVID- 19, there have been a few works that focused on the development of Twitter datasets. ...
... The development of Twitter datasets has been of significant importance and interest to the scientific community in the areas of Big Data mining, Data Analysis, and Data Science. This is evident from the recent Twitter datasets on 2020 U.S. Presidential Elections [72], 2022 Russia-Ukraine war [73], climate change [74], natural hazards [75], European Migration Crisis [76], movies [77], toxic behavior amongst adolescents [78], music [79], civil unrest [80], drug safety [81], and Inflammatory Bowel Disease [82]. ...
Article
Full-text available
This paper presents the findings of an exploratory study on the continuously generating Big Data on Twitter related to the sharing of information, news, views, opinions, ideas, knowledge, feedback, and experiences about the COVID-19 pandemic, with a specific focus on the Omicron variant, which is the globally dominant variant of SARS-CoV-2 at this time. A total of 12,028 tweets about the Omicron variant were studied, and the specific characteristics of the tweets that were analyzed include sentiment, language, source, type, and embedded URLs. The findings of this study are manifold. First, from sentiment analysis, it was observed that 50.5% of tweets had a ‘neutral’ emotion. The other emotions—‘bad’, ‘good’, ‘terrible’, and ‘great’—were found in 15.6%, 14.0%, 12.5%, and 7.5% of the tweets, respectively. Second, the findings of language interpretation showed that 65.9% of the tweets were posted in English. It was followed by Spanish or Castillian, French, Italian, Japanese, and other languages, which were found in 10.5%, 5.1%, 3.3%, 2.5%, and <2% of the tweets, respectively. Third, the findings from source tracking showed that “Twitter for Android” was associated with 35.2% of tweets. It was followed by “Twitter Web App”, “Twitter for iPhone”, “Twitter for iPad”, “TweetDeck”, and all other sources that accounted for 29.2%, 25.8%, 3.8%, 1.6%, and <1% of the tweets, respectively. Fourth, studying the type of tweets revealed that retweets accounted for 60.8% of the tweets, it was followed by original tweets and replies that accounted for 19.8% and 19.4% of the tweets, respectively. Fifth, in terms of embedded URL analysis, the most common domain embedded in the tweets was found to be twitter.com, which was followed by biorxiv.org, nature.com, wapo.st, nzherald.co.nz, recvprofits.com, science.org, and other domains. Finally, to support research and development in this field, we have developed an open-access Twitter dataset that comprises Tweet IDs of more than 500,000 tweets about the Omicron variant, posted on Twitter since the first detected case of this variant on 24 November 2021.
... In addition to the above, to further support research and development in this field, we present an open-access dataset of 537,702 Tweet IDs of the same number of tweets about the COVID-19 omicron variant, posted publicly on Twitter since the first detected case of this variant on November 24, 2021. The development of Twitter datasets has been of significant interest to the scientific community, as can be seen from the recent Twitter datasets on the 2020 U.S. Presidential Elections [72], 2022 Russia Ukraine war [73], climate change [74], natural hazards [75], European Migration Crisis [76], movies [77], toxic behavior amongst adolescents [78], music [79], civil unrest [80], drug safety [81], and Inflammatory Bowel Disease [82]. Since the outbreak of COVID-19, there have been a few works that focused on the development of Twitter datasets. ...
... The development of Twitter datasets has been of significant importance and interest to the scientific community in the areas of Big Data Mining, Data Analysis, and Data Science. This is evident from the recent Twitter datasets on 2020 U.S. Presidential Elections [72], 2022 Russia Ukraine war [73], climate change [74], natural hazards [75], European Migration Crisis [76], movies [77], toxic behavior amongst adolescents [78], music [79], civil unrest [80], drug safety [81], and Inflammatory Bowel Disease [82]. ...
Preprint
Full-text available
This paper presents the findings of an exploratory study on the continuously generating Big Data on Twitter related to the sharing of information, news, views, opinions, ideas, feedback, and experiences about the COVID-19 pandemic, with a specific focus on the Omicron variant, which is the globally dominant variant of SARS-CoV-2 at this time. A total of 12028 tweets about the Omicron variant were studied, and the specific characteristics of tweets that were analyzed include - sentiment, language, source, type, and embedded URLs. The findings of this study are manifold. First, from sentiment analysis, it was observed that 50.5% of tweets had a neutral emotion. The other emotions - bad, good, terrible, and great were found in 15.6%, 14.0%, 12.5%, and 7.5% of the tweets, respectively. Second, the findings of language interpretation showed that 65.9% of the tweets were posted in English. It was followed by Spanish, French, Italian, and other languages. Third, the findings from source tracking showed that Twitter for Android was associated with 35.2% of tweets. It was followed by Twitter Web App, Twitter for iPhone, Twitter for iPad, and other sources. Fourth, studying the type of tweets revealed that retweets accounted for 60.8% of the tweets, it was followed by original tweets and replies that accounted for 19.8% and 19.4% of the tweets, respectively. Fifth, in terms of embedded URL analysis, the most common domain embedded in the tweets was found to be twitter.com, which was followed by biorxiv.org, nature.com, and other domains. Finally, to support similar research in this field, we have developed a Twitter dataset that comprises more than 500,000 tweets about the SARS-CoV-2 omicron variant since the first detected case of this variant on November 24, 2021.
... Enriching the dataset by identifying more patients on Twitter or expanding the search to other social media could improve the classifier's performance and precise classification. In a previous study, we identified 337 IBD patients who actively discussed their disease on Twitter (Stemmer, Parmet, & Ravid, 2021). In future research, we intend to add these patients to our model after determining the values of the classification features by mining the patients' Twitter timelines. ...
... For classification, they used two predictive models. They trained a COVID-relevant classifier using a sample of multilingual tweets developed by [14] and considered them as the positive class. To train the misinformation detection model, they used two publicly available datasets as a positive class. ...
Article
Full-text available
The development and rollout of COVID-19 vaccination around the world offers hope for controlling the pandemic. People turned to social media such as Twitter seeking information or to voice their opinion. Therefore, mining such conversation can provide a rich source of data for different applications related to the COVID-19 vaccine. In this data article, we developed an Arabic Twitter dataset of 1.1 M Arabic posts regarding the COVID-19 vaccine. The dataset was streamed over one year, covering the period from January to December 2021. We considered a set of crawling keywords in the Arabic language related to the conversation about the vaccine. The dataset consists of seven databases that can be analyzed separately or merged for further analysis. The initial analysis depicts the embedded features within the posts, including hashtags, media, and the dynamic of replies and retweets. Further, the textual analysis reveals the most frequent words that can capture the trends of the discussions. The dataset was designed to facilitate research across different fields, such as social network analysis, information retrieval, health informatics, and social science.
... At present, there are about 192 million daily active users on Twitter, and approximately 500 million tweets are posted on Twitter every day [49]. Mining of social media conversations, such as Tweets, to develop datasets has been of significant interest to the scientific community in the areas of Big Data, Data Mining, and Natural Language Processing, as can be seen from these recent works where relevant Tweets were mined to develop Twitter datasets on the 2020 US Presidential Election [50], 2022 Russia-Ukraine war [51], climate change [52], natural hazards [53], European Migration Crisis [54], movies [55], toxic behavior amongst adolescents [56], music [57], civil unrest [58], drug safety [59], and Inflammatory Bowel Disease [60]. ...
Article
Full-text available
The COVID-19 Omicron variant, reported to be the most immune-evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations related to online learning in the form of tweets. Mining such tweets to develop a dataset can serve as a data resource for different applications and use-cases related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore, this work presents a large-scale, open-access Twitter dataset of conversations about online learning from different parts of the world since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. The paper also briefly outlines some potential applications in the fields of Big Data, Data Mining, Natural Language Processing, and their related disciplines, with a specific focus on online learning during this Omicron wave that may be studied, explored, and investigated by using this dataset.
... Therefore, mining and studying this Big Data of conversations from Twitter has been of significant interest to the research community. In the last few years, there have been several works in the fields of Big Data, Data Mining, and Natural Language Processing related to the development of datasets of Twitter conversations related to different topics, technologies, events, diseases, viruses, etc., such as -movies [38], COVID-19 [39], elections [40], toxic behavior amongst adolescents [41], music [42], natural hazards [43], personality traits [44], civil unrest [45], drug safety [46], climate change [47], hate speech [48], migration patterns [49], conspiracy theories [50], and Inflammatory Bowel Disease [51], just to name a few. Recent studies [52][53][54] have shown that sharing such data helps in the advancement of research, improves the quality of innovation, supports better investigation, and helps to avoid redundant efforts. ...
Preprint
Full-text available
p>The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use-cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times of its current value within the next two years. Therefore, it is crucial to study the degree and trends of user interest, views, opinions, perspectives, attitudes, acceptance, feedback, engagement, buying behavior, and satisfaction, towards exoskeletons, for which the availability of Big Data of conversations about exoskeletons is necessary. The Internet of Everything style of today's living, characterized by people spending more time on the internet than ever before, with a specific focus on social media platforms, holds the potential for the development of such a dataset by the mining of relevant social media conversations. Twitter, one such social media platform, is highly popular amongst all age groups, where the topics found in the conversation paradigms include emerging technologies such as exoskeletons. To address this research challenge, this work makes two scientific contributions to this field. First, it presents an open-access dataset of about 140,000 tweets about exoskeletons that were posted in a 5-year period from May 21, 2017, to May 21, 2022. Second, based on a comprehensive review of the recent works in the fields of Big Data, Natural Language Processing, Information Retrieval, Data Mining, Pattern Recognition, and Artificial Intelligence that may be applied to relevant Twitter data for advancing research, innovation, and discovery in the field of exoskeleton research, a total of 100 Research Questions are presented for researchers to study, analyze, evaluate, ideate, and investigate based on this dataset. </p
... Therefore, mining and studying this Big Data of conversations from Twitter has been of significant interest to the research community. In the last few years, there have been several works in the fields of Big Data, Data Mining, and Natural Language Processing related to the development of datasets of Twitter conversations related to different topics, technologies, events, diseases, viruses, etc., such as -movies [38], COVID-19 [39], elections [40], toxic behavior amongst adolescents [41], music [42], natural hazards [43], personality traits [44], civil unrest [45], drug safety [46], climate change [47], hate speech [48], migration patterns [49], conspiracy theories [50], and Inflammatory Bowel Disease [51], just to name a few. Recent studies [52][53][54] have shown that sharing such data helps in the advancement of research, improves the quality of innovation, supports better investigation, and helps to avoid redundant efforts. ...
Preprint
Full-text available
p>The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use-cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times of its current value within the next two years. Therefore, it is crucial to study the degree and trends of user interest, views, opinions, perspectives, attitudes, acceptance, feedback, engagement, buying behavior, and satisfaction, towards exoskeletons, for which the availability of Big Data of conversations about exoskeletons is necessary. The Internet of Everything style of today's living, characterized by people spending more time on the internet than ever before, with a specific focus on social media platforms, holds the potential for the development of such a dataset by the mining of relevant social media conversations. Twitter, one such social media platform, is highly popular amongst all age groups, where the topics found in the conversation paradigms include emerging technologies such as exoskeletons. To address this research challenge, this work makes two scientific contributions to this field. First, it presents an open-access dataset of about 140,000 tweets about exoskeletons that were posted in a 5-year period from May 21, 2017, to May 21, 2022. Second, based on a comprehensive review of the recent works in the fields of Big Data, Natural Language Processing, Information Retrieval, Data Mining, Pattern Recognition, and Artificial Intelligence that may be applied to relevant Twitter data for advancing research, innovation, and discovery in the field of exoskeleton research, a total of 100 Research Questions are presented for researchers to study, analyze, evaluate, ideate, and investigate based on this dataset. </p
... At present, there are about 192 million daily active users on Twitter, and approximately 500 million tweets are posted on Twitter every day [49]. Mining of social media conversations, for instance, Tweets, to develop datasets has been of significant interest to the scientific community in the areas of Big Data, Data Mining, and Natural Language Processing, as can be seen from these recent works where relevant Tweets were mined to develop Twitter datasets on 2020 US Presidential Elections [50], 2022 Russia Ukraine war [51], climate change [52], natural hazards [53], European Migration Crisis [54], movies [55], toxic behavior amongst adolescents [56], music [57], civil unrest [58], drug safety [59], and Inflammatory Bowel Disease [60]. ...
Preprint
Full-text available
p>The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations related to online learning in the form of tweets. Mining such tweets to develop a dataset can serve as a data resource for different applications and use-cases related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore, this work presents a large-scale open-access Twitter dataset of conversations about online learning from different parts of the world since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. The paper also briefly outlines some potential applications in the fields of Big Data, Data Mining, Natural Language Processing, and their related disciplines, with a specific focus on online learning during this Omicron wave that may be studied, explored, and investigated by using this dataset.</p
... At present, there are about 192 million daily active users on Twitter, and approximately 500 million tweets are posted on Twitter every day [49]. Mining of social media conversations, for instance, Tweets, to develop datasets has been of significant interest to the scientific community in the areas of Big Data, Data Mining, and Natural Language Processing, as can be seen from these recent works where relevant Tweets were mined to develop Twitter datasets on 2020 US Presidential Elections [50], 2022 Russia Ukraine war [51], climate change [52], natural hazards [53], European Migration Crisis [54], movies [55], toxic behavior amongst adolescents [56], music [57], civil unrest [58], drug safety [59], and Inflammatory Bowel Disease [60]. ...
Preprint
Full-text available
p>The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations related to online learning in the form of tweets. Mining such tweets to develop a dataset can serve as a data resource for different applications and use-cases related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore, this work presents a large-scale open-access Twitter dataset of conversations about online learning from different parts of the world since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. The paper also briefly outlines some potential applications in the fields of Big Data, Data Mining, Natural Language Processing, and their related disciplines, with a specific focus on online learning during this Omicron wave that may be studied, explored, and investigated by using this dataset.</p
... Therefore, mining and studying this Big Data of conversations from Twitter has been of significant interest to the research community. In the last few years, there have been several works in the fields of Big Data, Data Mining, and Natural Language Processing related to the development of datasets of Twitter conversations related to different topics, technologies, events, diseases, viruses, etc., such as -movies [38], COVID-19 [39], elections [40], toxic behavior amongst adolescents [41], music [42], natural hazards [43], personality traits [44], civil unrest [45], drug safety [46], climate change [47], hate speech [48], migration patterns [49], conspiracy theories [50], and Inflammatory Bowel Disease [51], just to name a few. Recent studies [52][53][54] have shown that sharing such data helps in the advancement of research, improves the quality of innovation, supports better investigation, and helps to avoid redundant efforts. ...
Preprint
The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use-cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times of its current value within the next two years. Therefore, it is crucial to study the degree and trends of user interest, views, opinions, perspectives, attitudes, acceptance, feedback, engagement, buying behavior, and satisfaction, towards exoskeletons, for which the availability of Big Data of conversations about exoskeletons is necessary. The Internet of Everything style of today's living, characterized by people spending more time on the internet than ever before, with a specific focus on social media platforms, holds the potential for the development of such a dataset, by the mining of relevant social media conversations. Twitter, one such social media platform, is highly popular amongst all age groups, where the topics found in the conversation paradigms include emerging technologies such as exoskeletons. To address this research challenge, this work makes two scientific contributions to this field. First, it presents an open-access dataset of about 140,000 tweets about exoskeletons that were posted in a 5-year period from May 21, 2017, to May 21, 2022. Second, based on a comprehensive review of the recent works in the fields of Big Data, Natural Language Processing, Information Retrieval, Data Mining, Pattern Recognition, and Artificial Intelligence that may be applied to relevant Twitter data for advancing research, innovation, and discovery in the field of exoskeleton research, a total of 100 Research Questions are presented for researchers to study, analyze, evaluate, ideate, and investigate based on this dataset.
... Mining of social media conversations, for instance, Tweets, to develop datasets has been of significant interest to the scientific community in the last few years, as can be seen from these recent works where relevant Tweets were mined to develop Twitter datasets on COVID-19 [31,32], 2022 war between Ukraine and Russia [33,34], European Migration Crisis [35], Inflammatory Bowel Disease [36], toxic behavior amongst adolescents [37], music [38], civil unrest [39], drug safety [40], and movies [41]. Such twitter datasets serve as a data resource for a wide range of applications and use-case scenarios related to studying the associated conversation paradigms as well as for investigating the patterns of the underlying information-seeking and sharing behavior. ...
Preprint
The world is currently facing an outbreak of the monkeypox virus, and confirmed cases have been reported from 28 countries. Following a recent “emergency meeting”, the World Health Organization is considering whether the outbreak should be assessed as a “potential public health emergency of international concern” or PHEIC, as was done for the COVID-19 and Ebola outbreaks in the past. During this time, people from all over the world are using social media platforms, such as Twitter, for information seeking and sharing related to the outbreak, as well as for familiarizing themselves with the guidelines and protocols that are being recommended by various policy-making bodies to reduce the spread of the virus. This is resulting in the generation of tremendous amounts of Big Data related to such paradigms of social media behavior. Mining this Big Data and compiling it in the form of a dataset can serve as a data resource for a wide range of use-cases and applications such as analysis of public opinions, interests, views, perspectives, attitudes, and sentiment towards this outbreak. Therefore, this work presents MonkeyPox2022Tweets, an open-access dataset of Tweets related to the 2022 monkeypox outbreak that were posted on Twitter since the first detected case of this outbreak on May 7, 2022. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management.
... The Internet of Everything lifestyle of today's living is centered around people engaging in online conversations via the internet, specifically social media platforms, and spending a lot more time on the internet than ever before [25]. As a result, there has been a tremendous increase in the use of social media platforms in the recent past [ [35], Inflammatory Bowel Disease [36], toxic behavior amongst adolescents [37], music [38], civil unrest [39], drug safety [40], and movies [41]. Such twitter datasets serve as a data resource for a wide range of applications and use-case scenarios related to studying the associated conversation paradigms as well as for investigating the patterns of the underlying information-seeking and sharing behavior. ...
Preprint
The world is currently facing an outbreak of the monkeypox virus, and confirmed cases have been reported from 28 countries. Following a recent “emergency meeting”, the World Health Organization is considering whether the outbreak should be assessed as a “potential public health emergency of international concern” or PHEIC, as was done for the COVID-19 and Ebola outbreaks in the past. During this time, people from all over the world are using social media platforms, such as Twitter, for information seeking and sharing related to the outbreak, as well as for familiarizing themselves with the guidelines and protocols that are being recommended by various policy-making bodies to reduce the spread of the virus. This is resulting in the generation of tremendous amounts of Big Data related to such paradigms of social media behavior. Mining this Big Data and compiling it in the form of a dataset can serve as a data resource for a wide range of use-cases and applications such as analysis of public opinions, interests, views, perspectives, attitudes, and sentiment towards this outbreak. Therefore, this work presents MonkeyPox2022Tweets, an open-access dataset of Tweets related to the 2022 monkeypox outbreak that were posted on Twitter since the first detected case of this outbreak on May 7, 2022. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management.
... The Omicron variant currently accounts for 86% of the COVID-19 cases worldwide [8], and some there are about 192 million daily active users on Twitter, and approximately 500 million tweets are posted on Twitter every day [49]. Mining of social media conversations, for instance, Tweets, to develop datasets has been of significant interest to the scientific community in the areas of Big Data, Data Mining, and Natural Language Processing, as can be seen from these recent works where relevant Tweets were mined to develop Twitter datasets on 2020 U.S. Presidential Elections [50], 2022 Russia Ukraine war [51], climate change [52], natural hazards [53], European Migration Crisis [54], movies [55], toxic behavior amongst adolescents [56], music [57], civil unrest [58], drug safety [59], and Inflammatory Bowel Disease [60]. ...
Preprint
The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations related to online learning. Mining such conversations, such as Tweets, to develop a dataset can serve as a data resource for interdisciplinary research related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore this work presents a large-scale public Twitter dataset of conversations about online learning since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. The paper also briefly outlines some potential applications in the fields of Big Data, Data Mining, Natural Language Processing, and their related disciplines, with a specific focus on online learning during this Omicron wave that may be studied, explored, and investigated by using this dataset.
Article
Full-text available
The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times its current value within the next two years. Therefore, it is crucial to study the degree and trends of user interest, views, opinions, perspectives, attitudes, acceptance, feedback, engagement, buying behavior, and satisfaction, towards exoskeletons, for which the availability of Big Data of conversations about exoskeletons is necessary. The Internet of Everything style of today’s living, characterized by people spending more time on the internet than ever before, with a specific focus on social media platforms, holds the potential for the development of such a dataset by the mining of relevant social media conversations. Twitter, one such social media platform, is highly popular amongst all age groups, where the topics found in the conversation paradigms include emerging technologies such as exoskeletons. To address this research challenge, this work makes two scientific contributions to this field. First, it presents an open-access dataset of about 140,000 Tweets about exoskeletons that were posted in a 5-year period from 21 May 2017 to 21 May 2022. Second, based on a comprehensive review of the recent works in the fields of Big Data, Natural Language Processing, Information Retrieval, Data Mining, Pattern Recognition, and Artificial Intelligence that may be applied to relevant Twitter data for advancing research, innovation, and discovery in the field of exoskeleton research, a total of 100 Research Questions are presented for researchers to study, analyze, evaluate, ideate, and investigate based on this dataset.
Article
Full-text available
Background Patients use social media as an alternative information source, where they share information and provide social support. Although large amounts of health-related data are posted on Twitter and other social networking platforms each day, research using social media data to understand chronic conditions and patients’ lifestyles is limited. Objective In this study, we contributed to closing this gap by providing a framework for identifying patients with inflammatory bowel disease (IBD) on Twitter and learning from their personal experiences. We enabled the analysis of patients’ tweets by building a classifier of Twitter users that distinguishes patients from other entities. This study aimed to uncover the potential of using Twitter data to promote the well-being of patients with IBD by relying on the wisdom of the crowd to identify healthy lifestyles. We sought to leverage posts describing patients’ daily activities and their influence on their well-being to characterize lifestyle-related treatments. Methods In the first stage of the study, a machine learning method combining social network analysis and natural language processing was used to automatically classify users as patients or not. We considered 3 types of features: the user’s behavior on Twitter, the content of the user’s tweets, and the social structure of the user’s network. We compared the performances of several classification algorithms within 2 classification approaches. One classified each tweet and deduced the user’s class from their tweet-level classification. The other aggregated tweet-level features to user-level features and classified the users themselves. Different classification algorithms were examined and compared using 4 measures: precision, recall, F1 score, and the area under the receiver operating characteristic curve. In the second stage, a classifier from the first stage was used to collect patients' tweets describing the different lifestyles patients adopt to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that patients with IBD use when describing their daily routine. Results Both classification approaches showed promising results. Although the precision rates were slightly higher for the tweet-level approach, the recall and area under the receiver operating characteristic curve of the user-level approach were significantly better. Sentiment analysis of tweets written by patients with IBD identified frequently mentioned lifestyles and their influence on patients’ well-being. The findings reinforced what is known about suitable nutrition for IBD as several foods known to cause inflammation were pointed out in negative sentiment, whereas relaxing activities and anti-inflammatory foods surfaced in a positive context. Conclusions This study suggests a pipeline for identifying patients with IBD on Twitter and collecting their tweets to analyze the experimental knowledge they share. These methods can be adapted to other diseases and enhance medical research on chronic conditions.
Article
Full-text available
We analyze data from Twitter to uncover early-warning signals of COVID-19 outbreaks in Europe in the winter season 2019–2020, before the first public announcements of local sources of infection were made. We show evidence that unexpected levels of concerns about cases of pneumonia were raised across a number of European countries. Whistleblowing came primarily from the geographical regions that eventually turned out to be the key breeding grounds for infections. These findings point to the urgency of setting up an integrated digital surveillance system in which social media can help geo-localize chains of contagion that would otherwise proliferate almost completely undetected.
Article
Full-text available
In December 2019, a cluster of severe respiratory infections were reported in Wuhan, Hubei Province, China. At the beginning, it appeared that some patients had a history of attending or working in the wholesale fish and seafood market[1]. The market was immediately shut down on January 1, 2020, and local environmental health and disinfection measures were fully implemented. On January 9, 2020, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), formerly known as 2019-nCoV, was declared the causative agent in 15 of the 59 hospitalized patients, causing great concern: this new coronavirus has 70% genetic association with SARS and is a subspecies of Sarbecovirus. The virus is temporarily named the 2019-nCoV virus[2] and the Coronavirus Study Group has nominated the virus as SARS-CoV-2[3]. On January 11, 2020, the first deaths from the virus were reported in China, and more positive cases from other countries such as Thailand, Japan, South Korea, and the United States of America were reported by January 20, 2020, and the transmission of individual-to-health care, further complicated the situation[4]. Coronaviruses are zoonotic, meaning they are transmitted between animals and people, but the ways in which it is transmitted, animal reservoirs, prophylaxis, and precise clinical manifestations requires more investigation. There is currently no vaccine and appropriate treatment for COVID-19, so a high index of clinical suspicion and inquiring about the history of travel and contact from patients with fever and respiratory symptoms play a critical role in the prevention and control of the disease[5]. On a daily basis, a large number of Websites and online social media produce a large amount of data in a variety of fields such as technology, medicine, history, political and social news, arts and other fields. Analyzing and classifying these data leads to the production of knowledge and nowadays, it has attracted the attention of many researchers[6].
Article
Full-text available
Background: Nowadays, the use of social media is part of daily life, with more and more people, including governments and health organizations, using at least one platform regularly. Social media enables users to interact among large groups of people that share the same interests and suffer the same afflictions. Notably, these channels promote the ability to find and share information about health and medical conditions. Objective: This study aimed to characterize the bowel disease (BD) community on Twitter, in particular how patients understand, discuss, feel, and react to the condition. The main questions were as follows: Which are the main communities and most influential users?; Where are the main content providers from?; What are the key biomedical and scientific topics under discussion? How are topics interrelated in patient communications?; How do external events influence user activity?; What kind of external sources of information are being promoted? Methods: To answer these questions, a dataset of tweets containing terms related to BD conditions was collected from February to August 2018, accounting for a total of 24,634 tweets from 13,295 different users. Tweet preprocessing entailed the extraction of textual contents, hyperlinks, hashtags, time, location, and user information. Missing and incomplete information about the user profiles was completed using different analysis techniques. Semantic tweet topic analysis was supported by a lexicon-based entity recognizer. Furthermore, sentiment analysis enabled a closer look into the opinions expressed in the tweets, namely, gaining a deeper understanding of patients' feelings and experiences. Results: Health organizations received most of the communication, whereas BD patients and experts in bowel conditions and nutrition were among those tweeting the most. In general, the BD community was mainly discussing symptoms, BD-related diseases, and diet-based treatments. Diarrhea and constipation were the most commonly mentioned symptoms, and cancer, anxiety disorder, depression, and chronic inflammations were frequently part of BD-related tweets. Most patient tweets discussed the bad side of BD conditions and other related conditions, namely, depression, diarrhea, and fibromyalgia. In turn, gluten-free diets and probiotic supplements were often mentioned in patient tweets expressing positive emotions. However, for the most part, tweets containing mentions to foods and diets showed a similar distribution of negative and positive sentiments because the effects of certain food components (eg, fiber, iron, and magnesium) were perceived differently, depending on the state of the disease and other personal conditions of the patients. The benefits of medical cannabis for the treatment of different chronic diseases were also highlighted. Conclusions: This study evidences that Twitter is becoming an influential space for conversation about bowel conditions, namely, patient opinions about associated symptoms and treatments. So, further qualitative and quantitative content analyses hold the potential to support decision making among health-related stakeholders, including the planning of awareness campaigns.
Article
Full-text available
Background: Data concerning patients originates from a variety of sources on social media. Objective: The aim of this study was to show how methodologies borrowed from different areas including computer science, econometrics, statistics, data mining, and sociology may be used to analyze Facebook data to investigate the patients' perspectives on a given medical prescription. Methods: To shed light on patients' behavior and concerns, we focused on Crohn's disease, a chronic inflammatory bowel disease, and the specific therapy with the biological drug Infliximab. To gain information from the basin of big data, we analyzed Facebook posts in the time frame from October 2011 to August 2015. We selected posts from patients affected by Crohn's disease who were experiencing or had previously been treated with the monoclonal antibody drug Infliximab. The selected posts underwent further characterization and sentiment analysis. Finally, an ethnographic review was carried out by experts from different scientific research fields (eg, computer science vs gastroenterology) and by a software system running a sentiment analysis tool. The patient feeling toward the Infliximab treatment was classified as positive, neutral, or negative, and the results from computer science, gastroenterologist, and software tool were compared using the square weighted Cohen's kappa coefficient method. Results: The first automatic selection process returned 56,000 Facebook posts, 261 of which exhibited a patient opinion concerning Infliximab. The ethnographic analysis of these 261 selected posts gave similar results, with an interrater agreement between the computer science and gastroenterology experts amounting to 87.3% (228/261), a substantial agreement according to the square weighted Cohen's kappa coefficient method (w2K=0.6470). A positive, neutral, and negative feeling was attributed to 36%, 27%, and 37% of posts by the computer science expert and 38%, 30%, and 32% by the gastroenterologist, respectively. Only a slight agreement was found between the experts' opinion and the software tool. Conclusions: We show how data posted on Facebook by Crohn's disease patients are a useful dataset to understand the patient's perspective on the specific treatment with Infliximab. The genuine, nonmedically influenced patients' opinion obtained from Facebook pages can be easily reviewed by experts from different research backgrounds, with a substantial agreement on the classification of patients' sentiment. The described method allows a fast collection of big amounts of data, which can be easily analyzed to gain insight into the patients' perspective on a specific medical therapy.
Article
Full-text available
The present study examines the online discussion on Twitter regarding the stigmatization of HIV-positive women in Athens in May 2012. The method of critical discourse analysis is applied on the anti-sovereign discourses that were articulated on Twitter, while the incident was taking place. The virtual countersphere is analyzed with regards to its political implications, such as the reproduction of the unfree sovereign discourse and the mobilization towards political action.
Article
Full-text available
People who undergo ostomy surgery are often worried about being stigmatized. Ostomates explain that they appreciate contact with others who can relate to their experiences and are concerned with where to get advice and support about their ostomies. Internet communities are one outlet to challenge stigma and find support, and the web site Uncover Ostomy is one such community. The web site is the work of Jessica Grossman, a college student who challenges ostomy stigma through provocative photos. Through the associated Facebook fan page, users upload their own photos uncovering their ostomies. This project examined uploaded photos to see how people challenge stigma using new media technology. Additionally, the project examined comments left by Facebook users to understand how the ostomy community reacted to these photos. Implications of such spaces for users to challenge a stigmatizing condition are discussed.
Article
Full-text available
Social media are being increasingly used for health promotion, yet the landscape of users, messages and interactions in such fora is poorly understood. Studies of social media and diabetes have focused mostly on patients, or public agencies addressing it, but have not looked broadly at all of the participants or the diversity of content they contribute. We study Twitter conversations about diabetes through the systematic analysis of 2.5 million tweets collected over 8 months and the interactions between their authors. We address three questions. (1) What themes arise in these tweets? (2) Who are the most influential users? (3) Which type of users contribute to which themes? We answer these questions using a mixed methods approach, integrating techniques from anthropology, network science and information retrieval such as thematic coding, temporal network analysis and community and topic detection. Diabetes-related tweets fall within broad thematic groups: health information, news, social interaction and commercial. At the same time, humorous messages and references to popular culture appear consistently, more than any other type of tweet. We classify authors according to their temporal ‘hub’ and ‘authority’ scores. Whereas the hub landscape is diffuse and fluid over time, top authorities are highly persistent across time and comprise bloggers, advocacy groups and NGOs related to diabetes, as well as for-profit entities without specific diabetes expertise. Top authorities fall into seven interest communities as derived from their Twitter follower network. Our findings have implications for public health professionals and policy makers who seek to use social media as an engagement tool and to inform policy design.
Conference Paper
Full-text available
Anonymity, speed and big data are all ingredients at the basis of an intense online medical information prosumerism supported by search-engines, websites, forums and social networks. Fostering reliable medical ecosystems requires an exploitation of such data, controlling quality and delineating user behaviors. This strongly emerges with chronic patients: they are expected to devote a long use of the Internet on the same health topics; nonetheless a limited amount of research has investigated their online resources and behavior. We uncover the role of online social networks for a growing community of chronic patients: Crohn's disease patients. Our contribution is twofold: (a) we characterize the data exchanged by Crohn's patients, (b) while analyzing how they deal with given topics of medical interest. In particular, of great medical relevance is the result that Infliximab is the treatment that mostly influences the Crohn's patient community sentiments. We are confident our analysis of the health heritage exchanged online can help improve Crohn's patients' experience, exceeding the traditional practices that typically concentrate on individuals rather than on communities.
Article
Full-text available
Background Biomedical research has traditionally been conducted via surveys and the analysis of medical records. However, these resources are limited in their content, such that non-traditional domains (eg, online forums and social media) have an opportunity to supplement the view of an individual’s health. Objective The objective of this study was to develop a scalable framework to detect personal health status mentions on Twitter and assess the extent to which such information is disclosed. Methods We collected more than 250 million tweets via the Twitter streaming API over a 2-month period in 2014. The corpus was filtered down to approximately 250,000 tweets, stratified across 34 high-impact health issues, based on guidance from the Medical Expenditure Panel Survey. We created a labeled corpus of several thousand tweets via a survey, administered over Amazon Mechanical Turk, that documents when terms correspond to mentions of personal health issues or an alternative (eg, a metaphor). We engineered a scalable classifier for personal health mentions via feature selection and assessed its potential over the health issues. We further investigated the utility of the tweets by determining the extent to which Twitter users disclose personal health status. ResultsOur investigation yielded several notable findings. First, we find that tweets from a small subset of the health issues can train a scalable classifier to detect health mentions. Specifically, training on 2000 tweets from four health issues (cancer, depression, hypertension, and leukemia) yielded a classifier with precision of 0.77 on all 34 health issues. Second, Twitter users disclosed personal health status for all health issues. Notably, personal health status was disclosed over 50% of the time for 11 out of 34 (33%) investigated health issues. Third, the disclosure rate was dependent on the health issue in a statistically significant manner (P
Article
Full-text available
Incidence of pediatric inflammatory bowel disease (IBD) is rising. Adult gastroenterologists are seeing increasing numbers of young adults with IBD, a subpopulation with unique needs and challenges that can impair their readiness to thrive in an adult healthcare system. Most adult gastroenterologists might not have the training or resources to address these needs. “Emerging adulthood” is a useful developmental lens through which this group can be studied. With complex disease phenotype and specific concerns of medication side effects and reproductive health, compounded by challenges of geographical and social flux and lack of adequate health insurance, emerging adults with IBD (EAI) are at risk of disrupted care with lack of continuity. Lessons learned from structured healthcare transition process from pediatric to adult services can be applied towards challenges in ongoing care of this population in the adult healthcare system. This paper provides an overview of the challenges in caring for the post transition EAI from the perspective of adult gastroenterologists and offers a checklist of provider and patient skills that enable effective care. This paper discusses the system-based challenges in care provision and search for meaningful patient-oriented outcomes and presents a conceptual model of determinants of continuity of care in this unique population.
Article
Full-text available
In this paper we compare the social network structure of people talking about Crohn's disease, Cystic Fibrosis, and Type 1 diabetes on Facebook and Twitter. We find that the Crohn's community's contributors are most emotional on Facebook and Twitter and most negative on Twitter, while the T1D community's communication network structure is most cohesive.
Article
Full-text available
Background: Twitter is an interactive, real-time media that could prove useful in health care. Tweets from cancer patients could offer insight into the needs of cancer patients. Objective: The objective of this study was to understand cancer patients' social media usage and gain insight into patient needs. Methods: A search was conducted of every publicly available user profile on Twitter in Japan for references to the following: breast cancer, leukemia, colon cancer, rectal cancer, colorectal cancer, uterine cancer, cervical cancer, stomach cancer, lung cancer, and ovarian cancer. We then used an application programming interface and a data mining method to conduct a detailed analysis of the tweets from cancer patients. Results: Twitter user profiles included references to breast cancer (n=313), leukemia (n=158), uterine or cervical cancer (n=134), lung cancer (n=87), colon cancer (n=64), and stomach cancer (n=44). A co-occurrence network is seen for all of these cancers, and each cancer has a unique network conformation. Keywords included words about diagnosis, symptoms, and treatments for almost all cancers. Words related to social activities were extracted for breast cancer. Words related to vaccination and support from public insurance were extracted for uterine or cervical cancer. Conclusions: This study demonstrates that cancer patients share information about their underlying disease, including diagnosis, symptoms, and treatments, via Twitter. This information could prove useful to health care providers.
Article
Full-text available
The article presents advertising research on user-generated content in public service Internet advertising presented on the Internet video sharing Web site YouTube. The effectiveness of such public service advertising is compared between user-generated advertising and advertising produced by persons with expertise in the public service or issue being advocated. It was found that advertising produced by a person perceived as a peer by the consumer was more effective in creating positive attitudes towards the advertising.
Article
Full-text available
Abstract Chronic pain is a pervasive and expensive public health problem affecting roughly one-third of the American population. The inability of language to accurately convey pain expressions combined with the social stigmas associated with discussing pain persuade many sufferers to remain silent about their pain. Gender politics and fear of professional repercussions further encourage silence. This article explores the need for a safe and secure place for chronic pain sufferers to talk of their pain experiences. The extent to which digital communication technology can fulfill this need is examined. This descriptive study examines the use of one online chronic pain management workshop for its ability to create an engaged community of choice. Workshop admittance was based on participants having a qualifying chronic pain condition. A thematic discourse analysis is conducted of all entries chronic pain participants posted. In addition to goal setting, participants discuss the ways in which pain affects them on a daily basis. Two themes emerge: validation and encouragement. This study suggests that chronic pain users need a discursive space to legitimate their chronic pain identity. It confirms that online websites and virtual audiences facilitate disclosure and allow for authentic communication. The benefits of computer-mediated discussion as well as its limitations are examined.
Article
Full-text available
Background Patients increasingly turn to the Internet for information on medical conditions, including clinical news and treatment options. In recent years, an online patient community has arisen alongside the rapidly expanding world of social media, or “Web 2.0.” Twitter provides real-time dissemination of news, information, personal accounts and other details via a highly interactive form of social media, and has become an important online tool for patients. This medium is now considered to play an important role in the modern social community of online, “wired” cancer patients. Results Fifty-one highly influential “power accounts” belonging to cancer patients were extracted from a dataset of 731 Twitter accounts with cancer terminology in their profiles. In accordance with previously established methodology, “power accounts” were defined as those Twitter accounts with 500 or more followers. We extracted data on the cancer patient (female) with the most followers to study the specific relationships that existed between the user and her followers, and found that the majority of the examined tweets focused on greetings, treatment discussions, and other instances of psychological support. These findings went against our hypothesis that cancer patients’ tweets would be centered on the dissemination of medical information and similar “newsy” details. Conclusions At present, there exists a rapidly evolving network of cancer patients engaged in information exchange via Twitter. This network is valuable in the sharing of psychological support among the cancer community.
Article
Full-text available
To undertake a metasynthesis of qualitative studies to understand the health and social needs of people living with inflammatory bowel disease (IBD). A systematic search strategy identified qualitative studies exploring the phenomenon of living with inflammatory bowel disease. Databases included MEDLINE, PsychInfo, EMBASE, CINAHL and the British Nursing Index via the OVID platform. Qualitative search filters were adapted from Hedges database (http://www.urmc.rochester.edu/hslt/miner/digital_library/tip_sheets/Cinahl_eb_filters.pdf). Qualitative empirical studies exploring the health and social needs of people living with inflammatory bowel disease were selected. Study eligibility and data extraction were independently completed using the Critical Appraisal Skills Programme for qualitative studies. The studies were analysed and synthesised using metasynthesis methodology. The themes from the studies allowed for common translations into a new interpretation of the impact of living with inflammatory bowel disease. Of 1395 studies, six published studies and one unpublished thesis fulfilled the inclusion criteria. First iteration of synthesis identified 16 themes, 2nd iteration synthesised these into three main 2nd order constructs: "detained by the disease"; "living in a world of disease" and "wrestling with life". "Detained by the disease" is the fear of incontinence, the behaviour the patients display due to the fear, and the impact this has on the individual, such as social isolation and missing out on life events. All of these serve to "pull" the patient back from normal living. "Living in a world of disease" is the long term effects of living with a long term condition and the fear of these effects. "Wrestling with life" is the continued fight to thrive, the "push" to continue normal living. The metasynthesis provides a comprehensive representation of living with IBD. The unmistakeable burden of incontinence is exposed and its ongoing effects are demonstrated. The combined overall impact of living with IBD is the tension these patients live with: "Pushed and pulled: a compromised life", people living with IBD experience a constant conflict throughout their lives, they push to be normal but IBD pulls them back. The impact of the fear of incontinence and behaviour of the individual as a result, requires further qualitative enquiry.
Article
Full-text available
To understand the impact of Crohn's disease (CD) on various aspects of daily life from the perspective of patients living with CD. Awareness of the disease and biologic therapies, patient satisfaction and adherence, and physician (provider) relationships were also assessed. CD is a chronic, inflammatory, autoimmune disorder of the gastrointestinal tract that substantially impacts patients' physical and emotional well-being. For patients eligible for biologic therapy, anti-tumor necrosis factor agents represent an important addition to the available therapies for CD. The study sample included biologic-naïve and biologic-experienced patients who had self-reported moderate to severe CD, were under the care of a specialist, and agreed to film a video diary and participate in a focus group. Data from the videos and group interviews were collected from May to June of 2009 and summarized qualitatively by grouping similar answers and quotations. Of the 44 participants who submitted video diaries, 23 were biologic-experienced and 21 were biologic-naïve. Participants stated that CD caused fear and embarrassment, that they were reluctant to share the full impact of CD with family and providers, and that they relied on their provider for treatment decisions. Many participants accepted a new state of normalcy if their current medication helped their most bothersome symptoms without providing sustained remission. Participants receiving biologic therapy generally were more informed, more satisfied, and more likely to adhere to treatment regimens. Participants' responses suggest a need for more patient education and more collaborative relationships between patients and providers (physicians) regarding treatment decisions.
Conference Paper
Full-text available
As ubiquitous computing becomes increasingly mobile and social, personal information sharing will likely increase in frequency, the variety of friends to share with, and range of information that can be shared. Past work has identified that whom you share with is important for choosing whether or not to share, but little work has explored which features of interpersonal relationships influence sharing. We present the results of a study of 42 participants, who self-report aspects of their relationships with 70 of their friends, including frequency of collocation and communication, closeness, and social group. Participants rated their willingness to share in 21 different scenarios based on information a UbiComp system could provide. Our findings show that (a) self-reported closeness is the strongest indicator of willingness to share, (b) individuals are more likely to share in scenarios with common information (e.g. we are within one mile of each other) than other kinds of scenarios (e.g. my location wherever I am), and (c) frequency of communication predicts both closeness and willingness to share better than frequency of collocation.
Article
Full-text available
On Twitter, people answer the question, "What are you doing right now?" in no more than 140 characters. We investigated the content of Twitter posts meeting search criteria relating to dental pain. A set of 1000 tweets was randomly selected from 4859 tweets over 7 non-consecutive days. The content was coded using pre-established, non-mutually-exclusive categories, including the experience of dental pain, actions taken or contemplated in response to a toothache, impact on daily life, and advice sought from the Twitter community. After excluding ambiguous tweets, spam, and repeat users, we analyzed 772 tweets and calculated frequencies. Of the sample of 772 tweets, 83% (n = 640) were primarily categorized as a general statement of dental pain, 22% (n = 170) as an action taken or contemplated, and 15% (n = 112) as describing an impact on daily activities. Among the actions taken or contemplated, 44% (n = 74) reported seeing a dentist, 43% (n = 73) took an analgesic or antibiotic medication, and 14% (n = 24) actively sought advice from the Twitter community. Twitter users extensively share health information relating to dental pain, including actions taken to relieve pain and the impact of pain. This new medium may provide an opportunity for dental professionals to disseminate health information.
Preprint
BACKGROUND Social media serve as an alternate information source for patients, who use them to share information and provide social support. Though large amounts of health-related data are being posted on Twitter and other social networking platforms each day, research using social media data for understanding chronic conditions and patients' lifestyles is still lacking. OBJECTIVE In this research we contribute to closing this gap by providing a framework for identifying patients with Inflammatory Bowel Disease (IBD) on Twitter and learning from their personal experience. We enable the analysis of patients' tweets by building a classifier of Twitter users that distinguishes patients from other entities. The research aims to assess the feasibility of using social media data to promote chronically ill patients' wellbeing, by relying on the wisdom of the crowd for identifying healthy lifestyles. We seek to leverage posts describing patients' daily activities and the influence on their wellbeing for characterizing different treatments and understanding what works for whom. METHODS In the first stage of the research, a machine learning method combining both social network analysis and natural language processing was used to classify users as patients or not automatically. Three types of features were considered: (1) the user's behavior on Twitter, (2) the content of the user's tweets, and (3) the social structure of the user's network. Different classification algorithms were examined and compared using two measures (F1-score and precision) over 10-fold cross-validation. In the second stage of the research, the obtained classification methods were used to collect tweets of patients, in which they refer to the different lifestyle changes they endure in order to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that IBD patients use when describing their daily routine. RESULTS The best classification results (F1-score 0.808 and precision 0.809) for identifying IBD patients among Twitter users were achieved by a multiple-instance learning approach, which constitutes the novelty of this research. The sentiment analysis of tweets written by IBD patients identified frequently mentioned lifestyles and their influence on patients' wellbeing. The findings reinforced what is known about suitable nutrition for IBD, and several foods that are known to cause inflammation were highlighted as words with negative sentiment. CONCLUSIONS Patients everywhere use social media to share health and treatment information, learn from each other's experiences, and provide social support. Mining these informative conversations may shed some light on patients' ways of life and support chronic conditions research.
Article
Background:: Contents published on social media have an impact on individuals and on their decision making. Knowing the sentiment toward diabetes is fundamental to understanding the impact that such information could have on people affected with this health condition and their family members. The objective of this study is to analyze the sentiment expressed in messages on diabetes posted on Twitter. Method:: Tweets including one of the terms "diabetes," "t1d," and/or "t2d" were extracted for one week using the Twitter standard API. Only the text message and the number of followers of the users were extracted. The sentiment analysis was performed by using SentiStrength. Results:: A total of 67 421 tweets were automatically extracted, of those 3.7% specifically referred to T1D; and 6.8% specifically mentioned T2D. One or more emojis were included in 7.0% of the posts. Tweets specifically mentioning T2D and that did not include emojis were significantly more negative than the tweets that included emojis (-2.22 vs -1.48, P < .001). Tweets on T1D and that included emojis were both significantly more positive and also less negative than tweets without emojis (1.71 vs 1.49 and -1.31 vs -1.50, respectively; P < .005). The number of followers had a negative association with positive sentiment strength ( r = -.023, P < .001) and a positive association with negative sentiment ( r = .016, P < .001). Conclusion:: The use of sentiment analysis techniques on social media could increase our knowledge of how social media impact people with diabetes and their families and could help to improve public health strategies.
Conference Paper
Millions of users share their experiences on social media sites, such as Twitter, which in turn generate valuable data for public health monitoring, digital epidemiology, and other analyses of population health at global scale. The first, critical, task for these applications is classifying whether a personal health event was mentioned, which we call the (PHM) problem. This task is challenging for many reasons, including typically short length of social media posts, inventive spelling and lexicons, and figurative language, including hyperbole using diseases like "heart attack»» or "cancer»» for emphasis, and not as a health self-report. This problem is even more challenging for rarely reported, or frequent but ambiguously expressed conditions, such as "stroke»». To address this problem, we propose a general, robust method for detecting PHMs in social media, which we call WESPAD, that combines lexical, syntactic, word embedding-based, and context-based features. WESPAD is able to generalize from few examples by automatically distorting the word embedding space to most effectively detect the true health mentions. Unlike previously proposed state-of-the-art supervised and deep-learning techniques, WESPAD requires relatively little training data, which makes it possible to adapt, with minimal effort, to each new disease and condition. We evaluate WESPAD on both an established publicly available Flu detection benchmark, and on a new dataset that we have constructed with mentions of multiple health conditions. Our experiments show that WESPAD outperforms the baselines and state-of-the-art methods, especially in cases when the number and proportion of true health mentions in the training data is small.
Article
Many people with inflammatory bowel disease (IBD), sometimes lacking adequate face-to-face sources of support, turn to online communities to meet others with the disease. These online communities are places of support and education, but through the use of social media communication technologies, people with IBD are redefining what it means to live with the disease. This ethnographic study followed 14 online communities to understand how people with IBD used social media technologies to construct their own meanings about living with the disease. The following redefinitions were observed: the refiguring of the body is beautiful; inflammatory bowel disease is serious and deadly; inflammatory bowel disease is humorous; the disease makes one stronger; and the disease is invisible, but needs to be made visible. This study will help health communication scholars understand how technology is appropriated by patients, and will help practitioners understand how their patients conceptualize their disease.
Article
Over 1 million residents in the USA and 2.5 million in Europe are estimated to have IBD, with substantial costs for health care. These estimates do not factor in the 'real' price of IBD, which can impede career aspirations, instil social stigma and impair quality of life in patients. The majority of patients are diagnosed early in life and the incidence continues to rise; therefore, the effect of IBD on health-care systems will rise exponentially. Moreover, IBD has emerged in newly industrialized countries in Asia, South America and Middle East and has evolved into a global disease with rising prevalence in every continent. Understanding the worldwide epidemiological patterns of IBD will prepare us to manage the burden of IBD over time. The goal of this article is to establish the current epidemiology of IBD in the Western world, contrast it with the increase in IBD in newly industrialized countries and forecast the global effects of IBD in 2025.
Article
Search engines and social media are two of the most com-monly used online services; in this paper, we examine how users appropriate these platforms for online health activi-ties via both large-scale log analysis and a survey of 210 people. While users often turn to search engines to learn about serious or highly stigmatic conditions, a surprising amount of sensitive health information is also sought and shared via social media, in our case the public social plat-form Twitter. We contrast what health content people seek via search engines vs. share on social media, as well as why they choose a particular platform for online health activi-ties. We reflect on the implications of our results for design-ing search engines, social media, and social search tools that better support people's health information seeking and sharing needs.
Article
We present a descriptive analysis of Twitter data. Our study focuses on extracting the main side effects associated with HIV treatments. The crux of our work was the identification of personal tweets referring to HIV. We summarize our results in an infographic aimed at the general public. In addition, we present a measure of user sentiment based on hand-rated tweets.
Article
The aim of this study was to describe the impacts of inflammatory bowel disease (IBD) from the patients' perspective and to inform the development of a conceptual model. Focus groups and one-on-one interviews were undertaken in adult patients with IBD. Transcripts from the focus groups and interviews were analyzed to identify themes and links between themes, assisted by qualitative data software MaxQDA. Themes from the qualitative research were supplemented with those reported in the literature and concepts included in IBD-specific patient-reported outcome measures. Twenty-seven patients participated. Key physical symptoms included pain, bowel-related symptoms such as frequency, urgency, incontinence, diarrhea, passing blood, and systemic symptoms such as weight loss and fatigue. Participants described continuing and variable symptom experiences. IBD symptoms caused immediate disruption of activities but also had ongoing impacts on daily activities, including dietary restrictions, lifestyle changes, and maintaining close proximity to a toilet. More distal impacts included interference with work, school, parenting, social and leisure activities, relationships, and psychological well-being. The inconvenience of rectal medications, refrigerated biologics, and medication refills emerged as novel burdens not identified in existing patient-reported outcome measures. IBD symptoms cause immediate disruption in activities, but patients may continue to experience some symptoms on a chronic basis. The conceptual model presented here may be useful for identifying target concepts for measurement in future studies in IBD.
Article
The problems associated with ulcerative colitis and its treatment have effects on adolescents and young adults dissimilar from as well as more profound than those on older individuals Adolescents are confronted with problems such as biological, psychological and social changes as well as role changes related to peers and family This inductive study aimed to describe the adolescents' experiences of living with ulcerative colitis A total of 28 subjects were asked about their experiences both at the present time and at the time their first symptoms appeared Verbatim transcribed thematized interviews were analysed according to a method influenced by the constant comparative method for grounded theory Eight categories were grounded in the data, forming a model which describes the process from onset of disease to present time The main variable identified was reduced living space, a strategy to manage the new situation Dependent on the reactions received from significant others, the outcome for the adolescents hovered between feelings of self-confidence and lack of self-confidence If the adolescents experienced support, the living space was expanded again The results might be of great value when caring for and assisting young persons with a chronic disease in general, and in particular when taking care of adolescents with a recently diagnosed inflammatory bowel disease
Article
This study was designed to identify the impact chronic ulcerative colitis (UC) has on the lives of patients compared to other chronic conditions. Overall, 451 patients with UC, 309 with rheumatoid arthritis, 305 with asthma, and 305 with migraine headaches were recruited in an Internet survey designed to assess a variety of disease-impact indices. Patients with UC reported a mean of eight (self-defined) flare-ups in the previous 12 months. Significantly more patients with UC (81%) believed that the quantity of flare-ups they experienced was 'normal', compared to patients with migraine headaches (64%) or asthma (75%). Patients with UC also reported significantly more worry about disease complications (84%), depression (62%), and embarrassment (70%) than patients with the other chronic conditions. Compared to patients with other chronic conditions, patients with UC perceive substantially more negative impact upon their lives, especially with regard to the psychological burden.
Article
Although the incidence and prevalence of ulcerative colitis and Crohn's disease are beginning to stabilize in high-incidence areas such as northern Europe and North America, they continue to rise in low-incidence areas such as southern Europe, Asia, and much of the developing world. As many as 1.4 million persons in the United States and 2.2 million persons in Europe suffer from these diseases. Previously noted racial and ethnic differences seem to be narrowing. Differences in incidence across age, time, and geographic region suggest that environmental factors significantly modify the expression of Crohn's disease and ulcerative colitis. The strongest environmental factors identified are cigarette smoking and appendectomy. Whether other factors such as diet, oral contraceptives, perinatal/childhood infections, or atypical mycobacterial infections play a role in expression of inflammatory bowel disease remains unclear. Additional epidemiologic studies to define better the burden of illness, explore the mechanism of association with environmental factors, and identify new risk factors are needed.
Article
This article reports on the experiences of individuals living with IBD and identifies a range of coping strategies used by them. Qualitative data from 15 individual interviews and three focus groups were analysed using a grounded theory approach. The main focus is on the emergent core concept of 'health-related normality'. A theoretical framework is proposed to explain how individuals with IBD assess their health-related normality, their fight to maintain it and their need to retain the appearance of normality to others. It is concluded that individuals maintain their health-related normality along certain time and context sensitive continuums rather than fitting into a distinct typology.
Data mining Twitter for cancer, diabetes, and asthma insights
  • K Chulis
The impact of ulcerative colitis on patients’ lives compared to other chronic diseases: a patient survey
  • D T Rubin
  • M C Dubinsky
  • R Panaccione
  • C A Siegel
  • D G Binion
  • S V Kane
  • DT Rubin
Dirichlet Regression in R. Version 0.4-0.R Foundation for Statistical Computing
  • M Maier
The ‘who’ and ‘what’ of#
  • M Beguerisse-Díaz
  • A K Mclennan
  • G Garduño-Hernández
  • M Barahona
  • S J Ulijaszek