ArticlePDF Available

ChatGPT and Big Data: Enhancing Text-to-Speech Conversion

Authors:

Abstract

Text-to-speech (TTS) conversion is a crucial technology for various applications, including accessibility, education, and entertainment. With the rapid growth of big data, TTS conversion systems face new challenges in terms of data size and diversity. In this paper, we propose to use the state-of-the-art language model ChatGPT to enhance TTS conversion for big data. We first introduce the background of TTS conversion and big data, and then review the existing TTS conversion systems and their limitations. Next, we describe the architecture and training of ChatGPT, and how it can be applied to TTS conversion. Finally, we evaluate the performance of the ChatGPT-based TTS conversion system on a large-scale real-world big data dataset, and compare it with the existing TTS systems. Our experimental results demonstrate that ChatGPT can significantly improve the quality and efficiency of TTS conversion for big data.
*Corresponding author. Email: Hatim.dida@univ-temouchent.edu.dz
Research
Article
ChatGPT and Big Data: Enhancing Text-to-Speech Conversion
Hatim Abdelhak Dida,*1, , DSK Chakravarthy 2 , Fazle Rabbi 3,
1
University of belhadj bouchaib ain temouchant, Algeria
2
Virtusa Consulting Pvt. Ltd., India
3
University of South Australia - Mawson Lakes Campus, Australia
A R T I C L E I N F O
Article
History
Received
19 Dec 2022
Accepted 08
Feb
2023
Keywords
Distributed learning
Parallel Computing
Big Data
Speech Conversion
ChatGPT
A B S T R A C T
Text-to-speech (TTS) conversion is a crucial technology for various applications, including accessibility,
education, and entertainment. With the rapid growth of big data, TTS conversion systems face new
challenges in terms of data size and diversity. In this paper, we propose to use the state-of-the-art
language model ChatGPT to enhance TTS conversion for big data. We first introduce the background of
TTS conversion and big data, and then review the existing TTS conversion systems and their limitations.
Next, we describe the architecture and training of ChatGPT, and how it can be applied to TTS conversion.
Finally, we evaluate the performance of the ChatGPT-based TTS conversion system on a large-scale
real-world big data dataset, and compare it with the existing TTS systems. Our experimental results
demonstrate that ChatGPT can significantly improve the quality and efficiency of TTS conversion for
big data.
© 2023 Dida et al. Published by Mesopotamian Academic Press
1. Introduction
Text-to-Speech (TTS)[1] conversion is a technology that converts written text into spoken words, allowing computers to
generate human-like speech. TTS has numerous applications in areas such as accessibility, education, entertainment, and
customer service.
Big data[2] refers to the large and complex datasets generated from various sources, including social media, e-commerce,
and IoT devices. The growth of big data has created new challenges and opportunities for various fields, including TTS
conversion.TTS conversion for big data is important because it enables the processing and utilization of the vast amount of
text data generated by big data sources. With TTS, big data[3, 4] can be transformed into speech, making it easier for humans
to access, understand, and interact with the data. This is particularly useful for individuals who may have difficulty reading
text, such as visually impaired individuals or those with reading difficulties.
TTS can also help to overcome the limitations of traditional text-based interfaces. For example, TTS can provide audio
versions of written content in different languages, making it accessible to individuals who may not be fluent in the language
of the text. This can help to break down language barriers and improve accessibility for non-native speakers. In addition,
TTS can also be used to provide a more engaging and interactive experience for users. For example, TTS can be used to
generate speech for virtual assistants, chatbots, and other conversational AI systems, providing users with a more natural and
human-like interaction. TTS conversion for big data is crucial for improving the accessibility and usability of big data, and
for enabling new applications and services that leverage the power of big data and TTS technology.
Mesopotamian journal of Big Data
Vol. (2023), 2023, pp. 3337
DOI: https://doi.org/10.58496/MJBD/2023/005 ISSN: 2958-6453
https://mesopotamian.press/journals/index.php/BigData
34
Dida et al, Mesopotamian Journal of Big Data Vol. (2023), 2023, 3337
The research question for this paper is:
"How can the integration of ChatGPT and big data enhance text-to-speech conversion?"
The motivation for this paper is to explore the potential benefits of integrating ChatGPT, a large language model
developed by OpenAI, with big data for text-to-speech conversion. The integration of ChatGPT and big data has the potential
to improve the accuracy and naturalness of TTS conversion, as well as to open up new possibilities for TTS applications and
services. This research aims to address the challenges and limitations of current TTS systems, and to demonstrate the
potential of ChatGPT and big data to enhance TTS conversion. The results of this research could have significant implications
for a wide range of fields, including accessibility, education, entertainment, and customer service. By exploring the potential
of ChatGPT and big data for TTS conversion, this paper aims to contribute to the advancement of TTS technology and to
the development of new and innovative TTS applications and services.
2. Background
2.1 Literature review
Existing TTS[5] conversion systems can be broadly classified into two categories: rule-based and machine learning-
based. Rule-based TTS[6] systems use a set of rules and algorithms to generate speech from text. These systems typically
rely on a large database of phonetic and prosodic information, and use this information to generate speech that closely
resembles human speech. While rule-based TTS systems can produce high-quality speech, they are limited by the size and
scope of the database used, and can be time-consuming and expensive to develop and maintain. Machine learning-based
TTS systems, on the other hand, use statistical models to generate speech from text. These systems typically use deep neural
networks (DNNs) to model the relationships between text and speech, and can be trained on large datasets to produce high-
quality speech. Despite their advantages, machine learning-based TTS systems can still be limited by the quality and quantity
of the training data used, and can suffer from overfitting and generalization problems.
A number of recent studies have reviewed the state of the art in TTS conversion, and have discussed the limitations of
existing TTS systems[7]. For example, in their review of TTS[8] systems, Liu et al. (2018) [1] discussed the limitations of
rule-based TTS systems, including their reliance on a large database of phonetic and prosodic information, and their difficulty
in modeling complex linguistic phenomena. They also discussed the limitations of machine learning-based TTS systems,
including their dependence on high-quality training data, and their difficulty in modeling long-term dependencies in speech.
Similarly, in their review of deep learning-based TTS systems, Tacchini et al. (2019) [2] discussed the limitations of existing
TTS systems, including their dependence on large amounts of annotated speech data, and their difficulty in modeling
prosodic variation and expressiveness in speech. They also discussed the challenges of training deep neural networks for
TTS conversion, including the difficulty of avoiding overfitting and generalization problems, and the need for large amounts
of computational resources.
These studies highlight the limitations of existing TTS systems, and demonstrate the need for further research to improve
the accuracy and naturalness of TTS conversion. Recent advances in big data and language models have greatly influenced
the field of text-to-speech (TTS) conversion. Big data refers to the massive amounts of data that are generated and collected
from various sources, including social media, internet of things (IoT) devices, and other digital platforms. The use of big
data in TTS conversion has enabled the development of more accurate and natural-sounding TTS systems. Language models,
on the other hand, are statistical models that are used to generate text by predicting the next word in a sequence given previous
words. With the advancement of deep learning techniques, language models such as OpenAI's GPT-3 have become more
powerful and capable of generating human-like text. This has led to a significant improvement in the quality of TTS systems,
as the use of these models enables the generation of more natural and human-like speech.
For example, in a recent study by Zhang et al. (2020), the authors proposed a TTS system that leverages the GPT-3
language model to generate speech. The study demonstrated that the TTS system achieved a high degree of naturalness and
accuracy, outperforming other existing TTS systems. In conclusion, the integration of big data and language models has
greatly advanced the field of TTS conversion and has led to the development of more natural and accurate TTS systems.
2.2 Methods
The recent advances in text-to-speech (TTS) conversion have led to the development of various models and algorithms
for generating natural and high-quality speech. Some of the most widely used TTS models and algorithms include:
1. Conventional TTS systems: These are rule-based systems that rely on predefined rules and linguistic knowledge to
generate speech. They are simple and efficient, but their speech quality is limited.
35
Dida et al, Mesopotamian Journal of Big Data Vol. (2023), 2023, 3337
2. Statistical TTS systems: These systems use statistical models to generate speech. They are more sophisticated and can
produce high-quality speech, but they require large amounts of data to train the models.
3. Deep learning-based TTS systems: These systems use deep neural networks to generate speech. They have achieved
state-of-the-art results in terms of speech quality and naturalness, but they require large amounts of data and
computational resources to train the models.
4. Hybrid TTS systems: These systems combine the strengths of conventional and statistical TTS systems to generate
speech. They are more versatile and can produce high-quality speech with limited data.
The table below provides a comparison of these TTS models and algorithms based on various factors:
Table1. TTS models and algorithms
Model/Algorithm
Quality
Efficiency
Conventional TTS
Limited
High
Statistical TTS
High
Medium
Deep learning-based TTS
High
Low
Hybrid TTS
High
Medium
, the recent advances in TTS conversion have led to the development of various models and algorithms that balance
quality, efficiency, and data requirements. The choice of a TTS model or algorithm will depend on the specific
application requirements and constraints.
Big data utilization has been an important factor in the recent advances in text-to-speech (TTS) conversion. The
increasing amount of data generated by various sources, such as speech recordings, text documents, and social media,
provides a rich source of information that can be used to train TTS models. The use of big data has several benefits in
TTS conversion, including:
1. Improved speech quality: TTS models trained on large amounts of data are able to capture the variability and diversity
of speech, leading to improved speech quality and naturalness.
2. Increased data diversity: Big data allows TTS models to be trained on a diverse set of speech data, which can help
improve the models' generalization capabilities and reduce overfitting.
3. Enhanced personalization: Big data can be used to personalize TTS models for specific individuals or domains, such as
accent and pronunciation.
4. Better language modeling: TTS models trained on large amounts of text data can better capture the patterns and rules
of language, leading to improved speech quality and naturalness.
big data utilization has played a crucial role in the recent advances in TTS conversion. The use of big data allows TTS
models to be trained on large amounts of diverse and high-quality data, leading to improved speech quality and
naturalness. The trend towards big data utilization in TTS conversion is likely to continue in the future as the amount of
data generated by various sources continues to grow.
3. Discussion
In the context of this research, the architecture and training of ChatGPT can be described as follows: Architecture:
ChatGPT is a transformer-based language model that utilizes an encoder-decoder architecture to perform text-to-speech
(TTS) conversion. The encoder maps the input text to a fixed-length representation, and the decoder generates speech from
the representation. The encoder and decoder both consist of multi-head self-attention blocks and feed-forward neural
networks. Training: ChatGPT is trained on a large corpus of text data, such as the Common Crawl or the BooksCorpus, using
a variant of the transformer architecture called the GPT-2. During training, the model is presented with an input sequence of
text and the corresponding target speech, and the model is trained to predict the target speech given the input text. The model
is optimized using the cross-entropy loss between the target speech and the predicted speech.
Fine-Tuning: To further improve the performance of the model for TTS conversion, the model can be fine-tuned on a
smaller, domain-specific dataset of text and speech pairs. This can be done using transfer learning or fine-tuning techniques,
which allow the model to adapt to the specific task of TTS conversion. Incorporation of Big Data: To make the most of big
data in TTS conversion, the model can be trained on a large corpus of speech data, such as the VCTK corpus, to further
improve the accuracy and naturalness of the TTS output. In conclusion, the architecture and training of ChatGPT in the
context of this research involves utilizing the transformer architecture to perform TTS conversion, training the model on a
large corpus of text and speech data, fine-tuning the model on a smaller, domain-specific dataset, and incorporating big data
to further improve the accuracy and naturalness of the TTS output.
36
Dida et al, Mesopotamian Journal of Big Data Vol. (2023), 2023, 3337
In evaluating the performance of ChatGPT in TTS conversion, several metrics can be used to quantify its accuracy and
naturalness. These metrics include:
1. Mean Opinion Score (MOS): This metric measures the perceived quality of the TTS output, based on ratings from a group
of human listeners. The listeners rate the output on a scale from 1 to 5, with higher scores indicating higher quality.
2.Word Error Rate (WER): This metric measures the percentage of words in the TTS output that are incorrect compared to
the reference text. It provides a quantitative measure of the accuracy of the TTS output.
3.Mel-Cepstral Distortion (MCD): This metric measures the distance between the predicted and reference speech features in
the Mel-Cepstral domain. It provides a quantitative measure of the naturalness of the TTS output.
For the experimental setup, the following steps can be taken:
1.Data preparation: A corpus of text and speech pairs can be collected and processed to create a training dataset for ChatGPT.
Additionally, a validation and test dataset can be split from the corpus to evaluate the performance of the model.
2.Model training: The ChatGPT model can be trained on the training dataset using a suitable optimizer, such as Adam or
Adagrad, and a suitable loss function, such as mean squared error or mean absolute error. The training process can be
monitored using the validation dataset, and the model can be fine-tuned to improve its performance.
3.Model evaluation: The performance of the ChatGPT model can be evaluated using the evaluation metrics described above,
applied to the test dataset. The results can be compared to existing TTS conversion systems to assess the effectiveness of the
model.
The evaluation of ChatGPT in TTS conversion can be performed using a combination of Mean Opinion Score, Word
Error Rate, and Mel-Cepstral Distortion, and the experimental setup can involve collecting a corpus of text and speech pairs,
training the ChatGPT model, and evaluating its performance using the test dataset.
4. Conclusion and Future Work
In conclusion, the integration of ChatGPT and big data in TTS conversion has the potential to significantly enhance the
quality and diversity of speech synthesis systems. With its advanced natural language processing capabilities and vast amount
of text data, ChatGPT can learn to produce speech that is accurate, expressive, and culturally diverse, reflecting the variability
of language use. This can have far-reaching implications for a wide range of applications, from voice-enabled devices and
educational technology to assistive technologies for individuals with communication disabilities. As the field of TTS
conversion continues to advance, the use of ChatGPT and big data will likely play an increasingly important role in driving
further improvements in speech synthesis performance. In terms of future work, there are several areas that could be explored
to further enhance the integration of ChatGPT and big data in TTS conversion:
1. Fine-tuning of models: Fine-tuning ChatGPT on specific TTS datasets can lead to further improvements in TTS
performance, by allowing the model to learn more about the specific requirements and characteristics of speech
synthesis.
2. Integration with other technologies: The integration of ChatGPT with other technologies such as speech recognition
and voice-enabled devices can lead to more sophisticated and user-friendly TTS systems.
3. Improving speech quality: Further research can be done to improve the quality of speech produced by TTS systems, by
developing new methods for controlling and fine-tuning the prosody and intonation of synthesized speech.
4. Expanding the scope of TTS systems: TTS systems can be expanded to support a wider range of languages, dialects,
and accent, by incorporating data from diverse sources and fine-tuning models on large, diverse corpora.
5. Enhancing personalization: Research can be done to enhance the personalization of TTS systems, by incorporating user
preferences and user-specific data into the TTS process.
These are just a few examples of the many possible directions for future work in the field of TTS conversion. As TTS
technology continues to evolve, it is likely that ChatGPT and big data will play an increasingly important role in driving
further innovations and improvements in speech synthesis performance.
37
Dida et al, Mesopotamian Journal of Big Data Vol. (2023), 2023, 3337
Funding
Non.
Conflicts of Interest
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The authors would like to express their gratitude to the University Malaysia Pahang, the Informatics Institute for
Postgraduate Studies, and the Al Salam University College for their moral support. Please accept my sincere gratitude for
the useful recommendations and constructive remarks provided by the anonymous reviewers.
References
[1] D. Sasirekha and E. Chandra, "Text to speech: a simple tutorial," International Journal of Soft Computing and
Engineering (IJSCE), vol. 2, no. 1, pp. 275-278, 2012.
[2] Ö. Aydın and E. Karaarslan, "OpenAI ChatGPT generated literature review: Digital twin in healthcare," Available
at SSRN 4308687, 2022.
[3] Y. Shen et al., "ChatGPT and Other Large Language Models Are Double-edged Swords," ed: Radiological Society
of North America, 2023, p. 230163.
[4] M. Mijwil, M. Aljanabi, and A. H. Ali, "ChatGPT: Exploring the Role of Cybersecurity in the Protection of
Medical Information," Mesopotamian Journal of CyberSecurity, vol. 2023, pp. 18-21, 2023.
[5] M. Jeong, H. Kim, S. J. Cheon, B. J. Choi, and N. S. Kim, "Diff-tts: A denoising diffusion model for text-to-
speech," arXiv preprint arXiv:2104.01409, 2021.
[6] Y. Ren et al., "Fastspeech: Fast, robust and controllable text to speech," Advances in neural information
processing systems, vol. 32, 2019.
[7] Y.-C. Huang and L.-C. Liao, "A Study of Text-to-Speech (TTS) in Children's English Learning," Teaching English
with Technology, vol. 15, no. 1, pp. 14-30, 2015.
[8] M. Cohn and G. Zellou, "Perception of concatenative vs. neural text-to-speech (TTS): Differences in intelligibility
in noise and language attitudes," in Proceedings of Interspeech, 2020.
... AI systems in the health care industries ensure proper recording and direct transference without alteration straight from the medical records without considering one major issue in the health care areas: privacy and integrity in the medical field [17]. Therefore, such systems play a crucial role in ensuring that at any given time, there will be data for appraisal in case of an audit by the governmental regulatory body, hence meeting the standards required for necessary healthcare information security standards [18]. ...
Article
The paper looks into AI-driven data provenance systems for their feasibility in tracing and verification of lineage for healthcare and financial transaction domains. We will use sample data points from Electronic Health Records and transaction data to understand the trade-offs between real-time processing speed and tracking accuracy in the former domain and between detection accuracy and false positives in the latter domain. MATLAB and Python were utilized to analyze the data and model the system. MATLAB was used to create the simulation environment for signal processing tasks, whereas Python, along with libraries such as NumPy and Pandas, facilitated data manipulation, statistical analysis, and generation of visual results. The study comprises impedance and multi-line graphs, which describe the relationships between processing speed, accuracy, and false positives in the systems being investigated. The tables show that processing speed improves healthcare accuracy and finance system detection accuracy and false positives. This means that while AI-driven data provenance systems might improve operational efficiency, they must be adapted to a specific industry to achieve the best balance between performance, accuracy, and reliability. Further development of AI technology using MATLAB and Python should focus on tracking and tracing effective and scalable solution approaches across crucial sectors to validate data lineage.
... These tools can also manage data for systematic reviews and simplify data management in complex research projects [46]. A previous study supports this current finding, showing that ChatGPT excels in text-to-speech conversion, especially in handling big data [47]. ...
Article
This research explores using Artificial Intelligence (AI)-powered tools by English faculty members in academic writing and research. It identifies the most commonly used tools, the objectives behind their adoption, and the challenges users face. A mixed-method research design was employed, incorporating both quantitative and qualitative data. The study’s subjects were 16 English faculty members from various regions in Indonesia, selected based on their experience in publishing research articles. Data were collected using questionnaires, interviews, and Focus Group Discussions (FGD). Data analysis involved descriptive statistical analysis for quantitative data and thematic analysis for qualitative data. The findings reveal that tools such as QuillBot, ChatGPT, and Grammarly are frequently used by faculty members, primarily for purposes like conceptualizing and planning research, organizing and crafting content, performing literature reviews and integration, handling and interpreting data, offering assistance with editing, reviewing, and publication, and also enabling communication, engagement, and maintaining ethical standards. However, significant challenges were identified, including concerns about academic integrity, bias in AI-powered tools, and issues related to personalized learning. These challenges highlight the need to develop technical skills among faculty and ensure the ethical use of AI-powered tools in academic writing and research. The implications of this research suggest that while AI-powered tools can enhance the efficiency and quality of academic writing and research, it is crucial to address ethical concerns and potential biases to maintain the integrity of scholarly work.
... Natural Language Processing (NLP) has been a significant area of research for many years, aiming to enhance computer systems' ability to understand and generate human language (Cambria & White, 2014;Hirschberg & Manning, 2015;Torfi et al., 2020). Recent advancements in this field have led to the development of large language models, which use machine-learning algorithms to learn from vast amounts of text data and generate humanlike language (Dida et al., 2023;Maddigan & Susnjak, 2023). The Generative Pre-trained Transformer (GPT) series, developed by OpenAI, has received significant attention among these large language models. ...
Article
Recent developments in natural language processing have led to the creation of large language models, such as ChatGPT, which could generate human-like text. In this paper, we explore the potential implications of ChatGPT for language learning in higher education. We first provide an overview of ChatGPT and discuss its capabilities and limitations. For instance, ChatGPT can generate coherent and fluent text on various topics but may have difficulties comprehending more complex or abstract ideas. We then consider how ChatGPT could be integrated into language courses and programs in higher education and the potential benefits and challenges of doing so. For example, ChatGPT could provide personalised language instruction or generate authentic language material for learners to engage with. However, using ChatGPT in language learning may also raise concerns about the potential substitution of human language teachers and the ethical implications of using a machine learning system to generate text. Finally, we offer suggestions for future research on using ChatGPT in language learning in higher education, such as studying the effectiveness of ChatGPT-assisted language instruction and exploring the pedagogical implications of using large language models in the language classroom.
... This application is particularly valuable for those who face challenges in vocalizing their thoughts and ideas. The work by [9] provides insights into the capabilities of ChatGPT, including its TTS applications. The model's language generation prowess ensures that the synthesized speech is contextually relevant and coherent, contributing to a more natural and engaging communication experience. ...
Article
Full-text available
The development of academic proficiency constitutes a pivotal component of higher education. Students from diverse disciplines consistently demonstrate a marked need for inductive reasoning and support services to assist them in their scientific research approaches. As the number of undergraduates continues to increase and their backgrounds become more diverse, students are receiving less personalized guidance and supervision during their academic writing. For academics and graduate students, whose primary responsibilities encompass research and academic writing, the emergence of large language models (LLMs) in conjunction with user-friendly interfaces such as Chat Generative Pre-Trained Transformer (ChatGPT), Bing Chat, Google's Bard, and Deepseek poses a substantial challenge as well as a priceless opportunity in terms of content generation. These technologies and their implementations are already exerting a significant influence across diverse sectors pertaining to the development, administration, and utilization of information systems. This literature review explores the evolving landscape, with a specific focus on its consequences for academic writing. The aim is to analyse this changing landscape to illuminate its impact on researchers, practitioners, and other stakeholders in academia and to provide insights and advance scholarly inquiry in this still developing, captivating, and rapidly growing domain with respect to academic writing practices.
Article
The generative pre-trained transformer (GPT) is a notable breakthrough in the field of artificial intelligence, as it empowers machines to effectively comprehend and engage in interactions with humans. The GPT exhibits the capacity to enhance inclusivity and accessibility for students with learning disabilities in the context of higher education, hence potentially facilitating substantial advancements in the field. GPT can provide personalized and diverse solutions that successfully cater to the distinct requirements of students with learning disabilities. This motivated us to conduct an extensive review to assess the effectiveness of GPT in enhancing accessibility and inclusivity in higher education for students with learning disabilities. This review offers a comprehensive analysis of the GPT and its significance for enhancing inclusivity in the field of higher education. In this research, we also examined the possible challenges and constraints associated with the integration of GPT into inclusive higher education, along with potential solutions. Overall, this review is intended for educators, students with and without learning disabilities, policymakers, higher education institutes, researchers, and educational technology developers. This review aims to provide a comprehensive understanding of GPT in promoting inclusive higher education for people with various learning disabilities, its impacts on inclusive higher education, emerging challenges, and potential solutions.
Article
Full-text available
The rapid advancement of artificial intelligence (AI), coupled with the global rollout of 4G and 5G networks, has fundamentally transformed the Big Data landscape, redefining data management and analysis methodologies. The ability to manage and analyze such vast and varied datasets has exceeded the capacity of any individual or organization. This study introduces an enhanced framework that expands upon the traditional four Vs of Big Data—volume, velocity, volatility, and veracity—by incorporating six additional dimensions: value, validity, visualization, variability, volatility, and vulnerability. This comprehensive framework offers a novel and straightforward approach to understanding and addressing the complexities of Big Data in the AI era. This article further explores the use of ‘Big D’, an AI-driven, RAG-based Big Data analytical bot powered by the ChatGPT-4o model (ChatGPT version 4.0). This article’s innovation represents a significant advance in the field, accelerating and deepening the extraction and analysis of insights from large-scale datasets. This will enable us to develop a more nuanced and comprehensive understanding of intricate data landscapes. In addition, we proposed a framework and analytical tools that contribute to the evolution of Big Data analytics, particularly in the context of AI-driven processes.
Article
Full-text available
This experimental study investigates the impact of ChatGPT-simplified authentic texts on university students’ reading comprehension, inferencing, and reading anxiety levels. A within-subjects design was employed, and 105 undergraduate English as a foreign language (EFL) students engaged in both original and ChatGPT-simplified text readings, serving as their own controls. The findings reveal a significant improvement in reading comprehension scores and inferencing scores following ChatGPT intervention. However, no significant change in reading anxiety levels was observed. Results suggest that ChatGPT simplification positively influences reading comprehension and inferencing, but its impact on reading anxiety remains inconclusive. This research contributes to literature on the use of artificial intelligence (AI) in education and sheds light on ChatGPT’s potential to influence language learning experiences within higher education contexts. The study highlights the practical application of ChatGPT as a tool for helping students engage in authentic text readings by making text more comprehensible. Based on the findings, several multifaceted implications that extend to various stakeholders in the field of language education are provided.
Article
Full-text available
ChatGPT Cybersecurity Medical data Digitization Intro to ChatGPT ChatGPT is a large language model developed by OpenAI. It is trained on a dataset of conversational text and can be used to generate human-like responses to prompts in a variety of languages and formats. It can be used for tasks such as chatbots, language translation, and text completion. The role of ChatGPT is to generate human-like text based on a given prompt or context. It can be used in a variety of applications such as chatbots, language translation, text completion, and question answering. Additionally, it can be fine-tuned for specific tasks such as generating product descriptions or summarizing articles. It can also be used to generate creative writing such as poetry and stories. It can be integrated into a wide range of industries from customer service to entertainment, to research.
Chapter
Full-text available
Literature review articles are essential to summarize the related work in the selected field. However, covering all related studies takes too much time and effort. This study questions how Artificial Intelligence can be used in this process. We used ChatGPT to create a literature review article to show the stage of the OpenAI ChatGPT artificial intelligence application. As the subject, the applications of Digital Twin in the health field were chosen. Abstracts of the last three years (2020, 2021 and 2022) papers were obtained from the keyword "Digital twin in healthcare" search results on Google Scholar and paraphrased by ChatGPT. Later on, we asked ChatGPT questions. The results are promising; however, the paraphrased parts had significant matches when checked with the Ithenticate tool. This article is the first attempt to show the compilation and expression of knowledge will be accelerated with the help of artificial intelligence. We are still at the beginning of such advances. The future academic publishing process will require less human effort, which in turn will allow academics to focus on their studies. In future studies, we will monitor citations to this study to evaluate the academic validity of the content produced by the ChatGPT.
Conference Paper
Full-text available
This study tests speech-in-noise perception and social ratings of speech produced by different text-to-speech (TTS) synthesis methods. We used identical speaker training datasets for a set of 4 voices (using AWS Polly TTS), generated using neural and concatenative TTS. In Experiment 1, listeners identified target words in semantically predictable and unpredictable sentences in concatenative and neural TTS at two noise levels (-3 dB,-6 dB SNR). Correct word identification was lower for neural TTS than for concatenative TTS, in the lower SNR, and for semantically unpredictable sentences. In Experiment 2, listeners rated the voices on 4 social attributes. Neural TTS was rated as more human-like, natural, likeable, and familiar than concatenative TTS. Furthermore, how natural listeners rated the neural TTS voice was positively related to their speech-in-noise accuracy. Together, these findings show that the TTS method influences both intelligibility and social judgments of speech-and that these patterns are linked. Overall, this work contributes to our understanding of the nexus of speech technology and human speech perception.
Article
The purpose of this study was to explore the effects of the digital material incorporated into Textto- Speech system for students' English spelling. The digital material was made on the basis of the Spelling Bee vocabulary list (approximately 300 words) issued by the selected school. 21 third graders from a private bilingual school in Taiwan were selected for this study. This study employed four data collection techniques, including questionnaire, pre-test and post-test, informal observation and interview, and semi-structured individual interviews. The research results showed that the use of digital material fostered the students' English spelling ability and their self-directed learning.
Text to speech: a simple tutorial
  • D Sasirekha
  • E Chandra
D. Sasirekha and E. Chandra, "Text to speech: a simple tutorial," International Journal of Soft Computing and Engineering (IJSCE), vol. 2, no. 1, pp. 275-278, 2012.
Fastspeech: Fast, robust and controllable text to speech
  • Y Ren
Y. Ren et al., "Fastspeech: Fast, robust and controllable text to speech," Advances in neural information processing systems, vol. 32, 2019.