Han Kyul Kim

Han Kyul Kim
  • Master of Science
  • PhD Student at University of Southern California

About

19
Publications
1,404
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
304
Citations
Current institution
University of Southern California
Current position
  • PhD Student

Publications

Publications (19)
Preprint
Full-text available
Misogynistic content in cyberspaces is a problem highlighted in many studies previously and is potentially harmful in the context of women's equality efforts and normalization of discrimination on online platforms. Many studies highlight the presence of misogynis-tic content and how this content may be proliferated and magnified through algorithms...
Article
Full-text available
In this work, we introduce EILEEN (Efficient Inference for Language-based Extraction of EHR Notes), a novel multi-modal natural language processing (NLP) framework designed to extract various alcohol consumption patterns from unstructured clinical notes, particularly in bilingual and non-English contexts. Recent advances in NLP have significantly i...
Preprint
Full-text available
This paper highlights the developing need for quantitative modes for capturing and monitoring malicious communication in social media. There has been a deliberate "weaponization" of messaging through the use of social networks including by politically oriented entities both state sponsored and privately run. The article identifies a use of AI/ML ch...
Article
Full-text available
A rapidly developing threat to societal well-being is from misinformation widely spread on social media. Even more concerning is ”mal-info” (malicious) which is amplified on certain social networks. Now there is an additional dimension to that threat, which is the use of Generative AI to deliberately augment the mis-info and mal-info. This paper hi...
Conference Paper
The growing popularity of generative AI, particularly ChatGPT, has sparked both enthusiasm and caution among practitioners and researchers in education. To effectively harness the full potential of ChatGPT in educational contexts, it is crucial to analyze its impact and suitability for different educational purposes. This paper takes an initial ste...
Article
Full-text available
As a key modifiable risk factor, alcohol consumption is clinically crucial information that allows medical professionals to further understand their patients’ medical conditions and suggest appropriate lifestyle modifying interventions. However, identifying alcohol-related information from unstructured free-text clinical notes is often challenging....
Article
Full-text available
Featured Application The study presents an improved and easily obtainable method in terms of automatic smoking classification from unstructured bilingual electronic health records. Abstract Smoking is an important variable for clinical research, but there are few studies regarding automatic obtainment of smoking classification from unstructured bi...
Preprint
BACKGROUND Smoking is a major risk factor and important variable for clinical research, but there are few studies regarding automatic obtainment of smoking classification from unstructured bilingual electronic health records (EHR). OBJECTIVE We aim to develop an algorithm to classify smoking status based on unstructured EHRs using natural language...
Article
Full-text available
Featured Application With its term mapping capability, MARIE can be used to improve data interoperability between different biomedical institutions. It can also be applied to text data pre-processing or normalization in non-biomedical domains. Abstract With growing interest in machine learning, text standardization is becoming an increasingly impo...
Article
Due to its simplicity and intuitive interpretability, spherical k-means is often used for clustering a large number of documents. However, there exist a number of drawbacks that need to be addressed for much effective document clustering. Without well-dispersed initial points, spherical k-means fails to converge quickly, which is critical for clust...
Article
While driving a vehicle, data are collected from a huge number of sensors that generate both categorical and continuous variables with varying scales. In order to understand the status of the vehicles and the drivers’ behaviors, it is crucial to segment and identify different phases within this time series data. However, data often lacks labels to...
Article
Two document representation methods are mainly used in solving text mining problems. Known for its intuitive and simple interpretability, the bag-of-words method represents a document vector by its word frequencies. However, this method suffers from the curse of dimensionality, and fails to preserve accurate proximity information when the number of...

Network

Cited By