Conference PaperPDF Available

A novel machine learning model that guides graduate students to write more organized and structured texts

Authors:
  • WriteWise Inc.

Abstract

Academic writing is one of the most valuable skills a scientist can develop. A primary challenge for graduate students is to coherently and concisely organize and present ideas within a manuscript. Writing a quality research manuscript requires transmitting the most relevant information through precise sentences that fulfill diverse communicational roles, ultimately resulting in a coherent, understandable text connected by cohesive mechanisms (e.g. lexical relationships between pairs of terms). Despite technological advances, the execution and teaching of the writing process have not similarly advanced. Therefore, a top priority for graduate programs is to implement new methodologies and technologies that aid students in communicating research advances. Through our investigation, we developed a novel, unsupervised machine-learning model applied to cell biology and biomedical texts that guides students in writing better organized and more structured texts.
A novel machine learning model that guides graduate students to write
more organized and structured texts
Javier Vera1, Hector Allende-Cid2, René Venegas3, Sebastián Rodríguez2, Wenceslao Palma2, Sofía Zamora3, Fernando Lillo3,
Humberto González2, Ashley Van Cott1,4, Eduardo N. Fuentes1,4*
Academic writing is one of the most valuable skills a scientist can develop. A primary challenge for
graduate students is to coherently and concisely organize and present ideas within a manuscript.
Writing a quality research manuscript requires transmitting the most relevant information through
precise sentences that fulfill diverse communicational roles, ultimately resulting in a coherent,
understandable text connected by cohesive mechanisms (e.g. lexical relationships between pairs of
terms). Despite technological advances, the execution and teaching of the writing process have not
similarly advanced. Therefore, a top priority for graduate programs is to implement new
methodologies and technologies that aid students in communicating research advances. Through
our investigation, we developed a novel, unsupervised machine-learning model applied to cell
biology and biomedical texts that guides students in writing better organized and more structured
texts.
In conclusion, our research proposes an unsupervised machine-learning model applicable in
revealing the hierarchy of information within cell biology and biomedical texts, providing
automatic cohesion feedback that aids graduate students in writing more coherent,
structured Abstracts. Our findings show how computational tools can contribute and
significantly help young scientist to improve communicational skills. Future technologies and
tools that provide deeper and more detailed advice for constructing and writing academic
texts (e.g. scientific papers, theses, grants) remain to be developed.
1WriteWise Research Group, Artificial Intelligence Unit, Santiago, Chile; 2Pontificia Universidad Católica de Chile, Escuela de Ingeniería Informática; 3Instituto de
Literatura y Ciencias del Lenguaje, Chile; 4BioPub, Scientific Writing Unit, Santiago, Chile. *Corresponding author: ef@writewise.cl
TEXT
NATURAL
LANGUAGE
PROCESSING GRAPH
CONSTRUCTION
Key concepts
Text organization
and structure
Word connectivity
and hierarchy
Text reorganization
and feedback
SOFTWARE
ANALYSIS
USER
INTERACTION
ORIGINAL ABSTRACT
PRE-TEST
PILOT ACTIVITIES
SOFTWARE DEMO
(30 MIN) SOFTWARE USE
(1 HOUR)
REVISED ABSTRACT
POST-TEST
INTERVENTION
UNSUPERVISED
MACHINE LEARNING
1.Rubric development (experts in academic discourse)
2.Rubric validation (external reviewer)
3.Rubric improvement
4.Training and induction of rubric raters
5.Minimum agreement between raters
6.Randomized-blinded revision and rating (2 raters/abstract)
7.Average between raters
8.Statistical validation and results
ABSTRACT WRITING QUALITY ASSESSMENT
FIG. 1. MACHINE LEARNING MODEL, EXPERIMENTAL DESIGN,
AND WRITING QUALITY ASSESSMENT
INTRODUCTION
Topic
adequacy
Audience
adequacy
Communicational
process
Semantic
relationships
Sentence
length
Holistic
appreciation
Average
Conclusion
presentation
CONCLUSION
All-trans-retinoic acid (AtRA) is the most active metabolite derived from vitamin A metabolism
and has been used for treatments of some erythropoietic diseases. Recent studies using human
erythrocytes (RBC) have suggested that the interaction mechanism induces structural changes in
lipid composition. However, the detail of these changes is unclear. In the present study, the
molecular interaction between AtRA and RBC as well as molecular models of membrane
structural changes were investigated. The latter consisted of dimyristoylphosphatidylcholine
(DMPC) and dimyristoylphosphatidylethanolamine (DMPE), representative of phospholipid classes
located in the outer and inner monolayers of the RBC respectively. X-ray diffraction and
differential scanning calorimetry (DSC) showed that AtRA induced structural and thermotropic
perturbations in multilayers and vesicles of both DMPC and DMPE, particularly at the hydrophobic
region of the membranes. Scanning electron microscopy (SEM) observations revealed that AtRA
induced morphological alterations in RBC from their normal discoid form to stomatocytes. These
outcomes suggested that AtRA molecules were located preferentially in the inner monolayer of
the RBC membrane. The results obtained from this study suggest that the location of AtRA
molecules into the RBC membrane and the modulation of the membrane properties thus
providing deeper insight into the structural biology of these type of cells.
FIG. 3. MACHINE LEARNING MODEL HELPS TO COMMUNICATE KEY CONCEPTS AND ORGANIZE TEXT STRUCTURE (REVISED ABSTRACT)
1.AtRA
2.RBC
3.Structural
Ranking of the 3 most
important key
concepts
1% most important and
connected concepts.
Text represented as an
interactive graph.
FIG. 4. UNSUPERVISED MACHINE LEARNING MODEL HELPS TO
COMMUNICATE KEY CONCEPTS AND ORGANIZE TEXT STRUCTURE
FIG. 2. UNSTRUCTURED AND UNORGANIZED
ABSTRACT (ORIGINAL)
PRE- AND POST-TEST
COMPARISON
PRE
PRE
PRE
PRE
PRE
PRE
PRE
PRE
POST
POST
POST
POST
POST
POST
POST
POST
All-trans-retinoic acid (AtRA) is a metabolite derived from vitamin A metabolism has been
used for the treatment of inflammatory skin diseases such as acne or psoriasis and as a
potential chemotherapeutic agent in some types of cancer. In the present study, the
molecular interaction with human erythrocytes as well as molecular models of its membrane
were investigated. The latter consisted of dimyristoylphosphatidylcholine (DMPC) and
dimyristoylphosphatidylethanolamine (DMPE), representative of phospholipid classes
located in the outer and inner monolayers of the human erythrocyte membrane,
respectively. X-ray diffraction and differential scanning calorimetry (DSC) showed that the
molecule induced structural and thermotropic perturbations in multilayers and vesicles of
both DMPC and DMPE, particularly at the hydrophobic region of the membranes. Scanning
electron microscopy (SEM) observations revealed that the retinoid induced morphological
alterations from their normal discoid form to stomatocytes. These outcomes suggested that
the retinoic were located preferentially in the inner monolayer and suggest that the location
of AtRA molecules into the RBC membrane and the modulation of the membrane properties
could be an important issue to help clarify the potent biological effects shown by this
retinoid.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.