ArticlePDF Available

Abstract and Figures

Question classification is the task of predicting the entity type of the answering sentence for a given question in natural language. It plays an important role in finding or constructing accurate answers and therefore helps to improve quality of automated question answering systems. Different lexical, syntactical and semantic features was extracted automatically from a question to serve the classification in previous studies. However, combining all those features doesn't always give the best results for all types of questions. Different from previous studies, this paper focuses on the problem of how to extract and select efficient features adapting to each different types of question. We first propose a method of using a feature selection algorithm to determine appropriate features corresponding to different question types. Secondly, we design a new type of features, which is based on question patterns. We tested our proposed approach on the benchmark dataset TREC and using Support Vector Machines (SVM) for the classification algorithm. The experiment shows obtained results with the accuracies of 95.2% and 91.6% for coarse grain and fine grain data sets respectively, which are much better in comparison with the previous studies.
Content may be subject to copyright.
Abstract
 

  

      
           
           




*Author for correspondence
Indian Journal of Science and Technology, Vol 9(17), DOI: 10.17485/ijst/2016/v9i17/93160, May 2016
ISSN (Print) : 0974-6846
ISSN (Online) : 0974-5645
Improving Question Classication by Feature
Extraction and Selection
Nguyen Van-Tu1, Le Anh-Cuong2*
1VNU University of Engineering and Technology, Ha Noi City, Vietnam;
tuspttb@gmail.com
2Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam;
leanhcuong@tdt.edu.vn
1. Introduction
Automated Question Answering has become an important
research direction in natural language processing1,2. Its pur-
pose is to seek an accurate and concise answer to a free-form
factual question from a large collection of text data, rather
than a full document, judged relevant as in standard infor-
mation retrieval tasks. Although dierent types of question
answering systems have dierent architectures, most of
them follow a framework in which question classication
plays an important role3. Furthermore, some studies have
demonstrated that performance of question classication
has signicant inuence on the overall performance of a
question answering system2,4,5. e task of question classi-
cation is to predict the entity type of the answer of a natural
language question6. For example, for the question “Where
is the Eiel Tower?”, the task of question classication is to
return label “location, thus the answer to this question is a
named entity of type “location. Since we predict the type of
the answer, question classication is also referred as answer
type prediction.
Many studies have addressed this problem, they
belongs to the rule-based approach7,8 or machine learning-
based approach4,9-11. In this paper, we follow the machine
learning approach and pay attension on the importance
of feature extraction and selection. From the view of
machine learning, we can easily formulate the this task
as a classication problem. ere are various supervised
learning methods used such as Nearest Neighbors (NN),
Naive Bayes (NB), Decision Tree (DT), Sparse Network of
Winnows (SNoW), and Support Vector Machines (SVM).
However, as expressed from experimental results in previ-
ous studies, feature sets aect much the quality of question
classication.
Keywords: 
Improving Question Classication by Feature Extraction and Selection
Indian Journal of Science and Technology
Vol 9 (17) | May 2016 | www.indjst.org
2
According to previous studies, various types of fea-
tures have been investigated. e most common types are
bag of words and n-grams which were used in all studies.
Some other studies (e.g12) tried to enrich the feature set
by adding more linguistic information as part-of-speech
tags or head words, or even semantic features. However,
from our observation combining all features is not always
the best solution for all questions. erefore, in this paper
we will give an experimental investigation for nding
the best feature sets corresponding to dierent groups
of questions. In addition, we also extract a new type of
features based on question patterns. ese new features
are then integrated to the existed feature sets and receive
better results of classication. We tested our proposed
feature sets using a SVM classier which is experimental
shown to get best results in6,9,13,14. And like most previ-
ous studies, the TREC dataset is chosen for conducting
experiments.
e rest of this paper is organized as follows: Section 2
presents the basic issues in question classication includ-
ing question type taxonomy and feature extraction.
Section 3 presents our proposal for feature selection.
Section 4 presents the experiments. Conclusion and
future works will be presented in section 5.
2. Basic Issues of Question
Classication
2.1. Question Type Taxonomy
e set of question categories (classes) are usually referred
as question taxonomy. Dierent question taxonomies
have been proposed in dierent works, but most of the
recent studies used the two layer taxonomy proposed by
[15]1. is taxonomy consists of 6 coarse grained classes
and 50 ne grained classes. Table 1 lists this taxonomy.
Whenever the entity of answering is determined we
can combine it with other information to nd correct
answers. For example, if we know the question is ask-
ing about location (or more concrete, a city), it is easier
to nd the exact information for answering as well as to
form the appropriate answer.
2.2. Classication Algorithms and
Evaluation
Machine Learning Approach
Most studies in question classication follow supervised
machine learning approach. ere are many dier-
ent classication methods used such as: Support Vector
Machine, Naive Bayesian classication, Maximum
Entropy Models16,17, Sparse Network of Winnows12.
Among these methods, Support Vector Machine with lin-
ear kernel function is shown as the most eective method,
according to6,9,13,14. erefore, SVM is the machine learn-
ing method used in our system. We can easily search for
a many documents introducing about SVM methods and
applications, thus it is not necessary for presenting it in
detail here.
A general framework in supervised machine learning
method for question classication is briey described in
the following steps:
• First, we need to build a training dataset, it includes
questions assigned with classication labels.
• Second, each labeled question in the training dataset is
represented as a vector of features.
• ird, a machine learning method (here SVM) is used to
learn on the training vectors and generate the classier.
• Finally, for each a test question we represent it by a
vector of features and use the learnt classier to obtain
a label (i.e. a question category).
Table 1. e coarse and ne grained question classes
Coarse Fine
ABBREVIATION Abbreviation, expression
ENTITY Animal, body, color, creative, currency, dis.med, event, food, instrument, lang, letter, other, plant,
product, religion, sport, substance, symbol, technique, term, vehicle, word
DESCRIPTION Denition, description, manner, reason
HUMAN Group, individual, title, description
LOCATION City, country, mountain, other, state
NUMERIC Code, count,date, distance, money, order, other, period, percent, speed, temperature, size, weight
Nguyen Van-Tu, Le Anh-Cuong
Indian Journal of Science and Technology 3
Vol 9 (17) | May 2016 | www.indjst.org
q = {(t1, f1), …, (tp, fp)}
where fi is the frequency of the term ith in the question
q. ese features are called bag-of-words features or uni-
grams features.
Unigrams is a special case of the so-called n-gram.
To extract n-gram features, any n consecutive words in a
question is considered as a feature. Table 2 lists the lexical
features of the sample question “Who was elected presi-
dent of South Africa in 1994?”
Note that some special cases of getting lexical information
like question words (i.e: who, how, when, what) or word-
shapes are put into the lexical feature set.
2.3.2. Syntactic Features
Syntactic features are extracted from the syntactical
structure of a question. ere are two common kinds of
syntactic features used for question classication, includ-
ing tagged unigrams and head words.
Tagged Unigrams
Tagged Unigrams indicate the part-of-speech tag of each
word in a question like NN (Noun), NP (Noun Phrase),
VP (Verb Phrase), JJ (adjective), and etc. e following
example shows these features extracted from the sentence
“Who was elected president of South Africa in 1994?”
{Who_WP, was_VBD, elected_VBN, president_NN, of_
IN, South_NNP, Africa_NNP, in_IN, 1994_CD,?.}
Head Words
A head word is considered as the key word or the cen-
tral word in a sentence, a clause or a phrase. is word
is determined based on the syntactic parsed tree of the
input sentence. As mentioned in6, Head Words contain
important information for specifying the object that a
question is seeking. erefore, identifying the head word
correctly can improve the classication accuracy since it
is the most informative word in the question.
Classication Evaluation
Performance in question classication is evaluated by the
global accuracy of the classier for all the coarse or ne
classes12.
Accuracy =
#ofcorrect predictions
#ofpredictions
ere is also the accuracy of a question classier on
a specic class precision. Precision in question classica-
tion on a specic class c is dened as follows:
Precision(c) =
#ofcorrect predictionsofclass c
#ofpredictions of classc
For the systems in which a question can only have
one class, a question is correctly classied if the predicted
label is the same as the true label. But for the systems
which allow a question to be classied in more than one
class 12,15, a question is correctly classied, if one of the
predicted labels is the same as the true label.
2.3. Feature Extraction
ere are various types of features which are currently
used for question classication. ey can be grouped into
three dierent categories based on the kinds of linguistic
information: lexical, syntactical and semantic features.
2.3.1 Lexical Features
Lexical features are usually the context words appearing
in the question. In question classication, a question is
represented similarly to document representation in the
vector space model, i.e., a question can be represented as
the vector:
q = (q1, q2, …, qN)
where qi is dened as the frequency of term ith in question
q and N is the total number of terms. Note that only non-
zero valued features are kept in the feature vector. en, a
question q is represented in the form:
Table 2. Example of lexical features
Feature Space Features
Unigram {(Who, 1) (was, 1) (elected, 1) (president, 1) (of, 1) (South, 1) (Africa, 1) (in, 1) (1994, 1) (?, 1)}
Bigram {(Who-was, 1), (was-elected, 1), (elected-president, 1), (president-of, 1), (of-South, 1), (South-Africa, 1),
(Africa-in, 1), (in-1994, 1), (1994-?, 1)}
Trigram {( Who-was-elected, 1), (was-elected-president, 1), …, (in-1994-?, 1)}
Wh-Word {(Who, 1)}
Word-Shapes {(lowercase, 5) (mix, 3) (digit, 1) (other, 1)}
Improving Question Classication by Feature Extraction and Selection
Indian Journal of Science and Technology
Vol 9 (17) | May 2016 | www.indjst.org
4
For example, for the question “What is the oldest city
in Spain?” the head word here is “city”. e word “city”
in this question can highly contribute to the classier for
classifying this question as “LOC:city”. Table 3 lists sam-
ple questions from TREC dataset together with their class
labels. e head words are identied by being underlined.
To determine the head word of a sentence, a syntac-
tic parser is required. For sentences written in English
language, people usually use the Stanford PCFG parser18
which is also used in this paper.
2.3.3. Semantic Features
Semantic features are useful in the case of sparse data.
From higher level semantic concept we can get the rela-
tionship (i.e. the semantic similarity) between dierent
words. WordNet is a well-known resources used for deter-
mining semantic features. WordNet is a lexical database
of English words providing a lexical hierarchy that associ-
ates a word with higher level semantic concepts namely
hypernyms6. For example a hypernym of the word “city”
is “municipality”.
ere are three kinds of semantic features being used
for question classication, as shown in12, as follows:
Question Category (QC)
WordNet hierarchy is used to estimate the similarity of
questions head word. e class with highest similarity is
considered as a new feature and will be added to the fea-
ture vector. For example, the question “What American
composer wrote the music for “West Side Story”?” has
its head word “composer”. To nd the question category
feature, the similarity of this word will be compared with
the similarity of all question categories. e category with
the highest similarity will be added to the feature vector.
In this example the most similar category is “individual”
and therefore the question category feature will be {(indi-
vidual, 1)}.
Question Expansion (QE)
Another semantic feature called query expansion which
is basically very similar to hypernym features. As we
explained before, we add hypernym of a head word to the
feature vector with words from WordNet hierarchy. Instead
of imposing this limitation, we dened a weight parameter
which decreases by increasing the distance of a hypernym
from the original word. For example for the question “What
river ows between Fargo, North Dakota and Moorhead,
Minnesota?”. e head word of this question is “river”. e
query expansion features of this question will be as follows,
given that the weight of “river” is considered as 1:
{(river, 1) (stream, 0.6) (body-of-water, 0.36) (thing, 0.22)
(physical-entity, 0.13) (entity, 0.08)}.
Related Words (RW)
Another semantic feature that we also use in this work is
the related words as presented in12. In this study, the authors
dened groups of words, each was represented by a category
name. If a word in the question exists in one or more groups,
its corresponding categories will be added to the feature vec-
tor. For example if any of the words {birthday, birthdate, day,
decade, hour, week, month, year} exists in a question, then
its category name, “date, will be added to the feature vector.
Table 4 lists semantic features the question “What
river ows between Fargo, North Dakota and Moorhead,
Minnesota?”.
Table 3. A sample of questions with their headwords and appropriate categories
Question Category
What city has the zip code of 35824 ? LOC:city
Who developed the vaccination against polio ? HUM:ind
Who invented the slinky ? HUM:ind
George Bush purchased a small interest in which baseball team ? HUM:gr
When did Idaho become a state ? NUM:date
What river ows between Fargo, North Dakota and Moorhead, Minnesota ? LOC:other
What is the oldest city in Spain ? LOC:city
Tabl e 4. Example of semantic features
Feature Space Features
Hypernyms {(river, 1) (stream, 1) (body-of-water, 1)
(thing, 1) (physical-entity, 1) (entity, 1)}
Related Words {(rel:What, 1) (rel:list.tar, 2) (rel:loca, 2)}
Question Category {(other, 1)}
Query Expansion {(river, 1) (stream, 0.6) (body-of-water,
0.36) (thing, 0.22) (physical-entity,
0.13) (entity, 0.08)}
Nguyen Van-Tu, Le Anh-Cuong
Indian Journal of Science and Technology 5
Vol 9 (17) | May 2016 | www.indjst.org
3. Our Proposal of Feature Selection
and Adding more New Feature Type
3.1 Combination of Dierent Feature Sets
Suppose that each feature type as mentioned in section 2.3
will generate a single set of features. A natural way for obtain-
ing the nal set of features to use in the question classication
is to combine all the single sets. However, we found that
the combination of all these feature sets is not ecient and
doesn’t always give the best results for all questions.
From our observation, we can recognize that each
type of questions can be sensitive with particular types of
features. erefore, assigning dierent feature sets corre-
sponding with dierent question types can be a solution.
e question types here relate to the question words:
“who, “when”, “how”, “why”, “which”, “where, and “what”.
It means each the question word denes one feature type.
We reserve one type for the remaining questions which
do not contain those question words.
We propose to use a simple feature selection for deter-
mining the best combination of single feature sets for each
question types, as presented in the algorithm 1 below:
3.2 Extracting Features from Question
Patterns
By studying TREC dataset we found some questions
inherently do not have any head word. For example, the
sentence “What is an atom?” has no suitable head word as
the entity type of the only noun (“atom”) in this question.
It does not provide necessary information to classify this
question as “denition. We recognize that by integrat-
ing lexical, syntactic and semantic information into an
unique form, we can get richer features for determining
correct labels of such questions. is new kind of features
also bring advanced evidences to the classication and
therefore may lead to a better result.
We rst design some patterns (i.e. templates) for con-
taining the integrated of lexical, syntactic, and semantic
information. Table 5 shows some designed question pat-
terns.
From these patterns which we call question patterns, we
will generate corresponding features. For example, from the
question “How is thalassemia dened?” we can be received
the features (How-is, 1) and (How-is-dened, 1). We then
combine these features with the existed feature sets to get
the nal feature sets for classication.
Algorithm 1. Determining the feature set for each
question type
Input: a training data and a development data set corre-
sponding to the selected feature type; a learning
machine method (e.g SVM here)
Output: a set of single feature types which gives the best
result on the development data set.
Step 1: extract all the single feature sets, denoted by SF1,
SF2, …, SFn set the remain feature sets SF = { SF1,
SF2, …, SFn }
set the initial feature set F = { }
set the intitial accuracy A = 0
Step 2: For each SFi in SF train a new classier again with
the new feature set F+{SFi} ; get the accuracy tested
on the development test, denote it by Ai
Step 3: get Ak to be the highest accuracy, corresponding to
the feature set F+{SFk};
If Ak > A
set A = Ak
SF = SF\{SFk}
F = F + { SFk }
Else
Return F; Quit
Step 4: If SF is not empty Repeat at Step 2
Else
Return F; Quit
Tabl e 5. Example of question patterns
Question patterns Explain the semantic
information
Wh-word + Tobe + word-
shape
Wh-word + weather-word weather word: hot, cold, warm,
wet, …
Wh-word + distance-word distance-word: far, long, …
Wh-word + Tobe +
distance-word
distance-word: far, long, …
Wh-word + money-word money-word: money, cost, rent,
sell, spend, charge, pay, …
Wh-word + place-word place-word: city, county,
mountain, state,…
Wh-word + reason-word reason-word: causes, used,
known, …
4. Experiments and Results
e dataset we used for conducting our experiment was
created by15. ey provided a question dataset which is
widely used in question classication studies and known
as UIUC or TREC2 dataset. It consists of 5500 labeled
questions which is used as training set and 500 indepen-
Improving Question Classication by Feature Extraction and Selection
Indian Journal of Science and Technology
Vol 9 (17) | May 2016 | www.indjst.org
6
dent labeled questions which is used as the test set. e
5500 training questions are split randomly into 5 dier-
ent training sets with the size 1000, 2000, 3000, 4000 and
5500 respectively.
We design dierent experiments as follows.
A. Experiment 1
For the rst experiment we combine all the single feature
sets for the task of classication, that includes: Unigram
(U), Bigram (B), Wh-Word (WH), Word-Shapes (WS),
Head-Word (H), Query-Expansion (QE), Question-
Category (QC), Related-Words (R). Table 6 shows the
results corresponding with dierent training data sets.
Tabl e 6. e accuracy of using SVM classier with
combining the feature kinds: U, B, WH, WS, H, QE,
QC, R
Training size 1000 2000 3000 4000 5500
Coarse 1 90.20% 91.20% 92.00% 92.60% 94.20%
Fine 1 79.00% 85.40% 86.60% 88.00% 90.40%
Experiment 2
In this experiment we would like to examine the contri-
bution of the question pattern features by adding the these
QP features to the feature set from the Experiment 1. Its
results are shown in the Table 7.
Table 7. Accuracy of using more Question-Pattern
feature
Training
size 1000 2000 3000 4000 5500
Coarse 2 90.40% 91.40% 92.80% 93.20% 95.00%
Fine 2 79.20% 86.00% 87.00% 88.60% 91.00%
Comparing results of experiment 1 and experiment 2, we
can see that the QP feature set actually improves the accu-
racy of question classication for all the training data sets.
C. Experiment 3
is experiment implements the Algorithm 1 for feature
selection. Note that the QP feature set is also considered
as one single feature set which is used in the algorithm.
Table 8 also shows the selected feature types for each kind
of question. It is worth to emphasize that the QP feature
set is selected for the questions containing Wh-words,
but not for the other type of questions. It seems reason-
able because the QP features are designed to contain
Wh-words, therefore they don’t aect the question with-
out Wh-words.
Tabl e 8. Result of feature selection
Question types Features
How, Who, Why,
When, Where, Which
Unigram, Bigram, Word-Shapes,
Question Pattern
What Unigram, Bigram, Head word,
Word-Shape, Related Words,
Question Pattern, Query
Expansion, Question Category
Other questions Unigram, Bigram, Word-Shape,
Related Words
Table 9 shows the accuracy of question classication
when using result of the feature selection.
Tabl e 9. Accuracy of using feature selection
corresponding to question types
Training size 1000 2000 3000 4000 5500
Coarse 3 90.40% 91.60% 93.20% 93.80% 95.20%
Fine 3 79.20% 86.60% 87.40% 89.00% 91.60%
Figure 1 and gure 2 show the comparisons of the
experiment 2 and experiment 3 for the coarse and ne
grained question classes.
Figure 1 Result for Coarse classes in the experiment 2 and
the experiment 3.
Figure 2 Result for ne grained classes in the experiment 2
and the experiment 3.
Nguyen Van-Tu, Le Anh-Cuong
Indian Journal of Science and Technology 7
Vol 9 (17) | May 2016 | www.indjst.org
Comparing result from Table 9 with results from
Table 6 and Table 7, and as illustrated in Figure 1 and
Figure 2 we can see that combining both solutions (using
QP features and using feature selection) signicantly
improves the task of question classication.
Comparison with previous studies:
In addition, we also make a comparison with well-known
previous studies of this task which also used the same
data set. e Table 10 shows the accuracy for the Coarse
classes and the Fine grained classes.
Table 10 shows that our proposal achieve the accura-
cies of 95.2% and 91.6% for coarse grain and ne grain
respectively, which are much better in comparison with
the previous studies.
5. Conclusion
In this paper we have presented our proposal of feature
extraction and feature selection for improving question
classication. We have investigated various types of fea-
tures including lexical, syntactic, and semantic features.
We also proposed a new type of feature based on question
pattern and then applying a feature selection algorithm
to determine the most appropriate feature set for each
type of questions. e experimental results shows that
our proposal gives the best accuracies for both the Coarse
classes and the Fine grained classes of questions, in com-
parison with using the conventional feature set, as well as
in comparison with the previous studies.
6. Acknowledgments
is work is supported by the Nafosted project 102.01-
2014.22
7. References
1. Wendy G Lehnert. A conceptual theory of question answer-
ing. In Proceedings of the 5th international joint conference
on Articial intelligence. 1977; 1:158–64.
2. Moldovan Dan, Pasca Marius, Harabagiu Sanda and
Surdeanu Mihai. Performance issues and error analysis in
an open-domain question answering system. ACM Trans,
Inf. Syst. 2003; p. 133–54.
3. Ellen M. Voorhees. Overview of the trec 2001 question
answering track. In Proceedings of the Tenth Text REtrieval
Conference (TREC). 2001; p. 42–51.
4. Hermjakob Ulf, Hovy Eduard and Lin Chin Yew. Automated
question answering in webclopedia - a demonstration. In
Proceedings of ACL-02. 2002.
5. Ittycheriah A, Franz M, Zhu WJ, Ratnaparkhi A and
Mammone RJ. IBM’s statistical question answering system.
NIST, In Proceedings of the 9th Text Retrieval Conference.
2001.
6. Zhiheng Huang, int Marcus and Zengchang Qin.
Question classication using head words and their hyper-
nyms. EMNLP ’08, In Proceedings of the Conference on
Empirical Methods in Natural Language Processing. 2008;
p. 927–36.
7. David A Hull. Xerox TREC-8 question answering track
report. In Voorhees and Harman. 1999.
8. Prager John, Radev Dragomir, Brown Eric and Coden Anni.
e use of predictive annotation for question answering in
trec8. In NIST Special Publication 500-246: e Eighth Text
REtrieval Conference (TREC 8). 1999; p. 399–411.
9. Zhiheng Huang, int Marcus and Celikyilmaz Asli.
Investigation of question classier in question answering.
EMNLP ’09, In Proceedings of the 2009 Conference on
Empirical Methods in Natural Language Processing. 2009;
p. 543–50.
10. Metzler Donald and Bruce Cro W. Analysis of statistical
question classication for fact-based questions. Inf. Retr.
2005; p. 481–504.
Study Classier Features Accuracy
Coarse Fine
Li and Roth (2002) [15] SNoW U+P+HC+NE+R 91.0% 84.2%
Li and Roth (2004) [12] SNoW U+P+HC+NE+R+S 89.3%
Metzler et al. (2005) [10] RBF kernel SVM U+B+H+HY 90.2% 83.6%
Huang et al. (2008) [6] Linear SVM U+WH+WS+H+HY+IH 93.4% 89.2%
Silva et al. (2011) [11] Linear SVM U+H+C 95.0% 90.8%
Hardy et al. (2013) [19] Extreme Learning Machine WH + H + HY 92.8% 84.6%
Our work Linear SVM U+B+WS+H+R+QE+QC+QP 95.2% 91.6%
Table 10. Comparison with previous studies for the same task and the same data set
Improving Question Classication by Feature Extraction and Selection
Indian Journal of Science and Technology
Vol 9 (17) | May 2016 | www.indjst.org
8
11. Silva Joao, Coheur Luısa, Mendes Ana and Wichert Andreas.
From symbolic to subsymbolic information in question
classication. Articial Intelligence Review. 2011; p. 137–54.
12. Li Xin and Roth Dan. Learning question classiers: e role
of semantic information, In Proc, International Conference
on Computational Linguistics (COLING). 2004; p. 556–62.
13. Krishnan Vijay, Das Sujatha and Chakrabarti Soumen.
Enhanced answer type inference from questions using
sequential models. HLT ’05, In Proceedings of the con-
ference on Human Language Technology and Empirical
Methods in Natural Language Processing. 2005; p. 315–22.
14. Zhang Dell and Lee Wee Sun. Question classication using
support vector machines. SIGIR ’03, In Proceedings of the 26th
annual international ACM SIGIR conference on Research and
development in information retrieval. 2003; p. 26–32.
15. Li Xin and Roth Dan. Learning question classiers.
COLING ’02, In Proceedings of the 19th international con-
ference on Computational linguistics. 2002; p. 1–7.
16. Blunsom Phil, Kocik Krystle and James R. Curran.
Question classication with loglinear odels. SIGIR ’06, In
Proceedings of the 29th annual international ACM SIGIR
conference on Research and development in information
retrieval. 2006; p. 615–16.
17. Xin Li, Xuan-Jing Huang and Li-De Wu. Question clas-
sication using multiple classiers. In Proceedings of the
5th Workshop on Asian Language Resources and First
Symposium on Asian Language Resources Network.
2005.
18. Petrov Slav and Klein Dan. Improved inference for unlexi-
calized parsing. In Human Language Technologies 2007:
e Conference of the North American Chapter of the
Association for Computational Linguistics, Proceedings of
the Main Conference. 2007; p. 404–11.
19. Hardy, Cheah Yu-N. Question Classication Using Extreme
Learning Machine on Semantic Features. J. ICT Res. Appl.
2013; 7(1):36-58.
... In [20], the author suggests a feature selection-based question-answering model which determines the feature types from the given question types and generates a new feature type based on the pattern of the supplied questions. The author conducted an experimental study to determine the feature set and to integrate new features based on question patterns into the existing feature sets. ...
Article
Full-text available
Question classification (QC) is a process that involves classifying questions based on their type to enable systems to provide accurate responses by matching the question type with relevant information. To understand and respond to natural language questions posed by humans, machine learning models or systems must comprehend the type of information requested, which can often be inferred from the structure and wording of the question. The high dimensionality and sparse nature of text data lead to challenges for text classification. These tasks can be improved using deep learning (DL) approaches to process complex patterns and features within input data. By training on large amounts of labeled data, deep learning algorithms can automatically extract relevant features and representations from text, resulting in more accurate and robust classification. This study utilizes a dataset comprising 5452 instances of questions and six output labels and uses two different word embedding techniques, like GloVe and Word2Vec, tested on the dataset using three deep learning models, LSTM, BiLSTM, and GRU, followed by a convolution layer. Additionally, a self-attention layer is included, which helps the model to focus on more relevant information when making predictions. Finally, an analytical discussion of the proposed models and their performance results provide insight into how GloVe and Word2Vec perform on the above-mentioned models. The GloVe embedding outperforms by achieving 97.68% accuracy and a moderate loss of 16.98 with the GRU model.
... The writers came up with a way to choose features that would work for different types of questions. They used the TREC dataset and Support Vector Machines (SVM) for classification to show that their plan worked [24]. ...
Article
Full-text available
Natural Language Processing (NLP) has emerged as a critical technology for understanding and generating human language, with applications including machine translation, sentiment analysis, and, most importantly, question classification. As a subfield of NLP, question classification focuses on determining the type of information being sought, which is an important step for downstream applications such as question answering systems. This study introduces an innovative ensemble approach to question classification that combines the strengths of the Electra, GloVe, and LSTM models. After being tried thoroughly on the well-known TREC dataset, the model shows that combining these different technologies can produce better outcomes. For understanding complex language, Electra uses transformers; GloVe uses global vector representations for word-level meaning; and LSTM models long-term relationships through sequence learning. Our ensemble model is a strong and effective way to solve the hard problem of question classification by mixing these parts in a smart way. The ensemble method works because it got an 80% accuracy score on the test dataset when it was compared to well-known models like BERT, RoBERTa, and DistilBERT.
... This empirical study showed satisfactory results with more than 90% overall precision. In a similar work, Nguyen et al. [16] described a machine learning approach for question classification based on coarse and fine-grained question classes on the TREC dataset. This work primarily focuses on feature selection methods. ...
Article
Full-text available
Inquiry-based learning supports the independent knowledge development of the learner in an e-learning environment. It is crucial for the learner to obtain the appropriate Learning Object (LO) for the intended query. Mapping a learner's query to the right LO is a challenging task, as keyword-based searching on the topics or content does not guarantee the best result for various reasons. A query that apparently connects a topic may also implicitly refer to multiple other topics. Besides, the content of an LO with the same topic name often varies over different portals. Therefore, there is always a need for a method to automatically identify the latent topics of the query and then find the most relevant LO that covers the query. This paper aims to build a recommender system that maps a given input query to a suitable LO based on the most appropriate matching of learning contents. The proposed work employs an amalgamation of different supervised and unsupervised methods of natural language processing and machine learning. The machine learning model is trained on a handcrafted dataset to map queries into predefined topics. The proposed algorithm also leverages a dynamic topic modeling technique on learning content collected from three popular e-learning portals and uses a similarity score to map the learner's (user) query to the most appropriate LO.
... In one study [10], a method combining feature selection, ensemble classification, and the Gravitational Search Algorithm was proposed. Similarly, another method introduced in [42] introduced a feature selection algorithm to determine appropriate features for different question types. They designed a new type of feature based on question patterns and employed a feature selection algorithm to determine the most suitable feature set for each question type. ...
Chapter
Question Classification is one of the important applications of information retrieval, as it plays a crucial role in improving the performance of question-answering systems. Differentiating between factoid and non-factoid questions is a particularly difficult task. Different methods have been suggested to improve the identification and classification of factoid questions. Most of these methods rely on semantic features and bag-of-words. This research paper explores the utilization of a Grammar-based framework for Questions Categorization and Classification (GQCC) to identify question types. This framework incorporates features such as grammatical features, domain-specific features, and patterns. These features leverage the question structure. By employing Ensemble Learning models, experimental findings demonstrate that the integration of question grammatical features with Ensemble Learning models contributes to achieving a good level of accuracy.KeywordsQuestion classificationGrammatical featuresFactoid questionsInformation retrievalMachine learningEnsemble learning
... This task is done by question classification. Question classification plays an important role in finding or constructing accurate answers ( Van-Tu & Anh-Cuong, 2016). It can improve the quality of question answering system. ...
Article
Full-text available
Question classification is one of the essential tasks for question answering system. This task will determine the expected answer type (EAT) of the question given to the system. Multinomial Naïve Bayes algorithm is one of the learning algorithms that can be used to classify questions. At the classification stage, this algorithm used a set of features in the knowledge model. The number of features used can result in curse of dimensionality if the feature is in high dimension. Feature selection can be used to reduce the feature dimension and could increase the system performance. Chi-Square algorithm can be used to select features that describe each category. In this research, the Multinomial Naïve Bayes is used to classify the question sentences and the Chi-Square algorithm is used for the feature selection. The dataset used is a set of Indonesian question sentences, consisting of 519 labeled factoids, 491 labeled non-factoids, and 185 labeled other. The test results showed an increase in accuracy of 0.1 when used feature selection. System accuracy when used feature selection is 0.87 with the number of features used are 248. Without feature selection, the accuracy is 0.77 with the number of features used are 1374.
... Therefore, in [15], a hierarchical classifier with two levels was proposed for problem classification. A method has been proposed in [18] that are using the feature selection algorithm to evaluate correct features for various types of questions.Though researchers in [19] proposed a SVM-based classifier. In addition; a problem identification method based on SVM was proposed in [20]. ...
Article
Full-text available
Question Answering is one of the most common applications for data acquisition. Although the majority of text-mining applications strive to improve the user experience and the tools used to find appropriate answers, the problems still exist because the web content is constantly increasing. The Questions Classification (QC) task is one of the main tasks in improving the classification system is to classify types of questions in the text mining application. A large number of QC methods are introduced to help resolve classification problems, most of which are bag of words approaches. In this project, we propose a QC system that uses Parts of Speech (POS) Tagger and Named Entity Recognition (NER) Tagger from the Stanford core Natural Language Processing (NLP) to classify the questions correctly. We started by cleaning the data by removing the available labels in the questions then we proceed by tagging the questions by splitting words and tagging each and every words in the input question with the POS Tagger. After this step, we will convert them into a pattern without changing the structure of the question. Then we proceed by tagging the question with NER Tagger. Finally, we will do confirmation process for certain question types which is performed by confirming question type module to make the system work efficiently.
Chapter
In conversation genres like instruction, clarification questions asked by a user may either relate to the task at hand or to common-sense knowledge about the task domain, whereas most conversational agents focus on only one of these types. To learn more about the best approach and feasibility of integrating both types of questions, we experimented with different approaches for modelling and distinguishing between task-specific and common sense questions in the context of a cooking assistant. We subsequently integrated the best ones in a conversational agent, which we tested in a study with six users cooking a recipe. Even though the three elements functioned well on their own and all participants completed the recipe, question-answering accuracy was relatively low (66%). We conclude with a discussion of the aspects that need to be improved upon to cope with the diverse information need in task-based conversational agents.
Chapter
Organizations are increasingly implementing chatbots to address customers’ inquiries, but customers still have unsatisfactory encounters with them. In order to successfully deploy customer service chatbots, it is important for organizations and designers to understand how to introduce them to customers. Arguably, how a chatbot introduces itself as well as its services might influence customers’ perceptions about the chatbot. Therefore, a framework was developed to annotate the social cues in chatbot introductions. In order to validate our framework, we conducted a content analysis of introductions of customer service chatbots (n = 88). The results showed that the framework turned out to be a reliable identification instrument. Moreover, the most prevalent social cue in chatbot introductions was a humanlike avatar, whereas communication cues, indicating the chatbot’s functionalities, hardly occurred. The paper ends with implications for the design of chatbot introductions and possibilities for future research.
Article
Full-text available
In statistical machine learning approaches for question classification, efforts based on lexical feature space require high computation power and complex data structures. This is due to the large number of unique words (or high dimensionality). Choosing semantic features instead could significantly reduce the dimensionality of the feature space. This article describes the use of Extreme Learning Machine (ELM) for question classification based on semantic features to improve both the training and testing speeds compared to the benchmark Support Vector Machine (SVM) classifier. Improvements have also been made to the head word extraction and word sense disambiguation processes. These have resulted in a higher accuracy (an increase of 0.2%) for the classification of coarse classes compared to the benchmark. For the fine classes, however, there is a 1.0% decrease in accuracy but is compensated by a significant increase in speed (92.1% on average).
Article
Full-text available
The Open-domain Question Answering system (QA) has been attached great attention for its capacity of providing compact and precise results for sers. The question classification is an essential part in the system, affecting the accuracy of it. The paper studies question classification through machine learning approaches, namely, different classifiers and multiple classifier combination method. By using compositive statistic and rule classifiers, and by introducing dependency structure from Minipar and linguistic knowledge from Wordnet into question representation, the research shows high accuracy in question classification.
Conference Paper
Full-text available
We present a new technique for question answering called Predictive Annotation. Predictive Annotation identifies potential answers to questions in text, annotates them accordingly and indexes them. This technique, along with a complementary analysis of questions, passage-level ranking and answer selection, produces a system effective at answering natural-language fact-seeking questions posed against large document collections. Experimental results show the effects of different parameter settings and lead to a number of general observations about the question-answering problem.
Conference Paper
Question classification has become a crucial step in modern question answering systems. Previous work has demonstrated the effectiveness of statistical machine learning approaches to this problem. This paper presents a new approach to building a question classifier using log-linear models. Evidence from a rich and diverse set of syntactic and semantic features is evaluated, as well as approaches which exploit the hierarchical structure of the question classes.
Conference Paper
Question classification is very important for question answering. This paper presents our research work on automatic question classification through machine learning approaches. We have experimented with five machine learning algorithms: Nearest Neighbors (NN), Naive Bayes (NB), Decision Tree (DT), Sparse Network of Winnows (SNoW), and Support Vector Machines (SVM) using two kinds of features: bag-of-words and bag-ofngrams. The experiment results show that with only surface text features the SVM outperforms the other four methods for this task. Further, we propose to use a special kernel function called the tree kernel to enable the SVM to take advantage of the syntactic structures of questions. We describe how the tree kernel can be computed efficiently by dynamic programming. The performance of our approach is promising, when tested on the questions from the TREC QA track.
Conference Paper
In this paper, we investigate how an ac- curate question classifier contributes to a question answering system. We first present a Maximum Entropy (ME) based question classifier which makes use of head word features and their WordNet hy- pernyms. We show that our question clas- sifier can achieve the state of the art per- formance in the standard UIUC question dataset. We then investigate quantitatively the contribution of this question classifier to a feature driven question answering sys- tem. With our accurate question classifier and some standard question answer fea- tures, our question answering system per- forms close to the state of the art using TREC corpus.
Conference Paper
Question classification plays an important role in question answering. Features are the key to obtain an accurate question classifier. In con- trast to Li and Roth (2002)'s approach which makes use of very rich feature space, we pro- pose a compact yet effective feature set. In particular, we propose head word feature and present two approaches to augment semantic features of such head words using WordNet. In addition, Lesk's word sense disambigua- tion (WSD) algorithm is adapted and the depth of hypernym feature is optimized. With fur- ther augment of other standard features such as unigrams, our linear SVM and Maximum Entropy (ME) models reach the accuracy of 89.2% and 89.0% respectively over a standard benchmark dataset, which outperform the best previously reported accuracy of 86.2%.
Conference Paper
Question classification is an important step in factual question answering (QA) and other dialog systems. Several at- tempts have been made to apply statistical machine learning approaches, including Support Vector Machines (SVMs) with sophisticated features and kernels. Curi- ously, the payoff beyond a simple bag-of- words representation has been small. We show that most questions reveal their class through a short contiguous token subse- quence, which we call its informer span. Perfect knowledge of informer spans can enhance accuracy from 79.4% to 88% using linear SVMs on standard bench- marks. In contrast, standard heuristics based on shallow pattern-matching give only a 3% improvement, showing that the notion of an informer is non-trivial. Us- ing a novel multi-resolution encoding of the question's parse tree, we induce a Con- ditional Random Field (CRF) to identify informer spans with about 85% accuracy. Then we build a meta-classifier using a linear SVM on the CRF output, enhancing accuracy to 86.2%, which is better than all published numbers.