Conference PaperPDF Available

Embedding Wikipedia Title Based on Its Wikipedia Text and Categories

December 2017

December 2017

DOI:10.1109/IALP.2017.8300566

Conference: International Conference on Asian Language Processing
At: Singapore

Authors:

Chi-Yen Chen

Academia Sinica

Wei-Yun Ma

Columbia University

Distributed word representation is widely used in many NLP tasks and knowledge-based resources also provide valuable information. Comparing to conventional knowledge bases, Wikipedia provides semi-structural data other than structural data. We argue that a Wikipedia title's categories can help complement the title's meaning besides Wikipedia text, so the categories should be utilized to improve the title's embedding. We propose two directions of using categories, cooperating with conventional context-based approaches, to generate embeddings of Wikipedia titles. We conduct extensively large scale experiments on the generated title embeddings on Chinese Wikipedia. Experiments on word similarity task and analogical reasoning task show that our approaches significantly outperform conventional context-based approaches.

Content uploaded by Wei-Yun Ma

Content may be subject to copyright.

Embedding Wikipedia Title Based on Its Wikipedia Text and Categories

Chi-Yen Chen, Wei-Yun Ma

Institue of Information Science

Academia Sinica

Taipei, Taiwan

{ccy9332, ma}@iis.sinica.edu.tw

Abstract—Distributed word representation is widely used

in many NLP tasks and knowledge-based resources also

provide valuable information. Comparing to conventional

knowledge bases, Wikipedia provides semi-structural data

other than structural data. We argue that a Wikipedia

title’s categories can help complement the title’s meaning

besides Wikipedia text, so the categories should be utilized

to improve the title’s embedding. We propose two directions

of using categories, cooperating with conventional context-

based approaches, to generate embeddings of Wikipedia

titles. We conduct extensively large scale experiments on

the generated title embeddings on Chinese Wikipedia. Ex-

periments on word similarity task and analogical reasoning

task show that our approaches signiﬁcantly outperform

conventional context-based approaches.

Keywords-word embedding, Wikipedia, Wikipedia cate-

gory, Knowledge base

I. INTRODUCTION

Distributed word representations have been widely used

for various NLP tasks. Many approaches have been pro-

posed to learn word embedding from a large corpus [1],

[2], [3], [4], [5], [6].

Knowledge embedding, which embeds knowledge

graphs into a continuous vector space while preserving

the original properties of the graph, has also attracted

considerable research efforts [7], [8], [9], [10]. Instead of

using mere graph node pair as a feature, GAKE [11] is

designed to learn knowledge embedding by utilizing the

graph’s structural information.

Researchers have also explored to make full use

of knowledge-based resources, such as Wikidata, Free-

base [12] and WordNet [13] to improve context-based

embeddings. Models that learn word embedding through

both knowledge bases and text jointly were proposed as

well [14], [8], [15], [16]. Among the various knowledge

bases, for each name entity, Wikipedia provides not only

structural data, i.e, knowledge graphs via infoboxes, but

also the nonstructural data, i.e, Wikipedia text, and semi-

structural data, i.e, the title’s categories. Most Wikipedia

categories are long noun phrases other than noun words,

so they are sometimes able to provide more complete

information than info-boxes. In our approaches, we deﬁne

the graph nodes as Wikipedia categories with the form

of long noun phrase. To the best of our knowledge, we

are the ﬁrst to utilize noun phrases or Wikipedia category

information to improve word embedding.

In this paper, we argue that a Wikipedia title’s cat-

egories can help deﬁne or complement the meaning of

the title besides the Wikipedia text, so the categories

should be utilized to improve the title’s embedding. We

propose two directions of using categories, cooperating

with conventional context-based approaches, to generate

embeddings of Wikipedia titles. One direction is to ﬁrst

obtain the embedding of each category, followed by adding

them to context-based embedding which is trained from

Wikipedia text. The other direction is to ﬁrst decide which

category can most represent the title, and then get the

category’s embedding in order to concatenate with the

context-based embedding. We conduct extensively large

scale experiments on the generated title embeddings on

Chinese Wikipedia. Experiments on word similarity task

and analogical reasoning task show that our approaches

signiﬁcantly outperform the conventional context-based

approaches.

II. DATA BACKG ROUND

Wikipedia provides data in three structural extents:

nonstructural, i.e, content, semi-structural, i.e, categories,

and structural data, i.e, info-boxes. Categories are usually

long noun phrases and provide more information than

info-boxes. The page Albert Einstein, for example, is in

categories of ETH Zurich alumni,ETH Zurich faculty and

20 more. The categories provide information that Einstein

was both alumni and faculty of ETH Zurich while in info-

box, only Einstein was related to ETH Zurich is shown.

We download the Chinese version of Wikipedia dump

ﬁle in September 2016, which compromised 1,243,319

articles at a time. There are 244,430,247 words in the Chi-

nese Wikipedia corpus and each title has 2.16 categories

in average.

III. MET HO D

A. Context-based Embedding

Data preprocessing, such as word segmentation is nec-

essary for Chinese data. We segment Wikipedia text via

CKIP Chinese Word Segmentation System [17], [18]. We

also collect Wikipedia titles as our lexicon to identify

titles during word segmentation. On the other hand, to

get general descriptions of categories, we do not apply

the lexicon on segmenting Wikipedia categories. After

preprocessing, we gain the Chinese Wikipedia corpus and

apply the Skip-gram model [4] to obtain 300 dimensional

context-based embeddings.

B. Categories Embedding

We propose approaches of acquiring embeddings

through Wikipedia categories, which can partially repre-

sent the corresponding title. We name the embeddings

as Wikipedia categories embedding, categories embedding

for short.

Given a title tand its corresponding categories from C1

to Cn, each category Cihas been word-segmented as K

words from Wi

1to Wi

K. We use ciand wi

jto represent

embedding of category Ciand word Wi

1) Average of Category Words: We acquire categories

embedding ecategory by averaging every category of a

single title. Considering the completeness of category

information, we obtain a category embedding by averaging

all words in the category. Process of computing ecategory

is shown as following:

ecategory =1

i=1

ci,(1)

where

ci=1

j=1

j.(2)

2) Average of Category headwords: By observing Chi-

nese linguistic structure, generally, the headword of each

category Ciis the last word Wi

K. We presume that

other words besides the headword may bring some noisy

information; thus, in this method, we acquire ecategory by

averaging only every category headword of a single title

and do not consider any other words. Computing process

is shown as following:

ecategory =1

i=1

ci,(3)

where

ci=wi

K.(4)

3) Weighted Sum of Categories (WSC): We assume that

each category of a title has different degree of represen-

tative and its representative depends on its headword’s

occurrence in context of the title. Therefore, we assert that

categories should be treated distinctively according to their

representative degrees. In this section, we acquire ecategory

by summing up categories of a single title with the

occurrence of category headword in context. Considering

category information completeness, we obtain a category

embedding by averaging all words’ embeddings in the

category but apply a different weight dto the headword.

Computing process is shown as following:

ecategory =

i=1

aici,(5)

where aiis the category headword frequency in context

with normalization and

ci=1

K+d−1





K−1

j=1

j+dwi



.(6)

where dis the weight added to the headword.

4) Top N Categories: Considering category occurrence

in context, we argue that categories with higher frequency

should gain more attention. Therefore, we only compute

categories embedding ecategory by summing up top 1, top

2, and top 3 categories relatively. Computing process is

shown as following:

ecategory =

i=1

ci,(7)

where N= 1,2,3and

ci=1

j=1

j.(8)

C. Wikipedia Title Embedding

Wikipedia title embedding, short for title embedding,

is the improved word embedding with combination of

context embedding and categories embedding. We propose

two approaches to acquire title embedding. One is linear

combination and the other is concatenation.

1) Linear Combination: We acquire title embedding

etitle by linear combining context embedding and cate-

gories embedding. Process of computing etitle is shown

as following:

etitle =α∗econtext + (1 −α)∗ecategory ,(9)

where 0< α < 1,econtext is obtained from III-A and

ecategory is obtained from III-B.

2) Concatenation: The method is to obtain category

embedding of the title via methods in III-B. Then we

concatenate it with the context-based embedding.

IV. EXP ER IM EN TS

A. Evaluation Set Translation

There are barely evaluation sets with large enough

amount of data for Chinese word embeddings. Therefore,

we translate three evaluation sets from English to Chinese

and check manually, two for word similarity task and one

for analogical reasoning task. We get 3,000 word pairs

by translating MEN-3k [19] dataset, 287 word pairs by

translating MTurk-287 [20] dataset, and 11,126 analogical

questions by translating Google analogy dataset [3].

After ﬁnishing our research, we will release the Chinese

datasets.

1) Difﬁculties: We encounter some obstacles while

translating the datasets. In word similarity datasets, some

English word pairs have slight differences which are even

unnoticeable in Chinese. For instance, word pairs such

as (stairs, staircase), both words mean stairs in most

dictionary. We look these words up in various dictionary

resources to select different but appropriate meaning in

Chinese. In analogical reasoning questions set, some En-

glish words are difﬁcult to ﬁnd an appropriate mapping in

Chinese words because of the differences in word usages.

For example, in syntactic relationship type questions, such

as adjective-adverb relation, e.g. apparent, apparently. We

discard questions in syntactic relationship type.

B. Word Similarity Tasks

Word similarity task datasets contain relatedness scores

for word pairs; the cosine similarity of the two word

embeddings should have high correlation. We have two

datasets: MEN-3k and MTurk-287 in Chinese version. To

get development set and testing set, we split both datasets

in halves, i.e, 1,500 word pairs in both development

and testing set for MEN-3k and 143 word pairs for

development set and 144 word pairs in testing set for

MTurk-287.

Table I shows result of Linear Combination in sec-

tion III-C1.We tune the weight αvia development set

and then apply the best α, which α= 0.9, on testing

set. Comparing to the baseline, our proposed methods get

signiﬁcant improvement.

Method MEN-3k MTurk-287

dev test dev test

Skip-gram 67.10 64.60 50.40 59.40

Avg(words) 67.80 65.20 50.20 59.60

Avg(headwords) 67.70 65.20 50.10 60.20

WSC (d=1) 67.50 65.10 50.10 59.40

WSC (d=2) 67.50 65.10 50.10 59.50

WSC (headwords) 67.50 65.10 49.90 59.50

Top 3 categories 67.70 65.20 50.00 59.70

Top 2 categories 67.70 65.20 50.00 59.50

Top 1 category 67.80 65.20 50.20 59.60

Table I

SPE ARM AN C ORR ELAT ION O N WOR D SI MIL AR ITY TA SK. A LL

EM BED DI NG AR E 300 DIMENSIONS.

Table II shows the result of Concatenation in sec-

tion III-C2. We, in fact, realize concatenation in two ways.

The ﬁrst procedure, Integrated Match (Int. Match), we

concatenate the category embedding right after the context

embedding into a 600 dimensional embedding, which is

shown as below:

etitle =econtext||ecateg ory,(10)

where ecategory is obtained from methods in III-B. For

the other procedure, Individual Match (Ind. Match), we

ﬁrst obtain the context embedding of a title and its corre-

sponding category embeddings via slight combination of

methods in III-B for each word pair in word similarity data

set Q. Then, we have four combination of embeddings pair

to select the pair with highest cosine similarity score. The

procedure will be illustrated in Algorithm 1.

Algorithm 1 Individual Match

1: procedure INDIVIDUAL MATCH(Q)

2: for each word pairs (A,B) in Qdo

3: Alist =econtext, ecateg ory of A

4: Blist =econtext, ecateg ory of B

5: ˆa, ˆ

b=maxa,bC osSim(a, b)

6: where a∈Alist and b∈Blist

7: Sim(A, B ) = CosSim(ˆa, ˆ

8: end for

9: end procedure

The result of Integrated Match comparing to the base-

line, 600 dimensional context embedding obtained from

skip-gram model [4], is unsatisﬁed to our initial assump-

tion. The result of Individual Match comparing to the

baseline, 300 dimensional context embedding obtained

from skip-gram model, validates our presumption. For

the categories, we obtain its category embeddings via

two methods applying to the top one category (using

N= 1 in III-B4)): averaging category words and applying

category headword relatively according to methods in

III-B1 and III-B2.

Method MEN-3k MTurk-287

dev test dev test

baseline SG-300 67.10 64.60 50.40 59.40

SG-600 67.30 65.40 50.10 56.50

Int. Match 65.40 61.60 49.30 59.65

Ind. Match Avg(wof c) 73.20 77.90 82.50 64.30

hword of c75.40 81.40 83.60 68.90

Table II

SPEARMAN CORRELATION ON WORD SIMILARITY TASK. EMBEDDING

IN SG-600 AND IN T. MATCH A RE 600 DIMENSIONS. AL L TH E OTH ER

EM BED DI NG AR E 300 DIMENSIONS.wDEN OTE S WO RDS I N THE T OP

ON E CATE GORY c.

Comparing to the baseline, our proposed method, Ind.

Match, get signiﬁcant improvement, though Int. Match

does not conquer the baseline. As in Ind. Match, each

word in word pairs has several independent embedding

candidates to choose from, i.e., each word can choose from

either its context embedding or its category embedding, it

can therefore choose the embedding with better represen-

tative and achieve a better performance.

C. Analogical Semantics Tasks

Analogical reasoning dataset is compromised of anal-

ogous word pairs, i.e, pairs of tuples of word relations

that follow a common syntactic relation. We use translated

Google dataset and split it in halves as development and

testing sets. Each set contains 5,563 questions.

Table III shows result of Linear Combination in III-C1.

We tune the weight αvia development set and then apply

the best α, which is α= 0.9, on testing set. Comparing

to the baseline, our proposed methods get signiﬁcant

improvement.

Method Google

dev test

Skip-gram 53.12 34.71

Avg(words) 55.71 36.33

Avg(headwords) 53.66 35.65

WSC (d=1) 54.43 35.12

WSC (d=2) 54.14 35.00

WSC (head word) 53.17 34.96

Top 3 categories 54.57 35.30

Top 2 categories 54.84 35.38

Top 1 category 55.51 36.10

Table III

ACCURACY ON ANALOGICAL REASONING TASK. ALL EMBEDDINGS

AR E 300 DIMENSIONS.

V. DISCUSSION

According to the experiment result, linear combination

using category embedding which obtains from averag-

ing all category words has the best performance. This

circumstance contradicts one of our initial assumptions:

categories have different extent of representative and their

representative depend on the occurrence in context of the

title. That is, we assume that methods stress on headwords

should have better performances; however, it turns out

that our assumption is not rigorous enough. Perhaps the

categories have no distinctive representation degree. Or if

distinguishable representative degree exists, the extent is

related to other factors, such as the position of category

appears in context. It is plausible that the category which

denotes the ﬁrst sentence in the context deserves most

attention.

Although the effect of representative degree of cate-

gories is unclear in linear combination, the improvement

is obvious in Individual Match. We apply Ind. Match

only in word similarity task due to the word coverage of

Wikipedia title in question and computing complexity. It is

likely that the representative degree matters only in some

tasks and methods applied. In either circumstances, that

categories can provide valuable information for improving

context-based embedding is undeniable.

VI. CONCLUSION

In this paper, we argue that a Wikipedia title’s categories

can help deﬁne or complement the meaning of the title

besides the title’s Wikipedia text, so the categories should

be utilized to improve the title’s embedding. We purpose

two directions of utilizing Wikipedia categories to improve

context-based embedding trained from Wikipedia text.

Experiments on both word similarity task and analogical

reasoning task show that our approaches signiﬁcantly

outperform conventional context-based approaches.

In the future, it is worth investigating whether the

categories have distinct degree of representative and if so,

what factors affect the degree.

REFERENCES

[1] R. Collobert, J. Weston, L. Bottou, M. Karlen,

K. Kavukcuoglu, and P. Kuksa, “Natural language

processing (almost) from scratch,” Journal of Machine

Learning Research, vol. 12, no. Aug, pp. 2493–2537,

2011.

[2] P. Dhillon, J. Rodu, D. Foster, and L. Ungar, “Two step

cca: A new spectral method for estimating vector models

of words,” arXiv preprint arXiv:1206.6403, 2012.

[3] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efﬁcient

estimation of word representations in vector space,” arXiv

preprint arXiv:1301.3781, 2013.

[4] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and

J. Dean, “Distributed representations of words and phrases

and their compositionality,” in Advances in neural informa-

tion processing systems, 2013, pp. 3111–3119.

[5] R. Lebret and R. Collobert, “Rehabilitation of count-based

models for word vector representations,” arXiv preprint

arXiv:1412.4930, 2014.

[6] J. Pennington, R. Socher, and C. D. Manning, “Glove:

Global vectors for word representation.” in EMNLP,

vol. 14, 2014, pp. 1532–1543.

[7] A. Bordes, J. Weston, R. Collobert, Y. Bengio et al.,

“Learning structured embeddings of knowledge bases.” in

AAAI, 2011.

[8] Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge

graph embedding by translating on hyperplanes.” in AAAI,

2014, pp. 1112–1119.

[9] Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu, “Learning entity

and relation embeddings for knowledge graph completion.”

in AAAI, 2015, pp. 2181–2187.

[10] G. Ji, S. He, L. Xu, K. Liu, and J. Zhao, “Knowledge graph

embedding via dynamic mapping matrix.” in ACL (1), 2015,

pp. 687–696.

[11] J. Feng, M. Huang, Y. Yang, and X. Zhu, “Gake: Graph

aware knowledge embedding.” in COLING, 2016, pp. 641–

651.

[12] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor,

“Freebase: a collaboratively created graph database for

structuring human knowledge,” in Proceedings of the 2008

ACM SIGMOD international conference on Management

of data. AcM, 2008, pp. 1247–1250.

[13] G. A. Miller, “Wordnet: a lexical database for english,”

Communications of the ACM, vol. 38, no. 11, pp. 39–41,

1995.

[14] M. Yu and M. Dredze, “Improving lexical embeddings with

semantic knowledge.” in ACL (2), 2014, pp. 545–550.

[15] D. Bollegala, M. Alsuhaibani, T. Maehara, and K.-i.

Kawarabayashi, “Joint word representation learning using

a corpus and a semantic lexicon.” in AAAI, 2016, pp. 2690–

2696.

[16] H.-Y. Wang and W.-Y. Ma, “Integrating semantic knowl-

edge into lexical embeddings based on information content

measurement,” EACL 2017, p. 509, 2017.

[17] W.-Y. Ma and K.-J. Chen, “A bottom-up merging algorithm

for chinese unknown word extraction,” in Proceedings

of the second SIGHAN workshop on Chinese language

processing-Volume 17. ACL, 2003, pp. 31–38.

[18] W. Y. Ma and K. J. Chen, “Introduction to ckip chinese

word segmentation system for the ﬁrst international chinese

word segmentation bakeoff,” in Proceedings of the sec-

ond SIGHAN workshop on Chinese language processing-

Volume 17. Association for Computational Linguistics,

2003, pp. 168–171.

[19] E. Bruni, G. Boleda, M. Baroni, and N.-K. Tran, “Distri-

butional semantics in technicolor,” in Proceedings of the

50th Annual Meeting of the Association for Computational

Linguistics. Association for Computational Linguistics,

2012, pp. 136–145.

[20] K. Radinsky, E. Agichtein, E. Gabrilovich, and

S. Markovitch, “A word at a time: computing word

relatedness using temporal semantic analysis,” in

Proceedings of the 20th international conference on

world wide web. ACM, 2011, pp. 337–346.

Word Embedding Evaluation Datasets and Wikipedia Title Embedding for Chinese

Conference Paper

Full-text available

May 2018

Knowledge Graph Embedding by Translating on Hyperplanes

Conference Paper

Full-text available

Jun 2014

Jianlin Feng

Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement

Conference Paper

Full-text available

Apr 2017

Glove: Global Vectors for Word Representation

Conference Paper

Full-text available

Jan 2014

Joint Word Representation Learning Using a Corpus and a Semantic Lexicon

Article

Full-text available

Nov 2015

Methods for learning word representations using large text corpora have received much attention lately due to their impressive performance in numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and word analogy detection. Despite their success, these data-driven word representation learning methods do not consider the rich semantic relational structure between words in a co-occurring context. On the other hand, already much manual effort has gone into the construction of semantic lexicons such as the WordNet that represent the meanings of words by defining the various relationships that exist among the words in a language. We consider the question, can we improve the word representations learnt using a corpora by integrating the knowledge from semantic lexicons?. For this purpose, we propose a joint word representation learning method that simultaneously predicts the co-occurrences of two words in a sentence subject to the relational constrains given by the semantic lexicon. We use relations that exist between words in the lexicon to regularize the word representations learnt from the corpus. Our proposed method statistically significantly outperforms previously proposed methods for incorporating semantic lexicons into word representations on several benchmark datasets for semantic similarity and word analogy.

Rehabilitation of Count-Based Models for Word Vector Representations

Conference Paper

Full-text available

Dec 2014

Recent works on word representations mostly rely on predictive models. Distributed word representations (aka word embeddings) are trained to optimally predict the contexts in which the corresponding words tend to appear. Such models have succeeded in capturing word similarties as well as semantic and syntactic regularities. Instead, we aim at reviving interest in a model based on counts. We present a systematic study of the use of the Hellinger distance to extract semantic representations from the word co-occurence statistics of large text corpora. We show that this distance gives good performance on word similarity and analogy tasks, with a proper type and size of context, and a dimensionality reduction based on a stochastic low-rank approximation. Besides being both simple and intuitive, this method also provides an encoding function which can be used to infer unseen words or phrases. This becomes a clear advantage compared to predictive models which must train these new words.

Natural Language Processing (Almost) from Scratch

Article

Full-text available

Feb 2011
J MACH LEARN RES

We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

Distributed Representations of Words and Phrases and their Compositionality

Article

Full-text available

Oct 2013
Adv Neural Inform Process Syst

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

Learning Entity and Relation Embeddings for Knowledge Graph Completion

Article

Feb 2015

Knowledge graph completion aims to perform link prediction between entities. In this paper, we consider the approach of knowledge graph embeddings. Recently, models such as TransE and TransH build entity and relation embeddings by regarding a relation as translation from head entity to tail entity. We note that these models simply put both entities and relations within the same semantic space. In fact, an entity may have multiple aspects and various relations may focus on different aspects of entities, which makes a common space insufficient for modeling. In this paper, we propose TransR to build entity and relation embeddings in separate entity space and relation spaces. Afterwards, we learn embeddings by first projecting entities from entity space to corresponding relation space and then building translations between projected entities. In experiments, we evaluate our models on three tasks including link prediction, triple classification and relational fact extraction. Experimental results show significant and consistent improvements compared to state-of-the-art baselines including TransE and TransH.

Knowledge Graph Embedding via Dynamic Mapping Matrix

Conference Paper

Jan 2015

Improving Lexical Embeddings with Semantic Knowledge

Conference Paper

Jun 2014

Word embeddings learned on unlabeled data are a popular tool in semantics, but may not capture the desired semantics. We propose a new learning objective that incorporates both a neural language model objective (Mikolov et al., 2013) and prior knowledge from semantic resources to learn improved lexical semantic embeddings. We demonstrate that our embeddings improve over those learned solely on raw text in three settings: language modeling, measuring semantic similarity, and predicting human judgements.

Embedding Wikipedia Title Based on Its Wikipedia Text and Categories

Abstract

Recommended publications

Verbs of Command and the Status of Their Embedded Complements in Chinese

Identifying Chinese Dependent Clauses in the Forms of subjects

Reduplication in Mandarin Chinese: Their Formation Rules, Syntactic Behavior, and ICG Representation

A New Multi-Agent System Architecture and Its Application in RoboCup