Weigang Li’s research while affiliated with Harbin Institute of Technology and other places


Ad

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Bootstrapping for extracting relations from large corpora
  • Article

October 2008

·

15 Reads

·

1 Citation

Journal of Electronics (China)

Weigang Li

·

·

A new approach of relation extraction is described in this paper. It adopts a bootstrapping model with a novel iteration strategy, which generates more precise examples of specific relation. Compared with previous methods, the proposed method has three main advantages: first, it needs less manual intervention; second, more abundant and reasonable information are introduced to represent a relation pattern; third, it reduces the risk of circular dependency occurrence in bootstrapping. Scalable evaluation methodology and metrics are developed for our task with comparable techniques over TianWang 100G corpus. The experimental results show that it can get 90% precision and have excellent expansibility.


Automated generalization of phrasal paraphrases from the web

January 2005

·

38 Reads

·

11 Citations

Weigang Li

·

·

·

[...]

·

Rather than creating and storing thou-sands of paraphrase examples, para-phrase templates have strong representation capacity and can be used to generate many paraphrase examples. This paper describes a new template representation and generalization method. Combing a semantic diction-ary, it uses multiple semantic codes to represent a paraphrase template. Using an existing search engine to extend the word clusters and generalize the exam-ples. We also design three metrics to measure our generalized templates. The experimental results show that the rep-resentation method is reasonable and the generalized templates have a higher precision and coverage.


Combining Sentence Length with Location Information to Align Monolingual Parallel Texts

October 2004

·

33 Reads

·

1 Citation

Lecture Notes in Computer Science

Abundant Chinese paraphrasing resource on Internet can be attained from different Chinese translations of one foreign masterpiece. Paraphrases corpus is the corpus that includes sentence pairs to convey the same information. The irregular characteristics of the real monolingual parallel texts, especially without the strictly aligned paragraph boundaries between two translations, bring a challenge to alignment technology. The traditional alignment methods on bilingual texts have some difficulties in competency for doing this. A new method for aligning real monolingual parallel texts using sentence pair's length and location information is described in this paper. The model was motivated by the observation that the location of a sentence pair with certain length is distributed in the whole text similarly. And presently, a paraphrases corpus with about fifty thousand sentence pairs is constructed.

Ad

Citations (1)


... These differences make sentence category without a clear unified standard [1] . In this paper, the classi cation of complex sentences is based on Jiaoyan Jia [2] , which puts the sentences into the joint complex sentence, subordinate complex sentence and multiple complex sentence in three categories. The joint sentence and compound sentence contains five kinds of small class. ...

Reference:

Research of Paraphrasing for Chinese Complex Sentences Based on Templates
Automated generalization of phrasal paraphrases from the web
  • Citing Article
  • January 2005