-
[show abstract]
[hide abstract]
ABSTRACT: Adopting the regression SVM framework, this paper proposes a linguistically motivated feature engineering strategy to develop
an MT evaluation metric with a better correlation with human assessments. In contrast to current practices of “greedy” combination
of all available features, six features are suggested according to the human intuition for translation quality. Then the contribution
of linguistic features is examined and analyzed via a hill-climbing strategy. Experiments indicate that, compared to either
the SVM-ranking model or the previous attempts on exhaustive linguistic features, the regression SVM model with six linguistic
information based features generalizes across different datasets better, and augmenting these linguistic features with proper
non-linguistic metrics can achieve additional improvements.
Keywordsmachine translation–automatic evaluation–regression SVM (supporting vector machine)–linguistic feature
Journal of Computer Science and Technology 04/2012; 26(1):57-67. · 0.56 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Stitching multiple animate or video clips together is a widely used technique in multimedia processing area. From digital video editing to scene generator in Text-to-Scene conversion system, it usually has its usage. Currently, many general or specific software were developed aiming to complete such task. However, few of them stitch VRML clips together. In this paper, we proposed an approach of stitching VRML clips together, which is based on the structure of VRML, not only completes its work correctly, but uses only very little time. It has been used in a Text-to-Scene conversion system developed by HIT MI&TLab.
Internet Computing for Science and Engineering (ICICSE), 2009 Fourth International Conference on; 01/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Spatial concepts conversion in text-to-scene conversion system is one of the essential tasks. This paper provides a hybrid method to instantiate spatial concepts. The hybrid method is on clustering, cognition and rules. The spatial ontology with instantiation by the method is tested in the experiment of object display for natural language, and gets a good result.
Internet Computing for Science and Engineering (ICICSE), 2009 Fourth International Conference on; 01/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Selection of wavelet type, decomposition level and fusing rule is a key problem when wavelet transform is applied to image fusion. 2916 kinds of different fusing methods (54×5×9, including 54 wavelet types, 5 decomposing levels, and 9 fusing rules) are analyzed and compared in the experiment of fusing multi-focus images in this paper. Through calculating the comparability degree of fused images, the fusion performances are evaluated. And the experiment shows that the similarities of the results and the ideal pictures are all over 0.999, showing pretty good performance.
Journal of Computational Information. 01/2010; 6:1-131.
-
[show abstract]
[hide abstract]
ABSTRACT: Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects.
Sensors 01/2010; 10(5):4622-33. · 1.74 Impact Factor
-
International Conference on Asian Language Processing, IALP 2010, Harbin, Heilongjiang, China, 28-30 December 2010; 01/2010
-
ICEIS 2009 - Proceedings of the 11th International Conference on Enterprise Information Systems, Volume AIDSS, Milan, Italy, May 6-10, 2009; 01/2009
-
Natural Language Processing and Cognitive Science, Proceedings of the 6th International Workshop on Natural Language Processing and Cognitive Science, NLPCS 2009, In conjunction with ICEIS 2009, Milan, Italy, May 2009; 01/2009
-
Fifth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2008, 18-20 October 2008, Jinan, Shandong, China, Proceedings, Volume 2; 01/2008
-
[show abstract]
[hide abstract]
ABSTRACT: Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system
based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set
of features, including local features, full text features and external resource features. All features incorporated in this
system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated.
In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena,
cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has
important impact on the system performance, and the post-processing explored has an important contribution on system performance
to achieve better results.
Journal of Electronics (China) 10/2007; 24(6):838-844.
-
[show abstract]
[hide abstract]
ABSTRACT: This paper proposes a novel Chinese syntactic parsing model based on semantic class, which is a variant of normal lexicalized statistical model. It attempts to make use of the syntactic and semantic similarity between Chinese words and then produces a more knowledgeable estimate of the probability of grammar rules. A simple but effective unsupervised method is designed to determine the proper semantic class of given words. Semantic class is used to improve the performance of parsing model. We evaluate our methods on the widely used Penn Chinese Treebank. Experimental results show that it outperforms a famous lexicalized model significantly on appropriate semantic class levels.
Machine Learning and Cybernetics, 2007 International Conference on; 09/2007
-
Knowledge Science, Engineering and Management, Second International Conference, KSEM 2007, Melbourne, Australia, November 28-30, 2007, Proceedings; 01/2007
-
[show abstract]
[hide abstract]
ABSTRACT: For improving the effectiveness of cross-lingual information retrieval (CLIR), a domain ontology knowledge based method is presented to apply to C-E CLIR. In this study, the domain ontology knowledge is acquired from both source language user queries and target documents to select target translation and re-rank initial retrieval documents set. The C-E CLIR dataset from NTCIR-4 Workshop is used to evaluate the effectiveness of this method. Different from previous works, we make use of source language user queries in total C-E CLIR and compared with previous works, this method improved the precision
Computational Intelligence and Security, 2006 International Conference on; 12/2006
-
[show abstract]
[hide abstract]
ABSTRACT: For information retrieval, users hope to acquire more relevant information from the top N ranking documents. In this paper, a hybrid Chinese language model is presented, which is defined as a combination of ontology with statistical method, to improve the precision of top N ranking documents by reordering the initial retrieval documents. The experiment with NTCIR-3 formal Chinese test collection shows the proposed method improved the precision at top N ranking documents level
Information Acquisition, 2006 IEEE International Conference on; 09/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Natural language is an easy and effective medium for spatial description. Thus, we foresee the emergence of methods to extract spatial relationships from text as constraints on layout of objects in a 3D scene. This would allow users to quickly display objects in 3D scenes without having to touch a desktop window-oriented interface, acquire artistic skills, or even understand nature language. The aim of this paper is to propose a method for automatic extraction of spatial relationships from text. The spatial relationship is firstly divided into two parts consisting of the spatial expression and the corresponding trajectory. Extracting spatial relationships is done in three stages: extracting spatial expressions, acquiring trajectories, and resolving ellipsis on landmarks. The first stage is done based on a linear classifier. The other stages are based on limited knowledge. Our hybrid method relies on HowNet which is a semantic knowledge base. Experimental results have demonstrated the effectiveness of our scheme
Information Acquisition, 2006 IEEE International Conference on; 09/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Text classification is becoming one of the key techniques in organizing and handling a large amount of text data. In this paper, a combination of ontology with statistical method is presented to improve the precision of text classification. In this study, first, different kind of linguistic ontology knowledge will be respectively acquired by learning training corpus to determine text classifiers. For an actual document, the semantic evaluation value of the document will respectively be gotten by different kind of linguistic ontology knowledge and the categories will be judged by the highest evaluation value. Compared with Bayes, k-nearest neighbor and support vector machine, the proposed approach outperforms previous works
Machine Learning and Cybernetics, 2006 International Conference on; 09/2006
-
[show abstract]
[hide abstract]
ABSTRACT: For information retrieval, users hope to acquire more relevant information from the top indexing documents. In this paper, a combination of ontology with statistical method is presented to retrieval initial document set and improves the precision of top N ranking documents by re-ranking document set. The experiment with NTCIR-3 Chinese CLIR dataset shows the proposed method improved the precision of information retrieval
Machine Learning and Cybernetics, 2006 International Conference on; 09/2006
-
[show abstract]
[hide abstract]
ABSTRACT: The paper presents some main progresses and achievements in Chinese information processing. It focuses on six aspects, i.e.,
Chinese syntactic analysis, Chinese semantic analysis, machine translation, information retrieval, information extraction,
and speech recognition and synthesis. The important techniques and possible key problems of the respective branch in the near
future are discussed as well.
Journal of Computer Science and Technology 08/2006; 21(5):838-846. · 0.56 Impact Factor
-
Knowledge Science, Engineering and Management, First International Conference, KSEM 2006, Guilin, China, August 5-8, 2006, Proceedings; 01/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Automatic acquisition of translation templates is important for MT system to improve its translation quality and its ability of adaptation to new domain. In this paper, translation equivalences are obtained from translation corresponding trees of bilingual sentence pairs. Error-driven learning method is employed to acquire templates from extracted equivalences. At the same time, optimization method based on automatic translation evaluation is used to clean these templates. Then they are applied to a transfer-based MT system, and "863" dialog corpus in 2003 is used for open test. Experimental results show that the performance of new acquired templates exceeds that of original ones. Combination of new acquired templates and original ones makes 5-gram Nist assessment score of open test corpus improve by 8.11%.
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on; 09/2005