Using variable length ngrams for retrieving technical abstracts in Japanese (poster session).
DOI: 10.1145/355214.355250 Conference: Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30 - October 01, 2000
Previous studies have reported that bigrams work well for many Asian language including Chinese, Korean and Japanese. Most of these studies have focused on newspaper texts. We report an experiment with a very different genre (technical abstracts) and find performance can be improved by combining both short and long ngrams. It is a sound approach to work with all ngrams of all lengths since we will have more information than that of bigrams.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.