Tomoko Fukuda

Fukuoka Jo Gakuin College, Hukuoka, Fukuoka, Japan

Are you Tomoko Fukuda?

Claim your profile

Publications (18)1.77 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Waka is a form of traditional Japanese poetry with a 1300-year history. In this paper, we attempt to discover characteristics common to a collection ofwaka poems. As a schema for characteristics, we use regular patterns where the constant parts are limited to sequences of auxiliary verbs and postpostional particles. We call such patternsfushi. The problem is to automate the process of finding significantfushi patterns that characterize the poems. Solving this problem requires a reliable significance measure for the patterns. Brāzma et al. (1996) proposed such a measure according to the MDL principle. Using this method, we report successful results in finding patterns from five anthologies. Some of the results are quite stimulating, and we hope that they will lead to new discoveries.
    New Generation Computing 04/2012; 18(1):61-73. · 0.80 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Waka is a form of traditional Japanese poetry with a 1300-year history. In this paper, we attempt to discover characteristics common to a collection of Waka poems. As a formalism for characteristics, we use regular patterns where the constant parts are limited to sequences of auxiliary verbs and postpositional particles. We call such patterns fushi. The problem is to find automatically significant fushi patterns that characterize the poems Solving this problem requires a reliable significance measure for the patterns. Brāzma et al. (1996) proposed such a measure according to the MDL principle. Using this method, we report successful results in finding patterns from five anthologies. Some of the results are quite stimulating, and we hope that they will lead to new discoveries. Based on our experience, we also propose a pattern-based text data mining system. Further research into waka poetry is now proceeding using this system
    01/2008: pages 129-141;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We attempt to extract characteristic expressions from literary works. That is, given two collections of literary works, one of which is written by a particular author (positive examples) and the other by a different author (negative examples), the problem is to find expressions that appear frequently in the positive examples but which are seldom found in the negative examples. This is considered as a special case of the optimal pattern discovery from textual data, in which only the substring patterns are considered. One approach would be to create a list of text substrings sorted according to goodness, and to scrutinize the first part of the list by human efforts. Since there is no word boundary in Japanese texts, a substring is often a fragment of a word or phrase. A method to assist domain experts who are involved in this task is a key problem. In this paper, we propose partitioning the text substrings into equivalence classes under an equivalence relation on strings, originally defined by Blumer et al. (J. ACM 34(3) (1987) 578). The equivalence relation has the desirable property that all members of each equivalence class necessarily have a unique goodness value. This idea effectively reduces the inefficiency of the task of evaluating mined patterns. We also present a method for browsing possible superstrings of a focused string as well as its context. We report successful results with two pairs of anthologies of classical Japanese poems. We expect that the extracted expressions may lead to discovering overlooked aspects of individual poets.
    Theoretical Computer Science 01/2003; 292:525-546. · 0.49 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Waka is a form of traditional Japanese poetry with a 1300-year history. In this paper, we attempt to semi-automatically discover instances of poetic allusion, or more generally, to find similar poems in anthologies of Waka poems. One reasonable approach would be to arrange all possible pairs of poems in two anthologies in decreasing order of similarity values, and to scrutinize high-ranked pairs by human effort. The means of defining similarity between Waka poems plays a key role in this approach. In this paper, we generalize existing (dis)similarity measures into a uniform framework, called string resemblance systems, and using this framework, we develop new similarity measures suitable for finding similar poems. Using the measures, we report successful results in finding instances of poetic allusion between two anthologies Kokin-Sh and Shin-Kokin-Sh. Most interestingly, we have found an instance of poetic allusion that has never before been pointed out in the long history of Waka research.
    Theoretical Computer Science 01/2003; · 0.49 Impact Factor
  • Ayumi Shinohara, Tomoko Fukuda, Ichiro Nanri
    [Show abstract] [Hide abstract]
    ABSTRACT: The classof pattern languages was introduced by Angluin (1980), and a lotof studies have been undertaken on itfR. the theoretical viewpoint of learnabilities. However, there have been fn practical studies exceptfc the one by Shinohara (1982), in which patterns are restricted so that every variable occurs at most once. In this paper, we distinguish repetitive variablesfia those occurring only once within a pattern, andf cus on the numberof occurrencesof a repetitive-variable and the lengthof strings it matches, in order to model the rhetorical device based on repetitionof words in classical Japanese poems. Preliminary result suggests that it will lead to characterizationof individual anthology, which has never been achieved, up till now. 1
    06/2002;
  • Masayuki Takeda, Tomoko Fukuda, Ichiro Nanri
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper surveys our recent studies of text mining from literary works, especially classical Japanese poems, Waka. We present methods for finding characteristic patterns in anthologies of Waka poems, as well as those for finding similar poem pairs. Our aim is to obtain good results that are of interest to Waka researchers, not just to develop efficient algorithms. We report successful results in finding patterns and similar poem pairs, some of which led to new discoveries.
    Progress in Discovery Science, Final Report of the Japanese Discovery Science Project; 01/2002
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The class of pattern languages was introduced by Angluin (1980), and a lot of studies have been undertaken on it from the theoret- ical viewpoint of learnabilities. However, there have been few practical studies except for the one by Shinohara (1982), in which patterns are restricted so that every variable occurs at most once. In this paper, we distinguish repetitive variables from those occurring only once within a pattern, and focus on the number of occurrences of a repetitive-variable and the length of strings it matches, in order to model the rhetorical device based on repetition of words in classical Japanese poems. Prelim- inary result suggests that it will lead to characterization of individual anthology, which has never been achieved, up till now.
    Discovery Science, 4th International Conference, DS 2001, Washington, DC, USA, November 25-28, 2001, Proceedings; 01/2001
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We attempt to extract characteristic expressions from literary works. That is, our problem is, given literary works by a particular writer as positive examples and works by another writer as negative examples, to find expressions that appear frequently in the positive examples but do not so in the negative examples. It is considered as a special case of the optimal pattern discovery from textual data, in which only the substring patterns are considered. One reasonable approach is to create a list of substrings arranged in the descending order of their goodness, and to examine a first part of the list by a human expert. Since there is no word boundary in Japanese texts, a substring is often a fragment of a word or a phrase. How to assist the human expert is a key to success in discovery. In this paper, we propose (1) to restrict to the prime substrings in order to remove redundancy from the list, and (2) a way of browsing the neighbor of a focused string as well as its context. Using this method, we report successful results against two pairs of anthologies of classical Japanese poems. We expect that the extracted expressions will possibly lead to discovering overlooked aspects of individual poets.
    Discovery Science, Third International Conference, DS 2000, Kyoto, Japan, December 4-6, 2000, Proceedings; 01/2000
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We attempt to extract characteristic expressions from literary works. That is, our problem is, given literary works by a particular writer as positive examples and works by another writer as negative examples, to find expressions that appear frequently in the positive examples but do not so in the negative examples. It is considered as a special case of the optimal pattern discovery from textual data, in which only the substring patterns are considered. One reasonable approach is to create a list of substrings arranged in the descending order of their goodness, and to examine a first part of the list by a human expert. Since there is no word boundary in Japanese texts, a substring is often a fragment of a word or a phrase. How to assist the human expert is a key to success in discovery. In this paper, we propose (1) to restrict to the prime substrings in order to remove redundancy from the list, and (2) a way of browsing the neighbor of a focused string as well as its context. Using this method, we report successful results against two pairs of anthologies of classical Japanese poems. We expect that the extracted expressions will possibly lead to discovering overlooked aspects of individual poets.
    12/1999: pages 112-126;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Waka is a form of traditional Japanese poetry with a 1300- year history. In this paper we attempt to semi-automatically discover instances of poetic allusion, or more generally, to nd similar poems in anthologies of waka poems. The key to success is how to dene the similarity measure on poems. We rst examine the existing similarity measures on strings, and then give a unifying framework that captures the essences of the measures. This framework makes it easy to design new measures appropriate to nding similar poems. Using the measures, we report successful results in nding poetic allusion between two antholo- gies Kokinsh u and Shinkokinsh u. Most interestingly, we have found an instance of poetic allusion that has never been pointed out in the long history of waka research.
    Discovery Science, Second International Conference, DS '99, Tokyo, Japan, December, 1999, Proceedings; 01/1999
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: WAKA is a form of traditional Japanese poetry with a 1300- year history. In this paper, we attempt to discover characteristics common to a collection of WAKA poems. As a formalism for characteristics, we use regular patterns where the constant parts are limited to sequences of auxiliary verbs and postpositional particles. We call such patterns FUSHI. The problem is to find automatically significant fushi patterns that characterize the poems. Solving this problem requires a reliable significance measure for the patterns. Bräzma et al. (1996) proposed such a measure according to the MDL principle. Using this method, we report successful results in finding patterns from five anthologies. Some of the results are quite stimulating, and we hope that they will lead to new discoveries. Based on our experience, we also propose a pattern-based text data mining system. Further research into WAKA poetry is now proceeding using this system.
    Discovery Science, First International Conference, DS '98, Fukuoka, Japan, December 14-16, 1998, Proceedings; 01/1998
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: 文字列解析ツールe-CSA“efficient character string analyzer, イークサ”は,テキストデータを単なる文字の連鎖として扱う立場で開発した,汎用のソフトウェアツールである。本稿では,とくに国語学・国文学研究者による利用を想定して,e-CSAの特長と使用法を解説する。