Article

Informative Spoken Language Summarization of the Diet Minutes

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... On the other hand, our research targets the general lecture speech. Moreover, Yamamoto et al. have proposed summarization technique for the Diet minutes[2]. In this technique, parenthetical expressions are deleted, redundant expressions peculiar to spoken language are deleted or paraphrased, and honorific expressions are converted into normal expressions. ...
Article
As typified by World Wide Web, a lot of information became accumulated on the Internet. However, most of the currently distributed information is occupied by written docu-ment. Compared with it, spoken document is hardly distributed. Therefore, if the mechanism for distributing them can be built, our human society will be able to share much more information. This paper proposes a technique for editing a sentence in spoken document for the purpose of converting it into the Internet contents equipped with the accessibility and readability. By aligning the recorded video data or speech data with the edited text on a fine level, it can be utilized as the multimedia contents equipped with the accessibility. Our technique consists of the following three sentence technologies: (1) paraphrase, (2) division, and (3) structuration. We implemented a spoken document edit system based on our techniques. We conducted an edit experiment by using lecture speech data and our technique could achieve high accuracy. From the results, we confirmed the availability of our technique.
Article
Full-text available
In this study, we collected minutes from national and local assemblies published on the web, constructing a large corpus. In addition, we developed pre-trained language models adapted to the Japanese political domain using the constructed corpus of meeting records, incorporating several derivatives. Our models demonstrated superior and comparable performances to conventional models for tasks within the political and nonpolitical domains, respectively. In addition, we showed that increasing the number of training steps during domain adaptation with additional pre-training improves performance significantly. Furthermore, leveraging the corpus from the initial pre-training enhances performance in the adapted domain while maintaining performance in the non-adapted domains.
Article
In this study, we attempt to reveal the relationship between the National Diet of Japan and Japanese ministries by text analysis of minutes data. The policy making process mainly consists of two routes: One is the parliamentary initiative route, and the other is the ministries initiative route, which is often consulted by advisory committees. These policy making process routes are not independent, but affect one another. While there are many studies and reports that have explored these relationships, most of them are qualitative case studies, which have some methodological limitations such as little comparability among cases. We propose a method of measuring the relationship between the National Diet of Japan and Japanese ministries through text similarity and time stamps contained within minutes of public organizations, which have been published online, providing machine-readable open data. Our analysis suggests that the method draws consistent results with existing qualitative analyses and can effectively support and improve understanding of the relationship between the Diet and ministries. In addition, this method has an advantage of analyzing a wide variety of topics using the same method, ensuring comparability for researchers.
Article
Full-text available
In recent years, minutes of regional assemblies and the National Diet have been published on the web. Those minutes have long recorded transcribed discussions of mayors and members of assemblies. Therefore, they are a target of study in various fields such as politics, economics, linguistics, information engineering. Since the minutes of the National Diet are maintained in electronic form and freely available via a search system, many researchers have utilized the minutes as a target of study. Minutes of regional assembly meetings are also the focus of researchers in various fields. However, researchers have had trouble gathering and preparing minutes for their study, because the way in which minutes are made available to the public varies assembly by assembly. It is very inefficient for each researcher to make the effort to digitize minutes separately. To improve the situation and contribute to research communities, we have collected regional minutes of assemblies and constructed the corpus of regional assembly minutes. In this paper, we discussed the construction of the corpus of regional assembly minutes. The corpus records minutes from regional assemblies all over Japan that are available on the web. We added additional information to the corpus, such as “date,” “name of meeting,” “name of speaker,” “text of statement,” so that users may search statements across the corpus using such information. The final goal of our project is to build a political information system that can recommend a suitable person, or members of an assembly, according to the consistency between users’ opinions and statements of assembly members. As a preliminary step of development, we annotated a part of the corpus with information about the speaker’s attitude to specific political subjects, including degree of approval/disapproval. In this paper, we also report the result of the annotation.
Article
Full-text available
This paper describes extraction of political activities of an assemblyman from min-utes of municipal assemblies using the po-litical category. The extraction system is oriented to the political information sup-porting service between local assembly-men and inhabitants through the World Wide Web. At first, we have constructed the local political categories based on the name of the committees in assemblies and their subjects. The annotation to the min-utes has been carried out. The target are 7,084 paragraphs in the minutes of Otaru city assembly in 2007. We have carried out the experiment using SVMs with sev-eral new features for minutes. The exper-iment on the extraction of political activi-ties from minutes using estimated political categories has been carried out. The cor-respondence rate between annotators' re-sult and system-estimated result is 91.7% when second highest rank is permitted.
ResearchGate has not been able to resolve any references for this publication.