Conference PaperPDF Available

「新幹線要約」のための文末の整形

Authors:

Abstract

池田 諭史, 大橋 一輝, 山本 和英. 「新幹線要約」のための文末の整形. 情報処理学会 研究報告, NL163-22 / FI76-22 (2004.9)
, ,
E-mail:{ikeda,ohashi,ykaz}@nlp.nagaokaut.ac.jp
52% 1
2.50 95%
Processing sentence end reduction of a newsflash
Satoshi Ikeda , Kazuteru Ohashi , Kazuhide Yamamoto
Department of Electrical Engineering,Nagaoka University of Technology
E-mail:{ikeda,ohashi,ykaz}@nlp.nagaokaut.ac.jp
The electrical bulletin board news consists of high density expressions. The end of the
sentence is unique shape that is nouns or case particles. This paper focuses on expressions
of the sentence end, and attempt to summarize them by forming them into nouns or case
particles. We summarize the news sentence by pattern matching approach. Our evaluation
illustrates that our summarizer reduces 2.50 characters on average; the summarization ratio
of sentence ends is 52%. We also show that the correctness of reduction is 95%.
1
1
1)
[4]
[6]
[1,3,5] [1]
[3], [5]
[1,3,5]
1
2
(1) NIKKEI-goo
1 3
2)
2
2
1 60
56 1999 12
4
1
1:
3365
21127
40374
2
(2)
2:
[%]
23.7 55.92
( ) (5.00) (39.90)
28.66 15.91
1.80 0.19
0.20 0.22
1.56 8.83
( ) (0.34) (6.41)
38.59 18.52
5.42 0.40
2
3 3
3
3:
(a) (b) a/b
1.059 3.126 0.335
0.622 2.147 0.290
0.342 2.522 0.136
0.188 2.633 0.072
1.356 3.605 0.376
0.446 0.191 2.432
5.914 50.882 0.116
7 2.712 1.011 0.373
3
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
3. 9.
3.1
:
:
3)
3.2
4)
3.3
2
5)
*
(1)
Step 1
Step 2
6)
Step 3
7)
Step 4 +
8)
Step 5
1 2
1
9)
10)
Step2
Step 2
(1)
‘*’
11)
12)
Step 3
13)
14)
Step 4.1
15)
16)
Step 4.2
Step 5
3.4
17)
*
3
18)
19)
3.5
20)
*
Step 1
Step 2
21)
22)
Step 3
23)
24)
3.6
3
25)
*
3
Step 1
Step 2
26)
Step 3
Step 4
27)
28)
29)
30)
3.7
31)
4
32)
3.8
33)
*
Step 1
34)
Step 2
35)
36)
37)
3.9
(2)
(2)
[2]
3
38)
3.10
(3)
(4)
39)
39
4
2000
1
232,038 73,512
4.1
40)
40
(5 ) (2 )
40 =
2
5
= 0.40
5
3.7
4
52%
4:
3.1 3.2 3.3 3.4
0.60 0.33 0.49 0.45
16825 1313 37995 7510
3.5 3.6 3.8 3.9
0.62 0.66 0.41 0.36 0.52
199 7194 197 848 72727
4.2
73,512 1,000
3
3
2 5
5:
3.1 3.2 3.3 3.4 3.5
231 19 492 107 9
205 18 481 106 8
0.89 0.95 0.98 0.99 0.89
3.6 3.7 3.8 3.9
116 21 3 13 1000
113 17 3 12 952
0.97 0.81 1 0.92 0.95
3
6
90%
6:
1 2 3
0.98 0.95 0.91
4.3
73,512 100
7
7
100
8 8 1 2.5
5
5.1
7:
72727 100
0.52 0.51
8:
73512 100
183606 290
1 2.50 2.90
41)
*
41
42)
*
42
43)
*
43
44)
*
44
6
45
23
45)
*
46)
*
46
1
2
1
47
47)
48)
*
48
49)
*
49
5.2
3.3
51 52
50)
51)
52)
53
55
53)
54)
55)
[2]
50 53
53
50 53
53 50
2
5.3
4.3
56)
56
57)
57
7
58)
58 3.9
59)
60)
59
60
61)
61
5.4
158,526 200
9
62)
63)
62
63
6
52% 1 2.5
95%
5.1
(B)
16700134 (A)
16200009
(1) ,NIKKEI-goo,
http://nikkeimail.goo.ne.jp/
(2) 2000 ,
.
(3) ,Ver.2.3.3,
,
http://chasen.naist.jp/hiki/ChaSen/
(4) 2000 , .
[1] , , , :
, ,NL-133-7,pp.45-
52,1999.
[2] , :
, 10
,pp.693-696,2004.
[3] , , : Web
,
,NL-153-1,pp.1-8,2003.
[4] , , , : Web
, ,NL-159-27,pp.193-
200,2004.
[5] , , :
,
,Vol.6,No.6,pp.65-
81,1999.
[6] , , :
,
,NL-122-13,pp.83-89,1997.
8E
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.