ArticlePDF Available

Abstract

In this paper we analyze the performance of clustering methods on the task of constructing community models for the users of large Web sites.
&OXVWHULQJWKH8VHUVRI/DUJH:HE6LWHVLQWR&RPPXQLWLHV
*HRUJLRV3DOLRXUDV #
,QVWLWXWHRI,QIRUPDWLFVDQG7HOHFRPPXQLFDWLRQV1&65'HPRNULWRV$JKLD3DUDVNHYL*5*5((&(
&KULVWRV3DSDWKHRGRURX
#
'LYLVLRQRI$SSOLHG7HFKQRORJLHV1&65'HPRNULWRV$JKLD3DUDVNHYL*5*5((&(
9DQJHOLV.DUNDOHWVLV
#
&RQVWDQWLQH'6S\URSRXORV
#
,QVWLWXWHRI,QIRUPDWLFVDQG7HOHFRPPXQLFDWLRQV1&65'HPRNULWRV$JKLD3DUDVNHYL*5*5((&(
$EVWUDFW
,QWKLVSDSHUZHDQDO\]HWKHSHUIRUPDQFHRIFOXV
WHULQJ PHWKRGV RQ WKHWDVNRIFRQVWUXFWLQJFRP
PXQLW\ PRGHOV IRU WKH XVHUV RI ODUJH :HE VLWHV
&RPPXQLW\ PRGHOV UHSUHVHQW SDWWHUQV RI XVDJH
RI WKH :HE VLWH ZKLFK FDQ EH DVVRFLDWHG ZLWK
GLIIHUHQWW\SHVRIXVHU.QRZOHGJHRIWKLVW\SHLV
FOHDUO\YDOXDEOHIRUFRPPHUFLDOVLWHVZKHUHHDFK
XVHU LV D SRWHQWLDO FXVWRPHU :H DUJXH WKDW LW LV
HTXDOO\ YDOXDEOH IRU QRQFRPPHUFLDO VLWHV EH
FDXVHLWFDQDVVLVWJUHDWO\LQWKHLPSURYHPHQWRI
WKHVLWH:HHYDOXDWHWKUHHFOXVWHULQJPHWKRGVRQ
XVDJH GDWD IURP D ODUJH VLWH WKDW FRYHUV RQOLQH
UHVRXUFHV LQ &KHPLVWU\ 7KH VL]H RIWKH VLWH DQG
LWV KLJK KLW UDWH LPSRVH D VHULRXV FRQVWUDLQW RQ
WKHVFDODELOLW\ RI WKHPHWKRGV :H DOVRH[DPLQH
WZR ZD\V RI HQFRGLQJ XVDJH GDWD ZKLFK JLYH
FRPSOHPHQWDU\ LQIRUPDWLRQ DERXW WKH EHKDYLRU
RIWKHXVHUV)LQDOO\WKHHPSKDVLVLVRQWKHFRQ
VWUXFWLRQ RI PHDQLQJIXO FRPPXQLW\ PRGHOV E\
LGHQWLI\LQJ WKH GHVFULSWLYH FKDUDFWHULVWLFV RI
FRPPXQLWLHVDWDSRVWSURFHVVLQJVWDJH
,QWURGXFWLRQ
,QWHUHVWLQWKHDQDO\VLVRIXVHUEHKDYLRURQWKH,QWHUQHWKDV
EHHQ LQFUHDVLQJ UDSLGO\ HVSHFLDOO\ VLQFH WKH DGYHQW RI
HOHFWURQLF FRPPHUFH 1HZ FRQFHSWV VXFK DV HOHFWURQLF
FXVWRPHU UHODWLRQVKLS PDQDJHPHQW H&50 ZHE XVDJH
DQDO\VLVDQGZHE PLQLQJ KDYH DSSHDUHG UHFHQWO\ LQ WKH
OLWHUDWXUH$OORIWKHVHVKDUHWKHVDPHJRDOXQGHUVWDQGLQJ
WKH QHHGV LQWHUHVWV DQG NQRZOHGJH RI WKH XVHUV RI :HE
VLWHV7KHPRWLYDWLRQVWHPVIURPWKHIDFWWKDWDGGHGYDOXH
LVQRWJDLQHGPHUHO\WKURXJKODUJHUTXDQWLWLHVRIGDWDRQD
VLWHEXWWKURXJKHDVLHUDFFHVVWRWKHUHTXLUHGLQIRUPDWLRQ
DWWKHULJKWWLPHDQGLQWKHPRVWVXLWDEOHIRUP6HHQIURP
DGLIIHUHQWYLHZSRLQW
³7KHTXDQWLW\RISHRSOHYLVLWLQJ\RXUVLWHLVOHVVLPSRUWDQW
WKDQWKHTXDOLW\RIWKHLUH[SHULHQFH´6FKZDUW]
&RPPHUFHRQ WKH ,QWHUQHWSURYLGHV D XQLTXHRSSRUWXQLW\
IRU EXVLQHVVHV WR PHHW WKHLU FXVWRPHUV DQG DGDSW WKHLU
VHUYLFH WR WKHP 7KH VDPH KROGV IRU VLWHV WKDW SURYLGH
QRQFRPPHUFLDO VHUYLFHV WKH VXFFHVV RI ZKLFK GHSHQGV
ODUJHO\RQWKHLUXQGHUVWDQGLQJRIWKHLUXVHUV¶LQWHUHVWVDQG
QHHGV7KHXVH RIWKHFRPSXWHUDVDQLQWHUPHGLDU\LQWKH
SURYLVLRQ RI VHUYLFHV DOORZV WKH FROOHFWLRQ RI WUDQVDFWLRQ
GDWD ZLWK OLPLWHGFRVW DQG HIIRUW+RZHYHU WKH WUDQVIRU
PDWLRQRIWKHVHGDWDWRXVHIXONQRZOHGJHLVQRWVLPSOH
0DFKLQHOHDUQLQJWHFKQLTXHVKDYHEHHQVKRZQWRDGGUHVV
WKLVLVVXH ZHOO OHDGLQJWR WKHFUHDWLRQ RIDVHSDUDWH ILHOG
RI VWXG\ LH WKDW RI NQRZOHGJH GLVFRYHU\ IURP GDWD
.'' 2XU DSSURDFK LV DQ DWWHPSW WR DSSO\ WKH VDPH
LGHDV WR XVDJH GDWD IURP ,QWHUQHWEDVHG VHUYLFHV ,Q WKLV
HIIRUWZH PDNH XVHRI EDVLF FRQFHSWVDQG LGHDV IURPWKH
DUHD RI XVHU PRGHOLQJ 7KH PRWLYDWLRQ IRU ERUURZLQJ
WKHVHLGHDVLV WKHIDFWWKDWD:HEVLWHLVVWLOODFRPSXWHU
EDVHGV\VWHPEHLQJXVHGE\SHRSOHRQWKH,QWHUQHW
,QSDUWLFXODUZHIRFXVRXUZRUNRQWKHFRQFHSWRIDXVHU
FRPPXQLW\2UZDQWZKLFKVHHPVWRDSSO\EHVWWR
D SXEOLF :HE VLWH $OWHUQDWLYH DSSURDFKHV DQG RWKHU UH
ODWHGZRUNLVSUHVHQWHGLQ6HFWLRQ$FRPPXQLW\FRUUH
VSRQGVWRDJURXSRIXVHUVZKRH[KLELWFRPPRQEHKDYLRU
LQ WKHLU LQWHUDFWLRQ ZLWK WKH V\VWHP 2XU DSSURDFK WR WKH
FRQVWUXFWLRQ RI FRPPXQLW\ PRGHOV LV WR XVH FOXVWHULQJ
PHWKRGVRQWKHXVDJHGDWDDQGWKHQXVHDSRVWSURFHVVLQJ
PHWKRG WR LGHQWLI\ WKH GLVWLQJXLVKLQJ FKDUDFWHULVWLFV RI
HDFKFOXVWHU:HJLYHSDUWLFXODUHPSKDVLVRQWKHLQWHUSUH
WDWLRQVWDJHEHFDXVHZHEHOLHYHWKDWWKHGHVFULSWLYHFKDU
DFWHUL]DWLRQ RI D FRPPXQLW\ LQ WHUPV RI D PRGHO LVZKDW
WKHVLWHRZQHUQHHGV6HFWLRQRIWKLVSDSHUH[SODLQVRXU
DSSURDFKIXUWKHU)RUWKHFOXVWHULQJSURFHVVLWVHOIZHXVH
H[LVWLQJWHFKQLTXHVZKLFKDUHEULHIO\GHVFULEHGLQ6HFWLRQ
 $Q LPSRUWDQW FULWHULRQ IRU WKH FKRLFH RI D FOXVWHULQJ
PHWKRG LV LWV VFDODELOLW\ WR ODUJH VHWV RI GDWD ZKLFK LV D
UHTXLUHPHQWIRUDQ\PHWKRGWKDWZHKRSHWRDSSO\WRUHDO
OLIH:HEVLWHV ,QWKLVSDSHUZHHYDOXDWHWKUHHFOXVWHULQJ
PHWKRGV RQ D ODUJH VLWH FRQWDLQLQJ LQIRUPDWLRQ IRU UH
VHDUFKHUVLQ&KHPLVWU\6HFWLRQGHVFULEHVWKHXVDJHGDWD
DFTXLUHG IURP WKLV VLWH DQG WKH SUHSURFHVVLQJ WKDW ZH
SHUIRUPHG WR WKHP 6HFWLRQ  SUHVHQWV RXU HYDOXDWLRQ RI
WKH WKUHH FOXVWHULQJ PHWKRGV RQ WKH GDWD DQG 6HFWLRQ 
FRQFOXGHV RXU ZRUN LQWURGXFLQJ VRPH LQWHUHVWLQJ LVVXHV
WKDWDUHVWLOORSHQ
5HODWHG:RUN
$V PHQWLRQHG DERYH RXU ZRUN IRFXVHV RQ WKH FRQVWUXF
WLRQRIPRGHOVIRUXVHUFRPPXQLWLHVLHJURXSVRIXVHUV
ZLWKFRPPRQEHKDYLRU2QHDOWHUQDWLYHFRQFHSWLVWKDWRI
D SHUVRQDO XVHU PRGHO LH D PRGHO FRUUHVSRQGLQJ WR D
VLQJOH XVHU 3HUVRQDO XVHU PRGHOOLQJ KDV DOUHDG\ EHHQ
VWXGLHG H[WHQVLYHO\ ZLWK WKH XVH RI PDFKLQH OHDUQLQJ
PHWKRGVHJ/DQJOH\-RDFKLPVHWDO3D]
]DQL %LOOVXV 7KHGLIILFXOW\ LQDGRSWLQJWKLV DS
SURDFKIRU :HE VLWHVLVWKH UHTXLUHPHQWIRUXVHU LGHQWLIL
FDWLRQ DW HDFK LQWHUDFWLRQ RI D XVHU ZLWK WKH V\VWHP $O
WKRXJK UHJLVWUDWLRQ SURFHGXUHV KDYH EHHQ DGRSWHG E\
VRPH:HEVLWHVWKHQHHGIRULGHQWLILFDWLRQLVZLGHO\FRQ
VLGHUHGD³VFDUHFURZ´IRUSRWHQWLDOXVHUV$QRWKHUDOWHUQD
WLYHWRFRPPXQLWLHVZKLFKZHKDYHORRNHGDWLQWKHSDVW
3DOLRXUDVHWDOLVWKHFRQFHSWRIDXVHUVWHUHRW\SH
5LFK  ZKLFK FRUUHVSRQGV WR D FRPPXQLW\ DVVRFL
DWHG ZLWK FRPPRQ SHUVRQDO FKDUDFWHULVWLFV RI WKH XVHUV
VXFK DV DJH JHQGHU HWF $OWKRXJK VWHUHRW\SHV DUH PRUH
LQIRUPDWLYH WKDQ FRPPXQLWLHV WKH FROOHFWLRQ RI SHUVRQDO
LQIRUPDWLRQH[RJHQRXVWR WKHV\VWHPLQ DZD\WKDWGRHV
QRWYLRODWHWKHSULYDF\RIWKHXVHUVLVDWKRUQ\LVVXH
7KHEURDGHU UHVHDUFK DUHDLQZKLFK RXUZRUNFRQWULEXWHV
PRVWO\LV:HEXVDJHPLQLQJ&RROH\HWDOZKLFK
DSSOLHVGDWDPLQLQJWHFKQLTXHVWR:HEXVDJHGDWDDLPLQJ
WRLGHQWLI\LQWHUHVWLQJXVDJHSDWWHUQV$VLQPRVWDSSOLFD
WLRQVRIGDWDPLQLQJGLIIHUHQWWHFKQLTXHVH[WUDFWGLIIHUHQW
W\SHVRINQRZOHGJHIURPXVDJHGDWD7KLVNQRZOHGJHFDQ
EH FRPELQHG WR SURYLGH D GHWDLOHG SLFWXUH RI WKHYLVLWRUV
RIDVLWH7KHVLWHDGPLQLVWUDWRUFDQXVHWKLVNQRZOHGJHWR
LPSURYH WKH VHUYLFH LQ WHUPV RI LWV SUHVHQWDWLRQ LQ WKH
VLWHHJGLUHFWLQJWKHXVHUVTXLFNO\WRWKHPRVWDSSURSUL
DWHSDJHV RUHYHQE\ VKLIWLQJWKHIRFDO SRLQWVRIWKH VHU
YLFH LH UHDFWLQJ WR WKH YLVLWRUV¶ LPSOLFLW UHTXHVWV 7KH
ZRUNWKDWKDVEHHQSUHVHQWHGVRIDULQ:HEXVDJHPLQLQJ
KDVORRNHGPDLQO\DWWKHFRQVWUXFWLRQRIDVVRFLDWLRQUXOHV
DQG VHTXHQFH PLQLQJ HJ &RROH\ HW DO  %FKQHU
HWDO7KHRQO\ZRUNWKDWZHDUHDZDUHRIXVLQJD
FOXVWHULQJ PHWKRG LV WKDW RI )X HW DO :HH[WHQG
WKLVZRUNE\ORRNLQJDWRWKHUFOXVWHULQJPHWKRGVDQGWZR
GLIIHUHQWZD\VWR HQFRGHWKHGDWD)XUWKHUPRUHZHIRFXV
RQ WKH FKDUDFWHUL]DWLRQ RI WKH UHVXOWLQJ FOXVWHUV ZKLFKLV
QRWDGGUHVVHGE\)XHWDO
$QRWKHUUHVHDUFKDUHDWKDWLVUHODWHGWR:HEXVDJHPLQLQJ
LVFROODERUDWLYHILOWHULQJ$WLWVRXWVHWFROODERUDWLYHILOWHU
LQJZDVPDLQO\ XVHGLQLQIRUPDWLRQ ILOWHULQJVHUYLFHVDQG
DLPHGWRUHODWHRQHSDUWLFXODUXVHUWRDJURXSRIRWKHUXV
HUVZLWKVLPLODULQIRUPDWLRQQHHGV5HVHDUFKLQFROODERUD
WLYHILOWHULQJKDVEHHQYHU\DFWLYHHJ%DVXHWDO
%DODEDQRYLF  6KRKDP  DQG KDV DOVR SURYLGHG
FRPPHUFLDO SURGXFWV 7KHUH LV VXEVWDQWLDO RYHUODS EH
WZHHQ FROODERUDWLYH ILOWHULQJ DQG :HE XVDJH PLQLQJ EXW
WKHUHLV DOVR DQLPSRUWDQW GLIIHUHQFH RIJRDOV&ROODERUD
WLYH ILOWHULQJ LV D XVHUFHQWHUHG DSSURDFK DLPLQJ WR KHOS
WKHXVHUGLUHFWO\ZKLOH:HEXVDJHPLQLQJDLPVWRH[WUDFW
YDOXDEOH NQRZOHGJH IRU WKH V\VWHPRZQHU 7KXV WKHWZR
DSSURDFKHV FRPSOHPHQW HDFKRWKHU 7HFKQLFDOO\ PRVWRI
WKH ZRUN LQ FROODERUDWLYH ILOWHULQJ XVHV LQVWDQFHEDVHG
PHWKRGV ZKLOH :HE XVDJH PLQLQJ DQG LQ SDUWLFXODU RXU
ZRUN LV EDVHG RQ DQ DFWLYHVHDUFK IRU JHQHUDOPRGHOV $
QRWDEOHH[FHSWLRQLVWKHZRUNRI%UHHVHHWDOZKR
XVHGPRGHOEDVHGPHWKRGVIRUFROODERUDWLYHILOWHULQJ7KH
SULPDU\JRDORIWKLVZRUNUHPDLQHGWKHDELOLW\WRKHOSWKH
XVHUVGLUHFWO\UDWKHUWKDQWRDQDO\]HWKHFKDUDFWHULVWLFVRI
WKHPRGHOVWKDWZHUHJHQHUDWHG
&RQVWUXFWLQJ DQG (YDOXDWLQJ &RPPXQLW\
0RGHOV
7KHRULJLQDO GHILQLWLRQ RID XVHU FRPPXQLW\DVVXPHGWKH
LGHQWLILFDWLRQRILQGLYLGXDOXVHUVLHDFRPPXQLW\PRGHO
FRUUHVSRQGHG WR DQ LGHQWLILDEOH JURXS RI XVHUV 6RPH RI
RXU HDUOLHU ZRUN RQ WKLV SUREOHP 3DOLRXUDV HW DO 
ZDVEDVHGRQWKHVDPHDVVXPSWLRQ,QWKDWZRUNZHXVHG
FRQFHSWXDO FOXVWHULQJ DOJRULWKPV VXFK DV &2%:(%
)LVKHU  DQG ,7(5$7( %LVZDV HW DO  WR
EXLOGFRPPXQLW\PRGHOV E\FOXVWHULQJSHUVRQDOXVHUSUR
ILOHV +RZHYHU DV PHQWLRQHG DERYH WKH DVVXPSWLRQ RI
XVHU LGHQWLILFDWLRQ LV XQUHDOLVWLF IRU WKH PDMRULW\ RI WKH
H[LVWLQJ :HE VLWHV 'HVSLWH WKLV FRPPXQLW\ PRGHOV FDQ
VWLOOEH FRQVWUXFWHG RQXVDJH GDWD SURYLGLQJYDOXDEOHLQ
IRUPDWLRQWRWKHRZQHURID:HEEDVHGVHUYLFH,PSOLFLWO\
FRPPXQLW\PRGHOVEXLOWLQWKLVPDQQHUVWLOOFRUUHVSRQGWR
JURXSVRIXVHUVZLWKFRPPRQEHKDYLRUDOWKRXJKRQHPD\
QRWEHDEOHWRLGHQWLI\LQGLYLGXDOFRPPXQLW\PHPEHUV
&RQFHSWXDOFOXVWHULQJDOJRULWKPVFDQVWLOOEHXVHGIRUWKH
FRQVWUXFWLRQ RI WKH PRGLILHG W\SH RI FRPPXQLW\ WKDW ZH
DUHLQWHUHVWHGLQ+RZHYHUWKHUHDUHWZRSUREOHPVLQGR
LQJWKDWDWKHRUHWLFDODQGDSUDFWLFDORQH7KHWKHRUHWLFDO
SUREOHP DULVHV IURP WKH FOXVWHULQJ SURFHVV XVHG E\ FRQ
FHSWXDO FOXVWHULQJ DOJRULWKPV ZKLFK FRQVWUXFWV QRQ
RYHUODSSLQJFOXVWHUV7KLVLVQRWGHVLUDEOHZKHQFOXVWHULQJ
XVHUVLQWRFRPPXQLWLHV VLQFHDXVHU PD\EHORQJWRPRUH
WKDQ RQH FRPPXQLW\ 7KH SUDFWLFDO SUREOHP WKDW ZH HQ
FRXQWHUHG LV WKDW WKH SXEOLFO\DYDLODEOH YHUVLRQV RI WKH
WZR DOJRULWKPV WKDW ZH KDG FRXOG RQO\ KDQGOH VPDOO VHWV
RIGDWD7KLVUHVWULFWLRQZDVSURKLELWLYHIRUWKHZRUNSUH
VHQWHGKHUH 'XH WRWKHVH SUREOHPV ZHGHFLGHGWR H[DP
LQH WKH WKUHH DOJRULWKPV WKDW DUH SUHVHQWHG LQ 6HFWLRQ 
ZKLFKDOORZRYHUODSSLQJFOXVWHUVDQGFDQKDQGOHWKHODUJH
TXDQWLWLHVRIGDWDWKDWZHKDYHDYDLODEOH
7KHILQDO JRDO RIRXU DSSURDFK LVD VHWRI PRGHOVZKLFK
FRUUHVSRQG WR EHKDYLRUDO SDWWHUQV IRU GLIIHUHQW W\SHV RI
XVHU&OXVWHULQJWKHXVHUVLQWRFRPPXQLWLHVZLWKFRPPRQ
EHKDYLRUDOFKDUDFWHULVWLFVLVDILUVWVWHSWRZDUGVWKLVJRDO
EXWGRHVQRWSURYLGHWKHGHVLUHGSDWWHUQV,QRUGHUWRRE
WDLQ WKHVH SDWWHUQV ZH QHHG WR LGHQWLI\ WKH GHVFULSWLYH
FKDUDFWHULVWLFV RI HDFK FOXVWHU 7KH ZD\ LQ ZKLFK ZH
DFKLHYH WKLV GLIIHUV IRU GLIIHUHQW FOXVWHULQJ PHWKRGV EXW
WKHXQGHUO\LQJLGHDLVFRPPRQ
$ FRPPXQLW\ PRGHO LV H[SUHVVHG LQ WHUPV RI WKH VDPH
SDUDPHWHUV DV WKH XQGHUO\LQJ XVDJH GDWD )RU LQVWDQFH LI
WKH XVDJH GDWD VLPSO\ UHFRUG WKH SDJHV LQ D VLWH WKDW DUH
YLVLWHG E\ D XVHU ZLWKLQ DQDFFHVV VHVVLRQ WKHFRPPXQL
WLHVZLOODOVREHGHVFULEHGLQWHUPVRIWKHSDJHVLQWKHVLWH
,QDPRUH IRUPDOODQJXDJHJLYHQ DVHWRIERROHDQDWWULE
XWHV$ GHVFULELQJWKHLQVWDQFHV LQWKHGDWDVHWWKH PRGHO
RI D FRPPXQLW\ &
FRQVLVWV RI D VXEVHW RI $ $  ZKLFK
FKDUDFWHUL]HV WKH PHPEHUV RI WKH FRPPXQLW\ LH ZKLFK
DUH XVXDOO\ WUXH IRU WKH PHPEHUV RI WKH FRPPXQLW\ ,I $
GRHVQRWFRQWDLQERROHDQDWWULEXWHVFKDUDFWHULVWLFDWWULEXWH
YDOXHV IRU QRPLQDO DWWULEXWHV RU UDQJHV RI YDOXHV IRU
QXPHULF DWWULEXWHV ZLOO SURYLGH WKH FRPPXQLW\ PRGHO
)RUWKHVDNHRIVLPSOLFLW\ZHXVHERROHDQDWWULEXWHVKHUH
7KHVHOHFWLRQRIWKHGHVFULSWLYHDWWULEXWHVLVGRQHZLWKWKH
DLGRIVLPSOHPHWULFVZKLFKDUHEDVHGRQWKHLGHDWKDWDQ
DWWULEXWHLVVSHFLDOIRUDFRPPXQLW\LILWVIUHTXHQF\ZLWKLQ
WKH FRPPXQLW\ LV VLJQLILFDQWO\ KLJKHU WKDQ LWV IUHTXHQF\
LQWKHZKROHGDWDVHW7KHQDWXUDOFKRLFHRIPHWULFGLIIHUV
IRUGLIIHUHQW FOXVWHULQJPHWKRGV)RU WKHFRQFHSWXDOFOXV
WHULQJ DOJRULWKPV ZH KDYH XVHG D VTXDUHGGLIIHUHQFH
PHDVXUH FDOOHG IUHTXHQF\ LQFUHDVH 3DOLRXUDV HW DO
 PRWLYDWHG E\ WKH FDWHJRU\ XWLOLW\ VHDUFK KHXULVWLF
WKDW WKHVH DOJRULWKPV XVH 7KH FKRLFHV IRU WKH FOXVWHULQJ
DOJRULWKPVWKDWDUHXVHGKHUHDUHH[SODLQHGLQ6HFWLRQ
+DYLQJ GHFLGHG RQ D PHWKRG WR REWDLQ WKH FRPPXQLW\
PRGHOV ZH QHHG WR GHFLGH RQ WKH GHVLUHG SURSHUWLHV RI
WKHVH PRGHOV 2XU SULPDU\ REMHFWLYH LV WR SURYLGH XVHIXO
FRPPXQLW\PRGHOV ,Q RUGHUIRU WKH PRGHOVWR EHXVHIXO
WKH\QHHGWREHUHODWLYHO\IHZLQQXPEHUDQGVPDOOLQVL]H
$VD UHVXOW WKHQXPEHU RIPRGHOV DQGWKHLUDYHUDJH VL]H
DUHWZRLPSRUWDQWPHDVXUDEOHFULWHULDIRUWKHVXFFHVVRID
PHWKRG7KHH[DFWILJXUHVIRUWKHVHFULWHULDGHSHQGRQWKH
QDWXUHRIWKHSUREOHPHJWKHVL]HRIWKHDWWULEXWHVHW
+RZHYHU D GLJHVWLEOH VHW RI PRGHOV LV QRW QHFHVVDULO\
LQWHUHVWLQJ :KHQ WKHUH DUH RQO\ VPDOO GLIIHUHQFHV EH
WZHHQ WKH PRGHOV DFFRXQWLQJ IRU YDULDQWV RI WKH VDPH
FRPPXQLW\WKHVHJPHQWDWLRQRIXVHUVLQWRFRPPXQLWLHVLV
QRW LQWHUHVWLQJ 7KXV ZH DUH LQWHUHVWHG LQ FRPPXQLW\
PRGHOV WKDW DUH DV GLVWLQFW IURP HDFK RWKHU DV SRVVLEOH
:H PHDVXUH WKH GLVWLQFWLYHQHVV RI D VHW RI PRGHOV 0E\
WKHUDWLREHWZHHQWKHQXPEHURIGLVWLQFWDWWULEXWHVWKDWDUH
FRYHUHGDQGWKHVL]HRIWKHPRGHOVHW07KXVLIWKHUHDUH
- FRPPXQLWLHV LQ 0 $
WKH DWWULEXWHV XVHG LQ WKH MWK
PRGHO DQG $ WKH DWWULEXWHV DSSHDULQJ DW OHDVW LQ RQH
PRGHOGLVWLQFWLYHQHVVLVJLYHQE\WKHIROORZLQJHTXDWLRQ
__
$
$
0HQHVV'LVWLQFWLY =

6LQFHZHDUH LQWHUHVWHGLQDVPDOOQXPEHURIPRGHOVWKDW
DUH GLVWLQFW WKH HPSW\ PRGHO VHW WULYLDOO\ VDWLVILHV RXU
FULWHULD ,Q D PRUH UHDOLVWLF VLWXDWLRQ ZH PLJKW KDYH D
VPDOO VHW RI GLVWLQFW PRGHOV ZKLFK DFFRXQW IRU RQO\ D
VPDOO SDUW RI WKH XVDJH RI WKH V\VWHP ,Q RUGHU WR DYRLG
WKLV SUREOHP ZH LQWURGXFH D IXUWKHU FULWHULRQ WKDW FRXQ
WHUDFWV GLVWLQFWLYHQHVV DQG VL]H 7KH QHZ FULWHULRQ LV WKH
RYHUDOO FRYHUDJH RI WKH FRPPXQLW\ PRGHOV LH WKHSUR
SRUWLRQ RI DWWULEXWHV FRYHUHG E\ WKH PRGHOV ,I $ LV WKH
VHWRIDWWULEXWHVDSSHDULQJDWOHDVWLQRQHPRGHOWKHFRY
HUDJHRIWKHVHWRIPRGHOV0LV
$
$
0&RYHUDJH
=

7KH VLPXOWDQHRXV RSWLPL]DWLRQ RI GLVWLQFWLYHQHVV DQG
FRYHUDJH E\ D VHW RI FRPPXQLW\ PRGHOV LQGLFDWHV WKH
SUHVHQFH RI XVHIXO NQRZOHGJH LQ WKH VHW $OO RI WKH IRXU
PHDVXUHVLQWURGXFHGLQWKLVVHFWLRQLHQXPEHURIPRG
HOVDYHUDJHPRGHOVL]HGLVWLQFWLYHQHVVDQGFRYHUDJHDUH
LQGHSHQGHQWRIWKHELDVHVRIWKHFOXVWHULQJDOJRULWKPV)RU
WKLVUHDVRQWKH\FRQVWLWXWHREMHFWLYHFULWHULDDQGWKH\DUH
XVHGLQWKHHYDOXDWLRQRIWKHWKUHHFOXVWHULQJPHWKRGV
,QWURGXFWLRQRIWKH&OXVWHULQJ0HWKRGV
7KLV VHFWLRQ GHVFULEHV EULHIO\ WKUHH FOXVWHULQJ PHWKRGV
WKDW ZH XVH LQ RXU ZRUN 7ZR RI WKHP KDYH EHHQ XVHG
ZLGHO\LQPDFKLQHOHDUQLQJUHVHDUFK7KHVHDUH$XWRFODVV
+DQVRQHWDODQGVHOIRUJDQL]LQJPDSV.RKRQHQ
 2XU SUHVHQWDWLRQ RI WKRVH WZR PHWKRGV LV D VKRUW
VXPPDU\ RI WKH GHVFULSWLRQV RQH FDQ ILQG LQ WKH RULJLQDO
UHIHUHQFHV 7KH WKLUG PHWKRG LV D IDLUO\ QHZ RQH ZKLFK
ZHKDYHDOVRPRGLILHGWRVRPHH[WHQW,WLVFDOOHGFOXVWHU
PLQLQJ 3HUNRZLW]  (W]LRQL  DQG LW ZLOO EH SUH
VHQWHGLQPRUHGHWDLOWKDQWKHRWKHUWZR
$XWRFODVV
$XWRFODVV LV DQ XQVXSHUYLVHG FODVVLILFDWLRQ DOJRULWKP XV
LQJPL[WXUHPRGHOLQJDVWKHEDVLFFOXVWHULQJPHWKRGVXS
SOHPHQWHG E\ D %D\HVLDQ PHWKRG IRU GLVFRYHULQJ HIIL
FLHQWO\WKHRSWLPDOFODVVHVLQODUJHGDWDVHWV$VDOOXQVX
SHUYLVHG FODVVLILHUV LWV JRDO LV WR ILQG WKH PRVW SUREDEOH
FODVVGHVFULSWLRQVRIDGDWDVHW6SHFLILFDOO\WKHDOJRULWKP
FRQVLGHUVWKDWHDFKFODVV&
KDVLWVRZQSUREDELOLW\GLVWUL
EXWLRQ7
,WVIXQGDPHQWDOPRGHOLVWKHILQLWHPL[WXUHGLV
WULEXWLRQ ZKLFKGHDOV ZLWK WZRPDLQ SUREDELOLWLHV LWKH
LQWHUFODVVSUREDELOLW\RIDQLQVWDQFH;
EHLQJDPHPEHURI
FODVV &
 3; » &  DQG LL WKH FODVV SUREDELOLW\ RI RE
VHUYLQJWKHLQVWDQFHDWWULEXWHYDOXHV;
FRQGLWLRQDORQWKH
DVVXPSWLRQWKDW;
LVDPHPEHURI& 3; _; »& 
$XWRFODVV SHUIRUPV WZR OHYHOV RI VHDUFK L WKH PRGHO
OHYHO VHDUFK ZKLFK GHWHUPLQHV WKH QXPEHU RI FODVVHV -
DQGDOWHUQDWH FODVV PRGHOVIRU HDFK 7
DQG LL WKHVHDUFK
IRU PD[LPXP SRVWHULRU SDUDPHWHU YDOXHV ZKLFK GHWHU
PLQHV IRU DQ\ IL[HG 7
 WKH VHW RI WKH FRUUHVSRQGLQJ SD
UDPHWHUVWKDWDUHPD[LPDOO\SUREDEOH
$Q LPSRUWDQW GLIIHUHQFH RI $XWRFODVV WR RWKHU XQVXSHU
YLVHGFODVVLILHUVLV WKDWLWGRHVQRWDVVLJQLQVWDQFHVWRWKH
FODVVHV 6LQFH LW KROGV WKDW QR ILQLWH DPRXQW RI HYLGHQFH
FDQ GHWHUPLQH DQ LQVWDQFHV FODVV PHPEHUVKLS LW XVHV D
ZHLJKWHG DVVLJQPHQW ZHLJKWLQJ RQ WKH SUREDELOLW\ RI
FODVV PHPEHUVKLS $V H[SODLQHG DERYH WKLV SUREDELOLVWLF
DVVLJQPHQW RI LQVWDQFHV WR FODVVHV LV YHU\ VXLWDEOH WR
FRPPXQLW\PRGHOLQJ
$XWRFODVVKDVDEXLOWLQPHWULFWRDVVLVWLQWKHGHVFULSWLYH
FKDUDFWHUL]DWLRQ RI WKH FOXVWHUV LH WKH FRQVWUXFWLRQ RI
FRPPXQLW\PRGHOV7KLV PHWULFLVFDOOHG LQIOXHQFHDQGLW
LVGHILQHGIRUDQDWWULEXWHD
DQGDFODVV& DVIROORZV
_
ORJ
_
_
D3
&D3
D3
&D3
&D, =

7KLV LV D PHDVXUH RI KRZ LPSRUWDQW DQ DWWULEXWH LV IRU D
SDUWLFXODU FODVV 7KH FRPPXQLW\ PRGHO VKRXOG FRQVLVW RI
WKHDWWULEXWHVZLWKWKHKLJKHVWLQIOXHQFHYDOXHV7KHTXHV
WLRQWKDWDULVHVLVKRZPDQ\DWWULEXWHVWRNHHSSHUPRGHO
&OHDUO\QHJDWLYHLQIOXHQFHYDOXHVLQGLFDWHWKDWWKHDWWULE
XWHVKRXOGQRWEHXVHGLQWKHPRGHO:HGHFLGHGWRLQYHV
WLJDWHWKLVSDUDPHWHU IXUWKHUE\QRUPDOL]LQJWKHLQIOXHQFH
YDOXH ,
D _&  ,D _& PD[ ,D _&  DQG H[DPLQLQJ WKH
HIIHFWRIGLIIHUHQWWKUHVKROGYDOXHV
,Q WKH GHVFULEHG H[SHULPHQWV ZH XVHG WKH SXEOLFGRPDLQ
SURJUDP$XWRFODVV&YHUVLRQIRU:LQGRZV17
GRZQORDGHG IURP WKH IROORZLQJ :HE VLWH
KWWSLFZZZDUFQDVDJRYLFSURMHFWVED\HVJURXSDXWRFODVV
6HOI2UJDQL]LQJ0DSV
7KH VHOIRUJDQL]LQJ PDS 620 PHWKRG LV RQH RI WKH
PRVW SRSXODUQHXUDO QHWZRUN DSSURDFKHVWR XQVXSHUYLVHG
OHDUQLQJ :H XVH WKH EDWFK LPSOHPHQWDWLRQ RI WKH 620
LQFOXGHGLQWKH,QWHOOLJHQW0LQHUVRIWZDUHE\,%0
620SHUIRUPVD NPHDQVW\SHRI FOXVWHULQJE\WU\LQJWR
LGHQWLI\ SURWRW\SH YHFWRUV IRU WKH N FOXVWHUV 3URWRW\SHV
DFWDVFHQWHUVRIJUDYLW\IRUWKHFOXVWHUVDQGWKHLUSRVLWLRQ
LQWKHLQSXWVSDFHLVRSWLPL]HGLWHUDWLYHO\7KLVSURFHVVLV
FRPPRQ WR D ODUJH IDPLO\ RI RWKHU FOXVWHULQJ PHWKRGV
+RZHYHU WKH SRZHU RI 620 OLHV LQ LWV DELOLW\ WR VHDUFK
HIILFLHQWO\IRUWKHRSWLPDOSURWRW\SHV7KLVLVDFKLHYHGE\
DOORZLQJ QHLJKERULQJ FOXVWHUV WR DIIHFW WKH FKRLFH RI D
QHZSURWRW\SHYHFWRU 7KHHQGUHVXOW LVVLPLODUWR D9R
URQRL WHVVHOODWLRQ RI WKH LQSXW VSDFH WKH ERXQGDULHV RI
ZKLFK DSSUR[LPDWH WKH WKHRUHWLFDO %D\HVLDQ GHFLVLRQ
ERXQGDULHV ,Q WKLV VHQVH WKH 620 PHWKRG SURYLGHV D
GLIIHUHQWSDWKWRWKHVDPHGHVWLQDWLRQDV$XWRFODVVGRHV
)RU WKH WDVN RI FRPPXQLW\ PRGHOLQJ HDFK FOXVWHU FRUUH
VSRQGVWRDFRPPXQLW\DQGFRPPXQLWLHVWKDWDUHFORVHWR
HDFK RWKHU WHQG WR KDYH VLPLODU PRGHOV :H FRQVWUXFW
FRPPXQLW\ PRGHOV XVLQJ WKH VDPH PHWULF DV ZH GR IRU
$XWRFODVV LH LQIOXHQFH 2QH WHFKQLFDO GLIILFXOW\ ZLWK
620LVWKDWRQHQHHGVWRVSHFLI\DIL[HGQXPEHURIFRP
PXQLWLHV ,Q SUDFWLFH WKLV QXPEHU GRHV QRW QHHG WR EH
DFFXUDWH EXW LW QHHGV WR EH ODUJH HQRXJK WR FRYHU WKH
QXPEHURIUHDOFRPPXQLWLHVLQWKHGDWD,%0
&OXVWHU0LQLQJ
7KH FOXVWHU PLQLQJ DOJRULWKP LV D VLPSOH JUDSKEDVHG
FOXVWHULQJ PHWKRG &OXVWHU PLQLQJ GLVFRYHUV SDWWHUQV RI
FRPPRQEHKDYLRUE\ORRNLQJIRUDOOIXOO\FRQQHFWHGVXE
JUDSKVFOLTXHVRIDJUDSKWKDWUHSUHVHQWVWKHXVHUVFKDU
DFWHULVWLF DWWULEXWHV ,W VWDUWV E\ FRQVWUXFWLQJ D ZHLJKWHG
JUDSK*$(:
: 7KHVHWRIYHUWLFHV$FRUUHVSRQGVWR
WKHGHVFULSWLYHDWWULEXWHVXVHGLQWKHLQSXWGDWD7KHVHWRI
HGJHV ( FRUUHVSRQGV WR DWWULEXWH FRRFFXUUHQFH DV RE
VHUYHGLQWKHGDWD)RULQVWDQFHLQWKH:HEVLWHRQ&KHP
LVWU\WKDW ZH H[DPLQHLI WKHXVHUYLVLWV SDJHVFRQFHUQLQJ
2UJDQLF &KHPLVWU\ DQG 3RO\PHUV DQ HGJH LV DGGHG
EHWZHHQWKHUHOHYDQWYHUWLFHV7KHZHLJKWVRQWKHYHUWLFHV
:
DQG WKH HGJHV : DUH FRPSXWHG DV WKH DWWULEXWH IUH
TXHQFLHV DQG DWWULEXWH FRRFFXUUHQFH IUHTXHQFLHV UHVSHF
WLYHO\(GJHIUHTXHQFLHVDUHQRUPDOL]HGE\GLYLGLQJWKHP
ZLWKWKH PD[LPXP RIWKH IUHTXHQFLHV RIWKH WZR YHUWLFHV
WKDW WKH\ FRQQHFW 7KH HIIHFW RI QRUPDOL]DWLRQ LV WR UH
PRYH WKH ELDV IRU DWWULEXWHV WKDW DSSHDU YHU\ RIWHQ LQ DOO
XVHUV7KHUHVXOWLQJJUDSKLVJLYHQLQ)LJXUH
7KH FRQQHFWLYLW\ RI WKH JUDSK LV XVXDOO\ YHU\ KLJK )RU
WKLVUHDVRQZH PDNHXVHRIDFRQQHFWLYLW\WKUHVKROGDLP
LQJ WR UHGXFH WKH HGJHV RI WKH JUDSK ,Q RXU H[DPSOH LQ
)LJXUHLI WKHWKUHVKROGHTXDOVWKHHGJH,QRUJDQLF
&KHPLVWU\%LRFKHPLVWU\LVGURSSHG
7KH FOXVWHU PLQLQJ PHWKRG ZDV LQWURGXFHG LQ WKH
3DJH*DWKHU V\VWHP 3HUNRZLW]  (W]LRQL  7KH DO
JRULWKPWKDWZHXVHGLIIHUVLQWZRZD\VIURPWKHRULJLQDO
DOJRULWKPD3DJH*DWKHUGRHVQRWQRUPDOL]HWKHZHLJKWV
:
DQGELWUHVWULFWVLWVVHDUFKWRFOLTXHVRIVL]HNDQGWR
FRQQHFWHGFRPSRQHQWV'HVSLWHWKHODUJHWKHRUHWLFDOFRP
SOH[LW\RIWKHFOLTXHILQGLQJSUREOHPLQSUDFWLFHWKHDOJR
ULWKP WKDW ZH LPSOHPHQWHG %URQ  .HUERVFK  LV
IDVW
7KHHIILFLHQF\RIWKHDOJRULWKPDOORZHGDIXOOLQYHV
WLJDWLRQRIWKHHIIHFWRIWKHFRQQHFWLYLW\WKUHVKROG
²²²²²
2UJDQLF
&KHPLVWU\
,QRUJDQLF
&KHPLVW
U
\
3RO
\
PHUV
,QGXVWULDO
&KHPLVWU\
%LRFKHPLVWU\















)LJXUH1RUPDOL]HGJUDSKIRUFOXVWHUPLQLQJ
,QFRQWUDVWWRWKHRWKHUFOXVWHULQJPHWKRGVVXFKDV$XWR
FODVV DQG 620 WKH FOXVWHUV JHQHUDWHG E\ FOXVWHU PLQLQJ
JURXSWRJHWKHUFKDUDFWHULVWLFIHDWXUHVRIWKHXVHUVGLUHFWO\
(DFKFOLTXHGLVFRYHUHGE\FOXVWHUPLQLQJLVDOUHDG\DEH
KDYLRUDO SDWWHUQ 7KHUHIRUH WKHUH LV QR QHHG WR SRVW
SURFHVVWKHFOXVWHUVWRFRQVWUXFWGHVFULSWLYHPRGHOV
'HVFULSWLRQRIWKH8VDJH'DWD
)RU WKLV H[SHULPHQW ZH XVHG WKH DFFHVV ORJV RI WKH VLWH
³,QIRUPDWLRQ 5HWULHYDO LQ &KHPLVWU\´
KWWSPDFHGRQLD
FKHPGHPRNULWRVJU
 ZKLFK FRQVLVWV RI D IHZ WKRXVDQG
SDJHV ZLWK D KLJK KLW UDWH 7KH ORJ ILOHV FRQVLVWHG RI
 :HEVHUYHU FDOOV ORJ ILOH HQWULHV DQG FRYHUHG
WKH SHULRG EHWZHHQ -DQXDU\ DQG $XJXVW  (DFK ORJ
HQWU\ UHFRUGHG DFFHVV GDWH DQG WLPH WKH YLVLWRU¶V ,3 DG
GUHVVDQGGRPDLQQDPHDQGWKHWDUJHWSDJH85/
,QRUGHUWRFRQVWUXFWDWUDLQLQJVHWIRUWKHFOXVWHULQJDOJR
ULWKPVWKHGDWDLQWKHORJILOHVSDVVHGWKURXJKWZRVWDJHV
RI SUHSURFHVVLQJ )LUVW ZH H[WUDFWHG DFFHVV VHVVLRQVDQG
WKHQ ZH WUDQVODWHG WKH SDWKV UHFRUGHG LQ WKH DFFHVV VHV
VLRQVLQWRDWWULEXWHYHFWRUV
$FFHVV VHVVLRQV ZHUH H[WUDFWHG IURP ORJ ILOHV XVLQJ WKH
IROORZLQJSURFHGXUH
 *URXSLQJWKHORJVE\GDWHDQG,3DGGUHVV
 6HOHFWLQJDWLPHIUDPHZLWKLQZKLFKWZRKLWVIURPWKH
VDPH ,3 DGGUHVV FDQ EH FRQVLGHUHG WR EHORQJ LQ WKH
VDPHDFFHVVVHVVLRQ
 *URXSLQJ WKH SDJHV DFFHVVHG E\ WKH VDPH ,3 DGGUHVV
ZLWKLQWKHVHOHFWHGWLPHIUDPHWRIRUPDVHVVLRQ
,Q RUGHU WR VHOHFW WKH DSSURSULDWH WLPHIUDPH ZH JHQHU
DWHG WKH IUHTXHQF\ GLVWULEXWLRQ RI WKH SDJH WUDQVLWLRQV LQ
PLQXWHV $FFRUGLQJ WR WKLV GLVWULEXWLRQ WUDQVLWLRQV IURP
RQHSDJHWRDQRWKHUPDGHZLWKDWLPHLQWHUYDOORQJHUWKDQ
RQHKRXUKDGYHU\ORZIUHTXHQF\7KXVDVHQVLEOHGHILQL
WLRQRIWKHDFFHVVVHVVLRQLVDVHTXHQFHRISDJHWUDQVLWLRQV
IRUWKHVDPH,3DGGUHVVZKHUHHDFKWUDQVLWLRQLVGRQHDWD
WLPHLQWHUYDOVPDOOHUWKDQRQHKRXU%DVHGRQWKLVGHILQL
WLRQRXUORJILOHVFRQVLVWHGRIDFFHVVVHVVLRQV
&RQFHUQLQJ WKH WUDQVODWLRQ RI DFFHVV VHVVLRQV WR DWWULEXWH
YHFWRUV ZH H[DPLQHG WZR DOWHUQDWLYH DSSURDFKHV ,Q WKH
ILUVWDSSURDFKHDFKDWWULEXWHLQWKHYHFWRUUHSUHVHQWHGWKH
SUHVHQFH RI D SDUWLFXODU SDJH RI WKH :HE VLWH LQ WKH VHV
VLRQ,QWKHVHFRQGDSSURDFKZHXVHGWUDQVLWLRQVEHWZHHQ
SDJHVUDWKHUWKDQLQGLYLGXDOSDJHVDVWKHEDVLFSDWKFRP
SRQHQWV ,Q ERWK FDVHV WKH DWWULEXWH YHFWRU FRQVLVWHG RI
ERROHDQ IHDWXUHV UHSUHVHQWLQJ ZKHWKHU DQ DWWULEXWH D
SDJHRUDWUDQVLWLRQZDVSUHVHQWLQDVHVVLRQRUQRW
7KHUH ZHUH  SDJHV LQ WKH VLWH WKDW ZHUH YLVLWHG DW
OHDVW RQFH &OHDUO\ WKH QXPEHU RI DOO SRVVLEOH WUDQVLWLRQV
EHWZHHQWKHVHSDJHVLVSURKLELWLYHO\ODUJH(YHQWKHQXP
EHU RI GLIIHUHQW WUDQVLWLRQV WKDW DSSHDU LQ WKH ORJ ILOHV LV
YHU\ODUJH7KXVZHQHHGHGDPHWKRGWRUHGXFHWKHQXP
EHURI DWWULEXWHV LQERWK H[SHULPHQWV 7KLVUHGXFWLRQ ZDV
DFKLHYHGE\ H[DPLQLQJ WKHIUHTXHQF\ GLVWULEXWLRQV RIWKH
SDJHV DQG WKH WUDQVLWLRQV IURP RQH SDJH WR DQRWKHU 7KH
WZR GLVWULEXWLRQV ZHUH KLJKO\ VNHZHG LH WKHUH ZDV D
VPDOO QXPEHU RI YHU\ IUHTXHQW SDJHV DQG WUDQVLWLRQV
7KXVZH GHFLGHG RQD FXWRIIIUHTXHQF\ RIIRU SDJHV
DQG  IRU WUDQVLWLRQV ZKLFK ZHUH WKH SRLQWV ZKHUH WKH
FRUUHVSRQGLQJ GLVWULEXWLRQV ZHUH EHFRPLQJ IODW $GGL
WLRQDOO\ ZH UHPRYHG DOO WUDQVLWLRQV IURP D SDJH WR LWVHOI
$V D UHVXOW  SDJHV DQG  WUDQVLWLRQV VXUYLYHG WKLV
VHOHFWLRQDQGZHUHXVHGWRIRUPDWWULEXWHYHFWRUV:HDOVR
WULHGDPHWKRGWKDWXVHV0XWXDO,QIRUPDWLRQDVDFULWHULRQ
IRUVHOHFWLQJDWWULEXWHVIRUXQVXSHUYLVHGOHDUQLQJ6DKDPL
 0RUH WKDQ  RI WKH DWWULEXWHV VHOHFWHG E\ WKLV
PHWKRG ZHUH ZLWKLQ WKH KLJKIUHTXHQF\ UDQJH WKDW ZH
VHOHFWHG+RZHYHUVRPHRIWKHDWWULEXWHVWKDWZHUHHOLPL
QDWHG ZHUH FOHDUO\ LPSRUWDQW HJ SDJHV FRYHULQJ PDMRU
UHVHDUFK DUHDV RI FKHPLVWU\ VXFK DV SKDUPDFHXWLFDO
FKHPLVWU\ )RU WKLV UHDVRQ ZH SUHIHUUHG WKH VLPSOH IUH
TXHQF\WKUHVKROGDSSURDFK
7KHWKUHHFOXVWHULQJ PHWKRGVZHUHDSSOLHG WRERWKUHSUH
VHQWDWLRQV RI WKH GDWD ,Q WKH ILUVW UHSUHVHQWDWLRQ WKH UH
VXOWLQJPRGHOIRUDFRPPXQLW\LVDJURXSRISDJHVZKLFK
DUHSRSXODUIRUXVHUVZLWKLQWKHFRPPXQLW\,QWKHVHFRQG
UHSUHVHQWDWLRQ HDFK FRPPXQLW\ PRGHO LV D VHW RI SDJH
WUDQVLWLRQV7KH SDJHEDVHGUHSUHVHQWDWLRQ SURYLGHVVWDWLF
PRGHOVRI XVHU LQWHUHVWVVLPLODU WR WKHXVHU SURILOHVXVHG
LQFROODERUDWLYHILOWHULQJ)RULQVWDQFHRQHFRPPXQLW\RI
FKHPLVWVPD\EHIRXQGWREHLQWHUHVWHGLQRUJDQLFFKHPLV
WU\ SRO\PHUV DQG ELRFKHPLVWU\ 2Q WKH RWKHU KDQG WKH
WUDQVLWLRQEDVHG UHSUHVHQWDWLRQ SURYLGHV QDYLJDWLRQDO
PRGHOV ZKLFK VKRZ WKH SDWKVWKURXJK WKH VLWHWKDW XVHUV
XVXDOO\ IROORZ )RU LQVWDQFH RQH FRPPXQLW\ PD\ VWDUW
IURP WKH ,QGH[ SDJH WKHQ PRYH WRD KLJKOHYHO FDWHJRU\
DQGWKHQ QDYLJDWHKRUL]RQWDOO\ WKURXJKWKHWKHPDWLF FDWH
JRULHV%RWKW\SHVRIPRGHODUHRILQWHUHVWLQDQLQIRUPD
WLRQ UHWULHYDO VLWH OLNH WKH RQH ZH DUH H[DPLQLQJ ,Q WKLV
UHVSHFW WKH WZR UHSUHVHQWDWLRQV FDQ EH VHHQ WR SURYLGH
FRPSOHPHQWDU\NQRZOHGJH
(YDOXDWLRQ5HVXOWV
(DFKRIWKHWKUHHFOXVWHULQJPHWKRGVSUHVHQWHGLQ6HFWLRQ
 ZHUH DSSOLHG WR WKHXVDJH GDWD IURPWKH VLWH ³,QIRUPD
WLRQ 5HWULHYDO LQ &KHPLVWU\´ XVLQJ ERWK W\SHV RI GDWD
UHSUHVHQWDWLRQ 7KH UHVXOWV LQ WKLV VHFWLRQ H[DPLQH WKH
EHKDYLRURIWKHWKUHHPHWKRGVLQWHUPVRIWKHIRXUFULWHULD
LQWURGXFHGLQ6HFWLRQQXPEHURIFRPPXQLWLHVDYHUDJH
VL]H RI FRPPXQLW\ PRGHOV GLVWLQFWLYHQHVV DQG FRYHUDJH
6XEVHFWLRQVDQGSUHVHQWWKHUHVXOWV6XEVHFWLRQ
H[SODLQVKRZ WKHVH UHVXOWVFDQ EH XVHGWR FKRRVHWKH GH
VLUHG VHW RI FRPPXQLW\ PRGHOV DQG 6XEVHFWLRQ  LOOXV
WUDWHVWKHXVHRIWKHPRGHOVE\WKHVLWHDGPLQLVWUDWRU
1XPEHUDQG6L]HRI&RPPXQLW\0RGHOV
$XWRFODVVGHFLGHVDXWRPDWLFDOO\RQWKHQXPEHURIFOXVWHUV
WKDW LW FUHDWHV GXULQJ WKH PRGHOOHYHO VHDUFK )RU WKH
SDJHEDVHG UHSUHVHQWDWLRQ $XWRFODVV FUHDWHG  FRPPX
QLWLHVZKLOHIRUWKHWUDQVLWLRQEDVHGUHSUHVHQWDWLRQRQO\
%RWKQXPEHUVDUHUHDVRQDEO\ORZEXWRQHQHHGVWRFRP
ELQHWKHVHZLWKWKHDYHUDJHVL]HRIFRPPXQLW\PRGHOVLQ
RUGHUWRMXGJH ZKHWKHUWKHUHVXOWV DUHXVDEOH7KHFKRLFH
RILQIOXHQFHWKUHVKROGKDVDGUDPDWLFHIIHFWRQPRGHOVL]H
%RWKLQ WKH SDJHEDVHGDQGLQ WKHWUDQVLWLRQEDVHGUHSUH
VHQWDWLRQV WKHUH LV D ODUJH QXPEHU RI DWWULEXWHV LQ HDFK
PRGHOWKDWKDYH YHU\ORZLQIOXHQFHYDOXHV$VDUHVXOWDW
WKUHVKROGOHYHORIWKHUHDUHMXVWDERYHDWWULEXWHVSHU
PRGHOZKLFKFRUUHVSRQGVWRDERXWRIWKHDWWULEXWHVHW
)RUWKHWUDQVLWLRQEDVHGUHSUHVHQWDWLRQZKHUHWKHQXPEHU
RIPRGHOVLVVPDOOWKLVVKDUSIDOOLVERXQGWRKDYHDVLJ
QLILFDQWHIIHFWRQFRYHUDJH
)RUWKH620PHWKRGDVPHQWLRQHGLQ6HFWLRQZHKDG
WRIL[WKH QXPEHURIFRPPXQLWLHV)RUERWKW\SHVRIUHS
UHVHQWDWLRQ ZH DVNHG IRU  FRPPXQLWLHV WR EH FRQ
VWUXFWHGFKRVHQWREHVOLJKWO\KLJKHUWKDQWKHODUJHUQXP
EHU RI FRPPXQLWLHV JHQHUDWHG E\ $XWRFODVV 6LPLODU WR
$XWRFODVV WKH QXPEHU RI LQIOXHQWLDO DWWULEXWHV LQ HDFK
FRPPXQLW\LVYHU\VPDOO6HWWLQJWKHWKUHVKROGWRWKH
DYHUDJHVL]HRIWKHFRPPXQLW\PRGHOVLVOHVVWKDQ
,Q WKH FOXVWHU PLQLQJ DOJRULWKP WKH JUDSKFRQQHFWLYLW\
WKUHVKROG DIIHFWV WKH QXPEHU DV ZHOO DV WKH VL]H RI WKH
FRPPXQLW\PRGHOV)RUVPDOOYDOXHVRIWKHWKUHVKROGWKH
JUDSKLVKLJKO\FRQQHFWHGDQGFRQWDLQVPDQ\FOLTXHV,WLV
UHDOO\ DERYH WKH WKUHVKROG YDOXH  WKDW WKH QXPEHU RI
FOLTXHVGURSVWRPDQDJHDEOHOHYHOVEHORZ7KHHIIHFW
LVDOPRVWLGHQWLFDOIRUERWKW\SHVRIUHSUHVHQWDWLRQ$WWKH
VDPH WLPH WKH DYHUDJH VL]H RI WKH JHQHUDWHG PRGHOV LV
YHU\ VPDOO )RU WKUHVKROG YDOXHV DERYH  DOPRVW DOO
FOLTXHVDUHSDLUVRISDJHVRUWUDQVLWLRQV7KHUHVXOWRIWKLV
SKHQRPHQRQ LV WKDW WKH DVVRFLDWLRQV IRXQG EHWZHHQ WKH
DWWULEXWHV DUH UHDOO\ FRRFFXUUHQFH SDWWHUQV UDWKHU WKDQ
VXEVWDQWLDOSDJHJURXSVRUWUDQVLWLRQVHTXHQFHV
'LVWLQFWLYHQHVVDQG&RYHUDJHRIWKH0RGHOV
$VH[SODLQHGLQ6HFWLRQDPHDVXUDEOHLQGLFDWLRQWKDWD
VHWRIFRPPXQLW\PRGHOVLVLQWHUHVWLQJFDQEHREWDLQHGE\
RSWLPL]LQJ WKH GLVWLQFWLYHQHVV DQG WKH FRYHUDJH RI WKH
PRGHOV 7KH FRYHUDJH PHDVXUH FDSWXUHV DOVR WKH FRP
ELQHGHIIHFWRQWKHQXPEHUDQGWKHVL]HRIWKHPRGHOV)RU
LQVWDQFHORZFRYHUDJHLVXVXDOO\REVHUYHGZKHQERWKWKH
QXPEHU DQG WKH VL]H RI WKH PRGHOV DUH VPDOO 6LQFH ZH
DUHLQWHUHVWHGLQWKHFRPELQHGRSWLPL]DWLRQRIGLVWLQFWLYH
QHVV DQG FRYHUDJH ZH ZRXOG OLNH WR SUHVHQW WKH UHVXOWV
DORQJ WKRVH WZR GLPHQVLRQV LQ D FRPELQHG PDQQHU $
JRRGFKRLFHIRUVXFKDSUHVHQWDWLRQLVWKHXVHRI5HFHLYHU
2SHUDWLQJ &KDUDFWHULVWLF 52& FXUYHV 52& FXUYHV DUH
FRPPRQO\ XVHG IRU FRVWVHQVLWLYH FODVVLILFDWLRQ WDVNV
VXFKDVPHGLFDOGLDJQRVLVLQRUGHUWRSUHVHQWWKHWUDGHRII
EHWZHHQ WZR W\SHV RI HUURU HJ RYHUGLDJQRVLV DQG XQ
GHUGLDJQRVLV7KHWZRW\SHVRIHUURUDUHPHDVXUHGE\WZR
FRUUHVSRQGLQJPHDVXUHVFDOOHGVHQVLWLYLW\DQGVSHFLILFLW\
$ 52& FXUYH LV D SORW RI VHQVLWLYLW\ DJDLQVW 
VSHFLILFLW\ $GDSWLQJ WKLV LGHD WR RXU PHDVXUHV ZH SORW
FRYHUDJHDJDLQVWGLVWLQFWLYHQHVV:HQDPHWKLVW\SHRI
SORW D WUDGHRII FXUYH LQ RUGHU WR DYRLG FRQIXVLRQ ZLWK
52& FXUYHV VLQFH ZH DUH QRW PHDVXULQJ VHQVLWLYLW\ DQG
VSHFLILFLW\7KHUHVXOWVWKDWZHREWDLQHGIRUWKHWZRW\SHV
RIGDWDUHSUHVHQWDWLRQDUHVKRZQLQ)LJXUHVDQG
(DFK FXUYH LQ WKH WZR ILJXUHV LV JHQHUDWHG E\ PHDVXULQJ
FRYHUDJH DQG GLVWLQFWLYHQHVV IRU GLIIHUHQW YDOXHV RI WKH
LQIOXHQFH DQG WKH FRQQHFWLYLW\ WKUHVKROGV ,Q WKH H[SHUL
PHQWVZHYDULHGWKHWKUHVKROGYDOXHVIURPWRZLWKDQ
LQWHUYDORI$ODUJHSURSRUWLRQRIWKHUHVXOWVOLHLQWKH
ORZ FRYHUDJH ± KLJK GLVWLQFWLRQ DUHD EHFDXVH WKH FRP
PXQLW\ PRGHOV DUH XVXDOO\ VPDOO IRU WKUHVKROG YDOXHV
DERYH  DV PHQWLRQHG LQ 6XEVHFWLRQ  7KLV LV WKH
UHDVRQ IRU WKH ³QRLV\´ ORRN RI VRPH RI WKH FXUYHV DW WKH
ORZHUOHIWVLGHRIWKHJUDSK
6LPLODUWRWKH52&FXUYHVWKHRSWLPDOSRVLWLRQLVWKHWRS
OHIWFRUQHUZKHUHFRYHUDJHDQGGLVWLQFWLYHQHVVUHDFKWKHLU
PD[LPXP YDOXHV 7KXV WKH VXUIDFH XQGHUQHDWK HDFK
FXUYH LV DQ LQGLFDWLRQ RI WKH RYHUDOO SHUIRUPDQFH RI WKH
PHWKRG LQ WKH H[SHULPHQW ,Q WKLV UHVSHFW $XWRFODVV LV
GRLQJZHOOLQERWKUHSUHVHQWDWLRQV,QWKHSDJHEDVHGUHS
UHVHQWDWLRQ $XWRFODVV RXWSHUIRUPV FOXVWHU PLQLQJ RQO\
IRUKLJKOHYHOVRIFRYHUDJH+RZHYHUGXHWRWKHGHFUHDVH
LQ GLVWLQFWLYHQHVV WKH PRGHOV EHFRPH WRR ODUJH WR EH RI
XVH WR WKH VLWH DGPLQLVWUDWRU )RU WKLVUHDVRQ ZH QHHGWR
WUDGHRIIFRYHUDJHIRUGLVWLQFWLYHQHVVPRYLQJWRDUHDVRI
WKH JUDSK ZKHUH WKH WZR PHWKRGV EHFRPH FRPSDUDEOH
620LVFOHDUO\GRLQJZRUVHWKDQWKHRWKHUPHWKRGVXVLQJ
WKLV UHSUHVHQWDWLRQ ,WV SHUIRUPDQFH LQGLFDWHV WKDW WKH
QXPEHURIFOXVWHUVWKDWZHFKRVHLVWRRODUJHUHVXOW
LJXUH7UDGHRIIFXUYHVIRUWKHSDJHEDVHGUHSUHVHQWDWLRQ
LJXUH7UDGHRIIFXUYHVIRUWKHWUDQVLWLRQEDVHGUHSUHVHQWDWLRQ
LQJLQORZGLVWLQFWLYHQHVVHYHQZKHQWKHFRYHUDJHLVORZ
,Q WKH WUDQVLWLRQEDVHG UHSUHVHQWDWLRQ $XWRFODVV FOHDUO\
DFKLHYHV WKH EHVW SHUIRUPDQFH 7KH EHKDYLRU RI 620 LV
VLPLODUWRWKDWRI$XWRFODVVGHVSLWHWKHPXFKODUJHUQXP
EHURIFOXVWHUVWKDW620JHQHUDWHVLQVWHDGRI
&KRRVLQJDVHWRI&RPPXQLW\0RGHOV
7KH UHVXOWV RI WKH H[SHULPHQWV SUHVHQWHG DERYH FDQ KHOS
XV FKRRVH D WKUHVKROG YDOXH IRU HDFK PHWKRG DQG HDFK
GDWD HQFRGLQJ 6LQFH ZH DUH LQWHUHVWHGLQ D KLJKOHYHO RI
FRYHUDJH ZH VKRXOG UHMHFW KLJK WKUHVKROG YDOXHV DERYH
 WKDW UHGXFH FRYHUDJH GUDVWLFDOO\ 2Q WKH RWKHU KDQG
VHWWLQJ WKH WKUHVKROG YDOXH WRR ORZ LV OLNHO\ WR LQFUHDVH
VLJQLILFDQWO\ WKH VL]H RI WKH PRGHOV DQG GHFUHDVH WKHLU
GLVWLQFWLYHQHVV7DEOHSUHVHQWVRXUSUHIHUUHGFKRLFHVIRU
WKHWKUHHPHWKRGVDQGIRUWKHWZRW\SHVRIUHSUHVHQWDWLRQ
7KH FKRLFH RI ORZ WKUHVKROG YDOXHV  IRU $XWRFODVV
DQG620LQWKHWUDQVLWLRQEDVHGUHSUHVHQWDWLRQ7LVGXH
WR WKH KLJK OHYHO RI GLVWLQFWLYHQHVV WKDW WKH WZR PHWKRGV
DFKLHYH2QWKHRWKHUKDQGWKHFKRLFHRIDORZWKUHVKROG
YDOXHIRU620LQWKHSDJHEDVHGUHSUHVHQWDWLRQ3
LVGLFWDWHGE\WKHVKDUSIDOOLQFRYHUDJHIRUKLJKHUWKUHVK
ROGYDOXHV7KHUHVXOWVLQ7DEOHDUHLQDFFRUGDQFHWRWKH
REVHUYDWLRQVPDGH LQ6XEVHFWLRQ $XWRFODVVVHHPV WR
KDYH WKH EHVW RYHUDOO SHUIRUPDQFH ZKLOH 620 IROORZV
FORVHO\ZKHQXVLQJWKHWUDQVLWLRQEDVHGUHSUHVHQWDWLRQ
7DEOH0RGHOVHWSURSHUWLHVIRUWKHVHOHFWHGWKUHVKROGV

     
     
     
     
     
     
8VLQJ&RPPXQLW\0RGHOV
$OWKRXJK WKH VHOHFWLRQ RI FRPPXQLW\ PRGHOV JRHV VRPH
ZD\WRZDUGVGHOLYHULQJXVHIXONQRZOHGJHDERXWWKHXVDJH
RI WKH V\VWHP WKH SUHVHQWDWLRQ RI WKH PRGHOV WR WKH VLWH
DGPLQLVWUDWRULQDGLJHVWLEOHIRUPDWUHTXLUHVIXUWKHUHIIRUW
,W LV QRW LQ WKH VFRSH RI WKLV SDSHU WR DGGUHVV WKLV LVVXH
IXOO\EXWZH SURYLGHKHUHVRPH LQGLFDWLRQRIWKH W\SHRI
IHHGEDFN WKDWZH KDYH UHFHLYHGIURP WKH VLWHDGPLQLVWUD
WRUVZKR DUH DOVR&KHPLVWU\ VFLHQWLVWV,Q RUGHUWRLQWUR
GXFH RXU UHVXOWV WR WKH DGPLQLVWUDWRUV ZH KDYH VHOHFWHG
WKH ³VWURQJHVW´ FRPPXQLW\ PRGHO IRU HDFK PHWKRG DQG
HDFK W\SH RI UHSUHVHQWDWLRQ :H GHILQH WKH VWURQJHVW
PRGHODVWKHODUJHVWPRGHOWKDWVXUYLYHVIRUODUJHWKUHVK
ROG YDOXHV 7DEOH  SUHVHQWV RXU FKRLFHV 3DJHEDVHG
PRGHOVDUHSUHVHQWHGDVOLVWVRISDJHQDPHVVHSDUDWHGE\
VHPLFRORQV 7UDQVLWLRQEDVHG PRGHOV DUH SUHVHQWHG DV
WUDQVLWLRQVHTXHQFHVFRQQHFWHGE\DQDUURZ
7DEOH  7KH VWURQJHVW FRPPXQLW\ PRGHOV GLVFRYHUHG E\ WKH
WKUHHDOJRULWKPV

 $WPRVSKHULF&KHP'DWDEDVHV
3KDUPDFHXWLFDO&KHP'DWDEDVHV
0HGLFLQDO&KHP'DWDEDVHV
&KHPLVWU\2YHUYLHZ'DWDEDVHV
+HOODV,QWHUQHW2YHUYLHZ
 &RPSXWDWLRQDO&KHPLVWU\-RXUQDOV
&KHPLVWU\2YHUYLHZ'DWDEDVHV
7KHUPRFKHPLVWU\
 (QJLQHHULQJ(QYLURQPHQWDO6FLHQFHV
&U\VWDOORJUDSK\2WKHU7RSLFV
 &KHPLVWU\!UHODWHG!,QWHUQHW
GHPRN!DNQR
&KHPLVWU\!,QWHUQHW!:::
&KHPLVWU\!%LRFKHPLVWU\
 ,QWHUQHW!,QVWLWXWH
$NQR!VWDWV!VWDWVBDOO!DZDUGV
7KH DGPLQLVWUDWRUV KDYH LGHQWLILHG SDWWHUQV WKDW ZHUH H[
SHFWHG DV ZHOO DV LQWHUHVWLQJ ³VXUSULVHV´$Q H[DPSOH RI
WKH IRUPHU W\SH LV WKH SDJHEDVHG PRGHO IRU $XWRFODVV
ZKLFK FRUUHVSRQGV WR D JURXS RI &KHPLVWU\ VFLHQWLVWV
ZKRNQRZKRZWRXVHWKHVLWH7KHLUVSHFLDOWLHVDUHTXLWH
WHFKQLFDO MXVWLI\LQJ IDPLOLDULW\ ZLWK ,QWHUQHWEDVHG VHU
YLFHV2QWKHRWKHUKDQGWKHSDJHEDVHGPRGHOLGHQWLILHG
E\FOXVWHUPLQLQJ ZDVDVXUSULVHWKDWKDVFDXVHGWKRXJKW
7KH H[SODQDWLRQ WKDW ZDV JLYHQ WR WKLV SDWWHUQ ZDV WKDW
VRPHILHOGVVXFKDVµ
(QYLURQPHQWDO6FLHQFHV¶DUHQRWFRY
HUHG VXIILFLHQWO\ IRU WKH HQJLQHHUV LQ WKH ILHOG FDXVLQJ
WKHP WR QDYLJDWH WR PRUH JHQHUDOWKHPH SDJHV VXFK DV
µ
(QJLQHHULQJ¶DQGµ2WKHU7RSLFV¶7KLVLVVXHLVZRUWKIXUWKHU
FRQVLGHUDWLRQDQGFRXOGFDXVHDFKDQJHLQWKHVLWH
&RQFOXVLRQV
8QVXSHUYLVHG PDFKLQH OHDUQLQJ PHWKRGV VHHP WR EH D
JRRG FKRLFH IRU H[WUDFWLQJ XVHIXO NQRZOHGJH IURP XVDJH
GDWD RQ D :HE VLWH :H KDYH ORRNHG DW WKUHH FOXVWHULQJ
PHWKRGV WZR RI ZKLFK DUH SRSXODU LQ PDFKLQH OHDUQLQJ
UHVHDUFKDQGKDYHDOUHDG\EHHQXVHGVXFFHVVIXOO\LQSUDF
WLFH:HKDYHLQFOXGHGWKHFOXVWHULQJPHWKRGVLQDPHWK
RGRORJ\ ZKLFK FRQVLVWV RI SUHSURFHVVLQJ WKH XVDJH GDWD
LQWZRGLIIHUHQWZD\VFRQVWUXFWLQJWKHFRPPXQLWLHVXVLQJ
FOXVWHULQJ DQG PRVW LPSRUWDQWO\ H[WUDFWLQJ FRPPXQLW\
PRGHOV 8VLQJ IRXU HYDOXDWLRQ FULWHULD DQG GDWD IURP D
ODUJH :HE VLWH ZH KDYH H[DPLQHG WKH EHKDYLRU RI WKH
WKUHH PHWKRGV DQG KDYH VKRZQ KRZ WKH IRXU FULWHULD FDQ
EH XVHG WR VHOHFW D VHW RI PRGHOV ,Q WKH VLWH WKDW ZH
ORRNHG DW $XWRFODVV VHHPHG WR RXWSHUIRUP WKH RWKHU
PHWKRGVEXWIXUWKHUVWXG\ZLWKGDWDIURPGLIIHUHQWVLWHVLV
QHFHVVDU\LQRUGHUWRGUDZDJHQHUDOFRQFOXVLRQ
,Q DGGLWLRQ WR IXUWKHU HPSLULFDO HYDOXDWLRQ ZH DUH LQWHU
HVWHG LQ DVVRFLDWLQJ WKLV PHWKRGRORJ\ WR GRPDLQ NQRZO
HGJHDERXWWKH VWUXFWXUHDQGWKH FRQWHQWRI:HEVLWHV,Q
SDUWLFXODUZHDUHLQWHUHVWHGLQSURYLGLQJJXLGHOLQHVDERXW
WKHXVHRIWKHFRPPXQLW\PRGHOVLQGLIIHUHQWW\SHVRIVLWH
$GGLWLRQDOO\ ZH DUH ORRNLQJ LQWR WKH V\VWHPDWL]DWLRQ RI
WKH SDUWV LQ RXU DSSURDFK ZKLFK VWLOO UHTXLUH PDQXDO LQ
WHUYHQWLRQHJDWWULEXWHVHOHFWLRQDQGWKUHVKROGVHOHFWLRQ
)LQDOO\ WKH LPSRUWDQW LVVXH RI SUHVHQWLQJ WKH PRGHOV WR
WKH DGPLQLVWUDWRU UHPDLQV RSHQ 9LVXDOL]DWLRQ WHFKQLTXHV
ZLOOEHRIXVHLQWKLVSUREOHP
$FNQRZOHGJHPHQWV
:HZRXOG OLNH WRWKDQN WKHPHPEHUV RIWKHWHDP ,QIRU
PDWLRQ 5HWULHYDO LQ &KHPLVWU\ ( 9DUYHUL $ 9DUYHULV
DQG37HORQLVIRUSURYLGLQJWKHGDWDDQG37]LW]LUDVIRU
KLVKHOSZLWKWKHH[SHULPHQWV
5HIHUHQFHV
%DODEDQRYLF 0  6KRKDP <  &RQWHQWEDVHG
FROODERUDWLYH UHFRPPHQGDWLRQ &RPPXQLFDWLRQV RI WKH
$&0
%DVX & +LUVK +  &RKHQ :  5HFRPPHQGD
WLRQDVFODVVLILFDWLRQ8VLQJVRFLDODQGFRQWHQWEDVHGLQ
IRUPDWLRQ LQ UHFRPPHQGDWLRQ 3URFHHGLQJV RI WKH )LI
WHHQWK 1DWLRQDO &RQIHUHQFH RQ $UWLILFLDO ,QWHOOLJHQFH
SS&DPEULGJH0$$$$,3UHVV
%LVZDV * :HLQEHUJ -%  )LVKHU '  ,7(5
$7( $ FRQFHSWXDO FOXVWHULQJ DOJRULWKP IRU GDWD PLQ
LQJ,(((7UDQVDFWLRQVRQ6\VWHPV0DQDQG&\EHUQHW
LFV
%UHHVH-6+HFNHUPDQ'.DGLH.(PSLUL
FDO DQDO\VLV RI SUHGLFWLYH DOJRULWKPV IRU FROODERUDWLYH
ILOWHULQJ3URFHHGLQJVRI WKH )RXUWHHQWK&RQIHUHQFH RQ
8QFHUWDLQW\ LQ $UWLILFLDO ,QWHOOLJHQFH SS  6DQ
)UDQFLVFR0RUJDQ.DXIPDQQ3XEOLVKHUV
%URQ&.HUERVFK-)LQGLQJDOOFOLTXHVRIDQ
XQGLUHFWHG JUDSK &RPPXQLFDWLRQV RI WKH $&0 

%FKQHU$*%DXPJDUWHQ 0$QDQG660XOYHQQD
0'  +XJKHV -*  1DYLJDWLRQ SDWWHUQ GLV
FRYHU\ IURP ,QWHUQHW GDWD 3URFHHGLQJV RI WKH .''
:RUNVKRSRQ:HE8VDJH$QDO\VLVDQG8VHU3URILOLQJ
&RROH\0REDVKHU%6ULYDVWDYD-:HEPLQ
LQJ ,QIRUPDWLRQ DQG SDWWHUQ GLVFRYHU\ RQ WKH :RUOG
:LGH :HE 3URFHHGLQJV RI WKH 1LQWK ,((( ,QWHUQD
WLRQDO &RQIHUHQFH RQ 7RROV ZLWK $UWLILFLDO ,QWHOOLJHQFH
SS1HZ<RUN,(((
&RROH\57DQ316ULYDVWDYD-:HE6,)7
7KH:HE VLWH LQIRUPDWLRQILOWHUV\VWHP 3URFHHGLQJVRI
WKH .'' :RUNVKRS RQ :HE 8VDJH $QDO\VLV DQG
8VHU3URILOLQJ
)LVKHU'.QRZOHGJHDFTXLVLWLRQYLDLQFUHPHQWDO
FRQFHSWXDOFOXVWHULQJ0DFKLQH/HDUQLQJ
)X < 6DQGKX .  6KLK 0<  &OXVWHULQJ RI
:HEXVHUVEDVHGRQDFFHVVSDWWHUQV3URFHHGLQJVRIWKH
.'' :RUNVKRS RQ :HE 8VDJH $QDO\VLV DQG 8VHU
3URILOLQJ
+DQVRQ56WXW] -&KHHVHPDQ3%D\HVLDQ
FODVVLILFDWLRQ WKHRU\ 7HFKQLFDO 5HSRUW ),$
$,%UDQFK1$6$$PHV5HVHDUFK&HQWHU&$
,%0  8VLQJ WKH ,QWHOOLJHQW 0LQHU IRU 'DWD 9HU
VLRQ5HOHDVH,%0&RUSRUDWLRQ
-RDFKLPV 7 )UHLWDJ '  0LWFKHOO 7  :HE
:DWFKHU$WRXU JXLGHIRUWKH:RUOG:LGH:HE3UR
FHHGLQJVRIWKH)LIWHHQWK,QWHUQDWLRQDO-RLQW&RQIHUHQFH
RQ $UWLILFLDO,QWHOOLJHQFHSS  6DQ)UDQFLVFR
0RUJDQ.DXIPDQQ3XEOLVKHUV
.RKRQHQ 7  6HOIRUJDQL]LQJ PDSV VHFRQG HGL
WLRQ%HUOLQ6SULQJHU9HUODJ
/DQJOH\38VHUPRGHOOLQJLQDGDSWLYHLQWHUIDFHV
3URFHHGLQJVRIWKH6HYHQWK,QWHUQDWLRQDO&RQIHUHQFHRQ
8VHU 0RGHOOLQJ SS  1HZ <RUN 6SULQJHU
9HUODJ
2UZDQW-+HWHURJHQHRXVOHDUQLQJLQWKH'RSSHO
JlQJHU XVHU PRGHOOLQJ V\VWHP 8VHU 0RGHOOLQJ DQG
8VHU$GDSWHG,QWHUDFWLRQ
3DOLRXUDV* 3DSDWKHRGRURX &.DUNDOHWVLV9 6S\UR
SRXORV &'  0DODYHWD 9  /HDUQLQJ XVHU
FRPPXQLWLHV IRU LPSURYLQJ WKH VHUYLFHV RI LQIRUPDWLRQ
SURYLGHUV3URFHHGLQJVRIWKH6HFRQG(XURSHDQ&RQIHU
HQFH RQ 'LJLWDO /LEUDULHV SS  %HUOLQ
6SULQJHU9HUODJ
3DOLRXUDV*.DUNDOHWVLV93DSDWKHRGRURX&6S\
URSRXORV &'  ([SORLWLQJ OHDUQLQJ WHFKQLTXHV
IRUWKHDFTXLVLWLRQRIXVHUVWHUHRW\SHVDQGFRPPXQLWLHV
3URFHHGLQJVRIWKH6HYHQWK,QWHUQDWLRQDO&RQIHUHQFHRQ
8VHU0RGHOLQJSS1HZ<RUN6SULQJHU9HU
ODJ
3D]]DQL0 %LOOVXV'/HDUQLQJDQGUHYLVLQJ
XVHUSURILOHV7KHLGHQWLILFDWLRQRILQWHUHVWLQJ:HEVLWHV
0DFKLQH/HDUQLQJ
3HUNRZLW]0(W]LRQL2$GDSWLYH:HEVLWHV
$XWRPDWLFDOO\ V\QWKHVL]LQJ :HE SDJHV 3URFHHGLQJV RI
WKH )LIWHHQWK 1DWLRQDO &RQIHUHQFH LQ $UWLILFLDO ,QWHOOL
JHQFHSS&DPEULGJH0$$$$,3UHVV
5LFK (  8VHUV DUH LQGLYLGXDOV ,QGLYLGXDOL]LQJ
XVHU PRGHOV ,QWHUQDWLRQDO -RXUQDO RI 0DQ0DFKLQH
6WXGLHV
6DKDPL 0  8VLQJ PDFKLQH OHDUQLQJ WR LPSURYH
LQIRUPDWLRQ DFFHVV 'RFWRUDO GLVVHUWDWLRQ 'HSDUWPHQW
RI &RPSXWHU 6FLHQFH 6WDQIRUG 8QLYHUVLW\ 6WDQIRUG
&$
6FKZDUW] ( ,  :HERQRPLFV 1HZ <RUN %URDG
ZD\ERRNV
Article
One of the major innovations in personalization in the last 20 years was the injection of social knowledge into the model of the user. The user is not considered an isolated individual any more, but a member of one or more communities. User communities have been facilitated by the striking advancements of electronic communications and in particular the penetration of the Web into people’s everyday routine. Communities arise in a number of different ways. Social networking tools typically allow users to proactively connect to each other. Alternatively, data mining tools discover communities of connected Web sites or communities of Web users. In this article, we focus on the latter type of community, which is commonly mined from logs of users’ activity on the Web. We recall how this process has been used to model the users’ interests and personalize Web applications. Collaborative filtering and recommendation are the most widely used forms of community-driven personalization. However, we examine a range of other interesting alternatives that are worth investigating further. This effort leads us naturally to the recent developments on the Web and particularly the advent of the social Web. We explain how this development draws together the different viewpoints on Web communities and introduces new opportunities for community-based personalization. In particular, we propose the concept of active user community and show how this relates to recent efforts on mining social networks and social media.
Article
Full-text available
Users of a Web site usually perform their interest-oriented actions by clicking or visiting Web pages, which are traced in access log files. Clustering Web user access patterns may capture common user interests to a Web site, and in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. The conventional Web usage mining techniques for clustering Web user sessions can discover usage patterns directly, but cannot identify the latent factors or hidden relationships among users’ navigational behaviour. In this paper, we propose an approach based on a vector space model, called Random Indexing, to discover such intrinsic characteristics of Web users’ activities. The underlying factors are then utilised for clustering individual user navigational patterns and creating common user profiles. The clustering results will be used to predict and prefetch Web requests for grouped users. We demonstrate the usability and superiority of the proposed Web user clustering approach through experiments on a real Web log file. The clustering and prefetching tasks are evaluated by comparison with previous studies demonstrating better clustering performance and higher prefetching accuracy.
Chapter
Full-text available
This paper presents a system that aggregates news from various electronic news publishers and distributors. The system collects news from HTML and RSS Web documents by using source-specific information extraction programs (wrappers) and parsers, organizes them according to pre-defined news categories and constructs personalized views via a Web-based interface. Adaptive personalization is performed, based on the individual user interaction, user similarities and statistical analysis of aggregate usage data by machine learning algorithms. In addition to the presentation of the basic system, we present here the results of a user study, indicating the merits of the system, as well as ways to improve it further.
Conference Paper
Full-text available
We present a method for modeling user navigation on a web site using grammatical inference of stochastic regular grammars. With this method we achieve better models than the previously used rst order Markov chains, in terms of predictive accuracy and utility of rec- ommendations. In order to obtain comparable results, we apply the same grammatical inference algorithms on Markov chains, modeled as proba- bilistic automata. The automata induced in this way perform better than the original Markov chains, as models for user navigation, but they are considerably inferior to the automata induced by the traditional gram- matical inference methods. The evaluation of our method was based on two web usage data sets from two very dissimilar web sites. It consisted in producing, for each user, a personalized list of recommendations and then measuring its recall and expected utility.
Conference Paper
Full-text available
This paper presents a system that integrates news from multiple sources on the Web and delivers in a personalized fashion to the reader. The presented service integrates automatic information extraction from various news sources and presentation of information according to the user's interests. The system consists of source-specific information extraction programs (wrappers) that extract highlights of news items from the various sources, organize them according to pre-defined news categories and present them to the user through a personal Web-based interface. Dynamic personalization is used based on the user's reading history, as well as the preferences of other similar users. User models are maintained by statistical analysis and machine learning algorithms. Results of an initial user study have confirmed the value of the service and indicated ways in which it should be improved.
Conference Paper
E-service (such as E-learning and E-business) has been growing rapidly, keeping pace with the Web. The management pattern of E-commerce may greatly save the cost in the physical environment and bring conveniences to customers. People pay more and more attention to E-commerce day by day. Therefore, more and more companies have set up their own E-business websites to sell commodities or issue information service. But, the quality of the recommendations has an important effect on the customer’s future shopping behavior. Unquali¿ed recommendations may cause two types of characteristic errors: false negative and false positive. According to the reasons mentioned above, the technologies of personalized technology, adaptive filtering are applied in the paper, while a new personalized recommendation algorithm is proposed, Personalized recommendation algorithm was applied to a real E-bookstore for evaluation. Some experiment results are also provided.
Article
this document retrieval stage will become even less of a time issue. CHAPTER 9. SONIA -- A COMPLETE SYSTEM 170 Clusterer Descriptor Extractor yes <...> <...> <...> <...> ... no Hierarchical Classifier Document Organization Reduced Vector Rep. Subset Descriptors Use existing hierarchy? Figure 9.3: The machine learning components in the SONIA system. The retrieved document texts are then parsed into a series of alphanumeric terms (i.e., words). Optionally, these terms may be stemmed to their root as SONIA's parser includes a standard word stemming scheme [125]. We note that we currently do not make use of such stemming in the examples of system usage provided in later sections. Empirically, we have not found stemming to create much of a difference in the results obtained with the system. Each term then forms a dimension in a high-dimensional space in which the documents can now be represented as vectors. That is, the vector representing a document contains in the dimension for each term, the count of how many times that term appeared in the document. Since we now have the term counts for each document, SONIA is capable of transforming the vector representation of documents to different CHAPTER 9. SONIA -- A COMPLETE SYSTEM 171 term weighting schemes, such as a Boolean representation (as in Eq. 2.6). Such different representations are easily generated when needed by different modules within SONIA. 9.3.2 Initial feature selection