ArticlePDF Available

Abstract and Figures

Learning analytics has reserved its position as an important field in the educational sector. However, the large-scale collection, processing, and analyzing of data has steered the wheel beyond the borders to face an abundance of ethical breaches and constraints. Revealing learners’ personal information and attitudes, as well as their activities, are major aspects that lead to identifying individuals personally. Yet, de-identification can keep the process of learning analytics in progress while reducing the risk of inadvertent disclosure of learners’ identities. In this paper, the authors discuss de-identification methods in the context of the learning environment and propose a first prototype conceptual approach that describes the combination of anonymization strategies and learning analytics techniques.
Content may be subject to copyright.
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
456(
De-Identification in Learning Analytics
Mohammad'Khalil'and'Martin'Ebner'
W-K12/,3.24(I*1>.34367(
X52Y(T.,Q*58,/7(30(I*1>.343679(MK8/5,2(
R3>2RR2-'O>24,4Z/K652Y'2/(
ABSTRACT@ L*25.,.6(2.247/,18(>28( 5*8*5Q*-(,/8(?38,/,3.(28(2.(,R?35/2./(0,*4-( ,.(/>*( *-K12/,3.24(
8*1/35'( [3N*Q*59( />*( 4256*+8124*( 1344*1/,3.9( ?531*88,.69( 2.-( 2.247Y,.6( 30( -2/2( >28( 8/**5*-( />*(
N>**4(S*73.-(/>*(S35-*58(/3(021*(2.(2SK.-2.1*( 30( */>,124(S5*21>*8( 2.-( 13.8/52,./8'( \*Q*24,.6(
4*25.*58]( ?*583.24( ,.035R2/,3. ( 2.-( 2//,/K-*89( 28( N*44( 28(/>*,5( 21/,Q,/,*89( 25*( R2C35( 28?*1/8( />2/(
4*2-(/3(,-*./,07,.6(,.-,Q,-K248(?*583.2447'( V*/9( -*+,-*./,0,12/,3.(12.(O**?(/>*(?531*88(30(4*25.,.6(
2.247/,18(,.( ?5365*88( N>,4*( 5*-K1,.6(/>*( 5,8O( 30( ,.2-Q*5/*./(-,81438K5*(30( 4*25.*58]( ,-*./,/,*8'( D.(
/>,8( ?2?*59( />*( 2K/>358( -,81K88(-*+,-*./,0,12/,3.( R*/>3-8( ,.( />*( 13./*B/( 30( />*( 4*25.,.6(
*.Q,53.R*./(2.-(?53?38*(2(0,58/(?53/3/7?*(13.1*?/K24(2??5321>(/>2/(-*815,S*8(/>*(13RS,.2/,3.(
30(2.3.7R,Y2/,3.(8/52/*6,*8(2.-(4*25.,.6(2.247/,18(/*1>.,^K*8'(
(
Keywords:'L*25.,.6(2.247/,189(2.3.7R,Y2/,3.9(-*+,-*./,0,12/,3.9(*/>,189(?5,Q217
(
1 INTRODUCTION
L*25.,.6(2.247/,18(,8(2.(21/,Q*( 25*2( 30(/>*(5*8*251>( 0,*4-(30(3.4,.*( *-K12/,3.(2.-(I*1>.34367(W.>2.1*-(
L*25.,.6(!IWL&'( D/( 2??4,*8(2.2478,8(/*1>.,^K*8( /3( />*(*-K12/,3.( -2/2(8/5*2R( ,.( 35-*5(/3( 21>,*Q*( 8*Q*524(
3SC*1/,Q*8'(I>*8*(3SC*1/,Q*8(R2,.47(2,R(/3(,./*5Q*.*(2.-(?5*-,1/(4*25.*58](?*5035R2.1*(,.(?K58K2.1*(30(
*.>2.1,.6( />*( 4*25.,.6( 13./*B/( 2.-( ,/8( *.Q,53.R*./'( [,6>*5( W-K12/,3.( ![W&( 2.-( 3.4,.*( 13K58*(
,.8/,/K/,3.8(25*(433O,.6( 2/( 4*25.,.6(2.247/,18(N,/>(2.(,./*5*8/(,.(,R?53Q,.6(5*/*./,3.(2.-(-*15*28,.6(/>*(
/3/24( -53?3K/( 52/*( !E42-*( _( X24?,.9( "#$"&'( [3N*Q*59( */>,124(,88K*8( *R*56*( N>,4*( 2??47,.6( 4*25.,.6(
2.247/,18(,.(*-K12/,3.24( -2/2(8*/8( !X5*44*5( _( )521>84*59( "#$"&'( M/( />*( 0,58/( D./*5.2/,3.24(P3.0*5*.1*(3.(
L*25.,.6(M.247/,18(2.-(`.3N4*-6*(!LM`(a$$&9(>*4-(,.(U2.009(M4S*5/29(P2.2-2(,.("#$$9(?25/,1,?2./8(265**-(
/>2/( 4*25.,.6( 2.247/,18(52,8*8( ,88K*8( 5*4*Q2./( /3( */>,18( 2.-( ?5,Q217( 2.-( b,/( 13K4-( S*( 13.8/5K*-( 28(
*2Q*8-53??,.6c(!U53N.9("#$$&'(I>*(R288,Q*(-2/2(1344*1/,3.(2.-(2.2478,8(30(/>*8*(*-K12/,3.24(-2/2(8*/8(
12.( 4*2-( /3( ^K*8/,3.8( 5*42/*-( /3( 3N.*58>,?9( /52.8?25*.179( 2.-( ?5,Q217( 30( -2/2'( I>*8*( ,88K*8( 25*( .3/(
K.,^K*(/3(/>*(*-K12/,3.(8*1/35(3.479(SK/(12.( S*(03K.-(,.(/>*(>KR2.( 5*83K51*(R2.26*R*./(2.-(>*24/>(
8*1/358(!P33?*59( "##:&'( M/( ,/8( O*7(4*Q*49( 4*25.,.6( 2.247/,18(,.Q34Q*8( /521O,.6( 8/K-*./8](8/*?8( ,.(4*25.,.6(
*.Q,53.R*./89( 8K1>( 28( Q,-*38( 30( deeP8( !f21>/4*59( `>24,49( I2526>,( _( WS.*59( "#$%&9( ,.( />*( ,./*5*8/( 30(
,-*./,07,.6( N>3( 25*( />*( 8/K-*./8( b2/( 5,8O9c( 35( /3( >*4?( 8/K-*./8(N,/>( -*1,8,3.8(2S3K/( />*,5( 0K/K5*8'(
F*Q*5/>*4*889( /521O,.6( ,./*521/,3.8( 30( 8/K-*./8( 13K4-( K.Q*,4( 15,/,124( ,88K*8( 5*625-,.6(/>*,5( ?5,Q217(2.-(
/>*,5(,-*./,/,*8(!U37-9("##=&'(
(
W/>,124( ,88K*8( 035( 4*25.,.6( 2.247/,18(0244( ,./3( -,00*5*./( 12/*635,*8'( f*( R2,.47( 8KRR25,Y*( />*R( 28( />*(
03443N,.6( !`>24,4( _( WS.*59( "#$HS&@( $&( /52.8?25*.17( 30( -2/2( 1344*1/,3.9( K826* 9(2 .-( ,.Q34Q*R*./( 30( />,5-(
?25/,*8g("&(2.3.7R,Y2/,3.(2.-(-*+,-*./,0,12/,3.(30(,.-,Q,-K248g(<&(3N.*58>,?(30(-2/2g(h&(-2/2(211*88,S,4,/7(
2.-(211K5217(30(/>*(2.247Y*-(5*8K4/8g(H&(8*1K5,/7(30(/>*(*B2R,.*-(-2/2(8*/8(2.-(8/K-*./(5*135-8(053R(2.7(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
437(
/>5*2/'( I>*8*( 15,/*5,2(?3,./( /3(/>*( N,-*47(S28*-(8*1K5,/7( R3-*4( PDM9( N>,1>( 8/2.-8( 035( P3.0,-*./,24,/79(
D./*65,/7(053R(24/*52/,3.9(2.-(MQ2,42S,4,/7(035(2K/>35,Y*-(?25/,*8'(
I>*( 4*25.,.6( 2.247/,18(13RRK.,/7(.**-8( /3( -*24( 125*0K447( N,/>( />*( ?3/*./,24( ?5,Q217( ,88K*8( N>,4*(
2.247Y,.6(8/K-*./(-2/2'(W-K12/,3.24(-2/2(2.2478,8(/*1>.,^K*8(12.(5*Q*24(?*583.24(,.035R2/,3.9(2//,/K-*89(
2.-(21/,Q,/,*8(5*42/*-(/3(4*25.*58( !U,*.O3N8O,9(i*.69( _( d*2.89("#$"&'( [3N*Q*59(/>*5*( >28(S**.(4,R,/*-(
5*8*251>9(2.-(/>*5*( 25*( 8/,44(.KR*53K8(K.2.8N*5*-( ^K*8/,3.8( 5*42/*-(/3(?5,Q2179(?*583.24(,.035R2/,3.9(
2.-(3/>*5(*/>,124(,88K*8(,.(/>*(13./*B/(30(4*25.,.6(2.247/,18(!U,*.O3N8O,9(i*.69(_(d*2.89("#$"g(X5*44*5(_(
)521>84*59("#$"g(E42-*(_(X24?,.9("#$"g(E42-*(_(j5,.84339("#$<&'(i35(*B2R?4*9(83R*(*-K12/358(142,R(/>2/(
*-K12/,3.24( ,.8/,/K/,3.8( 25*( K8,.6( 2??4,12/,3.8( />2/( 1344*1/( 8*.8,/,Q*( -2/2( 2S3K/( 8/K-*./8( N,/>3K/(
8K00,1,*./47(5*8?*1/,.6( -2/2( ?5,Q217( 2.-( >3N( />*(-2/2(N,44(*Q*./K2447( S*(K8*-(!E,.6*59( "#$h&'( I>K89( -2/2(
-*652-2/,3.(!M.1,2KB( */( 24'9("##=&9( -*+,-*./,0,12/,3.(R*/>3-89(35(-*4*/,3.(30(8?*1,0,1(-2/2( 5*135-89(R27(
S*( 5*^K,5*-( 28( 2( 834K/,3.( /3( ?5*8*5Q*( 4*25.*58]( ,.035R2/,3.'( D.( />,8( ?2?*59( N*( N,44( R2,.47( 031K8( 3K5(
-,81K88,3.( 3.( />*( -*+,-*./,0,12/,3.( ?531*88( ,.( />*( 4*25.,.6( 2.247/,18(2/R38?>*5*( 2.-( 20035-( 2( 0,58/(
?53/3/7?*(13.1*?/K24( 2??5321>( />2/( 13RS,.*8( 4*25.,.6(*.Q,53.R*./9(-*+,-*./,0,12/,3.( /*1>.,^K*89( 2.-(
4*25.,.6(2.247/,18'(
(
I>*(?2?*5(,8(3562.,Y*-(28(03443N8@(E*1/,3.("(13Q*58(/>*(-*+,-*./,0,12/,3.(,.(6*.*524(2.-(/>*(1K55*./(42N8(
28831,2/*-(N,/>(*-K12/,3.9(28(N*44(28(/>*(-5,Q*58(4,.O*-(N,/>(4*25.,.6(2.247/,18'(D.(E*1/,3.(<9(N*(?53?38*(
/>*( -*+,-*./,0,12/,3.;4*25.,.6 ( 2.247/,18(2??5321>'( I>*( 428/( 8*1/,3.( -,81K88*8(/>*(4,R,/2/,3 .8( 30( />*( -*+
,-*./,0,12/,3.(?531*88(,.(4*25.,.6(2.247/,18'(
(
2 BACKGROUND
2.1 Personal Information and De-Identification
j*583.24( ,.035R2/,3.( ,8( 2.7( ,.035R2/,3.( />2/( 12.( ,-*./,07( 2.( ,.-,Q,-K24'( D.( 0,*4-8( 8K1>( 28( />*( >*24/>(
8*1/359( ,/( ,8( .2R*-( j*583.24( [*24/>( D.035R2/,3.(35(j[D'( f>,4*( ,.( 3/>*5( 0,*4-89( 8K1>( 28( />*( *-K12/,3.(
8*1/359( />,8( ,.035R2/,3.( ,8( .2R*-( j*583.24( D-*./,0,2S4*( D.035R2/,3.(35( jDD'( I>*( F2/,3.24( D.8/,/K/*( 30(
E/2.-25-8(2.-(I*1>.34367(!FDEI&(-*0,.*8(jDD(28(b2.7(,.035R2/,3.(2S3K/(2.( ,.-,Q,-K24(R2,./2,.*-( S7(2.(
26*.179( ,.14K-,.6( $&( 2.7( ,.035R2/,3.(/>2/( 12.( S*( K8*-( /3( -,8/,.6K,8>( 35( /521*( 2.( ,.-,Q,-K24]8( ,-*./,/79(
8K1>( 28( .2R*9( 831,24( 8*1K5,/7( .KRS*59( -2/*( 2.-( ?421*( 30( S,5/>9( R3/>*5]8( R2,-*.( .2R*9( 35( S,3R*/5,1(
5*135-8g( 2.-( "&( 2.7( 3/>*5( ,.035R2/,3.( />2/( ,8( 4,.O*-( 35( 4,.O2S4*( /3( 2.( ,.-,Q,-K249( 8K1>( 28( R*-,1249(
*-K12/,3.249( 0,.2.1,249( 2.-( *R?437R*./( ,.035R2/,3.c( !d1P244,8/*59( X52.1*9( _( E12503.*9( "#$#&'( I>*(
?*583.24(,.035R2/,3.(30(4*25.*58(12.(S*(12/*635,Y*-(,./3(-*/2,48(8K1>(28(.2R*9(8*B9(?>3/3652?>9(-2/*(30(
S,5/>9( 26*9( 2--5*889( 5*4,6,3.9( R25,/24( 8/2/K89( *+R2,4( 2--5*889( ,.8K52.1*( .KRS*59(*/>.,1,/79( */( 1*/*529( 35(
*-K12/,3.24(-*/2,48(8K1>(28(^K24,0,12/,3.89(13K58*8(2//*.-*-9(-*65**89(2.-(8/K-7(5*135-8'(M8(2(15,/*5,3.9(2(
4*2O( 30( ,.-,Q,-K248]( ?*583.24( ,.035R2/,3.( 12.( ,.-K1*( R,8K8*( 30( -2/29( *RS255288R*./9( 2.-( 4388( 30(
5*?K/2/,3.'( [3N*Q*59( 3562.,Y2/,3.8( R27( S*( 5*^K,5*-( /3( ?KS4,8>( -*/2,48(*B/521/*-( 053R( ?*583.24(
,.035R2/,3.'(i35(,.8/2.1*9(83R*(*-K12/,3.24(,.8/,/K/,3.8(25*(5*^K,5*-(/3(?53Q,-*(8/2/,8/,18(2S3K/(8/K-*./(
?5365*88g( 4,O*N,8*9( >*24/>( 3562.,Y2/,3.8( R27( .**-( /3( 5*?35/( 8?*1,24( 128*8( 053R(/>*,5( ?2/,*./( 5*135-89(
8K1>( 28( 13RRK.,12S4*( -,8*28*8'( M8( 2( 5*8K4/9( -*+,-*./,0,12/,3.( >*4?8( 3562.,Y2/,3.8( /3( ?53/*1/( ?5,Q217(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
434(
N>,4*( 8/,44( ,.035R,.6(/>*( ?KS4,1'( I>*( -*+,-*./,0,12/,3.( ?531*88( ,8( K8*-( /3( ?5*Q*./( 5*Q*24,.6( ,.-,Q,-K24(
,-*./,/7(2.-(O**?,.6(/>*(jDD(13.0,-*./,24'(
D.(4*25.,.6(2.247/,189( ,/( ,8( 13RR3.(035(8/2O*>34-*58(/3(5*^K*8/( 2--,/,3.24(,.035R2/,3.(2S3K/(/>*( 5*8K4/8(
*B/521/*-( 053R( *-K12/,3.24( -2/2(8*/8'( W-K12/,3.24( -2/2( R,.,.6( 2.-( 4*25.,.6( 2.247/,18( R2,.47( 2,R( /3(
*.>2.1*(/>*( 4*25.,.6(*.Q,53.R*./( 2.-( *R?3N*5( 4*25.*58(2.-( ,.8/5K1/358(!X5*44*5( _( )521>84*59("#$"&'(
I>*5*035*9(/>*(2.2478,8(30(/>*8*(-2/2(R27(>2Q*(,./*5*8/,.6(/5*.-8(/>2/(13K4-(4*2-(/3(0K5/>*5(2.-(-**?*5(
2.2478,8(S7( 3/>*5( ,.8/,/K/,3.8(35(5*8*251>*58'(\*^K*8/8(035(R35*(*B/*.8,Q*(2.2478,8(R27(,.Q34Q*(/>*( K8*(
30(8/K-*./+4*Q*4( -2/2'(M1135-,.6479(*/>,124( ,88K*8( 25,8*9( 8K1>(28( ?5,Q217( -,81438K5*9(2.-( />*(.**-( /3(-*+
,-*./,07(/>*(-2/2(S*13R*8(?252R3K./'(
(
\*1*./479( [25Q25-( 2.-( dDI( K.,Q*58,/,*8( 5*4*28*-( -*+,-*./,0,*-(-2/2( 053R( $%( 13K58*8( 300*5*-( ,.( "#$";
"#$<(053R(/>*,5(N*44+O.3N.(*-k(d288,Q*(e?*.( e.4,.*( P3K58*(!deeP&(!dDI(F*N89( "#$h&'(I>*([25Q25-(
2.-( dDI( *-k( *.8K5*8( />2/( />*( 2.3.7R,/7( 30( />*( 5*4*28*-( -2/2( 13R?4,*8(N,/>( />*( i2R,47( W-K12/,3.24(
\,6>/8(2.-(j5,Q217(M1/(!iW\jM&'$(iK5/>*5R35*9(j5,.8433(2.-(E42-*(!"#$H&(8K66*8/*-(-,00*5*./(2??5321>*8(
/>2/(,.035R(8/K-*./8(,.(>,6>*5(*-K12/,3.(30(/>*(,R?4,12/,3.8(30(4*25.,.6(2.247/,18(3.(/>*,5(?5,Q2/*(-2/2'(
(
2.2 De-Identification Legislation
)*+,-*./,0,12/,3.( 30( 8/K-*./( 5*135-8(>28( S**.( 5*6K42/*-(,.(/>*( T.,/*-(E/2/*8(2.-( />*(WK53?*2.(T.,3.'(
I>*(T.,/*-(E/2/*8(2-3?/*-(iW\jM(5*625-,.6(/>*(?5,Q217(30(8/K-*./(*-K12/,3.24(5*135-8'(D.(/>*(WK53?*2.(
T.,3.9(/>*()2/2(j53/*1/,3.(),5*1/,Q*(!)j)g(:HAh%AWP"&(5*6K42/*8(/>*(?531*88,.6(30(?*583.24(-2/2(2.-(/>*(
R3Q*R*./(30(8K1>(,.035R2/,3.'(iW\jM(l::'<$!S&(-*248(N,/>(/>*(-*+,-*./,0,12/,3.(30(-2/2( 5K4*'( D/( 14*2547(
8/2/*8(/>2/(,.8/,/K/,3.8(bR27(5*4*28*9(N,/>3K/(13.8*./9(*-K12/,3.(5*135-89(35(,.035R2/,3.(053R(*-K12/,3.(
5*135-89(/>2/(>28(S**.(-*+,-*./,0,*-(/>53K6>(/>*(5*R3Q24(30(244(j*583.2447(D-*./,0,2S4*(D.035R2/,3.(!jDD&'c(
I>,8(8*1/,3.(30(iW\jM(5*^K,5*8(,.8/,/K/,3.8(/3(K8*(5*283.2S4*(R*/>3-8( /3(,-*./,07(/>*(3/>*5(?25/,*8(N>3(
-,81438*( *-K12/,3.( 5*135-8'( e.( />*( 3/>*5(>2.-9( />*( R38/( *B?4,1,/( 1,/2/,3.( 30( -*+,-*./,0,12/,3.( ,.( />*(
WK53?*2.()j)(,8(M5/,14*("%(3.(2.3.7R,Y2/,3.9(,.(N>,1>(b?5,.1,?4*8(30(-2/2(?53/*1/,3.(8>244(.3/(2??47(/3(
-2/2( 5*.-*5*-( 2.3.7R3K8( ,.( 8K1>( 2( N27( />2/( />*( -2/2( 8KSC*1/( ,8( .3( 43.6*5( ,-*./,0,2S4*'c(d35*3Q*59(
?25/,*8( 25*( *.13K526*-( /3( K8*( -*+,-*./,0,12/,3.( /*1>.,^K*8( /3( 5*.-*5( ,-*./,0,12/,3.( 30( -2/2( 8KSC*1/8(
,R?388,S4*'( D/( ,8( .3/( 3SQ,3K89( >3N*Q*59(N>2/( 4*Q*4( 30( -*+,-*./,0,12/,3.( ,8( 5*^K,5*-( /3( 2.3.7R,Y*(
*-K12/,3.(5*135-8(K.-*5(WK53?*2.(42N'([3N*Q*59(/>*(M5/,14*(":()2/2(j53/*1/,3.(f35O,.6(j25/7(>28(2.(
3?,.,3.(3.(/>*(,-*./,0,12/,3.(30(-2/2@(be.1*(2(-2/2(8*/(,8(/5K47(2.3.7R,Y*-(2.-(,.-,Q,-K248(25*(.3(43.6*5(
,-*./,0,2S4*9(WK53?*2.(-2/2(?53/*1/,3.(42N(.3(43.6*5(2??4,*8c(!"#$h9(?'(H&'(
(
2.3 Drivers of De-Identification in Learning Analytics
M(8/K-7( S7(j*/*583.( !"#$"&9( 2--5*88*-(/>*( .**-(/3( -*+,-*./,07( -2/2( K8*-( ,. ( 212-*R,1( 2.2478,8( S*035*(
R2O,.6(,/(2Q2,42S4*(/3( ,.8/,/K/,3.89(/3( SK8,.*88*89(35(035( 3?*52/,3.24( 0K.1/,3.8'( j*/*583.( !"#$"&(?3,./*-(
1 >//?@AANNN"'*-'63QA?34,17A6*.A6K,-A0?13A0*5?2A,.-*B'>/R4(!428/(211*88(J2.K257("#$H&(
"(>//?@AA*K5+4*B'*K53?2'*KAL*BT5,E*5QAL*BT5,E*5Q'-3mK5,n(PWLWk@<$::HL##h%@WF@[IdL!428/(211*88(J2.K257("#$H&(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
435(
/3( />*( ,-*2( 30( O**?,.6( 2( K.,^K*( ,-*./,0,*5( ,.( 1 28*( 2( 5*8*251>*5( R27( .**-( /3( 8/K-7( />*( S*>2Q,3K5( 30( 2(
?25/,1K425(,.-,Q,-K24'(E42-*( 2.-( j5,.8433( !"#$<&9( >3N*Q*59(-5*N( 2//*./,3.( /3( />*( 2RS,6K,/7( 30( -2/2(
R,.,.6( /*1>.,^K*8( ,.( R3.,/35,.6( 8/K-*./( S*>2Q,3K5( ,.( *-K12/,3.24( 8*//,.68'( I>*( 2K/>358( 4,.O*-( -*+
,-*./,0,12/,3.(N,/>( 13.8*./(2.-(?5,Q217( 2.-(8/5*88*-(/>*(.**-(/3( 6K252./**(8/K-*./(2.3.7R,/7( ,.( />*,5(
*-K12/,3.( 5*135-8(,.( 35-*5( /3( 21>,*Q*( 4*25.,.6( 2.247/,18(3SC*1/,Q*8( 8K1>( 28( ,./*5Q*./,3.8( S28*-( 3.(
8/K-*./( 1>2521/*5,8/,18'( M.( *B2R?4*( 30( />*( 4,.O(S*/N**.( 13.8*./( 2.-( -*+,-*./,0,12/,3.(N3K4-( S*(2(
^K*8/,3..2,5*( 35( 8K5Q*7( />2/(/>38*( 0,44,.6( ,/( 3K/(25*( /34-( N,44( S*( K8*-( 035( 5*8*251>( 3.47'( D.( />2/( 128*9(
14*2547( />*(4,R,/2/,3.( 30( K8,.6( />*,5( -2/2( N,44( S*( CK8/( />*( 3.*( 8/K-7'( D0( />*( 8K5Q*7( ,.14K-*8( ?*583.24(
,.035R2/,3.9(>3N*Q*59(/>*.(288K52.1*8(30(2.3.7R,Y,.6(/>*,5(-2/2(8>3K4-(S*(13.8,-*5*-'(
(
\72.( U2O*5( !"#$<&( -,81K88*-( />*( -*R2.-8( 30( -*+,-*./,07,.6( *-K12/,3.24( -2/2(8*/8( ,.( >,8( bL*25.,.69(
E1>334,.69(2.-()2/2(M.247/,18c(1>2?/*5(,.(/>*(8&%9:"";("%(<%%"=&0,"%2(,%(*+&$%,%-()"$(>0&0+2?(@,20$,102?(
&%9( >1A""'2'( )*+,-*./,0,12/,3.( 30( />*8*( -2/2(8*/8( R*2.8( S*,.6( 2S4*( /3(8>25*( />*R( 2R3.6( 3/>*5(
5*8*251>*58(N,/>3K/(Q,342/,.6(iW\jM(5*6K42/,3.8'(U2O*5(8/5*88*-(/>2/(*-K12/,3.24(?34,1,*8(8>3K4-(,.14K-*(
5K4*8( 035( 2.3.7R,Y,.6( -2/2( ,.( 35-*5( /3( ?5*Q*./( ,-*./,0,2S4*( ,.035R2/,3.( 053R( S*,.6( 4*2O*-( N,/>3K/(
2K/>35,Y2/,3.'(iK5/>*5R35*9()521>84*5(2.-(X5*44*5(13Q*5*-(/>*(/3?,1(30(2.3.7R,Y2/,3.(,.(/>*,5()WLDPMIW(
2??5321>(!)521>84*5(_( X5*44*59("#$%&'( M( b8/5,1/47( 6K25-*-( O*7c(8>3K4-( S*( >*4-(83( />2/( 5*8*251>*58( R27(
4,.O(/>*,5(5*8K4/8(053R(4*25.,.6(2.247/,18(2.-(*-K12/,3.24(-2/2(R,.,.6(N,/>(,.-,Q,-K24(8/K-*./8(,.(35-*5(/3(
S*.*0,/(/>*(8/K-*./8'( )*+,-*./,0,12/,3.( /*1>.,^K*8( >2Q*( S**.( 5*Q,*N*-(28( 2(5,6>/( 30( 211*88( ?5,.1,?4*(,.(
4*25.,.6(2.247/,18(-*?437R*./(!j25-3(_(E,*R*.89("#$h&'(D.(2--,/,3.9(j25-3(2.-(E,*R*.8(0K5/>*5(8K66*8/(
/>2/(8*R2./,1(2.2478,8(R,6>/(S*(5*^K,5*-(/3(-*/*1/(,-*./,0,2S4*(5*135-8(,.(2.3.7R,Y*-(-2/2(8*/8'(
(
3 PROPOSED APPROACH
'
D.( />,8( 8*1/,3.9( N*( ?53?38*( 2( 13.1*?/K24( -*+,-*./,0,12/,3.;4*25.,.6( 2.247/,18(052R*N35O( 28( 8>3N.( ,.(
i,6K5*( $'( I>*( 052R*N35O( S*6,.8(N,/>( 4*25.*58( ,.Q34Q*-( ,.( 4*25.,.6( *.Q,53.R*./8'( PK55*./479( 2( 4256*(
.KRS*5(30( 4*25.,.6( *.Q,53.R*./8( 8K??35/( 3.4,.*( 4*25.,.69( 8K1>( 28( deePE9( L*25.,.6( d2.26*R*./(
E78/*R8( !LdE&9( DRR*58,Q*( L*25.,.6( E,RK42/,3.8( !DLE&9( R3S,4*( 4*25.,.69( 2.-( j*583.24,Y*-( L*25.,.6(
W.Q,53.R*./8( !jLW&'( I>*8*( ?42/035R8( 300*5( *.Q,53.R*./8( N,/>( 5,1>9( Q28/( 2R3K./8( 30( -2/2( />2/( 12.( S*(
^K2./,/2/,Q*47A^K24,/2/,Q*47(2.247Y*-(/3(S*.*0,/(4*25.*58(2.-(*.>2.1*(/>*(4*25.,.6(13./*B/'(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
433(
'
Figure'1:'The'proposed'conceptual'de-identificationlearning'analytics'framework'
I>*( .*B/( 8/*?( ,8( />*( -*+,-*./,0,12/,3.( ?531*88( N>*5*( /*1>.,^K*8( /3( 13.Q*5/( ?*583.24( 2.-( ?5,Q2/*(
,.035R2/,3.( ,./3( 2.3.7R,Y*-( -2/2(/2O*( ?421*'( )*+,-*./,0,12/,3.( /*1>.,^K*8( ,.14K-*( 8K1>(R*/>3-8(28(
2.3.7R,Y2/,3.9(R28O,.69(S4K55,.69(2.-(?*5/K5S2/,3.'(I>*(428/(8/*?(,.14K-*8(/>*(-*+,-*./,0,*-(-2/2(4,.O*-(
N,/>( 2( K.,^K*( -*815,?/35( />2/( R27(S*(*B2R,.*-( S7( 4*25.,.6( 2.247/,18(5*8*251>*58(2.-( S*.*0,/(
8/2O*>34-*589(SK/(K4/,R2/*47(RK8/(S*(K8*-(3.47(/3(/>*(2-Q2./26*(30(8/K-*./8'(
'
3.1 De-Identification Techniques
D.(3K5(?53?38*-(-*+,-*./,0,12/,3.;4*25.,.6(2.247/,18(13.1*?/K24(052R*N35O9(/>*5*(25*(8*Q*524(/*1>.,^K*8(
2Q2,42S4*( /3( -*+,-*./,07(8/K-*./( -2/2( 5*135-8'( i,6K5*( <(4,8/8( 8*Q*524(R*/>3-8( 30( -*+,-*./,0,12/,3.(2.-(
?53Q,-*8( *B2R?4*8( !S28*-( 3.( M5/,14*( ":( )2/2( j53/*1/,3.( f35O,.6(j25/79( "#$hg( P35R3-*( _( E5,Q28/2Q29(
"##:g(WK538/2/9($::%g(j*/*58*.9("#$"&'(
(
.%"%/B,C&0,"%(
)2/2(2.3.7R,Y2/,3.(/*1>.,^K*8( >2Q*( 5*1*./47( S**.( O**.47( 5*8*251>*-(,.( -,00*5*./( 8/5K1/K5*-( -2/2(
5*135-8(N,/>(/>*(6324(30(6K252./**,.6(/>*(?5,Q217(30(8*.8,/,Q*(,.035R2/,3.(262,.8/(K.,./*.-*-(-,81438K5*(
2.-( 2( Q25,*/7( 30( 2//21O8(!P35R3-*( _( E5,Q28/2Q29( "##:&'( e>R( !"#$#&( -*0,.*-( 5*283.8( S*>,.-(
2.3.7R,Y2/,3.(N>*.(3562.,Y2/,3.8(N2./( /3( 5*4*28*(/>*(-2/2(/3(/>*(?KS4,19(8*44(/>*( ,.035R2/,3.( /3( />,5-(
?25/,*89(35(8>25*(/>*(,.035R2/,3.(N,/>,.(/>*(82R*( 3562.,Y2/,3.'(I>*(-,00*5*.1*(S*/N**.(2.3.7R,Y2/,3.(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
43D(
2.-(-*+,-*./,0,12/,3.9(>3N*Q*59(,8(^K,/*(R,8K.-*58/33-'(M.3.7R,Y2/,3.(?5,.1,?4*8(25*(2(8KS8*/(30(>34,8/,1(
-*+,-*./,0,12/,3.( R*/>3-3436,*8'( )2/2( 2.3.7R,Y2/,3.( ,8( />*( ?531*88( 30( -*+,-*./,07,.6( -2/2( N>,4*(
?5*8*5Q,.6(,/8(35,6,.24(035R2/(!\26>K.2/>2.9("#$<&'(D.(/>*(*-K12/,3.24(13./*B/9(2.3.7R,Y2/,3.(5*0*58(/3(
-,00*5*./( ?531*-K5*8( /3(-*+,-*./,07( 8/K-*./( -2/2( ,.( 8K1>( 2( N27( />2/(,/( 12..3/( S*( 5*+,-*./,0,*-( !/>*(
3??38,/*( 30( -*+,-*./,0,12/,3.&( K.4*88( />*5*( ,8( 2( 5*135-( 13-*'( M.3.7R,Y2/,3.( ,8( .3/( 5*8*5Q*-( 3.47( 035(
/2SK425(-2/2(5*135-89(SK/(12.(2483(S*(2??4,*-(/3(3/>*5(/7?*8(30(-2/2(o(8K1>(28(Q,8K24,Y*-(-2/2(35(652?>8(
o(N>*5*(,.8/,/K/,3.8(,./*.-(/3(?5*8*./(/>*,5(3K/13R*8(N,/>3K/(5*Q*24,.6(8*.8,/,Q*(,.035R2/,3.'(
(
e.( />*( 3/>*5( >2.-9( ,.( 2--,/,3.( /3( 2.3.7R,Y2/,3.9( -*+,-*./,0,12/,3.( ,.14K-*8(R28O,.69( 52.-3R,Y2/,3.9(
S4K55,.69( 2.-( 83( 3.'( i35( ,.8/2.1*9( 5*?421,.6( bU*5.25-c( N,/>( bpppppppc( ,8( 2( R*/>3-( 30( R28O,.6(N>,4*(
24/*5,.6( bU*5.25-c( /3(bf34062.6c( N3K4-( S*( 2.( *B2R?4*( 30( 2.3.7R,Y2/,3.'( [3N*Q*59( R28O,.6( 2.-(
S4K55,.6(25*( .3/( 28( N*44(O.3N.( 28( 2.3.7R,Y2/,3.'(U7( 2.7( R*2.89( -*+,-*./,0,12/,3.9( ?8* K-3.7R,Y2/,3.9(
2.-(2.3.7R,Y2/,3.(25*(,./*51>2.6*2S4*(/3?,18(K.-*5(/>*(,.035R2/,3.(13.1*24,.6(KRS5*442'(I3(1425,07(/>*(
-,00*5*.1*8(,.(8,R?4*(/*5R89( ?8*K-3.7R,Y2/,3.( R*2.8(1432O,.6(/>*(35,6,.24( -2/2(N,/>(0248*(,.035R2/,3.(
N,/>(/>*(2S,4,/7(/3(/521O(,/(S21O(/3(,/8(35,6,.24(035R2/,3.g(2.3.7R,Y2/,3.9(13.Q*58*479(12..3/(S*(5*Q*58*-((
!\26>K.2/>2.9("#$<&'(
(
M8(?5*Q,3K847(R*./,3.*-9(*-K12/,3.24(-2/2( 5*135-8( R27( ,.14K-*( ?5,Q2/*(,.035R2/,3.9( 8K1>( 28( .2R*( 35(
8/K-*./(D)9( N>,1>(8,.6K42547( 25*(1244*-( -,5*1/(,-*./,0,*58'(\*R3Q,.6( 35(>,-,.6( />*8*( ,-*./,0,*58(-3*8(.3/(
288K5*(2(/5K*(-2/2( 2.3.7R,Y2/,3.'( D-*./,0,*58( 13K4-( S*( 4,.O*-(N,/>( 3/>*5(,.035R2/,3.(/>2/(N3K4-(2443N(
,-*./,0,12/,3.( 30( ,.-,Q,-K248(!8**( i,6K5*("&'([3N*Q*59(^K28,+,-*./,0,*58( 12.( S*(K8*-( /3( *.8K5*(S*//*5( -*+
,-*./,0,12/,3.( 30( -2/2'( b)2 /*( 30( U,5/>( q( E*B( q( F2R*c( ,8( 2.( *B2R?4*( 30( 2( ^K28,+,-*. /,0,*5'( D.( "##%9( MeL(
5*4*28*-(/>*(8*251>(5*135-8(30(H##9###(30(,/8(K8*58'(E*Q*524(-278(20/*5(MeL]8(-2/2S28*(5*4*28*9(E+F(G"$;(
H,B+2(C3K5.24,8/8(N*5*(2S4*(/3(5*Q*24(/>*(,-*./,/7(30(2(%"+7*25+34-(N,-3N(K8,.6(2(8,R,425(?531*88(/3(/>2/(
8>3N.(,.(i,6K5*("(!E36>3,2.9("##G&'(MeL(2-R,//*-(/>2/(/>*(-2/2(5*4*28*(N28(2(R,8/2O*(2.-(/>*(5*8*251>(
/*2R(5*8?3.8,S4*(035(8>25,.6(/>*(-2/2(N28(0,5*-'(
(
(
Figure'2:'Linking'data'sources'leads'to'name'identification'
(
M.3/>*5(*B2R?4*(30(,-*./,07,.6( ,.-,Q,-K248(N28(5*?35/*-(,.("###(N>*.(-*R3652?>,1(,.035R2/,3.(4*-(/3(
5*/5,*Q,.6(/>*(.2R*8(2.-(13./21/(,.035R2/,3.(30(?2/,*./8(N>38*(R*-,124(-2/2(>2-(S**.(5*4*28*-(,.(/>*(
T.,/*-(E/2/*8(!EN**.*79("###&'((
(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
43I(
E2R252/,( 2.-( EN**.*7(!$::=&( ?53Q,-*-( 2(N*44+O.3N.( 2.3.7R,Y2/,3.( /*1>.,^K*9( .2R*47( ;+
2.3.7R,Y2/,3.'( I>,8( R*/>3-( 2--5*88*8( />*( ?53S4*R( 30( 4,.O,.6( 5*135-8( /3( ,-*./,07( />*( ,.-,Q,-K24]8(
,.035R2/,3.(N>*.(5*4*28,.6( -2/29( />K8(820*6K25-,.6(2.3.7R,/7'(I>*( ;+2.3.7R,/7( /*1>.,^K*(031K8*8(3.(
2Q3,-,.6(2(-2/2(5*135-(053R(S*,.6(,-*./,0,*-(N,/>(;(,.-,Q,-K248(!P35R3-*(_(E5,Q28/2Q29("##:&'(
(
(
Figure'3:'Examples'of'de-identification'techniques'
J&2;,%-(
d28O,.6( ,8( 2( -*+,-*./,0,12/,3.( /*1>.,^K*( />2/( 5*?421*8( 8*.8,/,Q*( -2/2( N,/>( 0,1/,3.24(-2/2( ,.( 35-*5(/3(
-,81438*(5*8K4/8( 3K/8,-*(/>*( ,.8/,/K/,3.'()2/2( R28O,.6(12.( R3-,07(/>*( -2/2( 5*135-8(83( />2/( />*7(5*R2,.(
K82S4*( N>,4*(O**?,.6( ?*583.24( ,.035R2/,3.( 13.0,-*./,24'( i35( ,.8/2.1*9( 1>2521/*5( R28O,.6( 5*?421*8( 2(
8/5,.6(N,/>(8?*1,24(1>2521/*58'(
(
K'#$$,%-(
U4K55,.6( ,.Q34Q*8( 5*-K1,.6( ?5*1,8,3.( /3( R,.,R,Y*( />*( ,-*./,0,12/,3.( 30( -2 /2'( I>*5*( 25*( 8*Q*524( N278( /3(
21>,*Q*( S4K55,.69( 8K1>( 28( -,Q,-,.6( />*( -2/2( ,./3( 8KS12/*635,*89(52.-3R,Y,.6(/>*(-2/2( 0,*4-89( 35( 2--,.6(
.3,8*(/3(-2/2(5*135-8'(
(
3.2 Coding Data Records
(
D.( 81,*./,0,1( 5*8*251>9( -2/2 (K8K2447( 5*^K,5*8(0K5/>*5( ,.Q*8/,62/,3.( N,/>(5*8*251>*58 ( 433O,.6(-**?*5( ,./3(
/>*(-*/2,48'([2Q,.6(-*+,-*./,0,*-(-2/2(R,6>/(S*(,.8K00,1,*./(035(/>*8*(?K5?38*8g(5*8*251>*58(R27(5*^K,5*(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
43L(
2--,/,3.24(,.035R2/,3.( ,.(35-*5(/3( -3( R35*(2.2478,8'(I>*( MR*5,12.(0*-*524([*24/>( D.8K52.1*(j35/2S,4,/7(
2.-(M113K./2S,4,/7(M1/(![DjMM&9(N>,1>(,8(5*8?3.8,S4*(035(?53/*1/,.6(/>*(13.0,-*./,24,/7(30(?2/,*./(5*135-89(
2K/>35,Y*8( K8,.6(2.( b288,6.*-( 13-*c( />2/( 12.( S*( 2??*.-*-( /3( />*( 5*135-8(,.( 35-*5( /3( ?*5R,/( />*(
,.035R2/,3.(/3(S*(5*+,-*./,0,*-( 035( 5*8*251>(?K5?38*8'3(U28*-(3.(/>2/([DjMM(5K4*9(N*(03K.-(/>2/(iW\jM(
::'<$!S&(2443N8(035(K8,.6(2(K.,^K*(-*815,?/35(035(8/K-*./(-2/2(5*135-8(,.(35-*5(/3( R2/1>(2.(,.-,Q,-K24]8(
,.035R2/,3.(035(5*8*251>(2.-(,.8/,/K/,3.24(K8*'(M1135-,.6479(N*(13.14K-*(/>2/(288,6.,.6(2(13-*(/3(8/K-*./(
5*135-8( ,.( 3K5( ?53?38*-( 052R*N35O( 12.( 652./( 4*25.,.6( 2.247/,18(5*8*251>*58( />*( 2S,4,/7( /3( 8/K-7(
S*>2Q,3K58( 30( 8?*1,0,1( 8/K-*./8( 2.-9( />*5*035*9( 12.( S*.*0,/( 4*25.*58'()*8?,/*( />*( 021/( />2/( 4*25.,.6(
2.247/,18(?38*8(*/>,124(1>244*.6*89(/>*(R2,.( 6324( ,8( 8/,44(/3( S*.*0,/(4*25.,.6(*.Q,53.R*./8( 2.-(8/K-*./89(
8K1>( 28( R2O,.6( 5*13RR*.-2/,3.89( 14288,07,.6( 8/K-*./8( ,./3(?530,4*8(35( ?5*-,1/,.6( />*,5(?*5035R2.1*(
!WS.*5(_(E1>r.9( "#$<g( X5*44*5(_()521>84*59("#$"g( E42-*( _( j5,.84339("#$<g( `>24,4( _( WS.*59( "#$H2g( `>24,49(
`28/4(_(WS.*59("#$%&'(
'
4 LIMITATIONS
)*8?,/*( />*( 021/( />2/( -*+,-*./,0,12/,3.( ?53/*1/8( 13.0,-*./,24(,.035R 2/,3.( 2.-( ?5,Q2179( />*( -*+,-*./,0,*-(
-2/2( 8/,44( ?38*8(83R*( ?5,Q217( 5,8O8( !j*/*58*.9( "#$"&'( D.( R2.7( 128*89( 83R*( 2//5,SK/*8( 25*( 12?2S4*( 30(
,-*./,07,.6( ,.-,Q,-K248g( ,.( 3/>*5( 128*89( 2//21O*58( 12.( 4,.O( 5*135-8( /36*/>*5( 053R ( -,00*5*./( 83K51*8( 2.-(
/>*5*035*(b13-*(S5*2Oc(/>*(-*+,-*./,0,12/,3.'(e.(/>*(3/>*5(>2.-9(,.(/>*,5(?2?*5(bj5,Q2179(M.3.7R,/79(2.-(
U,6( )2/2( ,.( />*( E31,24( E1,*.1*89c()25,*8( */( 24'( !"#$h&(288K5*-( />2/( N,/>( -*+,-*./,0,12/,3.9( />*5*( ,8( .3(
6K252./**( 30(O**?,.6(/>*( 2.2478,8( ?531*88( K.1355K?/*-'( j25-3(2.-( E,*R*.8( 265**(/>2/( b-2/2( 12.( S*(
*,/>*5(K8*0K4(35(?*50*1/47(2.3.7R3K89(SK/(.*Q*5(S3/>c(!"#$h9(?'(hhG&'(I>*(S3//3R(4,.*(,8(/>2/(/>*(8/5,1/*5(
/>*(-*+,-*./,0,12/,3.(6K,-*4,.*89(/>*(65*2/*5(/>*(.*62/,Q*(200*1/(3.(/>*(K4/,R2/*(2.2478,8'((
5 CONCLUSION
E,.1*(4*25.,.6(2.247/,18(0,58/(S*12R*(O.3N.(,.("#$$9(,/(>28(>*4?*-(4*25.*58(/3(,R?53Q*(/>*,5(?*5035R2.1*(
S28*-(3.(2.247Y,.6(/>*,5(*-K12/,3.24(-2/2'( F*Q*5/>*4*889(/>,8(0,*4-(52,8*8(R2.7(,88K*8(5*42/*-(/3(*/>,18(
2.-(3N.*58>,?'(I>*(R288,Q*(8124*(30(-2/2(1344*1/,3.(2.-(2.2478,8(4*2-8( /3( ^K*8/,3.8( 2S3K/(/>*(13.8*./(
2.-( ?5,Q217( 30( ?*583.24( ,.035R2/,3.'( I>,8( ?2?*5( R2,.47( -,81K88*8( 3.*( 30( />*( 2//2,.2S4*(834K/,3.8( 035(
?5*8*5Q,.6(4*25.*58](8*.8,/,Q*(,.035R2/,3.9( />*(b-*+,-*./,0,12/,3.(30(-2/2c( /3( 021,4,/2/*(4*25.,.6(2.247/,18(
2??4,12/,3.8'(f*(8>*-(4,6>/(3.(/>,8(/3?,1(Q,2(TE(2.-(WT(5*6K42/,3.8(5*625-,.6(-2/2(?5,Q217'(f*(?53?38*-(
2( 13.1*?/K24( 2??5321>( N,/>(*B2R?4*8( 30( -*+,-*./,0,12/,3.( /*1>.,^K*8( />2/ ( 288,8/(K8(N,/>(3K5( b,d33kc(
?42/035R(!>//?@AANNN',R33B'2/&(2.-(12.(>*4?(4*25.,.6(2.247/,18(8?*1,24,8/8(?5*8*5Q*(13.0,-*./,24(4*25.*5(
,.035R2/,3.'(
M4/>3K6>(-*+,-*./,0,12/,3.(,8( .3/(2( 0334?5330(834K/,3.( 035(?53/*1/,.6(4*25.*5(?5,Q2179( ,/(,8( 2.( ,R?*52/,Q*(
13.8,-*52/,3.(,.(*B2R,.,.6(/>*(*/>,124(-,R*.8,3.8(30(4*25.,.6(2.247/,18'('
'
'
3 (\K4*(hH(P'i'\'(l($%h'H$h!1&'
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
43M(
REFERENCES
(
M5/,14*( ":( )2/2( j53/*1/,3.( f35O,.6( j25/7'( !"#$h&'( e?,.,3.( #HA"#$h( 3.( M.3.7R,82/,3.( I*1>.,^K*8(
!#=":A$hAWF( fj"$%&'( \*/5,*Q*-( 053R( >//?@AA*1'*K53?2'*KACK8/,1*A-2/2+?53/*1/,3.A25/,14*+
":A-31KR*./2/,3.A3?,.,3.+5*13RR*.-2/,3.A0,4*8A"#$hAN?"$%s*.'?-0(
M.1,2KB9(F'9(U3K62.,R9(L'9(t2.([**5-*9(['9(jK1>*5249(j'9(_(M?*589(j'(d'(!"##=&'()2/2(-*652-2/,3.@(d2O,.6(
?5,Q2/*(-2/2(4*88(8*.8,/,Q*(3Q*5(/,R*'(D.(J'(X'(E>2.2>2.9(E'(MR*5+V2>,29(D'(d2.34*81K9(V'(u>2.69()'(
M'( WQ2.89(M'( `341Y9(`'+E'( P>3,9(M'( P>3N->K57(!W-8'&9( N$"1++9,%-2( ")( 4M0A(.OJ( <%0+$%&0,"%&'(
O"%)+$+%1+( "%( <%)"$B&0,"%( &%9( P%"F'+9-+( J&%&-+B+%0( QO<PJ(577RS( !??'($h#$;$h#"&'( F*N(
V35O@(MPd'(>//?@AA-B'-3,'356A$#'$$hHA$hH=#="'$hH=<#$(
U2O*59(\'(E'(J'(-'(!"#$<&'(L*25.,.69(81>334,.69(2.-(-2/2(2.247/,18'(D.(d'(dK5?>79(E'(\*--,.69(_(J'(IN7R2.(
!W-8'&9(8&%9:"";( "%( ,%%"=&0,"%2( ,%( '+&$%,%-( )"$( 20&0+2?( 9,20$,102?( &%9( 21A""'2( !??'($G:;$:#&'(
j>,42-*4?>,29(jM@(P*./*5(3.(D..3Q2/,3.8(,.(L*25.,.69(I*R?4*(T.,Q*58,/7'(
U,*.O3N8O,9( d'9( i*.69( d'9( _( d*2.89( U'( !"#$"&'( T%A&%1,%-( 0+&1A,%-( &%9( '+&$%,%-( 0A$"#-A( +9#1&0,"%&'(
9&0&(B,%,%-(&%9( '+&$%,%-( &%&'/0,12U(.%( ,22#+( :$,+)'( \*/5,*Q*-( 053R(/>*( N*S8,/*( 30( />*(e00,1*(30(
W-K12/,3.24( I*1>.343679( TE( )*?25/R*./( 30( W-K12/,3.9(>//?8@AA/*1>'*-'63QAN?+
13./*./AK?432-8A"#$hA#<A*-R+42+S5,*0'?-0(
U53N.9( d'( !"#$$&'( *+&$%,%-( &%&'/0,12U( HA+( 1"B,%-( 0A,$9( F&=+( !T@VO.V>T( *+&$%,%-( <%,0,&0,=+( K$,+)S'(
\*/5,*Q*-(053R(W)TPMTEW(4,S5257(>//?8@AA.*/'*-K12K8*'*-KA,5A4,S5257A?-0AWLDU$$#$'?-0(
P33?*59( F'( !"##:&'( f35O0351*( -*R3652?>,1( 2.247/,18( 7,*4-( >*24/>+125*( 82Q,.68'( TBW'"/B+%0( X+'&0,"%2(
H"9&/(3L!<&9($<;$='(>//?@AA-B'-3,'356A$#'$##"A*5/'"#"H%(
P35R3-*9( X'9( _( E5,Q28/2Q29( )'( !"##:&'(M.3.7R,Y*-( -2/2@( X*.*52/,3.9( R3-*489( K826*'( D.( P '( U,..,6( _( U'(
)26*Q,44*(!W-8'&9(N$"1++9,%-2(")(0A+(3I0A(<%0+$%&0,"%&'( O"%)+$+%1+( "%( J&%&-+B+%0( ")( @&0&(!??'(
$#$H;$#$=&'(F*N(V35O@(MPd'(>//?@AA-B'-3,'356A$#'$$#:ADP)W'"#$#'HhhGG"$(
U37-9( )'( !"##=&'( i21*S33O]8( ?5,Q217( /52,.N5*1O@( WB?38K5*9( ,.Q28,3.9( 2.-( 831,24( 13.Q*56*.1*'(
O"%=+$-+%1+U(HA+(<%0+$%&0,"%&'(!"#$%&'(")(X+2+&$1A(,%0"(E+F(J+9,&(H+1A%"'"-,+29(4D!$&9($<;"#'(
>//?@AA-B'-3,'356A$#'$$GGA$<Hh=H%H#G#=hh$%(
)25,*89(J'(j'9(\*,1>9(J'9(f24-39(J'9(V3K.69(W'(d'9(f>,//,.6>,449(J'9([39(M'()'9('''(_(P>K2.69( D'( !"#$h&'( j5,Q2179(
2.3.7R,/79( 2.-( S,6( -2/2( ,.( />*( 831,24( 81,*.1*8'( O"BB#%,1&0,"%2( ")( 0A+( .OJ9( IM!:&9( H%;%<'(
>//?@AA-B'-3,'356A$#'$$hHA"%h<$<"(
)521>84*59(['(_(X5*44*59(f'(!"#$%&'(j5,Q217(2.-( 2.247/,18(;(,/v8(2()WLDPMIW(,88K*'((M(1>*1O4,8/(/3(*8/2S4,8>(
/5K8/*-(4*25.,.6(2.247/,18'(N$"1++9,%-2(")(0A+(L0A(<%0+$%&0,"%&'(O"%)+$+%1+("%(*+&$%,%-(.%&'/0,12(
&%9(P%"F'+9-+(Q*.P(Y4LS9(=:;:='(>//?@AA-B'-3,'356A$#'$$hHA"==<=H$'"==<=:<(
WS.*59(d'9(_(E1>r.9(d'(!"#$<&'(f>7(4*25.,.6(2.247/,18(,.(?5,R257(*-K12/,3.(R2//*58'(D.(P'(`2526,2..,-,8(
_(E'(X520(!W-8'&9(K#''+0,%(")(0A+(H+1A%,1&'(O"BB,00++("%(*+&$%,%-(H+1A%"'"-/9(4I!"&9($h;$G'(
WK538/2/'(!$::%&'(d2.K24(3.(-,81438K5*(13./534(R*/>3-8'(*#Z+B:"#$-U([)),1+()"$([)),1,&'( N#:',1&0,"%2(")(
0A+( T#$"W+&%( O"BB#%,0,+2'( \*/5,*Q*-( 053R(
>//?@AA*1'*K53?2'*KA*K538/2/A52R3.A8/2/R2.K248A0,4*8AR2.K24s3.s-,81438K5*s13./534sR*/>3-
8s$::%'?-0(
X5*44*59(f'9(_()521>84*59(['(!"#$"&'(I52.842/,.6(4*25.,.6(,./3(.KRS*58@(M(6*.*5,1(052R*N35O(035(4*25.,.6(
2.247/,18'(T9#1&0,"%&'(H+1A%"'"-/(&%9(>"1,+0/9(4I!<&9(h";HG'(
`>24,49(d'9(_(WS.*59(d'(!"#$H2&'(M(EIWd(deeP(035(81>334(1>,4-5*.@(f>2/(-3*8(4*25.,.6(2.247/,18(/*44(K8m(
<%0+$%&0,"%&'( O"%)+$+%1+( "%( <%0+$&10,=+( O"''&:"$&0,=+( *+&$%,%-(!DPL("#$H&9( !??'( $"$G;$""$&'(
i435*.1*9(D/247@(DWWW'(
`>24,49(d'9(_(WS.*59(d'(!"#$HS&'(L*25.,.6(2.247/,18@(j5,.1,?4*8(2.-(13.8/52,./8'(D.(E'(P254,.*59(P'(iK4035-9(_(
F'( e8/28>*N8O,( !W-8'&9( N$"1++9,%-2( ")( T9J+9,&U( \"$'9( O"%)+$+%1+( "%( T9#1&0,"%&'( J+9,&( &%9(
H+1A%"'"-/?(574IQ4S?($G=:;$G::'(
!"#$%&'()*+,-*./,0,12/,3.(,.(4*25.,.6(2.247/,18'(!"#$%&'(")(*+&$%,%-(.%&'/0,129(3!$&9($":;$<='(>//?@AA-B'-3,'356A$#'$=%#=AC42'"#$%'<$'=(
(
DEEF($:":+GGH#(!3.4,.*&'(I>*(J3K5.24(30(L*25.,.6(M.247/,18(N35O8(K.-*5(2(P5*2/,Q*(P3RR3.8(L,1*.8*9(M//5,SK/,3.(+(F3.P3RR*51,24+F3)*5,Q8(<'#(T.?35/*-(!PP(UV+FP+F)(<'#&(
43R(
`>24,49( d'9( `28/49( P'9( _( WS.*59( d'( !"#$%&'( j35/527,.6( deeP8( 4*25.*58@( M( 14K8/*5,.6( *B?*5,*.1*( K8,.6(
4*25.,.6(2.247/,18'(D.(d'(`>24,49(d'(WS.*59(d'(`3??9(M'(L35*.Y9(_(d'(`24Y(!W-8'&9(N$"1++9,%-2(")(0A+(
T#$"W+&%( >0&;+A"'9+$( >#BB,0( "%( +ZW+$,+%1+2( &%9( :+20( W$&10,1+2( ,%( &%9( &$"#%9( J[[O2(
QTJ[[O>(574L&(!??'("%H+"G=&'(F35-*58/*-/g(X*5R2.7@(U33O8(,.()*R2.-(XRS['(
d1P244,8/*59( W'9( X52.1*9( I'9( _( E12503.*9( `'( !"#$#&'( ]#,9+( 0"( W$"0+10,%-( 0A+( 1"%),9+%0,&',0/( ")( W+$2"%&''/(
,9+%0,),&:'+( ,%)"$B&0,"%(QN<<S(!\*13RR*.-2/,3.8( 30( />*( F2/,3.24( D.8/,/K/*( 30(E/2.-25-8( 2.-(
I*1>.34367&((\*/5,*Q*-(053R(/>*(N*S8,/*(30(P3R?K/*5(E*1K5,/7(),Q,8,3.(30(/>*(F2/,3.24(D.8/,/K/*(
30(E/2.-25-8(2.-(I*1>.34367(>//?@AA1851'.,8/'63QA?KS4,12/,3.8A.,8/?KS8A=##+$""A8?=##+$""'?-0(
dDI( F*N8'( !"#$h9( d27( <#&'( dDI( 2.-( [25Q25-( 5*4*28*( -*+,-*./,0,*-( 4*25.,.6( -2/2( 053R( 3?*.( 3.4,.*(
13K58*8'( wf*S( ?38/( S7( />*( dDI( F*N8( e00,1*x'( \*/5,*Q*-( 053R( >//?@AA.*N8'R,/'*-KA"#$hAR,/+
2.-+>25Q25-+5*4*28*+-*+,-*./,0,*-+4*25.,.6+-2/2+3?*.+3.4,.*+13K58*8(
e>R9(j'(!"#$#&'(U53O*.(?53R,8*8(30(?5,Q217@(\*8?3.-,.6(/3(/>*(8K5?5,8,.6(02,4K5*(30(2.3.7R,Y2/,3.'(VO*.(
*&F(X+=,+F9(IM9($G#$'(
j25-39(M'9( _(E,*R*.89( X'( !"#$h&'( W/>,124(2.-( ?5,Q217( ?5,.1,?4*8(035( 4*25.,.6( 2.247/,18'(K$,0,2A(!"#$%&'( ")(
T9#1&0,"%&'(H+1A%"'"-/9(DI9(h<=;hH#'(>//?@AA-B'-3,'356A$#'$$$$ASC*/'$"$H"(
j*/*58*.9( \'(J'( !"#$"9( JK47( $=&'( j34,17( -,R*.8,3.8( 30( 2.247/,18( ,.( >,6>*5( *-K12/,3.'(T@VO.V>T( X+=,+F'(
\*/5,*Q*-( 053R( >//?@AA*5'*-K12K8*'*-KA25/,14*8A"#$"AGA?34,17+-,R*.8,3.8+30+2.247/,18+,.+
>,6>*5+*-K12/,3.(
j5,.84339(j'9(_(E42-*9(E'(!"#$<&'(M.(*Q24K2/,3.(30(?34,17(052R*N35O8(035(2--5*88,.6(*/>,124(13.8,-*52/,3.8(
,.(4*25.,.6( 2.247/,18'(N$"1++9,%-2(")(0A+( 3$9(<%0+$%&0,"%&'(O"%)+$+%1+("%(*+&$%,%-(.%&'/0,12(&%9(
P%"F'+9-+9("h#;"hh'(>//?@AA-B'-3,'356A$#'$$hHA"h%#":%'"h%#<hh(
\26>K.2/>2.9(U'( !"#$<&'(HA+(1"BW'+0+(:"";( ")( 9&0&( &%"%/B,C&0,"%U(^$"B(W'&%%,%-(0"( ,BW'+B+%0&0,"%'(
U312(\2/3.9(iL@(P\P(j5*88'(
E2R252/,9(j'9(_(EN**.*79(L'(!$::=&'(N$"0+10,%-(W$,=&1/(FA+%(9,21'"2,%-(,%)"$B&0,"%U(;_&%"%/B,0/(&%9(,02(
+%)"$1+B+%0( 0A$"#-A( -+%+$&',C&0,"%( &%9( 2#WW$+22,"%( !I*1>.,124( 5*?35/&'(\*/5,*Q*-( 053R( />*(
W4*1/53.,1( j5,Q217( D.035R2/,3.( P*./*5( N*S8,/*(
>//?8@AA*?,1'356A?5,Q217A5*,-*./,0,12/,3.AE2R252/,sEN**.*7s?2?*5'?-0(
E,.6*59( F'( !"#$h9( F3Q*RS*5( $=&'( P4288)3C3( 2-3?/8( -*4*/,3.( ?34,17( 035( 8/K-*./( -2/2'( E+F( G"$;(H,B+2'(
\*/5,*Q*-(053R(>//?@AAS,/8'S4368'.7/,R*8'13RA"#$hA$$A$=A14288-3C3+2-3?/8+-*4*/,3.+?34,17+035+
8/K-*./+-2/2A(
E42-*9(E'9(_(X24?,.9(i'(!"#$"&'(L*25.,.6(2.247/,18(2.-(>,6>*5(*-K12/,3.@(W/>,124(?*58?*1/,Q*8'(N$"1++9,%-2(
")( 0A+( 5%9(<%0+$%&0,"%&'( O"%)+$+%1+( "%( *+&$%,%-( .%&'/0,12( &%9( P%"F'+9-+(Q*.P( `45S9( $%;$G'(
>//?@AA-B'-3,'356A$#'$$hHA"<<#%#$'"<<#%$#(
E42-*9( E'9( _( j5,.84339( j'( !"#$<&'( L*25.,.6( 2.247/,18@( W/>,124( ,88K*8( 2.-( -,4*RR28'( .B+$,1&%( K+A&=,"$&'(
>1,+%0,20?(IM?($H#:;$H"='(>//?@AA-B'-3,'356A$#'$$GGA###"G%h"$<hG:<%%(
E36>3,2.9( P'( !"##G9( )*1*RS*5( $&'( MeL9( F*/04,B( 2.-( />*( *.-( 30( 3?*.( 211*88( /3( 5*8*251>( -2/2'( O_E+0a(
\*/5,*Q*-( 053R( >//?@AANNN'1.*/'13RA.*N8A234+.*/04,B+2.-+/>*+*.-+30+3?*.+211*88+/3+
5*8*251>+-2/2A(
EN**.*79(L'(!"###&'(E,R?4*(-*R3652?>,18(30/*.(,-*./,07(?*3?4*(K.,^K*47'(8+&'0A(Q>&%(^$&%1,21"S9(LM49($;
<h'(E2.(i52.1,8139(PM'(
f21>/4*59(J'9(`>24,49(d'9(I2526>,9(U'9(_(WS.*59(d'(!"#$%&'(e.(K8,.6(4*25.,.6(2.247/,18(/3(/521O(/>*(21/,Q,/7(30(
,./*521/,Q*( deeP( Q,-*38'( D.( d'( X,2..2O389( )'X'( E2R?83.9( L'( `,-Y,.8O,9( M'( j25-3( !W-8'&9(
N$"1++9,%-2( ")( 0A+( *.P( 574L( \"$;2A"W( "%( >B&$0( T%=,$"%B+%02( &%9( .%&'/0,12( ,%( b,9+"_K&2+9(
*+&$%,%-( !??'=;$G&( W-,.SK56>9( E13/42.-@(PWT\E+fE'( \*/5,*Q*-( 053R( >//?@AA1*K5+N8'356At34+
$HG:A?2?*5<'?-0(
... LA concerns: Current LA practices anonymize data prior to sharing among LA researchers [40]. Anonymization is increasingly becoming a hardly viable technique since it requires removing all personal identifiers in a dataset [8]. Each dataset must, therefore, be considered on a case-by-case basis and evaluate whether there are identifiers that are linked to an individual. ...
... As an essential part of interventions, the individual data subjects (students) need to be explicitly identified, algorithmically or manually, which deteriorates the meaning of anonymization of data [44]. As [8] mentions, de-identification of data is a step forward in the process of anonymization of data and refers to all the identifiable measures of individuals after aggregating data from many sources. This is a way out of GDPR, but it is a challenging and mathematically expensive process. ...
Article
Full-text available
Personalized learning is one of the main focuses in 21st-century education, and Learning Analytics (LA) has been recognized as a supportive tool for enhancing personalization. Meanwhile, the General Data Protection Regulations (GDPR), which concern the protection of personal data, came into effect in 2018. However, contemporary research lacks the essential knowledge of how and in which ways the presence of GDPR influence LA research and practices. Hence, this study intends to examine the requirements for sustaining LA under the light of GDPR. According to the study outcomes, the legal obligations for LA could be simplified to data anonymization with consequences of limitations to personalized interventions, one of the powers of LA. Explicit consent from the data subjects (students) prior to any data processing is mandatory under GDPR. The consent agreements must include the purpose, types of data, and how, when and where the data is processed. Moreover, transparency of the complete process of storing, retrieving, and analysing data as well as how the results are used should be explicitly documented in LA applications. The need for academic institutions to have specific regulations for supporting LA is emphasized. Regulations for sharing data with third parties is left as a further extension of this study.
... While much of the focus of early learning analytics research has related to the digital traces within learning management systems and MOOCs (Khalil & Ebner, 2016a), there is increasing interest in capturing and analysing students' data from real-world learning contexts such as gaze, postures, motions, and gestures inside classrooms and face-to-face sessions. Significant records of student behavior in the classroom are often constrained within learning analytics research, due to ethics, privacy, and security concerns (Khalil & Ebner, 2016b). ...
Chapter
It is with a sense of irony that we offer a conclusion to this book. As we acknowledged in the introductory chapter, when we invited authors to submit proposals for a book on exploring the potential and challenges of learning analytics for open, distance and distributed learning institutions and forms of delivery, no one would have imagined how the world, and in particular the education sector would be disrupted by the Covid-19 pandemic.
... In the context of MOLAM, privacy, confidentiality, and anonymity remain paramount. Learning analytics may reveal personal information and attitudes, as well as learner activities, which could lead to the identification of individuals to unwanted stakeholders (Khalil & Ebner, 2016a). It should be stressed then that developing mobile applications using MOLAM (and other approaches) should follow national and international frameworks such as the General Data Protection Regulation (GDPR) in the Euro zone, and the Family Educational Rights and Privacy Act (FERPA) and the Student Privacy Compass in the US. ...
Chapter
Full-text available
Online distance learning is highly learner-centred, requiring different skills and competences from learners, as well as alternative approaches for instructional design, student support, and provision of resources. Learner autonomy and self-regulated learning (SRL) in online learning settings are considered key success factors that predict student performance. Research suggests that learners may struggle in online, open, and mobile learning environments when they do not use critical SRL strategies. This chapter argues that the effective use of SRL would be beneficial in these contexts, although this can be difficult for both learners and educators, particularly when students are learning online and/or independently. The chapter introduces a Mobile Multimodal Learning Analytics approach (MOLAM) aimed at guiding learners, teachers and researchers wanting to develop, successfully employ and/or evaluate learning analytics approaches for mobile learning activities for the purposes of measuring and fostering student SRL in diverse online learning environments. MOLAM is especially valuable for continuous measurement and interventions, thus fostering students’ transferable SRL skills, strategies, and knowledge across formal, informal, and non-formal online learning settings. The chapter concludes suggesting that mobile multimodal learning analytics should be performed with careful integration of relevant support mechanisms and frameworks to protect student privacy and ensure their agency.
... As an example, one relatively established form of PET is that of anonymisation or pseudonymisation. This might involve the disclosure of de-identified data using approaches such as generalisation, suppression, encryption and masking (Khalil & Ebner, 2016). However, such approaches are prone to human mistakes and privacy attacks, which can reveal private user information through background knowledge engineering and intersection attacks. ...
Article
Evidence shows that appropriate use of technology in education has the potential to increase the effectiveness of, eg, teaching, learning and student support. There is also evidence that technology can introduce new problems and ethical issues, eg, student privacy. This article maps some limitations of technological approaches that ensure student data privacy in learning analytics from a critical data studies (CDS) perspective. In this conceptual article, we map the claims, grounds and warrants of technological solutions to maintaining student data privacy in learning analytics. Our findings suggest that many technological solutions are based on assumptions, such as that individuals have control over their data (‘data as commodity’), which can be exchanged under agreed conditions, or that individuals embrace their personal data privacy as a human right to be respected and protected. Regulating student data privacy in the context of learning analytics through technology mostly depends on institutional data governance, consent, data security and accountability. We consider alternative approaches to viewing (student) data privacy, such as contextual integrity; data privacy as ontological; group privacy; and indigenous understandings of privacy. Such perspectives destabilise many assumptions informing technological solutions, including privacy enhancing technology (PET). Practitioner notes What is already known about this topic Various actors (including those in higher education) have access to and collect, use and analyse greater volumes of personal (student) data, with finer granularity, increasingly from multiplatforms and data sources. There is growing awareness and concern about individual (student) privacy. Privacy enhancing technologies (PETs) offer a range of solutions to individuals to protect their data privacy. What this paper adds A review of the assumption that technology provides adequate or complete solutions for ensuring individual data privacy. A mapping of five alternative understandings of personal data privacy and its implications for technological solutions. Consideration of implications for the protection of student privacy in learning analytics. Implications for practice and/or policy Student data privacy is not only a technological problem to be solved but should also be understood as a social problem. The use of PETs offers some solutions for data privacy in learning analytics. Strategies to protect student data privacy should include student agency, literacy and a whole‐system approach. What is already known about this topic Various actors (including those in higher education) have access to and collect, use and analyse greater volumes of personal (student) data, with finer granularity, increasingly from multiplatforms and data sources. There is growing awareness and concern about individual (student) privacy. Privacy enhancing technologies (PETs) offer a range of solutions to individuals to protect their data privacy. What this paper adds A review of the assumption that technology provides adequate or complete solutions for ensuring individual data privacy. A mapping of five alternative understandings of personal data privacy and its implications for technological solutions. Consideration of implications for the protection of student privacy in learning analytics. Implications for practice and/or policy Student data privacy is not only a technological problem to be solved but should also be understood as a social problem. The use of PETs offers some solutions for data privacy in learning analytics. Strategies to protect student data privacy should include student agency, literacy and a whole‐system approach.
... Currently there are no judical decisions to that, but it may not effectuate the GDPR stringent enough. Another approach is the de-identification and anonymization of data (Khalil and Ebner, 2016). This approach complies with the EU GDPR but affects the possibilities to apply EDM or LA at the level of the learner. ...
Conference Paper
Full-text available
The popularity of Educational Big Data (EBD) is increasing fast. The presentation will include Educational Data Mining (EDM) and Learning Analytics (LA) arisen from this as the two main research fields, and focus on stakeholders from various educational fields, that are developing applications of EBD for their purposes. Based on the reflection on the whole process of learning, a model will be presented, that has been developed to map most important stakeholders and their applications of EDM/LA.
... Other approaches include the de-identification and anonymization of data sets (Khalil and Ebner, 2016). The principle of de-identification or anonymization is aimed at removing direct identifiers so that each data record cannot be related to an individual. ...
Thesis
Full-text available
By combining the concepts of privacy, big data, and digital footprints related to the field of education, the current thesis demonstrated that decisions about the use of particular software tools have a societal relevance. Decisions must not only be based on juridical or economical criteria but must also include additional aspects, such as individual, cultural, and moral perceptions in regard of privacy. This work is unique in developing a method for the evaluation of digital footprints. Based on the state of art of research and the findings from this study, a model was developed and proposed that operationalizes how particular digital tools and services can be evaluated in regard of their contribution to the digital footprints of an individual. By virtue of the ease in using the model, called "Digital Footprint Estimation Model," individuals can better reflect upon the effects of digital tools. The model can further assist decision makers, like educational authorities or teachers, to choose between alternative products and can serve as suggestion for further development of acceptable applications. For this purpose, the results of this research have been embodied in a practical application. Through a sequential explanatory mixed methods study, the privacy attitude, perceived usefulness of digital tools, digital footprint awareness, digital footprint experience and digital footprint practice of Austrian teachers in regard of demographic variables and the use of particular software tools were investigated. Unlike previous studies, this study also scrutinized the relationships among the variables of the digital footprint, so this study made a significant contribution to the concept of the digital footprint. The study represents a valuable contribution to practice, as it derives recommendations for action for the Austrian school system based on the qualitative and quantitative results.
... While public sharing of students' (anonymized) data for research purposes can accelerate scientific progress, it may also increase privacy risks [15,95]. This heightened fear of violating students' privacy, coupled with the instantiating of stricter privacy laws (such as the General Data Protection Regulations, GDPR, in Europe), discouraged publicizing learning analytics datasets [46,55]. Indeed, in a recent survey of public MOOC datasets, Lohse, McManus, and Joyner noted that most research papers on learning analytics experiment on proprietary datasets, and no dataset has been made public since 2016 [62]. ...
Article
Full-text available
Education technologies (EdTech) are becoming pervasive due to their cost-effectiveness, accessibility, and scalability. They also experienced accelerated market growth during the recent pandemic. EdTech collects massive amounts of students’ behavioral and (sensitive) demographic data, often justified by the potential to help students by personalizing education. Researchers voiced concerns regarding privacy and data abuses (e.g., targeted advertising) in the absence of clearly defined data collection and sharing policies. However, technical contributions to alleviating students’ privacy risks have been scarce. In this paper, we argue against collecting demographic data by showing that gender—a widely used demographic feature—does not causally affect students’ course performance: arguably the most popular target of predictive models. Then, we show that gender can be inferred from behavioral data; thus, simply leaving them out does not protect students’ privacy. Combining a feature selection mechanism with an adversarial censoring technique, we propose a novel approach to create a ‘private’ version of a dataset comprising of fewer features that predict the target without revealing the gender, and are interpretive. We conduct comprehensive experiments on a public dataset to demonstrate the robustness and generalizability of our mechanism.
... ESG criteria should be built to make the most of data while causing the least amount of intrusion into people's privacy as feasible and environment (Lords, 2018;Sethu, 2019;. Techniques may address this problem to data anonymization and de-identification in the tourism department (Garfinkel, 2015;Khalila & Ebner, 2016). The second predictive stage focuses on the developers' perspectives, consumers, a regulators' need to comprehend and justify artificial intelligence for any industry (Lords, 2018;Monroe, 2018). ...
Article
Full-text available
Intelligent automation in travel and tourism is likely to grow in the future, which is possible due to advances in artificial intelligence (AI) and associated technologies. Intelligent automation in tourism is a socio-economic activity, which needs an explanation of theory and practice. The study objective is to know the predictive relationship between artificial intelligence and intelligent automation in tourism with mediating role of the internet of things (IoT), sustainability, facilitating adoption, and ESG investment. Designing valuable AI, promoting adoption, analyzing the implications of intelligent automation, and establishing a sustainable future with artificial intelligence are the fundamental constructs of this study. Research in these areas enables a systematic knowledge creation that shows a concentrated effort on the part of the scientific community to ensure the positive uses of intelligent automation in the tourist industry. A quantitative research approach was used to collect and analyze data. A purposive sampling technique was applied, and data was collected from four hundred two (N= 402) respondents. The results revealed that artificial intelligence has a predictive relationship with intelligent automated tourism. Similarly, IoT, sustainability, facilitating adoption, and ESG has influenced tourism. As conclusion, artificial intelligence design can improve tourism department if the intelligent automated framework was applied to it.
Article
For the developers of next‐generation education technology (EdTech), the use of Learning Analytics (LA) is a key competitive advantage as the use of some form of LA in EdTech is fast becoming ubiquitous. At its core LA involves the use of Artificial Intelligence and Analytics on the data generated by technology‐mediated learning to gain insights into how students learn, especially for large cohorts, which was unthinkable only a few decades ago. This LA growth‐spurt coincides with a growing global “Ethical AI” movement focussed on resolving questions of personal agency, freedoms, and privacy in relation to AI and Analytics. At this time, there is a significant lack of actionable information and supporting technologies, which would enable the goals of these two communities to be aligned. This paper describes a collaborative research project that seeks to overcome the technical and procedural challenges of running a data‐driven collaborative research project within an agreed set of privacy and ethics boundaries. The result is a reference architecture for ethical research collaboration and a framework, or roadmap, for privacy‐preserving analytics which will contribute to the goals of an ethical application of learning analytics methods. Practitioner notes What is already known about this topic Privacy Enhancing Technologies, including a range of provable privacy risk reduction techniques (differential privacy) are effective tools for managing data privacy, though currently only pragmatically available to well‐funded early adopters. Learning Analytics is a relatively young but evolving field of research, which is beginning to deliver tangible insights and value to the Education and EdTech industries. A small number of procedural frameworks have been developed in the past two decades to consider data privacy and other ethical aspects of Learning Analytics. What this paper adds This paper describes the mechanisms for integrating Learning Analytics, Data Privacy Technologies and Ethical practices into a unified operational framework for Ethical and Privacy‐Preserving Learning Analytics. It introduces a new standardised measurement of privacy risk as a key mechanism for operationalising and automating data privacy controls within the traditional data pipeline; It describes a repeatable framework for conducting ethical Learning Analytics. Implications for practice and/or policy For the Learning Analytics (LA) and Education Technology communities the approach described here exemplifies a standard of ethical LA practice and data privacy protection which can and should become the norm. The privacy risk measurement and risk reduction tools are a blueprint for how data privacy and ethics can be operationalised and automated. The incorporation of a standardised privacy risk evaluation metric can help to define clear and measurable terms for inter‐ and intra‐organisational data sharing and usage policies and agreements (Author, Ruth Marshall, is an Expert Contributor on ISO/IEC JTC 1/SC 32/WG 6 "Data usage", due for publication in early 2022). What is already known about this topic Privacy Enhancing Technologies, including a range of provable privacy risk reduction techniques (differential privacy) are effective tools for managing data privacy, though currently only pragmatically available to well‐funded early adopters. Learning Analytics is a relatively young but evolving field of research, which is beginning to deliver tangible insights and value to the Education and EdTech industries. A small number of procedural frameworks have been developed in the past two decades to consider data privacy and other ethical aspects of Learning Analytics. What this paper adds This paper describes the mechanisms for integrating Learning Analytics, Data Privacy Technologies and Ethical practices into a unified operational framework for Ethical and Privacy‐Preserving Learning Analytics. It introduces a new standardised measurement of privacy risk as a key mechanism for operationalising and automating data privacy controls within the traditional data pipeline; It describes a repeatable framework for conducting ethical Learning Analytics. Implications for practice and/or policy For the Learning Analytics (LA) and Education Technology communities the approach described here exemplifies a standard of ethical LA practice and data privacy protection which can and should become the norm. The privacy risk measurement and risk reduction tools are a blueprint for how data privacy and ethics can be operationalised and automated. The incorporation of a standardised privacy risk evaluation metric can help to define clear and measurable terms for inter‐ and intra‐organisational data sharing and usage policies and agreements (Author, Ruth Marshall, is an Expert Contributor on ISO/IEC JTC 1/SC 32/WG 6 "Data usage", due for publication in early 2022).
Article
Full-text available
For many years, educational researchers have been challenged to prove and justify the effective use of computers in teaching and learning in the classroom. In reviewing the antecedents of computer use in education, many studies have adopted a relatively restricted perspective and confined their research to only technology-based variables, namely students' attitudes towards computers and their experience in using the computer. In contrast, this study includes an investigation of teachers' educational perceptions (constructivist beliefs, traditional beliefs) as an antecedent of computer use, while regulating the influence of technology-related variables (computer experience, general computer attitudes) and demographic variables (gender, age). For identifying the distinction in the determinants of computer use in the classroom, multilevel modelling was used (N = 525). For assess primary school teachers' use of computers in supporting the pedagogical process, an adapted version of the "Class Use of Computers" scale of van Braak et al., (2004) was used. It basically explained the various forms of computer use among primary school teachers, supporting the hypothesis that "teachers' beliefs are significant determinants in explaining why teachers adopt computers in the classroom." Concerning the effect of computer experience, general computer attitudes and gender, the findings indicate a positive impact of constructivist beliefs on the classroom use of computers. The use of computers in the classroom is negatively affected by Traditional views.
Technical Report
Full-text available
In data mining and data analytics, tools and techniques once confined to research laboratories are being adopted by forward-looking industries to generate business intelligence for improving decision making. Higher education institutions are beginning to use analytics for improving the services they provide and for increasing student grades and retention. The U.S. Department of Education's National Education Technology Plan, as one part of its model for 21st-century learning powered by technology, envisions ways of using data from online learning systems to improve instruction. With analytics and data mining experiments in education starting to proliferate, sorting out fact from fiction and identifying research possibilitiesand practical applications are not easy. This issue brief is intended to help policymakers and administrators understand how analytics and data mining have been-and can be-applied for educational improvement. At present, educational data mining tends to focus on developing new tools for discovering patterns in data. These patterns are generally about the microconcepts involved in learning: one-digit multiplication, subtraction with carries, and so on. Learning analytics-at least as it is currently contrasted with data mining-focuses on applying tools and techniques at larger scales, such as in courses and at schools and postsecondary institutions. But both disciplines work with patterns and prediction: If we can discern the pattern in the data and make sense of what is happening, we can predict what should come next and take the appropriate action. Educational data mining and learning analytics are used to research and build models in several areas that can influence online learning systems. One area is user modeling, which encompasses what a learner knows, what a learner's behavior and motivation are, what the user experience is like, and how satisfied users are with online learning. At the simplest level, analytics can detect when a student in an online course is going astray and nudge him or her on to a course correction. At the most complex, they hold promise of detecting boredom from patterns of key clicks and redirecting the student's attention. Because these data are gathered in real time, there is a real possibility of continuous improvement via multiple feedback loops that operate at different time scales-immediate to the student for the next problem, daily to the teacher for the next day's teaching, monthly to the principal for judging progress, and annually to the district and state administrators for overall school improvement. The same kinds of data that inform user or learner models can be used to profile users. Profiling as used here means grouping similar users into categories using salient characteristics. These categories then can be used to offer experiences to groups of users or to make recommendations to the users and adaptations to how a system performs. User modeling and profiling are suggestive of real-time adaptations. In contrast, some applications of data mining and analytics are for more experimental purposes. Domain modeling is largely experimental with the goal of understanding how to present a topic and at what level of detail. The study of learning components and instructional principles also uses experimentation to understand what is effective at promoting learning. These examples suggest that the actions from data mining and analytics are always automatic, but that is less often the case. Visual data analyticsclosely involve humans to help make sense of data, from initial pattern detection and model building to sophisticated data dashboards that present data in a way that humans can act upon. K-12 schools and school districts are starting to adopt such institution-level analyses for detecting areas for instructional improvement, setting policies, and measuring results. Making visible students' learning and assessment activities opens up the possibility for students to develop skills in monitoring their own learning and to see directly how their effort improves their success. Teachers gain views into students' performance that help them adapt their teaching or initiate tutoring, tailored assignments, and the like. Robust applications of educational data mining and learning analytics techniques come with costs and challenges. Information technology (IT) departments will understand the costs associated with collecting and storing logged data, while algorithm developers will recognize the computational costs these techniques still require. Another technical challenge is that educational data systems are not interoperable, so bringing together administrative data and classroom-level data remains a challenge. Yet combining these data can give algorithms better predictive power. Combining data about student performance-online tracking, standardized tests, teachergenerated tests-to form one simplified picture of what a student knows can be difficult and must meet acceptable standards for validity. It also requires careful attention to student and teacher privacy and the ethical obligations associated with knowing and acting on student data. Educational data mining and learning analytics have the potential to make visible data that have heretofore gone unseen, unnoticed, and therefore unactionable. To help further the fields and gain value from their practical applications, the recommendations are that educators and administrators: • Develop a culture of using data for making instructional decisions. • Involve IT departments in planning for data collection and use. • Be smart data consumers who ask critical questions about commercial offerings and create demand for the most useful features and uses. • Start with focused areas where data will help, show success, and then expand to new areas. • Communicate with students and parents about where data come from and how the data are used. • Help align state policies with technical requirements for online learning systems.Researchers and software developers are encouraged to: • Conduct research on usability and effectiveness of data displays. • Help instructors be more effective in the classroom with more realtime and data-based decision support tools, including recommendation services. • Continue to research methods for using identified student information where it will help most, anonymizing data when required, and understanding how to align data across different systems. • Understand how to repurpose predictive models developed in one context to another. A final recommendation is to create and continue strong collaboration across research, commercial, and educational sectors. Commercial companies operate on fast development cycles and can produce data useful for research. Districts and schools want properly vetted learning environments. Effective partnerships can help these organizations codesign the best tools.
Conference Paper
Full-text available
It is widely known that interaction, as well as communication, are very important parts of successful online courses. These features are considered crucial because they help to improve students’ attention in a very significant way. In this publication, the authors present an innovative application, which adds different forms of interactivity to learning videos within MOOCs such as multiple-choice questions or the possibility to communicate with the teacher. Furthermore, Learning Analytics using exploratory examination and visualizations have been applied to unveil learners’ patterns and behaviors as well as investigate the effectiveness of the application. Based upon the quantitative and qualitative observations, our study determined common practices behind dropping out using videos indicator and suggested enhancements to increase the performance of the application as well as learners’ attention.
Conference Paper
Full-text available
Massive Open Online Courses are remote courses that excel in their students' heterogeneity and quantity. Due to the peculiarity of being massiveness, the large datasets generated by MOOCs platforms require advance tools to reveal hidden patterns for enhancing learning and educational environments. This paper offers an interesting study on using one of these tools, clustering, to portray learners' engagement in MOOCs. The research study analyse a university mandatory MOOC, and also opened to the public, in order to classify students into appropriate profiles based on their engagement. We compared the clustering results across MOOC variables and finally, we evaluated our results with an eighties students' motivation scheme to examine the contrast between classical classes and MOOCs classes. Our research pointed out that MOOC participants are strongly following the Cryer's scheme of Elton (1996).
Conference Paper
Full-text available
The widespread adoption of Learning Analytics (LA) and Educational Data Mining (EDM) has somewhat stagnated recently, and in some prominent cases even been reversed following concerns by governments, stakeholders and civil rights groups about privacy and ethics applied to the handling of personal data. In this ongoing discussion, fears and realities are often indistinguishably mixed up, leading to an atmosphere of uncertainty among potential beneficiaries of Learning Analytics, as well as hesitations among institutional managers who aim to innovate their institution's learning support by implementing data and analytics with a view on improving student success. In this paper, we try to get to the heart of the matter, by analysing the most common views and the propositions made by the LA community to solve them. We conclude the paper with an eight-point checklist named DELICATE that can be applied by researchers, policy makers and institutional managers to facilitate a trusted implementation of Learning Analytics.
Conference Paper
Full-text available
Massive Open Online Courses (MOOCs) have been tremendously spreading among Science, Technology, Engineering and Mathematics (STEM) academic disciplines. These MOOCs have served an agglomeration of various learner groups across the world. The leading MOOCs platform in Austria, the iMooX, offers such courses. This paper highlights authors’ experience of applying Learning Analytics to examine the participation of secondary school pupils in one of its courses called “Mechanics in everyday life”. We sighted different patterns and observations and on the contrary of the expected jubilant results of any educational MOOC, we will show, that pupils seemingly decided to consider it not as a real motivating learning route, but rather as an optional homework.
Conference Paper
Full-text available
Within the evolution of technology in education, Learning Analytics has reserved its position as a robust technological field that promises to empower instructors and learners in different educational fields. The 2014 horizon report (Johnson et al., 2014), expects it to be adopted by educational institutions in the near future. However, the processes and phases as well as constraints are still not deeply debated. In this research study, the authors talk about the essence, objectives and methodologies of Learning Analytics and propose a first prototype life cycle that describes its entire process. Furthermore, the authors raise substantial questions related to challenges such as security, policy and ethics issues that limit the beneficial appliances of Learning Analytics processes.
Article
Full-text available
Open data has tremendous potential for science, but, in human subjects research, there is a tension between privacy and releasing high-quality open data. Federal law governing student privacy and the release of student records suggests that anonymizing student data protects student privacy. Guided by this standard, we de-identified and released a data set from 16 MOOCs (massive open online courses) from MITx and HarvardX on the edX platform. In this article, we show that these and other de-identification procedures necessitate changes to data sets that threaten replication and extension of baseline analyses. To balance student privacy and the benefits of open data, we suggest focusing on protecting privacy without anonymizing data by instead expanding policies that compel researchers to uphold the privacy of the subjects in open data sets. If we want to have high-quality social science research and also protect the privacy of human subjects, we must eventually have trust in researchers. Otherwise, we'll always have the strict tradeoff between anonymity and science illustrated here.
Conference Paper
Full-text available
Higher education institutions have collected and analysed student data for years, with their focus largely on reporting and management needs. A range of institutional policies exist which broadly set out the purposes for which data will be used and how data will be protected. The growing advent of learning analytics has seen the uses to which student data is put expanding rapidly. Generally though the policies setting out institutional use of student data have not kept pace with this change. Institutional policy frameworks should provide not only an enabling environment for the optimal and ethical harvesting and use of data, but also clarify: who benefits and under what conditions, establish conditions for consent and the de-identification of data, and address issues of vulnerability and harm. A directed content analysis of the policy frameworks of two large distance education institutions shows that current policy frameworks do not facilitate the provision of an enabling environment for learning analytics to fulfil its promise.