Conference PaperPDF Available

An Efficient Multi-GPU Implementation for Linear-Response Time-Dependent Density Functional Theory

Authors:
M 1{+B2Mi JmHiB@:Sl AKTH2K2MiiBQM 7Q`
GBM2`@_2bTQMb2 hBK2@.2T2M/2Mi .2MbBiv 6mM+iBQMH
h?2Q`v
ZBM;+B CBM;- GBM;vmM qM- a?Bx?2 CBQ- q2B >m- CmMb?B *?2M†∗ M/ >QM; M†∗
a+?QQH Q7 *QKTmi2` a+B2M+2 M/ h2+?MQHQ;v- lMBp2`bBiv Q7 a+B2M+2 M/ h2+?MQHQ;v Q7 *?BM- >272B- *?BM
>272B LiBQMH G#Q`iQ`v 7Q` S?vbB+H a+B2M+2b i i?2 JB+`Qb+H2- .2T`iK2Mi Q7 *?2KB+H S?vbB+b-
M/ avM2`;2iB+ AMMQpiBQM *2Mi2` Q7 ZmMimK AM7Q`KiBQM M/ ZmMimK S?vbB+b-
lMBp2`bBiv Q7 a+B2M+2 M/ h2+?MQHQ;v Q7 *?BM- >272B- *?BM
1KBH, D[+!KBHXmbi+X2/mX+M- &rMHv- Dbxkke'!KBHXmbi+X2/mX+M- r?mmbi+!mbi+X2/mX+M-
+DmMb!KBHXmbi+X2/mX+M- ?M!mbi+X2/mX+M
#bi`+iěLQr/vb- EQ?M@a?K /2MbBiv 7mM+iBQMH i?2Q`v
U.6hV +H+mHiBQM ?b /`rM KQ`2 M/ KQ`2 ii2MiBQM BM
+?2KBbi`v M/ Ki2`BH b+B2M+2 bBKmHiBQMbX >Qr2p2`- /m2 iQ
i?2 2ti`2K2 H`;2 >KBHiQMBM Ki`Bt M22/2/ iQ #2 ;2M2`i2/
/m`BM; i?2 +H+mHiBQM- r?2M i?2 bim/B2/ bvbi2K BM+`2b2b- i?2
+Qbi Q7 +H+mHiBQM #2+QK2b mM#2`#H2 #Qi? BM ;`QmM/ M/
2t+Bi2/ bii2 2H2+i`QMB+ bi`m+im`2 bBKmHiBQMb rBi? H`;2 mMB7Q`K
#bBbX AM i?Bb TT2`- r2 T`QTQb2  ?B;?@T2`7Q`KM+2 KmHiB@:Sl
TT`Q+? 7Q` HBM2`@`2bTQMb2 iBK2@/2T2M/2Mi /2MbBiv 7mM+iBQMH
i?2Q`v UG_@h..6hV +H+mHiBQM iQ +QKTmi2 i?2 2t+BiiBQM
2M2`;B2b BM KQH2+mH2b M/ bQHB/b rBi? i?2 THM2 rp2 #bBb b2i
mM/2` i?2 T2`BQ/B+ #QmM/`v +QM/BiBQMX q2 +`27mHHv /2bB;M i?2
T`HH2H BKTH2K2MiiBQM- +H+mHiBQM bi2Tb M/ /i /Bbi`B#miBQM
b+?2K2b BM i?2 Mśp2 *Sl BKTH2K2MiiBQM iQ KBMiBM ;QQ/
b+H#BHBiv r?2M i?2 bim/B2/ bvbi2K 2tTM/b- i?2M TQ`i i?2 KQbi
iBK2@+QMbmKBM; T`i iQ KmHiB@:Sl THi7Q`K HQM; rBi? b2p2`H
2z2+iBp2 QTiBKBxiBQM bi2TbX h?2 `2bmHib b?Qr i?i rBi? /mH
oRyy :Slb- i?2 T`QTQb2/ TT`Q+? +M +?B2p2 M p2`;2
Q7 eXe3t bT22/mT +QKT`2/ rBi? /mH Rk@+Q`2 s2QM *Sl rBi?
#mHF bBHB+QM bvbi2Kb i?i +QKT`Bb2b i?QmbM/b Q7 iQKb UR-yk9
iQKbVX
AM/2t h2`KběGBM2`@`2bTQMb2 iBK2@/2T2M/2Mi /2MbBiv 7mM+@
iBQMH i?2Q`v- :Sl- S`HH2H TT`Q+?
AX AMi`Q/m+iBQM
.m2 iQ Bib ;QQ/ #HM+2 #2ir22M ++m`+v M/ +QKTm@
iiBQMH 2{+B2M+v- i?2 iBK2@/2T2M/2Mi /2MbBiv 7mM+iBQMH
i?2Q`v Uh..6hV- 2bi#HBb?2/ #v i?2 _mM;2@:`Qbb i?2Q@
`2K (R)- r?B+? Bb  b2H7@+QMbBbi2Mi 7`K2rQ`F T`QTQb2/ iQ
/2b+`B#2 i?2 2t+Bi2/ bii2b T`QT2`iB2b BM KMv bvbi2Kb
HBF2 KQH2+mH2b M/ bQHB/b- ?b #22M rB/2Hv mb2/ BM Kii2`
T?vbB+b- [mMimK +?2KBbi`v M/ Ki2`BH b+B2M+2 (k)Ĝ(9)X
:2M2`HHv bT2FBM;- i?2`2 `2 irQ rvb iQ bQHp2 i?2 iBK2@
/2T2M/2Mi a+?ƺ`/BM;2` 2[miBQM rBi?BM i?2 7`K2rQ`F Q7
h..6h (8)- (e)- i?2 KQbi rB/2Hv mb2/ TT`Q+? Bb HBM2`@
`2bTQMb2 iBK2@/2T2M/2Mi /2MbBiv 7mM+iBQMH i?2Q`v UG_@
h..6hV- r?B+? Bb Q7i2M `272``2/ /B`2+iHv b h..6h BM
HBi2`im`2- bQHp2b KMv@#Q/v [mMimK T`Q#H2Kb i?`Qm;?
6Qm`B2` i`Mb7Q`KiBQM Q7 i?2 iBK2@/2T2M/2Mi HBM2` `2@
bTQMb2 7mM+iBQMbX Ai Q#iBMb 2t+BiiBQM 2M2`;B2b M/ +Q`@
*Q``2bTBM; mi?Q`, >QM; M M/ CmMb?B *?2MX
`2bTQM/BM; Qb+BHHiBQM bi`2M;i?b 7`QK TQH2b M/ `2bB/m2b BM
i?2 +QKTH2t `2bTQMb2 7mM+iBQM (d)- (3)X
qBi?BM G_@h..6h 7`K2rQ`F- i?2 KQbi +QKKQM rv
iQ Q#iBM i?2 2t+BiiBQM 2M2`;v M/ +Q``2bTQM/BM; rp2
7mM+iBQMb Bb iQ bQHp2 i?2 HBM2` `2bTQMb2 *bbB/ 2[miBQMX
h?2 KQbi iBK2@+QMbmKBM; T`i BM G_@h..6h +H+mHiBQM
Bb iQ 2tTHB+BiHv +QMbi`m+i i?2 G_@h..6h >KBHiQMBM rBi?
i?2 +QKTH2tBiv Q7 O(N5
eV rBi? `2bT2+i iQ i?2 MmK#2` Q7
2H2+i`QMb BM i?2 bim/B2/ bvbi2K- NeX aQ b i?2 bvbi2K
2tTM/b- i?2 +QKTmiiBQM M/ K2KQ`v +Qbi Q7 G_@h..6h
+H+mHiBQM BM ;2M2`B+ *Sl THi7Q`K #2+QK2b T`Q?B#BiBp2Hv
2tT2MbBp2- 2bT2+BHHv BM H`;2 mMB7Q`K #bBb b2ib bm+? b
THM2 rp2 #bBb b2iX h?2`27Q`2- Bi Bb biBHH  p2`v +?HH2M;BM;
rQ`F iQ 2tTHQ`2 i?2 2t+Bi2/ bii2 T`QT2`iB2b Q7 bvbi2K rBi?
?mM/`2/b Q7 iQKb rBi? i?2 G_@h..6h 7`K2rQ`FX
6Q`imMi2Hv- i?2 bBimiBQM ?b #22M BKT`Qp2/ i?MFb iQ
#Qi? M2r H;Q`Bi?K /2p2HQTK2Mib M/ i?2 /2p2HQTK2Mi
Q7 ?2i2`Q;2M2Qmb +QKTmiBM; `+?Bi2+im`2 HBF2 :2M2`H@
Sm`TQb2 :`T?B+b S`Q+2bbBM; lMBi U:S@:SlVX PM i?2
H;Q`Bi?KB+ bT2+i- r2 +M +``v hKK@.M+Qz TT`Qt@
BKiBQM M/ mb2 HQ+H@/2MbBiv TT`QtBKiBQM 7mM+iBQMH iQ
`2/m+2 +QKTmiiBQMH +Qbi r?BH2 KBMiBMBM; i?2 +?2KB+H
++m`+v i i?2 +QKTmiiBQMH H2p2H- b  +QMi`bi- i`/B@
iBQMH G_@h..6h +H+mHiBQM rBHH +``v i?2 +QKTmiiBQMH
bi2T rBi?  Km+? H`;2` G_@h..6h >KBHiQMBM- r?B+?
BM+`2b2b i?2 +Qbi Q7 +QKTmiiBQM- K2KQ`v M/ +QKKmMB@
+iBQMX PM i?2 ?`/r`2 bT2+i- ?2i2`Q;2M2Qmb `+?Bi2+im`2
TQr2`2/ #v :Sl ++2H2`iQ`b ?b #2+QK2 i?2 KQbi rB/2Hv
mb2/ `+?Bi2+im`2 KQM; bmT2`+QKTmi2`b- b M 2tKTH2-
h?2 hQT@k 7bi2bi bmT2`+QKTmi2` amKKBiǶb N8W +QKTmiBM;
TQr2` Bb T`QpB/2/ #v i?2 Hi2bi LoA.A h2bH oRyy :Sl-
r?B+? ?b bB;MB}+MiHv BM+`2b2/ i?2 pBH#H2 +QKTmiBM;
TQr2`X *QKT`2/ rBi? i?2 ;2M2`B+ *Sl T`Q+2bbQ`- :Sl
T`QpB/2b Km+? ?B;?2` ~QiBM;@TQBMi +QKTmiBM; 2{+B2M+v-
r?B+? +M #2 ;`2iHv #2M2}+BH 7Q` G_@h..6h +H+mHiBQMX
AM i?Bb TT2`- KQiBpi2/ #v H;Q`Bi?K M/ ?`/r`2
2pQHp2K2Mib- r2 T`QTQb2  ?B;?Hv 2{+B2Mi KmHiB@:Sl BK@
TH2K2MiiBQM Q7 G_@h..6h- r?B+? rb /2bB;M2/ +`27mHHv

*&&&OE*OUFSOBUJPOBM$POGFSFODFPO)JHI1FSGPSNBODF$PNQVUJOHBOE$PNNVOJDBUJPOT*&&&UI*OUFSOBUJPOBM
$POGFSFODFPO4NBSU$JUZ*&&&UI*OUFSOBUJPOBM$POGFSFODFPO%BUB4DJFODFBOE4ZTUFNT)1$$4NBSU$JUZ%44
978-1-7281-7649-9/20/$31.00 ©2020 IEEE
DOI 10.1109/HPCC-SmartCity-DSS50907.2020.00025
2020 IEEE 22nd International Conference on High Performance Computing and Communications | 978-1-7281-7649-9/20/$31.00 ©2020 IEEE | DOI: 10.1109/HPCC-SmartCity-DSS50907.2020.00025
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
BM Ki?KiB+H rv M/ Q|Q/b i?2 Ki`Bt KmHiBTHB+@
iBQM- 6bi 6Qm`B2` h`Mb7Q`K M/ b2p2`H +QKTmiiBQMHHv
BMi2MbBp2 F2`M2Hb +QKTmiiBQMb 7`QK *Slb iQ +QKKQ/Biv
:SlbX
q2 bmKK`Bx2 i?2 +QMi`B#miBQM b 7QHHQrb,
q2 /2bB;M M 2{+B2Mi T`HH2H bi`i2;v HQM; rBi? mb2@
7mH /i /Bbi`B#miBQM b+?2K2b iQ BKTH2K2Mi H`;2 b+H2
HBM2`@`2bTQMb2 iBK2@/2T2M/2Mi /2MbBiv 7mM+iBQMH i?2@
Q`vX
q2 +``v Qmi  K2i?Q/ 7Q` KmHiB@:Sl ++2H2`iBQM BM
G_@h..6h M/ T`QTQb2 bQK2 `2bmHi7mH QTiBKBxiBQMb
iQ 7m`i?2` `2/m+2 i?2 rHH +HQ+F iBK2X
q2 T2`7Q`K 2ti2MbBp2 2tT2`BK2Mib iQ 2pHmi2 i?2
T2`7Q`KM+2 Q7 i?2 T`QTQb2/ G_@h..6h +H+mHiBQM
QM :Slb iQ T`Qp2 i?2 2z2+iBp2M2bb Q7 Qm` K2i?Q/X
h?2 `2bi Q7 i?2 KMmb+`BTi Bb Q`;MBx2/ b 7QHHQrbX
a2+iBQM AA T`2b2Mib i?2 i?2Q`2iB+H H;Q`Bi?K M/ T`HH2H
BKTH2K2MiiBQM Q7 G_@h..6hX JmHiB@:Sl BKTH2K2Mi@
iBQM M/ bQK2 Qi?2` QTiBKBxiBQMb `2 b?QrM BM b2+iBQM
AAAX h?2 MmK2`B+H `2bmHi M/ +Q``2bTQM/BM; MHvbBb `2
/Bb+mbb2/ BM a2+iBQM AoX AM a2+iBQM o- r2 `2pB2r i?2 `2Hi2/
rQ`FbX q2 +QM+Hm/2 Qm` rQ`F M/ T`QTQb2 M QmiHQQF 7Q`
Qm` 7mim`2 rQ`F BM a2+iBQM oAX
AAX S`HH2H AKTH2K2MiiBQM Q7 G_@h..6h
AM i?Bb b2+iBQM- r2 rBHH BMi`Q/m+2 i?2 i?2Q`2iB+H TT`Q+?
M/ T`HH2H BKTH2K2MiiBQM Q7 G_@h..6h rBi? i?2 THM2
rp2 b2i QM *Sl THi7Q`K BM /2iBHX q2 `2K`F i?i
i?2 b+H#H2 BKTH2K2MiiBQM Q7 G_@h..6h Bb 7+BHBii2/
#v i?2 JSA@PT2MJS ?v#`B/ T`HH2H bi`i2;v iQ `2/m+2
i?2 KbbBp2Hv ?B;? +QKTmiiBQMH +Qbi BM +QMbi`m+iBM; M/
+H+mHiBM; i?2 2B;2MpHm2b Q7 G_@h..6h >KBHiQMBMX
X h?2Q`2iB+H H;Q`Bi?K
G_@h..6h bQHp2b 2B;2MpHm2 2[miBQM Q7 i?2 7Q`K
HX XURV
q?2`2 X`2T`2b2Mib i?2 +Q2{+B2Mi Q7 2t+BiiBQM rp2@
7mM+iBQM BM i?2 EQ?M@a?K Q`#BiHb M/ ΛT`2b2Mib i?2
2t+BiiBQM 2M2`;B2bX h?2 HBb r?i r2 +HH G_@h..6h
>KBHiQMBM #2+mb2 Bi ?b bBKBH` 7Q`K rBi? >KBHiQMBM
BM Ea@.6h (3)X
H=D+2VHxc 2WHxc
2WHxc D2VHxc UkV
D(ivic,j
vjc)=(εicεiv)δivjvδicjc- ?2`2 δ/2MQi2b i?2
E`QM2+F2` /2Hi- Bb M Ncv×Ncv UNcv =Nc×NvKi`Bt- Nc
Bb i?2 MmK#2` Q7 +QM/m+iBQM Q`#BiHb M/ NvBb i?2 MmK#2`
Q7 pH2M+2 Q`#BiHbV /B;QMH Ki`BtX h?2b2 2M2`;B2b M/
Q`#BiHb `2 ivTB+HHv Q#iBM2/ pB i?2 EQ?M@a?K /2MbBiv
7mM+iBQMH i?2Q`v UEa.6hV (N) +H+mHiBQMbX h?2 VHxc M/
WHxc `2 i?2 >i`22@2t+?M;2@+Q``2HiBQM BMi2;`Hb (Ry)
V>t+ =Ψ
iν(ric(r)f>t+(r,rjν(r
jc(r)drdr
W>t+ =Ψ
iν(ric(r)f>t+(r,r
jν(rjc(r)drdr
UjV
>2`2 f>t+ Bb i?2 >i`22@2t+?M;2@+Q``2HiBQM F2`M2H
fHxc (r,r)=fH(r,r)+fxc[n](r,r)
=1
|rr|+δVxc[n](r)
δn (r)
U9V
q?2`2 n(r)=Nv
i=1 |Ψi(r)|2Bb i?2 2H2+i`QM /2MbBiv M/
fxc Bb i?2 2t+?M;2@+Q``2HiBQM TQi2MiBH BM G_@h..6h
+H+mHiBQMX
h?MFb iQ i?2 hKK@.M+Qz TT`QtBKiBQM Uh.V (RR)-
WHxc Bb M2;H2+i#H2 M/ H#2+QK2b >KBHiQMBM Ki`Bt
rBi? i?2 7Q`K
H=D+2VHxc U8V
aQ #bB+HHv r2 M22/ iQ ;2i >`i`22@2t+?M;2@+Q``2HiBQM
BMi2;`H V>t+ iQ +QMbi`m+i i?2 G_@h..6h >KBHiQMBM-
M/ BM /Bb+`2i2 +b2b VHxc +M #2 +QMbi`m+i2/ b i?2
KmHiBTHB+iBQM Q7 i?2 Ki`Bt f>t+ M/ i`MbTQb2/ E?i`B@
_Q T`Q/m+i Ki`Bt (Rk) Pvc ={Ψiv(ric(r)}rBi? i?2
pH2M+2 M/ +QM/m+iBQM Q`#BiHb iv(r)M/ Ψic(r)) BM `2H
bT+2- ?2`2 Nr/2MQi2b i?2 MmK#2` Q7 ;`B/ TQBMib 7Q` 
rp27mM+iBQM BM `2H bT+2X
VHxc =P
vcfHxcPvc UeV
6Q` bBKTHB+Biv M/ +QKTmiiBQMH b+H#BHBiv- r2 +?QQb2
iQ mb2 i?2 HQ+H@/2MbBiv TT`QtBKiBQM UG.V 7mM+iBQMH
(Rj) BM i?2 Ea.6h M/ G_@h..6h +H+mHiBQMX qBi?
G. 7mMiBQMH- fxc i?i /2MQi2b 2t+?M;2@+Q``2HiBQM
TQi2MiBH- M/ Bi Bb /B;QMH BM `2H bT+2 {ri}Nr
i=1-bQ
i?2 2t+?M;2@+Q``2HiBQM T`Q/m+i fxcPvc Bb KQ`2 2{+B2MiHv
+QKTmi2/ BM `2H bT+2 pB :2M2`H Ji`Bt JmHiBTHv rBi? 
+QKTmiiBQMH +QKTH2tBiv Q7 O(NrN2
vN2
c)O(N5
e)- ?2`2
Ne/2MQi2b i?2 MmK#2` Q7 2H2+i`QMb BM i?2 bim/B2/ bvbi2KX
q2 MQiB+2 i?i i?2 >`i`22 TQi2MiBH vHBb /B;QMH BM
`2+BT`Q+H bT+2 {Gi}Ng
i=1- ?2`2 NgBb i?2 ;`B/ TQBMib
BM `2+BT`Q+H bT+2X q2 i`Mb7Q`K VHiQ `2+BT`Q+H bT+2
bQ ˆvHˆ
Pvc +M #2 +QKTmi2/ BM  KQ`2 2{+B2Mi Tii2`MX
ˆvHˆ
Pvc +M #2 +H+mHi2/ #v :2M2`H Ji`Bt JmHiBTHv rBi?
 +QKTmiiBQMH +QKTH2tBiv Q7 O(NgN2
vN2
c)O(N5
e)7i2`
r2 mb2 i?2 6bi 6Qm`B2` h`Mb7Q`K U66hV iQ +``v i?2
i`Mb7Q`K 7`QK `2H bT+2 iQ `2+BT`Q+H bT+2X 7i2` i?2b2
bi2Tb- r2 T2`7Q`K 66h iQ i?2 T`Q/m+i Q7 >`i`22 TQi2MiBH
QT2`iBQM #2ir22M `2+BT`Q+H M/ `2H bT+2 iQ // vHPvc
M/ fxcPvc M/ KmHiBTHv P
vc iQ ;2i i?2 `2bmHi Q7 >`i`22@
2t+?M;2@+Q``2HiBQM BMi2;`H VHxcX
++Q`/BM; iQ 1[miBQM k- G_@h..6h >KBHiQMBM +M
#2 +QMbi`m+i2/ #v //BM; VHxc M/ DX _B;?i 7i2` +QM@
bi`m+iBM; i?2 G_@h..6h >KBHiQMBM H- i?2 `2bi Q7 i?2
G_@h..6h +H+mHiBQM Bb /B;QMHBxBM; i?2 G_@h..6h

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
>KBHiQMBM HiQ ;2i i?2 2t+BiiBQM rp27mM+iBQMb X
M/ 2M2`;B2b ΛX h?2 T`Q+2bb Q7 G_@h..6h +H+mHiBQMb
Bb T`2b2Mi2/ BM H;Q`Bi?K RX
"X S`HH2H AKTH2K2MiiBQM
q2 BKTH2K2Mi i?2 G_@h..6h KQ/mH2 iQ TQbi@T`Q+2bb
i?2 rp27mM+iBQMb ΨM/ ;`QmM/@bii2 2M2`;B2b εi++mH@
Hi2/ #v Sq.6h USHM2rp2 /2MbBiv 7mM+iBQMH i?2Q`vV
(R9)- r?B+? Bb  bm#KQ/mH2 Q7 .:.6h U.Bb+QMiBMmQmb
:H2`FBM .2MbBiv 6mM+iBQMH h?2Q`vV (R8)- (Re)X .:.6h Bb
 KbbBp2Hv T`HH2H bQ7ir`2 T+F;2 iQ T2`7Q`K H`;2@b+H2
EQ?M@a?K /2MbBiv 7mM+iBQMH i?2Q`v U.6hV +H+mHiBQMb
2{+B2MiHv M/ Sq.6h Bb  b2H7@+QMiBM2/ KQ/mH2 7Q` T2`@
7Q`KBM; +QMp2MiBQMH biM/`/ THM2@rp2 #b2/ 2H2+i`QMB+
bi`m+im`2 +H+mHiBQMbX hQ KBMiBM i?2 ?B;? +QKTmiiBQMH
2{+B2M+v i  H`;2 b+H2- r2 /2bB;M  JSA@PT2MJS ?v#`B/
T`HH2HBxiBQM bi`i2;v BM G_@h..6h BKTH2K2MiiBQMX AM
bT2+B}+- r2 mb2 JSA iQ ?M/H2 /Bz2`2Mi ivT2b Q7 /i M/
ibF /Bbi`B#miBQM b+?2K2b M/ r2 mb2 PT2MJS iQ 7m`i?2`
/Bb+Qp2` i?2 T`HH2HBbK Q7 2+? JSA ibF bQ Bi +M #2 7mHHv
b+H#H2 QM KQ/2`M bmT2`+QKTmi2`bX
b r2 +M b22 7`QK };m`2 R- i?2`2 `2 irQ KBM /i
/Bbi`B#miBQM b+?2K2b 7Q` i?2 rp27mM+iBQMb BM i?2 G_@
h..6h BKTH2K2MiiBQMX h?2 }`bi QM2 Bb +QHmKM #HQ+F
H;Q`Bi?K R h?2 Tb2m/Q+Q/2 7Q` +``vBM; i?2 G_@h..6h
+H+mHiBQMbX
AMTmi,
:`QmM/@bii2 2M2`;B2b i
qp27mM+iBQMb Ψμ(r)M/ Ψν(r)/Bbi`B#mi2/ ++Q`/BM;
iQ i?2 +QHmKM #HQ+F BM/2tX
R, 7Q` 2+? JSA ibF BM G_@h..6h +H+mHiBQM /Q
k, AMBiBHBx2 Pvc(r)={Ψμ(rν(r)}BM `2H bT+2
O(NvNcNr)
j, *``v Qmi JSAnHHiQHH iQ rp27mM+iBQMb ΨiQ i`Mb@
72` /i /Bbi`B#miBQM b+?2K2 7`QK `Qr #HQ+F T`iBiBQM
iQ +QHmKM #HQ+F T`iBiBQM
9, i`Mb72` Pvc(r)BMiQ `2+BT`Q+H bT+2 Uzi_k*V
O(NrHQ;NrNvNc)
8, TTHv i?2 >`i`22 TQi2MiBH BM `2+BT`Q+H bT+2 M/
i`Mb72` BMiQ `2H bT+2 v>Pvc ∼O(NrHQ;NrNvNc+
NcNvNr)
e, *``v Qmi JSAnHHiQHH iQ rp27mM+iBQMb ΨiQ
i`Mb72` /i /Bbi`B#miBQM b+?2K2 7`QK +QHmKM #HQ+F
T`iBiBQM iQ `Qr #HQ+F T`iBiBQM
d, *QKTmi2 i?2 >`i`22@2t+?M;2@+Q``2HiBQM BMi2;`Hb
V>t+ BM `2H bT+2 U:1JJV ∼O(NrN2
vN2
c)
3, bmKK`Bx2 V>t+ rBi?BM HH JSA ibFb #v
JSAnHH`2/m+2
N, 2M/ 7Q`
Ry, P#iBM G_@h..6h >KBHiQMBM #v +QKTmiBM; i?2
/Bz2`2M+2 Q7 EQ?M@a?K 2M2`;v 2B;2MpHm2b
RR, .B;QMHBx2 i?2 G_@h..6h >KBHiQMBM H
PmiTmi, 1t+Bi2/@bii2 2M2`;B2b {λi}M/ rp27mM+iBQMb
{xij }
T`iBiBQM- 2+? +QHmKM Q7 rp27mM+iBQM `2 biQ`2/ iQ QM2
JSA ibF #b2/ QM Bib +QHmKM BM/2tX h?Bb /i /Bbi`B#miBQM
b+?2K2 Bb ?B;?Hv 2{+B2Mi iQ TTHv >i`22 QT2`iQ` bBM+2
/Bz2`2Mi JSA ibFb `2 #H2 iQ T2`7Q`K 66hb BM/2T2M/2MiHv
BM `2+BT`Q+H bT+2X h?2 b2+QM/ QM2 Bb `Qr #HQ+F T`iBiBQM-
BM i?Bb b+?2K2 i?2 /i Bb /Bbi`B#mi2/ ++Q`/BM; iQ i?2
T`iBiBQM Q7 JSA ibFbX h?Bb /Bbi`B#miBQM b+?2K2 7+BHBii2b
i?2 2pHmiBQM Q7 Ki`Bt@Ki`Bt KmHiBTHB+iBQM U:1JJVX AM
G_@h..6h i?2 rp27mM+iBQMb Ψ`2 KQbiHv /Bbi`B#mi2/ BM
i?2 +QHmKM #HQ+F T`iBiBQM iQ T`2b2Mi bmT2`TQbBiBQM Q7  b2i
Q7 2B;2Mrp2 7mM+iBQMb BM i?2 bvbi2KX h?2 +QMp2`bBQM 7`QK
+QHmKM #HQ+F T`iBiBQM iQ `Qr #HQ+F T`iBiBQM Bb T2`7Q`K
pB JSAnHHiQHHX
AAAX JmHiB@:Sl ++2H2`iBQM
hQ +QKTmi2 i?2 2t+BiiBQM 2M2`;B2b BM KQH2+mH2b M/
bQHB/b rBi? THM2 rp2 #bBb b2i- Bi mbmHHv iF2b 3yW Q7 rHH
+HQ+F iBK2 iQ TTHv i?2 >i`22 TQi2MiBH BM `2+BT`Q+H bT+2
M/ +QKTmi2 i?2 >i`22@2t+?M;2@+Q``2HiBQM BMi2;`Hb BM
`2H bT+2 rBi? Qm` Lśp2 *Sl BKTH2K2MiiBQMX hQ #2
2t+i- i?2 KQbi iBK2 +QMbmKBM; T`i +QMbBbib Q7 QT2`iBQMb
bm+? b 6bi 6Qm`B2` h`Mb7Q`K U66hV- :2M2`H Ji`Bt
JmHiBTHv U:1JJV- Ki`Bt /B;QMHBxiBQM Uau1o.V M/
bQK2 JSA +QKKmMB+iBQMb bm+? b JSAnHH`2/m+2X h?2b2
Ki`Bt QT2`iBQMb +M #2 /QM2 pB i?2 H2p2H j "bB+ GBM2`
H;2#` am#T`Q;`Kb U"GaV- GBM2` H;2#` S*E;2
UGS*EV M/ 6bi2bi 6Qm`B2` h`Mb7Q`K BM i?2 q2bi
U66hqV HB#``B2b BM i?2 *Sl BKTH2K2MiiBQMX hQ 7mHHv
++2H2`i2 i?2 iBK2@BMi2MbBp2 T`i Q7 G_@h..6h- r2 TQ`i
i?2b2 F2`M2Hb iQ :Slb rBi? 2Bi?2` :Sl@++2H2`i2/ Q`
*l. +mbiQK F2`M2HbX q2 HbQ +``v  #2ii2` /i T`@
iBiBQM rv iQ Qp2`HT i?2 JSA +QKKmMB+iBQMb M/ :Sl
+QKTmiiBQM BM Q`/2` iQ 7mHHv 2tTHQ`2 i?2 T`HH2H TQi2MiBH
Q7 i?2 ?2i2`Q;2M2Qmb `+?Bi2+im`2X
6B;X RX S`HH2H /i M/ ibF /Bbi`B#miBQM b+?2K2b Q7 G_@h..6hX UV
"M/ T`HH2HBxiBQM rBi? +QHmKM #HQ+F T`iBiBQM U7Q` 66hV M/ U#V
;`B/ T`HH2HBxiBQM rBi? `Qr #HQ+F T`iBiBQM U7Q` :1JJVX AHHmbi`iBQM
Q7 i?2 T`HH2H b+?2K2 rBi? rp27mM+iBQM MmK#2` Ncv 4 3 M/ 9
+QKTmiiBQM +Q`2bX

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
AM Q`/2` iQ +``v Qmi 2{+B2Mi ++2H2`iBQM QM KmHiB@:Sl
THi7Q`K- r2 T2`7Q`K  b2`BH Q7 QTiBKBxiBQM bi2TbX
X SQBMi@iQ@TQBMi _2TH+2K2Mi
h?2 }`bi bi2T Q7 TQ`iBM; G_@h..6h iQ :Sl Bb iQ `2TH+2
i?2 KQbi +QKTmiiBQMHHv 2tT2MbBp2 T`i F2`M2Hb HBF2 66h
M/ :1JJ #v mbBM; +m"Ga- +m66h M/ *l. +mbiQK
F2`M2Hb iQ ++2H2`i2 i?2 +QKTmiiBQM Q7 H;Q`Bi?K RX AM
Qm` BKTH2K2MiiBQM- i?2 *l. +mbiQK F2`M2Hb `2 r`Bii2M
iQ }HH i?2 ;Tb #2ir22M i?2 +m66h 7mM+iBQMb b  /KTBM;
+Q2{+B2Mi iQ ++2H2`i2 i?2 T`Q+2bb Q7 +QMp2`;2M+2X AM
Qm` *Sl BKTH2K2MiiBQM- i?2 /i /Bbi`B#miBQM b+?2K2 Bb
T2`72+i 7Q` KmHiB@:Sl T`HH2HBxiBQM #2+mb2 2+? JSA
ibF ?QH/b HH i?2 /i M22/2/ iQ T2`7Q`K MmK2`B+H
QT2`iBQMb- bQ i?2`2 Bb MQ :Sl@:Sl /i i`Mb72` #2ir22M
+H+mHiBQM bi2TbX 7i2` KTTBM; i?2b2 F2`M2Hb iQ :Sl- i?2
*Slb QMHv T2`7Q`K  bTQi Q7 +QKTmiiBQM bi2Tb- M/ KQbi
Q7 i?2 *Sl iBK2 `2 mb2/ iQ +``v JSA +QKKmMB+iBQM-
M/ i?2 /i +QTv #2ir22M *Sl M/ :Sl Bb M2+2bb`v
#27Q`2 M/ 7i2` 2+? :Sl +H+mHiBQM QT2`iBQMX
"X Pp2`HT Q7 +QKTmiiBQM M/ +QKKmMB+iBQMX
AM T`BM+BTH2- *l.@r`2 JSA +QKKmMB+iBQM +M Mi@
m`HHv Qp2`HT rBi? i?2 :Sl +QKTmiiBQMX "mi b r2 +M
b22 7`QK H;Q`Bi?K R- `B;?i 7i2` +QKTmiBM; i?2 >`i`22@
2t+?M;2@+Q``2HiBQM BMi2;`Hb pB :2M2`H Ji`Bt JmHiBTHv
U:1JJV- r2 T2`7Q`K JSAnHH`2/m+2 iQ ;i?2` V>t+ BM HH
JSA ibFbX .m2 iQ /i /2T2M/2M+2- JSAnHH`2/m+2 Kmbi
rBi 7Q` :1JJ iQ }MBb? +QKTmiiBQM- i?mb i?2 Qp2`HTTBM;
Q7 +QKTmiiBQM M/ +QKKmMB+iBQM Bb /Bb`mTi2/X q?2M i?2
bvbi2K bBx2 BM+`2b2b- Hi?Qm;? :Slb +M ?M/H2 :1JJ BM
 p2`v 2{+B2Mi rv- #Qi? :1JJ M/ JSAnHH`2/m+2 rBHH
BMi`Q/m+2 Km+? iBK2 +QbiX hQ 7mHHv ++2H2`i2 G_@h..6h-
r2 K/2 M ii2KTi iQ +?B2p2 i?2 Qp2`HT Q7 +QKTmiiBQM
M/ +QKKmMB+iBQMX
6B`bi Q7 HH- 7i2` MHvxBM; i?2 /i T`iBiBQM Q7 G_@
h..6h- r2 }M/ i?i iQ +H+mHi2 i?2 /Bz2`2M+2 #2ir22M
EQ?M@a?K 2M2`;v 2B;2MpHm2b- MQi HH JSA ibFb M22/ iQ
biQ`2 i?2 r?QH2 V>t+ Ki`BtX aQ 7i2` :1JJ- r2 +?M;2
i?2 rv Q7 /i T`iBiBQM- b r2 +M b22 7`QK 6B;m`2 k- i?2M
r2 ;2i i?2 `2bmHi Q7 :1JJ- 2+? JSA ibF QMHv M22/b iQ
biQ`2 T`i Q7 i?2 V>t+ Ki`BtX
6B;X kX .i T`iBiBQM QTiBKBxiBQM Q7 V>t+
6B;X jX .i `2/m+iBQM QTiBKBxiBQM- iF2 i?2 }`bi `Qr Q7 Ji`Bt V>t+
b 2tKTH2
h?2 #Qp2 ii2KTib #`BM; #Qmi irQ K2`BibX 6B`bi- i?Bb
/i T`iBiBQM rv +M `2/m+2 K2KQ`v mb;2 7Q` i?2
MmK#2` Q7 JSA T`Q+2bb2b iBK2bX a2+QM/- r2 rBHH MQi M22/ iQ
T2`7Q`K JSAnHH`2/m+2 iQ ;i?2` i?2 r?QH2 V>t+ Ki`Bt-
BMbi2/- r2 mb2 JSAn_2/m+2 iQ i`Mb72` T`i Q7 V>t+
Ki`Bt iQ 2+? JSA ibF #b2/ QM i?2 BM/2tX
PrBM; iQ i?2b2 ii2KTib r2 2HBKBMi2 T`i Q7 i?2 /i
/2T2M/2M+B2b- iQ #2 /2iBH2/- r2 +M bTHBi i?2 Ki`Bt BMiQ
bKHH #HQ+Fb M/ T2`7Q`K :1JJ KMmHHv iQ i?2b2 bKHH
#HQ+FbX h?2 #bB+ ~Qr Q7 :1JJ M/ _2/m+2 Bb b?QrM BM
6B;m`2 9 M/ -i?2 TBT2HBM2 Q7 :1JJ M/ `2/m+2 7i2` Qm`
QTiBKBxiBQM `2 b?QrM BM 6B;m`2 8X PM+2 2+? #HQ+F ;2ib
i?2 `2bmHi- r2 +M BKK2/Bi2Hv `2/m+2 i?2 #HQ+F Ki`Bt iQ
2+? JSA ibF #v JSAn_2/m+2X 7i2`r`/ r2 rBHH ;2i i?2
r?QH2 V>t+ #mi /Bbi`B#mi2/ BM 2+? JSA ibFX
JSA `MF y
JSA `MF R
JSAn_2/m+2
6B;X 9X h?2 #bB+ BKTH2K2MiiBQM Q7 :1JJ M/ _2/m+2- iF2 JSA
ibF 4 k b 2tKTH2
*X JBt2/@T`2+BbBQM +QKTmiiBQM
aBM+2 i?2 2H2K2Mi V>t+ BM i?2 >KBHiQMBM Ki`Bt Q7
i?2 G_@h..6h 2[miBQM Bb Km+? bKHH2` i?M .- i?2
2t+BiiBQM 2M2`;v Q7 G_@h..6h Bb bB;MB}+MiHv z2+i2/
#v i?2 2M2`;v H2p2H /Bz2`2M+2- BM i?Bb rv r2 +M `2Ht
i?2 ++m`+v BM i?2 +QMbi`m+iBQM T`Q+2bb Q7 V>t+ iQ 
+2`iBM 2ti2MiX hQ 7m`i?2` QTiBKBx2 i?2 +Q/2 r?BH2 `2@
KBMBM; +QKTmiiBQMH ++m`+v- r2 i2bi /Bz2`2Mi H2p2Hb Q7
KBt2/ T`2+BbBQMX b  `2bmHi- r2 }M/ M ++m`i2 M/ 7bi
T`2b+`BTiBQM i?i mb2b bBM;H2 T`2+BbBQM BM :1JJ M/ 66h
QT2`iBQMb #mi mb2b /Qm#H2 T`2+BbBQM BM BMBiBHBxBM; Pvc(r)BM
`2H bT+2 M/ /B;QMHBxBM; i?2 G_@h..6h >KBHiQMBM
>X hQ #2 2t+i- i?2 ΨKi`Bt ?2H/ #v 2+? JSA ibF BM `Qr
#HQ+F T`iBiBQM Bb +QMp2`i2/ 7`QK /Qm#H2 T`2+BbBQM iQ bBM;H2

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
T`2+BbBQM #27Q`2 i?2 :1JJ M/ 66h- r?2M +``vBM; i?2
Qi?2` MmK2`B+H +H+mHiBQMb- r2 i`Mb72` i?2 QmiTmi Ki`Bt
iQ /Qm#H2 T`2+BbBQM iQ KBMiBM ++m`+vX
q2 +QKT`2 i?2 KBt2/@T`2+BbBQM BKTH2K2MiiBQM rBi?
i?2 /Qm#H2 T`2+BbBQM KQ/2H #v mbBM;  ivTB+H +QM};m`iBQM
Q7  bBHB+QM +`vbiH bvbi2K +QKTQb2/ Q7 Ryk9 iQKb M/
rBi? Nvc = 128 ×128 = 16384- M/ Q#b2`p2  /2pBiBQM Q7
`QQi K2M b[m`2 yXkN2ofKQH2+mH2 BM i?2 2M2`;v T`2/B+iBQMX
aBM+2 i?2b2 /2pBiBQMb `2 Km+? H2bb i?M i?2 2``Q` +mb2/
#v HQ+H@/2MbBiv TT`QtBKiBQM- bQ i?2 KBt2/ T`2+BbBQM
BKTH2K2MiiBQM Bb Q7 biBb7+iQ`v ++m`+vX AM i2`Kb Q7
bT22/ M/ :Sl K2KQ`v `2[mB`2K2Mi- i?2 F2`M2H rBi?
KBt2/ T`2+BbBQM Bb RXd iBK2b 7bi2` i?M M/ +QMbmK2b
8yW H2bb K2KQ`v i?M i?2 /Qm#H2 T`2+BbBQM p2`bBQMX 7i2`
i2biBM;  HQi Q7 Qi?2` KBt2/ T`2+BbBQM THMb- r2 `2K`F
i?i Hi?Qm;?  7mHHv@~Qi BKTH2K2MiiBQM Bb KQ`2 TQr2`
2{+B2Mi QM i?2 LoA.A oRyy :Sl- Qm` i2bib b?Qr i?i-
/m2 iQ i?2 HBKBi2/ `2T`2b2MiiBQM `M;2 rBi? jk #BM`v #Bib-
i?2 +Q``2bTQM/BM; ii2KTi +MMQi T`2b2`p2 i?2 `2[mB`2/ +@
+m`+v Q7 i?2 2t+Bi2/ 2M2`;v M/ rp27mM+iBQMbX h?2`27Q`2-
i i?Bb KQK2Mi- r2 +MMQi iF2 /pMi;2 Q7 i?2 K2`Bib Q7
i?2 7mHHv@~Qi BKTH2K2MiiBQMX
.X J2KQ`v ++2bb QTiBKBxiBQM
AM i?2 #b2HBM2 :Sl BKTH2K2MiiBQM- Ki`Bt fxc M/
p2+iQ` Eigenvalue `2 biQ`2/ BM i?2 :Sl ;HQ#H K2KQ`v
7Q` i?2 T2MmHiBKi2 T`i Q7 i?2 +QKTmiiBQM iQ +H+mHi2
fxc+=EigenvaluejEigenvaluekX >Qr2p2`- :Sl ;HQ#H
K2KQ`v ?b ?B;? Hi2M+v M/ HQr #M/rB/i? +QKT`2/
iQ i?2 b?`2/ K2KQ`v BMbB/2 2+? :Sl bi`2KBM; KmH@
iBT`Q+2bbQ` UaJVX q2 `2K`F i?i fxc Bb +H+mHi2/ THmb
i?2 2H2K2Mi@rBb2 T`Q/m+i Q7 EigenvaluejEigenvaluekX
hQ BKT`Qp2 i?2 T2`7Q`KM+2- r2 mb2 i?2 HQr@Hi2M+v QM@
+?BT b?`2/ K2KQ`v iQ `2TH+2 i?2 :Sl ;HQ#H K2KQ`v
iQ biQ`2 fxc M/ EigenvalueX h?2M- r?2M +H+mHiBM; fxc-
i?2 H;Q`Bi?K ++2bb2b i?2 b?`2/ K2KQ`v i?i Bb Km+?
7bi2` i?M ++2bbBM; :Sl ;HQ#H K2KQ`vX
1X aBM;H2 T`2+BbBQM JSA
7i2` i?2 #Qp2 QTiBKBxiBQMb Q7 G_@h..6h- r2 7QmM/
i?i b i?2 MmK#2` Q7 JSA ibF ;`Qrb- i?2 rHH@+HQ+F iBK2
Q7 G_@h..6h Bb /QKBMi2/ #v k bi2Tb Q7 JSAnHHiQHH QT@
2`iBQMX h?Bb `2bmHi Bb [mBi2 BMimBiBp2 #2+mb2 JSAnHHiQHH
b2M/b /i 7`QK HH iQ HH T`Q+2bb2b- K2Mr?BH2- i?2 /i
Bb p2`v H`;2 b i?2 bBx2 Q7 BMTmi BM+`2b2bX hQ `2/m+2
i?2 +QKKmMB+iBQM iBK2 Q7 i?2 /i i`Mb7Q`K #2ir22M
b+?2K2b BM 6B;m`2 R- r2 mb2 i?2 bBM;H2 T`2+BbBQM 7Q`Ki 7Q`
b2M/BM; M/ `2+2BpBM; i?2 rp27mM+iBQMb- r?B+? `2/m+2b i?2
+QKKmMB+iBQM i?`Qm;?Tmi #v ?H7X q2 HbQ K2MiBQM i?i
i?2 bBM;H2 T`2+BbBQM 7Q`Ki Bb QMHv mb2/ BM i?2 JSAnHHiQHH
+QKKmMB+iBQM- bQ rp27mM+iBQMb rBHH #2 +QMp2`i2/ #+F
iQ i?2 /Qm#H2 T`2+BbBQM 7Q`Ki 7Q` +QKTmiiBQMX q2 }M/
i?i i?Bb QTiBKBxiBQM H2/b iQ M2;HB;B#H2 /2pBiBQM BM
i?2 ++m`+v Q7 i?2 `2bmHi BM i?2 G_@h..6h- BM Qm`
2tT2`BK2Mi- i?2 2``Q` Q7 i?2 iQiH 2M2`;v Bb b HBiiH2 b
1072ofKQH2+mH2 7Q`  bBHB+QM bvbi2K rBi? Ryk9 iQKb M/
Nvc = 128×128 = 16384- +QKT`2/ iQ i?2 /Qm#H2 T`2+BbBQM
BKTH2K2MiiBQMX
6B;m`2 e b?Qrb i?2 `2/m+iBQM Q7 i?2 +QKTmiiBQMH iBK2
bbQ+Bi2/ rBi? /Bz2`2Mi bi2Tb Q7 QTiBKBxiBQMX h?2 i2biBM;
bvbi2K Bb  Ryk9 bBHB+QM iQKb bvbi2K b?QrM BM 6B;m`2 d-
r?B+? rBHH #2 /Bb+mbb2/ BM b2+iBQM oX h?2 Mśp2 *Sl p2`bBQM
Q7 G_@h..6h mb2b /mH AMi2H s2QM 18@keN8 Rk@+Q`2 bQ+F2ib
M/ BM Qm` i2bib r2 mb2 e *Sl +Q`2 #QmM/ iQ 2+? JSA ibFX
h?2 :Sl p2`bBQM mb2b /mH oRyy :Sl BM LoA.A .:s@R
b2`p2` (Rd)- M/ Bb `QmM/ eXd iBK2b 7bi2` i?M i?2 *Sl
p2`bBQM BM i2`Kb Q7 G_@h..6h +H+mHiBQMX q2 }M/ i?i
bi2T R UTQBMi@iQ@TQBMi `2TH+2Ki2MiV H2/b iQ KQbi Q7 i?2
T2`7Q`KM+2 BKT`Qp2K2Mi 7`QK *Slb iQ :Slb- M/ i?Bb
bi2T Bb `2bTQMbB#H2 7Q` KQbi Q7 i?2 BKTH2K2MiiBQM 2zQ`ib
b r2HHX
6B;X eX qHH +HQ+F iBK2 7Q` G_@h..6h +H+mHiBQM rBi? /Bz2`2Mi
bi2Tb Q7 QTiBKBxiBQMb 7Q`  bvbi2K rBi? Ryk9 bBHB+QM iQKbX h?2 *Sl
p2`bBQM mb2b k9 *Sl +Q`2b- M/ i?2 :Sl p2`bBQM mb2b k :Slb
AoX LmK2`B+H _2bmHi M/ MHvbBb
AM i?Bb b2+iBQM- r2 `2TQ`i i?2 MmK2`B+H `2bmHib- BM+Hm/BM;
i?2 bT22/mT- bi`QM; M/ r2H b+HBM; T2`7Q`KM+2 Q7 i?2
G_h..6h +Q/2 rBi? b2p2`H bBx2b Q7 #mHF bBHB+QM bvbi2KbX
.Bz2`2Mi bBx2 Q7 i2biBM; bvbi2Kb `2 ;2M2`i2/ iQ 7Bi?@
7mHHv BM/B+i2 i?2 T2`7Q`KM+2 Q7 G_@h..6h rBi? :Sl
++2H2`i2/- BM+Hm/BM; bBHB+QM bvbi2Kb rBi? e9- k8e- 8Rk-
Ryk9- ky93 iQKb r?B+? rBHH #2 Hi2` `2K`F2/ b aBe9-
aBk8e- aB8Rk- aBRyk9 M/ aBky93- `2bT2+iBp2HvX AM Qm` i2bib-
r2 T2`7Q`K i?2 MmK#2` Q7 pH2M+2 M/ +QM/m+iBQM Q`#BiHb
Nv4Nc4 R3y- i?mb i?2 ?2B;?i M/ H2M;i? Q7 i?2 G_@
h..6h >KBHiQMBM `2 #Qi? N+p 4 R3y Ɠ R3y 4 jk9yyX
q2 T2`7Q`K HH MmK2`B+H i2bib QM LoA.A .:s@R b2`p2`
b?QrM BM 6B;m`2 3X .:s@R b2`p2` ?b /mH ky@*Q`2 AMi2H
s2QM 18@keN3 *Sl bQ+F2ib `mMMBM; i kXk :>x M/ 3
h2bH oRyy :Sl +QMM2+i2/ pB S*A2 jXyX oRyy :Slb
`2 +QMM2+i2/ pB LoGBMF rBi?  #M/rB/i? Q7 ky:"fbX
h?2 h2bH oRyy :Sl ?b Re :" /2pB+2 K2KQ`v M/
8Rky *l. +Q`2b `mMMBM; i RXj3 :>x- r?B+? T`QpB/2

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
JSA `MF y
JSA `MF R
JSAn_2/m+2
JSA `MF y
JSA `MF R
JSAn_2/m+2
JSA `MF y
JSA `MF R JSAn_2/m+2
JSA `MF y
JSA `MF R JSAn_2/m+2
6B;X 8X Pp2`HT Q7 +QKTmiiBQM M/ +QKKmMB+iBQM- i?Bb };m`2 b?Qrb i?2 TBT2HBM2 Q7 :1JJ M/ _2/m+2- r2 bTHBi i?2 Ki`B+2b KMMmHv
iQ +H+mHi2 b2p2`H bKHH #HQ+F Ki`B+2bX hF2 JSA ibF4kb2tKTH2
 i?2Q`2iB+H T2F T2`7Q`KM+2 Q7 dX3h6GPSa /Qm#H2
T`2+BbBQM QT2`iBQMbX h?2 QT2`iBM; bvbi2K Q7 i?2 b2`p2`
Bb e9@#Bi _2/ >i 1Mi2`T`Bb2 GBMmt M/ i?2 bvbi2K ?b BM
iQiH 8Rk:" ._J K2KQ`vX
AM i?Bb TT2`- i?2 *Sl H;Q`Bi?K M/ :Sl H;Q`Bi?K `2
`mMMBM; QM *YY M/ *l. RyXy- `2bT2+iBp2HvX q?B+? r2
i2bi :Sl p2`bBQMb Q7 G_@h..6h- 2+? JSA ibF Bb #QmM/
iQ M BM/BpB/mH :Sl M/ 2+? 2tT2`BK2Mi Bb 2t2+mi2/ 7Qm`
iBK2b M/ r2 `2TQ`i i?2 p2`;2 `2bmHibX
X aT22/mT
i }`bi- r2 T2`7Q`K 2tT2`BK2Mib rBi? b2p2`H bBx2b Q7
bBHB+QM bvbi2Kb iQ KF2 bm`2 Qm` BKTH2K2MiiBQM rQ`Fb
r2HH QM bvbi2Kb Q7 /Bz2`2Mi bBx2bX h?`22 p2`bBQMb Q7 i?2
H;Q`Bi?K `2 2t2+mi2/ 7Q` +QKT`BbQM, i?2 *Sl p2`bBQM-
i?2 mMQTiBKBx2/ :Sl p2`bBQM- M/ i?2 QTiBKBx2/ :Sl
p2`bBQMX h?2 mMQTiBKBx2/ M/ QTiBKBx2/ :Sl p2`bBQM
biM/ 7Q` i?2 TQBMi@iQ@TQBMi TT`Q+? M/ i?2 }MH BK@
TH2K2MiiBQM- `2bT2+iBp2HvX h?2 *Sl p2`bBQM mb2b ky *Sl
+Q`2 M/ 8 JSA ibFb- bQ 2+? JSA ibF Bb #QmM/ iQ 9 *Sl
6B;X dX "mHF bBHB+QM bvbi2Kb rBi? Ryk9 iQKb
6B;X 3X h?2 `+?Bi2+im`2 Q7 LoA.A .:s@RX
+Q`2 M/ i?2 /i /BbTi+? M/ KmHiBi?`2/ 2t2+miBQM `2
T`QpB/2/ #v PT2MJSX 6B;m`2 N b?Qrb i?2 `mMMBM; iBK2 M/
bT22/mTb Q7 G_@h..6h +H+mHiBQM `mMMBM; QM i?2 h2bH
oRyy :Slb M/ *Slb- `2bT2+iBp2HvX h?2 QTiBKBx2/ :Sl
H;Q`Bi?K +?B2p2b M p2`;2 Q7 8Xd8Ɠ M/ mT iQ eXN9Ɠ
bT22/mTb p2`bmb *Sl p2`bBQMX AM +QMi`bi- i?2 mMQTiBKBx2/
:Sl H;Q`Bi?K +?B2p2b M p2`;2 Q7 jXyRƓ M/ mT iQ
jXkyƓ bT22/mTb *Sl p2`bBQMX h?2 `2bmHib /2KQMbi`i2 i?2
2z2+iBp2M2bb M/ mMBp2`bHBiv Q7 i?2 QTiBKBxiBQM bi`i2;B2b
BM a2+iBQM AAAX h?2 };m`2 HbQ b?Qrb i?i i?2 bT22/mTb F22T
BM+`2bBM; rBi? i?2 ;`QrBM; Q7 bvbi2K bBx2X >Qr2p2`- i?2

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
KtBKmK bvbi2K bBx2 i?i +M #2 i2bi2/ QM i?2 :Sl Bb
aBky93 /m2 iQ HBKBi2/ :Sl K2KQ`vX
6B;X NX _mMMBM; iBK2 M/ bT22/mTb Q7 /Bz2`2M+2 bBx2b Q7 i2biBM;
bvbi2Kb QM irQ THi7Q`KbX :Sl@mMQTiBKBx2/ BM/B+i2b Qm` Mśp2
BKTH2K2Mi M/ :Sl@QTiBKBx2/ `2T`2b2Mib Qm` }MH p2`bBQM 7i2` 9
bi2Tb Q7 QTiBKBxiBQMb
"X ai`QM; a+HBM;
h?2 KQbi bB;MB}+Mi +QM+2`M 7Q` i?2 G_@h..6h +H+m@
HiBQMb Bb i?2 bi`QM; b+H#BHBiv QM ?2i2`Q;2M2Qmb KmHiB@
:Sl `+?Bi2+im`2b- r?B+? `2~2+ib ?Qr Km+? /Q2b i?2
;`Qri? Q7 ?`/r`2 `2bQm`+2b iF2b 2z2+i 7Q`  /2i2`KBM2/
bvbi2K b //BM; i?2 MmK#2`b Q7 :SlbX q2 i2bi i?2 :Sl@
QTiBKBx2/ p2`bBQM Q7 G_@h..6h QM #mHF bBHB+QM bvbi2Kb
rBi? Ryk9 iQKb M/ r2 2pHmi2 bi`QM; b+H#BHBiv #v
T`HH2H 2{+B2M+v /2}M2/ BM 1[miBQM d- M/ i?2 bT22/mT
Bb +QKT`2/ rBi? rHH +HQ+F iBK2 BM bBM;H2 :SlX
1{+B2M+v =bT22/mT
N umber of GP U UdV
b r2 +M b22 7`QK 6B;m`2 Ry- T`HH2H 2{+B2M+v ;Q2b
/QrM r?2M i?2 MmK#2` Q7 :Sl BM+`2b2bX h?2 KBM `2bQM
7Q` i?Bb T?2MQK2MQM Bb i?i bBM+2 r2 /Bbi`B#mi2 Ki`B+2b
BMiQ 2+? JSA ibF 2[m#Hv r?B+? #QmM/b iQ  :Sl
M/ i?2 bBx2 Q7 2+? Ki`Bt T2`7Q`K2/ :1JJ M/ 66h
#2+QK2b bKHH2` r?2M r2 mb2 KQ`2 :Slb- bQ :Sl +MMQi
2z2+iBp2Hv mb2 Bib +QKTmiBM; TQr2`X "mi T`HH2H 2{+B2M+v
Bb biBHH #Qp2 88W- r?B+? Bb [mBi2 ++2Ti#H2 bBM+2 BM i?2
+H+mHiBQM Bi 2tBbib KMv iBK2b Q7 ;HQ#H +QKKmMB+iBQMX
*X q2F a+HBM;
MQi?2` BKTQ`iMi +`Bi2`BQM 7Q` i?2 G_@h..6h +H+m@
HiBQMb Bb i?2 r2F b+H#BHBiv- r?B+? `2~2+ib i?2 T`HH2H
T2`7Q`KM+2 7Q`  b+H2/ T`Q#H2K bBx2 HQM; rBi?  }t2/
MmK#2` Q7 :SlX .Bz2`2Mi bBx2b Q7 bBHB+QM bvbi2Kb `2 i2bi2/
QM k oRyy :Slb #2ir22M lMQTiBKBx2/ M/ PTiBKBx2/
:Sl p2`bBQMbX h#H2 A b?Qr i?i +QKT`2/ rBi? aB8Rk
bvbi2K- aBRyk9 M/ aBky93 +?B2p2 HKQbi HBM2` b+HBM; BM
QTiBKBx2/ :Sl p2`bBQM- r?B+? /2KQMbi`i2b i?2 b+H#BHBiv
Q7 Qm` BKTH2K2MiiBQMX
6B;X RyX ai`QM; b+HBM;, i?2 iQiH iBK2 M/ T`HH2H 2{+B2M+v Q7 :Sl@
QTiBKBx2/ p2`bBQM mbBM; /Bz2`2Mi MmK#2` Q7 :SlbX
h"G1 A
qHH +HQ+F iBK2 Q7 /Bz2`2Mi b+H2 Q7 i2biBM; bvbi2K rBi? k :Sl
h2biBM; bvbi2K :Sl hBK2
lMQTiBKBx2/ PTiBKBx2/
aBe9 98XN3 j9XRj
aBk8e N9Xe3 8jX8R
aB8Rk RRRXe eNX8j
aBRyk9 keyXN8 RkRXe3
aBky93 8yjXee kkRXN
oX _2Hi2/ qQ`F
AM ;`QmM/ bii2 2H2+i`QMB+ bi`m+im`2 +H+mHiBQMb- p`B@
Qmb ?B;?Hv 2{+B2Mi >S* Ea@.6h bQ7ir`2 T+F;2b mb2
:Sl iQ `2/m+2 +QKTmiiBQM iBK2 bm+? b "ALAh (R3)-
"B;.6h (RN)- P+iQTmb (k)- SqKi (ky)- (kR)- ZmMimK
1aS_1aaP (kk)- .:.6h (R8)- (kj) M/ oaS (k9)- (k8)-
r?B+? `2 #2M2}+BH iQ iF2 7mHH /pMi;2 Q7 i?2 KbbBp2
T`HH2HBbK pBH#H2 QM KQ/2`M ?2i2`Q;2M2Qmb bvbi2KbX
"mi QMHv b2p2`H >S* bQ7ir`2 T+F;2b 7Q` 2t+Bi2/ bii2
2H2+i`QMB+ bi`m+im`2 +H+mHiBQMb ?p2 #22M /2p2HQT2/- bm+?
b Lq*?2K (ke)- "2`F2H2v:q (kd) M/ Z*?2K (k3)- /m2 iQ
bm+? mHi`?B;? +QKTmiiBQM M/ K2KQ`v +Qbi BM i?2 2t+Bi2/
bii2 2H2+i`QMB+ bi`m+im`2 +H+mHiBQMb- 2bT2+BHHv rBi? THM2
rp2 #bBb b2iX AM T`iB+mH`- CB 2i HX (kN) ++2H2`i2 ?v#`B/
7mM+iBQMH `i@h..6h +H+mHiBQMb- r?B+? Bb MQi?2` 7Q`K
Q7 h..6h- mbBM; i?2 T`HH2H i`MbTQ`i ;m;2 7Q`KHBbK
#v :Sl QM amKKBi- i?2B` BKTH2K2MiiBQM +M 2{+B2MiHv
b+H2 iQ d3e :Slb 7Q`  H`;2 bvbi2K rBi? R8je bBHB+QM
iQKbX
oAX *QM+HmbBQM M/ PmiHQQF
AM i?Bb TT2`- r2 T`QTQb2 M 1{+B2Mi KmHiB@:Sl AK@
TH2K2MiiBQM 7Q` HBM2`@`2bTQMb2 iBK2@/2T2M/2Mi /2MbBiv
7mM+iBQMH i?2Q`v +H+mHiBQM iQ +QKTmi2 i?2 2t+BiiBQM
2M2`;B2b BM KQH2+mH2b M/ bQHB/b rBi? i?2 THM2 rp2 #bBb
b2iX q2 +`27mHHv /2bB;M i?2 Ki?2KiB+H TT`QtBKiBQM
KQ/2H iQ `2/m+2 i?2 +QKTmiiBQMH +Qbi r?BH2 `2iBMBM;
+?2KB+H ++m`+vX AM Qm` Mśp2 *Sl /2bB;M- r2 T2`7Q`K
/Bz2`2Mi /i T`iBiBQM M/ ibF /Bbi`B#miBQM b+?2K2b iQ

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
2Mbm`2 i?2 ~2tB#BHBiv M/ 2{+B2M+v Q7 i?2 +QKTmiiBQM
T`Q+2bbX J2Mr?BH2- ?v#`B/ JSA@PT2MJS T`Q;`KKBM;
K2i?Q/ Bb mb2/ iQ +?B2p2 ?B;? b+H#BHBiv BM KQ/2`M
KmHiB@+Q`2 +QKTmiBM; bvbi2KX h?2 KQbi +?HH2M;BM; T`i
Q7 G_@h..6h +H+mHiBQM Bb iQ ;2M2`i2 i?2 G_@h..6h
>KBHiQMBM- r?B+? mbmHHv Q++mTB2b 3yW Q7 i?2 r?QH2 rHH
+HQ+F iBK2- bQ r2 BKTH2K2Mi  KmHiB@:Sl p2`bBQM #v TQ`i
i?2 KQbi iBK2 +QMbmKBM; T`i iQ KmHiB@:Sl M/ ii2KTi
b2p2`H QTiBKBxiBQMb iQ 7mHHv iT i?2 TQi2MiBH Q7 KQ/2`M
?2i2`Q;2M2Qmb :Sl bvbi2KX q2 T2`7Q`K MmK2`B+H i2biBM;
KQM; /Bz2`2Mi bBx2 Q7 bvbi2Kb- M/ :Sl p2`bBQM +?B2p2b
M p2`;2 Q7 8Xd8Ɠ M/ mT iQ eXN9Ɠ bT22/mTb p2`bmb *Sl
p2`bBQM- M/ #Qi? bi`QM; M/ r2F b+HBM; F22T i  ?B;?
H2p2HX
GQQFBM; 7Q`r`/ iQ Qm` 7mim`2 rQ`F- i?2`2 rBHH biBHH
`2KBM  HQi Q7 +?HH2M;b iQ 7m`i?2` 2tTHQ`2 i?2 2t+Bi2/
bii2 T`QT2`iB2b Q7 KQH2+mH2b M/ bQHB/b rBi? KQ/2`M ?2i@
2`Q;2M2Qmb +QKTmiBM; `+?Bi2+im`2X PM QM2 ?M/- r2 rBHH
7i2`r`/ b22F i?2 K2i?Q/ iQ 2z2+iBp2Hv mb2 i?2 KbbBp2Hv
+QKTmiBM; TQr2` BM amMrv hB?mGB;?i bmT2`+QKTmi2`
(jy) rBi?BM G_@h..6h +H+mHiBQMX PM i?2 Qi?2` ?M/-
i?2 HBKBiiBQM Q7 G_@h..6h Bb i?i i?2 ++m`+v +MMǶi
K22i T`2+Bb2 /2KM/ Q7 bBKmHi2/ bT2+i`mK- bQ r2 i`v iQ
#QQbi i?2 ++m`+v Q7 2t+Bi2/ bii2 T`QT2`iB2b #v mbBM; KQ`2
2H#Q`i2 Ki?2KiB+H K2i?Q/ HBF2 :qf"a1 (jR)X
+FMQrH2/;K2Mi
q2 `2 i?MF7mH iQ i?2 `2pB2r2`b 7Q` 2pHmiBM; i?Bb bim/v
M/ T`QpB/BM; pHm#H2 722/#+FX h?Bb rQ`F Bb bmTTQ`i2/
#v i?2 LiBQMH E2v _2b2`+? M/ .2p2HQTK2Mi S`Q;`K
Q7 *?BM U:`Mib LQXkyR3u6"yky9RyyVX
_272`2M+2b
(R) 1X _mM;2 M/ 1X EX lX :`Qbb- dz.2MbBiv@7mM+iBQMH
i?2Q`v 7Q` iBK2@/2T2M/2Mi bvbi2Kb-Ǵ S?vbX _2pX G2iiX-
pQHX 8k- TTX NNdĜRyyy- J` RN39X (PMHBM2)X pBH#H2,
?iiTb,ffHBMFXTbXQ`;f/QBfRyXRRyjfS?vb_2pG2iiX8kXNNd
(k) sX M/`/2- CX H#2`/B@_Q/`B;m2x- .X X ai`m##2- JX CX
PHBp2B`- 6X LQ;m2B`- X *bi`Q- CX Jm;m2`x- X ``m#``2M-
aX :X GQmB2- X bTm`m@:mxBF 2i HX- dzhBK2@/2T2M/2Mi /2MbBiv@
7mM+iBQMH i?2Q`v BM KbbBp2Hv T`HH2H +QKTmi2` `+?Bi2+im`2b,
i?2 Q+iQTmb T`QD2+i-Ǵ CQm`MH Q7 S?vbB+b, *QM/2Mb2/ Jii2`-
pQHX k9- MQX kj- TX kjjkyk- kyRkX
(j) *X X lHH`B+?- hBK2@/2T2M/2Mi /2MbBiv@7mM+iBQMH i?2Q`v, +QM@
+2Tib M/ TTHB+iBQMbX PlS Pt7Q`/- kyRRX
(9) EX u#M M/ :X "2`ib+?- dzhBK2@/2T2M/2Mi HQ+H@/2MbBiv T@
T`QtBKiBQM BM `2H iBK2-Ǵ S?vbB+H _2pB2r "- pQHX 89- MQX d- TX
9939- RNNeX
(8) hX GX "2+F- dz_2H@bT+2 K2b? i2+?MB[m2b BM /2MbBiv@7mM+iBQMH
i?2Q`v-Ǵ _2pB2rb Q7 JQ/2`M S?vbB+b- pQHX dk- MQX 9- TX Ry9R- kyyyX
(e) :X PMB/- GX _2BMBM;- M/ X _m#BQ- dz1H2+@
i`QMB+ 2t+BiiBQMb, /2MbBiv@7mM+iBQMH p2`bmb KMv@
#Q/v ;`22MǶb@7mM+iBQM TT`Q+?2b-Ǵ _2pX JQ/X S?vbX-
pQHX d9- TTX eyRĜe8N- CmM kyykX (PMHBM2)X pBH#H2,
?iiTb,ffHBMFXTbXQ`;f/QBfRyXRRyjf_2pJQ/S?vbXd9XeyR
(d) JX 1X *bB/ M/ .X *?QM;- dz_2+2Mi /pM+2b BM /2MbBiv 7mM+@
iBQMH K2i?Q/b-Ǵ *QKTmiiBQMH *?2KBbi`v, _2pB2rb Q7 *m``2Mi
h`2M/b- RNN8X
(3) _X ai2`M?2BK2`- dz1H2+i`QMB+ TQH`Bx#BHBiB2b Q7 BQMb 7`QK i?2
?`i`22@7Q+F rp2 7mM+iBQMb-Ǵ S?vbB+H _2pB2r- pQHX Ne- MQX 9-
TX N8R- RN89X
(N) SX >Q?2M#2`; M/ qX EQ?M- dzAM?QKQ;2M2Qmb 2H2+i`QM ;b-Ǵ
S?vbB+H `2pB2r- pQHX Rje- MQX j"- TX "3e9- RNe9X
(Ry) qX EQ?M M/ GX CX a?K- dza2H7@+QMbBbi2Mi 2[miBQMb BM+Hm/BM;
2t+?M;2 M/ +Q``2HiBQM 2z2+ib-Ǵ S?vbB+H `2pB2r- pQHX R9y-
MQX 9- TX RRjj- RNe8X
(RR) X GX 62ii2` M/ CX .X qH2+F- dzZmMimK i?2Q`v Q7 KMv@
T`iB+H2 bvbi2Kb-Ǵ [iKT- RNdRX
(Rk) *X E?i`B M/ *X _X _Q- dzaQHmiBQMb iQ bQK2 7mM+iBQMH 2[miBQMb
M/ i?2B` TTHB+iBQMb iQ +?`+i2`BxiBQM Q7 T`Q##BHBiv /Bbi`B#m@
iBQMb-Ǵ aMF?v, h?2 AM/BM CQm`MH Q7 aiiBbiB+b- a2`B2b - TTX
RedĜR3y- RNe3X
(Rj) aX :Q2/2+F2`- JX h2i2`- M/ CX >mii2`- dza2T`#H2 /mH@bT+2
;mbbBM Tb2m/QTQi2MiBHb-Ǵ S?vbB+H _2pB2r "- pQHX 89- MQX j- TX
Rdyj- RNNeX
(R9) qX >m- GX GBM- X aX "M2`D22- 1X o2+?`vMbFB- M/ *X uM;-
dz/TiBp2Hv +QKT`2bb2/ 2t+?M;2 QT2`iQ` 7Q` H`;2@b+H2 ?v@
#`B/ /2MbBiv 7mM+iBQMH +H+mHiBQMb rBi? TTHB+iBQMb iQ i?2
/bQ`TiBQM Q7 ri2` QM bBHB+2M2-Ǵ CQm`MH Q7 +?2KB+H i?2Q`v M/
+QKTmiiBQM- pQHX Rj- MQX j- TTX RR33ĜRRN3- kyRdX
(R8) qX >m- GX GBM- M/ *X uM;- dz.;/7i,  KbbBp2Hv T`HH2H
K2i?Q/ 7Q` H`;2 b+H2 /2MbBiv 7mM+iBQMH i?2Q`v +H+mHiBQMb-Ǵ
h?2 CQm`MH Q7 +?2KB+H T?vbB+b- pQHX R9j- MQX Rk- TX Rk9RRy-
kyR8X
(Re) GX GBM- CX Gm- GX uBM;- M/ 1X q2BMM- dz/TiBp2 HQ+H #bBb
b2i 7Q` FQ?MĜb?K /2MbBiv 7mM+iBQMH i?2Q`v BM  /Bb+QMiBMmQmb
;H2`FBM 7`K2rQ`F B, hQiH 2M2`;v +H+mHiBQM-Ǵ CQm`MH Q7
*QKTmiiBQMH S?vbB+b- pQHX kjR- MQX 9- TTX kR9yĜkR89- kyRkX
(Rd) LpB/B /;t@R rBi? i2bHpRyy bvbi2K `+?Bi2+im`2X (PMHBM2)X
pBH#H2, ?iiTb,ffBK;2bXMpB/BX+QKf+QMi2MifT/7f/;tR@pRyy@
bvbi2K@`+?Bi2+im`2@r?Bi2TT2`XT/7
(R3) sX :QMx2- 6X CQHH2i- 6X X `mDQ- .X /Kb- "X K/QM- hX T@
TH2M+Qm`i- *X m/Qmx2- CX@JX "2mF2M- CX "B2/2`- X "QF?M+?mF
2i HX- dz_2+2Mi /2p2HQTK2Mib BM i?2 #BMBi bQ7ir`2 T+F;2-Ǵ
*QKTmi2` S?vbB+b *QKKmMB+iBQMb- pQHX ky8- TTX RyeĜRjR- kyReX
(RN) GX 1X _i+HBz- X .2;QKK2- CX X 6HQ`2b@GBpb- aX :Q2/2+F2`-
M/ GX :2MQp2b2- dzzQ`/#H2 M/ ++m`i2 H`;2@b+H2 ?v#`B/@
7mM+iBQMH +H+mHiBQMb QM ;Tm@++2H2`i2/ bmT2`+QKTmi2`b-Ǵ
CQm`MH Q7 S?vbB+b, *QM/2Mb2/ Jii2`- pQHX jy- MQX N- TX yN8NyR-
kyR3X
(ky) qX CB- wX *Q- GX qM;- CX 6m- sX *?B- qX :Q- M/ GX@qX
qM;- dzh?2 MHvbBb Q7  THM2 rp2 Tb2m/QTQi2MiBH /2MbBiv
7mM+iBQMH i?2Q`v +Q/2 QM  ;Tm K+?BM2-Ǵ *QKTmi2` S?vbB+b
*QKKmMB+iBQMb- pQHX R39- MQX R- TTX NĜR3- kyRjX
(kR) qX CB- CX 6m- wX *Q- GX qM;- sX *?B- qX :Q- M/ GX@qX qM;-
dz6bi THM2 rp2 /2MbBiv 7mM+iBQMH i?2Q`v KQH2+mH` /vMKB+b
+H+mHiBQMb QM KmHiB@;Tm K+?BM2b-Ǵ CQm`MH Q7 *QKTmiiBQMH
S?vbB+b- pQHX k8R- TTX RykĜRR8- kyRjX
(kk) CX _QK2`Q- 1X S?BHHBTb- :X _m2ib+?- JX 6iB+- 6X aTB;- M/
SX :BMMQxxB- dz T2`7Q`KM+2 bim/v Q7 [mMimK 2bT`2bbQǶb Trb+7
+Q/2 QM KmHiB@+Q`2 M/ ;Tm bvbi2Kb-Ǵ BM AMi2`MiBQMH qQ`Fb?QT
QM S2`7Q`KM+2 JQ/2HBM;- "2M+?K`FBM; M/ aBKmHiBQM Q7 >B;?
S2`7Q`KM+2 *QKTmi2` avbi2KbX aT`BM;2`- kyRd- TTX edĜ3dX
(kj) qX >m- sX ZBM- ZX CBM;- CX *?2M- >X M- qX CB- 6X GB- sX GBm-
.X *?2M- 6X GBm 2i HX- dz>B;? T2`7Q`KM+2 +QKTmiBM; Q7 /;/7i
7Q` i2Mb Q7 i?QmbM/b Q7 iQKb mbBM; KBHHBQMb Q7 +Q`2b QM bmMrv
iB?mHB;?i-Ǵ a+B2M+2 "mHH2iBM- kykyX
(k9) JX >+2M2- X M+Bmt@a2/`FBM- sX _QxMbF- .X EH?`-
hX :mB;MQM- M/ SX 6H2m`i@G2bb`/- dz++2H2`iBM; pbT 2H2+@
i`QMB+ bi`m+im`2 +H+mHiBQMb mbBM; ;`T?B+ T`Q+2bbBM; mMBib-Ǵ
CQm`MH Q7 +QKTmiiBQMH +?2KBbi`v- pQHX jj- MQX jk- TTX k83RĜ
k83N- kyRkX
(k8) JX >mi+?BMbQM M/ JX qB/QK- dzobT QM  ;Tm, TTHB+iBQM iQ
2t+i@2t+?M;2 +H+mHiBQMb Q7 i?2 bi#BHBiv Q7 2H2K2MiH #Q`QM-Ǵ
*QKTmi2` S?vbB+b *QKKmMB+iBQMb- pQHX R3j- MQX d- TTX R9kkĜ
R9ke- kyRkX
(ke) JX oHB2p- 1X CX "vHbF- LX :QpBM/- EX EQrHbFB- hX SX
ai`ibK- >X CX oM .K- .X qM;- CX LB2THQ+?- 1X T`- hX GX
qBM/mb 2i HX- dzLr+?2K,  +QKT`2?2MbBp2 M/ b+H#H2 QT2M@
bQm`+2 bQHmiBQM 7Q` H`;2 b+H2 KQH2+mH` bBKmHiBQMb-Ǵ *QKTmi2`
S?vbB+b *QKKmMB+iBQMb- pQHX R3R- MQX N- TTX R9ddĜR93N- kyRyX
(kd) CX .2bHBTT2- :X aKbQMB/x2- .X X ai`m##2- JX CBM- JX GX
*Q?2M- M/ aX :X GQmB2- dz"2`F2H2v;r,  KbbBp2Hv T`HH2H
+QKTmi2` T+F;2 7Q` i?2 +H+mHiBQM Q7 i?2 [mbBT`iB+H2 M/
QTiB+H T`QT2`iB2b Q7 Ki2`BHb M/ MMQbi`m+im`2b-Ǵ *QKTmi2`
S?vbB+b *QKKmMB+iBQMb- pQHX R3j- MQX e- TTX RkeNĜRk3N- kyRkX

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
(k3) uX a?Q- wX :M- 1X 1TB7MQpbFv- X hX :BH#2`i- JX qQ`KBi-
CX EmbbKMM- X qX GM;2- X "2?M- CX .2M;- sX 62M; 2i HX-
dz/pM+2b BM KQH2+mH` [mMimK +?2KBbi`v +QMiBM2/ BM i?2 [@
+?2K 9 T`Q;`K T+F;2-Ǵ JQH2+mH` S?vbB+b- pQHX RRj- MQX k- TTX
R39ĜkR8- kyR8X
(kN) qX CB- GX@qX qM;- M/ GX GBM- dzS`HH2H i`MbTQ`i iBK2@
/2T2M/2Mi /2MbBiv 7mM+iBQMH i?2Q`v +H+mHiBQMb rBi? ?v#`B/
7mM+iBQMH QM bmKKBi-Ǵ BM S`Q+22/BM;b Q7 i?2 AMi2`MiBQMH *QM@
72`2M+2 7Q` >B;? S2`7Q`KM+2 *QKTmiBM;- L2irQ`FBM;- aiQ`;2
M/ MHvbBb- kyRN- TTX RĜkjX
(jy) >X 6m- CX GBQ- CX uM;- GX qM;- wX aQM;- sX >mM;- *X uM;-
qX sm2- 6X GBm- 6X ZBQ 2i HX- dzh?2 bmMrv iB?mHB;?i bmT2`@
+QKTmi2`, bvbi2K M/ TTHB+iBQMb-Ǵ a+B2M+2 *?BM AM7Q`KiBQM
a+B2M+2b- pQHX 8N- MQX d- TX ydkyyR- kyReX
(jR) :X ai`BMiB- dzTTHB+iBQM Q7 i?2 ;`22MǶb 7mM+iBQMb K2i?Q/ iQ i?2
bim/v Q7 i?2 QTiB+H T`QT2`iB2b Q7 b2KB+QM/m+iQ`b-Ǵ G _BpBbi
/2H LmQpQ *BK2MiQ URNd3@RNNNV- pQHX RR- MQX Rk- TTX RĜ3e- RN33X

Authorized licensed use limited to: University of Science & Technology of China. Downloaded on May 07,2021 at 02:48:28 UTC from IEEE Xplore. Restrictions apply.
Article
Full-text available
KSSOLV (Kohn-Sham Solver) is a MATLAB (Matrix Laboratory) tool-box for solving the Kohn-Sham density functional theory (KS-DFT) with the plane-wave basis set. In the KS-DFT calculations, the most expensive part is commonly the diagonalization of Kohn-Sham Hamiltonian in the self-consistent field (SCF) scheme. To enable a personal computer to perform medium-sized KS-DFT calculations that contain hundreds of atoms, we present a hybrid CPU-GPU implementation to accelerate the iterative diagonalization algorithms implemented in KSSOLV by using the MATLAB built-in Parallel Computing Toolbox. We compare the performance of KSSOLV-GPU on three types of GPU, including RTX3090, V100, and A100, with conventional CPU implementation of KSSOLV respectively and numerical results demonstrate that hybrid CPU-GPU implementation can achieve a speedup of about 10 times compared with sequential CPU calculations for bulk silicon systems containing up to 128 atoms.
Article
Full-text available
High performance computing (HPC) is a powerful tool to accelerate the Kohn-Sham density functional theory (KS-DFT) calculations on modern heterogeneous supercomputers. Here, we describe a massively parallel implementation of discontinuous Galerkin density functional theory (DGDFT) method on the Sunway TaihuLight supercomputer. The DGDFT method uses the adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field (SCF) iteration to solve the KS equations with high precision comparable to plane-wave basis set. In particular, the DGDFT method adopts a two-level parallelization strategy that deals with various types of data distribution, task scheduling, and data communication schemes, and combines with the master-slave multi-thread heterogeneous parallelism of SW26010 processor, resulting in large-scale HPC KS-DFT calculations on the Sunway TaihuLight supercomputer. We show that the DGDFT method can scale up to 8,519,680 processing cores (131,072 core groups) on the Sunway TaihuLight supercomputer for studying the electronic structures of two-dimensional (2D) metallic graphene systems that contain tens of thousands of carbon atoms.
Conference Paper
Full-text available
Real-time time-dependent density functional theory (rt-TDDFT) with hybrid exchange-correlation functional has wide-ranging applications in chemistry and material science simulations. However, it can be thousands of times more expensive than a conventional ground state DFT simulation, and hence is limited to small systems. In this paper, we accelerate hybrid functional rt-TDDFT calculations using the parallel transport gauge formalism, and the GPU implementation on Summit. Our implementation can efficiently scale to 786 GPUs for a large system with 1536 silicon atoms, and the wall clock time is only 1.5 hours per femtosecond. This unprecedented speed enables the simulation of large systems with more than 1000 atoms using rt-TDDFT and hybrid functional.
Article
Full-text available
Performing high accuracy hybrid functional calculations for condensed matter systems containing a large number of atoms is at present computationally very demanding or even out of reach if high quality basis sets are used. We present a highly optimized multiple GPU implementation of the exact exchange operator which allows one to perform fast hybrid functional density-functional theory calculations with systematic basis sets without additional approximations for up to a thousand atoms. With this method hybrid DFT calculations of high quality become accessible on state-of-the-art supercomputers within a time-to-solution that is of the same order of magnitude as traditional semilocal-GGA functionals. The method is implemented in a portable open-source library.
Article
Full-text available
Density functional theory (DFT) calculations using hybrid exchange-correlation functionals have been shown to provide an accurate description of the electronic structures of nanosystems. However, such calculations are often limited to small system sizes due to the high computational cost associated with the construction and application of the Hartree-Fock (HF) exchange operator. In this paper, we demonstrate 1 that the recently developed adaptively compressed exchange (ACE) operator formulation [J. Chem. Theory Comput., 2016, 12, 2242-2249] can enable hybrid functional DFT calculations for nanosystems with thousands of atoms. The cost of constructing the ACE operator is the same as that of applying the exchange operator to the occupied orbitals once, while the cost of applying the Hamiltonian operator with a hybrid functional (after construction of the ACE operator) is only marginally higher than that associated with applying a Hamiltonian constructed from local and semi-local exchange-correlation functionals. Therefore, this new development significantly lowers the computational barrier for using hybrid functionals in large-scale DFT calculations. We demonstrate that a parallel planewave implementation of this method can be used to compute the ground state electronic structure of a 1000-atom bulk silicon system in less than 30 wall clock minutes, and that this method scales beyond 8000 computational cores for a bulk silicon system containing about 4000 atoms. The efficiency of the present methodology in treating large systems enables us to investigate adsorption properties of water molecules on Ag-supported two-dimensional silicene. Our computational results show that water monomer, dimer and trimer configurations exhibit distinct adsorption behaviors on silicene. In particular, the presence of additional water molecules in the dimer and trimer configurations induces a transition from physisorption to chemisorption, followed by dissociation on Ag-supported silicene. This is caused by the enhanced effect of hydrogen bonds on charge transfer and proton transfer processes. Such a hydrogen bond autocatalytic effect is expected to have broad applications for silicene as an efficient surface catalyst for oxygen reduction reactions and water dissociation.
Article
Full-text available
The Sunway TaihuLight supercomputer is the world’s first system with a peak performance greater than 100 PFlops. In this paper, we provide a detailed introduction to the TaihuLight system. In contrast with other existing heterogeneous supercomputers, which include both CPU processors and PCIe-connected many-core accelerators (NVIDIA GPU or Intel Xeon Phi), the computing power of TaihuLight is provided by a homegrown many-core SW26010 CPU that includes both the management processing elements (MPEs) and computing processing elements (CPEs) in one chip. With 260 processing elements in one CPU, a single SW26010 provides a peak performance of over three TFlops. To alleviate the memory bandwidth bottleneck in most applications, each CPE comes with a scratch pad memory, which serves as a user-controlled cache. To support the parallelization of programs on the new many-core architecture, in addition to the basic C/C++ and Fortran compilers, the system provides a customized Sunway OpenACC tool that supports the OpenACC 2.0 syntax. This paper also reports our preliminary efforts on developing and optimizing applications on the TaihuLight system, focusing on key application domains, such as earth system modeling, ocean surface wave modeling, atomistic simulation, and phase-field simulation.
Article
Full-text available
We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) [J. Comput. Phys. 2012, 231, 2140] method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field (SCF) iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. It minimizes the number of degrees of freedom required to represent the solution to the Kohn-Sham problem for a desired level of accuracy. In particular, DGDFT can reach the planewave accuracy with far fewer numbers of degrees of freedom. By using the pole expansion and selected inversion (PEXSI) technique to compute electron density, energy and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of two-dimensional (2D) phosphorene systems with 3,500-14,000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
Chapter
We describe the porting of PWscf (Plane-Wave Self Consistent Field), a key component of the Quantum ESPRESSO open-source suite of codes for materials modeling, to GPU systems using CUDA Fortran. Kernel loop directives (CUF kernels) have been extensively used in order to have a single source code for both CPU and GPU implementations. The results of the GPU version have been carefully validated and the performance of the code on several GPU systems (both x86 and POWER8 based) has been compared with traditional Intel multi-core (CPU only) systems. This current GPU version can reduce the time-to-solution by an average factor of 2–3 running two different input cases widely used as benchmarks on small and large high performance computing systems.
Article
ABINIT is a package whose main program allows one to find the total energy, charge density, electronic structure and many other properties of systems made of electrons and nuclei, (molecules and periodic solids) within Density Functional Theory (DFT), Many-Body Perturbation Theory (GW approximation and Bethe–Salpeter equation) and Dynamical Mean Field Theory (DMFT). ABINIT also allows to optimize the geometry according to the DFT forces and stresses, to perform molecular dynamics simulations using these forces, and to generate dynamical matrices, Born effective charges and dielectric tensors. The present paper aims to describe the new capabilities of ABINIT that have been developed since 2009. It covers both physical and technical developments inside the ABINIT code, as well as developments provided within the ABINIT package. The developments are described with relevant references, input variables, tests and tutorials.