ArticlePDF Available

A Novel 3D Paradigm for Target Expansion of Augmented Reality SSVEP

Authors:

Abstract and Figures

Steady-State Visual Evoked Potentials (SSVEP) have proven to be practical in Brain-Computer Interfaces (BCI), particularly when integrated with augmented reality (AR) for real-world application. However, unlike conventional computer screen-based SSVEP (CS-SSVEP), which benefits from stable experimental environments, AR-based SSVEP (AR-SSVEP) systems are susceptible to the interference of real-world environment and device instability. Particularly, the performance of AR-SSVEP significantly declines as the target frequency increases. Therefore, our study introduced a 3D paradigm that combines flicker frequency with rotation patterns as stimuli, enabling expansion of target sets without additional frequencies. In the proposed design, in addition to the conventional frequency-based SSVEP feature, bio-marker elicited by visual perception of rotation was investigated. An experimental comparison between this novel 3D paradigm and a traditional 2D approach, which increases targets by adding frequencies, reveals significant advantages. The 12-class 3D paradigm achieved an accuracy of 76.5% and an information transfer rate (ITR) of 70.42 bits/min using 1-second EEG segments. In contrast, the 2D paradigm exhibited a lower performance with 72.07% accuracy and 62.28 bits/min ITR. The result underscores the 3D paradigm’s superiority in enhancing the practical applications of SSVEP-based BCIs in AR settings, especially with shorter time windows, by effectively expanding target recognition without compromising system efficiency.
Content may be subject to copyright.
1
A Novel 3D Paradigm for Target Expansion of
Augmented Reality SSVEP
Beining Cao1, Charlie Li-Ting Tsai1, Nan Zhou1, Thomas Do1, Chin-Teng Lin*1
1GrapheneX-UTS Human-centric AI Centre, Australian AI Institute, School of Computer Science,
Faculty of Engineering and Information Technology, University of Technology Sydney
Abstract—Steady-State Visual Evoked Potentials (SSVEP) have
proven to be practical in Brain-Computer Interfaces (BCI),
particularly when integrated with augmented reality (AR) for
real-world application. However, unlike conventional computer
screen-based SSVEP (CS-SSVEP), which benefits from stable ex-
perimental environments, AR-based SSVEP (AR-SSVEP) systems
are susceptible to the interference of real-world environment and
device instability. Particularly, the performance of AR-SSVEP
significantly declines as the target frequency increases. Therefore,
our study introduced a 3D paradigm that combines flicker
frequency with rotation patterns as stimuli, enabling expansion
of target sets without additional frequencies. In the proposed
design, in addition to the conventional frequency-based SSVEP
feature, bio-marker elicited by visual perception of rotation was
investigated. An experimental comparison between this novel
3D paradigm and a traditional 2D approach, which increases
targets by adding frequencies, reveals significant advantages. The
12-class 3D paradigm achieved an accuracy of 76.5% and an
information transfer rate (ITR) of 70.42 bits/min using 1-second
EEG segments. In contrast, the 2D paradigm exhibited a lower
performance with 72.07% accuracy and 62.28 bits/min ITR. The
result underscores the 3D paradigm’s superiority in enhancing
the practical applications of SSVEP-based BCIs in AR settings,
especially with shorter time windows, by effectively expanding
target recognition without compromising system efficiency.
Index Terms—Brain-computer interface, augmented reality,
SSVEP, targets expansion, multiple bio-markers
I. INTRODUCTION
B
RAIN-COMPUTER interface (BCI) represents a direct
communication pathway that captures human intentions
without the involvement of the peripheral nervous system or
muscle tissue [
1
]. Among the various types of BCI signals,
EEG is particularly favored for its portability and user-
friendliness. Several EEG-based BCI paradigms have been
proposed, including steady-state visually evoked potentials
(SSVEP) [
2
], P300 [
3
] and motor imagery (MI) [
4
]. SSVEP is
a neural response to visual stimulation with specific frequencies
[
5
]. When people observe a stimulus flickering at a constant
frequency, EEG signals at that frequency and its harmonics are
synchronously elicited in the visual cortex. This synchronization
allows users to effectively express their intentions by focusing
on the target stimulus. Due to its robust neural mechanisms
and high accuracy, SSVEP-based BCI has been widely used in
a variety of applications, such as keyboard operation [
6
] and
navigation [7].
The SSVEP paradigm requires the presentation of visual
stimuli at specific frequencies. Thus, the signal quality is
*Corresponding author: Chin-Teng Lin. Email: chin-teng.lin@uts.edu.au
significantly influenced by the device used to display these
stimuli. Currently, most SSVEP stimuli are presented using
devices such as liquid-crystal displays (LCD) [
8
] light-emitting
diodes (LED) [
9
] or computer screens (CS-SSVEP) [
10
]. These
devices have achieved considerable success in BCI applications
due to their stable refresh rates and reliable power supply. For
instance, Chen et al. developed a 45-target SSVEP speller
using an LED screen, achieving an accuracy of 84.1% and an
information transfer rate (ITR) of 105 bits per minute with a
stimulation duration of 2 seconds [
11
]. Wang et al. employed
an LCD to develop a 40-command SSVEP paradigm for a BCI-
based speller [
12
]. However, using CS-SSVEP requires users
to frequently shift their focus between the monitor and the real-
world visual environment, which can be highly inconvenient
[
13
]. Therefore, CS-SSVEP exhibits certain limitations for
practical applications.
Augmented reality-based SSVEP (AR-SSVEP) presents a
promising alternative, enabling users to view stimuli within
a real-world scenario through an AR device [
14
,
15
]. This
approach allows for more natural interaction with the real world
via SSVEP, which could be the focal point for future research in
SSVEP. However, it is hindered by lower accuracy and a limited
number of target frequencies [
16
] [
17
].Notably, the accuracy
of AR-SSVEP declines as the number of target frequencies
increases. With an increasing number of frequencies, the smaller
frequency interval is a major factor contributing to the decline
of AR-SSVEP accuracy [
17
,
18
]. On the other hand, the refresh
rates of AR devices are not as stable as those of LED or
computer screens [
19
,
20
]. Such instability can prevent the
target stimuli from achieving the expected frequencies, leading
to misrecognition [
21
,
22
].The likelihood of incorrect frequency
labeling increases when the frequency intervals are narrower.
Therefore, increasing the number of targets by expanding the
frequency range is not advisable for an AR setting. It is essential
to address the limitations in the number of available targets
while maintaining accuracy for AR-SSVEP.
Various methods have been proposed to increase the number
of targets in SSVEP from different perspectives. Hwang et al.
proposed a modified dual-frequency chessboard stimulation that
combines four different frequencies to expand the number of
targets to ten [
23
], achieving an accuracy of 87.23% with
ten participants. However, the classification time windows
ranged from 4 to 6 seconds, making this dual-frequency
method inconvenient for practical applications due to the long
stimulus duration required for accurate classification. Chen et al.
introduced a spectrally-dense joint frequency-phase modulation
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
2
(sJFPM) method, increasing the target number to 120 [
24
].
The average accuracy of sJFPM was 92.47% with a frequency
interval of 0.1 Hz. Nonetheless, the prerequisite for achieving
good results with sJFPM is ensuring that the stimulus can
be presented steadily at the specified frequency, which is
a challenging requirement for the AR environment. Some
studies have utilized high-frequency stimuli to increase the
target number, achieving relatively good results [
25
]. However,
high-frequency SSVEP also requires a stable refresh rate of
the device. On the other hand, it is feasible to expand SSVEP
targets by introducing additional bio markers. Currently, the
mainstream approach includes the use of event-related potential
(ERP) components for auxiliary discrimination [
26
,
27
]. Han
et al. proposed a hybrid BCI that integrates SSVEP, P300, and
motion visual evoked potential (mVEP), expanding the original
36 SSVEP targets to 216 [
28
]. However, ERP components are
not stable enough without superimposed averaging. Soram Kim
et al. employed AR-P300 for drone control [
29
], achieving
satisfactory accuracy by averaging 20 epochs, which is time-
consuming. Moreover, unlike in a laboratory environment, ERP
is susceptible to being elicited by unpredictable stimuli in real-
world scenarios [
30
], which can also impact discrimination.
Therefore, for AR applications, it is necessary to introduce
a more stable bio-markers. Overall, previous methods for
target expansion are either time-consuming or impose high
requirements on equipment, rendering them unsuitable for
real-world AR environments. In addition, most research has
primarily focused on increasing CS-SSVEP targets. An effective
strategy for expanding AR-SSVEP targets remains to be
developed.
Nevertheless, most SSVEP research primarily focuses on
two-dimensional paradigms, such as plane flickers or chess-
board. Meanwhile, three-dimensional SSVEP paradigms have
gradually attracted more research attention in recent years
[
21
,
31
,
32
]. Particularly, 3D stimuli possess additional prop-
erties such as direction, shape, and motion,unlike 2D stimuli
which only have different frequencies. These multiple properties
of 3D stimuli may elicit more distinct biomarkers under
different conditions. Within the visual-evoked potential (VEP)
research community, various vision-related EEG mechanisms
such as covert attention [
33
], orientation perception of ambigu-
ous 3D objects [
34
], and motion perception [
35
] have been
clarified. Compared to simple frequency feature, the number of
targets and performance of AR-SSVEP might be enhanced by
incorporating other visually evoked biomarkers. For instance,
Kelly et al. integrated spatial attention information into SSVEP
and achieved significant improvements by combining SSVEP
features with attention-related parieto-occipital alpha band
modulations [
36
]. However, most SSVEP paradigms are still
predominantly expanding targets by adding more frequencies.
There are few studies on SSVEP that integrate it with other
potential visual biomarkers induced by 3D objects.
Therefore, leveraging the advantages of 3D objects, a
novel paradigm is proposed in this study. This paradigm
incorporates 12 rotating cube flickers (6 frequencies and 2
rotation orientations), which may elicit both frequency and
rotation-related bio-markers. To compare with the conventional
method, a baseline experiment using a 2D paradigm consisting
of 12 target frequencies was conducted. Our aim is to verify the
feasibility of 3D paradigm in effectively enlarging the target
number of AR-SSVEP. Compared to 2D paradigms, the 3D
paradigm aims to achieve more accurate discrimination by
integrating rotational and frequency information. In this study,
12 subjects were recruited in the experiment. Three methods,
including filter bank canonical correlation analysis (FBCCA)
[
37
], extended canonical correlation analysis (ECCA) [
38
], and
task related component analysis (TRCA) [
39
] were employed
for SSVEP classification. Specifically, a common spatial pattern
(CSP) based squeeze-and-excitation convolutional neural net-
works (CSP-SECNN) [
40
] was innovatively applied to classify
rotation-related visual EEG in this study. Finally, classification
accuracy and ITR [
41
] were compared between 2D and 3D
paradigm.
The contributions of this study are outlined as follows:
1)
Development of a novel 3D hybrid paradigm: We
proposed a novel 3D paradigm which merges SSVEP
and EEG elicited by visual perception of different rotation
pattern to increase the number of targets effectively.
2)
Building effective model for the classification of
rotation-evoked EEG: We innovatively leveraged CSP-
SECNN for the classification of rotation-evoked EEG
and got a good result.
3)
Demonstration of enhanced performance in AR
environments: Experimental results with 12 participants
validate the better performance of the 3D paradigm than
2D paradigm in AR settings.
The rest of this paper is organized as follows: Section 2
introduces the experiment details. Section 3 describes the
methodology employed for the classification of 2D and 3D
paradigms. Section 4 presents the classification results. Section
5 offers a discussion from different perspectives. Finally,
Section 6 concludes this study and outlines future work.
II. EXPERIMENT
A. Stimulation Design
1) 3D paradigm: For 3D stimulation (see 2(B)), 12 rotating
cube flickers were selected as targets to elicit the corresponding
frequency in EEG and patterns related to the visual perception
of rotation. By combining six frequencies, including 7 Hz,
8 Hz, 9 Hz, 11 Hz, 12 Hz and 13 Hz with two rotation
patterns (left and right rotation), twelve targets stimuli were
generated with an approximate interval of 1 Hz. Ideally, the
inclusion of a 10 Hz frequency would ensure a 1 Hz interval.
However, the occipital alpha rhythm at 10 Hz could respond
to other stimulation frequencies [
42
]. Specially, in response to
external visual stimulation, EEG at 10 Hz could be elicited
simultaneously with other target frequencies, potentially leading
to misrecognition of SSVEP. Therefore, 10 Hz stimulus was
excluded from this experiment.
The distance between the AR glasses and the stimulation
was set at 3000 mm and the distance between adjacent flickers
was 150 pixels horizontally and 135 pixels vertically. Each
cube was sized at
472 ×472 ×472
pixels, corresponding to
a visual angle of
2.1
.The selection of these parameters is
based on the experiment detailed in [
14
]. Sinusoidal coding
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
3
Fig. 1. Illustration for experiment procedure.
Fig. 2. Stimulation in Hololens-based AR environment: (A) Stimulation of
2D paradigm. (B) Stimulation of 3D paradigm.
was employed for the modulation of flicker frequency. The
chroma value of the flickers was modulated by a sine wave at
the target frequency. The color of the cube flickers alternated
between black and white. The rotation speed of each flicker
was set to 30 degree per second, as participants reported that
this speed provided the most comfortable experience during
pre-test.
2) 2D paradigm: Twelve static flashing planes with different
frequencies were selected as 2D stimulation (see 2 (A)), serving
as a baseline paradigm for comparison with the 3D paradigm.
The size of 2D stimuli is
472 ×472
pixels, which was same
as 3D stimuli. The layout parameters of 2D paradigm were
equal to the 3D one as well. As a baseline, the 2D paradigm
expands the targets by increasing the number of frequencies.
Accordingly, twelve frequencies including 7 Hz, 7.5 Hz, 8 Hz,
8.5 Hz, 9 Hz, 9.5 Hz, 11 Hz, 11.5 Hz, 12 Hz, 12.5 Hz, 13 Hz,
13.5 Hz were selected for 2D flickers, which had a smaller
interval.
3) Paradigm Implement: Both 2D and 3D paradigm were
implemented with Unity 3D (Unity Technologies, Inc) and
presented with a HoloLens 2 headset (Microsoft Corporation).
Initially, the paradigm projects were set up in Unity with
the Universal Windows Platform (UWP). The Mixed Reality
Toolkit (MRTK) was used for HoloLens features, and then
these paradigms were built and deployed through Visual Studio
to the HoloLens device. The refresh rate of HoloLens 2 is 60
Hz, which is high enough to support the stimuli display.
B. Experiment Procedure
The experimental procedure is depicted in Figure 1. Here,
an illustration for 2D experiment is given to show the
procedure details. The procedures for 2D and 3D experiment
are completely same. The only difference between 2D and
3D experiment is the stimulation type. At the beginning of
each round, a preparatory interface displaying twelve cubes
and a ’Ready’ virtual button was presented to the subjects.
During this period, subjects were allowed to adjust their status
and positions. This interface remained visible until the subject
presses the ’Ready’ button. After preparation, subject could start
the experiment by pressing the ‘Ready’ button contactlessly
with a finger gesture (Drag the AR cursor to the ‘Ready’
button with thump finger and index finger, then release these
two fingers). Subsequently, one of the twelve cubes turned red
to indicate the target. The cue period lasted 1000 ms, providing
sufficient time for subjects to locate the target flicker and adjust
their head posture. Following this, all twelve flickers flashed
for 5000 ms and subjects were expected to fixate on the target
without moving their eyes. This was followed by a 3000 ms
rest period. Each round comprised 12 trials, with each flicker
serving as the target once, and ended once all flickers had been
targeted. The sequence of target was random within a round.
Subjects could start the next round at any time when they were
ready.
To minimize the influence of participant fatigue and battery
level fluctuations on the two experiments, the 3D and 2D
paradigms were conducted alternately, switching every six
rounds. Each participant completed 36 rounds in total, com-
prising 18 rounds each in 2D and 3D paradigms. Consequently,
216 samples were collected for each paradigm.
C. Participants and Data Acquisition
Twelve subjects, including 7 males and 5 females, par-
ticipated in this experiment. They ranged in age from 22
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
4
to 32. Among them, three subjects had prior experience
with SSVEP experiments, while the remainder participated
for the first time. All subjects were mentally healthy and
possessed either normal vision or corrected-to-normal vision.
Prior to the experiment, written ethical consent, approved by the
University of Technology Sydney’s Ethical Committee (Grant
number: UTS HREC REF No. ETH20-5371), was obtained
from each participant. For data acquisition, EEG signals were
collected using a 64-channel Neuroscan electroencephalograph
cap, which followed the 10-20 electrode placement standard
[
43
]. And a Synamps2 amplifier (Compumedics Neuroscan,
Charlotte, NC, USA) was used to enhance the signal. The
sampling rate was set at 1000 Hz. At the onset of each trial, a
marker corresponding to the target was transmitted to CURRY
8 (the data acquisition software of Neuroscan) via UDP and
recorded synchronously with the EEG data.
D. Data Preprocessing
Initially, the left and right mastoid electrodes (M1 and M2)
were designated as the reference channels to re-reference the
data from the other 62 channels. To eliminate the artifacts, such
as electrooculography (EOG) and electromyogram (EMG),
independent component analysis (ICA) [
44
] was employed
to make blind source separation with EEGLAB [
45
]. The
EEG data were then decomposed into several independent
components. With the ADJUST’ toolbox, significant artifact
components were identified and removed [
46
]. An infinite
impulse response (IIR) band-pass filter ranging from 0.5 Hz
to 90 Hz was employed for data filtering. And a 50 Hz notch
filter was applied to eliminate power line interference. With
the reference to event markers, EEG samples were extracted
by slicing the EEG with a 5000 ms time window. Finally, 216
pre-processed samples, each with the dimension of
62 ×5000
were extracted for each paradigm.
III. METHODS
In this study, an experiment was conducted to compare the
conventional 2D paradigm with the proposed 3D paradigm.
For the 2D SSVEP, three classification methods, including
ECCA, TRCA and FBCCA, were employed. Regarding the
3D SSVEP, it is also necessary to develop a classification
model for EEG signals related to the visual perception of
different rotation orientations. By integrating the results from a
6-class SSVEP classifier and a 2-class rotation classifier, twelve
targets were established, as shown in Supplementary Figure 1.
The classification methods for the 2D and 3D paradigms are
described as follows:
A. SSVEP Classifiers
1) FBCCA: FBCCA is an unsupervised method for SSVEP
classification that utilizes the harmonic information and outper-
forms the original CCA. In FBCCA, EEG data are filtered into
multiple frequency bands using a band-pass filter, followed by
using CCA to calculate the correlation coefficients between
the sub-band data and reference signals. Subsequently, these
coefficients are weighted and summed. Finally, the target
frequency is classified by voting with the weighted correlation
coefficients. According to [
47
], the FBCCA approach primarily
involves three hyperparameters: the number of sub-bands
N
,
and
a
and
b
for calculating the weighting coefficients. In this
study, the parameters for FBCCA were set as
N
=5,
a
= 1.25
and b= 0.25.
2) ECCA: ECCA incorporates subject-specific reference
signals into the CCA and has shown great performance
in SSVEP classification. In ECCA, in addition to utilizing
standard sinusoidal and cosinusoidal signals as references,
the template obtained from averaging SSVEP signals across
various frequencies for each subject are also employed as an
individual-specific reference, thereby enhancing the recognition
of individual SSVEP responses. The detail of ECCA can be
found in [
38
]. Furthermore, in calculating ECCA, the filter-
bank concept was also adopted, which involves calculating
correlation coefficients for SSVEP signals across multiple
frequency bands and weighting them to obtain the final
coeffifient. The parameters for the filter bank are the same
as those of FBCCA.
3) TRCA: TRCA extracts task-related components by
maximizing the reproducibility during the task period. The
parameters for the filter bank are same as that of FBCCA.
In TRCA, a frequency-specific template is constructed by
averaging the EEG data associated with each frequency. Spatial
filters are subsequently derived through the TRCA method. The
result is determined by calculating the correlation coefficients
between the projected EEG data and multiple reference signals.
The details of TRCA calculation can be found in [39].
Additionally, TRCA also incorporates a filter-bank mecha-
nism to process EEG data across multiple frequency bands,
with its parameters aligned consistently with those used in
FBCCA.
B. Classifier for 3D paradigm
Classification for 3D SSVEP is divided into two parts (see
Supplementary Figure 1). EEG data are input simultaneously
into the SSVEP classifier and the classifier for rotation-related
EEG. The intersection (
) of the results from the two classifiers
is used as the final output.
For SSVEP classifier, the previously mentioned three meth-
ods including ECCA, TRCA and FBCCA are also employed
for 3D paradigm.
For rotation classification, CSP algorithm [
48
] was leveraged
for feature extraction of EEG related to different rotations. The
details of the feature extraction and discrimination processes
for rotation are depicted in Figure 3.
CSP feature extraction is executed in two steps. Firstly, a
set of spatial filters is derived. During this phase, covariance
matrices for different classes of EEG data are computed.
Subsequently, these covariance matrices are jointly diagonalized
and whitened to obtain a mapping matrix, referred to as a CSP
filter. The computation details of the CSP filter in this study are
based on [
48
]. In the second step, EEG data are projected onto
a common space with CSP filters and the variance difference
of the projected data is enhanced. Thus, the variance of the
projected data is selected as the feature for classification. The
CSP feature fnis calculated as Equation 1:
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
5
Fig. 3. Structure of CSP-SECNN model leveraged for rotation classification.
fn=log(1 + var(W En)) (1)
Where
En
denote the EEG signal of the
n
-th class,
W
is the
CSP filter. Based on previous work [
49
], CSP features derived
from the first and last few rows of the CSP filters are distinctive.
Typically, the number of the selected CSP filters ranges from
2 to 6. In this study, the first three and the last three rows
of CSP filters were selected, as these six filters demonstrated
optimal performance during test.
Next, the filter-bank idea was applied to feature extraction.
Compared with CSP, Filter-bank Common Spatial Pattern
(FBCSP) extracts features derived from multiple sub-bands,
which leverages more useful information and can enhance
classification performance [
50
]. In the FBCSP module, the
original EEG data are initially filtered through
N
IIR bandpass
filters and
N
sub-band signals are generated. Then
N
CSP
filters are trained with these sub-band data to extract multi-
band features.
Following feature extraction, SECNN is applied to discrimi-
nation.
N
features are stacked vertically and the input of CNN
model is a
N×Lcsp
feature map, where
Lcsp
is the length of
one CSP feature. Specifically,
Layer 1
consists of two kernels
where
N1
= 1 and
L1
= 3 while
Layer 2
has four kernels
where
N2
= 2 and
L2
= 3. Then, the SE-Net module is used for
assigning a
1×N
weight
w
adaptively to the feature of each
band to enhance the useful features and suppress the redundant
ones. The details of SE-Net can be found in [
51
]. Finally, the
weighted features are combined with a Full-Connected layer,
and a softmax layer is applied for the binary classification
of rotation. In this study, Floating Point Operations (FLOPs)
[
52
] was used for evaluating the computational complexity and
calculated with the pytorch-flops-counter2
1
. The FLOPs of
CSP-SECNN with 1 second data as input is 0.665 Mmac.
C. Performance Evaluation and Details
The purpose of this evaluation is to compare the performance
of the proposed 3D paradigm with the conventional 2D
paradigm (Figure 4). In the 3D paradigm, there are two parallel
processing streams for EEG decoding. Data of 3D paradigm
are input into both a binary rotation classifier and a six-class
SSVEP classifier. The rotation classifier outputs the target
orientation (left/right), and the SSVEP classifier outputs the
target frequency. Taking the intersection of the results from the
two classifiers as the final outcome, 12 targets can be generated.
For the 2D paradigm, data are input into a SSVEP classifier,
which outputs the classification result for 12 target frequencies.
According to [
12
], channels Pz, POz, PO3, PO4, PO5,
PO6, O1, Oz and O2 are selected as the input channels for
the SSVEP classifier. As few studies have clearly defined
which brain regions and frequency bands are involved in the
visual perception of rotation, several combinations of brain
cortex areas (frontal, motor, parietal, occipital) and frequency
bands (delta, theta, alpha, beta, and low gamma) were tested
on rotation classifier. The results indicated that the optimal
performance was achieved using the low gamma bands of the
occipital and posterior parietal lobes. Consequently, EEG from
19 occipital-parietal channels, including P1-P8, Pz, PO3-PO8,
POz, O1, Oz, and O2, are selected as the inputs for the rotation
classifier. For feature extraction in the rotation classifier, six
sub-bands ranging from 30 Hz to 45 Hz, segmented with an
interval of 2.5 Hz, were utilized.
Except for FBCCA, the remaining methods are supervised
learning models for which training and test datasets are
1https://github.com/sovrasov/flops-counter.pytorch
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
6
Fig. 4. Illustration of the comparison between 3D and 2D paradigm
necessary. Five-fold validation was implemented for evaluation„
where 80% of the data served as the training set and the
remaining 20% as the test set. The length of the EEG segments
used for testing was increased from 0.5s to 2.0s in increments
of 0.25s.
In addition to accuracy, the ITR was also calculated for
evaluation:
ITR =60
Tg+Ts+Twlog2N+Plog2P+ (1 P) log2
1P
N1
(2)
where
Tg
is the gaze shifting time,
Ts
denotes the response
latency,
Tw
denotes the duration of EEG input,
N
is the total
number of targets and
P
is the accuracy. Here, the
Tg
was
0.5sand the Tswas 0.135s[37].
IV. RES ULT
The average accuracies and ITR of 2D and 3D paradigm are
presented in Figure 5 and Figure 6 respectively. With ECCA, it
is evident that the 3D paradigm outperforms the 2D paradigm
(
p < 0.05
) when the EEG segment length is shorter than 1.5s.
With a 1s EEG segment, the 3D paradigm achieved an average
accuracy of 76.50% and an ITR of 70.42 bits/min, while the
accuracy and ITR of 2D paradigm were 72.07% and 62.28
bits/min respectively (
pacc = 2.91×104
,
pIT R = 8.13×106
,
a= 0.05
). With TRCA, the average accuracy and ITR of the 3D
paradigm were higher than those of the 2D paradigm, although
the difference was not statistically significant. The accuracies
of both paradigms were less than 70% when the time length
was 1.0s. Nevertheless, the 3D paradigm still outperforms the
2D one (
pacc = 0.0287
,
a= 0.05
). Regarding FBCCA, both
accuracy and ITR for the 3D paradigm were significantly higher
than those for the 2D only when the EEG segment was less than
1s (
pacc0.5s= 0.0175
,
pacc0.75s= 0.0236
,
pIT R0.5s= 0.0023
,
pIT R0.75s= 0.0312,a= 0.05)
By comparing the three SSVEP classification algorithms,
it can be seen that ECCA had the best performance for both
paradigms, achieving an average accuracy of 81.63% in 1.5s.
When the time length was relatively short (less than 1s), ECCA
and TRCA outperformed FBCCA (
p < 0.01
,
a= 0.05
).
When the time window was long enough (
t > 1.25s
, ECCA
significantly outperformed both TRCA and FBCCA (
p < 0.001
,
a= 0.05
), while the performance difference between the
latter two methods was negligible (
p > 0.05
,
a= 0.05
).
Compared to FBCCA, ECCA introduces individual SSVEP
templates as supplementary reference signals. This enhances the
similarity between individual SSVEP signals and the reference
signals, improving the discriminability between target and non-
target frequencies [
53
], thereby achieving better classification
results. TRCA relies on phase differences, and it performs well
only when there is a significant phase difference [
54
]. In this
study, the phase among all stimuli was consistent, which may
result in the poorer performance of TRCA. Overall, the mean
accuracy of the best method (ECCA) illustrate that the 3D
paradigm achieved better performance on shorter time segments
(t<1.25) than the 2D paradigm. And when the time window
is sufficiently long (t>1.25), there are only slight differences
between the two approaches.
The accuracy of two sub-classifiers in the 3D paradigm is
shown in Figure 7, which presents the classification results of
SSVEP and rotation-induced EEG signals in the 3D paradigm
respectively. ECCA significantly outperformed TRCA and
FBCCA for SSVEP classification ((
p < 0.01
,
a= 0.05
). The
mean accuracy of TRCA was higher than that of FBCCA when
the time length was less than 1.25s ((
p < 0.01
,
a= 0.05
)
and lower when the time length was longer (
t > 1.5s
).For
rotation classification, the average accuracy reached 82.38%
with a 0.5s time window and exceeded 90% when the time
length was 2s. Moreover, the results of the 6-class 3D SSVEP
classifier (Figure 7 (A)) were consistent with those of 12-class
classification of the 3D paradigm (Figure 5), where ECCA
performed best. Thus, the result of sub-classifiers indicates that
a good SSVEP classifier can improve the performance of 3D
paradigm effectively.
Moreover, the individual results of both 3D and 2D
paradigms are provided in supplementary Figure 2. S3 exhibited
the best performance, achieving an accuracy of 80.56% and an
ITR of 95.65 bit/min using the 3D paradigm with a duration
of only 0.75 seconds. Individual results showed that not all
subjects performed better in the 3D paradigm than in the 2D
one. For S1 and S6, there is little difference in their performance
between the 2D and 3D paradigms. Moreover, both subjects
achieved relatively good results with these two paradigms.
When the duration was 1s, the accuracy of S1 in the 2D and
3D paradigms was 80.42% and 81.86% respectively, while
the accuracy of S6 reached 78.64% and 80.13% respectively.
For S2, S4, S8, and S9, they achieved better accuracy and
ITR with the 3D paradigm when the time window was shorter
than 1.25s. And when time window was longer than 1.25s,
their performance on the 3D paradigm was worse than the 2D
paradigm. Take S2 as an example, when time length was 0.75s,
the accuracy was 76.85% for 3D paradigm and dropped to
70.31% for 2D paradigm. But with a time window of 1.25s, the
accuracy of the 3D and 2D paradigms was 85.19% and 89.58%,
respectively. For S3, S5, and S7, when the time window was
shorter than 1.25s, their performance in the 3D paradigm was
superior to that in the 2D paradigm. However, when the time
window was longer, their performance was relatively similar
in both paradigms. For S10, S11, and S12, regardless of the
length of the EEG segment, they consistently achieved better
performance on the 3D paradigm. Overall, except for S1 and
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
7
Fig. 5. Mean accuracy of 2D and 3D paradigm. In each subplot, the horizontal axis represents the length of the EEG signal in units of seconds.The vertical
axis indicates accuracy. The blue bars indicate the accuracy of 12-class classification of 3D paradigm. From left to right, The results of the 3D paradigm were
calculated with (ECCA, CSP-SECNN), (TRCA, CSP-SECNN), and (FBCCA, CSP-SECNN). The orange bars indicate the accuracy of 2D paradigm (12-class
SSVEP). * indicates significant differences between accuracy of 3D and 2D paradigm, as assessed by paired t-test (*p < 0.05, ***p < 0.001).
Fig. 6. Mean ITR of 2D and 3D paradigm. In each subplot, the horizontal axis represents the length of the EEG signal in units of seconds. The vertical axis
indicates ITR in units of bits/min. The blue bars and orange bars indicate the ITR of 3D paradigm and 2D paradigm respectively. * indicates significant
differences between ITR of 3D and 2D paradigm, as assessed by paired t-test (*p < 0.05, **p < 0.01,***p < 0.001).
Fig. 7. Accuracy of two sub classifiers of 3D paradigm. In each subplot, the horizontal axis represents the length of the EEG signal in units of seconds.The
vertical axis indicates accuracy. (A) Mean accuracy of three SSVEP classifiers (6-class SSVEP classification of 3D paradigm).
indicates significant differences
between the accuracy of two methods, as assessed by paired t-test (*p < 0.05, **p < 0.01). (B) Mean accuracy of classifier of rotation (CSP-SECNN).
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
8
S6, the remaining ten subjects could achieve better performance
in the 3D paradigm with a short EEG segment. These findings
are consistent to the group-level result shown in Figure 5 and
6, which imply that the performance of AR-SSVEP can be
effectively improved in shorter time windows with the 3D
paradigm.
V. ONLINE EXPERI ME NT
A. Experimental Details
To validate the feasibility of the proposed 3D paradigm
and make a comparison to original 2D paradigm, this study
conducted online experiments using the HoloLens 2 headset for
stimulus presentation and a 64-channel Neuroscan EEG system
for signal acquisition. The four participants who performed
best in offline experiments participated in the online tests. Each
participant’s model was trained using data from their offline
experiments. The sampling rate was set to 1000 Hz in online
experiment. The selection of channels is consistent with the
offline experiments. Considering the cross-session instability
of EEG signals [
55
] and the interference of spontaneous
EEG activity [
56
], the presentation time of stimuli in the
online experiments was adjusted to 3 seconds to ensure stable
elicitation of the corresponding EEG components.
Each experimental session included 12 trials, with each of the
12 stimuli randomly chosen as the target once, and participants
were instructed to focus on the target stimuli during task period.
After each stimulus presentation, EEG signals from the task
phase were transmitted via TCP/IP [
57
] and preprocessed with
a an IIR filter ranging from 0.5 Hz to 90 Hz before being
input into classifiers. The results were recorded and compared
with the labels to calculate the final accuracy. Each participant
completed three rounds of online experiments, totaling 36 trials,
with each stimulus being tested three times.
B. Online Result
The online result is shown in Table I. For 3D paradigm, three
subjects achieved accuracies above 80%, with the highest result
reaching 86.11%. The remaining participant whose accuracy
was 77.78% also significantly exceeded the random baseline
of 16.67%. Compared to the 2D paradigm, all subjects except
subject 1 achieved better performance in the 3D paradigm.
Especially for subject 4, his performance in the 2D paradigm
was relatively poor. This result further indicates that as the target
frequency increases, some participants with lower sensitivity
to flicker frequency may face challenges.
The results of the online experiments to some extent validate
the feasibility and outperformance of 3D paradigm. The
discrepancy between online and offline results may stem from
overfitting during offline model training and variations in EEG
data across sessions. Therefore, addressing the cross-session
transfer learning issues in the 3D paradigm can be a significant
direction for future improvements.
VI. DISCUSSION
In this section, some explanation work is given. Firstly, we
analyze the rotation-evoked EEG patterns, providing expla-
nations for why EEG signals induced by different rotations
TABLE I
THE RESULT OF ONLINE EXPERIMENT FOR 4SUBJECTS
Subject ID 1 2 3 4 mean
3D Accuracy (%) 86.11 83.33 83.33 77.78 82.63
2D Accuracy (%) 88.89 77.78 80.56 66.67 78.48
Fig. 8. Topological maps of CSP filters extracted from rotation classification.
Left figure of each subject denotes the active area under left rotation. Right
figure of each subject denotes the active area under right rotation. Dark blue
and dark red denote higher negative and positive weights and both of them
indicate highly active degree.
are distinguishable. Secondly, we discuss on two key factors
which might affect the SSVEP classification: the introduction
of rotation and the number of target frequencies. By exploring
the impact of these factors, we explain why the 3D paradigm
achieves superior results.
A. Pattern analysis of rotation-evoked EEG
1) Spatial filter analysis: Firstly, we investigated the brain
regions associated with rotation-evoked EEG. As mentioned
above, CSP method was utilized for feature extraction in
rotation classification. During this process, a set of spatial
filters was derived, which could represent the active levels of
the electrodes involved in classification [
58
]. Notably, motor
imagery studies have shown that CSP filters assign high weights
(absolute values) to task-dependent electrodes (e.g., C3 and C4
in the motor cortex) [
58
]. Some researchers selected electrodes
of interest according to the weights [
59
]. Thus, by examining
the distribution of CSP filters, it is possible to find the brain
regions activated by the rotation task.
In detail, according to the CSP calculations [
48
], the number
of CSP filters are equal to the number of electrodes. In this
study, 62 electrodes were utilized, resulting in 62 CSP filters.
Previous studies has indicated that the first and the last CSP
filters can effectively represent the brain’s activity levels under
two distinct tasks [
49
], respectively. Therefore, the first and last
CSP filters derived from rotation-evoked EEG (filtered between
30-45 Hz) are shown in Figure 8, illustrating the active brain
areas during left and right rotations, respectively.
It is evident that the active electrodes are primarily located
in the occipital and posterior parietal lobes, aligning with
the regions typically associated with visual-evoked potentials.
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
9
However, no significant differences were observed between the
filter weights for left and right rotations (p>0.05,
a= 0.05
).
Only minor contralateral differences were noted in the data
for individual subject. For example, S11 and S12 exhibited
high weights in the left posterior area during left rotation, and
in the right posterior area during right rotation. Conversely,
for subjects like S1 and S4, nearly all electrodes in the
posterior brain regions showed high weights. Although no
universal distribution of filter weights has been established,
these findings suggest that the occipital and posterior parietal
lobes are relevant to rotation classification, offering guidance
for electrode selection in practical applications of this paradigm.
2) Power spectrum density (PSD) analysis: Considering
that the visual perception of different rotations may elicit
EEG synchronization or desynchronization like motor imagery,
PSD analysis was conducted to elucidate the power difference
between two rotation cases. In this study, the PSD difference
for each electrode under left and right rotations was calculated,
and the significant different value (
P SDlef t P S Dright)
were
projected onto a topological map. Specifically, a data length
of 2 seconds was used for PSD calculation. The details of
calculation is shown in pseudocode 1, where
Bandpower
is
the function used to calculate the PSD of a specific frequency
band,
ttest
is the paired t-test that tests the significance of the
differences between the PSD of EEG elicited by left and right
rotation. The False Discovery Rate (FDR) was used to correct
the results of t-test and prevent false positive results [60].
Algorithm 1 Pseudocode of calculating PSD difference be-
tween EEG elicited by left rotation (LR) and right rotation
(RR) and outputting significant results.
Require: EEGLR,EEGRR
1: P SDLR Bandpower(E EGLR),
P SDRR Bandpower(E EGRR)
2: p, t_valueLRttest(P S DLR, P S DRR)
3: padj F DR(p)
4: t_valueLRt_valueLR= 0 if padj >0.05
5: return t_valueLR
Individual results indicate significant PSD differences in the
occipital and parietal areas. For subjects S1, S4, S5, S6, S7
and S10, the PSD in the left posterior cortex was significantly
higher during left rotation than right rotation. Additionally,
compared to left rotation, a significantly higher PSD in the
right posterior regions was observed during right rotation in
all subjects except for S7 and S10. Regarding the group level,
significant PSD differences were only found in the left posterior
brain region. Therefore, it can be concluded that when subjects
viewed a cube rotating in left or right, EEG synchronization
likely occurred in the ipsilateral posterior cortex. However, the
PSD results across all subjects were not completely consistent
due to inevitable individual differences, which is acceptable.
Moreover, significant PSD differences were observed in the
occipital and post-parietal cortex, consistent with the activated
regions detected by CSP filters. This consistency provides an
explanation for the effective classification of different rotation
orientations.
Fig. 9. Topological maps of significant PSD difference between left and
right rotation. The values of all colorbars denote the t-value of comparison
betweeen left and right rotation, ranging from -6 to 6. The group-level result
was calculated with data of all subjects. The dark red and dark blue indicate
high positive value and high negative value.
Regarding this phenomenon, previous studies have suggested
that the gamma response of the posterior brain may be
associated with the visual cognition of moving objects [
61
,
62
].
MÜLLER et al. conducted an experiment where subjects were
required to view a coherent stimulus (a single bar moving
to the right) and an incoherent stimulus (two bars moving in
opposite directions) [
63
]. They reported that gamma power
increased significantly with the coherent stimulus compared
to the incoherent one. Interestingly, higher gamma power was
observed at the right parieto-occipital scalp sites (P4, O2, and
T6), which aligned with our PSD results. However, they did
not conduct experiments with bars moving to the left. A similar
finding was reported by [
62
], where subjects viewed a grating
moving horizontally from left to right. Their results showed
a significantly higher gamma power in the right occipital-
parietal lobes, consistent with our PSD analysis. In this study,
the horizontal rotation might have a similar visual effect as
the horizontal moving stimulation used in these studies and
induce synchronization in unilateral brain areas. However, it is
challenging to draw complete conclusions as few studies have
reported gamma EEG patterns in response to moving visual
stimuli in various directions. Nonetheless, we did observe
significant contralateral differences in gamma power between
the two rotation stimuli in our experiment, yielding good
classification performance. Some studies suggest that gamma
synchronization is related to bottom-up attentional processing
[
64
,
65
]. In this experiment, the behavior of watching a rotating
flicker and generating associated EEG is also a form of bottom-
up cognition. Rotating motion may also induce a shift in the
user’s attention, potentially causing differences in gamma power
across different posterior regions.
B. Analysis for the superior performance of 3D paradigm
1) Effect of introducing rotation on SSVEP classification:
An ablation evaluation was conducted to determine whether
the introduction of rotation enhances or inhibits the SSVEP
classification compared to the conventional stationary 2D
flicker, which involved three classification tasks under three
ablation conditions (see Supplementary Figure 3). Sub-dataset1
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
10
Fig. 10. Result of ablation evaluation on ECCA. The horizontal axis represents
the length of the EEG signal in units of seconds.The vertical axis indicates
accuracy.
indicates significant differences between the accuracy of two cases,
as assessed by paired t-test (*p < 0.05)
contained SSVEP elicited solely under left rotation (LR),
and sub-dataset2 under right rotation (RR) only. Sub-dataset3
was separated from the 2D SSVEP dataset, which shares the
same target frequencies as the 3D paradigm. All sub-datasets
contained the same number of samples and target frequencies.
According to 5 and 6, as ECCA had the best performance and
demonstrated the most significant difference bewteen 2D and
3D paradigms, the ablation experiments utilized ECCA as the
classification algorithm for SSVEP.
The result of ablation evaluation was presented in Figure 10,
which indicated that the accuracy of 3D-RR on ECCA is lower
than that of 3D-LR and 2D when the time segment is short
(
t < 1.25s
). No significant difference was found between 2D
and 3D-LR. Moreover, 2D SSVEP even performed significantly
better than 3D-RR with time windows of 1s and 1.25s. Overall,
compared to conventional 2D paradigm, there was no significant
improvement for SSVEP classification of 3D paradigm. Thus,
introducing rotation to flickers appears to have little effect
on the discrimination of SSVEP, indicating that the superior
performance of the 3D paradigm is not due to the involvement
of rotation.
2) Effect of increasing the number of target frequencies: To
assess the impact of increasing the number of target frequencies
on SSVEP decoding, confusion matrices for the 12-class 2D
paradigm classification with ECCA are displayed in Figure 11.
For 9 out of the 12 target frequencies, they were most likely
to be misclassified as adjacent frequencies. For instance, 31
samples labeled as 12 Hz were misclassified as 11.5 Hz. Here,
the fast Fourier transform (FFT) value was computed for the
Oz electrode data of all misclassified samples labeled as 12 Hz,
and the average value was presented as Figure 12. It is evident
that the frequency peak of the misclassified samples appears
at 11.5 Hz rather than the expected 12 Hz. We suspect that
this may be related to frame lost in the AR device, reducing
the original 12 Hz stimulus to approximately 11.5 Hz. The
misclassification of adjacent frequencies indicates that adding
more target frequencies and reducing the interval between them
may impair SSVEP classification. This finding partly explains
why the accuracy of 2D paradigms significantly decreases
Fig. 11. Confusion matrix of 12-class 2D paradigm with 0.8s time length
(summation of all 12 subjects). In the rightmost column of the figure, the
green and red numbers respectively represent the precision and false discovery
rate for each class. In the row at the bottom of the figure, the green and
red numbers respectively represent the recall and false negative rate for each
predicted class. The bold green and red numbers in the lower right corner
of the figure represent the overall accuracy and error rate, respectively. The
object enclosed by a blue box denotes the most misidentified frequency for
one label.
Fig. 12. The FFT results of error cases labeled as 12 Hz. The horizontal axis
represents the frequency in units of Hz.The vertical axis indicates single-sided
amplitude spectrum, with units in volts.
when the number of frequencies is increased from 6 to 12.
In the 3D paradigm, fewer target frequencies were employed,
reducing the adverse effects on classification caused by smaller
frequency intervals. Meanwhile, the 6-class SSVEP and 2-
class rotation classifiers maintained relatively high accuracy
even with short EEG segments (as shown in Figure 7, when
t= 0.75s
,
ACC3DS SV EP = 81.58%
,
ACC3Dr otation =
85.75%
). Consequently, by integrating these two sub-classifiers,
the overall performance of the 12-class 3D paradigm surpassed
that of the 2D paradigm.
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
11
The negative effects of increasing the frequency count can
be explained by the unpredictability of device refresh rates
[
21
,
22
]. For instance, in the experiments conducted by Arpaia
et al. [
19
], although the declared refresh rate of AR device
was 30 Hz, the actual refresh rate fluctuated up to 32 Hz. This
incorrect refresh rate caused the target frequency of 15 Hz to
shift to 16 Hz, impacting the SSVEP accuracy. According to
Figure 11, five target frequencies including 7 Hz, 7.5 Hz, 11
Hz, 11.5 Hz and 13 Hz were most frequently misclassified
as their adjacent higher frequencies, which was consistent to
[
19
]. Additionally, frame loss issues in Hololens were also
observed in the AR-SSVEP experiments by Ke et al. [
66
].
Wang et al. further noted that using Hololens-based flickers
is susceptible to refresh rate variability [
67
]. Moreover, their
results demonstrated that AR-SSVEP was significantly less
accurate than CS-SSVEP with shorter EEG segments (
t < 1.5s
).
The result of the 12-class 2D paradigm in this study is also
poor when EEG segments are short (
t < 1.25s
), which is
consistent with their findings.
VII. CONCLUSION AND FUTURE WORK
In this study, we proposed a 3D hybrid paradigm that
combines visual rotation-evoked potentials and SSVEP to
effectively expand the number of AR-SSVEP targets. When
subjects viewed flickers rotating in different orientations, EEG
components containing both frequency and rotation information
were simultaneously induced, enabling the generation of more
targets with their combination. We conducted a comparative ex-
periment to assess the performance of this novel method against
the conventional 2D paradigm that increased targets by adding
more frequencies. ECCA was effectively applied to SSVEP
classification, and a CSP-SECNN method was leveraged for
rotation classification. Based on these two sub-classifiers, the
proposed 3D paradigm achieved an ideal classification result.
Twelve subjects participated in the experiment within an AR
environment, and ten of them got enhanced performance with
the 3D paradigm, particularly in shorter time windows. The
mean accuracy and ITR further demonstrated the superior
performance of the proposed 3D paradigm compared to the
2D approach.
We doubled the number of targets by introducing 2-class
biomarkers elicited by the visual perception of rotation in this
study. However, the current approach still has some limitations,
which is only able to double the number of targets and can not
expand more targets. In the future, we aim to incorporate
additional biomarkers by investigating other properties of
stimuli, such as various directions of rotation or different
rotational speeds, to expand the targets in AR-SSVEP-based
BCI systems more effectively.
REFERENCES
[1]
S. Samejima, A. Khorasani, V. Ranganathan, J. Nakahara,
N. M. Tolley, A. Boissenin, V. Shalchyan, M. R. Daliri,
J. R. Smith, and C. T. Moritz, “Brain-computer-spinal
interface restores upper limb function after spinal cord
injury, IEEE Transactions on Neural Systems and Reha-
bilitation Engineering, vol. 29, pp. 1233–1242, 2021.
[2]
X. Gao, D. Xu, M. Cheng, and S. Gao, “A bci-based
environmental controller for the motion-disabled, IEEE
Transactions on neural systems and rehabilitation engi-
neering, vol. 11, no. 2, pp. 137–140, 2003.
[3]
M. Salvaris and F. Sepulveda, “Visual modifications on the
p300 speller bci paradigm,” Journal of neural engineering,
vol. 6, no. 4, p. 046011, 2009.
[4]
G. Townsend, B. Graimann, and G. Pfurtscheller, “Contin-
uous eeg classification during motor imagery-simulation
of an asynchronous bci,” IEEE Transactions on Neural
Systems and Rehabilitation Engineering, vol. 12, no. 2,
pp. 258–265, 2004.
[5]
R. Ku´
s, A. Duszyk, P. Milanowski, M. Łab˛ecki,
M. Bierzy´
nska, Z. Radzikowska, M. Michalska, J. ˙
Zy-
gierewicz, P. Suffczy´
nski, and P. J. Durka, “On the
quantification of ssvep frequency responses in human
eeg in realistic bci conditions, PloS one, vol. 8, no. 10,
p. e77536, 2013.
[6]
E. Yin, T. Zeyl, R. Saab, T. Chau, D. Hu, and Z. Zhou,
A hybrid brain–computer interface based on the fusion
of p300 and ssvep scores, IEEE Transactions on Neural
Systems and Rehabilitation Engineering, vol. 23, no. 4,
pp. 693–701, 2015.
[7]
C. Farmaki, M. Krana, M. Pediaditis, E. Spanakis, and
V. Sakkalis, “Single-channel ssvep-based bci for robotic
car navigation in real world conditions, in 2019 IEEE
19th International Conference on Bioinformatics and
Bioengineering (BIBE). IEEE, 2019, pp. 638–643.
[8]
S. Park, H.-S. Cha, and C.-H. Im, “Development of an
online home appliance control system using augmented
reality and an ssvep-based brain–computer interface,
IEEE Access, vol. 7, pp. 163 604–163 614, 2019.
[9]
C. Liu, M. Duan, Z. Duan, A. Liu, Z. Lu, and H. Wang,
An ssvep-based bci with leds visual stimuli using dynamic
window cca algorithm, Biomedical Signal Processing
and Control, vol. 76, p. 103727, 2022.
[10]
A. Luo and T. J. Sullivan, A user-friendly ssvep-based
brain–computer interface using a time-domain classifier,
Journal of neural engineering, vol. 7, no. 2, p. 026010,
2010.
[11]
X. Chen, Z. Chen, S. Gao, and X. Gao, A high-itr ssvep-
based bci speller, Brain-Computer Interfaces, vol. 1, no.
3-4, pp. 181–191, 2014.
[12]
Y. Wang, X. Chen, X. Gao, and S. Gao, A benchmark
dataset for ssvep-based brain–computer interfaces, IEEE
Transactions on Neural Systems and Rehabilitation Engi-
neering, vol. 25, no. 10, pp. 1746–1752, 2016.
[13]
R. Zhang, L. Cao, Z. Xu, Y. Zhang, L. Zhang, Y. Hu,
M. Chen, and D. Yao, “Improving ar-ssvep recognition
accuracy under high ambient brightness through iterative
learning,” IEEE Transactions on Neural Systems and
Rehabilitation Engineering, vol. 31, pp. 1796–1806, 2023.
[14]
X. Zhao, C. Liu, Z. Xu, L. Zhang, and R. Zhang, “Ssvep
stimulus layout effect on accuracy of brain-computer
interfaces in augmented reality glasses, IEEE Access,
vol. 8, pp. 5990–5998, 2020.
[15]
H. Si-Mohammed, J. Petit, C. Jeunet, F. Argelaguet,
F. Spindler, A. Evain, N. Roussel, G. Casiez, and
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
12
A. Lécuyer, “Towards bci-based interfaces for augmented
reality: feasibility, design and evaluation,” IEEE trans-
actions on visualization and computer graphics, vol. 26,
no. 3, pp. 1608–1621, 2018.
[16]
A. Ravi, J. Lu, S. Pearce, and N. Jiang, “Enhanced
system robustness of asynchronous bci in augmented
reality using steady-state motion visual evoked potential,
IEEE Transactions on Neural Systems and Rehabilitation
Engineering, vol. 30, pp. 85–95, 2022.
[17]
R. Zhang, Z. Xu, L. Zhang, L. Cao, Y. Hu, B. Lu, L. Shi,
D. Yao, and X. Zhao, “The effect of stimulus number
on the recognition accuracy and information transfer rate
of ssvep–bci in augmented reality,” Journal of Neural
Engineering, vol. 19, no. 3, p. 036010, 2022.
[18]
D. Zhu, J. Bieger, G. Garcia Molina, and R. M. Aarts, A
survey of stimulation methods used in ssvep-based bcis,
Computational intelligence and neuroscience, vol. 2010,
no. 1, p. 702357, 2010.
[19]
P. Arpaia, E. De Benedetto, L. De Paolis, G. D’Errico,
N. Donato, and L. Duraccio, “Performance enhancement
of wearable instrumentation for ar-based ssvep bci,
Measurement, vol. 196, p. 111188, 2022.
[20]
L. Angrisani, P. Arpaia, E. De Benedetto, L. Duraccio,
F. L. Regio, and A. Tedesco, “Wearable brain–computer
interfaces based on steady-state visually evoked potentials
and augmented reality: A review,” IEEE Sensors Journal,
vol. 23, no. 15, pp. 16 501–16 514, 2023.
[21]
H. Liu, Z. Wang, R. Li, X. Zhao, T. Xu, T. Zhou, and
H. Hu, A comparative study of stereo-dependent ssvep
targets and their impact on vr-bci performance, Frontiers
in Neuroscience, vol. 18, p. 1367932, 2024.
[22]
Y. Mustafa, M. Elmahallawy, T. Luo, and S. Eldawlatly,
A brain-computer interface augmented reality framework
with auto-adaptive ssvep recognition, in 2023 IEEE
International Conference on Metrology for eXtended
Reality, Artificial Intelligence and Neural Engineering
(MetroXRAINE). IEEE, 2023, pp. 799–804.
[23]
H.-J. Hwang, D. H. Kim, C.-H. Han, and C.-H. Im, A
new dual-frequency stimulation method to increase the
number of visual stimuli for multi-class ssvep-based brain–
computer interface (bci), Brain research, vol. 1515, pp.
66–77, 2013.
[24]
X. Chen, B. Liu, Y. Wang, and X. Gao, A spectrally-
dense encoding method for designing a high-speed ssvep-
bci with 120 stimuli,” IEEE Transactions on Neural
Systems and Rehabilitation Engineering, vol. 30, pp. 2764–
2772, 2022.
[25]
A. Chabuda, P. Durka, and J. ˙
Zygierewicz, “High fre-
quency ssvep-bci with hardware stimuli control and phase-
synchronized comb filter, IEEE Transactions on neural
systems and rehabilitation engineering, vol. 26, no. 2, pp.
344–352, 2017.
[26]
Y. Li, J. Pan, F. Wang, and Z. Yu, “A hybrid bci
system combining p300 and ssvep and its application
to wheelchair control,” IEEE Transactions on Biomedical
Engineering, vol. 60, no. 11, pp. 3156–3166, 2013.
[27]
M. Xu, H. Qi, B. Wan, T. Yin, Z. Liu, and D. Ming, A
hybrid bci speller paradigm combining p300 potential and
the ssvep blocking feature, Journal of neural engineering,
vol. 10, no. 2, p. 026001, 2013.
[28]
J. Han, M. Xu, X. Xiao, W. Yi, T.-P. Jung, and D. Ming,
A high-speed hybrid brain-computer interface with more
than 200 targets, Journal of Neural Engineering, vol. 20,
no. 1, p. 016025, 2023.
[29]
S. Kim, S. Lee, H. Kang, S. Kim, and M. Ahn, “P300
brain–computer interface-based drone control in virtual
and augmented reality, Sensors, vol. 21, no. 17, p. 5765,
2021.
[30]
P. Johnston, J. Robinson, A. Kokkinakis, S. Ridgeway,
M. Simpson, S. Johnson, J. Kaufman, and A. W. Young,
“Temporal and spatial localization of prediction-error
signals in the visual brain,” Biological psychology, vol.
125, pp. 45–57, 2017.
[31]
L. Niu, J. Bin, J. K. S. Wang, G. Zhan, J. Jia, L. Zhang,
Z. Gan, and X. Kang, “Effect of 3d paradigm synchronous
motion for ssvep-based hybrid bci-vr system, Medical
& Biological Engineering & Computing, vol. 61, no. 9,
pp. 2481–2495, 2023.
[32]
S. R. Zehra, J. Mu, B. V. Syiem, A. N. Burkitt, and
D. B. Grayden, “Evaluation of optimal stimuli for ssvep-
based augmented reality brain-computer interfaces, IEEE
Access, vol. 11, pp. 87 305–87 315, 2023.
[33]
L. Tonin, R. Leeb, A. Sobolewski, and J. Del R Millán,
An online eeg bci based on covert visuospatial attention
in absence of exogenous stimulation, Journal of neural
engineering, vol. 10, no. 5, p. 056007, 2013.
[34]
J. Kornmeier, M. Pfäffle, and M. Bach, “Necker cube:
Stimulus-related (low-level) and percept-related (high-
level) eeg signatures early in occipital cortex, Journal
of vision, vol. 11, no. 9, pp. 12–12, 2011.
[35]
Y. Jiang, C. N. Boehler, N. Nönnig, E. Düzel, J.-M.
Hopf, H.-J. Heinze, and M. A. Schoenfeld, “Binding 3-d
object perception in the human visual cortex, Journal
of cognitive neuroscience, vol. 20, no. 4, pp. 553–562,
2008.
[36]
S. P. Kelly, E. C. Lalor, R. B. Reilly, and J. J. Foxe,
“Visual spatial attention tracking using high-density ssvep
data for independent brain-computer communication,”
IEEE Transactions on Neural Systems and Rehabilitation
Engineering, vol. 13, no. 2, pp. 172–178, 2005.
[37]
X. Chen, Y. Wang, S. Gao, T.-P. Jung, and X. Gao, “Filter
bank canonical correlation analysis for implementing a
high-speed ssvep-based brain–computer interface, Jour-
nal of neural engineering, vol. 12, no. 4, p. 046008, 2015.
[38]
C. M. Wong, F. Wan, B. Wang, Z. Wang, W. Nan, K. F.
Lao, P. U. Mak, M. I. Vai, and A. Rosa, “Learning across
multi-stimulus enhances target recognition methods in
ssvep-based bcis, Journal of neural engineering, vol. 17,
no. 1, p. 016026, 2020.
[39]
M. Nakanishi, Y. Wang, X. Chen, Y.-T. Wang, X. Gao,
and T.-P. Jung, “Enhancing detection of ssveps for a high-
speed brain speller using task-related component analysis,”
IEEE Transactions on Biomedical Engineering, vol. 65,
no. 1, pp. 104–112, 2017.
[40]
B. Cao, H. Niu, J. Hao, and G. Wang, “Building eeg-
based cad object selection intention discrimination model
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
13
using convolutional neural network (cnn), Advanced
Engineering Informatics, vol. 52, p. 101548, 2022.
[41]
B. Obermaier, C. Neuper, C. Guger, and G. Pfurtscheller,
“Information transfer rate in a five-classes brain-computer
interface, IEEE Transactions on neural systems and
rehabilitation engineering, vol. 9, no. 3, pp. 283–288,
2001.
[42]
C. S. Herrmann, “Human eeg responses to 1–100 hz
flicker: resonance phenomena in visual cortex and their
potential correlation to cognitive phenomena, Experimen-
tal brain research, vol. 137, pp. 346–353, 2001.
[43]
R. Oostenveld and P. Praamstra, “The five percent
electrode system for high-resolution eeg and erp mea-
surements,” Clinical neurophysiology, vol. 112, no. 4, pp.
713–719, 2001.
[44]
T.-W. Lee and T.-W. Lee, Independent component analysis.
Springer, 1998.
[45]
A. Delorme and S. Makeig, “Eeglab: an open source
toolbox for analysis of single-trial eeg dynamics including
independent component analysis,” Journal of neuroscience
methods, vol. 134, no. 1, pp. 9–21, 2004.
[46]
A. Mognon, J. Jovicich, L. Bruzzone, and M. Buiatti, Ad-
just: An automatic eeg artifact detector based on the joint
use of spatial and temporal features,” Psychophysiology,
vol. 48, no. 2, pp. 229–240, 2011.
[47]
T. Lee, S. Nam, and D. J. Hyun, Adaptive window
method based on fbcca for optimal ssvep recognition,
IEEE Transactions on Neural Systems and Rehabilitation
Engineering, vol. 31, pp. 78–86, 2022.
[48]
H. Ramoser, J. Muller-Gerking, and G. Pfurtscheller, “Op-
timal spatial filtering of single trial eeg during imagined
hand movement, IEEE transactions on rehabilitation
engineering, vol. 8, no. 4, pp. 441–446, 2000.
[49]
F. Jamaloo and M. Mikaeili, “Discriminative common
spatial pattern sub-bands weighting based on distinction
sensitive learning vector quantization method in motor
imagery based brain-computer interface, Journal of
Medical Signals & Sensors, vol. 5, no. 3, pp. 156–161,
2015.
[50]
K. K. Ang, Z. Y. Chin, H. Zhang, and C. Guan, “Filter
bank common spatial pattern (fbcsp) in brain-computer
interface, in 2008 IEEE international joint conference on
neural networks (IEEE world congress on computational
intelligence). IEEE, 2008, pp. 2390–2397.
[51]
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation
networks, in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2018, pp. 7132–
7141.
[52]
M. E. Paoletti, J. M. Haut, X. Tao, J. Plaza, and A. Plaza,
“Flop-reduction through memory allocations within cnn
for hyperspectral image classification, IEEE Transactions
on Geoscience and Remote Sensing, vol. 59, no. 7, pp.
5938–5952, 2020.
[53]
M. Nakanishi, Y. Wang, Y.-T. Wang, and T.-P. Jung,
A comparison study of canonical correlation analysis
based methods for detecting steady-state visual evoked
potentials,” PloS one, vol. 10, no. 10, p. e0140703, 2015.
[54]
M. Nakanishi, Y.-T. Wang, and T.-P. Jung, “Optimizing
phase intervals for phase-coded ssvep-based bcis with
template-based algorithm,” in 2018 IEEE International
Conference on Systems, Man, and Cybernetics (SMC).
IEEE, 2018, pp. 650–655.
[55]
Y. Zhu, Y. Li, J. Lu, and P. Li, “Eegnet with ensemble
learning to improve the cross-session classification of
ssvep based bci from ear-eeg, IEEE Access, vol. 9, pp.
15 295–15 303, 2021.
[56]
X. Chen, Y. Wang, M. Nakanishi, X. Gao, T.-P. Jung, and
S. Gao, “High-speed spelling with a noninvasive brain–
computer interface, Proceedings of the national academy
of sciences, vol. 112, no. 44, pp. E6058–E6067, 2015.
[57]
L. Parziale, W. Liu, C. Matthews, N. Rosselot, C. Davis,
J. Forrester, D. T. Britt et al., “Tcp/ip tutorial and technical
overview,” 2006.
[58]
A. P. Costa, J. S. Møller, H. K. Iversen, and S. Puthussery-
pady, An adaptive csp filter to investigate user indepen-
dence in a 3-class mi-bci paradigm,” Computers in Biology
and Medicine, vol. 103, pp. 24–33, 2018.
[59]
N. Masood, H. Farooq, and I. Mustafa, “Selection of
eeg channels based on spatial filter weights, in 2017
International Conference on Communication, Computing
and Digital Systems (C-CODE). IEEE, 2017, pp. 341–
345.
[60]
Y. Benjamini and Y. Hochberg, “Controlling the false
discovery rate: a practical and powerful approach to
multiple testing,” Journal of the Royal statistical society:
series B (Methodological), vol. 57, no. 1, pp. 289–300,
1995.
[61]
E. V. Orekhova, A. V. Butorina, O. V. Sysoeva, A. O.
Prokofyev, A. Y. Nikolaeva, and T. A. Stroganova,
“Frequency of gamma oscillations in humans is modulated
by velocity of visual motion, Journal of Neurophysiology,
vol. 114, no. 1, pp. 244–255, 2015.
[62]
S. D. Muthukumaraswamy and K. D. Singh, “Visual
gamma oscillations: the effects of stimulus type, visual
field coverage and stimulus motion on meg and eeg
recordings,” Neuroimage, vol. 69, pp. 223–230, 2013.
[63]
M. M. Müller, J. Bosch, T. Elbert, A. Kreiter, M. V.
Sosa, P. V. Sosa, and B. Rockstroh, “Visually induced
gamma-band responses in human electroencephalographic
activity—a link to animal studies, Experimental brain
research, vol. 112, pp. 96–102, 1996.
[64]
W. Lutzenberger, F. Pulvermüller, T. Elbert, and N. Bir-
baumer, “Visual stimulation alters local 40-hz responses
in humans: an eeg-study,” Neuroscience letters, vol. 183,
no. 1-2, pp. 39–42, 1995.
[65]
M. M. Müller and T. Gruber, “Induced gamma-band
responses in the human eeg are related to attentional
information processing,” Visual Cognition, vol. 8, no. 3-5,
pp. 579–592, 2001.
[66]
Y. Ke, P. Liu, X. An, X. Song, and D. Ming, “An online
ssvep-bci system in an optical see-through augmented
reality environment, Journal of neural engineering,
vol. 17, no. 1, p. 016066, 2020.
[67]
Y. Wang, K. Li, X. Zhang, J. Wang, and R. Wei, “Research
on the application of augmented reality in ssvep-bci, pp.
505–509, 2020.
This article has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TNSRE.2025.3562217
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Steady-state visual evoked potential brain-computer interfaces (SSVEP-BCI) have attracted significant attention due to their ease of deployment and high performance in terms of information transfer rate (ITR) and accuracy, making them a promising candidate for integration with consumer electronics devices. However, as SSVEP characteristics are directly associated with visual stimulus attributes, the influence of stereoscopic vision on SSVEP as a critical visual attribute has yet to be fully explored. Meanwhile, the promising combination of virtual reality (VR) devices and BCI applications is hampered by the significant disparity between VR environments and traditional 2D displays. This is not only due to the fact that screen-based SSVEP generally operates under static, stable conditions with simple and unvaried visual stimuli but also because conventional luminance-modulated stimuli can quickly induce visual fatigue. This study attempts to address these research gaps by designing SSVEP paradigms with stereo-related attributes and conducting a comparative analysis with the traditional 2D planar paradigm under the same VR environment. This study proposed two new paradigms: the 3D paradigm and the 3D-Blink paradigm. The 3D paradigm induces SSVEP by modulating the luminance of spherical targets, while the 3D-Blink paradigm employs modulation of the spheres' opacity instead. The results of offline 4-object selection experiments showed that the accuracy of 3D and 2D paradigm was 85.67 and 86.17% with canonical correlation analysis (CCA) and 86.17 and 91.73% with filter bank canonical correlation analysis (FBCCA), which is consistent with the reduction in the signal-to-noise ratio (SNR) of SSVEP harmonics for the 3D paradigm observed in the frequency-domain analysis. The 3D-Blink paradigm achieved 75.00% of detection accuracy and 27.02 bits/min of ITR with 0.8 seconds of stimulus time and task-related component analysis (TRCA) algorithm, demonstrating its effectiveness. These findings demonstrate that the 3D and 3D-Blink paradigms supported by VR can achieve improved user comfort and satisfactory performance, while further algorithmic optimization and feature analysis are required for the stereo-related paradigms. In conclusion, this study contributes to a deeper understanding of the impact of binocular stereoscopic vision mechanisms on SSVEP paradigms and promotes the application of SSVEP-BCI in diverse VR environments.
Article
Full-text available
Steady-State Visually Evoked Potentials (SSVEPs) serve as one of the most robust Brain-Computer Interface (BCI) paradigms. Being an exogenous brain response, the properties of elicited SSVEPs are directly related to the properties of the visual stimuli. However, studies on integrating BCI and Augmented Reality (AR), aimed at realising mobile BCI systems, have mainly focused on applications of BCIs and performance comparison with screen-based BCIs. Little work has been done to study the effects of stimulus parameters on BCI performance when stimuli are presented with an AR headset. Here, we compare AR-based SSVEP with 3D and 2D stimuli using three different stimulation strategies: flickering, grow-shrink, and both. Participant feedback on level of fatigue and their subjective preference of stimuli were also collected. Our results did not show significant differences in classification accuracies between the 2D and 3D stimuli. However, for most of the participants, classification accuracy with flickering stimuli was above their average performance and stimuli that changed only in size were below average. The participants were divided in terms of which type of stimulus they felt was the most comfortable.
Article
Full-text available
Brain-Computer Interfaces (BCIs) are an integration of hardware and software communication systems that allow a direct communication path between the human brain and external devices. Among the existing BCI paradigms, Steady-State Visually Evoked Potentials (SSVEPs) have gained momentum in the development of non-invasive BCI applications as they are characterized by adequate signal-to-noise ratio and information transfer rate. In recent years, the adoption of Augmented Reality (AR) head-mounted displays to render the flickering stimuli necessary for SSVEPs elicitation has become an attractive alternative to traditional computer screens. In fact, the increase in system wearability anticipates the possibility of adopting BCIs in contexts other than research laboratory. This has contributed to a steadily-increasing interest in BCIs, as also confirmed by the recent literature dedicated to the topic. In this evolving scenario, this review intends to provide a comprehensive picture of the current state-of-the-art in relation to the latest advancement of wearable BCIs based on SSVEPs classification and AR technology. The goal is to provide the reader with a systematic comparison of different technological solutions realized over the last years, thus making future research in this direction more efficient.
Article
Full-text available
Augmented reality-based brain-computer interface (AR-BCI) system is one of the important ways to promote BCI technology outside of the laboratory due to its portability and mobility, but its performance in real-world scenarios has not been fully studied. In the current study, we first investigated the effect of ambient brightness on AR-BCI performance. 5 different light intensities were set as experimental conditions to simulate typical brightness in real scenes, while the same steady-state visual evoked potentials (SSVEP) stimulus was displayed in the AR glass. The data analysis results showed that SSVEP can be evoked under all 5 light intensities, but the response intensity became weaker when the brightness increased. The recognition accuracies of AR-SSVEP were negatively correlated to light intensity, the highest accuracies were 89.35% with FBCCA and 83.33% with CCA under 0 lux light intensity, while they decreased to 62.53% and 49.24% under 1200 lux. To solve the accuracy loss problem in high ambient brightness, we further designed a SSVEP recognition algorithm with iterative learning capability, named ensemble online adaptive CCA (eOACCA). The main strategy is to provide initial filters for high-intensity data by iteratively learning low-light-intensity AR-SSVEP data. The experimental results showed that the eOACCA algorithm had significant advantages under higher light intensities (>600 lux). Compared with FBCCA, the accuracy of eOACCA under 1200 lux was increased by 13.91%. In conclusion, the current study contributed to the in-depth understanding of the performance variations of AR-BCI under different lighting conditions, and was helpful in promoting the AR-BCI application in complex lighting environments.
Article
Full-text available
Objective. Brain-computer interfaces (BCIs) have recently made significant strides in expanding their instruction set, which has attracted wide attention from researchers. The number of targets and commands is a key indicator of how well BCIs can decode the brain’s intentions. No studies have reported a BCI system with over 200 targets. Approach. This study developed the first high-speed BCI system with up to 216 targets that were encoded by a combination of electroencephalography features, including P300, motion visual evoked potential (mVEP), and steady-state visual evoked potential (SSVEP). Specifically, the hybrid BCI paradigm used the time-frequency division multiple access strategy to elaborately tag targets with P300 and mVEP of different time windows, along with SSVEP of different frequencies. The hybrid features were then decoded by task-discriminant component analysis and linear discriminant analysis. Ten subjects participated in the offline and online cued-guided spelling experiments. Other ten subjects took part in online free-spelling experiments. Main results. The offline results showed that the mVEP and P300 components were prominent in the central, parietal, and occipital regions, while the most distinct SSVEP feature was in the occipital region. The online cued-guided spelling and free-spelling results showed that the proposed BCI system achieved an average accuracy of 85.37% ± 7.49% and 86.00% ± 5.98% for the 216-target classification, resulting in an average information transfer rate (ITR) of 302.83 ± 39.20 bits min⁻¹ and 204.47 ± 37.56 bits min⁻¹, respectively. Notably, the peak ITR could reach up to 367.83 bits min⁻¹. Significance. This study developed the first high-speed BCI system with more than 200 targets, which holds promise for extending BCI’s application scenarios.
Article
Full-text available
In the conventional studies related to steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs), the window length (detection time) was typically predetermined through the offline analysis, which had limitations of practical applicability of a BCI system due to the inter-subject/trial variability of electroencephalography (EEG) signals. To address these limitations, this study aims to automatically optimize the window length for each trial based on training-free approaches and proposes a novel adaptive window method (ANCOVA-based filter-bank canonical correlation analysis, ABFCCA) for SSVEP-based BCIs. The proposed method is based on analysis of covariance (ANCOVA) which is applied after feature extraction by the conventional training-free SSVEP recognition approaches. To evaluate the performance of the proposed method, conventional fixed window and recent adaptive window methods were compared using two open-access datasets. In the Benchmark dataset, the average information transfer rate (ITR) was 146.81 bits/min, the average accuracy 93.55%, and the average window length 1.53 s. In the OpenBMI dataset, the average ITR was 119.01 bits/min, the average accuracy 83.50%, and the average window length 0.65 s. The proposed method significantly outperformed the conventional approaches with fixed window in terms of the accuracy and ITR, and is applicable to various SSVEP-based BCI paradigms based on the criterion of significance level without offline analysis to find optimal hyper-parameters. ABFCCA is enabled the practical use of various BCI systems by automatically optimizing the window length independently.
Article
Full-text available
The practical functionality of a brain-computer interface (BCI) is critically affected by the number of stimuli, especially for steady-state visual evoked potential based BCI (SSVEP-BCI), which shows promise for the implementation of a multi-target system for real-world applications. Joint frequency-phase modulation (JFPM) is an effective and widely used method in modulating SSVEPs. However, the ability of JFPM to implement an SSVEP-BCI system with a large number of stimuli, e.g., over 100 stimuli, remains unclear. To address this issue, a spectrally-dense JPFM (sJFPM) method is proposed to encode a broad array of stimuli, which modulates the low-and medium-frequency SSVEPs with a frequency interval of 0.1 Hz and triples the number of stimuli in conventional SSVEP-BCI to 120. To validate the effectiveness of the proposed 120-target BCI system, an offline experiment and a subsequent online experiment testing 18 healthy subjects in total were conducted. The offline experiment verified the feasibility of using sJFPM in designing an SSVEP-BCI system with 120 stimuli. Furthermore, the online experiment demonstrated that the proposed system achieved an average performance of 92.47±1.83% in online accuracy and 213.23±6.60 bits/min in online information transfer rate (ITR), where more than 75% of the subjects attained the accuracy above 90% and the ITR above 200 bits/min. This present study demonstrates the effectiveness of sJFPM in elevating the number of stimuli to more than 100 and extends our understanding of encoding a large number of stimuli by means of finer frequency division.
Article
Full-text available
Objective. The biggest advantage of steady-state visual evoked potential (SSVEP)-based brain–computer interface (BCI) lies in its large command set and high information transfer rate (ITR). Almost all current SSVEP–BCIs use a computer screen (CS) to present flickering visual stimuli, which limits its flexible use in actual scenes. Augmented reality (AR) technology provides the ability to superimpose visual stimuli on the real world, and it considerably expands the application scenarios of SSVEP–BCI. However, whether the advantages of SSVEP–BCI can be maintained when moving the visual stimuli to AR glasses is not known. This study investigated the effects of the stimulus number for SSVEP–BCI in an AR context. Approach. We designed SSVEP flickering stimulation interfaces with four different numbers of stimulus targets and put them in AR glasses and a CS to display. Three common recognition algorithms were used to analyze the influence of the stimulus number and stimulation time on the recognition accuracy and ITR of AR–SSVEP and CS–SSVEP. Main results. The amplitude spectrum and signal-to-noise ratio of AR–SSVEP were not significantly different from CS–SSVEP at the fundamental frequency but were significantly lower than CS–SSVEP at the second harmonic. SSVEP recognition accuracy decreased as the stimulus number increased in AR–SSVEP but not in CS–SSVEP. When the stimulus number increased, the maximum ITR of CS–SSVEP also increased, but not for AR–SSVEP. When the stimulus number was 25, the maximum ITR (142.05 bits min⁻¹) was reached at 400 ms. The importance of stimulation time in SSVEP was confirmed. When the stimulation time became longer, the recognition accuracy of both AR–SSVEP and CS–SSVEP increased. The peak value was reached at 3 s. The ITR increased first and then slowly decreased after reaching the peak value. Significance. Our study indicates that the conclusions based on CS–SSVEP cannot be simply applied to AR–SSVEP, and it is not advisable to set too many stimulus targets in the AR display device.
Article
A brain-computer interface (BCI) system and virtual reality (VR) are integrated as a more interactive hybrid system (BCI-VR) that allows the user to manipulate the car. A virtual scene in the VR system that is the same as the physical environment is built, and the object's movement can be observed in the VR scene. The four-class three-dimensional (3D) paradigm is designed and moves synchronously in virtual reality. The dynamic paradigm may affect their attention according to the experimenters' feedback. Fifteen subjects in our experiment steered the car according to a specified motion trajectory. According to our online experimental result, different motion trajectories of the paradigm have various effects on the system's performance, and training can mitigate this adverse effect. Moreover, the hybrid system using frequencies between 5 and 10 Hz indicates better performance than those using lower or higher stimulation frequencies. The experiment results show a maximum average accuracy of 0.956 and a maximum information transfer rate (ITR) of 41.033 bits/min. It suggests that a hybrid system provides a high-performance way of brain-computer interaction. This research could encourage more interesting applications involving BCI and VR technologies.