THE FEATURE EXTRACTION AND ESTIMATION OF A STEADY-STATE VISUAL EVOKED POTENTIAL BY THE KARHUNEN-LOEVE EXPANSION

There are situations in ophthalmology when standard parametric methods (visual acuity test, perimetry, etc.) of diagnostics are inexpedient and impossible to use. Such situations occur, firstly, in the case of the visual analyzer diagnostics of newborn and non-speaking children who are unable to express themselves clearly; secondly, they appear if there is some pathology in the visual activity related to both optical and neural (sensory) disorders. A diagnostics method using the visual evoked potential (VEP) does not require the patient’s response, so it allows estimating the visual analyzer activity in a complex: the peripheral organ – an eye, visual pathways, and visual centers in the cortex. The visual evoked potential (VEP) is one of the types of the brain electric activity registered on the scalp above the visual areas and caused by external stimuli (light flashes – FVEP, pattern changes – PVEP, or changes in an image on the monitor). At a low stimulation frequency (Fs<4 Hz), the visual analyzer produces a transient VEP (TVEP), but at an increased stimulation frequency (Fs>5 Hz), there appears a steady-state VEP (SSVEP). In other words, at high stimulation frequencies, the human visual system proceeds from generating a TVEP to an SSVEP, thus producing responses with the same frequency. In contrast to a transient VEP, the SSVEP characteristics (phase, frequency, and period) are stable throughout the whole research duration, and they are less exposed to the impact of artifacts and noises [1]. Consequently, the field of using the SSVEP is not limited to ophthalmological problems (assessment of visual acuity, an injury of the brain visual centers, optic neuritis, amblyopia, etc. [2]), but it is used to solve problems of cognitive (assessment of visual attention, working memory, and binocular vision [3]) and clinical (schizophrenia, autism, epilepsy, depression, migraine [4]) neurosciences, brain-computer interfaces [5], and neuromarketing [6]. Research on the SSVEP to use it further in the information technology of the ophthalmological diagnostics is an essential problem as its solving contributes to detailed visual analyzer research and appropriate complex treatment.


Introduction
There are situations in ophthalmology when standard parametric methods (visual acuity test, perimetry, etc.) of diagnostics are inexpedient and impossible to use.Such situations occur, firstly, in the case of the visual analyzer diagnostics of newborn and non-speaking children who are unable to express themselves clearly; secondly, they appear if there is some pathology in the visual activity related to both optical and neural (sensory) disorders.A diagnostics method using the visual evoked potential (VEP) does not require the patient's response, so it allows estimating the visual analyzer activity in a complex: the peripheral organ -an eye, visual pathways, and visual centers in the cortex.
The visual evoked potential (VEP) is one of the types of the brain electric activity registered on the scalp above the visual areas and caused by external stimuli (light flashes -FVEP, pattern changes -PVEP, or changes in an image on the monitor).At a low stimulation frequency (Fs<4 Hz), the visual analyzer produces a transient VEP (TVEP), but at an increased stimulation frequency (Fs>5 Hz), there appears a steady-state VEP (SSVEP).In other words, at high stimulation frequencies, the human visual system proceeds from generating a TVEP to an SSVEP, thus producing responses with the same frequency.
In contrast to a transient VEP, the SSVEP characteristics (phase, frequency, and period) are stable throughout the whole research duration, and they are less exposed to the impact of artifacts and noises [1].Consequently, the field of using the SSVEP is not limited to ophthalmological problems (assessment of visual acuity, an injury of the brain visual centers, optic neuritis, amblyopia, etc. [2]), but it is used to solve problems of cognitive (assessment of visual attention, working memory, and binocular vision [3]) and clinical (schizophrenia, autism, epilepsy, depression, migraine [4]) neurosciences, brain-computer interfaces [5], and neuromarketing [6].Research on the SSVEP to use it further in the information technology of the ophthalmological diagnostics is an essential problem as its solving contributes to detailed visual analyzer research and appropriate complex treatment.

Literature review and problem statement
Let us consider the mathematical models of the SSVEP and the corresponding parameters that could be used as informative for further diagnostics.A component model, which is described in [1], represents the SSVEP as the sum of three components: primary, secondary, and rhyth-
A contrast response function is used by the authors of [2] for amblyopia diagnostics.The maximum amplitude of the evoked potentials of healthy people should be growing at changing the contrast of the stimulation source of the visual analyzer.In the case of a slight increase, an ophthalmologist argues about certain pathology.The value of amplitude changes is a diagnostic parameter, and an appropriate method is the comparison of a previous value with some standardized one.It should be noted that there are visual pathologies for which this method is not effective, for there is a high probability of a false diagnosis.
Scientists use averaging of the signal realization sets (a coherent signal accumulation), describing the analyzed signal by an additive model [7].The first component of this model is a determined function that represents the VEP, and the second component is a centered weakly stationary process as a background electroencephalogram (EEG).Several methods -GSA, SRM, and SDEM [7] -are used to select the VEP on the EEG background.However, only the first moment is studied in the analysis, and only the amplitude-time characteristics of extremes (N75, P100, and N145) are used for ophthalmologic diagnostics, which is not enough.
The authors of [8] apply independent component analysis for selecting the SSVEP.With such a component approach to diagnosing, it is necessary to choose those components that display the visual activity; that can be followed by their further processing, and this procedure increases the computational complexity.
An autoregression model is applied to describe the SSVEP in [9], and the visual analyzer can be diagnosed with estimations of autoregression parameters.The complexity of this model will increase when applying it to a two-channel SSVEP, so scientist must necessarily take into account the cyclical nature of the studied process for parameter estimation.
The authors of [10] apply a discrete wavelet transform to the SSVEP; the information parameters are appropriate decomposition coefficients, and the classification method is the support vector machine.It is important to choose correctly and justify the basic function and the number of the decomposition levels.An increase in the number of the decomposition levels of the wavelet transform increases the number of the informative parameters.This technique is applied in brain-computer interfaces, but for ophthalmologic diagnostics it is necessary to investigate the correlation between the results of the wavelet decomposition and the biophysical principle of generating the signal for further results interpretation.
Thus, the primary important problem is to create a mathematical model that would take into account the simultaneous interaction between multiple sources of the brain activity, includes the cyclical properties and the stochasticity of the SSVEP, reflects the mechanism of generating electrical activity by individual neurons, and allows estimating the informative features of the process.
Obtaining informative parameters on the basis of an approved model that takes into account the above enumerated requirements is the main critical stage in the creation of the information technology of ophthalmologic diagnostics, which reflects the objective of this research.

The purpose and objectives of the study
The research purpose is the feature extraction of a two-channel SSVEP for further ophthalmological diagnostics, using a corresponding linear transformation based on a mathematical model in the form of two-dimensional linear periodic random process (LPRP).
To achieve this purpose, it is necessary to do the following tasks: -to justify the usage of the Karhunen-Loeve expansion for the feature extraction of a two-channel SSVEP; -to determine the optimal number of informative characteristics that sufficiently characterize the investigated process; -to investigate the effect of a stochastic correlation between the channels on the optimal number of informative parameters.

1. The digital electroencephalograph specification and the 2-channel SSVEP registration protocols
The SSVEP registration was performed by the electroencephalograph DX-NT32 (Kharkiv, Ukraine) with the following specifications: the sampling frequency -512 Hz, the signal quantization -12 bits, the notch filter -50 Hz, the high-pass filter -0.05-1 Hz, the input impedance -20 MΩ, the phase noise reduction coefficient -100 dB, the light stimulation -three LED lamps with the ability to change the stimulation frequency (1-30 Hz).
The research involved 20 participants (12 men and 8 women) aged 18-23 years (the average age was 20 years).Each person of the subject group participated in two experiments conformed to the ISCEV standard for clinical registration of a VEP [11].The international 10-20 system was used to determine the location of scalp electrodes.Active electrodes were placed in the positions O 1 and О 2 , whereas the reference electrode was in the position F z .The experiments were performed in a dimly lighted laboratory.The source of the external visual stimulation consisted of three LED light bulbs that simultaneously produced light flashes lasting 30 microseconds.The horizontal visual field was 140°.The mean luminance was 3 cd×m -2 .
The first experiment comprised three trials with the corresponding stimulation frequencies of 6, 8, and 10 Hz, as schematically shown in Fig. 1.The recording duration of each trial was 120 sec, which was divided into four sessions: 10 seconds of a resting part when the participant could look at the source of stimulation without any response; 10 seconds of an adaptive session when the source of stimulation produced light flashes at the selected frequencies; 10 seconds of a resting part; 90 seconds of an active session during which the SSVEP was registered under the stimulation frequencies of 6, 8 or 10 Hz.
The second experiment differed from the first experiment only since the electrodes activity changed during the active session: for 45 seconds, the active electrode was O 1 ; for the next 45 seconds, it was O 2 .In other words, there was no simultaneous registration of the two-channel SSVEP.

2. Feature extraction of a SSVEP by the Karhunen-Loeve expansion
In [12,13], a mathematical model of the SSVEP is constructed and justified, taking into account all previously described requirements of a two-dimensional linear periodic random process (LPRP).The following information briefly represents the basic provisions of the model.
Suppose the random process x(t), which reflects the visual activity caused by a flash cyclic stimulation, is presented in the form of a linear random process (LRP) [11]: where p 1 (t), tÎ(-¥, ¥) is a non-uniform generalized Poisson process for which P{p 1 (0)=0}=1; j(t, t) is the kernel of the LRP, a nonrandom function that is presented by the expression: where U(s) is the Heaviside function; w(t) is a nonrandom function that describes the coefficient of impulse damping; b(t) is a nonrandom function that describes the impulse frequency.It is assumed that only one active electrode placed on the human scalp was used for the random process registration (one-channel record).
According to expression (1), a SSVEP is the sum of all impulses (2) that are produced by neurons at consistent time moments t n , nÎZ.Herewith, the random variable of generating the process jumps p 1 (t), tÎ(-¥, ¥) characterizes the impulses amplitude.
The authors of [12] use the LPRP to represent x(t), having justified the periodic properties of the mathematical model (1).The model takes into account the cyclic stimulation, which is important for diagnostics by the SSVEP.Suppose that the visual system of some object is exposed to a cyclical stimulation with a certain frequency F or a corresponding stimulation period T=1/F.Based on the conclusion of [12], we claim that the mathematical expectation and the covariance function of the investigated process (1) are T-periodic, namely: R (t ,t ) R (t T,t T).
Let us consider a case where the SSVEP is recorded by two electrodes -O 1 and O 2 (a two-channel record).The corresponding random processes should be noted as x 1 (t) and x 2 (t), and the two-channel resulting process will be represented as a two-dimensional model: Based on the optic tract anatomy, we argue that a signal registered at the position of the projection of the left visu-al part of the brain contains useful information obtained not only from the left eye but also from part of the right eye.
We then state the generating process vector as follows: and the matrix of the LPRP kernels of the 2-dimensional model elements (4) will be: Based on expressions ( 5) and ( 6), the two-dimensional stochastic process ( 4) is represented in the integral form: Using the modern theories of electrogenesis, the cyclostationarity theorems given in [14], the periodic properties of the generating process increments and the kernels of the LRP (1), the authors of [13] justified the T-periodicity of the mathematical expectation and the covariance function of the 2-dimensional process (7).As a result, the 2-channel SSVEP will be described with the 2-dimesional mathematical model that is presented by (7).
Since the information technology of diagnostics uses a signal at discrete time, the random process (1) should be presented as an L-periodic linear random sequence: where j t,t =j t+L,t+L is a L-periodic nonrandom function, the kernel of the stochastic sequence, h t is white noise, and L=T/ Δt, Δt is the sampling rate.In a general case, ( 8) is a linear transformation of the white noise sequence, which can be represented by the linear operator: where Since the Karhunen-Loeve expansion/transform (KLE or KLT) is sufficiently simple to use, its mathematical apparatus is thoroughly investigated [15,16] and widely used.Therefore, A t was selected from the large number of linear operators to perform the specified task.Let us consider the features of the KLE performance for the 2-channel SSVEP.It should be noted that according to [17] representation of the L-periodic random cyclostationary sequence requires the KLE only within the set [0,L 1] − : where x t is the centered linear L-periodic random sequence, j k (t) is the real functions within the set − [0,L 1], which are orthogonal in the space R L , and h k is pairwise uncorrelated random variables with the variance Var(h k )=l k , where l k is the covariance matrix eigenvalues R x (t 1 ,t 2 ) of the investigated sequence (8).
It should be noted that the KLE is completely determined by the covariance matrix of the sequence because the orthogonal basis consists of its eigenvectors and associated eigenvalues.
Because the two active electrodes in the positions O 1 and O 2 were active during the investigated process recording, their appropriate realizations will be noted as X t and Y t , and their mathematical representation will be the same as in (8).Taking into account the interaction between the two channels, we introduce a random sequence of the following type: The problem of the KL expansion of random sequence (11) becomes the problem of estimating its covariance matrix, which is represented in the following form: where t t X ,Y are centered linear L-periodic random sequences; are corresponding cross-covariance functions that characterize the interaction between the sequences X t and Y t .
A simple way of implementing the mathematical expectation and the covariance function estimation is a method of j-series [18], where j-series is a set of sequence samples arranged in time (X t or Y t ) and taken through a period of L: j-series are stationary and stationary related sequences; therefore, the probable characteristics can be assessed by the already verified methods of statistical analysis of stationary sequences.The mathematical expectation estimates of j-series reflect the mathematical expectation estimates of the L-periodic sequence, taken from one period: where n is the volume of the investigated sequence.The centered linear L-periodic random sequence t X (or t Y ) is presented in the matrix form X (or Y), each row of which represents centered j-series: Then the L-periodic random sequence would be represented in the matrix form as a concatenation of the centered matrices of the t t X ,Y sequences: The next step is to construct a square matrix of the size (2L´2L) in which the elements are asymptotically unbiased estimates and consistent estimates of the covariance matrix samples (12): T

(
). m = Ŕ Z Z (17) The final step of the KLE is to get a set of eigenvectors {j 1 , j 2, …, j k } and the corresponding eigenvalues {l 1 , l 2, …, l k } of the covariance matrix R.

3. The estimation of the optimal number of the SSVEP informative features
We will use the following statistics [17] for choosing the number of informative features that sufficiently characterize the investigated process: where e k reflects the percentage of the initial sequence that is made by the first k elements of the expansion, and trR is the total sequence energy as a trace of the covariance matrix.
The statistics is the results of the KLE properties: the investigated sequence variance, the estimations of which are displayed on the main diagonal of the covariance matrix, is the sum of its eigenvalues.The optimal number of the informative features k would be defined by the inequality e k >0.95.
Kaiser's rule will also be used to estimate the number of the informative features: According to (19), the informative parameters will be eigenvectors with corresponding eigenvalues whose values are greater than the power of the initial sequence.

The KLE results of the two-channel SSVEP at different stimulation frequencies
The stepwise results of the KLE to 2-channel SSVEP obtained in the first experiment at a stimulation frequency of 10 Hz are presented below.
The first stage is a sampled signal preprocessing, which means centering and estimation of the mathematical expec-tation of each single-channel realization.Fig. 2, a The main diagonal of the covariance matrix with the dimensions of (100´100) reflects the variance estimations of each j-series, as shown in Fig. 3; their total sum is the full energy of the investigated 2-channel SSVEP.Based on the calculated estimates of the covariance matrix, the KLE was implemented and a set of eigenvectors and eigenvalues was obtained for the matrix R. Fig. 4 shows the first two eigenvectors -j 1 and j 2 (k= [1,100]), which represent the highest percentage of the signal energy.
Based on the considered array of eigenvalues {l 1 , l 2, …, l k }, k= [1,100], as shown in Fig. 5, the estimation of the optimal number of the informative features was calculated according to expressions (18) and (19).It should be noted that Fig. 5 shows the first 25 eigenvalues because others have insignificant values.The dot-dashed line represents the SSVEP signal power and demonstrates Kaiser's rule, according to which the informative parameters will be only those eigenvectors (in this case, 15) whose corresponding eigenvalues are above the specified line.Fig. 5 shows that the informative features are eigenvectors.These eigenvectors correspond to the first few highest eigenvalues that reflect the researched process fully enough.
Table 1 below demonstrates the average values of the eigenvectors that are used for the diagnostics as informative.They were obtained by the statistics of ( 18) and ( 19) for the two-channel SSVEP and recorded during the first experiment for each of the 20 participants at different stimulation frequencies.

Table1
The correspondence between the average optimal number of the informative features and the stimulation frequency (from the first experiment, calculated for the 2-channel registration) Table 2 demonstrates the average values of the selected eigenvectors that were obtained by the statistics of ( 18) and (19) for the SSVEP; they were registered during the first experiment for each of the 20 participants at different stimulation frequencies and calculated separately for each channel.

Table 2
The correspondence between the average optimal number of the informative features and the stimulation frequency (from the first experiment, calculated for each separately registered channel) Table 3 demonstrates the average values of the selected eigenvectors that were obtained by the statistics of ( 18) and (19) for the SSVEP; they were registered during the second experiment and calculated for each of the 20 participants.
Table 3 The correspondence between the average optimal number of the informative features and the stimulation frequency (from the second experiment, calculated for each separately registered)

Statistics
The number of the informative features The results were calculated separately for each channel, and the corresponding values are the following: L=25, m=450, and Î k [1,25].It should be noted that the results presented in Table 2 and Table 3 are almost identical.

Discussion of the KLE results of the 2-channel SSVEP at different stimulation frequencies
An optimal number of informative parameters were evaluated by using two statistics, which allowed comparing them and determining the most appropriate statistics for the researched objectives.According to Fig. 5, the optimal number of the informative parameters of the 2-channel SSVEP at a stimulation frequency of 10 Hz is equal 14 or equal to the number of eigenvalues that were registered and showed below the dotted line.This approach is based on the assumption that the information component of the signal is of high amplitude and much smaller size, but the noise is of small amplitude and a big size.This assumption is not always true, which is a disadvantage of Kaiser's rule.
The statistic estimation on the basis of the percentage of the signal energy that constitutes the energy of the first k components also has its disadvantage: namely, the optimal number of the informative parameters will enlarge by increasing the percentage of the energy; therefore, it is necessary to choose the right value of e k , which would be the most appropriate value.In the considered case, e k >0.95 and the number of the informative parameters is 12, which is fewer than by applying Kaiser's rule.The author of [19] also states the fact that Kaiser's rule for data of large dimensions underestimates the number of informative parameters.
Table 1 shows an inverse correlation between the stimulation frequency and the number of eigenvectors of the covariance matrix of the 2-channel SSVEP.The reason for this is a change in the duration of the period L. For example, the dimension of the covariance matrix of the SSVEP at a stimulation frequency of 10 Hz is equal to (100´100), and for the SSVEP at a stimulation frequency of 6 Hz, it is equal to (160´160).Consequently, the number of eigenvalues and vectors increases.It should also be noted that as the value of e k increases the number of the informative parameters increases, too.
Let us consider Tables 1-3 and explore the examples at a stimulation frequency of 10 Hz to substantiate the expediency of using the optimal number of visual signals that are recorded simultaneously in the positions O 1 and O 2 .Table 1 shows the average optimal value of the selected number of eigenvectors that was obtained by two statistics based on the Karhunen-Loeve expansion of the 2-channel SSVEP covariance matrix that includes information about interference between the channels.At the frequency of 10 Hz, it is enough to select 14 (according to Kaiser's rule) and 12 (according to the energy percentage) informative parameters.It should be noted that in this case it is necessary to present a two-channel signal in the form of (11) before performing the estimation of the covariance matrix elements and only then implement the calculation, which constitutes a complexity in the processing.
Calculations of the average optimal number of informative parameters for each signal registered simultaneously in the positions O 1 and O 2 were performed for comparison.The Karhunen-Loeve expansion was implemented on the basis of two separate covariance matrices.According to the data from Table 2, at a frequency of 10 Hz, it is enough to select 15 (according to Kaiser's rule) and 16 (according to the energy percentage) informative parameters.It should be noted that the number of parameters is higher as compared with the first approach, namely, it is 15>14 and 16>12, which means that some informative parameters duplicate the primary useful information, confirming the fact of a correlation between the registration channels [13].This should be taken into account in diagnostics.
Calculations of the average optimal number of informative parameters were made for each signal registered simultaneously in the positions O 1 and O 2 ; the results are presented in Table 3.The data in Tables 2 and 3 are almost identical.It confirms that using the KLE of a two-channel SSVEP based on the covariance matrix of each channel separately produces results that are equal to those that are obtained by using the KLE of a one-channel SSVEP with the data being recorded non-simultaneously, without any stochastic correlation between the channels.

Conclusions
1. Based on a mathematic model of the steady-state VEP as a two-dimensional linear periodic random process, it is justified to use its Karhunen-Loeve expansion for the feature extraction: the eigenvalues and eigenvectors of the covariance matrix of a random vector formed by a concatenation of the vectors of the observed SSVEP data that are simultaneously recorded from two separate channels.
2. Taking into account the statistical estimations of the informative features and the stochastic cyclostationarity of the signal, it has been found that the first 18 components (at a stimulation frequency of 6 Hz) of the Karhunen-Loeve expansion reflect more than 95 % of the signal energy (15 components at a simulation frequency of 8 Hz and 12 com-ponents at a stimulation frequency of 10 Hz).It suggests using the estimated number of the appropriate informative features in actual diagnostics.
3. It has been proved that the stochastic dependence between signals registered from different channels and analyzed together allows using fewer informative features (chosen according to the energy criterion) than when the signals from the channels are analyzed independently.

Fig. 4 .Fig. 5 .
Fig. 4. The eigenvectors of the covariance matrix R of the 2-channel SSVEP: а -the first eigenvector reflects 42 % of the process energy; b -the second eigenvector reflects 17 % of the process energy