Sleep scoring using artificial neural networks

13
TECHNICAL REVIEW Sleep scoring using articial neural networks Marina Ronzhina a, * , Oto Janou sek a, d , Jana Kolá rová a, e , Marie Nováková b, g , Petr Honzík c, h , Ivo Provazník a, f a Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Kolejní 4, Brno 61200, Czech Republic b Department of Physiology, Faculty of Medicine, Masaryk University, Kamenice 753/5, Brno 62500, Czech Republic c Department of Control and Instrumentation, Faculty of Electrical Engineering and Communication, Brno University of Technology, Kolejní 4, Brno 61200, Czech Republic article info Article history: Received 18 March 2011 Received in revised form 30 June 2011 Accepted 30 June 2011 Available online 24 October 2011 Keywords: Polysomnographic data Sleep scoring Features extraction Articial neural networks summary Rapid development of computer technologies leads to the intensive automation of many different processes traditionally performed by human experts. One of the spheres characterized by the intro- duction of new high intelligence technologies substituting analysis performed by humans is sleep scoring. This refers to the classication task and can be solved e next to other classication methods e by use of articial neural networks (ANN). ANNs are parallel adaptive systems suitable for solving of non- linear problems. Using ANN for automatic sleep scoring is especially promising because of new ANN learning algorithms allowing faster classication without decreasing the performance. Both appropriate preparation of training data as well as selection of the ANN model make it possible to perform effective and correct recognizing of relevant sleep stages. Such an approach is highly topical, taking into consideration the fact that there is no automatic scorer utilizing ANN technology available at present. Ó 2011 Elsevier Ltd. All rights reserved. Introduction Sleep disorders represent one of the serious problems of a modern society. The pressure of work and unhealthy lifestyle cause the decrease of sleep quality, which may produce various mental disorders such as depression. Moreover, presence of certain sleep disorders serves as indicator of serious disturbances such as cardiovascular disease, 1,2 diabetes mellitus and obesity. 3,4 Thus, sleep structure observation helps to detect abnormal changes early enough and to prevent disorder progress. Manual visual sleep scoring on the base of different biological signal analysis is a difcult time-consuming process. The classi- cation of 8-h recording (whole night record) requires approx. 2e4 h. Moreover, scoring by human expert is not absolutely correct. It is characterized by the subjectivity of decision making: agreement between the results of visual scoring obtained by two experts reaches only 83 3%, 5 which is quite a low value. There- fore, the development of systems for automatic sleep scoring is a very important area of sleep studies. 6e14 Many different methods for sleep stage classication have been proposed recently. Rapid development of high intelligence tech- nologies makes it possible to create increasingly sophisticated systems to substitute for human experts. Classication methods based on the Bayesian probability (linear and quadratic discrimi- nation, k Nearest Neighbor and Parzen classiers) can be used for sleep scoring. They require homogeneity and Gaussian distribution of the input data. 7,14 Although the data must be transformed before classication, the linear and quadratic statistical classiers have one big advantage e they are fast. 7 Another group of classication methods is represented by the articial neural networks (ANNs) which are the parallel adaptive systems generally allowing the solving of non-linear problems. 15,16 In contrast to previous methods, this approach is not sensitive to the extreme values often present in the real signals and, consequently, does not require special transformation of the data. 7,14 Moreover, there are new ANN algorithms developed to increase the speed of learning phase without decreasing resulting performance (see below). Therefore, ANN systems have been widely adopted for processing and analysis of not only sleep recordings 11,12 but also of various types of data: for example, in chemistry for modelling the silver nanoparticles dimensions 17 and prediction of thermal conductivity of electrolyte * Corresponding author. Tel.: þ420 54114 3609. E-mail addresses: [email protected] (M. Ronzhina), xjanou12@stud. feec.vutbr.cz (O. Janou sek), [email protected] (J. Kolá rová), [email protected]. cz (M. Nováková), [email protected] (P. Honzík), [email protected] (I. Provazník). d Tel.: þ420 54114 3609. e Tel.: þ420 54114 9546. f Tel.: þ420 54114 9562; fax: þ420 54114 9542. g Tel.: þ420 54949 5931; fax: þ420 54949 3266. h Tel.: þ420 54114 1113. Contents lists available at ScienceDirect Sleep Medicine Reviews journal homepage: www.elsevier.com/locate/smrv 1087-0792/$ e see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.smrv.2011.06.003 Sleep Medicine Reviews 16 (2012) 251e263

Transcript of Sleep scoring using artificial neural networks

lable at ScienceDirect

Sleep Medicine Reviews 16 (2012) 251e263

Contents lists avai

Sleep Medicine Reviews

journal homepage: www.elsevier .com/locate /smrv

TECHNICAL REVIEW

Sleep scoring using artificial neural networks

Marina Ronzhina a,*, Oto Janou�sek a,d, Jana Kolá�rová a,e, Marie Nováková b,g, Petr Honzík c,h,Ivo Provazník a,f

aDepartment of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Kolejní 4, Brno 61200, Czech RepublicbDepartment of Physiology, Faculty of Medicine, Masaryk University, Kamenice 753/5, Brno 62500, Czech RepubliccDepartment of Control and Instrumentation, Faculty of Electrical Engineering and Communication, Brno University of Technology, Kolejní 4, Brno 61200, Czech Republic

a r t i c l e i n f o

Article history:Received 18 March 2011Received in revised form30 June 2011Accepted 30 June 2011Available online 24 October 2011

Keywords:Polysomnographic dataSleep scoringFeatures extractionArtificial neural networks

* Corresponding author. Tel.: þ420 54114 3609.E-mail addresses: [email protected] (M

feec.vutbr.cz (O. Janou�sek), [email protected] (J.cz (M. Nováková), [email protected] (P. Honz(I. Provazník).

d Tel.: þ420 54114 3609.e Tel.: þ420 54114 9546.f Tel.: þ420 54114 9562; fax: þ420 54114 9542.g Tel.: þ420 54949 5931; fax: þ420 54949 3266.h Tel.: þ420 54114 1113.

1087-0792/$ e see front matter � 2011 Elsevier Ltd.doi:10.1016/j.smrv.2011.06.003

s u m m a r y

Rapid development of computer technologies leads to the intensive automation of many differentprocesses traditionally performed by human experts. One of the spheres characterized by the intro-duction of new high intelligence technologies substituting analysis performed by humans is sleepscoring. This refers to the classification task and can be solved e next to other classification methods e

by use of artificial neural networks (ANN). ANNs are parallel adaptive systems suitable for solving of non-linear problems. Using ANN for automatic sleep scoring is especially promising because of new ANNlearning algorithms allowing faster classification without decreasing the performance. Both appropriatepreparation of training data as well as selection of the ANN model make it possible to perform effectiveand correct recognizing of relevant sleep stages. Such an approach is highly topical, taking intoconsideration the fact that there is no automatic scorer utilizing ANN technology available at present.

� 2011 Elsevier Ltd. All rights reserved.

Introduction

Sleep disorders represent one of the serious problems ofa modern society. The pressure of work and unhealthy lifestylecause the decrease of sleep quality, which may produce variousmental disorders such as depression. Moreover, presence of certainsleep disorders serves as indicator of serious disturbances such ascardiovascular disease,1,2 diabetes mellitus and obesity.3,4 Thus,sleep structure observation helps to detect abnormal changes earlyenough and to prevent disorder progress.

Manual visual sleep scoring on the base of different biologicalsignal analysis is a difficult time-consuming process. The classifi-cation of 8-h recording (whole night record) requires approx.2e4 h. Moreover, scoring by human expert is not absolutely correct.It is characterized by the subjectivity of decision making:

. Ronzhina), [email protected]á�rová), [email protected].ík), [email protected]

All rights reserved.

agreement between the results of visual scoring obtained by twoexperts reaches only 83� 3%,5 which is quite a low value. There-fore, the development of systems for automatic sleep scoring isa very important area of sleep studies.6e14

Many different methods for sleep stage classification have beenproposed recently. Rapid development of high intelligence tech-nologies makes it possible to create increasingly sophisticatedsystems to substitute for human experts. Classification methodsbased on the Bayesian probability (linear and quadratic discrimi-nation, k Nearest Neighbor and Parzen classifiers) can be used forsleep scoring. They require homogeneity and Gaussian distributionof the input data.7,14 Although the data must be transformed beforeclassification, the linear and quadratic statistical classifiers haveone big advantage e they are fast.7 Another group of classificationmethods is represented by the artificial neural networks (ANNs)which are the parallel adaptive systems generally allowing thesolving of non-linear problems.15,16 In contrast to previousmethods, this approach is not sensitive to the extreme values oftenpresent in the real signals and, consequently, does not requirespecial transformation of the data.7,14 Moreover, there are new ANNalgorithms developed to increase the speed of learning phasewithout decreasing resulting performance (see below). Therefore,ANN systems have beenwidely adopted for processing and analysisof not only sleep recordings11,12 but also of various types of data: forexample, in chemistry for modelling the silver nanoparticlesdimensions17 and prediction of thermal conductivity of electrolyte

Abbreviations

AASM American Academy of Sleep MedicineANN artificial neural networkANOVA analysis of varianceBP backpropagationCV cross-validationEDF European data formatEEG electroencephalogramEMG electromyogramEOG electrooculogramFT Fourier transformLM LevenbergeMarquardt algorithmMLNN multilayer neural networkMLP multilayer perceptronMSE mean of squared error

MT movement timePSD power spectral densityPSG polysomnographicREM rapid eye movementR&K Rechtschaffen and KalesRMS root mean squareRUM Rumelhart algorithmSD standard deviationSOM self-organizing mapSS sleep spindlesSSE sum of squared errorSTFT short-time Fourier transformSWS slow wave sleepW wakefulnessWT wavelet transform

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263252

solutions,18 in mathematics for solving partial differential equa-tions,19 in economics for credit scoring20 and for forecasting theprices of agricultural commodities,21 in civil engineering forstructural damage detection,22 in computer science for spamdetecting,23 in signal and image processing for auto-regressivemodel order estimation24 and object-based classification of satel-lite images,25 in medicine for diagnosing the chronic obstructivepulmonary disease26 and predicting the patients with heartdisease.27

Standards for sleep scoring have been proposed based on thepotentials of human experts to recognize some characteristicpatterns of signals with no extra tools.28,29 Thus, the results ofmachine scoring are generally compared with those obtained byvisual analysis to estimate the classifier performance.

The present paper reviews the principle of automatic sleepscoring using neural network technique and reports some of thesystems developed during last two decades. The practical exampleof ANNs based sleep scoring using only one-channel electroen-cephalogram shows efficacy of this approach. More detailed infor-mation about presented methods may be found in recommendedliterature.

Polygraphic data and visual sleep scoring

In sleep studies, polysomnographic (PSG) measure is usuallyused. Various biological signals are simultaneously recorded duringPSG, for example respiratory effort, heart rate, nasal airflow, andothers.30 The vigilance levels of patient (wakefulness and sleepphases) can be successfully represented by three of those:

� electroencephalogram (EEG) is recorded for evaluation of brainelectrical activity;

� electrooculogram (EOG) is recorded for detection of eyemovements; and

� electromyogram (EMG) is recorded for muscle tonemonitoring.

In clinical practice, these signals are manually scored by trainedexperts according to special rules. Primary data are consequentlyreplaced with the sequence of numbers e so called hypnogram.Each sample of hypnogram represents one signal’s segment of fixedlength usually 20e30 s (epoch). This form of vigilance levelspresentation is very simple and easy interpretable.

The scoring is based on recognition of EEG sleep patterns. Thefirst human EEG was recorded by Hans Berger in 1924.31 Since thenmeasurement and following analysis of EEG are used for various

purposes in clinical practice aswell as in research. The changes of thevigilance level (awakeesleep) are caused by changes of the activityof nerve centers. Therefore, sleep may be studied by brain signalsanalysis. In comparison to EOG and EMG, EEG represents immediateelectrical signals of brain. For that reason, someauthorsuse onlyEEGfor automatic sleep scoring.8,9,32e38 In neurology, it is convention todescribe EEG in terms of its frequency components. The main onesare delta (0.5e4 Hz), theta (4e8 Hz), alpha (8e12 Hz), and beta(12e35 Hz) bands. For sleep spindles (see below) detection, sigmaband (approx. 12e16 Hz) can be also distinguished.

There are two approaches to visual scoring of PSG data. In 1968,Rechtschaffen and Kales (R&K) tried to standardize the method forsleep scoring of healthy adults by producing a manual, whichcontained the rules for sleep pattern recognition from PSG data.28

According to R&K, seven stages of vigilance are distinguished:wakefulness, four stages of non-REM sleep (stage 1, stage 2, stage 3,and stage 4), REM (rapid eye movement) sleep and movement time(MT). They are generally identified as W, S1, S2, S3, S4, REM andMT(or ‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’ in hypnograms) and defined as follows.

� Wis characterized by low voltage (10e30 mV), mixed frequencyand/or possible alpha activity in EEG and relatively high tonicEMG (muscle activity higher than 30 Hz is usually taken intoconsideration30);

� S1 is defined as a stage with the amplitude of EEG notexceeding 200 mV and frequencies placed within the 2e7 Hzrange, alpha components should not exceed 50% of the totalspectral band, slow eye movements can be present in EOG,EMG level should be lower than in previous stage;

� S2 is characterized by presence of so called sleep spindles (SS)(waves in the 12e14 Hz range) and K-complexes (sharpnegative waves followed by a smooth positive waves) withduration of at least 0.5 s and the absence of slow waves inEEG;

� S3 is a stage with delta activity occupying 20e50% of the totalband, with amplitude higher than 75 mV, where K-complexesand SS can also occur;

� S4 is very similar to S3, delta activity, however, appears in morethan 50% of the total band;

� REM is characterized by low voltage, mixed frequency andpossible presence of saw-tooth waves in EEG and the lowestlevel of EMG and episodic REMs (period from beginning of eyedeflection to the initial peak shorter than 500 ms30) detected inEOG;

� MT labels an epoch that can be classified as neither sleep norwakefulness due to amplifier blocking or muscle activity.

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263 253

R&K rules have remained the gold standard for main experts invisually scored sleep and inventors of automatic classificationsystems for a long time. However, the routine use of R&K standardhas revealed some limitations.39 As a result, the new advancedmethod replaced R&K rules. The scoring according to AmericanAcademy of Sleep Medicine (AASM) implies the application of themodified rules and terminology.29 There are the following changesin vigilance stages definition: N1, N2 and N3 are used instead ofS1eS4 (S3 and S4 are merged into N3), MT is abolished. The defi-nition of sleep stages is similar to that described by R&K with someexceptions. The studies of different authors have shown that thescoring of sleep by means of these two approaches producessignificantly different results both in adults40 and in children.41 It isobvious that new sleep studies are needed for adapting the existingage-dependent normative values to AASM criteria.40 Differencescan be explained by requirement of using at least three EEG deri-vations for analysis according to new standard and new set ofscoring rules. In children, for instance, the alpha and slow waveactivities can be less pronounced in the occipital and frontal leads,respectively, in comparison with adults.41 With the help of inter-rater reliability analysis, it has been found that agreements for allsleep stages, except N2 (due to new rule for detecting the end of N2based on cortical arousals), are higher for AASM in comparisonwithR&K. For detailed information, see study of Danker-Hopfer and co-authors.42

Automatic sleep scoring by ANN

Algorithm for automatic classification generally consists of threesteps: data pre-processing, characteristic features extraction andclassification on the base of extracted features. In case of sleep stageclassification, the features vector is extracted from PSG data whichwere previously transformed by analogue-to-digital converter intothe sequences of samples. PSG features are then used as input forANN classifier to provide sleep scoring.

Methods of artefacts reduction

Before feature determination, PSG signals must be pre-processed in order to remove artefacts and noise and/or magnifysome interesting signals components (for example, alpha activity inEEG).

All PSG signals often obtain so called technical artefacts whichare usually related to the electrodes or equipment. They can bemanifested as sudden changes in the level of signal baseline due tomovements of electrodes, or as presence of 50/60 Hz componentrepresenting the influence of electromagnetic fields or as noise ofamplifier, analogue-to-digital converter and other devices.43

In addition to technical artefacts, EEG recordings contain arte-facts of physiological non-cerebral origin; the most important ofthem are eye movement, blinks, muscle and cardiac activity.10,43

The eye movement artefact can be presented in EEG during allrecording, i.e., not only in wake state but also in REM, when itinterferes with rapid eye movements. Moreover, this type of arte-facts is sometimes very similar to slow EEG activity (theta anddelta) and therefore cannot be easily identified. Blinks relatedartefact is characterized with higher frequency of its components incomparison with eye movement artefact and higher amplitude inthe frontal EEG channels. The muscle activity artefact, typical forwake stage of EEG, has a frequency close to beta activity. This factcomplicates its removing from EEG using simple filtering. Thepresence of cardiac artefact in EEG is easily detectable because of itscharacteristic shape (similar to QRS complex of ECG). However,epileptic activity or cardiac arrhythmias may lead tomisinterpretation.

Various computational methods can be applied for reducingartefacts described above. One of suchmethods is the linear filteringwhich allows reducing for example 50/60 Hz powerline inter-ference36,43e45 (using notch filter) and muscle activity artefacts(using lowpass filter).43 As mentioned above, linear filtering ofmuscle activity artefacts is appropriate only if there is no need toanalyse the higher EEG rhythms. For reduction of EOG and ECGrelated artefacts, the corresponding recordings containing onlyinformation about these signals (not about EEG activity) can beused to perform adaptive filtering, subtracting or correlationmethods.43,46 It is also possible to remove such artefacts withoutreference signals by application of wavelet transform.47 There aremore sophisticated approaches avoiding the limitations which arecaused by overlapping the spectral characteristics of artefacts anduseful signal. They are based on new technologies such as artificialneural networks,12,48 independent component analysis,49e51 prin-cipal component analysis,52 regression methods,53 and morpho-logical component analysis.54 Each approach has some advantages/disadvantages. Method for removing artefacts should be chosenaccording to available data and software/hardware possibilities.

Features extraction from PSG data

The studies of various authors show that different parameters ofPSG signals can be used for sleep stage classification. The values ofextracted parameters are individual for each epoch of signal andcharacterize certain stage represented by this epoch.55 According toR&K and AASM, recommended length of epoch is 20e30 s. Thesesuggestions are often taken into account when creating an auto-matic scoring system7,14,8,35 the time resolution of which isconsequently limited by chosen epoch length used for extractingfeatures. Achieved resolution can be however insufficient forevaluating the microstructure of sleep. Shorter signal segments(2e10 s) in combination with special rules for performing finaldecision similar to that of human have been used in some studies tosolve this task.5,34,36

There are many different methods which enable extraction ofnecessary information from PSG signals. All sleep-stage-relatedfeatures can be divided into four main groups: time, frequency,time-frequency domain, and non-linear features. The most widelyused ones are described below.

Time domain featuresSuch features are calculated from signal itself. Some methods,

which are widespread in the statistical analysis of experimentaldata, can be applied on the PSG data. The statistical parametersmean value, median, standard deviation (SD), root mean square(RMS), kurtosis and skewness can be computed from any types ofsignals (EEG, EOG or EMG) and used as time domain features.7,14,45

The other widely used group of features has been proposed byHjorth.5,9,55,56 So called activity, mobility and complexity are calcu-lated from EEG based on the variance, the first and the secondderivatives of the signal.

Frequency and time-frequency domain featuresThe frequency and time-frequency domain features of signals

describe their spectral character changes depending on the sleepstage. Computing these features requires applying some specialtransform to signal sequence for transition from time to frequencyor time-frequency domain to get the spectral components of signal.

The main idea of methods for signal spectrum estimation is thatevery signal can be decomposed into the combination of elemen-tary functions. Because of stochastic character of PSG signals, powerspectrum and the values of power spectral density (PSD) are used todescribe their spectral structure. There are two main categories of

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263254

methods for power spectrum estimation: nonparametric, whenPSD values are estimated directly from signal, and parametricmethods, when PSD values are estimated with the help of signalorigin modelling. As the example of nonparametric approaches,periodogram and Welch57 methods based on the well-knownFourier transform (FT) can be given. Among the parametric ones,Yule-Walker and Burg methods are adopted for solving this task.45

The parametric approaches differ by procedure for the determi-nation of hypothetical model parameters. For PSD estimation usingparametric methods, some model characteristics such as modelorder and sampling rate should be known in advance, which isoften difficult in practice. Values of these characteristics dramati-cally affect the estimated PSD values and consequently resultingfeatures used for sleep scoring. There are some special criteria forchoosing model order.43 It is generally suggested that order shouldbe at least twice numbers of spectral peaks.43

The FT based methods are easily implemented due to thealgorithm for fast FT calculation. However, there are problems withsmearing and leakage that may cause the loss of frequency reso-lution.43 Thus, nonparametric methods require long records toobtain appropriate resolution. Parametric methods avoid theseproblems. Correctly designed auto-regressive models (quite lowmodel order and sampling rate) allow obtaining spectra withimproved resolution, which is necessary, i.e., for analysis of peaks inthe power spectra of short data recordings.43,58

Incontrast to frequencydomain features, time-frequency featuresdescribe the time distribution of signal spectrum, so called spectro-gram. Hence, there is two-dimensional space of features in this case.It is very important, since PSG signals have non-stationary character.Therefore, it is appropriate to examine their spectral compositionusing special algorithms, the best known of which are short-timeFourier transform (STFT) proposed by Gabor59 andWavelet transform(WT).43,60,61 Time-frequency analysis by means of STFT consists ofcalculating Fourier spectra from short signal segments identifiedwith awindow. STFTdecomposes input data into the set of harmonicsignals. The use ofwindowwith stationarywidth causes limited timeresolution. The segments are often overlapped to increase the timeresolution. WT has non-stationary resolution that varies for thedifferent length of analysed segments: resolution is higher for lowfrequencies and long segments, and for high frequencies and shortersegments. In this case, input signals are decomposed by shifted andscaled wavelets corresponding to characteristic sleep-relatedpatterns of signals. The vector of coefficients obtained duringwavelet decomposition procedure can be used as an input for clas-sifier. This vector contains lower number of values in comparisonwith time course of EEG epoch.

The comparison of FT andWT results in higher WT efficiency forvisual analysis of EEG. WT coefficients represent time courses ofseveral components of EEG activity (delta, theta, alpha, sigma, beta)and, consequently, allowmonitoring not only their presence in EEGbut also the time during which their power was significant.62 Onthe other hand, studying the accuracy of Bayesian and AAN clas-sifiers with the feature sets composed of FT and WT basedparameters have shown that these two approaches give quitesimilar results.14 Power of EEG components (i.e., square of absolutevalues of correspondingWTcoefficients) are often not used directlybut they are used to calculate, for instance, RMS values of severalEEG frequency components in order to reduce the dimensionalityof data additionally.14,8,36 This procedure leads to the fact thatinformation provided by WT and FT is in fact the same. Bothapproaches are useable, althoughWTcomputing is more difficult ascompared with FT.

The output of transform (the vector of transform coefficients,i.e., PSD values) is then used to calculate parameters that are theinput for ANN-based classifier. The PSD values can be obviously

given on the input neurons of ANNwithout any processing.13,36 Thefollowing spectral features are very often used in automaticsleep scoring systems: total power defined as the sum of PSD oftotal signal spectrum,8,33,35,37,63 relative power defined as theratio of the power in the certain spectral band to the totalpower,5,7,9,14,34,37,45,55,63 ratio power defined as ratio of the power inthe two different spectral bands or their combinations,8,34,35,37,63

central frequency and power at central frequency,45,63 spectral edgefrequency.5,45 The use of these parameters instead of transformcoefficients reduces features dimensionality that results in thesimpler architecture of used ANN and consequently in smaller timeand personal computer (PC) memory requirements.

Non-linear featuresBesides parameters described above, there are non-linear

approaches based on the principles of non-linear dynamic andchaos theory. According to this concept, biological signals are theoutcomes of the chaotic processes. Consequently, they can berepresented by chaotic parameters such as Lyapunov exponents,dimensional complexity, entropy, Hurst exponent and fractaldimensions. Some studies show that value of the EEG chaoticparameters are dependent on the sleep stages and can be used astheir characteristic pattern.5,14,64e68 Particularly, it has been shownthat dimensional complexity and Lyapunov exponent allow betterdiscriminating between S1 and S2 in comparison with linearspectral parameters which are more powerful for detection of S1and so called slow wave sleep (SWS) comprising S3 and S4.65

Hybrid featuresIt is shown that hybrid features (the combination of different

types of features) allow best separating of sleep stages.45 Forexample, using the combination of non-linear (Lyapunov exponentand correlation dimension) and linear (spectral) features (spectralentropy) or entropy of amplitude of EEG epoch results in betterdistinction of S1, S2, SWS, and REM stages (total error rates forhybrid sets is about 23%, whereas for spectral and nonlinear setscontaining different three features 37e44%).65 Time- andfrequency-domain features are often used together to increaseclassification ability of the system.5,7,9,45

Selection of optimal set of featuresAs shownabove,manydifferent features canbederived fromPSG

data for sleep scoring. Each feature represents information aboutseveral sleep stages. For example, powerof sigmaband is amarker ofthe SS occurrence which is characteristic for shallow sleep N2according to AASM or for S2 stage of non-REM sleep according toR&K.5 Beta band power reflects the high-frequency activity which istypical for activated brain stages such as W and R (REM).5 EEGShannon’s entropy and Hjorth’s activity seem to be goodmarkers ofdeep sleep (SWS or N3).5,14 The differences between EMG and EOGactivity in REMand S1allow to distinguish themusing EMGand EOGentropies which are opposite for these two stages.14 It has been alsoshown that S1 and S2 can be better discriminated by means ofcorrelation dimension and Lyapunov exponent in comparison withspectral features; they are suitable for separating S2 and SWS.65

Kurtosis serves to measure the flatness/peakedness of distribution.Thus, higher values of kurtosis reflect the presence of abrupt EOGvariations which can be associated with occurrence of rapid eyemovements during REM stage.14 The variety of possible sleep-related features raises the question of choosing which of them areoptimal for solving the given problem.

The first aspect whichmight be examined is what kind of signalsshould be taken into account for extracting the features and furthersleep scoring. It is known that R&K and AASM standards are basedon information contained not only in EEG but also in EMG and EOG

Practice points

1) Analysis of EEG spectral features is sufficient to recog-

nize W, N2 (S2) and N3 (SWS) accurately; thus, single-

channel systems, which are more comfortable for

a person, can be effectively used when there is no need

to distinguish other stages correctly.

2) Including EMG and EOG recordings in analysis signifi-

cantly improves accuracy of distinction between N1

(S1) and R (REM) stages.

3) Analysis of features previous to scoring enables esti-

mating of classifier accuracy which could be achieved

andmaking a decision about their suitability for solving

the given problem and possible adding of other

features for improving system properties.

Research agenda

1) Studying thepossibilitiesofusing sophisticatedmethods

for artifact removing in single-channel scoring system.

2) Determination of new or combination of current sleep-

related parameters (features) derived from PSG data

in order to overcome difficulties related to worse

distinguishability of some sleep stages (mainly N1 (S1)

and R (REM)) for both multi-channel and single-channel

applications; other ‘’non-traditional’’ signals can be

taken into account.

3) Evaluation of these parameters in healthy persons and

in patients of various ages.

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263 255

(see above). Therefore multi-channel recording is preferred todistinguish accurately the sleep stages. This is especially importantfor stages N1 (S1) and R (REM), andWand R (REM). The stages fromfirst couple are represented with EEGs with similar waveform(similar frequency content); thus, analysis of EOG and EMG isessential to distinguish between them.5,14,34,55 Those from secondcouple are usually distinguished based on the EMG, which,however, often is of poor quality due to problems with electrodesfixation and cannot provide full information required for correctscoring.10 Testing of ANN based systems have revealed that even ifthe features of EEG (Hjorth mobility and activity, relative delta andsigma powers) and EMG (spectral edge frequency at level 95%) havebeen taken into consideration, W, N1 and R epochs were confusedonewith another (Wwasmisclassified as N1e41% and Re 13%, N1as R e 48%, and R as N1 e 15%).5 It has been also shown thataccuracy for classification of W, S2, and SWS using only spectralEEG features (relative powers) is at least 80%, whereas for classifi-cation of S1 and REM is about 30% and 70%, respectively, andincreases due to adding of EMG (entropy) and EOG (entropy,kurtosis, and SD) features to 65% and 75%.14 Particularly, the mosthardly distinguished S1 stage becomes better recognized due toextension of features set by EMG and EOG entropy and EOGkurtosis; EMG entropy improves REM stage detection.14 Never-theless, in order to reduce amount of data which must be pro-cessed, the researchers attempt to use only single-channel recording(usually EEG) for automatic classification.8,9,13,32e38 Some authorsavoid the problems of badly separable S1 and REM by combiningthem in one stage; resulting agreement of recognizing of this stageis within the range of 78 and 91%.9,35,37 Studies of system using FFTcalculated EEG spectral features (powers in 9 different frequencyregions) for classification by self-organizing map (see below)combined with fuzzy rules has shown low ability of such system toseparate the sleep transitions, i.e., neighbouring states (such as S1and S2 and others).13 However, agreement of all stages according toR&K reached the value more than 70% and equals 84.8% and 80.3%for REM and S1, respectively.13 Improved agreement for REM wasachieved (89%) when 6 spectral EEG features (total power and RMSof five spectral bands) were used as an input of ANN based REMclassifier.33 Using 13 different WT features computed from 30-sEEG epochs in combination with ANN results in 65%, 74% and 75%agreements for REM, W and S1, respectively.8 Computing WTfeatures from shorter EEG epochs (2 s) leads to better accuracy: SS(96.84%), REM (93.68%) and W (95.52%).36 One of the promisingapproaches is based on the interpretation of single-channel scoringas a non-linear problemwhich requires system to be adaptive; thus,the set of optimized ANN leads to reasonable agreement for REM(82e97%).34 Generally, it can be concluded that accuracy of single-and multi-channel systems are approximately the same and bothcan be improved by using short-time segments for feature extrac-tion and further classification by ANN combined with set of rules.

The second question is what combination of features couldprovide full description of sleep stages without redundancy. It canobviously be solved by using the trial and error method when allpossible features and their combinations are tested and the mostsuitable of them are selected for further classifying. Another way,one of the easiest, how to select the appropriate features is per-forming statistical test (for example, t-test or more complex anal-ysis of variance, ANOVA) to find out which of the parameters differsignificantly in different sleep stages.36,55,65 For this purpose,genetic algorithm,34 principal component analysis,45 sequentialforward5,14 and sequential backward selection algorithms14 can beapplied as the pre-processing step. With help of these methods, theoptimal set with a minimal number of features can be obtained thatavoids the problem with overfitting (see below) and lack of PCmemory.

The ANNs and selection of their parameters

Information extracted from PSG signals and presented as featurespace serves for the automatic classification of sleep stages. TheANNs are often chosen by various authors as a tool for sleep scoringbecause of their relative easy implementation and higher effec-tiveness in solving classificationproblems. There are some tasks thatmust be solved by creating the ANN classifier. It is accompanied bythe selection of parameters which affect the performance of theclassifier. The main ones are architecture, neuron transfer function,training algorithm, method of error calculation, and parametersdeterminingdurationofANN training. Generally, theANNtypemustbe determined taking into account the kind and the complexity ofsolved problem and the facilities of hardware/software (or both),whereby ANN classifier is implemented. Full information aboutprinciples of the ANN technique, properties, possibilities andcreating the ANN systems can be found in literature.15,16,69

The ANN architectureThe ANNs consist of elementary units (neurons). Each of them

provides processing of data getting to its input like a biologicalneuron that has been an inspiration for designers of ANN tech-nique. The neuron with vector input x composed of n elements(x1, x2,.,xn) is shown in Fig. 1a.

The neuron multiplies input values by weights (w1,1, w1,2,.,w1,n)and sums weighted values up, i.e., computes the dot product of thevectorsxandw. Then it sumstheproductwithabiasband transformsobtained value y* (y*¼w1,1x1þw1,2x2þ.þw1,nxnþb) using transferfunction f that produces the output of neuron y (y¼ f (wxþ b)). In theview of geometry, bias b is the value determined shifting of thetransfer function along the abscissa and can be omitted. Theweightsand bias of neurons are adjusted during training (see below) in sucha way that the ANN exhibits the desired output.

Fig. 1. Schematic representation. (a) Single neuron with vector input. (b) One-layer network with m neurons.

Fig. 2. Transfer functions. (a) Log-sigmoid. (b) Tan-sigmoid. (c) Hard limit. (d) Linear.

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263256

The neurons of the same level form a so called layer. The numberof neurons in the input andoutput layers is definedby thenumberoffeatures in input vector and the number of classes into which theinput data must be scored, respectively. One-layer network with nelements in input andmneurons is showninFig.1b. In this case, eachinput value is connectedwith each neuron. Thematrix of weightswand the vector of biases b are adjusted on the same principles as inthe previous example that results in the vector output y of the ANN.

Additional layers are often placed between input and outputlayers for increasing ANN performance. These hidden layers havegenerally no contact with environment and have only access toinformation entered from layers with which they are connected. Inpractice, the ANNs with one or more hidden layers e multilayerANNs (MLNN)e arewidespread. TheMLNN is created by translatingthe single layer (Fig. 1b) as many times as it is needed to obtain thedesired number of hidden layers. It is important to determine thenumber of hidden layers and neurons in them. These parameters ofANN have dramatic effect on its classification ability. At the sametime, they determine time and memory required for training andusing ANN: the complex ANNs (with the large number of hiddenlayers and/or neurons in them) require high processing time and PCmemorycapacity. There are no standard guidelines todetermine thenumber of hidden layers and neurons. Most authors find them bytrial and error method and based on their experience.16

The neurons of ANN can be interconnected in two differentways. The first one is that there are no feedback connectionsbetween neurons and data flow strictly from input to outputneurons through hidden units e so called feedforward ANNs. Thesimplest type of feedforward networks is a perceptron.70 The singleperceptron provides binary classification (classification into twoclasses). In other words, it maps input vector to an output valueequaled 0 or 1. The main disadvantage of the perceptrons is thatthey can be used only for linear classification problem, that is, in theview of geometrical representation, if the patterns of feature spacecan be separated by a single line. Multilayer perceptrons (MLP) areapplied to solving non-dichotomous classification and widely usedin sleep research.5,7,9,35,44,63,71 Classical MLNN (non-perceptrontype) are not limited in the output value due to using other types oftransfer function (see below). Therefore they are the most suitabletools for a number of applications.

The second kind of neuron arrangement is with feedbackconnections. Itmeans that there are connections between outputs ofsome neurons and inputs of neurons in the same layer or previouslayers. In this case the ANN is called recurrent. Elman and HopfieldANNs are the frequently used recurrent ANNs. The Elman ANNs72

usually have one hidden layer with the feedback connectionbetweenoutputs and inputsofhiddenunitsand thebackpropagationalgorithm of training (see below). The Hopfield ANNs73 are used as

associativememoryandallowstoring the stable target vectorswhichcan be recalled by network in case of similar vectors.

Neurons transfer functionsThe neurons generate their output using special function e

transfer function (or activation function). For the first hidden layer ofthe ANN, the neurons ‘apply’ the transfer function to the sum of theweighted inputs and bias. The neurons of other layers transform theoutputs of the previous layer. The most popular transfer functionsare shown in Fig. 2. The type of transfer functions (especially for theneurons from output layer) must be selected depending on thedesired output. If the preferable values of output are in rangebetween 0 and 1, it is appropriate to use log-sigmoid (or logistic)function (Fig. 2a). The tan-sigmoid (or hyperbolic tangent) transferfunction is the alternative of log-sigmoid and generates the outputvalue between �1 and þ1 (Fig. 2b). The output of the hard-limit(step) function is limited to 0, if the input of function is less than 0,and 1, if it is greater than or equal to 0 (Fig. 2c). This type of transferfunction is used in the perceptrons. The linear transfer function(Fig. 2d) allows taking any output values.

Training of the ANNBefore practical use, the ANN must be trained to adequately

respond to input data. The main purpose of ANN training is theachievement of the possible smallest classification error that isdefined as the difference between desired and real outputs of theANN. Two methods are commonly used for calculating the perfor-mance function describing the ANN error. The first one evaluatesthe performance of ANN by calculatingmean of squared error (MSE),the second one by calculating sum of squared error (SSE). Theseapproaches are not different in principle and can be both success-fully used for classification.

The training process starts with initialization of biases andweights of ANN. The biases take the initial value 1 and the weightsare generally randomly initialized.

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263 257

Two basic kinds of training (learning) approaches can beimplemented: supervised and unsupervised training.

If it is possible to perform the analysis of some part of the data bythe expert or there are enough priory information about the inputdata then supervised algorithms can be used for ANN training. In thiscase, there is a ‘teacher’ that introduces the set of beforehandanalyzed features (input vector) to input neurons and the results ofexpert classification (target vector) as the output. Bymeansof specialmathematical operations, the ANN correlates input with output,corrects ANNparameters (weights andbiases), calculates the error ofclassification, and if some conditions are executed (see below),training process stops and the ANN is ready to use for scoring newdata. The ANN iteratively corrects theweights and biases of neuronsto minimize the performance function. Thus, the ANN solves anoptimization problem. The target vectors can be filled with valuesrepresenting each resulting classes. The binary form of target vector,whennumbers in themposses the values 1 or 0, is verypopular. Thenthe outputwith the value nearest to 1 determines resulting class. It isimportant to remember that ANN classification ability in the firstplace depends on the taken training set: only if used training set fullydescribes the solving task (i.e., contains the data representing all theclasses), it can be expected that trained ANN will be able tosuccessfully classify the new data. Supervised training with back-propagation (BP) algorithm74 is the most often employed in sleepclassifiers based on the feedforward MLNNs and MLPs. The BP algo-rithmcalculates the gradient of performance function to perform theoptimization. The computation of new corrected weights is realizedin the direction of the negative gradient of the performance functionbackward through the ANN (therefore, the algorithm is called‘backpropagation’). The step of weights correction is dependent ona learning rate (constant parameter multiplied by negative gradientfor computing the step): the smaller the learning rate, the smaller thestep and consequently the slower convergence of gradient to itsminimum value. The described approach e gradient descent e is toosmall to solve complexclassificationproblemandcanbe improvedbyadding momentum that increases the speed of training.69 There arefaster techniques for ANN training and LevenbergeMarquardt algo-rithm (LM) is themost commonly used among them. This approach isbased on the numerical optimization processes and allows changingthe correction step depending on the actual error (the errorcomputed in actual iteration) in such a way, that the performancefunction (i.e., theerror) is reducedduringeach iteration. It leads to theextremely high speed of optimization and as a result to significantshortening duration of ANN training. The more detailed informationand mathematical background of this and other methods of opti-mization may be found in monograph of Kelley.75

During unsupervised training (self-organization), the ANNdiscovers significant patterns in input data and separates them intoclasses. These algorithms do not require external intervention andthe presence of priory information about inputeoutput relation(i.e., target vector). The frequently reported unsupervised ANN isKohonen self-organizing map (SOM)76 which belongs to competitivelearning ANNs. Kohonen network generally produces the two-dimensional distribution of input data using a special neighbor-hood function selecting thewinning neuron by comparing weightedoutputs of neurons. Hence, the output neurons compete amongthemselves in privilege to become a neuron, the output of whichwill be response of the network. Each characteristic group of inputdata is then represented as separate region of the output map.Elman-type SOMs can be also used for sleep data analysis.38

The conditions of training stoppingThe training process for supervised training is generally stopped

when one of the following conditions is fulfilled:maximum numberof epochs (iterations), maximum training duration, performance goal

or minimum performance gradient is reached. For unsupervisedlearning, the maximum number of epochs is used. These parame-ters are defined before training depending on the complexity ofsolving problem and the desired performance of classification. It isa very important phase of ANN model creating. The overfittedsupervised ANN cannot correctly generalize new input data despitethe fact that the testing error reaches a very small value. It results invery large classification error. The overfitting is typical for the largeANNs. There are improved methods allowing prevention of thisproblem: early stopping and regularization used in machinelearning.77e79 In the first case, the training set is divided into a newtraining set and a validation set. At first, the learning algorithm isapplied to the new training set, then to the validation set and, whenthe performance for the validation set stops improving (i.e., theerror does not decrease or starts to increase), the process stops.The ANN with the best performance is then chosen for further use.The second approach provides modification of performance func-tion by adding the value calculated from weights and biases of theANN. That results in the smaller weights and biases and in‘smoother’ response of the ANN and its lower susceptibility tooverfitting.

Estimation of classifier performance

As described above, many types of ANN models with differentarchitectures and learning parameters can be used for sleepscoring. It is difficult to compare correctly the performance of thesemodels. It must be also determined what features better describePSG signals and allow more efficient sleep stages classification. Forthis reason, some special methods are of use.

Cross-validationCross-validation (CV) method has been proposed to solve the

task of correct performance estimating. There are three commonlyused CV algorithms80:

� repeated random sub-sampling CV: during each iteration, inputdata (features vectors) are randomly divided into training andvalidating sets, model is trained and tested on the training andvalidating set, respectively. After a few iterations, the finalperformance of model is calculated by averaging performancesof each CV cycle. This approach has one serious disadvantage:some data may never be chosen to test the model, whereasothers may be used for testing several times.

� k-fold CV: all the data are divided into k sets, k� 1 for them aretraining and one set is testing. CV algorithm performs k times.Each of the k sets is used for validation only once. Finally, kperformances are averaged. Instead of previous approach,there is no problemwith overlapping validating sets.10-fold CVis often used.

� leave-one-out CV: only one features vector is used for datavalidation and remaining vectors are training. The processrepeats until each vector has been validating exactly once. Thisapproach is time consuming but it is effective in case of smallamount of data.

Statistical approachesResults of classification are usually represented as the

percentage agreement between the pairs of observation, that is,two experts (for visual analysis) or between results obtained by theexpert and machine classifier (for visual and automatic machineanalysis). In the second case, the visual analysis is generally takenas ‘gold standard’. Results obtained by human expert are comparedwith those of the automatic system.36,44

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263258

The percentage agreement representation has one seriousdisadvantage: it does not take into account the agreement whichcan be reached in the case that the two observations are not related(in other words, expected by chance). This problem can be solvedby using Cohen’s kappa81 which has a value 1 for perfect agreementand 0 if there is no agreement at all between observations. Inpractice, Cohen’s kappa is calculating using observations frequen-cies finding from concordance matrix containing the intra- andinter-class agreements.5,81 Results of such analysis are not easilyinterpretable. There are only guides how they might be assessed.82

Described approach may be misleading in the case of orderedclasses, when agreement between some pairs of observation islarger than that of other ones. So called weighted Cohen’s kappa83

should be used in such cases. The weight is usually 1 for perfectagreement and near to zero for low agreement. If weights aredefined for the degree of disagreement their values are inverse.

In some cases, there is a need of measurement of the agreementbetween more than two observations. It requires using the specialcoefficients such as Fleiss kappa, improvedweighted Cohen’s kappaand others.84

Practice points

1) Input (features vectors) and output (target vectors) of

ANN should be carefully prepared. The features repre-

senting each desired output class must be present in

input data in sufficient quantity.

2) Architecture of ANN should by chosen taking into

account the complexity of solved problem and available

time, hardware, and software resources.

3) Supervised ANNs may be the powerful tool for auto-

matic sleep scoring on condition that suitable data (fully

and correctly describing the solved task) will be used as

an input and output. For unsupervised methods, it is

possible that results of scoring would not be easily

interpreted.

4) Use of fast learning algorithms leads to significant

decrease of time and memory requirements.

5) Special algorithms should be applied to prevent ANN

overfitting. It is also necessary to take into account that

the large networks are exposed to overfitting.

6) Performance of ANN models should be calculated

using special technique, which allows to compare

different ANNs and to evaluate influence of used

features on the sleep scoring results.

Research agenda

1) Creation and evaluation of scoring systems with

improved properties which can be achieved by combi-

nation of ANN with some non-ANN techniques (e.g.,

fuzzy inference system or set of context rules) permit-

ting providing smooth reasoning.

2) Adaptation of current and creation of new systems for

automatic sleep scoring according to new AASM stan-

dard; validation of these systems on large amount of

patients/non-patients data.

The ANN based classifiers

Sleep stages ANN classifiers published in the last two decadesare summarized in Table 1. The architecture of the ANNs is pre-sented in form *-**-*** (for example, 17-10-6 for Schaltenbrand’ssystem), where *, ** and *** are the number of input elements,hidden and output neurons, respectively. These systems differ onefrom another in some characteristics that allow analysing theirinfluence on the classification results.

Each system can be characterized by distinguishable sleep stages.There are classifiers recognizingonly some sleep stages (for instanceREM,33 W, SS and REM36) and those recognizing all stages.5,13,34,63

Some systems use a combination stages (S1/REM,9,35,37,71 S3/S4(SWS),14,35,37,71 andW/MT7) which allowavoiding the problemwithbad distinguishing of corresponding stages.

Systems shown inTable 1 alsodiffer in type of signals and extractedfeatures used for classification. As referred above, some authors takeinto account EEG only.8,9,13,32e35,37,38 More complex approaches usealso EOG and EMG. There are classifiers with the hybrid features set.

Some of the presented systems have the same set of sleep stages andsimilar ANN (number of hidden neurons and layers and learningalgorithm)but different feature sets.35,37,71 It is useful for comparisonof how selecting the features can affect the classification.

Presented ANN based classifiers are also characterized byproperties of used ANN. As may be seen from the table, thesupervized BPMLP and BPMLNN are the most popular ande basedon the agreement between automatic and visual scoring results emay be successfully used for sleep scoring.

Commercially available scoring systems

Some software systems for automatic sleep scoring are availabletoday. Some of them use fuzzy logic (i.e., ASEEGA, Physip, France,spectral features from single-channel sleep EEG, percentage agree-ments and Cohen’s kappa for scoring wake/sleep, wake/REM/non-REM, wake/REM/stage1-stage2/SWS and wake/REM/stage1/stage2/SWS are 96%, 92.1%, 84.9%, 82.9% and 0.82, 0.81, 0.75, 0.72, respec-tively85). Other examples are expert system Somnolyzer 24�7, bySiesta Group, Austria, with agreement 80% and Cohen’s kappa 0.72for analysis of central EEG, two EOG and one chin EMG channels forMT, wake, stage1, stage2, SWS and REM6 and Svarog system, byWarsaw University, Poland usingmatching pursuit EEG features andhierarchical rule-based decision86 with total agreement 73% for 7stages. ANN based sleep scoring systems are also available. TheSASCIA (Sleep Analysis System to Challenge Innovative ArtificialNetworks) system developed by Baumgart-Schmitt et al.34 allowssleep scoring based on analysis of 31 features using methodcombined ANN, genetic algorithm and context rules. Systems byRoberts and Tarassenko,32 Grözinger et al.33 and Schaltenbrandet al.63 have been used in clinical practice. ANNs and neuro-fuzzymethod are taken as a basis of ARTISANA e the part of SOMNOlabsoftware, which is implemented in the ambulatory poly-somnography system SOMNOchek2 R&K (Weinmann, Germany).87

All these systems contain decision algorithmswhich are basedonthe R&K scoring rules. In some systems, their AASMversions are alsoavailable (Somnolyzer 24�7with agreement 81% andCohen’s kappa0.75 for semi-automated sleep scoring,88 SOMNOlab software withscoring according to AASM when utilizing SOMNOlab2 polygraphicsystem). Some modifications of scoring system (by example ofSomnolyzer 24�7) were performed to allow scoring according tonew standard: 1) including occipital and central leads in featureextraction for detection of alpha activity, 2) including frontal andcentral leads in feature extraction fordetectionof slowwave activity,3) modification of arousals detector, 4) transformation of thesmoothing rules corresponding to detectionofN2end,MTand to the‘3-Min Rule’, 5) adding the rule for simple adding S3 and S4 into onestage N3.88 This systemwas tested on adults’ signals only. However,studies have shown, that physiological singularities of childrenshould be taken into account for creating the new scoring systems.41

At present, there are no ANN based systems scoring the sleepstages according to the AASM available.

Table 1Summary of artificial neural network (ANN) based systems for sleep scoring. BP: backpropagation, EEG: electroencephalogram, EMG: electromyogram, EOG: electrooculogram(LEOG, REOG: left, right EOG, respectively), FC: fully connected, FT: Fourier transform, MLNN: multilayer neural network, MLP: multilayer perceptron, MT: movement time,RatP: ratio power, REM, rapid eye movement, RMS: root mean square, RP: relative power, RUM, LM: Rumelhart (gradient descent without momentum) and Levenberge-Marquardt learning algorithm, respectively, S1, S2, S3 and S4: see section “Polygraphic data and visual sleep scoring” for definitions, SD: standard deviation, SOM: self-organizing map, SWS: slow wave sleep, TP: total power, W: wakefulness, WT: wavelet transform. Description of sleep stages is according to R&K and AASM.

Authors Sleep stages Extracted features ANN model Agreement

Principeand Tome198971

W, S1/REM,S2, S3/S4

24 features (6 in 4 2bits levels): 3 EEG channels:alpha, beta, sigma, delta, REM; EMG: level

FC single perceptron, FCMLP: 24-3-4, 24-5-3-4,RUM

From 78.8 to 96.9%

Roberts andTarassenko199232

W, REM, S1,S2, S3, S4

10 C4eA1 EEG features: 10 first coefficients ofKalman filter

Kohonen map e

Schaltenbrandet al. 199363

MT, W, REM,S1, S2, S3, S4

11 recordings, 17 features: EEG: RP delta, theta,alpha, beta1 (13e22 Hz), beta2 (22e35 Hz),TP (0e35 Hz), RatP delta/theta and alpha/theta,central frequency, dispersion of power; EOG:RP (0e4 Hz), TP, central frequency, dispersionof power; EMG: TP, mean frequency, dispersion

FC MLP: 17-10-6, RUM 80.6%

Grözingeret al. 199533

REM 13 recordings, 6 EEG features: RMS delta, theta,alpha, beta1 (15e35 Hz), beta2 (35e45 Hz) andTP (0.5e45 Hz)

FC BP MLNN: 6-4-1, RUM 89%

Baumgart-Schmittet al. 199734

MT, W, REM,S1, S2, S3, S4

16 recordings, 31 EEG features: TP (1e64 Hz),max and 2nd max power (0e63 Hz), power in thenear region of max and 2nd max power (0e63 Hz),frequency at max power (<4, 5e7, 3e8, 7e11, 8e12,12e14, 14e20, 21e30, 40e60 Hz), RP (<4, 1e2, 5e7,4e6, 7e9, 7e11, 8e13, 12e14, 14e20, 15e30 Hz),frequency at 25%, 50% and 75% of TP, maxTP/meanTP,SD of SS power, overall power related to 1 s intervalof delta power

10 ANNs with geneticalgorithm and contextrules, RUM withmomentum

Total 70e80%:40.8-45%: S1,75e77.5%: S2,50.2e68%: S3,57e79.5: S4,82.8e97%: REM,23e64.5%:W,62.3e74.5%: MT

Pacheco andVaz 19989

W, S1/REM, S2,S3, S4

7 EEG features: delta, alpha, sigma, K complex activity,combined Hjorth activity and mobility, RP thetaand alpha

MLP: 7-10-4, combinedwith context rules

90% (for MLPonly e 57%)

Oropesa et al.19998

W, REM, S1, S2,S3, S4

2 recordings, 13 C3 EEG features: mean quadratic valuesof WT coefficients on the bands K-complexesþ delta(0.4e1.55 Hz), delta (1.55e3.2), theta, alpha, SS(11.0e15.6 Hz), beta1 (15.6e22.0 Hz), beta2(220e375 Hz), TP, RatP delta (0.4e3.2)/TP, alpha/TP,K-complexesþ spindles/TP, alpha/theta, delta(0.4e3.2)/theta

FC BP MLNN: 13-10-6,LM

77.6%

Becq et al.20057

W/MT, REM, S1,S2, S3, S4

11 recordings, 8 features: C3eA2 EEG: SD, RP delta,theta, alpha, sigma, beta, gamma; EMG SD

BP MLP: 8-6-6 71%

Tian and Liu200513

MT, W, REM, S1,S2, S3, S4

9 EEG features: power at 0.5e2 Hz, 2e4 Hz, 4e6 Hz,6e8 Hz, 8e10 Hz, 10e12 Hz, 12e14 Hz, 14e18 Hz,18e30 Hz

SOM of size [10� 10]combined with 26fuzzy rules

85.3%

Zoubek et al.200714

W, S1, S2, SWS,REM

47 recordings, the set of 10 features with best classificationresults: EEG: RP delta, theta, alpha, sigma, beta(FT coefficients), 75th percentile; EMG: entropy;EOG: entropy, kurtosis number and SD

BP MLNN: 5-6-5 and10-6-5

71% (EEG only), 80%(EEG, EOGand EMG): 84.57%: W,64.56%: S1, 85.55%: S2,92.90%: SWS,72.81%: REM

Ebrahimi et al.200835

W, S1/REM, S2,S3/S4

7 recordings, 12 EEG features: delta, theta, alpha,beta1 (15.63e21.88 Hz), beta2 (21.8e37.50 Hz)power, mean quadratic values, TP, RatP (alpha/deltaþ theta, delta/alphaþ theta, theta/alphaþ delta),mean of the absolute values, SD

BP MLP: 12-8-4, RUMwith momentum

93.0%

Sinha 200836 W, SS, REM 5 recordings, 64 EEG features: 64 WT coefficients BP MLNN: 64-14-3 RUMwith momentum,combined with contentrules

95.35%

Chapotot andBecq 20105

W, transitionalsleep N1, shallowsleep N2, deepsleep N3, REM, MT

48 recordings, 16 features: C4eA1 EEG: 2 Shannonentropy, Hjorth activity, mobility and complexity,Hurst exponent, spectral edge frequency 95%, RP:delta, theta, alpha, sigma, beta and gamma; EMG:Shannon entropy, spectral edge frequency 95%,gamma RP

BP MLP: 16-20-6, LM W: 34%, N1: 43%,N2: 51%, N3: 82%,REM: 82%, MT: 13%

Liu et al.201037

W, S1/REM, S2, SWS 10 features: EEG: Hilbert transform: TP, RatP alpha/thetaand delta/theta, RP (deltaþ K-complexes (0.5e2 Hz),delta (2e4 Hz), theta, alpha, spindle (12e16 Hz), beta1(16e22 Hz), beta2 (20e35 Hz))

BP MLNN: 10-8-4, RUM 95.2%: W,87.1%: S1/REM,82.0%: S2,92.9%: SWS

(continued on next page)

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263 259

Table 1 (continued )

Authors Sleep stages Extracted features ANN model Agreement

Shimada et al.201038

W, S1, S2,S3, S4

5 recordings, 16 EEG features: delta0 (0e1.17), delta1(1.17e2.34), delta2e3 (2.34e4.30), theta1 (4.30e6.25),theta2 (6.25e8.20), alpha s (8.20e9.38), alpha m(9.38e11.33), alpha f¼ omega1 (11.33e13.28), omega2(13.28e15.23), beta11 (15.23e17.19), beta12 (17.19e19.14),beta2 (19.14e23.05), beta3 (23.05e26.95), gamma1(26.95e39.45), gamma2 (39.45e60.16), gamma3(60.16e100.0)

Elman-typefeedback SOM

72.20%

Tagluk et al.201044

REM, S1, S2,S3, S4,

21 recordings, 5 features: EEG: 5 s segments of 0.3e50 Hz,EMG: 40e4000 Hz, LEOG and REOG 0.5e100 Hz

BP MLP: 4-10-10-5,RUM with momentum

74.7%

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263260

Example of sleep scoring using one EEG channel recordingand the BP MLNN

Polygraphic data

In this work, the polygraphic data from Sleep-EDF Database89

are used. This dataset contains 8 polygraphic recordings saved inEuropean data format (EDF).90 The subjects are healthy males andfemales (21e35 years) without any medication. Data are dividedinto two sets (each of four patients’ signals), each contains e inaddition to other signals e EEGs measured from two channels(FpzCz and PzOz) (sampled at 100 Hz) and hypnograms (sampled at1 Hz). The record duration is about 24 h and 8 h for these two sets.

Hypnograms are manually scored based on 30-s epochs of Fpz-Cz and Pz-Oz EEG according to R&K.91 The complete information ofsubjects, datasets and hypnogram scoring can be found inliterature.92,93

Eight Pz-Oz EEGs are used for sleep scoring in the presentstudy. For analysis, epochs with the exception of MT and unscoredstages are selected from EEG records. Moreover, some W epochsof 24 h EEGs are excluded from study in order to obtain moreequal percentage ratios of epochs representing different stages ofsleep.

The EEG features extraction

Two different sets of features are extracted from one-lead EEGsignals: relative power values for four spectral bands delta (0.5, <4Hz), theta (4, <8 Hz), alpha (8, <13 Hz) and beta (13, 30 Hz) andrelative power values for 30 spectral bands (1 Hz bands from 0.5 Hzto 30 Hz). Thus, each epoch is described by means of some of theseparameters that are used to form input vectors for ANN classifiers.The PSDs are estimated from EEGs using Welch method. No pre-processing methods have been applied on the signals before PSDcomputing. Process of 4-elements vector input (features vector)

Fig. 3. Extraction of the 4-elements features vector from EEG epoch. PSD e power spectralrelative power values for delta, theta, alpha and beta bands, respectively.

extraction is shown in Fig. 3. In this figure, d, q, a, and b are the PSDin corresponding spectral bands, drel, qrel, arel, brel are relativepowers of these bands.

The BP MLNN models

Five BP MLNN models are proposed to study influence of theused spectral features and the ANN architecture on the sleepscoring performance. The LM learning algorithm, the hidden log-sigmoid and output linear transfer functions, the MSE performancefunction are chosen as parameters of ANNs. Early stopping is used toprevent overfitting of ANNs. The number of training epochs (iter-ations) has not exceeded 100. Values of output neurons of eachmodel are presented in Table 2.

The first model provides classification of only two stages: awake‘W’ and sleep ‘S’, which contains four phases of non-REM sleep andREM sleep. For the output neurons, binary representation is chosen.The value 1 in the output of the first model means that ANN clas-sifies common epoch as W stage. If the value 0 occurs, epoch isreferred to as S. Other models form the output on the same prin-ciple. The second model allows scoring of three stages: awake ‘W’,sleep ‘S*’ (involves four stages of non-REM sleep) and ‘REM’. Thethird model allows classification of four stages: awake ‘W’,combined S1 and REM ‘S1/REM’, S2 of non-REM ‘S2’ and combinedS3 and S4 stages of non-REM ‘S3/S4’. In the case of next model, non-REM sleep stages are combined as ‘S1/S2’ and ‘S3/S4’, W and REMsleep are classified separately. The last model has six outputs withthe purpose of all sleep stages scoring.

10-fold CV is applied to choose more informative features and toevaluate performance of classifiers. All features vectors are dividedinto 10 sets of the same length (773 column vectors of differentnumber of features depending on ANN architecture) so thatpercentage ratio of vectors presenting different sleep stages isequal.

density, d, q, a, b e delta, theta, alpha and beta bands, respectively, drel, qrel, arel, brel e

Table 2Output neurons of proposed artificial neural network models. REM: rapid eyemovement, S*: stage involving four stages of non-REM sleep (other sleep stages areaccording to R&K), S1, S2, S3 and S4: see section “Polygraphic data and visual sleepscoring” for definitions, W: wakefulness.

ModelNo.

Sleepstage

Outputneurons

ModelNo.

Sleepstage

Output neurons

1 2 3 4 1 2 3 4 5 6

1 W 1 e e e 4 W 1 0 0 0 e e

S 0 e e e S1/S2 0 1 0 0 e e

2 W 1 0 0 e S3/S4 0 0 1 0 e e

S* 0 1 0 e REM 0 0 0 1 e e

REM 0 0 1 e 5 W 1 0 0 0 0 03 W 1 0 0 0 S1 0 1 0 0 0 0

S1/REM 0 1 0 0 S2 0 0 1 0 0 0S2 0 0 1 0 S3 0 0 0 1 0 0S3/S4 0 0 0 1 S4 0 0 0 0 1 0

REM 0 0 0 0 0 1

Table 3Results of sleep scoring obtained by proposed artificial neural network (ANN) models. Einvolving the four stages of non-REM and REM sleep, S*: stage involving the four stagessection “Polygraphic data and visual sleep scoring” for definitions, W: wakefulness.

No Sleep stages EEG features (RP of certain bands) ANN archite

1 W, SDelta 1-3-2Theta 1-7-2Alpha 1-3-2Beta 1-3-2Delta, theta, alpha, beta 4-3-2

4-7-24-10-2

30 Subbands 30-4-230-10-230-20-2

2 W, S*, REMDelta, theta, alpha, beta 4-5-3

4-7-34-10-34-5-5-3

30 Subbands 30-5-330-20-330-5-5-3

3 W, S1/REM, S2, S3/S4Delta, theta, alpha, beta 4-5-4

4-8-44-12-4

30 Subbands 30-5-430-20-430-10-5-430-10-8-4

4 W, S1/S2, S3/S4, REMDelta, theta, alpha, beta 4-6-4

4-10-44-11-4

30 Subbands 30-6-430-10-430-15-430-6-6-430-7-7-4

5 W, S1, S2, S3, S4, REMDelta, theta, alpha, beta 30-7-6

30-11-630-15-630-20-630-8-8-630-10-8-6

30 Subbands 30-7-630-9-630-11-630-6-6-630-8-7-630-9-8-6

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263 261

The proposed method of sleep scoring is realized in MATLAB 7.5(The MathWorks, Inc.).

Results

Many different ANN architectures are tested to study capabil-ities of proposed five models. Some classification results aresummarized in Table 3. The proposed ANNs differ not only in theoutput layers (i.e., the sleep stages that can be recognized) but alsoin the input vectors (features).

It is interesting that even if the relative power of one spectralband (delta, theta, alpha or beta) is given on the input of ANN, theagreement for S stage is very high (see model 1 in Table 3). More-over, the ANNwith beta relative power input has the agreement forW stage approx. 49%.

EG: electroencephalogram, REM: rapid eye movement, RP, relative power, S: stageof non-REM sleep (other sleep stages are according to R&K), S1, S2, S3 and S4: see

cture Agreement between automatic and visual scoring results

W S Total28.23 98.93 93.106.29 99.64 92.126.77 99.76 92.2748.71 98.44 94.1267.58 98.68 95.7298.06 99.41 98.6291.13 99.12 97.8674.52 98.14 96.2575.97 98.45 96.6579.35 98.42 96.90

W S* REM Total69.19 92.93 45.44 80.7161.61 93.13 49.75 81.2173.06 92.08 49.44 81.2381.45 91.07 52.94 81.8584.68 92.60 78.44 84.8686.94 94.27 78.00 90.3180.32 93.36 77.19 88.97

W S1/REM S2 S3/S4 Total70.48 53.59 79.89 69.46 69.3975.48 55.18 79.31 71.01 70.1876.61 56.73 78.56 71.55 70.4980.48 82.91 85.86 74.34 82.6681.94 85.05 86.85 75.19 84.0583.39 83.73 87.04 75.43 83.8782.58 89.91 86.13 76.05 83.57

W S1/S2 S3/S4 REM Total54.35 83.55 70.23 51.69 72.3563.71 83.75 69.55 51.19 72.5465.00 83.96 69.69 48.63 72.7483.39 83.34 73.57 77.50 80.5082.10 85.55 71.94 76.81 81.1982.26 85.12 74.73 76.75 81.4283.71 84.64 74.19 77.81 81.4182.58 84.72 75.04 78.06 81.55

W S1 S2 S3 S4 REM Total73.39 0.00 86.08 0.00 79.52 56.06 64.1875.32 0.33 85.94 0.90 77.41 57.06 64.7077.26 0.17 85.64 0.15 82.10 56.75 64.6679.19 1.17 85.66 2.84 80.00 57.94 65.2179.84 1.17 85.41 5.67 76.77 57.63 65.0781.13 1.00 84.53 8.36 76.77 58.06 65.0783.87 33.67 90.14 3.73 83.06 80.56 75.2184.35 25.50 89.75 26.12 76.77 81.81 76.1383.55 30.67 89.50 28.66 77.26 82.25 76.7084.52 34.17 88.84 21.64 79.35 81.63 76.1783.87 36.17 88.76 27.76 76.13 81.31 76.4483.71 35.83 89.03 28.66 75.97 80.38 76.40

* The most important references are denoted by an asterisk.

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263262

Recognition of three stages W, S* and REM with fourth-elementvector input has a good result for W and S* and worse for REM(model 2 in Table 3). Agreement of REM identification dramaticallyincreases with use of 30-elements input vector for classification.

S1 and S3 stages cannot be successfully recognized by theproposed ANNs (model 5 in Table 3) because of a low number ofepochs representing these stages in EEG (about 10% of epochs) andalso similarity of characteristics S1 with S2 and REM, and S3 withS4. For the purpose of these epochs staging, more features (not onlyEEG relative power) should be taken into account. However, models3 and 4 distinguish combined stages S1/REM, S1/S2 and S3/S4 witha good agreement.

In general, the highest agreement occurs in the case of 30-valuesinput vectors, one hidden layer and relatively low number ofhidden neurons. Increase of hidden-layer and hidden-neuronsnumber allows increasing of classifier performance only toa certain level. Further complexity of ANN architecture does notsignificantly improve results. Obtained results are in accordancewith those reported for the ANNs classified similar sleep stages (seeTable 1 for comparison).

Conclusion

In this review, possibilities of ANN use for automatic sleepscoring are described and shown on the practical example. Thismethod is generally performed using PSG signals that describevarious sleep stages by quality and quantity changes of differentsignals characteristics.

The features extraction is a very important phase of automaticsleep scoring and significantly affects the overall accuracy of clas-sification. Ideally, only the most informative features fullydescribing the analyzed PSG signals and clustering them intoseparated groups corresponding with several sleep stages shouldbe chosen for scoring. Perhaps one of the most interesting andimportant aspects is creating the systems which would be able torecognize sleep stages using single-channel data only. There arevarious approaches which deal with this task and propose differentfeature sets for this purpose. It is, however, still open questionwhich of them is the only optimal and whether single-channelsystems could replace multi-channel ones. It especially refers tothe distinction of badly separable R (REM) and N1 (S1) stages.

Special attention must be also paid to selection of the ANNmodel, in particular to the architecture and training algorithmdetermination. Parameters of the scoring model must be chosenwith regard to complexity of solved problem (i.e., type of analyzedsignal, number of extraction features and discrimination of sleepstages) and hardware and/or software resources used for classifi-cation system design. The ANN based scoring systems reported bymany authors differ in attained total performance that varies in therange of 70e97% depending on recognized stages (see Table 1).There are some commercially available ANN scoring systems per-forming, however, sleep stages classification according to R&Krules, which were replaced with AASM standard in 2007. There isthus the need to modify current R&K systems and/or develop newones considering reported results of comparison of scoring data inaccordance with R&K and AASM. Adoption of new AASM standardin practice should lead to occurrence of large databases includingdifferent groups of data (patients/non-patients, adults/children)suitable to study the influence of changes in the scoring rules on theinter-rater agreement and to validate the automatic scoringapproaches.

To summarize the above information, close co-operationbetween sleep and biomedical engineering experts should betaken as a basis for creating robust and reliable automatic scoringsystems able to substitute the time-consuming human scoring.

Acknowledgement

This work was supported by the grant projects of the GrantAgency GACR GD 102/09/H083, GA 102/09/1897-BAD, MSM0021630513, and MSM 0021622402.

Conflict of interestThere is no conflict of interest.

References

1. Fava C, Montagnana M, Favaloro EJ, Guidi GC, Lippi G. Obstructive sleep apneasyndrome and cardiovascular diseases. Semin Thromb Hemost 2011;37:280e97.

2. Reishtein JL. Obstructive sleep apnea a risk factor for cardiovascular disease.J Cardiovasc Nurs 2011;26:106e16.

3. Pandey A, Demede M, Zizi F, Al Haija’a OA, Nwamaghinna F, Jean-Louis G, et al.Sleep apnea and diabetes: insights into the emerging epidemic. Curr DiabetesRep 2011;11:35e40.

4. Melej R. The sleep disturbances and obesity: OSAS and OHS (pneumologistexperience). Progr Nutr 2010;12:327e30.

*5. Chapotot F, Becq G. Automated sleep-wake staging combining robust featureextraction, artificial neural network classification, and flexible decision rules.Int J Adapt Control Signal Process 2010;24:409e23.

6. Anderer P, Gruber G, Parapatics S, Woertz M, Miazhynskaia T, Klosch G, et al.An E-health solution for automatic sleep classification according toRechtschaffen and Kales: validation study of the Somnolyzer 24�7 utilizingthe Siesta database. Neuropsychobiology 2005;51:115e33.

7. Becq G, Charbonnier S, Chapotot F, Buguet A, Bourdon L, Baconnier P.Comparison between five classifiers for automatic scoring of human sleeprecordings. Stud Comput Intel 2005;4:113e27.

8. Oropesa E, Cycon HL, Jobert M. Sleep stage classification using Wavelettransform and neural network. ISCI Technical Report TR-99e008; 1999.

9. Pacheco OR, Vaz F. Integrated system for analysis and automatic classification ofsleep EEG. In: Proceedings of IEEE EMBC 1998;vol. 20. 2062e2065.

*10. Penzel T, Conradt R. Computer based sleep recording and analysis. Sleep MedRev 2000;4:131e48.

*11. Robert C, Guilpin C, Limoge A. Review of neural network applications in sleepresearch. J Neurosci Methods 1998;79:187e93.

12. Robert C, Gaudy JF, Limoge A. Electroencephalogram processing using neuralnetworks. Clin Neurophysiol 2002;113:694e701.

13. Tian JY, Liu JQ. Automated sleep staging by a hybrid system comprising neuralnetwork and fuzzy rule-based reasoning. In: Proceedings of IEEE EMBC 2005;vol.4. 4115e4118.

14. Zoubek L, Charbonnier S, Lesecq S, Buguet A, Chapotot F. Feature selection forsleep/wake stages classification using data driven methods. Biomed Sig ProcControl 2007;2:171e9.

15. Callan R. Essence of neural networks. USA: Prentice Hall; 1998.16. Haykin S. Neural networks: a comprehensive foundation. 2nd ed. USA: Prentice

Hall; 1998.17. Nateri AS, Dadvar S, Oroumei A, Ekrami E. Prediction of silver nanoparticles

diameter synthesized through the Tollens process by using artificial neuralnetworks. J Comput Theor Nanos 2011;8:713e6.

18. Eslamloueyan R, Khademi MH, Mazinani S. Using a multi layer perceptronnetwork for thermal conductivity prediction of aqueous electrolyte solutions.Ind Eng Chem Res 2011;50:4050e6.

19. Alharbi A. An artificial neural networks method for solving partial differentialequations. AIP Conf Proc 2010;1281:1425e8.

20. Chuang CL, Huang ST. A hybrid neural network approach for credit scoring.Expert Syst 2011;28:185e96.

21. Ribeiro CO, Oliveira SM. A hybrid commodity price-forecasting model appliedto the sugar-alcohol sector. Aust J Agri Resour Econom 2011;55:180e98.

22. Pan DG, Lei SS, Wu SC. Two-stage damage detection method using the artificialneural networks and genetic algorithms. In: Proceedings of ICICA’10 2010;vol.6377. 325e332.

23. Mohammad AH, Abu Zitar R. Application of genetic optimized artificialimmune system and neural networks in spam detection. Appl Soft Comput2011;11:3827e45.

24. Al-Qawasmi KE, Al-Smadi AM, Al-Hamami A. Artificial neural network-basedalgorithm for ARMA model order estimation. In: Proceedings of NDT 2010;vol.88. p. 184e192.

25. Buddhiraju KM, Rizvi IA. Comparison of CBF, ANN and SVM classifiers forobject based classification of high resolution satellite images. Proceedings ofIGARSS; 2010. p. 40e43.

26. Amaral JLM, Faria ACD, Lopes AJ, Jansen JM, Melo PL. Automatic identificationof chronic obstructive pulmonary disease based on forced oscillationmeasurements and artificial neural networks. Proceedings of IEEE EMBC;2010. p. 1394e1397.

M. Ronzhina et al. / Sleep Medicine Reviews 16 (2012) 251e263 263

27. Nawi NM, Ghazali R, Salleh MNM. The development of improved back-propagation neural network algorithm for predicting patients with heartdisease. In: Proceeding of ICICA’10 2010;vol. 6377. 317e324.

28. Rechtschaffen A, Kales A. A manual of standardized terminology, techniques andscoring system for sleep stages of human subjects. Los Angeles, CA: BIS/BRI,University of California; 1986.

29. Iber C, Ancoli-Israel S, Chesson A, Quan SF. The AASM manual for scoring ofsleep and associated events: rules, terminology and technical specifications. 1sted. Westchester, IL: American Academy of Sleep Medicine; 2007.

30. Vaughn BV, Giallanza P. Technical review of polysomnography. Chest2008;134:1310e9.

31. Zifkin BG, Avanzini G. Clinical neurophysiology with special reference to theelectroencephalogram. Epilepsia 2009;50:30e8.

32. Roberts S, Tarassenko L. New method of automated sleep quantification. MedBiol Eng Comput 1992;30:509e17.

33. Grözinger M, Röschke J, Klöppel B. Automatic recognition of rapid eyemovement (REM) sleep by artificial neural networks. J Sleep Res 1995;4:86e91.

34. Baumgart-Schmitt R, Eilers R, Herrman WM. On the use of neural networktechniques to analyze sleep EEG data. Second communication: training ofevolutionary optimized neural networks on the basis of multiple subjects dataand the application of context rules according to Rechtschaffen and Kales.Somnologie 1997;1:171e83.

*35. Ebrahimi F, Mikaeili M, Estrada E, Nazeran H. Automatic sleep stage classifi-cation based on EEG signals using neural networks and Wavelet packetcoefficients. Proceeding of IEEE EMBC; 2008. p. 1151e1154.

*36. Sinha RK. Artificial neural network and wavelet based automated detection ofsleep spindles, REM sleep and wake states. J Med Syst 2008;32:291e9.

37. Liu Y, Yan L, Zeng B, Wang W. Automatic sleep stage scoring using Hilbert-Huang transformwith BP Neural Network. Proceedings of ICBEE; 2010. p. 1e4.

38. Shimada T, Tamura K, Fukami T, Saito Y. The effect of using Elman-type SOMfor sleep stages diagnosis. Proceedings of IEEE/ICME; 2010. p. 165e170.

39. Himanen SL, Hasan J. Limitations of Rechtschaffen and Kales. Sleep Med Rev2000;4:149e67.

*40. Moser D, Anderer P, Gruber G, Parapatics S, Loretz E, Boeck M, et al. Sleepclassification according to AASM and Rechtschaffen & Kales: effects on sleepscoring parameters. Sleep 2009;32:139e49.

41. Novelli L, Ferri R, Bruni O. Sleep classification according to AASM andRechtschaffen and Kales: effects on sleep scoring parameters of children andadolescents. J Sleep Res 2010;19:238e47.

42. Danker-Hopfe H, Anderer P, Zaitlhofer J, Boeck M, Dorn H, Gruber G, et al.Interrater reliability for sleep scoring according to the Rechtschaffen & Kalesand the new AASM standard. J Sleep Res 2009;18:74e84.

43. Sörnmo L, Laguna P. Bioelectrical signal processing in cardiac and neurologicalapplications. USA: Elsevier Academic Press; 2005.

*44. Tagluk ME, Sezgin N, Akin M. Estimation of sleep stages by an artificial neuralnetwork employing EEG, EMG and EOG. J Med Syst 2010;34:717e25.

*45. Vural C, Yildiz M. Determination of sleep stage separation ability of featuresextracted from EEG signals using principal component analysis. J Med Syst2010;34:83e9.

46. Correa AG, Laciar E, Patino HD, Valentinuzzi ME. Artifact removal from EEGsignals using adaptive filters in cascade. J Phys Conf Ser 2007;90:1e10.

47. Kumar PS, Arumuganathan R, Sivakumar K, Vimal C. Removal of ocular arti-facts in the EEG through wavelet transform without using an EOG referencechannel. Int J Open Problems Comput Math 2008;1:188e200.

48. Araghi LF. A new method for artefact removing in EEG signals. In: ProceedingsIMECS 2010;vol. 1. 17e20.

49. Shao S-Y, Shen K-Q, Ong CJ. Automatic EEG artifact removal: a weightedsupport vector machine approach with error correction. IEEE Trans Biomed Eng2009;56:336e44.

50. Crespo-Garcia M, Atienza M, Cantero JL. Muscle artefact removal from humansleep EEG by using independent component analysis. Ann Biomed Eng2008;36:467e75.

51. Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J NeurosciMethods 2004;134:9e21.

52. Lagerlund TD, Sharbrough FW, Busacker NE. Spatial filtering of multichannelelectroencephalographic recordings through principal component analysis bysingular value decomposition. J Clin Neurophysiol 1997;14:73e82.

53. Moretti DV, Babiloni F, Carducci F, Cincotti F, Remondini E, Rossini PM, et al.Computerized processing of EEG-EOG-EMG artifacts for multicentric studiesin EEG oscillations and event-related potentials. Int J Psychophysiol2003;47:199e216.

54. Yong X, Ward RK, Birch GE. Artifact removal in EEG using morphologicalcomponent analysis. Proceedings of ICASSP’09; 2009, p. 345e348.

55. Estrada E, Nazeran H. EEG and HRV signal features for automatic sleep stagingand apnea detection. Proceedings of CONIELECOMP; 2010, p. 142e147.

56. Hjorth B. EEG analysis based on time domain properties. ElectroencephalogrClin Neurophysiol 1970;29:306e10.

57. Welch PD. The use of Fast Fourier Transform for the estimation of powerspectra: a method based on time averaging over short, modified periodo-grams. IEEE Trans Audio Electroacoustics 1967;15:70e3.

58. Djuric PM, Kay SM. Spectrum estimation and modeling. In: Madisetti VK,Williams DB, editors. Digital signal processing handbook. USA: CRC Press; 1997.p. 14e9.

59. Gabor D. Theory of communication. Proc IEE 1946;93:429e57.60. Daubechies I. Orthonormal bases of compactly supported wavelets. Comm

Pure Appl Math 1988;41:909e96.61. Mallat S. A theory for multiresolution signal decomposition: the wavelet

representation. IEEE Trans Pattern Anal Machine Intell 1989;11:674e93.62. Akin M. Comparison of Wavelet transform and FFT methods in the analysis of

EEG signals. J Med Syst 2002;26:241e7.*63. Schaltenbrand N, Lengelle R, Macher JP. Neural network model: application to

automatic analysis of human sleep. Comput Biomed Res 1993;26:157e71.64. Acharya RU, Faust O, Kannathal N, Chua T, Laxminarayan S. Non-linear analysis

of EEG signals at various sleep stages. Comput Meth Programs Biomed2005;80:37e45.

65. Fell J, Röschke J, Mann K, Schäffner C. Discrimination of sleep stages:a comparison between spectral and nonlinear EEG measures. Electro-encephalogr Clin Neurophysiol 1996;98:401e10.

66. Pradhan N, Narayana Dutt D, Sadasivan PK, Satish M. Analysis of the chaoticcharacteristics of sleep EEG patterns from dominant Lyapunov exponents. In:Proceedings of RC IEEE-EMBS and 14th BMESI; 1995. p. 3/79e3/80.

67. Röschke J, Fell J, Beckmann P. The calculation of the first positive Lyapunovexponent in sleep EEG data. Electroencephalogr Clin Neurophysiol1993;86:348e52.

68. Shen Y, Olbrich E, Achermann P, Meier PF. Dimensional complexity andspectral properties of the human sleep EEG. Clin Neurophysiol2003;114:199e209.

*69. Hagan MT, Demuth HB, Beale MH. Neural Network design. Boston: PWS Pub.Co.; 1995.

70. Rosenblatt F. The perceptron: a probabilistic model for information storageand organization in the brain. Psychol Rev 1958;65:386e408.

71. Principe JC, Tome AMP. Performance and training strategies in feed forwardneural networks: an application to sleep scoring. In: Proceedings of 1990 IJCNN1989;vol. 1. 341e346.

72. Elman JL. Finding structure in time. Cognitive Sci 1990;14:179e211.73. Hopfield JJ. Neural networks and physical systems with emergent collective

computational abilities. Proc Natl Acad Sci USA 1982;79:2554e8.74. Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by

error propagation. In: Rumelhart DE, McClelland JL, editors. Parallel distributedprocessing: explorations in the microstructure of cognition. Cambridge: MITPress; 1986. p. 318e62.

75. Kelley CT. Iterative methods for optimization. Philadelphia: Society for Indus-trial and Applied Mathematics; 1987.

76. Kohonen T. The self-organizing map. Proc IEEE 1990;78:1464e80.77. Foresee Fd, Hagan MT. GausseNewton approximation to Bayesian regulari-

zation. Proceedings of IJCNN; 1997. p. 1930e1935.78. Hagiwara K. Regularization learning, early stopping and biased estimator.

Neurocomputing 2002;48:937e55.79. Mackay DJC. Bayesian interpolation. J Neural Computation 1991;4:1e14.80. Duda RO, Hart PE, Stork G. Pattern classification. 2nd ed. New York: Wiley-

Interscince; 2000.81. Cohen J. A coefficient of agreement for nominal scales. Educ Physiol Meas

1960;20:37e46.82. Landis JR, Koch GG. The measurement of observer agreement for categorical

data. Biometrics 1977;33:159e74.83. Cohen J. Weighted kappa: nominal scale agreement with provision for scaled

disagreement or partial credit. Physiol Bull 1968;70:213e20.84. Mielke PW, Berry KJ, Johnston JE. Unweighted and weighted kappa as

measures of agreement for multiple judges. Int J Manage; 2009:26.85. Berthomier C, Drouot X, Herman-Stoïca M, Berthomier P, Prado J, Bokar-

Thire D. Automatic analysis of single-channel sleep EEG: validation in healthyindividuals. Sleep 2007;30:1587e95.

86. Malinowska U, Klekowicz H, Wakarow A, Niemcewicz S, Durka PJ. Fullyparametric sleep staging compatible with the classical criteria. Neuroinform2009;7:245e53.

87. Schwaibold M, Schöller B, Penzel T, Bolz A. Artificial intelligence in sleepanalysis (ARTISANA) e modelling of the visual sleep stage identificationprocess. Biomed Tech 2001;46:129e32.

88. Anderer P, Moreau A, Woertz M, Ross M, Gruber G, Parapatics S, et al.Computer-assisted sleep classification according to the Standard of theAmerican Academy of sleep medicine: validation study of the AASM version ofthe Somnolyzer 24�7. Neuropsyshobiology 2010;62:250e64.

89. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, et al.PhysioBank, physiotoolkit, and physionet: components of a new researchresource for complex physiologic signals. Circulation 2000;101:e215e20.

90. Kemp B, Värry A, Rosa AC, Nielsen KD, Gade J. A simple format for exchange ofdigitized polygraphic recordings. Electroencephalogr Clin Neurophysiol1992;82:391e3.

91. Sweden B, Kemp B, Kamphuisen HAC, Velde EA. Alternative electrode place-ment in (automatic) sleep scoring (Fpz-Cz/Pz-Oz versus C4eA1/C3eA2). Sleep1990;13:279e83.

92. Mourtazaev MS, Kemp B, Zwinderman AH, Kamphuisen HAC. Age and genderaffect different characteristics of slow waves in the sleep EEG. Sleep1995;18:557e64.

93. Kemp B, Zwinderman AH, Tuk B, Kamphuisen HAC, Oberyé JJL. Analysis ofa sleep-dependent neuronal feedback loop: the slow-wave microcontinuity ofthe EEG. IEEE Trans Biomed Eng 2000;47:1185e94.