Tam et al. REVIEW Human motor decoding from neural signals: a review Wing-kin Tam 1† , Tong Wu 1† , Qi Zhao 2† , Edward Keefer 3† and Zhi Yang 1* Abstract Many people suffer from movement disability due to amputation or neurological diseases. Fortunately, with modern neurotechnology now it is possible to intercept motor control signals at various points along the neural transduction pathway and use that to drive external devices for communication or control. Here we will review the latest developments in human motor decoding. We reviewed the various strategies to decode motor intention from human and their respective advantages and challenges. Neural control signals can be intercepted at various points in the neural signal transduction pathway, including the brain (electroencephalography, electrocorticography, intracortical recordings), the nerves (peripheral nerve recordings) and the muscles (electromyography). We systematically discussed the sites of signal acquisition, available neural features, signal processing techniques and decoding algorithms in each of these potential interception points. Examples of applications and the current state-of-the-art performance are also reviewed. Although great strides have been made in human motor decoding, we are still far away from achieving naturalistic and dexterous control like our native limbs. Concerted efforts from material scientists, electrical engineers, and healthcare professionals are needed to further advance the field and make the technology widely available in clinical use. Keywords: motor decoding; brain-machine interfaces; neuroprosthesis; neural signal processing Background Every year, it is estimated that more than 180,000 people undergo some form of limb amputation in the United States alone [1]. In 1996, a national survey re- vealed that there are 1.2 million people living with limb loss [2]. The figure is expected to be more than tripled to 3.6 million by year 2050 [1]. Besides amputations, various neurological disorders or injuries will also af- fect one’s movement ability. Examples include spinal cord injury, stroke, amyotrophic lateral sclerosis, etc. Patients suffering from these conditions lose volitional movement control even though their limbs are still in- tact. No matter if it is amputation or neurological dis- order, affected patients have their everyday life and work significantly disrupted. Some may be forced to give up their original jobs, while some may even lose the ability to take care of themselves entirely. * Correspondence: [email protected]1 Department of Biomedical Engineering, University of Minnesota Twin Cities, 7-105 Hasselmo Hall, 312 Church St. SE, 55455 Minnesota, USA Full list of author information is available at the end of the article † Email contacts: WKT: [email protected], TW: [email protected], QZ: [email protected], EK: [email protected]Fortunately, although part of the signal transduc- tion pathway from higher cortical centers to muscles have been severed in those aforementioned conditions, in most of the cases we can still exploit the remaining parts to capture the movement intention of the sub- ject. For amputation, the neurological pathway above the nerve stump is mostly intact. For neurological dis- orders and injuries, depending on the site of the le- sion, usually upper stream structures are still intact and functioning. With modern neural interfacing tech- nology, signal processing and machine learning algo- rithms, it is now possible to decode those motor inten- tions and use it to either replace the loss function (e.g. through a prosthesis) or to help rehabilitation (e.g. in stroke [3, 4]). The signal for movement control can be intercepted at various points along the neural transduction path- way. Each of these points exhibits different features and poses unique advantages and challenges. Some of the methods are more invasive (e.g. intracortical recording) but also more versatile because they inter- cept neural signals at the upmost stream, so they are less reliant on the presence of residue functions. How- ever, some others (e.g. surface electromyogram) while
23
Embed
Human motor decoding from neural signals: a reviewqzhao/publications/pdf/decoding_re… · Human motor decoding from neural signals: a review Wing-kin Tam1y, Tong Wu1y, Qi Zhao2y,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Tam et al.
REVIEW
Human motor decoding from neural signals: areviewWing-kin Tam1†, Tong Wu1†, Qi Zhao2†, Edward Keefer3† and Zhi Yang1*
Abstract
Many people suffer from movement disability due to amputation or neurological diseases. Fortunately, withmodern neurotechnology now it is possible to intercept motor control signals at various points along the neuraltransduction pathway and use that to drive external devices for communication or control. Here we will reviewthe latest developments in human motor decoding. We reviewed the various strategies to decode motorintention from human and their respective advantages and challenges.
Neural control signals can be intercepted at various points in the neural signal transduction pathway,including the brain (electroencephalography, electrocorticography, intracortical recordings), the nerves(peripheral nerve recordings) and the muscles (electromyography). We systematically discussed the sites ofsignal acquisition, available neural features, signal processing techniques and decoding algorithms in each ofthese potential interception points.
Examples of applications and the current state-of-the-art performance are also reviewed. Although greatstrides have been made in human motor decoding, we are still far away from achieving naturalistic anddexterous control like our native limbs. Concerted efforts from material scientists, electrical engineers, andhealthcare professionals are needed to further advance the field and make the technology widely available inclinical use.
Keywords: motor decoding; brain-machine interfaces; neuroprosthesis; neural signal processing
BackgroundEvery year, it is estimated that more than 180,000people undergo some form of limb amputation in theUnited States alone [1]. In 1996, a national survey re-vealed that there are 1.2 million people living with limbloss [2]. The figure is expected to be more than tripledto 3.6 million by year 2050 [1]. Besides amputations,various neurological disorders or injuries will also af-fect one’s movement ability. Examples include spinalcord injury, stroke, amyotrophic lateral sclerosis, etc.Patients suffering from these conditions lose volitionalmovement control even though their limbs are still in-tact. No matter if it is amputation or neurological dis-order, affected patients have their everyday life andwork significantly disrupted. Some may be forced togive up their original jobs, while some may even losethe ability to take care of themselves entirely.
*Correspondence: [email protected] of Biomedical Engineering, University of Minnesota Twin
Cities, 7-105 Hasselmo Hall, 312 Church St. SE, 55455 Minnesota, USA
Full list of author information is available at the end of the article†Email contacts: WKT: [email protected], TW: [email protected], QZ:
Fortunately, although part of the signal transduc-tion pathway from higher cortical centers to muscleshave been severed in those aforementioned conditions,in most of the cases we can still exploit the remainingparts to capture the movement intention of the sub-ject. For amputation, the neurological pathway abovethe nerve stump is mostly intact. For neurological dis-orders and injuries, depending on the site of the le-sion, usually upper stream structures are still intactand functioning. With modern neural interfacing tech-nology, signal processing and machine learning algo-rithms, it is now possible to decode those motor inten-tions and use it to either replace the loss function (e.g.through a prosthesis) or to help rehabilitation (e.g. instroke [3, 4]).
The signal for movement control can be interceptedat various points along the neural transduction path-way. Each of these points exhibits different featuresand poses unique advantages and challenges. Someof the methods are more invasive (e.g. intracorticalrecording) but also more versatile because they inter-cept neural signals at the upmost stream, so they areless reliant on the presence of residue functions. How-ever, some others (e.g. surface electromyogram) while
are less invasive, rely heavily on the presence of down-stream functional structures and thus any upstreamdamages undermine their performance. Ultimately, thechoice of signal modality to decode from depends onthe location, type, and severity of the lesion. In thisreview, we will discuss the various opportunities avail-able to decode motor intention from human subject atdifferent locations along the motor control pathway. Itis our hope that this comprehensive information canhelp make the most effective clinical decision on howto help the patients.
In this review, we will mainly focus on the decod-ing of motor intention on human subjects. Althoughanimal studies are an very important and indispens-able part of motor decoding research, the applicationon human subjects is the ultimate goal. Clinical tri-als on patients may introduce additional and non-negligible challenges to the system and experimentaldesign. For example, in amputees or paralyzed sub-jects the ground-truth for limb movement is usuallyunavailable. Special considerations must be incorpo-rated into the experimental design to work around thislimitation. Furthermore, although some methods maybe working very well on animal studies, their transla-tion into human use may not be straightforward dueto safety concerns or surgical difficulties. Therefore, afocus on human studies will allow us to have a morerealistic expectation of the current state-of-the-art per-formance in the field. This knowledge can then in-turnbetter inform the decision choosing between risk andbenefit of a decoding strategy.
Main textNeurophysiology of motor control
To decode the motor intention of human subject, it isuseful to first understand the natural neurophysiologyof motor control, so that we may know where to inter-cept the control signal and what kind of signal featurethat we may encounter.
Motor controls in the human body begins at thefrontal and posterior parietal cortex (PPC) [5, 6].These areas carry out high-level, abstract thinking todetermine what actions to take in a given situation[7]. For example, when confronted with a player fromthe opposing team, a soccer player may need to de-cide whether to dribble, shoot or pass the ball to histeammate. The choice of the best action depends onthe location of the player, the opponent and the ball.It also depends on the current joint angles of the kneesand ankles in relation to the ball. The PPC receives in-put from the somatosensory cortex to get informationon the current state of the body. It also has exten-sive interconnection with the prefrontal cortex, which
is responsible for abstract strategic thoughts. The pre-frontal cortex may need to consider other factors be-side the sensory information about the current envi-ronment. For example, how skillful is the opponentcompared to myself? What is the existing team strat-egy at the current state of the game, should I playmore aggressively or defensively? The combination ofsensory information, past experience, and strategic de-cision in the frontal and posterior parietal cortex de-termine what sequences of action to take.
The planning of the action sequence is then carriedout by the premotor area (PMA) and the supplemen-tary motor area (SMA), both located in Brodmannarea 6 of the cortex. Stimulation in area 6 is known toelicit complex action sequence and intracortical record-ing in the PMA shows that it is activated around 1second before movement and stops shortly after themovement is initiated [8]. Some neurons in the PMAalso appear to be tuned to the direction of movement,with some of them only be activated when the handmove in one direction but not in the other.
After a sequence of action is planned in PMA orSMA, it requires input from the basal ganglia to ac-tually initiate the movement. The basal ganglia con-tains the direct and indirect pathway [9–11]. The di-rect pathway helps select a particular action to initiate,while the indirect pathway filters out other inappropri-ate motor programs. In the direct pathway, the stria-tum (putamen and caudate) receives input from thecerebral cortex and inhibits the internal globus pal-lidus (GPi). In the resting state, GPi is spontaneouslyactivated and inhibits the oral part of the ventral lat-eral nucleus (VLo) of the thalamus. Thus, inhibitionof GPi will enhance the activity of VLo, which in turnexcites the SMA. In the indirect pathway, the striatumexcites GPi through the subthalamus nucleus (STN),which then suppresses VLo activity and in turn in-hibits SMA. In some neurological disorder like Parkin-son’s disease, deficit in the ability to activate the directpathway will lead to difficulty in initiating a movement(i.e. bradykinesia), while deficit in the indirect path-way will lead to uncontrolled movement in the restingstate (i.e. resting tremor).
After the basal ganglia helps filter out unwanted mo-tor programs and focus on the selected programs, theprimary motor cortex (M1) will be responsible for theirlow-level executions [12]. In the layer V of M1, thereare population of large neurons pyramidal in shapethat project their axon connections down the spinalcord through the corticospinal track. These axons con-nect with motor neurons in the spinal cord monosy-naptically to activate muscles fibers. They also con-nect with inhibitory interneurons in the spinal cord toinhibit antagonistic muscles. This structure allows one
Tam et al. Page 3 of 23
single pyramidal cell to generate coordinated move-ment in multiple muscle groups.
Motor neurons in the spinal cord receive inputs fromthe M1 pyramidal cells through the corticospinal track[13]. They also receive the input indirectly from themotor cortex and cerebellum through the rubrospinaltrack, routed via the red nucleus in the midbrain. Al-though its functions is well established in lower mam-mal, the functions of the rubrospinal track in humanappears to be rudimentary. Motor neurons in the ven-tral horn of the spinal cord bundle together to formthe ventral root, which exits the spinal cord and jointswith the dorsal root to form a mixed spinal nerve.The spinal nerve further branches out to smaller nervefibers that innervate various muscles of the body. Onemotor neuron may supply multiple muscle fibers, col-lectively known as one motor unit. A muscle consistsof multiple muscle fibers, grouped into motor units ofvarious sizes, each of which may be supplied by dif-ferent motor neurons. In large muscles such as thosein the leg, one motor neuron may supply hundreds ofmuscle fibers. In smaller muscles, such as those in thefingers, one motor neuron may only supply 2 or 3 mus-cle fibers, enabling fine movement control.
The motor control pathway of the human body goesfrom the high level associative area of the brain, medi-ated by the motor cortex, through the spinal cord tothe individual muscle fibers. Each of the stages playsa different role and uses different mechanisms to en-sure that a movement is carried out in a coordinatedand smooth manner. Each of these stages also offersdifferent signal modalities and features that can be ex-ploited for motor decoding. We will now discuss thesefeatures and strategies to utilize them in details below.An overview showing the motor control pathway andvarious ways to intercept the control signal is shownin (Fig. 1).
Cortical decoding of limb movementsAll volitional motor controls originate from the brain.The motor cortex of the brain plays an especially im-portant role in planning and executing motor com-mands. For some patients, the brain is the only sitewhere motor intention can be captured because theyhave lost motor functions in all their extremities (e.g.in tetraplegic patients). Therefore, many efforts havebeen invested in cortical decoding.
Electroencephalography (EEG)EEG is the measurement of weak electrical signalsfrom the brain on the surface of the scalp. Its ori-gin is believed to be the summation of postsynapticpotentials of excitable neural tissues in the brain [14].The skull, dura and cerebrospinal fluid between the
brain and the EEG electrodes attenuate the electri-cal signal significantly, thus the EEG signal is veryweak, typically under 150 µV. Those structures alsoact like temporal low-pass filters, limiting the usefulbandwidth of the EEG signal to be below 100 Hz [15].Furthermore, due to the volume conduction effect ofcurrent sources in the head, the effect of a single cur-rent source spreads to several electrodes. The resultis a spatial low-passing of the original signal, leadingto a “smearing” of the signal source and reduction inthe spatial resolution. Thus most EEG setups for mo-tor decoding only involve 64 or 128 electrodes. Setupswith higher than 128 electrodes are uncommon.
EEG signal is traditionally separated into several fre-quency bands (delta: 0 – 4 Hz, theta: 4 – 7.5Hz, al-pha: 8 – 13Hz, beta: 13 – 30Hz, gamma: 30 – 100Hz).Of particular importance to motor decoding is thebrain oscillation in the alpha band over the motorand somatosensory cortex, also known as the µ-rhythm[16, 17]. It has been observed that there is a decreaseof the signal power in the 8 – 13 Hz band when asubject is carrying out actual or even imagined move-ment [18, 19]. Similar observations can also be foundin the lower beta band (12 – 22Hz). Although somecomponents of the beta band oscillation may be har-monics of the alpha band signals, the common consen-sus now is that they are independent signal featuresdue to having different topographic and timing char-acteristics [18, 20]. The mu-rhythm tends to focus onthe bi-lateral sensorimotor area while the beta rhythmconcentrates mainly on the vertex. Collectively, themodulation of the signal band power over the sensori-motor area is called sensorimotor rhythm (SMR).
This decrease of band power coinciding with an eventis called event-related desynchronization (ERD). Theopposite is called event-related synchronization (ERS),which is the increase of band power coinciding with anevent. ERD/ERS is typically calculated with respectto a reference period, usually when the subject is wake-fully relaxed and not doing any task [21]:
ERD =R−AR
× 100%
where R is the band power during the reference pe-riod and A is that during the time period of interest.An example of ERD topography during motor imageryis shown in (Fig. 2).
The ERD topography during movement displays anevolving pattern over time [21]. ERD usually startsaround 2 seconds before actual movement, concen-trating on the contralateral sensorimotor area, thenspreads to the ipsilateral side and becomes bilaterallysymmetrical just before the start of movement. After
Tam et al. Page 4 of 23
EEG
ECoG
Nerverecordings
EMG
Intracorticalrecordings
Figure 1 Overview of various ways to intercept motor control signals. Motor control signal is relayed from the primary motorcortex of the brain, via the spinal cord and peripheral nerve, to the muscle fibers. The control signal can be intercepted at variouspoints using different techniques. Electroencephalography (EEG) captures the superimposed electrical fields generated by neuralactivity on the surface of of the scalp. Electrocorticography (ECoG) measures activity underneath the scalp on the surface of thebrain. Intracortical recordings penetrate into the brain tissue to acquire multi- and single-unit activities. Electrodes can also beplaced on the peripheral nerve to monitor the low level signal used to drive muscle contraction. Finally, electromyograph (EMG) canalso be used to monitor the activity of the muscle directly. (Figure contains elements of images adapted from Patrick J. Lynch andCarl Fredrik under Creative Commons Attribution License)
the movement terminates, there is an increase of betaband power (i.e. ERS) around the contralateral sen-sorimotor area [21, 23, 24], also known as the “betarebound”. The occurrence of beta rebound coincideswith reduction in corticospinal excitability [25], sug-gesting the rebound may be related to the deactiva-tion of the motor cortex after a movement terminates.Beta rebound occurs in actual as well as in imaginedmovements. An example of the beta rebound can beobserved in (Fig 2a).
Different kinds of motor imagery (MI) produce dif-ferent topograpies of ERD and hence are useful fordecoding the motor intention of the subject. For exam-ple, imaging moving one’s hand will elicit ERD nearthe hand area of the motor cortex, which is in themore lateral position. On the other hand, imaging afoot movement will elicit ERD near the foot area insome of the subjects, which is closer to the sagittalline [26], as can be observed in (Fig. 2c). The betarebound after MI also displays a similar somatotopicpattern [23]. Simultaneous ERD and ERS on different
parts of the brain is also evident in some of the subject.For example, some subjects showed ERD in the handarea and ERS in the foot area during a voluntary handmovement, and vice versa during a foot movement[23].ERD may represent an activation of the cortical areacontrolling the motion while an ERS may represent aninhibition of other unintended movements. As we recallfrom the neurophysiology of motor control, the indi-rect pathway of basal ganglia contains mechanisms tosuppress the thalamic activation to SMA to filter outunintended movements. There are characteristic pat-terns of ERD/ERS during different actual and imag-ined movements, thus by looking into those patternswe can detect and distinguish the motor intention ofdifferent body parts.
The most reactive frequency band at which ERD/ERSoccurs may be specific for each subject and even for thetype of motor imagery, and its topography may varyslightly across different EEG preparations. Therefore,signal processing and machine learning techniques are
Tam et al. Page 5 of 23
Left hand
Right hand
ERD
ERS
(a) (b) (c)
Fixation CueMotor
imagery Rest
0 2 3 6 t (s)
Figure 2 Examples of EEG features in motor decoding. EEG features from one of the subject from the BCI Competition IV 2adataset [22]. (a) The time course of the change in band power of the EEG signal filtered between 8-12Hz, in left hand and righthand motor imagery, compared to a reference period (0-3s). The shaded regions show the standard deviation of the changes acrossdifferent trials. The experimental paradigm is also shown below. (b) The frequency spectrum of the EEG signal during the fixationand motor imagery (c) the topography of the ERD/ERS distribution in different types of motor imagery.
usually employed to adapt to the signal features of thesubjects automatically.
One of the most important signal processing step inSMR-based motor decoding is the estimation of signalpower in the frequency range of choice, typically inthe alpha (8–12 Hz) and beta (12–30 Hz) band. Thereare many methods to achieve this. One of the sim-plest and most computational efficient method is band-pass filtering [3, 27]. The EEG signal is first band-pass filtered in the frequency band of interest, thenthe sum of the square of the signal is then taken asthe power of the signal in the chosen frequency band.Sum-of-the-square is equivalent to the variance of thesignal, so usually the variance of the signal is usedinstead. After taking the variance, a log-transform iscommonly employed. The log-transform can serve twopurposes. First, it transforms skewed data to makethem more conforming to the normal distribution [28],which may help improve performance in some clas-sification algorithms. Second, the log-transform em-phasizes the relative change of the signal rather than
the absolute difference (e.g. log(110) − log(100) =log(1100) − log(1000)), so it can perform an implicitnormalization of the signal and improve the perfor-mance of the classifier.
One of the major drawbacks of the simple band-pass filtering approach is that it may be difficult tochoose the best frequency band to perform the filter,as each patient has their own specific reactive band. Toovercome this limitation, the adaptive auto-regressive(AAR) model is another commonly employed tech-nique [29–32]. It models the signal at current timepoint as a linear combination of previous p points:
Yt = a1,tYt−1 + a2,tYt−2 + . . .+ ap,tYt−p +Xt
where Yt is the signal, Xt is the residue white noise andap,t the autoregressive coefficients. The core differencewith the traditional AR model is that in the AARmodel, the coefficients ap,t are dependent on time andare calculated for each signal time point using recur-sive least square [33]. AAR coefficients from multiple
Tam et al. Page 6 of 23
electrodes are then concatenated together to form thefeature vector used by a classification system. AARcoefficients can be seen as the impulse response of asystem and so it contains information about the fre-quency spectrum of the modeled signal. Compared tothe traditional band-pass filtering, spectrum estima-tion using AAR can be more robust against noise.One can also specify the number of spectrum peaksbased on domain knowledge (each peak requires twocoefficients). Another advantage is that there is noneed to choose a subject-specific frequency band be-forehand as all model coefficients are used for clas-sification. Another way to choose the subject-specificfrequency band automatically is to use a filter bankthat consists of multiple band-pass filters in differentfrequencies. After filtering, the most informative fre-quency band and channels are then selected using someperformance metrics, e.g. whether deleting those fea-ture will lead to a reversal of the classification label[34, 35].
Due to the volume conduction problem in the hu-man head, a single current source often appears to be“smeared” across several EEG electrodes. Spatial fil-tering is usually employed to improve the spatial res-olution of the EEG signal. Popular spatial filters in-clude the common average reference (CAR) and sur-face Laplacian [36]. These methods re-reference thesignals by subtracting the voltage at each electrodefrom the average (as in CAR) or from its neighbors(as in surface Laplacian).
V CARj = Vj −1
N
k=1∑N
Vk
V LAPj = Vj −1
n
∑k∈Sj
Vk
where V is the signal voltage, N is the total numberof electrodes, n the number of neighboring electrodes,and S is the set of neighboring electrodes in surfaceLaplacian (LAP).
These filters enhance the focal activity by acting likea high-pass spatial filter. There are also other more ad-vanced spatial filters proposed. For example, the pop-ular common spatial pattern (CSP) [37, 38] works byfinding a projection of the electrode voltage such thatthe difference in variance between two classes are max-imized. A further variation of the method is to add infrequency information by filtering the signal by a set offilter bands and then calculate the CSP for each, andfinally select the most informative feature through amutual information criterion [39].
The performance of EEG-based motor decoding hasbeen improving steadily over the years. While earlier
studies can only distinguish between discrete types ofmotor imagery [40], recent studies have already achieve2D [41] and 3D control [42–44]. Some of the latest stud-ies even demonstrate that it is possible to decode dif-ferent movements in the same limb [45, 46] or evenindividual finger movements [47].
Besides being used to replace the lost functions,EEG-based motor decoding can also be used a toolfor rehabilitation. For example, it can be used to con-trol a robotic hand to assist in active hand trainingin post-stroke rehabilitation [4, 48, 49]. This applica-tion of motor decoding as a tool for training is a verypromising area, as it can potentially extend its use toa wider population.
Electrocorticogram (ECoG)ECoG is the measurement of the electrical signals fromthe brain on top of the dura, but underneath the skull.ECoG measurement is commonly performed before anepilepsy surgery to delineate the epileptogenic areaand identify important cortical regions to avoid dur-ing a resection [50]. ECoG signal is not affected bythe skull and thus tends to have a higher temporaland spatial resolution than EEG. It also has a largerbandwidth (0 to 500 Hz) [51, 52] and higher amplitude(maximum ∼500 µV [53]). Therefore, generally ECoGhas a higher signal-to-noise ratio than EEG althoughit is also more invasive.
ECoG and EEG likely arise from the same underly-ing neural mechanisms therefore they share many sim-ilarities with each other. Howevers, there are two ma-jor signal features in motor decoding that are uniqueto ECoG and are specifically exploited. The first isthe change of signal band power in the high gammaband (≥ 75Hz). Many studies have suggested that thehigh gamma band contains more informative featuresfor motor decoding compared to the alpha and betaband, which are typically used in EEG decoding [54–58]. Interestingly, the high gamma band tends to in-crease during movement, unlike the alpha and betaband, which typically show desynchronization (i.e. de-crease in power). Therefore, high gamma power maybe produced by a different neural mechanism than theone that produces the alpha and beta desynchroniza-tion.
Another unique feature is the low-frequency ampli-tude modulation of the raw ECoG signal, coined as theLocal Motor Potential (LMP) by Schalk et al. [31, 52].It was found that the envelop of the raw ECoG showsa striking correlation to the movement trajectory ofthe human hand, as measured by a joystick. The am-plitude also shows a cosine or sine tuning in relation tothe movement direction, similar to what have been ob-served in intra-cortical recordings. Since this discovery,
Tam et al. Page 7 of 23
many group have incorporated the LMP into ECoGmotor decoding in addition to other high frequencyfeatures (e.g. [54, 57, 59, 60]). The LMP is a very lowfrequency component (2-3 Hz) of the raw ECoG signal.It is usually extracted by Guassian low-pass filter, run-ning average [31, 54, 60], or the Savitzky-Golay filter[59, 61, 62].
Due to the robustness of the LMP signal, usuallya simple linear regression is sufficient to decode themotor intention in many of the previous studies (e.g.[52, 63, 64]), although a feature selection or regulationstep may be needed to first remove the uninforma-tive features. A recent study using deep neural networkalso show promises [65], however its improvement com-pared to classical techniques is not always significant.
Because ECoG has a better resolution and highersignal-to-noise ratio, it tends to produce better andfiner results than EEG in motor decoding. Beside de-coding the movement of different body parts as inEEG [66, 67], different hand gesture can also be dis-tinguished [57, 68]. Using the LMP in addition tofrequency features, position and velocity of 2D armmovement can also be decoded from ECoG signals[31, 52, 59]. Subsequent studies even demonstratethat continuous finger positions can also be decoded[55, 60, 62, 64, 65, 69]. The correlation coefficient be-tween the predicted and actual finger movement canreach from 0.4 to 0.7 in some of the recent studies[62, 65].
The large majority of studies in ECoG motor decod-ing are performed on epilepsy patients without a spe-cific movement disorder or limb injury. However, oneof the strongest motivation for motor decoding is thatit can compensate the lost motor function of a patient.Given that the brain may re-organize due to disease orinjury, it is vitally important that the decoding exper-iments be repeated on those patient population as wellto see if similar decoding performance can be achieved.There are only a few studies to try ECoG motor decod-ing in stroke patients [58, 70] and paralyzed subjects[71], but the results are encouraging.
Intra-cortical recordingsPenetration into the cortical tissue offers the closestproximity to the neurons and produces the most pre-cise signal. Since the discovery of the directional tuningproperty of the neurons in the motor cortex [72], a lotof studies have been trying to decode motor intentionfrom intracortical recordings, first in non-human pri-mate (NHP), then in human subjects in recent years.Our review will focus on intracortical decoding in hu-man as it presents some unique challenges comparedto NHP, and it is also where the technology will ulti-mately be applied.
Penetrating electrodes for motor decoding are usu-ally implanted into the primary motor area of thebrain. There is a structure in the precentral gyrus re-sembling a “knob” that houses a majority of the neu-rons responsible for motor hand function [73]. This“motor hand knob” is typically used as the target forelectrode implantation (e.g. in [74–78]). Another po-tential target for implantation is the posterior parietalcortex (PPC). Although PPC has long been proposedto play an important role in the associative functions,in recent years more and more evidence suggests thatit also encodes the high-level motor intention of thesubject [79]. A recent study suggests that the goal andtrajectory of the movement can be decoded from neu-ral activities in human PPC [80].
One important property exhibited by the neuronsin the M1 is directional tuning. Some of the neuronsthere are broadly tuned to a particular direction. Theydischarge the strongest when the movement is in theirpreferred direction, but they will also discharge lessvigorously when the movement is in other directions.Their firing rates present the length of their preferreddirection vector. When the vectors of those neurons aresummed together, it indicates the final direction of themovement. This population encoding of movement is astriking property of the nervous system. Similar analogof population encoding can also be found in the supercolliculus representing the direction of eye movement[81]. An example showing the directional tuning prop-erty of M1 in a non-human primate is shown in (Fig.3).
Currently, the only FDA-approved, commerciallyavailable microelectrode array for temporary (< 30days) intracortical recordings is the Neuroport Sys-tem (Blackrock Microsystem, Inc, USA). As a result,majority of the work on human intracortical decod-ing are performed on that platform. Other intracorti-cal electrodes do exist but they are either mainly foracute intraoperative monitoring (e.g. Spencer DepthElectrode, Ad-Tech; NeuroProbes, Alpha Omega En-gineering Ltd; microTargeting electrodes, FHC), orfor EEG applications (e.g. DIXI Medical MicrodeepDepth Electrodes).
The activities of the neurons in the implanted site arerepresented by their action potentials, which manifestas spikes in extracellular recordings. Therefore, detect-ing the occurrence of a spike is often the first step inintracortical signal processing. There are many meth-ods for spike detection [82, 83]. The signal is typicallyfirst band-passed filtered in the spike frequency band(e.g. 300-5000Hz), then various methods are used totransform the filtered signal to improve its signal-to-noise ratio (SNR). A detection threshold is then cal-culated to distinguish spikes from background noise.
Tam et al. Page 8 of 23
One of the most common spike detection methods isto use the root-mean-square of the signal
Thres = C ∗
√√√√ 1
N
N∑n=1
x[n]2
where Thres represents the detection thresholdabove which a signal time point is considered belong-ing to a spike. However, the RMS value may be easilycontaminated by artifacts, so another way is to use themedian to set the detection threshold [84]
σ = median
(|x|
0.6745
)Thres = 4 ∗ σ
The non-linear energy operator is also another popularmethod [84]. It first transforms the signal such that thehigh frequency component is amplified to improve theSNR.
ψ(x[n]) = x[n]2 − x[n+ 1]x[n− 1]
Thres = C1
N
n=1∑N
ψ[x(n)]
Other more advanced techniques like continuous wavelettransform [85] and EC-PC spike detection [83] can offera better accuracy but at a higher computational cost.Although there are a lot of ways to detect spike accu-rately offline, not everyone of them are fast enough tobe used in real-time. Therefore in online decoding thechoices are usually limited to the simpler algorithms.Manually setting a threshold by an operator still re-mains one of the most commonly used method. An-other popular method in online decoding is the RMSmethod due to its high efficiency.
An electrode may record signals from multiple neu-rons nearby. Isolating the activity of a single neuron(i.e. signal-unit activity) from this multi-unit activ-ity usually leads to better results in motor decoding.This process is called spike sorting. There is a largebody of literature on spike sorting that cannot be ex-hausted here. Interested readers are encouraged to con-sult other excellent reviews [86–88]. In practice, themost popular way to do online, real-time spike sortingis via template matching. A set of spike templates arecollected during a period of initial recording, then sub-sequent spikes are classified by comparing their sim-ilarity with the templates. However, it may not bereally necessary, or may even degrade the decodingresult, to do online spike sorting. The spike clusters
obtained from recordings may not be stable across dif-ferent sessions of experiments. The total number ofsingle units sorted from recording may change fromsessions to sessions [80]. Thus a decoder trained onsome sorted spikes may not work well on future ses-sions. Spike sorting may also introduce additional la-tency in online decoding, as accurate spike sorting is acomputational expensive process. In fact, many recentdecoding studies do not use spike sorting at all, e.g.[80, 89–95].
A decoding algorithm reconstructs motor kinematicsfrom neural activity. Since the discovery of the direc-tional tuning property of motor neurons, one of theearliest decoding algorithm for intracortical spike sig-nal is the population vector algorithm[96, 97]. In itssimplest form, the firing rate of a neuron can be re-lated to its preferred direction by
f = f0 + fmaxcos(θ − θp)
where f is the neural firing rate, f0 and fmax are re-gression constants and θ and θp are the current andpreferred direction respectively. However, for cosinefunction the width of the modulation is fixed. A moreflexible tuning function that allows adjustable width ofthe modulation is the von Mises tuning function [98]:
f = b+ k exp(κcos(θ − µ))
where b, k, κ, µ are the regression constants, andθ is the current movement direction. When µ = θ,the function will be at maximum, so µ can also beinterpreted as the preferred direction of the neuron.Examples of the von Mises tuning curves are shown in(Fig. 3b).
The preferred directions of each of the neurons thencan be summed together to predict the target direction[97].
P (M) =
N∑i=1
wi(M)Ci
where Ci is the preferred direction for the i-th neuron,and wi(M) is the weighting function combining thecontributions of each neuron in direction M to thefinal population vector. However, this method requiresa large number of neurons to be accurate and may leadto error if the distribution of the preferred direction isnot uniform [99]. For example in (Fig. 3c), we can seethat the preferred directions are not distributed evenly.For this reason, a simple linear regression scheme isusually employed instead in recent studies [74],
u = Rf = R(RTR)−1RTk
Tam et al. Page 9 of 23
where R is the neural response matrix (e.g. firingrate), f is the linear filter (or the regression constants)and k is the motor kinematic values (e.g. joint anglesor cursor positions). It has been suggested that this re-gression scheme can provide more accurate predictioncompared to the summation of preferred direction vec-tors, especially when those vectors are not uniformlydistributed [99].
In recent years, the Kalman filter is usually employedinstead of the simple linear regression (e.g. in [76–78, 102, 103]). The Kalman filter incorporates the in-formation both from an internal process model andactual measurement to estimate the states of a sys-tem [104]. A Kalman gain variable is used to deter-mine the “mixing weight” of the model and measure-ments. If the model is more accurate, then it will trustthe model more. The same goes for the measurement.Kalman filter is especially useful if the states are notdirectly observable or if the measurement is very noisy,which are often both true in motor decoding. In mo-tor decoding, the subjects usually lost their limb orability to move, therefore the internal state (e.g. mo-tor intention) of the system is not directly observable.The observable variables (e.g. neural activity) are alsovery noisy. A typical Kalman filter for motor decod-ing assumes no control variable and the system can beformulated as two linear equation [105, 106]):
~xt = A~xt−1 + ~wt−1
~yt = C~xt + ~vt
where x is the state of the system one want to de-code, e.g. joint kinematics or cursor position. y is theobserved variables, e.g. neural firing rate. ~wt and ~vtare the process and measurement noises drawn fromwt ∼ N(0, Q) and vt ∼ N(0, R) respectively. A, C, Qand R are the Kalman constants that need to be de-fined according to the decoding model. For the internalstate x, if it is a cursor position, it can be expressed as
xt = [post, velt, 1]T
With the model defined, the Kalman gain K andthe estimation error covariance P then can be updatedwith the typical two-step update equations:
Predict:
x−t = Axt−1 +But
P−t = APt−1AT +Q
Update:
Kt =P−t C
T
CP−t CT +R
xt = x−t +Kt(yt − Cx−t )
Pt = (I −KtC)P−t
where x− and x are the a prior and a posterior stateestimates respectively. u is the control variable. Typ-ically it is set to 0 in motor decoding, here we haveincluded it for completeness.
One crucial aspect of performing online motor decod-ing is the training and re-calibration of the decodingmodel. Although the neural features for similar move-ments are relatively stable within a few days [107],the neural tuning curve may start to change when thesubject is learning to perform a new task [108]. It isalso very difficult to track the same neuron for an ex-tended period of time [109, 110], due to the micro-movement of electrodes and fluctuations of other noisesources. Furthermore, training data are often acquiredin an open-loop fashion, meaning that no feedback isprovided by the decoder during training. However, inactual decoding session, feedback is provided and thesubject may attempt to change his motor imagery inorder to “learn” the decoder. This may need to changein the underlying neural features [111]. Therefore, re-calibration of the trained model is often necessary andwill be ideal if it can be performed online. A success-ful re-calibration method is the ReFIT-KF algorithmproposed by Gilja et al [112]. ReFIT-KF assumes thesubject’s true intention is to move towards the tar-get, so it can generate a pseudo-ground truth from thedecoded result automatically even though the predic-tion of the current model may be wrong. It can thencalibrate the model using the estimate ground truthto adapt for the instability of the neural signals. It isable to produce better results than Kalman filter alone[93, 94, 112].
Due to the more robust signals obtained by intra-cortical recordings, it has been utilized successfully tohelp tetraplegia patient control the environments invarious ways, including 2D cursor control [74, 77, 95],virtual and real prosthetic hands [78, 80, 93, 113, 114]and functional electrical stimulation of the patients’own paralyzed hands [91, 92, 94].
Peripheral decoding of limb movementsSignals from the central nervous system (CNS) eventu-ally arrive at the peripheral nervous system (PNS) anddrive the contraction of different muscle fibers. Com-pared to CNS, signals in the peripheral structures areusually more specific. They contain detailed instruc-tions on the contractions of individual muscle fibers,
Tam et al. Page 10 of 23
(a) (b)
Direction (rad)
Fir
ing r
ate
(Hz)
(c)-157.5 ⁰
-112.5 ⁰ -67.5 ⁰
-22.5 ⁰
22.5 ⁰
67.5 ⁰112.5 ⁰
157.5 ⁰
Figure 3 Examples of directional tuning in intra-cortical signals Diagrams showing the directional tuning properties of the neuronsin non-human primate M1 from the data in [100, 101]. (a) Spike raster plots from one of the neurons (Neuron 31). Each plot showsthe spike timing of the neuron aligned to the time point (t=0) at which the movement speed of the hand exceeds a pre-definedthreshold. Each dot in the plot represents an action potential. Different plots indicates the neuronal activity when the hand is movingin different directions. (b). The von Mises tuning curve of some of the representative neurons. (c) The preferred direction of all theneurons. The length of the vector represents the modulation depth of the neuron, here defined as the magnitude of the tuning curvedivided by the angle between the maximum and minimum point on the curve.
therefore potentially can enable dexterous prostheticcontrol. Surgeries involved in peripheral interface isusually less complicated than those involving the in-tracortical structures. Therefore, many studies are alsodevoted to motor decoding in the peripheral struc-tures.
Peripheral nerve recordingsPeripheral nerves contain the low-level neural signalssent to activate the contraction of specific muscles.Previous studies on peripheral neural recording mainlyfocus on afferent sensory information because it is noteasy to get efferent signals in anesthetized animals[115]. However, in recent years, more studies have ap-peared trying to explore the possibility of decodingefferent peripheral nerve signals for prosthetic control.Because the peripheral nerves contain low-level infor-mation targeting each muscle, it may be possible toregain high-dexterity and naturalistic control by ex-ploiting this rich information.
One of the major challenges in peripheral nerverecordings is accessing the axons in the nerves. Axonsin spinal nerves are bundled in fascicules and multiplefascicules are grouped together to form a peripheral
nerve. Those axons are enclosed in three sheaths ofconnective tissues – the epineurium that covers the en-tire nerve, and the perineurium that encloses a fascicleand the endoneurium that holds the neurons and bloodvessels together within a fascicle. Due to these multiplelayers of lamination around an axon, the amplitude ofa peripheral nerve signal is usually very small, can bearound 5 – 20 µV [115].
There are multiple electrode configurations designedto get a better signal from the peripheral nerves [116].The cuff electrode [117], as its name suggests, workslike a cuff to wrap around a nerve. Its main advantageis that it causes minimal damage to the neural tissuesas it does not require any incision on the nerve itself.However, since it only measures the electrical potentialat the surface of a nerve, it can only obtain a grandsummation of the neural activity in different fascicles.Another variation of the cuff electrode is the flat in-terface nerve electrode (FINE) [118]. It works like aclip to apply pressure on the nerve and make it flat-tened into an oval shape, thus increasing its surfacearea and reducing the distance from the electrode tothe fascicles. There are also other types of electrodesthat are implanted into the nerves. They offer higher
Tam et al. Page 11 of 23
selectivity due to their direct contact with the fascicles.However, they are also more invasive and may causemore damage to the nerve. The longitudinal intrafasci-cular electrodes (LIFE) are long, thin wires implantedlongitudinally into the nerve fascicles [119]. On theother hand, the transverse intrafascicular multichannelelectrodes (TIME) are implanted transversely into thenerves, accessing multiple fascicles at the same time.There is also the Utah Slanted Electrode Array [120],which consists of an array of electrodes with differentlength, such that when the array is inserted into thenerve, the tip of the electrode can get into contact withdifferent fascicles. Recently, there is also developmentof the regenerative peripheral neural interface (RPNI)[121], which uses a muscle graft to wrap around sev-ered fascicles endings. The nerve endings grow intoand innervate with the graft, creating a new interfacefor acquiring neural signal. Of the different types ofelectrodes introduced, only the cuff electrode is cur-rently used in commercial FDA-approved systems forvagus nerve stimulation (e.g. VNS Therapy, Cyberon-ics, USA). Most of the others are still in research orundergoing clinical trials [122].
Studies on the human decoding of peripheral signalsare still very limited, partly due to the challenge ofacquiring nerve signals with sufficient SNR, and mayalso due to the cross-talk between neural signals andEMG, as the peripheral nerves are usually located inclose proximity with the limb musculature. The major-ity of existing studies focus on upper limb decoding, asupper-limb amputation tends to have a bigger impacton the everyday life of the patients. Neural recordingare performed on the ulnar, medial and/or the radialnerve. Different types of electrodes are used, but themore common ones in human decoding are the Utahslate electrode (e.g. in [123, 124]) and the LIFE (e.g.[125–127]).
The analysis of peripheral signals commonly involvesthe detection of action potentials in the nerve. The de-tection procedures are similar to those used in intra-cortical studies, but the step of clustering spikes is notusually performed. Due to the low SNR of the periph-eral signals, sometimes they need to be first de-noised(e.g. by wavelet [127]) before detection. The firing rateof the action potential can then be fed into a regressor(e.g. in [106, 123–125]) or a classifier (e.g. in [126, 127])for decoding. The difference in using a regressor or aclassifier lies in whether a discrete gesture or a contin-uous joint trajectory is decoded.
Support-vector machine (SVM) is the most com-monly used classifier for peripheral decoding (e.g. in[126, 127]). For regressor, simple linear regression or aKalman filter have been used ([106, 123–125]). Kalmanfilter allows the online recursive update of the model
in real-time, and is especially helpful when the mea-surement of the target variable is noisy (as often inthe case of motor decoding, since it is not possible tomeasure the actual movement of the missing limb).
The issue of obtaining ground truth for training thedecoder is also very important. While for discrete grasptype classification, it may be sufficient to ask the sub-ject to imagine holding a particular grasp, for positiondecoding a more precise approach have to be used. Onecommon solution is to show a shadow hand on a screen,and ask the subject to try to follow the movement ofthe hand, either through a manipulandum controlledby the mirrored movement in the intact hand [124] orthrough imagined phantom limb movements only.
Currently, the performance of human peripheralnerve decoding is still not very satisfactory, partly dueto the difficulty in obtaining clear signal and EMGcross-talk. In discrete grasp classification, a 4-classclassification task with 3 grasps (power grip, pinchgrip, flexion of little finger) and rest have obtained85% accuracy [127], but state-of-the-art surface elec-tromyogram (EMG) can already distinguish between 7gestures [128]. Regression-based decoding enables pro-portional control of a prosthetic hand, and hence canbe more intuitive. Decoding based on Kalman filter isable to classify 13 different movements offline, but only2 movements can be decoded online successfully dueto the cross-talk between different degree-of-freedoms(DoFs) [124].
The peripheral nerves offer a promising target formotor decoding. It is more downstream in the motorcontrol pathway and contains more specific informa-tion about muscle activities. This property can be po-tentially exploited to enable high dexterity control. Ac-cess to peripheral nerves is also relatively easier thanintracortical structures. However, peripheral record-ings are plagued by their low SNRs due to the multiplelevels of lamination around an axon. This may be im-proved by better electrode designs, and ultra-low-noiseneural amplifiers that can resolve the small amplitudeof the nerve signals (e.g. [129]).
Electromyogram (EMG)EMG signals are the sum of the electrical activities ofthe muscle fibers, which are triggered by spike trains,i.e. impulses of activation of the innervating motorneurons. EMG signals can be measured in two ways,either on the surface of the skin above a muscle (sur-face EMG), or directly inside a muscle fiber using aneedle electrode (intramuscular EMG). An example ofEMG data in different hand gestures is shown in (Fig.4).
Myoelectric signals have been used as the controlsource for decades in prostheses, in which muscle sig-nals are recorded and translated into control com-mands to induce prosthesis motions. Intramuscular
Tam et al. Page 12 of 23
EMG signals are believed to be of a higher resolutionand less susceptible to cross-talks compared with sur-face EMG because of its more invasive electrode de-ployment and direct targeting of specific muscles.
Despite decades of research and development, am-putees still do not use state-of-the-art myoelectricprostheses more frequently than the basic, body-powered hooks [131], and an estimate of 40% of upper-limb amputees actually reject using a prosthesis [132].One primary limitation of clinically available hand-prosthesis is the number of simultaneously and pro-portionally controllable degrees of freedom (DoFs),which is rarely greater than 2 [133, 134] and has fo-cused mostly on wrist DoFs without the hand [135],although functions of hand-movement are more essen-tial for daily living.
Myoelectric control can be categorized into directcontrol and pattern recognition control. Direct controlrefers to the type of methods that use the amplitudeof two surface EMG inputs from an antagonistic mus-cle pair to control the two directions (ON and OFF)at a prosthetic DoF. Due to the inadequate remain-ing musculature, signal crosstalk contamination, andattenuation of deep muscle signals at the skin level,the number of independent myosites in the residualforearm is typically limited to two, only allowing thecontrol of one DoF at a time. As a result of this con-straint, patients need to toggle between modes usingquick co-contraction at the myosites to sequentiallycontrol multiple DoFs. Pattern recognition control re-lies on machine learning algorithms to train a sepa-rate classifier for each DoF. Multiple classifiers havebeen proposed and evaluated, including quadratic dis-criminant analysis [136], support vector machine [137],artificial neural network [138], hidden Markov models[139], Gaussian mixture models [140], and more. How-ever, as training of the computational models involvesthe movement of only 1-DoF, the trained classifiers donot support simultaneous control of multiple DoFs. Amore promising approach based on machine learningis adopting a regression-based control scheme (insteadof classification) that inherently facilitates continuouscontrol (as opposed to ON and OFF), in which a lin-ear or nonlinear mapping from EMG signal featuresto the changes of prosthesis DoFs is learned. Com-monly used methods for this purpose include artificialneural networks [141], support vector machine [142],and kernel ridge regression [135]. A major shortcom-ing of regression-based control is the requirement forlarge amount of training data that include an exhaus-tive combination of movements of all prosthesis DoFs,which is impractical to be clinically implemented.
One of the fundamental issues with EMG based pros-thesis control is the scarcity of independent signals
with which to control prosthesis DoFs. EMG signalsare inherently heavily correlated and lacks the reso-lution and the information capacity needed for simul-taneous and proportional control of multiple DoFs. Apotential solution to this problem is to record motorcommands directly from the peripheral nerves, suchas ulnar and median nerves that directly innervate allfive fingers. However, this comes at the costs of inva-sive surgical implantation of electrodes and the risksof tissue infection and nerve damage.
There have been works to extract more invariant andindependent information from EMG signals withoutinvasive recordings. One major group of the efforts fo-cuses on extracting muscle synergy features from EMGrecordings, i.e., the complex muscle activation patternsthat are executed by users as high-level control inputsregardless of any neurological origin [143]. Muscle syn-ergies are believed capable of describing complex forceand motion patterns in reduced dimensions and can beused as a robust representation for decoding outputsconsistent with user’s intent. Non-negative matrix fac-torization (NMF) [144] has been commonly used toextract muscle synergies from multichannel EMG sig-nals for simultaneous and proportional control of mul-tiple DOFs [141, 145–147]. Another group of worksfocuses on directly extracting the neural codes of mo-tor neuron activities that govern the muscle move-ments through the nerve pathway. This normally re-quires advanced recording setups such as high-densityEMG with a sufficient number of recording sites thatare closely spaced. A number of algorithms have beenproposed to extract the underlying neural information[148, 149]. Among them, convolution kernel compensa-tion (CKC) has been most extensively used as a typeof multichannel blind source separation method [150–153]. Despite the promise of extracting neural contentsfrom high-density EMG signals, the demonstration ofutilizing such scheme in online experiments remainsdifficult. More in-depth investigation and significantefforts are needed to build neural interface and achievedirect neural-based control based on this framework.
Decoding of speech motor activitiesAlthough this review mainly focuses on the decodingof movement in the extremities, recently there are alsoanother line of research in decoding motor speech ac-tivities [154, 155]. Speech production is a complex pro-cess involving multiple areas of the brain and dozens ofmuscles fibers. The muscle activities need to be highlycoordinated to produce different speech sounds (i.e.phonemes) which concatenate together to form intelli-gible words and sentences.
Multiple brain regions are associated with languageproduction [156], but there are two major areas that
Tam et al. Page 13 of 23
Able-bodied AmputeesS21
S2 S22
S1
Ges
ture
s o
r m
ove
men
ts
Back Front
(a) (b)
Figure 4 Examples of EMG signal in different hand gestures Diagram showing EMG signals from 12 surface electrodes in 3different hand gestures. The original data are from [130]. (a) EMG signals from both able-bodied and amputee subjects. The lastrow shows the hand gestures performed for their respective EMG segments. (b) Locations of the 12 EMG electrodes.
have received more attentions in speech decoding. Theleft ventral premotor cortex has been suggested to rep-resent high-level phonemes in speech [157, 158], whilethe ventral sensorimotor cortex contains rich represen-tations of different speech articulators (e.g. lip, tongue,larynx etc.) [159, 160]. Therefore most of the decodingefforts concentrate on these two brain regions.
Historically, various neural signals have been ex-ploited to decode speech. EEG is non-invasive but itslow signal-to-noise ratio and EMG contamination fromfacial muscles make it very difficult to be used for de-coding speech [155]. There has been some success inusing multielectrode array to decode phenomes frommulti-unit activities [161]. However, the cortical repre-sentation of speech articulators cover a large area thatmay not be suitable for the very localized recordingregion of a multielectrode array [160, 162]. Further-more, speech decoding often require overt speech toserve as the ground truth, and that requires the sub-jects to be capable of speaking clearly. It is difficult tojustify implanting penetrating electrodes in the other-wise healthy eloquent cortex to conduct experiments.Currently, ECoG obtains a greater success in speechdecoding due to its high signal quality and less invasivenature. ECoG recordings are also commonly employed
during brain resection to avoid damage to the eloquentcortex, so it is well-integrated into existing surgicalprocedures. Studies using ECoG for speech decodingmainly focus on the high gamma band (70-170Hz), asit has been shown that the high gamma activity cor-relates strongly with ensemble firing rate [163].
Earlier speech decoding efforts have focused on thedirect decoding of simple words or phonemes [154,161, 162, 164–166], but their performance is not verysatisfactory. Decoding from a limited dictionary orphoneme set may produce a higher accuracy (e.g.>80% for 10 words [164] or 9 phonemes [161]), butit can only cover a very narrow range of human spo-ken expressions. Studies trying to decode the full rangeof English phonemes result in a lower classification ac-curacy (10-50% [154, 159, 166]). The low classificationaccuracy can be partly mitigated by incorporating apronunciation dictionary and language model (e.g. in[154]), which can limit the output of the decoder tomore probable words.
On the other hand, recently attentions have beenshifted to focus more on decoding the intermediaterepresentation of speech (e.g. articulator movements)rather than decoding phonemes directly. Part of theshift may be motivated by the growing body of evi-
Tam et al. Page 14 of 23
dence suggesting that the speech motor cortex is ableto generate differential activation patterns encodingthe kinematics of speech articulators [160, 167–169].Advances in deep learning has made the predictionof articulator trajectories from acoustic signal (i.e.acoustic-articulatory inversion) accurate enough to actas the ground-truth for decoding, as the traditionalways of implanting coils or magnets in the mouthvia articulography is invasive and not compatible withneural recordings [170]. In one very recent study [171],a deep neural network is used to decode ECoG fea-tures to articulator trajectories. The trajectories arethen decoded by another neural network to acousticfeatures (e.g. pitch, mel-frequency cepstral coefficientsetc.), which are then converted to audible voice us-ing a voice synthesizer. Even mimed speech can bedecoded, although with a lower accuracy. In anotherstudy [172], ECoG features are decoded into mel-scaledspectrograms directly using a neural network, then aneural network vocoder is used to construct the spec-trogram into audible waveforms. These recent resultsshow great promises in decoding human speech fromECoG signals.
Challenges and future directionAlthough great strides have been made in decodinghuman motor intention, there are still some signifi-cant challenges remain to be solved. One of the biggestchallenge preventing the adoption of motor decodingoutside the laboratory is the limited longevity of thedecoding model. Typically, some calibration session isneeded to collect data to train the decoding model,then the model is tested on subsequent sessions onthe same or next few days. While it is acceptable ina scientific study due to the limited time and clini-cal resources available, in actual daily use, the trainedmodel must be able to maintain its performance for anextended period of time.
The limited longevity can be due to several rea-sons. First is the instability of the electrode interfaces.Micro-movement of the electrodes may cause a shift inthe feature space. If the decoder is not robust enough,this shift may result in a deterioration of the decodingperformance. Another reason is the different environ-ment noises injected into the acquired signals. Neuralsignals used for decoding usually have a very smallamplitude and thus are susceptible to interference byenvironment noises. A cell-phone, fluorescence lampor other electrical appliances all inject various types ofnoise in the acquired signal. As the subjects are per-forming various tasks in daily lives, they may comeinto the influence of different noise sources not cov-ered in the trained data set and results in performancedegradation. The third reason is the slow build up
of immune response on the electrode interface. Glialscars may encapsulate the electrode and increase itsimpedance [178]. Neurodegeneration as a result of im-mune response will also lead to a weaker signal [179].The model longevity problem is multifaceted and mustbe carefully addressed. First, a better electrode de-sign can help secure the electrode onto its anchoringstructure and reduce their relative movement. An im-plantable solution will also produce more stable featurethan one that requires repeated dismantling and re-installation every time (e.g. EEG and EMG). Second,the model should be trained with more robust featuresand tested in an environment typical of its everydayuse. A shielded chamber may help acquire very cleansignals that are good for the demonstration of a pro-totype. However, it is unlikely that the same qualityof signals can be acquired in everyday environment.Thus it is also important to consider how a decoderis tested rather than just looking at offline numericalmetrics. Thirdly, advancement in the electrode mate-rials or special organic coatings can potentially reduceits immune response [180]. A flexible instead of rigidelectrode may also cause less neuronal damage and in-flammation [181, 182].
The second challenge is how to account for the differ-ence in features during open-loop training and close-loop control. The training dataset is typically obtainedin an open-loop fashion, meaning that the subjectsare instructed to carry out a particular motor imagerywithout any feedback. However, in actual use the sys-tem will provide feedback to the subject based on thedecoder outputs. When the decoder output is wrong,the subject may try to correct it deliberately, and thatmay lead to discrepancy in the offline and online per-formance [183]. One of the solutions is to introduce asmall calibration session with feedback at the begin-ning of the testing session, like in many EEG-basedmotor decoding studies. The original model is trainedwith an open-loop paradigm, then the model is furtherfine-tuned with feedback in the calibration session. However, this is only possible if a clear ground truth isavailable. For the case in which the ground truth is notavailable, e.g. in the case of a tetraplegic patient whereit is very difficult to know the true intention of the sub-ject, the ReFIT algorithm is another approach to ad-dress this problem [112]. The basic idea of the ReFITalgorithm is that it tries to construct a pseudo groundtruth by assuming that the subject is constantly try-ing to correct the wrong output of the decoder. Thusthe directional vector of the motor intention is takento be always pointing towards the target from the cur-rent cursor position. Using this method, it is possibleto train a decoder from scratch with as few as 3 min-utes of data [95]. Online calibration with feedback can
Tam et al. Page 15 of 23
Table 1 Comparison of different methods for motor decoding.
Cortical Peripheral
EEG ECoG Intra-cortical Peripheral nerves EMG
Decoding site Scalp On the surface ofthe brain
Penetrated intocortical tissues (e.g.
PPC, M1)
Peripheral nerves(e.g. ulna, median,
radial nerves)
Muscles
Types of electrode Disk electrodes Flexible electrodearray
offer a more realistic prediction on how the decoderis able to perform in real-life. This approach can alsolet the decoder quickly adapt to any shift in the fea-ture space due to change in the electrode interface orenvironmental noises. However, online calibration de-mands that the model can be updated quickly, whichputs an constraint on the complexity of the decodingmodel. More research is needed to study how to updatethe decoder efficiently in real-time.
Besides advancement in decoding algorithms, devel-opment of new electrodes and neural amplifiers alsoplay a very important part in advancing motor decod-ing. Recent trends in electrode development mainlyfocus on improving four areas of electrode design:
density, flexibility, biocompatibility and connectivity.Denser electrode can improve the spatial resolution ofneural recordings. High-density electrode has been cre-ated from silicon wafer and carbon fiber monofilament[184, 185]. Electrode material with a flexibility closerto that of brain tissues can reduce neural damage andinflammatory response. Many flexible polymers havebeen used to make neural electrode, including poly-imide [186, 187], parylene [188], PDMS [189] etc. Bio-compatibility is always an important issue in electrodedesign because inflammatory response and encapsula-tion deteriorate signal quality over time and under-mine the quality of chronic neural recordings. Strate-gies to improve biocompatibility including using in-
Tam et al. Page 16 of 23
ert metals like gold or platinum, using flexible materi-als to reduce tissue damage, or coating the electrodewith biocompatible materials like conducting polymer[190] and carbon nanotubes [191]. Read-out connec-tion from the electrodes will also quickly become aproblem when the density and number of electrodecontinue to increase. Incorporating transistors into theelectrodes directly to enable connection multiplexingis one of the ways to mitigate this problem [192, 193].Readers interested in neural electrode designs are sug-gested to consult other more in-depth reviews in thisarea [122, 176, 180, 181, 194].
Development of neural amplifiers also plays a veryimportant role in advancing the science of motor de-coding, as we first need to acquire a clear neural sig-nal before any processing and decoding can be done.There are multiple lines of research trying to improvethe different aspects of the amplifier design. Firstly,the power consumption of an amplifier can be reducedby resource sharing (e.g. one amplifier sharing mul-tiple electrodes [195] or multiple amplifiers sharingone analog-to-digital convertor [196]), power schedul-ing (e.g. switching off unused components [197], dy-namically adjusting the amplifier parameters [198]), orreduction of supply voltage [199]. Secondly, the chan-nel count can be increased by multiplexing or integrat-ing amplifiers directly with the electrodes [195, 200].Thirdly, the circuit noise can be reduced by trim-ming [201], chopping [202, 203], auto-zeroing [204] orfrequency-shaping [205] etc.. Fourthly, wireless trans-mission of power or data can be achieved by an in-ductive link [197, 206, 207], short-distance power har-vest [197, 208] or even ultrasound [209]. Finally, thefunctionality of the amplifier can also be expanded byintegrating more signal processing on-chip, e.g. spikedetection [207], spike sorting [210, 211] and data com-pression [212, 213]. Interested readers are encouragedto consult other more focused reviews in this area [214–217].
ConclusionsEvery year, a large number of patients suffer from var-ious degrees of movement disability due to amputa-tion or neurological disorders. Their everyday lives andworks are severely affected. With modern neurotech-nology, it is now possible to intercept and decode themotor intention at different points along the neuro-muscular control pathway and use that information todrive a prosthetic device to restore movement. In thispaper, we have reviewed the various signal featuresand techniques to decode motor intention in human.Although motor decoding performance is improvingsteadily with the advancements in electrode configu-rations, neural amplifier designs and decoding algo-rithms, we are still very far away from the goal of
achieving naturalistic and dexterous control like ournative limbs. The eventual successful clinical applica-tion of motor decoding will depend on the concertedefforts of both healthcare and engineering profession-als, and likely also needs to be tailored-made accordingto the conditions and ability of each patient. We hopeour review can provide a useful overview of the currentstate-of-the-art in motor decoding, so that researchersinterested in the field can be aware of the neural fea-tures that they can exploit, potential problems theymay encounter and the available solutions that theycan adopt.
List of abbreviationsPPC: posterior parietal cortex; PMA: premotor area; SMA: supplementary
motor area; GPi: internal globus pallidus; VLo: oral part of ventral lateral
nucleus; STN: subthalamus nucleus; M1: primary motor cortex; EEG: