Top Banner

of 65

Miyanaga Lecture

Apr 05, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/31/2019 Miyanaga Lecture

    1/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Robust Speech Recognition andits ROBOT implementation

    Yoshikazu Miyanaga

    Hokkaido University

  • 7/31/2019 Miyanaga Lecture

    2/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Conditions for Speech Recognition

    Short Isolated Speech:words, phrase (2sec)

    Attached Mic(several cm 10cm)

    Remote Mic:(10cm5m)

    Silent Room>20dB)

    Living Room2010dB)

    Noisy Room:exhibition5m)

  • 7/31/2019 Miyanaga Lecture

    3/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Conventional ASR

    ContinuousSpeech: (>2sec)

    Attached Mic(20dB)

    Attached Mic(

  • 7/31/2019 Miyanaga Lecture

    4/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Hokkaido University SpeechCommunication System (HU-SCS)

    Short Isolated Speech:words, phrase (5m)

    Silent Room>20dB)

    Living Room2010dB)

    Noisy Room:exhibition

  • 7/31/2019 Miyanaga Lecture

    5/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Detection

  • 7/31/2019 Miyanaga Lecture

    6/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Detection

    97% by Current TechnologySNR 10dB)WAVELET

    Non-Linear ProcessingRobust voice activity detection using

    perceptual wavelet-packet transform and

    teager energy operator S-H Chen, H-T Wu,

    Y. Chang and T.K. Truong, Trans. Pattern

    Recognition Letters (2007)

  • 7/31/2019 Miyanaga Lecture

    7/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Detection

    HU-SCS v499% over SNR 10dB

    BPThreshold Ope

    F0 Detection

  • 7/31/2019 Miyanaga Lecture

    8/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech

    Recognition

    Candidates ofRecognition Results(1) Good Morning

    (2) See you

    (3) How are you ?

  • 7/31/2019 Miyanaga Lecture

    9/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech

    Recognition

    Candidates ofRecognition Results(1) Good Morning

    (2) See you

    (3) How are you ?

    71% by Current TechSNR 10dB) .

    97.4% (SNR 20dB).Spectral SubtractionRASTA, CMSA Prior Information

  • 7/31/2019 Miyanaga Lecture

    10/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech

    Recognition

    Candidates ofRecognition Results(1) Good Morning

    (2) See you

    (3) How are you ?

    HU-SCS v4

    95.3% (SNR 10dB).98.3% (20dB)No A Prior Info.RSF/DRA

  • 7/31/2019 Miyanaga Lecture

    11/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Rejection

    Recognition Result

    Good Morning

    Candidates of Recognition Results(1) Good Morning

    (2) See you

    (3) How are you ?

  • 7/31/2019 Miyanaga Lecture

    12/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Rejection

    Recognition Result

    Good Morning

    Candidates of Recognition Results(1) Good Morning

    (2) See you

    (3) How are you ?

    90% by Current TechConfidential ScoringTechnique

    Recognition confidentialscoring and its use in speech

    understanding systems, T.J.

    Hazen, S.Seneff and

    J.Polifroni, Trans on Computer

    Speech and language (2002).

  • 7/31/2019 Miyanaga Lecture

    13/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Rejection

    Recognition Result

    Good Morning

    Candidates of Recognition Results(1) Good Morning

    (2) See you

    (3) How are you ?

    HU-SCS v4Dependent GMM byWeighted HMM (90%

    Accuracy)AI (ArtificialIntelligence)

  • 7/31/2019 Miyanaga Lecture

    14/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Detection

    AutomaticSpeech

    Recognition

    AutomaticSpeech Rejection

    HW withLow Power

    Super Low-Power Consumption DesignReal-Time SCS180nsec/word (10MHz Recognition Time

    Small Scale Design with Special Designed LSINoise Reduction by Array Microphone

    First SCS HWLSI IPMobileIntelligent Consumer Electronics etc Fine Advantage

    (1) Mobile Appli Small Low Power

    (2) PC free

  • 7/31/2019 Miyanaga Lecture

    15/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    Automatic

    Speech Detection

    AutomaticSpeech

    Recognition

    Automatic

    Speech Rejection

    HW with

    Low Power

  • 7/31/2019 Miyanaga Lecture

    16/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Running Spectrum Domain

    Waveform

    Mel-Spectra

    1 2 3 t

    1 2 3 t-6

  • 7/31/2019 Miyanaga Lecture

    17/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    BP and Threshold OP

    Start Point

    End Point

  • 7/31/2019 Miyanaga Lecture

    18/65

  • 7/31/2019 Miyanaga Lecture

    19/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    Automatic

    Speech Detection

    AutomaticSpeech

    Recognition

    Automatic

    Speech Rejection

    HW with

    Low Power

  • 7/31/2019 Miyanaga Lecture

    20/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Speech Analysis and Robust Processing

    Speech Analysis

    LPC Cepstrum

    Mel-Frequency Cepstrum

    Robust Processing

    Various types of techniques have been proposed.

    Spectral Subtraction

    Wiener Filtering

    Microphone Arrays

    RSF/DRA (Running Spectrum Filtering/DynamicRange Adjustment)

    uses filtering and normalizing for cepstral vectors.

  • 7/31/2019 Miyanaga Lecture

    21/65All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Procedure of Mel-Frequency Cepstrum

    Speech Signals

    Cut into Short-Time Frames

    Discrete Fourier Transform (DFT)

    Filterbanks with Mel-Frequency Scale

    Logarithm

    Discrete Cosine Transform (DCT)

    x(t)

    xf(n,ts)

    |X(n,f)|

    Xs(n,fm)

    log(Xs(n,fm))

    C(n,k)

    Cepstral Coefficients

    n : frame index

    k : cepstral order

  • 7/31/2019 Miyanaga Lecture

    22/65

  • 7/31/2019 Miyanaga Lecture

    23/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Noise Corruption in Power Spectrum

    E(n,)+A

    E(n,)

    Noise corruptions make differences on

    gains and DC components.

    Clean Speech

    Noisy Speech

    Power Spectrum

    (White Noise

    at 10dB

    SNR)

  • 7/31/2019 Miyanaga Lecture

    24/65

  • 7/31/2019 Miyanaga Lecture

    25/65

  • 7/31/2019 Miyanaga Lecture

    26/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Spectral Subtraction

    Estimate the spectrum of noise

    from short-time spectra in the

    first several flames

    Running spectrum of a noisy speech

    (white noise at 5 dB SNR)

    Subtract the estimated

    spectrum from each

    short-time spectrum

    After Subtraction

  • 7/31/2019 Miyanaga Lecture

    27/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Noise Reduction Techniques

    Conventional method

    Spectral subtraction

    Parameters are not optimized for speeches from variousenvironments.

    Excessive subtraction may cause musical noise.

    Robust speech feature extraction. Advanced speech analysis using RSF (running

    spectral filtering) and DRA (dynamic range

    adjustment).

  • 7/31/2019 Miyanaga Lecture

    28/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Modulation Spectrum

    Modulation Spectrum

    Running Spectrum

    Frame NumberFrequency

    DFT on each frequency

    Frequency Modulationfrequency

    RSF focuses on modulation spectrum

    Modulation spectrum: spectrum versus time

    trajectory of frequency.

  • 7/31/2019 Miyanaga Lecture

    29/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Mod-F of Clean and Noisy Speech

    Clean Noisy (white noise at 5 dB SNR)

    Speech components are dominant around

    4 Hz in modulation spectrum.

    Lower modulation frequency components can be assumed as

    noise because of little changes in noise components.

  • 7/31/2019 Miyanaga Lecture

    30/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    RSF (Running Spectrum Filtering)

    Speech components are dominant around

    4 Hz in modulation spectrum.

    Modulation Frequency [Hz]

    Modulation Spectrum

    Noise Components

    Speech ComponentsUnnecessary Part

    Frequency

    (Hz)

  • 7/31/2019 Miyanaga Lecture

    31/65

  • 7/31/2019 Miyanaga Lecture

    32/65

  • 7/31/2019 Miyanaga Lecture

    33/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    RSF / DRA

    10 20 30 40 50 60 70 80 90 100-3

    -2

    -1

    0

    1

    2

    3

    RSF processing

    10 20 30 40 50 60 70 80 90 100

    -3

    -2

    -1

    0

    1

    2

    Baseline

    10 20 30 40 50 60 70 80 90 100-1

    -0.8

    -0.6

    -0.4

    -0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    RSF/DRA processing

    Clean

    Noisy

    Comparison in cepstral time-trajectories at 4th order

  • 7/31/2019 Miyanaga Lecture

    34/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HU-SCS

    AutomaticSpeech Detection

    AutomaticSpeech

    Recognition

    Automatic

    Speech Rejection

    HW withLow Power

  • 7/31/2019 Miyanaga Lecture

    35/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Likelihoods of HMM

    HMM

    GMM GMM GMM GMM GMM

    Approximation of many multi-dimensional GaussianDistribution

    Average

    Variance

  • 7/31/2019 Miyanaga Lecture

    36/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Evaluation on Likelihoods

    MFCC

    Likelihood of MFCC into this HMM1p2p

    4

    p3p

    5p6p

    7p

    8

    p

    9p

    11p10p

    The maximum likelihoodis selected and its label isrecognized as the result.

    The result iscorrect, isnt it ?

  • 7/31/2019 Miyanaga Lecture

    37/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Evaluation of Reliability

    The result of the topscore is trusted.

    Likelihood

    Likelihood

    The result of the topscore is NOT trusted.

  • 7/31/2019 Miyanaga Lecture

    38/65

  • 7/31/2019 Miyanaga Lecture

    39/65

  • 7/31/2019 Miyanaga Lecture

    40/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Overview of ASR System

    Current ASR systems adopt robust processingthat removes influences of noise distortions.

    SpeechData

    Speech

    Analysis

    Covert to Spectrum or Cepstrum

    Robust

    Processing

    Decrease Noise Distortions

    Speech

    Recognition

    Calculate Probability (likelihoodscores)

    Results

    Reference Models

    Prepare Reference Patterns by Speech Training

    Speech FeatureVectors

  • 7/31/2019 Miyanaga Lecture

    41/65

  • 7/31/2019 Miyanaga Lecture

    42/65

  • 7/31/2019 Miyanaga Lecture

    43/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Block Diagram

    Interfaces

    Microprocessor, External RAM, and Master/SlaveMPU Interface

    HMM16

    Master Bus

    16

    5

    5

    24

    RSF/DRA

    24

    16

    16

    MFCC

    24 24

    SRAM

    1616

    Bus Control System Control

    SRAMinterface

    16

    2

    1

    20

    Address

    Interrupt Signal

    Chip Select

    16 16

    SRAM

    24 24

    SRAM

    16

    Filter Coefficients for RSF

    Working for MFCC and RSF

    Feature parameters before speech detection

    16

    16

    1

    22

    2

    Slave Bus Data Control

    3

    Data Control

    5

    SW

    CLK

    RESET

    MPU Interface

    HMM16

    Master Bus

    16

    5

    5

    24

    RSF/DRA

    24

    16

    16

    MFCC

    24 24

    SRAM

    1616

    Bus Control System Control

    SRAMinterface

    16

    2

    1

    20

    Address

    Interrupt Signal

    Chip Select

    16 16

    SRAM

    24 24

    SRAM

    16

    Filter Coefficients for RSF

    Working for MFCC and RSF

    Feature parameters before speech detection

    16

    16

    1

    22

    2

    Slave Bus Data Control

    3

    Data Control

    5

    SW

    CLK

    RESET

  • 7/31/2019 Miyanaga Lecture

    44/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    New Scalable Architectures

    2 types of scalable techniques are applied to thesystem.

    (1) Multiple Process Elements (PEs) in HMM Circuit

    The PEs enable high-speed processing and improvingrecognition performance.

    (2) Master/Slave Operation in the Complete System

    The operation enables high-speed processing andincrease the number of word vocabularies.

  • 7/31/2019 Miyanaga Lecture

    45/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    HMM (Hidden Markov Models)

    Hidden Markov Models (HMM)

    Statistical modeling approach using Markov chain.

    Powerful for expressing time-varying data sequences

    and robust with speaker differences.

    11a 22a

    12a

    44a33a

    34a23a 45a1q 2q 3q 4q

    ija State transition probability)1( Nnnq Set of states

    )(1 kb )(2 kb )(kbN

    Output probability

  • 7/31/2019 Miyanaga Lecture

    46/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Full-Parallel Computations in HMM

    The output probabilities and temporal scores can becomputed concurrently for the number of HMM states.

    Output Prob. Calc.

    Output Prob. Calc.

    Output Prob. Calc.

    Output Prob. Calc.

    ot

    Score Calc.

    Score Calc.

    Score Calc.

    Score Calc.

    Path for upper state

    SelectMax

    Max()

  • 7/31/2019 Miyanaga Lecture

    47/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Microprocessor

    RAMMaster

    Slave1

    Slave2

    Slave3

    Master/Slave Operation

    (1) Set Reference Data

    (2) Speech Analysis andRobust Processing

    (3) Broadcast

    (4) Speech Recognition

    (5) Gather Results

  • 7/31/2019 Miyanaga Lecture

    48/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Microprocessor

    RAMMaster

    Slave1

    Slave2

    Slave3

    Master/Slave Operation

    (1) Set Reference Data

    (2) Speech Analysis andRobust Processing

    (3) Broadcast

    (4) Speech Recognition

    (5) Gather Results[1]

    [2]

    [3]

    [4]

  • 7/31/2019 Miyanaga Lecture

    49/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Microprocessor

    RAMMaster

    Slave1

    Slave2

    Slave3

    Master/Slave Operation

    (1) Set Reference Data

    (2) Speech Analysis andRobust Processing

    (3) Broadcast

    (4) Speech Recognition

    (5) Gather Results

  • 7/31/2019 Miyanaga Lecture

    50/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Microprocessor

    RAMMaster

    Slave1

    Slave2

    Slave3

    Master/Slave Operation

    (1) Set Reference Data

    (2) Speech Analysis andRobust Processing

    (3) Broadcast

    (4) Speech Recognition

    (5) Gather Results

    [2]

    [1]

  • 7/31/2019 Miyanaga Lecture

    51/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Microprocessor

    RAMMaster

    Slave1

    Slave2

    Slave3

    Master/Slave Operation(2)

    (1) Set Reference Data

    (2) Speech Analysis andRobust Processing

    (3) Broadcast

    (4) Speech Recognition

    (5) Gather Results

  • 7/31/2019 Miyanaga Lecture

    52/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Microprocessor

    RAMMaster

    Slave1

    Slave2

    Slave3

    Master/Slave Operation(2)

    (1) Set Reference Data

    (2) Speech Analysis andRobust Processing

    (3) Broadcast

    (4) Speech Recognition

    (5) Gather Results[1]

    [2]

    [3]

    [4]

  • 7/31/2019 Miyanaga Lecture

    53/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Circuit Design (Analysis & HMM TEG)

    Technology Rohm CMOS 0.35 m

    Univ. of Tokyo EXD Standard Cell Library

    Voltage Supply 3.3V

    RTL Level Design.Verilog-HDL

    Evaluation

    Clock Freq.(MHz)

    Proc Time(ms/word)

    Power Coms(mW)

    60 0.029 567.7

    30 0.059 285.2

    10 0.180 93.2

    V2 Layout View

  • 7/31/2019 Miyanaga Lecture

    54/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Comparison on Power Consumption

    Proposed HW (10MHz) and DSP Design (80MIPS)

    DSP based System Proposed System

    Processor StructureTMS320C549

    80MIPS

    DedicatedProcessor

    10MHz

    Memory AccessTime (ns)

    15 80

    Processor (mW)(Core : 3.3V)

    158.4 93.2

    Memory (mW)

    (SRAM, Core : 3.3V)627 100

    Total 785.4 193.2

  • 7/31/2019 Miyanaga Lecture

    55/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Processing Time of HU-SCS

    Comparison with Software Design

    54 times faster

    No high speed clockUseful for Low-Power Design

    Proposed System(Hardware)

    Pentium 4(Software)

    No. arithmetic units 160 -

    No. cycles 455,200 -

    Frequency(MHz) 80 2200

    RecognitionProcessing time(ms) 5.7 310

  • 7/31/2019 Miyanaga Lecture

    56/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Design by Standard Cells

    TSMC0.25m CMOS Standard Cell Voltage 2.5V

    Highest Clock Rate 80.6MHz (12.4ns, Temperature Cond. Typical)

    No. Parallel Processing 32 8

    HMM 491,600 116,980

    RSF/DRA 11,910

    MFCC 39,670

    System Control 18,310Bus Control 1,310

    SRAM 63,400

    Total 626,200 251,580

  • 7/31/2019 Miyanaga Lecture

    57/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Current HU-SCS

    HU-SCS Board

    PC Interface with

    HU-SCS Board

    55mm44 mm

  • 7/31/2019 Miyanaga Lecture

    58/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Overview of Current HU-SCS

    Improvement of Noise Robust

    Accurate ASR under SNR 0 - 10dB

    Robustness against Echo

    Improvement of Speech Recognition

    Higher Accuracy on MFCC Calculation

    Low Power Design and Higher SpeedProcessing

    Improvement of Total HW System

    Higher Speed Response Time

  • 7/31/2019 Miyanaga Lecture

    59/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Comparison on Performance

    50B 96.4% 90.0%

    50B 95.0% 84.4%

    45B 85.1% 50.5%

    50B 99.4% 95.6%

    75B 93.3% 85.0%

    75B 88.9% 65.6%

    80B 82.7% -

    Comparisonsbetween HU-SCSv4 and v3

    0.00%

    50.00%

    100.00%

    Previous

    Current

  • 7/31/2019 Miyanaga Lecture

    60/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Results on Some Distances

    60.0%

    70.0%

    80.0%

    90.0%

    100.0%

    30cm 60cm 90cm

    Car A

    30.0%

    40.0%

    50.0%

    60.0%

    70.0%

    80.0%

    90.0%

    100.0%

    30cm 60cm 90cm

    Car C

    60.0%

    70.0%

    80.0%

    90.0%

    100.0%

    30cm 60cm 90cm

    Elevator

    30.0%

    40.0%

    50.0%

    60.0%

    70.0%

    80.0%

    90.0%

    100.0%

    30cm 60cm 90cm

    Stair

    60.0%

    70.0%

    80.0%

    90.0%

    100.0%

    30cm 60cm 90cm

    Meeting Room

    60.0%

    70.0%

    80.0%

    90.0%

    100.0%

    30cm 60cm 90cm

    Car B

  • 7/31/2019 Miyanaga Lecture

    61/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Robot Implementation

    Speech Recognition & Synthesis

    Quick Response

    Control to Consumer Electronics andMachines

    http://chapit.mpg/
  • 7/31/2019 Miyanaga Lecture

    62/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Communications and Controls

    http://cw_news_080107.mpg/
  • 7/31/2019 Miyanaga Lecture

    63/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Summary

    Hokkaido University Speech CommunicationSystem Integrated Architecture of Speech Detection, Robust

    Speech Analysis, Speech Recognition, Speech Rejection

    Higher Speed Processing than DSP and Software

    Superior in Energy Saving than DSP Solutions

    Improving Noise Robustness by RSF/DRA Technique

    Small, Fast and Low Power

  • 7/31/2019 Miyanaga Lecture

    64/65

    All right reserved. Copyright 2009- Yoshikazu Miyanaga

    Who ?

    64

    Yoshikazu MiyanagaHe received the B.S., M.S., and Dr. Eng. degrees from Hokkaido University, Sapporo,Japan, in 1979, 1981, and 1986, respectively. He is currently a Professor at GraduateSchool of Information Science and Technology, Hokkaido University.

    His research interests are in the areas of signal processing for wireless

    communications, nonlinear signal processing and low-power LSI systems.He was a chair of Technical Group on Smart Info-Media System, IEICE. He is anadvisory member of this technical group. Currently, he is IEICE fellow.

    He served as a member in the board of directors, IEEE Japan Council as a chair ofstudent activity committee from 2002 to 2004. He is a chair of student activitycommittee in IEEE Sapporo Section from 2001. He is a chair of IEEE Circuits and

    Systems Society, Digital Signal Processing Technical Committee from 2006.He has been serving as international steering committee chairs/members of IEEEISPACS, IEEE ISCIT, IEEE/EURASIP NSIP and honorary/general chairs/co-chairs of theirinternational symposiums/workshops, i.e., ISPACS 2003, ISCIT 2004, ISCIT 2005, NSIP2005, ISPACS 2008, ISMAC 2009 and APSIPA ASC 2009. He also served asinternational organizing committee chairs of IEICE ITC-CSCC 2002 - 2003, IEEE MSCAS

    2004, IEEE ISCAS 2005 - 2008.

  • 7/31/2019 Miyanaga Lecture

    65/65

    Current References of this Topic

    1. Kazunaga Ohnuki, Wataru Takahashi, Shingo Yoshizawa, Yoshikazu Miyanaga, Noise Robust Speech Features for Automatic Continuous Speech

    Recognition using Running Spectrum Analysis, Proceedings of 2008 International Symposium on Communications and Information Technologies

    (ISCIT), pp.150-153, October 2008.

    2. Jirabhorn Chaiwongsai, Werapon Chiracharit, Kosin Chamnongthai, Yoshikazu Miyanaga, An Architecture of HMM-Based Isolated-Word Speech

    Recognition with Tone Detection Function, Proceedings of 2008 International Symposium on Intelligent Signal Processing and Communication Systems

    (ISPACS), December 2008.

    3. Nongnuch Suktangman, Kham Khanthavivone, Kraisin Songwatana, Yoshikazu Miyanaga, Robust Speech Recognition Based on Speech Spectrum on

    Bark Scale, EURASIP Proceedings of 2007 International Workshop on Nonlinear Signal and Image Processing (NSIP), pp.135 -138, September 2007.

    4. Shingo Yoshizawa, Naoya Wada, Noboru Hayasaka, Yoshikazu Miyanaga, "Scalable Architecture for Word HMM-Based Speech Recognition and VLSI

    Implementation in Complete System", IEEE Transactions on Circuits and Systems I, Vol.53, No.1, pp.70-77, January 2006.

    5. Noboru Hayasaka and Yoshikazu Miyanaga, Spectrum Filtering with FRM for Robust Speech Recognition, IEEE Proceedings of International

    Symposium on Circuits and Systems (ISCAS), No.2, pp.3285-3288, May 2006.

    6. Naoya Wada, Noboru Hayasaka, Shingo Yoshizawa, Yoshikazu Miyanaga, Direct Control on Modulation Spectrum for Noise-Robust Speech

    Recognition and Spectral Subtraction, IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2533-2536, May 2006.

    7. Shingo Yoshizawa, Noboru Hayasaka, Naoya Wada, Yoshikazu Miyanaga, VLSI Architecture for Robust Speech Recognition Systems and its

    Implementation in Verification Platform, Journal of Robotics and Mechatronics, Vol.17, No.4, pp. 447-455, Aug. 2005.

    8. Yasuyuki Hatakawa, Shingo Yoshizawa, Yoshikazu Miyanaga, Robust VLSI Architecture for System-On-Chip Design and its implementation in ViterbiDecoder, IEEE International Symposium on Circuits and Systems (ISCAS), Vol.3, pp.25-28, May 2005.

    9. K.Songwatana, K. Dejhan, Y. Miyanaga and K. Khanthavivone,AVowels Recognition Model for Laotion language using Transfer Function on Bark

    scale and Hidden Markov Modeling, IEEE Proceedings of International Workshop on Nonlinear Signal and Image Processing (NSIP) , Vol.1, pp.426-429,

    May 2005.

    10. Kazuma Fujioka,Noboru Hayasaka,Yoshikazu Miyanaga and Norinobu Yoshida,A Noise Reduction Method of Speech Signals Using Running Spectrum

    Filtering, IEICE Transactions on Information and Systems Part.2,Vol.J88-D-, No.4,pp.695-703,April 2005.

    11. Qi Zhu, Noriyuki Ohtsuki, Yoshikazu Miyanaga and Norinobu Yoshida,Noise-Robust Speech Analysis Using Running Spectrum Filtering, IEICE

    T ti F d t l f El t i C i ti d C t S i V l E 88 A N 2 541 548 F b 2005