Top Banner
1 From CHILDES to TalkBank An International Database of Communicative Interaction
51

1 From CHILDES to TalkBank An International Database of Communicative Interaction.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

1

From CHILDES to TalkBank

An International Database of Communicative Interaction

Page 2: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

2

TalkBank

• Brian MacWhinney– Carnegie Mellon University, Psychology– Child Language Data Exchange System CHILDES

• Steven Bird, Mark Liberman– University of Pennsylvania, Linguistics– Linguistic Data Consortium, LDC

• Howard Wactlar– Carnegie Mellon University, Computer Science– Informedia Project

Page 3: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

3

Basic Premise of TalkBank• Human Communication is a unified fact,

• but it is studied by 8 disciplines and up to 40 subdisciplines.

• Analysis is important, but so is synthesis.

• We can put the puzzle back together by focusing all the disciplines on the data.

Page 4: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

4

Some Examples

• “My Theory”

• Bettino Craxi

• Nixon’s Watergate Tapes

• MacWhinney’s Lectures

• Ross and Mark

• Graphics lesson

• Bilingual Classroom

Page 5: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

5

My Theory: An ExampleSpecial Issue of Discourse Processes edited by Tim

Koschmann with articles from• Rogers Hall• Jay Lemke• Annemarie Palincsar• Carl Frederiksen• Commentary by

– Judith Green & Marleen McClelland

– Jeremy Roschelle

Page 6: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

6

TalkBank Areas

• Classroom Discourse - CMU Dec 99• Conversation Analysis - Odense Oct• Text and Discourse - Santa Barbara July• Child Language Disorders - Madison 2002• Language and Gesture - CMU October• Child Language Learning - Madison Aug 2002• Animal Communication - Penn May 2000

Page 7: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

7

More areas ….• Field Linguistics - LSA Dec 99, Penn Dec 2000• Aphasia• Corpus Linguistics• Signed Language• Second Language Learning• Anthropological Linguistics• Cross-cultural studies

Page 8: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

8

More areas ...• Multilingualism, code-switching - LIDES

• Mother-infant interaction

• Psychiatry

• Conflict Resolution

• Management Styles

• Small-group Interaction - soon

• Human-computer Interaction

Page 9: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

9

More areas ...

• Speech Technology - ongoing

• Virtual Reality

• Guided Robots, Social Robots

Page 10: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

10

Why data-sharing is important

• Increasing the size and reliability of the empirical basis

• Opening science to the community, practitioners, and students

• Opening science to collaborative commentary

• Creating transparency across disciplines

Page 11: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

11

Key Features of TalkBank• Multimodal digitized data

• Internet access

• Defense of confidentiality

• Codon: transcription, coding, viewing, and analysis

• XML standard for underlying representation

• Alliance of databases from many fields

Page 12: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

12

Why TalkBank can be built now

• The Internet

• Fast computers. big disks, cheap storage

• Good audio and video digitization

• Advances in web-based database design

• Emergence of annotation standards

• Maturation of the social sciences

Page 13: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

13

CHILDES: APrototype

• Brian MacWhinney - CMU• Leonid Spektor - CMU• Catherine Snow - Harvard

• 2000 Members• 400 Active contributors

Page 14: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

14

1850-1950 Darwin and Diaries

• Darwin, Stern, Ament

• Emotion, gesture, language, the soul

• Card files and shoe boxes

Page 15: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

15

1950-1984 Tapes

• Nagras and TEAC, VHS and Beta

• Dittos, mimeo, notes in the margins

• Good “raw” data, unclear transcription

Page 16: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

16

1984 - 1994 PCsCHILDES Concord Massachusetts 1984

Page 17: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

17

1994 -2001 childes.psy.cmu.edu

Page 18: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

18

2000 - ? TalkBank

Page 19: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

19

Universals• Are there basic patterns to babbling?

• Are early word orders universal?

• Does UG give children a universal set of functional categories?

• Is the vocabulary spurt universal?

The answer requires LOTS of data

Page 20: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

20

Particulars• Do children have individual styles?

– Gestalt vs. Analytic– Enactive (1S) vs. Depictive (3S)

• Do children respond differentially to parental recasts?

• Do children vary in their match to cue validity?

Again, we need LOTS of data

Page 21: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

21

Comparisons

• How should we match SLI children to normal controls -- MLU? Morphology, TTR

• How should we compare language socialization processes across social classes? Between cultures?

• How should we compare the course of development across languages? The case of Romance.

Page 22: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

22

Three Components

• CHAT -- Transcription System

• CLAN -- Programs

• Database

Page 23: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

23

CHAT Format

@Begin

@Participants: CHI Target_Child Sid, MOT Mother

*MOT: you want them to go in there?

*CHI: yeah. [+ Q]

*CHI: yeah. [+ SR]

*MOT: okay.

*CHI: okay. [+ I]

*CHI: look at this.

%act: CHI picks up piece of paper

@End

Page 24: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

24

CLAN Programs

Page 25: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

25

String Search

• Freq

• KWAL

• Combo

• Gem

• GemFreq, GemList

Page 26: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

26

Indexes

• MLU

• MLT

• WdLen, MaxWd

• VOCD

• DSS

• IPSyn (in progress)

Page 27: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

27

Profiles

• Chains

• Cooccur

• Dist

• CHIP

• KeyMap

• TimeDur

Page 28: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

28

Phonology

• MakeMod

• ModRep

• PhonFreq

• UniCode

• Inventory (in progress, LIPP, CompProf)

• Process Analysis (in progress)

Page 29: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

29

Utilities

• Dates

• Rely

• Lines

• SaltIn

• Check

Page 30: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

30

The Database

• English - 25 corpora

• Non-English - 18 languages

• Clinical - 14 corpora, aphasia, SLI, Down, autism, Williams, and other groups

• Narrative - Frog stories, Red Balloon

• Childhood Bilingualism

• Adult Second Language Learning

Page 31: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

31

Morphology

• MOR

• Post, PostTrain -- Christophe Parisse

• Parse -- Kenji Sagae

• --> revised DSS, LARSP, IPSyn

• MinMor for 14 language

• MaxMor for English, Spanish, Italian, Hungarian, Dutch, German

Page 32: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

32

New Technologies

• Sonic CHAT

• Bullets

• QuickTime Movies

• Sound editor by wave

• Movie editor by dragging

• Fast mode editing

• Web streaming of audio and video

Page 33: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

33

Sample Topics• Past tense debate

• Functional categories, tenseless verbs

• Verb frame generalization

• Fine-tuning of the input

• Theory of mind

• Lexical range and communicative context

• MLU and vocabulary growth in disorders

Page 34: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

34

Research based on CHILDES• Over 1200 published studies• Syntax• Morphology• Discourse• Lexicon• Narrative, Literacy• Language Impairments• Phonology

Page 35: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

35

Allied Efforts

• JCHAT, Chinese, Korean

• Dutch, Nordic, Celtic

• Romance (Italian, Spanish, Portuguese)

• Slavic (Krakow, Vienna)

• Bilingualism -- Catalan, Basque

• Frogs, Disorders, Code-switching

• Classroom discourse

Page 36: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

36

Page 37: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

37

CHILDES/BIB On-Line

Page 38: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

38

Format BabelAlembic Annotator Archivage CA CHAT

COCOSDA CSAE CSLU DAISY DAMSL

Delta DRI EAGLES Emu Festival

FSA’s GATE HIAT Hyperlex Intex

ISIP LDC MATE MICASE MPEG

MPI Multitext Observer PartiturPraat

SABLE SAMPA SGREP SignSTream SIL

SLAM SMDL SNACK StandOff SUSANN

TalkBank TEI Tipster Transcriber TreeBank

TSNLP Unicode UTF

Page 39: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

39

Video ToolsMedia Tagger, CLAN,

Digital Lava, Informedia ….

Page 40: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

40

The Script

Page 41: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

41

syncWRITER

Page 42: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

42

SignStream

Page 43: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

43 41

Page 44: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

44

Audio on the Web

Page 45: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

45

Anthropology on the Web

Chagnon’s Yanamamo

Page 46: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

46

Touch and Click for Audio

Page 47: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

47

Pawnee Lexicon

Page 48: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

48

Lexicon -> Cultural Encyclopedia

Page 49: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

49

Cornell Bioacoustics Laboratory

Page 50: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

50

Confidentiality Levels1 - fully public2 - copying block3 - transcripts public, audio/video protected4 - non-disclosure5 - non-disclosure, no copying6 - data-viewing with approval7 - data-viewing under direct supervision8 - archived only

Page 51: 1 From CHILDES to TalkBank An International Database of Communicative Interaction.

51

Conclusions

• Child Language has guided other fields, but now we need to link to these other fields.

• CLAN must give way to more international tools and distributed databases.

• Number counting will give way to reality-linked number counting.

• Lab-based research will have to open up to collaborative annotation.