投影片 1

advertisement
CHILDES SYSTEM OVERVIEW
- BASIC -
1.
What is
CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
CHIld Language Data Exchange System
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
2.
Why
we need
CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Because…..
You want to study what I say
You want to investigate languages
(Photo source: http://www.flickr.com/photos/klapow/203398273/)
CHILDES provides
Tools for studying
conversational interactions
3.
Who
started
CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Found in 1984
Concord MA
Department of Psychology,
Carnegie Mellon University
The team
Director
Contact
Programmers
Brian MacWhinney
macw@cmu.edu
Leonid Spektor
Franklin Chen
4,500
members
130
corpora
1,500
published articles
4.
Why we
need a lot of
data with
CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
We needs
LOTS of DATA.
WHY?
Universals
and
Differences
Photo source: http://www.flickr.com/photos/alvy/69385239/
Photo source: http://www.flickr.com/photos/alvy/69385239/
5.
Where is
CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
visit CHILDES website at
http://childes.psy.cmu.edu
6.
How can I get
the latest info
of CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Subscribe to the CHILDES Mailing Lists now!
7.
What are the
tools provided
by CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
The CHILDES system provides tools for studying
conversational interactions, including
Transcript database
Methods for linguistic coding
Programs for transcript analysis
Systems for audio and video linking
8.
What related
software do I
need?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
BEFORE installing, you should have
Quicktime
player
Adobe
reader
Winzip
Unicode fonts:
Arial FixedSys
To read the
media files
To view the
Manual
To unzip the
corpus
To display the
characters
Download unicode fonts - STEP ONE
Download unicode fonts - STEP TWO
Download unicode fonts - STEP THREE
9.
Where is
the CHILDES
program?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
The program available at CHILDES
is called
CLAN
Download CLAN
4 versions are available
ClanX + ClanXu
ClanWin

UnixClan
versions  No longer supported
9.
I want to
install CLAN to
my Windows.
HOW ?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Getting Started @ Windows
updated
frequently
download
new version
Download CLAN at ”
Program and
Database ” Section.
(Photo Source : http://www.flickr.com/photos/tanaka/49602421)
After Download
Double click the *.exe file
downloaded and follow the
instructions given by
InstallShield
(Photo Source : http://www.flickr.com/photos/tanaka/49602421)
10.
I have CLAN
now. What
should I do?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Download the Manual for details
CHAT Transcript System:
How to record the
conversation in a standard
format at CHILDES.
CLAN Program Manual:
How to use the CLAN program
11.
I just want to study
the available
language database.
How?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Click Database to
download the
Corpus from
around the world
TWO ways to view the data
View the corpus using WebData
You can download the corpus and run the
transcript in your local machine in this page.
1. Unzip the corpus into folders
2. Use CLAN program to open the *.CHA files.
Local Transcripts
Download the audio and video
files here and place them in the
same folder of the transcripts..
Download the
bilingual
corpus here.
e.g. Download YipMatthews bilingual corpus
On Window, right
click the mouse >> save target
as >> choose the directory for
this zip file.
On Mac, click the link and it will
save automatically.
Unzip the corpus files
Unzip the downloaded
corpus by right click the
mouse >> extract here
(Photo Source : http://www.flickr.com/photos/tanaka/49602421)
Unzip the corpus files
1. After extraction, folders,
which contains *.cha under
names of children being
investigated, will place
inside a folder.
2. Each folder contains
*.cha files, which are
transcripts of the
bilingual children.
(Photo Source : http://www.flickr.com/photos/tanaka/49602421)
This is a transcript in CHAT format
(*.cha file)
Zoom inside a transcript
FAT=Father, he is saying “what’s bear doing?”
CHI=children, saying “writing a letter, letter”
%mor=morphological tier, list parts of speech
“n” is NOUN, “PL” is plural, so “friends” is a plural noun
Download the Manual for more
12.
How can I read
transcripts together
with audio & video
files?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Playback with audio file
1. Put the
corresponding
audio and *.cha
files in the same
folder
4. Either use Esc+8
OR
Click Mode >>
Continuous playback
(Photo Source : http://www.flickr.com/photos/tanaka/49602421)
2. Open the CHA file
Click Mode >>Sonic
Mode>>Locate the
audio file.
3. Audio Wave of the
sound file will pop up
inside the CLAN
window.
Playback with video file
1. Put corresponding
video files and *.cha
files in the same
folder
4. Either use Esc+8
OR
Click Mode >>
Continuous playback
(Photo Source : http://www.flickr.com/photos/tanaka/49602421)
2. Open the CHA file
Click Mode >>Sonic
Mode>>Locate the
video file.
3. Video Player will pop
up inside the CLAN
window.
13.
I want
to search words/
language structure
from various corpus
for research.
HOW?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Command window
1. Click Window >> Commands
Or Ctrl+D
2. Type the Command here
basically composed of
3 subparts
Basic structure of the Commands
freq
mlu
Command
Name
+t*CHI
+t*MOT
Tier(s)
(started with +t )
0042.cha
0042.cha
Target file name)
(ended with .cha
or .cex )
14.
Can you
introduce me
some useful
COMMANDS?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
A. MLU
stands for
Mean Length Utterance
The ratio of morphemes over utterances
1. Click “WORKING” Locate
the files/folder here by
clicking SELECT DIRECTORY
2. TYPE
3. Click “RUN”
mlu +t*CHI *.cha
B. FREQ
stands for
Frequency
Count numbers of words used in selected files
+
Calculate the type– token ratio (a measure of lexical diversity)
1. Click “WORKING” Locate
the files/folder here by
clicking SELECT DIRECTORY
2. TYPE
3. Click “RUN”
freq +t*CHI (filename).cha
C. Kwal
is for
Keyword and Line
searching
Search data for user-specified words
+
Output those keywords in context.
1. Click “WORKING”. Locate
the files/folder here by
clicking SELECT DIRECTORY
2. TYPE
kwal +t*CHI +t%mor +s”but” (filename).cha
You want to search the
file with the word “BUT”
3. Click “RUN”
D. Combo
is used for
Combination search
A powerful program that searches the data for
specified combinations of words or character strings.
1. Click “WORKING” Locate
the files/folder here by
clicking SELECT DIRECTORY
2. TYPE
combo +t*CHI +s”what^is” (filename).cha
You want to search file with
the word “what”+”is”
3. Click “RUN”
For more
information on commands
details
example
combo +t*MOT +s"kitty^kitty" 0042.cha
kwal +sbunny -w2 +w2 0042.cha
+s “xx^xx” - search for specific
combinations of words OR
character strings
-w* and +w* options for number of text
lines included before and after the
search words.
15.
WOW! Can I
create a language
corpus for my
own kids with
CHILDES?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
YES!!!
http://childes.psy.cmu.edu
This is the Work Flow
CHILD/INFORMANT
record
Sound / Video data
transcribe
TEXT
RUN
CHECK @
CLAN
link
“Esc-L”
CLAN
sound+video+transcript  corpus
digitalized
audio/video
files in
computer
16.
What about
the details? e.g.
how to record
data, digitalize the
sounds & video?
Photo source: http://www.sxc.hu/photo/740583 ; royal free under usage option
Visit
http://childes.psy.cmu.edu
CHILDES SYSTEM OVERVIEW
- ADVANCE -
Coming soon!
This introduction was produced by
Uta Lam using materials derived from
the CHILDES website
AND
the Bilingual Child Language Corpus
contributed to CHILDES by
Virginia Yip (Chinese University of Hong Kong)
and Stephen Matthews (University of Hong Kong).
Special Thanks to
Brian MacWhinney, Virginia Yip, Stephen Matthew
Contact me at utalam@hotmail.com
April 2007
I disclaim any responsibility in regards with photos, contents displayed and links provided by this slides. At time of
review, they were deemed valuable either for this slides or content. Upon your visit – this slide or its content may have
changed or be unavailable.
Download