Symbolic Dynamics of the Human Heartbeat Analysis Using Rank

advertisement
Who Wrote Shakespeare?
Application of Multi-Disciplinary
Research to Medicine and Humanity
Albert C.-C. Yang, M.D., PhD
Attending Physician, Department of Psychiatry,
Taipei Veterans General Hospital, Taiwan
Assistant Professor, School of Medicine,
National Yang-Ming University, Taiwan
accyang@gmail.com
Information Created by
Biological Systems
Neuronal Impulse
Genetic Codes
Information Created by
Biological Systems
Human Heartbeats
Human Creations
Earliest record of paintings by human
Lascaux Cave France 20000 BC
Human Creations
Symbols
Jiahu
China 6600 BC
Vinča signs
Europe 4500 BC
Indus script
India 3500 BC
Human Creations
Writing Systems
Cuneiform script Sumerians Iraq 2600 BC
Challenge
How to effectively categorize information of
different origins?
A general principle to analyze
information-embedded signals
Patterns
Human Repetitive
Genome vs. Chimpanzee
Genome
Information Categorization Method
Comparison of human literary texts
Repetitive patterns: words
Frequency and Rank Order Statistics
Frequency and Rank Order Statistics
Rank Comparison Map
Word
Rank
50 Tale)
(The Winter's
Rank
(Bonduca)
MORE
I
AND
TO
OF
YOU
A
Rank (The Winter's Tale)
THE
1
3
2
2
330
1
40
ME7
6
FOR
8
THAT
90
…
…
6
YOUR
GOOD
SHALL
ARE
WILL
HIM
NO
WITH
BUT
BY
WHAT
THOU
HE
4
520
MY
NOW
ALL
4
710
WE
THEN
HIS
SO
THIS
AS
HAVE
IN
BE
IS
5ITNOT
THAT
MY
A
YOU
OF
10
TO
AND
I
THE
9
0
10
…
20
30
Rank (Bonduca)
40
50
Rank Comparison Maps
Shakespeare vs. Shakespeare
50
50
WAS
Rank (The Winter's Tale)
40
30
20
10
0
Shakespeare vs. Fletcher
MORE
OR
WE
SHE
MORE
WE
40
ALL
ON
ARE
WILL
IF
WHICH
HIM
NO
BY
WHAT
THOU
THEN
HE
HIS
HER
SO
WITH
THIS
ME
BUT
AS
FOR
HAVE
IN
BE
IS
YOUR
IT
NOT
THAT
MY
A
YOU
OF
TO
AND
I
THE
0
10
20
30
Rank (Cymbeline)
40
NOW
ALL
SHALL
SIR
SHALL
ARE
WILL
HIM
NO
30
WITH
20
ME
FOR
YOUR
10
0
THAT
MY
A
YOU
OF
TO
AND
I
THE
0
10
BUT
BY
WHAT
THOU
HE
50
GOOD
THEN
HIS
SO
THIS
AS
HAVE
IN
BE
IS
IT
NOT
20
30
Rank (Bonduca)
40
50
Rank Comparison Maps
金庸 vs. 古龍
金庸 vs. 金庸
50
¤l
Ґh
射雕英雄傳
40
20
є
¤в Ё§Ъ
¤§
¤]
Ґu Бn¦і
ҐXЁм
ЁЈ
§A
»Ў
10
20
30
倚天屠龍記
§Ъ
20
0
10
»Ў
Ґh
§A
¤F
№
D¬O
¤Ј
Є
є
¤@
0
¦і
30
¤W
¤U¤¤
¤j
Ё
У
ҐL
¤H і o
10
0
±o
¤]
Ё­
¤Я
¦b
¤S
40
­У
30
50
L
®ЙАY №
µЫ
40
50
іo
¬O ¤H
¤Ј
№
D
Єє
¤@
0
10
Ёє
¦b
ЁУ
20
Бn
­У
±o
¤l ¤U
¤W
¤¤ ¤в
¤j
30
楚留香傳奇
40
50
Information-Based Similarity Index
1 N12
D(T1 , T2 ) 
R1 ( wk )  R2 ( wk ) F wk 

N12 k 1
50
MORE
Rank (The Winter's Tale)
WE
40
NOW
ALL
GOOD
SHALL
ARE
WILL
HIM
NO
30
WITH
20
ME
FOR
YOUR
10
0
THAT
MY
A
YOU
OF
TO
AND
I
THE
0
10
BUT
BY
WHAT
THOU
HE
THEN
HIS
SO
THIS
AS
Physical Review
Letters 90:108103
(2003);
HAVE
IN
BE
IS
IT
NOT
Physica A 329:473483 (2003);
20
30
Rank (Bonduca)
40
50
Journal of
Computational Biology
12(8):1103-16 (2005).
Cluster Analysis
Known Authorship Classification
Chinese Authorship Debate
Dream of the Red Chamber
Dream of Red Chamber 紅樓夢

One of China's four great classical novels.

Written by Cao Xueqin in the middle of the
18th century during the early Qing
Dynasty.

80 Chapters in original manuscript copies.

Gao E and Cheng Weiyuan added 40
additional chapters to complete the novel.
Authorship Debate (紅樓夢)
Rank
1-40
41-80
81-120
Word
Frequency
Word
Frequency
Word
Frequency
1
了
6250
了
8301
了
6946
2
不
4505
不
5676
的
5499
3
的
4010
的
5539
不
5009
4
一
3891
一
4942
來
3944
5
道
3683
來
4097
道
3756
6
來
3563
人
3892
是
3741
7
人
3139
我
3769
人
3644
8
我
2843
是
3720
一
3461
9
是
2833
道
3683
說
3391
10
說
2805
說
3637
我
2743
Authorship Debate (紅樓夢)
Who Wrote Shakespeare’s Plays?

Both Marlowe and Shakespeare had births
recorded in 1564.

Before Shakespeare’s name became widely
known, Marlowe had already produced several
major works in various genres, including
Tamburlaine the Great and Dr. Faustus.

Marlowe’s career tragically ended on 30 May,
1593 when he was apparently murdered in a
dispute.
The Murder of The Man Who Was
Shakespeare – Calvin Hoffman

Shakespeare did not visit some places which
vividly appeared in scenes of Shakespeare’s plays.

Shakespeare seems to suddenly appear after
Marlowe’s death.

Marlowe had not died as claimed in 1593, but
instead escaped to a secret refuge in Italy where
he spent the rest of his life writing the body of
plays generally attributed to Shakespeare.
Who Wrote Shakespeare’s Plays?
Shakespeare
?
Marlowe
Henry VIII
The Two Noble Kinsmen
Edward
III
Yang AC et al. Physica A 329:473-483 ( 2003)
Shakespeare versus Fletcher
Critique
It is like taking all the words and throwing them
in the blender
~ leading Shakespeare scholar
Support
The Calvin & Rose G. Hoffman Marlowe
Memorial Trust 2003 Prize
~ leading Shakespeare scholar
Boston Globe Aug 5, 2003: D1-D4; Cook Gareth: “Much Ado About Data”
仿
倪
匡
作
品
倪
匡
原
作
Application to Human Heartbeat
Heart rate dynamics
Parasympathetic stimulation
Sympathetic stimulation
Which Heart Rate Pattern is Healthy?
Heart Failure
Heart Failure
Normal
Atrial Fibrillation
Technical Challenges
• How to map a heart rate time series to a symbolic
sequence?
?
KJLFNHACUARAFVTH
TYAERFVVAEVACVAZ
CFVDFVZDSDSFVSDF
VNTEWOSIXWRXDPOI
JRROIRFUFNVIMVMF
• How to define words in heart rate symbolic sequences?
KJ LFN HACUA RAFVT HTY AER FV
VA EVAC VAZ CF VDF VZ D SDSFV
SDFV NTEWOSI XW RXDP OIJR RO
IRFU FNV IMV MF
?
interbeat interval (sec)
Symbolic Mapping
1.2
1.0
0.8
1 1 0 0 0 1 1 0 0 1 1 0 0
0.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14
beat number
8-bit word: 11000110, 10001100, 00011001
Comparison of Human
Heartbeat
Health vs. Health
D = 0.10
Health vs. Disease
D = 0.25
Yang AC et al. Physical Review Letter 90: 108103 (2003)
Phylogenetic Tree of Human
Heartbeat
Yang AC et al. Physical Review Letter 90: 108103 (2003)
Clustering of Human Heartbeat Is Associated
with β2-AR Gene Polymorphisms
Yang AC et al. PLoS ONE 6(5): e19232 (2011)
Application to Genetic Sequences
Picture obtained from www.genetic-programming.org
Analogy to Natural Languages
ATATTAGGTTTTTACCTACCCAGGAAAAGCCAACC
AACCTCGATCTCTTGTAGATCTGTTCTCTAAACGA
ACTTTAAAATCTGTGTAGCTGTCGCTCGGCTGCATG
CCTAGTGCACCTACGCAGTATAAACAATAATAAA
TTTTACTGTCGTTGACAAGAAACGAGTAACTCGTCC
CTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGT
TGCAGTCGATCATCAGCATACCTAGGTTTCGTCCGG
GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTC
TTGGTGTCAACGAGAAAACACACGTCCAACTCAGT
TTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCG
TGGCTTCGGGGACTCTGTGGAAGAGGCCCTATCGG
AGGCACGTGAACACCTCAAAAATGGCACTTGTGGT
Similarity
Similarity
ACTTAAGTACCTTATCTATCTACAGATAGAAAAGT
TGCTTTTTAGACTTTGTGTCTACTTTTCTCAACTA
AACGAAATTTTTGCTATGGCCGGCATCTTTGATGCT
GGAGTCGTAGTGTAATTGAAATTTCATTTGGGTT
GCAACAGTTTGGAAGCAAGTGCTGTGTGTCCTAGT
CTAAGGGTTTCGTGTTCCGTCACGAGATTCCATTC
TACAAACGCCTTACTCGAGGTTCCGTCTCGTGTTTG
TGTGGAAGCAAAGTTCTGTCTTTGTGGAAACCAG
TAACTGTTCCTAATGGCCTGCAACCGTGTGACACT
TGCCGTAGCAAGTGATTCTGAAATTTCTGCAAATG
GCTGTTCTACTATTGCGCAAGCCGTCCGCCGTTATA
GCGAGGCCGCTAGCAATGGTTTTAGGGCATGCCG
DNA “Words”
5’
3’
TACCCCCACTGTCAACCCAACACAGGCATG……
Word
Frequency
Rank
CCC
633
1
CCT
543
2
CTA
526
3
AAA
524
4
ACC
515
5
…
…
…
Rank Comparison Maps
Rank: Mitochondiral DNA (Human)
Same Species
60
50
40
30
20
10
0
Different Species
CTT
TTT
AAG
GCA
AGT
ATG
AAC
TTC
GGT
CGG
GAA
TAT
GCT
TCT
CTG
CAG
GTA
ACC
GTG
AAT
CTC
TTA
ACT
TTG
TAC
TCA
ATT
ATC
ATA
CAT
TAA
TGA
AGA
GTT
TGG
TGT
CAA
GGG
GCG
AAA
GGA
GAC
CCA
TCC
GAG
TCG
TGC
ACA
CAC
GAT
CGC
GCC
GGC
GTC
CCG
AGC
AGG
CGA
ACG
TAG
CGT
CCC
CTA
CCT
0
10
20
30
40
50
60
Rank: Mitochondiral DNA (Human)
CTT
TTT
GCA
AGT
ATG
TTC
GGT
CGG
60
TAT
GCT
CTG
CAG
ACC
50
40
GTT
TGG
30
GCG
GAC
20
10
0
TTG
TAC
TGA
AGA
10
20
AAC
GAA
TCT
GTA
GTG
AAT
CTC
TTA
ACT
TCA
ATT
ATC
CAT
TGT
CAA
GGG
AAA
GGA
GAG
TCG
TGC
ACA
CAC
GAT
CGC
GCC
GGC
GTC
CCG
AGC
AGG
CGA
ACG
TAG
CGT
CCC
CTA
CCT
0
ATA
TAA
AAG
CCA
TCC
30
40
50
60
Rank: Mitochondiral DNA (Gorilla)
Human Influenza Virus
Our result is consistent with previous finding based on sequence
alignment technique (Science 1986; 232: 980)
Genome-wide Sequence
Comparison (SARS Coronavirus)
Yang AC et al. Journal of Computational Biology 12(8):1103-16 (2005).
Mathematics compares the most
diverse phenomena and discovers the
secret analogies that unite them
- Joseph Fourier
Selected References and Tutorial
• 1. Yang AC, Hseu SS, Yien HW*, Goldberger AL, Peng CK. Linguistic analysis of
human heartbeats using frequency and rank order statistics. Physical Review Letters
90:108103 (2003).
• 2. Yang AC, Peng CK, Yien HW, Goldberger AL. Information categorization approach
to literary authorship disputes. Physica A 329:473-483 ( 2003).
• 3. Yang AC, Goldberger AL, Peng CK.* Genomic classification using an informationbased similarity index: application to the SARS coronavirus. Journal of Computational
Biology 12(8):1103-16 (2005).
• 4. Peng CK, Yang AC , Goldberger AL. Statistical physics approach to categorize
biologic signals: from heart rate dynamics to DNA sequences. Chaos 17: 015115
(2007).
• 5. Yang AC, Tsai SJ, Hong CJ, Wang C, Chen TJ, Liou YJ, Peng CK. Clustering heart
rate dynamics is associated with β-adrenergic receptor polymorphisms: analysis by
information-based similarity index. PLoS ONE 6(5): e19232 (2011).
Online Tutorial: http://www.physionet.org/physiotools/ibs/
Physionet: NIH Research Resource for Complex Physiologic Signals
Download