Nonlinear Microscopy with Shaped Femtosecond Pulses

Physics of information ‘Communication in the presence of noise’ C.E. Shannon, Proc. Inst. Radio Eng. (1949) ‘Some informational aspects of visual perception’, F. Attneave, Psych. Rev. (1954) Ori Katz ori.katz@weizmann.ac.il Talk overview • Information capacity of a physical channel • Redundancy, entropy and compression • Connection to biological systems Emphasis concepts, intuitions, and examples A little background • An extension of “A mathematical theory of communications”, (1948). • The basis for information theory field (first use in print of ‘bit’) • Shannon worked for Bell-labs at the time. • His Ph.D thesis: “An algebra for theoretical genetics”, was never published ‘Theseus’ ‘W.C. Fields’ • Built the first juggling machine (‘W.C.Fields’), and a mechanical-mouse with learning capabilities (‘Theseus’) A general communication system ‘message’ Encoder Information source Continuous function s(t) Transmitter s(t) – pressure amplitude Physical Channel Continuous function s(t)+n(t) Receiver ‘message’ Decoder (bandwidth W) Information destination Added noise Shannon’s route for this abstract problem: 1) Encoder codes each message  continuous waveform s(t) 2) Sampling theorem: s(t) represented by finite number of samples 3) Geometric representation: samples  a point in Euclidean space. 4) Analyze the addition of noise (physical channel)  a limit on reliable transmission rate The (Nyquist/Shannon) sampling theorem • Transmitted waveform = a continuous function in time s(t), bandwidth (W) limited by the physical channel: S(f>W)=0 • sample its values at discrete times Δt=1/fs: (fs = sampling frequency) Vn=[s(Δt), s(2 Δt),…] t 2t 3t... Fourier (freq.) domain: • s(t) can be represented exactly by the discrete samples Vn as long as: fs  2W (Nyquist sampling rate)   S(f>W)=0 S ( f )  s (t )e i 2ft dt • Result: waveform of duration T, is represented by 2WT numbers  = a vector in 2WT-dimensions space: V=[s(1/2W), s(2/2W),… , s(2WT/2W)] An example for Nyquist rate – a music CD • Audible human-ear frequency range: 20Hz - 20KHz • The Nyquist rate is therefore: 2 x 20KHz = 40KHz • CD sampling rate = 44.1KHz, fulfilling Nyquist rate. Anecdotes: • Exact rate was inherited from late 70’s magnetic-tape storage conversion devices. • Long debate between Philips (44,056 samples/sec) and Sony (44,100 samples/sec)... The geometric representation • Each continuous signal s(t) of duration T and bandwidth W, mapped to  a point in 2WT-dimension space (coordinates = sampled amplitudes): V = [x1,x2,…, x2WT] = [s(1/2W), …, s(2WT/2W)] In our example: A 1 hour CD recording  a single point in a space having: 44,100 x 60sec x 60min = 158.8x106 dimensions (!!) • The norm (distance2) in this space is measures signal power / total energy  An Euclidean space metric d2  2TW  n 1  2 2 x  2 W s  n  (t )dt  2W  E  2WTP Addition of noise in the channel • Example in a 3-dimensional space (first 3 samples in the CD): V = [x1,x2,…, x2WT] = [s(Δt), s(2Δt), …, s(T)] x3 “mapping” N P P x1 x2 • Addition of white Gaussian (thermal) noise with an average power N smears each point into a sphere cloud with a radii N: • For large T, noise power  s(2Δt)+n(2Δt), N (statistical average) VS+N = [s(Δt)+n(Δt), …, s(T)+n(T)]  Received point, located on sphere shell: distance = noise  N  “clouded” sphere of uncertainty becomes rigid The number of distinguishable messages • Reliable transmission: receiver must distinguish between any two different messages, under the given noise conditions x3 N P P x1 x2 • Max number of distinguishable messages (M)  the ‘sphere-packing’ problem in 2TW dimensions: accesible volume Volume{Sph ere with a radii  P  N }  P  N   M    sphere volume N  Volume{Sph ere with a radii  N }  2TW • Longer mapped message, ‘rigid’-er spheres  probability to err is as small as one wants (reliable transmission) The channel capacity • Number of distinguishable messages (coded as signals of length T):  PN   M   N   2TW • Number of different distinguishable bits: PN # bits  log 2 M  TW log 2    N  • The reliably transmittable bit-rate (bits per unit time): C Channel bandwidth # bits PN  W log 2   T  N  Signal to Noise Ratio (SNR) P  C W log 2 1   (in bits/second)  N The celebrated ‘channel capacity theorem’ by Shannon. - Also proved that C can be reached Gaussian white noise = Thermal noise?  2  KT amplitude (pressure) • With no signal, the receiver measures a fluctuating noise • In our example: pressure fluctuations of air molecules impinging on the microphone (thermal energy): P{s=v} time • The statistics of thermal noise is Gaussian: P{s(t)=v}  exp(-(m/2KT)v2) • The power spectral-density is constant: (power-spectrum |S(f)|2=const) |S(f)| 2 “white” “pink/brown” frequency Some examples for physical channels Channel capacity limit: P  C W log 2 1   (in bits/second)  N 1) Speech (e.g. this lecture): W=20KHz, P/N=~1 - 100  C  20,000bps – 130,000bps Actual bit-rate = ~ (2 words/sec) x (5 letters/word) x (5 bits/letter) = 50 bps 2) Visual sensory channel: (Images/sec) x (receptors/image) x (Two eyes) Bandwidth (W) = ~25 x ~50x106 x ~2 = ~2.55x109 Hz P/N > 256  C  2.5x109 x log2(256) = ~20x109 bps A two-hour movie:  2hours x 60min x 60 sec x 20Gbps = 1.4x1014bits = ~15,000 Gbytes (DVD = 4.7Gbyte) • We’re not using the channel capacity  redundant information • Simplify processing by compressing signal • Extracting only the essential information (what is essential…?!) Redundant information demonstration (using Matlab) Original sample: 44.1Ks/s x 16bit/s = 705Kbps (CD quality) 16bit 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 2000 4000 6000 8000 sample number 10000 12000 14000 16000 With only 4bit per sample 44.1Ks/s x 4bit/s = 176.4Kbps 4bit 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 2000 4000 6000 8000 10000 sample number 12000 14000 16000 With only 3bit per sample 44.1Ks/s x 3bit/s = 132.3Kbps 3bit 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 2000 4000 6000 8000 10000 sample number 12000 14000 16000 With only 2bit per sample 44.1Ks/s x 2bit/s = 88.2Kbps 2bit 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 2000 4000 6000 8000 10000 sample number 12000 14000 16000 With only 1bit per sample (!) 44.1Ks/s x 1bit/s = 44.1Kbps 1bit 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 Sounds not-too-good, but the essence is there… -0.8 Main reason: not all of ‘phase-space’ is accessible by mouth/ear -1 0 2000 4000 6000 8000 10000 12000 Another example: (smart) high-compression mp3 algorithm: sample number 14000 16000 @16Kbps Visual redundancy / compression • Images: Redundancies in Attneave’s paper  image compression formats - edges - short-range similarities - patterns “a bottle” on “a table” What information is essential?? (evolution…?) (2008) (1954) - repetitions - symmetries 80x50 pixels 400x600 704Kbyte .bmp 30.6Kbyte .jpg - repetitions 10.9Kbyte .jpg - etc, etc…. 8Kbyte .jpg 6.3Kbyte .jpg 5Kbyte .jpg 4Kbyte .jpg • Movies: the same + consecutive images are similar… • Text: future ‘language’ lesson (Lilach & David) How much can we compress? How many bits are needed to code a message? • Intuitively: #bits = log2M (M - possible messages) • Regularities/Lawfulness  smaller M • some messages more probable  can do better than log2M • Can code a message with: (without loss of information) bits Source   p( M i ) log 2 p( M i ) ‘Entropy’ message Mi • Intuition: Can use shorter bit-strings for probable messages. lossless-compression example (entropy code) Example: M=4 possible messages (e.g. tones): ‘A’ (94%), ‘B’ (2%), ‘C’ (2%), ‘D’ (2%) 1) Without compression: 2 bits/message: ‘A’00, ‘B’01, ‘C’10, ‘D’11. 2) A better code: ‘A’0, ‘B’10 , ‘C’110, ‘D’111 <bits/message> = 0.94x1 + 0.02x2 + 2x (0.02x3) = 1.1 bits/msg source  entropy   p( M i ) log 2 p ( M i )  0.94 log 2 0.94  3  0.02 log 2 0.02  0.42 Mi Why entropy? bits   p( M i ) log 2 p( M i ) message Mi • The only measure that fulfills 4 ‘physical’ requirements: 1. H=0 if P(Mi)=1. 2. A message with P(Mi)=0 does not contribute 3. Maximum entropy for equally distributed messages 4. Addition of two independent messages-spaces: Hx+y = Hx+Hy Any regularity  probable patterns  lower entropy (redundant information) The speech Vocoder (VOice-CODer) Model the vocal-tract with a small number of parameters. Lawfulness of speech subspace only  fails for musical input Used by Skype / Google-talk / GSM (~8-15KBps) The ancestor of modern speech CODECs (COderDECoders): The ‘Human organ’ Link to biological systems • Information is conveyed via. a physical channel: Cell to cell , DNA to cell, Cell to its descendant , Neurons/nerve system • The physical channel: concentrations of molecules (mRNA, ions….) as a function of space and time. • Bandwidth limit: parameters cannot change at an infinite rate (diffusion, chemical reaction timescales…) • Signal to noise: Thermal fluctuations, environment • Major difference: not 100% reliable transmission  Model: an overlap of non-rigid uncertainty clouds. • Use channel-capacity theorem at your own risk... Summary • Physical channel Capacity theorem • SNR, bandwidth • Geometrical representation • Entropy as a measure of redundancy • Link to biological systems

Nonlinear Microscopy with Shaped Femtosecond Pulses

Related documents

Products

Support

Nonlinear Microscopy with Shaped Femtosecond Pulses

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib