The Beginners Guide to analyzing GSM data in MatLab

advertisement
The Beginners Guide to analyzing GSM data in MatLab
2007-03-02, The GSM Scanner Project (GSMSP)
http://scratchpad.wikia.com/wiki/Gsm
Abstract: The GSMSP uses the USRP hardware device to receive data
from the GSM band. This data is raw and pretty useless unless filtered
correctly. The GSMSP created a challenge to extract as much
information as possible from 3 example USRP data dumps.
Tore/Norway won the challenge. He used MaLab and routines from the
GSMsim toolkit to extract meaningful data. His results are available on
our webpage. Not everyone is familiar with MatLab. This document
gives a step-by-step approach of how to analyze the example data dumps
with MatLab.
What Is MatLab?
MatLab is a high-performance language for technical computing. It integrates
computation, visualization, and programming in an easy-to-use environment where
problems and solutions are expressed in familiar mathematical notation. It comes with a
powerful scripting language (the famous .m files) that can be used to process/filter and
visualize/display data.
At the time of writing there is no public software GSM implementation. Tore decided to
use MatLab because he was familiar with it and it gives quick results in a short time.
MatLab is good for prototyping and processing small amount of data. The GSMSP is
hoping to developer their own software GSM implementation to process GSM data in
real-time once the theory is understood.
Some Basic
The USRP is a generic hardware receiver/transmitter (transceiver) for receiving data from
any frequency band (http://www.ettus.com). It is cheap ($750) and can be used to receive
raw data from the GSM frequency band.
The USRP runs at 64Mhz. This means that 64 million times the second it converts the
signal (amplitude) on the frequency band from analog to digital. This is called Discrete
Digital Signal Processing (there is more to it. Read more!).
The GSM band is 25 MHz wide. It is divided into 125 carrier frequencies. This makes
each carrier 200 kHz wide (25,000,000 / 125 = 200,000). Our sample rate has to be at
least 400 kHz (after Nyquist’s theorem). A GSM burst period lasts for 15/26 msec (0.577
msec). Each burst lasts 156.25 bit. Only 148 bits carry data. The other 8 ¼ bit are used as
‘Guard Period’ and the end of each burst. This is to prevent overlapping of bursts. There
is no useful information in them. This means that every 15/26/156.25 msec we have a
new bit of information. Makes 270,833 bits per second ( 1 / (15/26/156.25)).
The USRP can give us up to 64,000,000 samples per second. That would be very precise
but would be far to much data. The GSM data dumps that Robert created with gnu-radio
and the USRP end in _128.cfile or *_64.cfile.
Example: GSMSP_2007_robert_dbsrx_941.0Mhz_128.cfile
This means that he used a decimation of 128. E.g. instead of receiving 64,000,000
samples he configured his USRP to receive 64,000,000 / 128 samples per second. This
means 128 samples are merged into 1 sample.
Let’s calculate if this is fast enough: We need at least 400,000 symbols per second.
Roberts example file above has 64,000,000 / 128 = 500,000 symbols per second. Just
enough!
The GnuRadio M&M clock recovery blocks requires a minimum sample rate that is twice
the symbol rate (270,833 / sec). That would have required at least 541,666 samples per
second. This fact is irrelevant for MatLab.
Next step is to oversample the signal so that we get a N * 270833 oversampled signal.
This is because we want to feed the next stage with samples as close to the centre of the
symbol interval as possible and we need some time resolution to do this. A large N gives
good accuracy, but requires more processing power. We use 4 and get good results.
8 bursts are grouped into 1 TDMA frame (Burst 0 - Burst 7. GSM people like to call
them TS0 - TS7). 1 burst lasts 15/26 msec and 1 TDMA frame lasts 8 * 15/26 = 120/26
msec. If we found one TS0 burst we get another TS0 burst after we waited 7 more bursts
(TS1 – TS7). If we received 51 of these TS0 bursts we call it a 51-multiframe. 51multiframes are used on the Control Channel (TS0). The other channels are either control
channels or traffic channels (TCH) depending on the configuration. 26-multiframes are
used on the TCH. If we are only interested in the 51-multiframe we have to read 408
bursts to get a full TS0 51-multiframe (51 * 8 = 408, 51 * 7 of the read frames are TCH
bursts from TS1 – TS7).
Our quest should be how to find the start of a 51-multiframe. This is how a 51multiframe looks like:
FSBBBBCCCCFSCCCCCCCCFSCCCCCCCCFSCCCCCCCCFSCCCCCCCCI
F: FCCH (Frequency Correction Channel)
S: SCH (Syncronisation Channel)
B: BCCH (Broadcast Control Channel)
C: CCCH (Control Channel)
I: Idle (nothing)
We get a FCCH burst ever 10 bursts on the TS0. This is always followed by a SCH burst.
Remember that these are bursts only from TS0. In fact if we find a FCCH burst we have
to wait 7 more bursts (TS1 – TS7) before we get another burst from TS0 which then has
to be a SCH.
The information from the SCH allows us to calculate the Frame Number (FN). For each
burst the FN is incremented by 1 until it reaches 8 * 26 * 51 * 2048. It then starts at 1
again. If we modulo 51 the FN we know exactly if the current burst is at the beginning of
a 51-multiframe or somewhere else. The SCH also tells us about the Base Station Color
Code (BCC) and the Network Color Code (NCC).
Note: The above 51-multiframe is one out of many configurations how it can be
structured. It’s up to the base station. The all start with FSBBBBCCCC and end with an I.
The base station tells the MS in the BCCH message (4 BCCH bursts = 1 message) how
the rest of the 51-multiframe is structured. Oh, and there can be 51-multiframes on TCH
(TS0 – TS7) as well – more to that later.
MatLab
Extract the examples files into c:\gsmsp. Start MatLab. You should see a ‘Command
Window’ on the right. Change to the directory that contains all the files by typing
To get started, select MATLAB Help or Demos from the Help menu.
>> cd c:\gsmsp
>> dir
Output:
.
..
DeMUX.m
GSMSP_20070204_robert_dbsrx_941.0MHz_128.cfile
GSMSP_20070204_robert_dbsrx_953.6MHz_128.cfile
GSMSP_20070204_robert_dbsrx_953.6MHz_64.cfile
T_SEQ_gen.m
calc_freq_offset.m
[…more data here…]
Open step1.m to get a feel of how MatLab scripts look like. Take a look at step1.m and
you will see that it loads GSMSP_20070204_robert_dbsrx_953.6Mhz_128.cfile and sets
the sample rate to 500,000 (64 Mhz / 128).. 953.6 MHz is within the European GSM
band for the downlink (Base Station (BS) to Mobile Station (MS)). Uncomment the line
6-10 in step1.m if you want to load a different dump file.
>> step1
Output:
fcch_start =12962
frequency_offset_before_Hz = 8.6956e+003
frequency_offset_after_Hz =7.3344
The script also plots some graphics:
Wow, a Frequency Correction Control Channel burst (FCCH) was found at position
12962 in the example dump file. This is before over sampling with 4. Again, take a look
at step1.m to see how the FCCH is found.
It also calculates that the frequency offset it 8695 Hz. This is a hardware issue with the
USRP. The frequency offset tells us that the frequency is shifted by 8695 Hz. We can
compensate this in software (see step1.m). After the correction it’s only around 7Hz
(7.3344 Hz). This is good enough for the GSMsim demodulator to work with.
A FCCH is always followed by a SCH burst. Finding the SCH should be our goal in
step2.m.
We oversample by 4 so the fcch_start after oversampling is at 4 * 12962 = 51848. We
expect the SCH burst 8 bursts (or 156.25 * 8 bits) later (skip TS1 – TS7). Because we
oversampled by 4 we have to skip 156.24 * 8 * 4 = 5000 positions. We expect the SCH
burst at position 51848 + 5000 = 56848. Nothing is precise and we might find the actual
SCH a couple of positions before or after.
Take a look at step2.m and you will see that we start looking for the SCH at around
position fcch_start + 5000.
Step2.m helps us finding the exact location of the SCH burst.
>> step2
Output:
sync_burst_start =56850
BCC =0
PLM =7
FN =857107
The first synchronization burst is found at 56850. This is only 2 positions later to what
we calculated above! The frame number can be calculated and is 857107 for the first
SCH burst. The frame number is important and will later on help us to figure out what
type of burst we have (CCCH, SCH, BCCH, …). The SCH burst also contains the Base
Station Identity Code (BSIC) to be 56 (PLMN color = 7, BS color = 0 makes BSIC = 56).
See 3GPP standard GSM_44.018:9.1.30a for more information how to decode the BSIC.
Press space to continue the script. The script will look for the FCH 9 TS0-bursts later and
then again find the SCH burst 1 TS0-burst later. A second SCH burst is found:
sync_burst_start=106850
BCC =0
PLM =7
FN =857117
It contains the same information as the first SCH burst.
Press space again to find a third SCH burst:
sync_burst_start =156849
BCC =0
PLM =7
FN =857127
Press space once more to finish the script. No further SCH bursts are found.
We found the FCH and the SCH. To calculate if the next burst is a BCCH or a CCCH we
have to modulo the FN by 51: 857127 mod 51 = 5. It’s 5 bursts into a 51-multiframe
which makes it a BCCH.
Let’s execute step3.m!
>> step3
Output:
rx_burst =
Columns 1 through 13
Columns 14 through 26
Columns 27 through 39
Columns 40 through 52
Columns 53 through 65
Columns 66 through 78
Columns 79 through 91
Columns 92 through 104
Columns 105 through 117
Columns 118 through 130
Columns 131 through 143
Columns 144 through 148
0
0
0
1
1
0
0
0
1
0
1
0
0
1
0
0
1
1
1
1
0
0
1
1
0
1
0
0
0
0
0
1
0
1
0
0
1
1
0
0
0
1
0
0
0
0
1
0
0
1
0
0
1
1
1
0
1
0
0
0
1
0
1
1
0
1
0
1
1
1
0
0
1
1
1
0
0
1
0
1
0
1
1
0
0
0
0
0
1
1
0
1
1
1
0
1
1
1
0
1
1
0
0
1
0
1
0
1
0
0
1
0
1
1
0
0
0
0
0
0
1
1
1
0
1
0
1
0
0
0
1
0
1
0
0
1
0
1
1
1
1
0
0
0
1
0
0
1
This is the output of 1 burst. A BCCH or a CCCH message spans over 4 bursts. Press 3
times space to process the other bursts. We can calculate from the frame number (FN) if
it is a BCCH or a CCCH. The next output shows the final message:
Checksum correct!
frame_number_mod_51 =5
Channel type = BCCH
message =
1 0 0 1 0 0
0 1 1 0 0 0
1 1 0 1 1 0
1 1 0 1 1 1
1 0 1 1 0 1
0 1 0 0 1 1
0 1 0 0 1 1
0 0 0 0 0 1
0 1 0 1 1 1
0 0 0 1 0 1
0 0 0 0 1 0
1 1 0 0 0 0
0 0 1 1 1 1
0 0 1 0 1 0
1 0 1 0 0 1
0 1 1 0 0 0
1 0 0 1 1 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 1
0 1 0 0 0 0
1 1 0 1 0 0
1
0
0
0
0
1
1
0
0
0
1
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
1
0
0
1
0
0
0
Each row shows bit 0 to bit 7 from left to right. We see 23 rows of 8 bit.
This is a System Information Type 3 message. See 3GPP standard 44:018. It tells us the
the country is 272 (Ireland) and the network operator is 02 (“O2 / Digifone mmO2”) and
many more details.
Frame number 5 means step2 analyzed frame 2,3,4 and 5 of the 51-multiframe.
Press space 5 more times to see 4 CCCH bursts and the full CCCH message.
Checksum correct!
frame_number_mod_51 =9
Channel type = CCCH
message =
1 0 0 0 0 0
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Frame number 7-9 in the 51-multiframe is a PACH (Paging and Access Grant Channel)
burst. This is an empty paging fill message (see 3GPP 04.06:5.4.2.3).
Step3.m skips over the FCH and SCH and shows us another PACH message after
pressing space 5 more times:
Checksum correct!
frame_number_mod_51 =15
Channel type = CCCH
message =
1 0 0 0 0 0 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
1 1 0 1 0 1 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
This message is from TS0 burst 12 – 15 of the 51-multiframe. It’s again an empty page
message.
Press space 5 more times for the message in burst 16 – 19:
Checksum correct!
frame_number_mod_51 =19
Channel type = CCCH
message =
1 0 1 0 1 0 0
0 1 1 0 0 0 0
1 0 0 0 0 1 0
0 0 0 0 0 0 0
1 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 0 1 0 1 0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
(see 3GPP 04.07:11.2.3.1.1 for more information.)
1st octet: L2 Pseudo Length octet: first two bits are reserved (10). 101000 = 5 bytes of
layer 3 data follows. (Remember that the bits are in reverse order, the lowest bit first).
2nd octet: 0 1 1 0 means it’s a radio resource message. The next 4 bits are unused (0 0 0 0)
3rd octet: Message type (0x21) = Paging Request Type 1
4th octet: Page mode = 0, Normal
5th octet: channel needed = 1
6th octet: Mobile identity = 0
The other rows are filled with ‘fill bits’.
Press space 3 more time and the script hits the end of robert’s dump file.
That’s it. Edit the start of step1.m to process the other two dump files!
Regards,
The GSMSP Team
http://scratchpad.wikia.com/wiki/Gsm
Download