Exploring Community Simulation Methods

advertisement
1
Exploring Community Simulation Methods
John Stevens
Abstract
Analysing a complete social network of a community provides a large amount of information
about that community. As collecting complete network information on all bar micro
communities is virtually imposable computer modelling techniques are used to model
communities. The community features in simulations I chose to investigate included the
physical geography of a community, the passing of time in the model, the degree of links
when compared with the real world, the representation of the strength and type of those
links in the community and the enabling of different characteristics of layers within a
community. The modelling methods I investigated where probability modelling, several types
of random modelling, multi level modelling and agent based modelling. With this
investigation I confirm that agent based modelling is the best at modelling communities.
1. Introduction
This paper explores several computer based simulation methods. Mapping a complete
social network would facilitate a range of complex analyses, but in practice, the disinterest
of some people, and the challenges of contacting all members, even from a small network,
to capture the entire set of relations between members of that network, poses challenges
making such a complete mapping almost impossible. This paper explores the potential
simulation of complete networks by using several simulations generated in C+. The methods
I explore in the rest of the paper are probability modelling, random modelling, multi level
modelling and agent based modelling
2. Literature Review
There are many problems to be faced when collecting complete social network data. Mainly
I discovered the lack of interest from individuals willing to participate in completing
questionnaires for various reasons. In spite of my range of efforts (described in more detail
in Stevens 2008), the level of non-response created problems for my analysis. For practical
and cost reasons, completed network studies had to be focused on small populations. But
one solution that helped to alleviate this problem was to simulate the network data instead
of collecting real data. This then gave the obvious problem that the network data is not truly
representative of the social situation that real data would present. However, this approach
did have the advantage that as long as network data was created using a simulation program
that accounts for parameters from the social situation and that it is modelling. There are
several methods for simulating and modelling social networks. These range from historic
methods such as Marcov probability modelling (1931) and differing types of random
modelling described in Holland and Leinhardt (1970) and Holland and Leinhardt (1979). The
more recent methods such as multi level modelling detailed by Snijders (2001) and agent
base modelling that was used in Doran and Gilbert (1995) are discussed further in the
sections below.
2
2.1 Probability modelling
One of the first methods proposed for modelling networks was to use the mathematical
technique of probability. Andrei Markov proposed this in his seminal paper on Markov
chains (1931). This is a discrete time stochastic process with the Markov property. In such a
process, the past is irrelevant for predicting the future, given knowledge of the present. This
technique was then later modified by Markov, to provide a continuous feedback loop by
taking the current result of the Markov equation to be an input into the equation that is to
be known as the continuous time of Markov chains. Markov random graphs and the
evolution of statistical models for social networks are discussed in the papers of Frank and
Strauss (1986).
2.2 Random modelling
There are three main forms of random network models. The first is described by Holland
and Leinhardt (1970) as P*, P1, P2, and later found described in Holland and Leinhardt
(1979), as types of exponential random graphs. This is further developed to use the Logit
regression by Wasserman and Pattison (1996). The random P* model was implemented
unwittingly by Erdos-Renye, (1895), Watts-Strogatz, (1998) and Kleinberg (1999). In each of
these models, a point represents a random individual and then connections are distributed
at random. Random network models serve as a base from which we can build more
sophisticated models of existing social networks. In these models, points correspond to
persons, and edges then simulate social links. None of the random models adequately
account for the dynamic changes that occur in social networks. For these models, a given
number of points result in a fixed structure. Social links can begin to fade or even strengthen
over time, and may be reinforced through repeated social contact, but I found that only the
Kleinberg model can account for the geography. Nevertheless, the Kleinberg approach
embeds individuals within a two-dimensional space. While assuming that any individual is
more likely to have more friends living nearby, the uniform lattice Kleinberg structure does
not reflect the less than uniform distributions of people into houses, streets, villages or city
neighbourhoods, and countries.
The random network models make for the assumption that all links are as equally strong,
while the links in social networks can vary in intensity. For example, family ties often differ
markedly from ties to co-workers. None of the models has link-strength or type. Similarly
to all links being equal, all points are equal. There is no provision for age, gender, job type,
and so forth, in these models. But a real social network has many various "layers", each with
a different distance function and the points having links of varying degrees of closeness
within each. While the Kleinberg model might adequately simulate neighbourhood level
links, it cannot simultaneously account for links at the workplace or based on other
organizational affiliations. For simplicity, let us consider only housing and work-based links. If
we think of each individual as embedded within two distinct planes (that is, where you work
and where you live) and each individual as a person making connections to other points in
each plane, we will have a two-dimensional Kleinberg-type model (without the far links). But
what we lack is a means to explain the connections between work and home locations. P*
networks have been used by Robins, Pattison and Woolcock (2004) to investigate missing
data and also by Albert, R, and Barabási, (2002) to investigate the mechanics of complex
networks.
None of the random models I looked into could account for any of these fundamental
properties of social networks without significant modification. Of the three, the Kleinberg
3
model, offers the most relevant starting point. Yet, while it is possible that we can model
the small-world properties of a social network accurately without explicitly modelling all the
points, we will not know whether this is the case without more complex socially aware
models. P1 modelling deals with Statistical models for triadic structure, transitivity/closure,
intransitivity, openness and triadic local structures in the networks, by using the triad census
different forces, such as transitivity, closure, and openness are considered. I look at how
these different triadic forces can be tested as proposed by Holland and Leinhardt (1981).
With P2 modelling Fienberg and Wasserman explore log linear models for analyzing dyadic
independence in a network. They explore how these models also are able to incorporate
characteristics of the actor and dyadic covariates (such as other ties). This is described in
Fienberg and Wasserman (1985).
2.3 Multi level modelling
An alternate way of modelling networks is by representing several differing networks in
layers. Multi level modelling is also called stochastic in the literature. Multi level modelling
is implemented in a computer program called Siena, and detailed in Snijders (2001). Siena is
very good at dealing with a network’s nodes’ facets, but one limitation of Siena is it has no
representation of the networks environment. More recently Snijders & Baerveldt. C (2003)
investigated a multilevel network study of the effects of delinquent behaviour on friendship
evolution. However, Nosh, in his book, Theories of Communication Networks (2003),
details of his research interests including applications of systems theories of complexity to
communication, the role of emergent networks within and between organizations, and
collaboration technologies in the workplace multilevel modelling.
2.4 Agent based modelling
Others have developed agent-based modelling packages, such as “Eos”, “Life” and “Swarm”.
Swarm simulates the behaviour of insect colonies by representing each insect in its own
right and with its own set of rules for how it should function in the environment or in this
situation, a colony. One advantage of this method of simulation over multi level modelling is
that it adds features of natural surroundings or built environment. Gilbert and Doran
pioneered this simulation method with the EOS project. Their initial work is summarized in
Gilbert and Doran (1995). This area is further described in Gilbert and Troitzsch (1999),
which describes computer simulation of societies in general, its uses and its methods.
Methods described include queuing models, multilevel models and multi agent models.
Examples of multi agent models are the EOS project from the University of Essex (Gilbert
and Doran, 1995) and Swarm. A more detailed description on the “tools and techniques” of
simulation can be found in Suleiman R, Troitzsch and Gilbert, (2000). This method of multi
agent modelling has the advantage of accounting for individual agency in the formation and
continuation of links to other individuals in the community, but it does not represent a
community as a network. My own implementation reported in the later part of chapter 7,
deals with the protocols for agent interaction, collaboration, communication and languages.
3. Research Question and Simulation Judging Criteria
The specific research question I wish to answer in this chapter is:

Which is the best method of computer modelling to use when modelling
communities such as random modelling, multi-level or agent based modelling,
4
The criteria used to judge the performance of computer simulation program at producing a
model of a community is how ell thee simulation program fulfils the following requirements:

Representing the physical geography of a community.

Representing the passing of time

Representing the degree of links in the real world

Representing the strength of those links in the community

Representing the types of link within a community

Enabling the different characteristics of layers within a community
4. Methods
The six methods I describe in this section include four methods which all are can be loosely
described as random network modelling methods including Erdos-Renyi graphs, BarabásiAlbert graphs, Watts-Strogatz graphs and Kleinberg graphs. The fifth method I will
investigate is a multi-layered network modelling method which I developed and the final
method an agent based network modelling method.
4.1 Erdos-Renyi graphs
The first method of social network simulation I investigate is Erdos-Renyi graphs. A
description of which has been translated for the paper Erdos, P. & Renyi, A. (1959), although
this technique dates back several centuries. This method creates a graph with n points and
independently fills each of the possible n(n-1)/2 undirected edges with probability p = m / (n1). m is therefore the expected mean degree.
4.2 Barabási-Albert graphs
The next modelling method I examine is one developed by Barabási, Albert and is described
in Barabási, Albert-lászló, (2002). This method is also known as “Scale Free Networking”. I
take a totally disconnected graph of m points 1,...,m. We inductively build up the simple
graph by supposing we already have points 1,...,r-1. We then add point r and simultaneously
create m simple edges between r and points sampled (with repetitions re-sampled) from
1,...,r-1 such that the probability of choosing a point s is proportional to the degree of s (this
is the degree in the graph before adding any of the m links from r, i.e. we only update the
sampling distribution after finishing adding all m links from each point r). (In particular for
the point m+1 we create edges to all other existing points 1,...,m). We stop when we have
added the point r = n. The number of edges after adding the point r is m(r-m) so the mean
degree of the final graph is 2m(1 - m/n).
4.3 Watts-Strogatz graphs
The next modelling method I consider is one developed by Watts and Strogatz and is
described in Watts, D. & Strogatz S. (1998). I take points {0,1,...,n-1} arranged in a
clockwise fashion around a ring. Initially each point is connected with an undirected edge to
k points, the k/2 nearest to our point clockwise around the ring and k/2 nearest
5
anticlockwise (in particular k has to be even). I.e. move a fraction of these edges using the
following algorithm:

We start with the point 0 (the "base point") and the edge between 0 and the next
point, if around in a clockwise fashion, i.e. the edge (0, 1). With probability p we
replace ("rewire") this edge with the still undirected edge (0, q), where q is chosen
uniformly from the ring elements such that q!=0 and (0, q) was not an edge in the
graph before the rewiring; otherwise we leave the edge in place. We then consider
base point 1 and the edge (1,2), which we rewire with probability p as before. We
continue until all edges between points and nearest neighbours clockwise have been
considered. Then we return to base point 0 and the point clockwise from 0 (i.e. test
the edge (0, 2)) and proceed round the ring as before. This continues until all edges
have been considered, i.e. after k/2 passes around the ring. The final graph still
contains nk/2 edges, so the mean degree is k. The expected "mean close degree" is
k (1-p) (i.e. the expected number of not rewired edges from each point) and the
expected "mean far degree" is kp.
4.4 Kleinberg 2D graphs
The next modelling method I chose to look into introduces the concept of physical
geography into the model. This model was developed by Kleinberg 2D graph and is
described in Kleinberg, J. (1999). This method positions all nodes individually in an X by X
lattice not over lapping with each other. We take an nxn square lattice of points (so each
point is represented as (x, y) for 1 <= x,y <= n). Define the lattice distance d ((x, y), (u, v))
= |x-u| + |y-v|. From each (x, y) we create:

Directed vectors to all (u, v) with (u, v) !=(x, y) and d(((x,y),(u,v)) <= p, the "close"
vectors.

q directed vectors to (u,v) with d((x,y),(u,v)) > p. These are sampled independently
(with repetitions resample) from a distribution such that each (u,v) has probability
proportional to d((x,y),(u,v))^(-r), the "far" links.
For a point in the "centre" of the lattice (i.e. with x, y such that p < x, y < n-p) we get
2p(p+1) close vectors that are in fact reciprocated. The reason for the somewhat odd
degree distribution is that the points not in the centre have less close links as, unlike the
Watts-Strogatz model; we do not have a periodic boundary condition. The mean out
degree is approximately 2p(p+1) + q, the more points the better the approximation. The
"close degree" of each point (i.e. the number of close vectors) is approximately 2p(p+1) and
each "far degree" is q (i.e. the number of vectors per point which were chosen at random,
whether or not they actually happen to be far away in a lattice-distance sense).
4.5 Multi-Layered Network Modelling Method
This work represents the simulation of multiple tiers of lattices representing
acquaintance/friendship links. Each individual is represented by a point that has a home
location (position in the tier 1 lattice), a work location (tier 2), followed by social and family
location, so that the non-home locations are derived from the home location based on a
distribution.
6
4.6 Agent Based Network Modelling Method
I implement agent based computer simulation. Of the many problems to be faced when
attempting to collect complete social network data, the most difficult was a lack of interest
from individuals in the population being studied in completing the questionnaires. For
various reasons, in spite of my range of efforts (described in more detail in Stevens,2008),
the level of non-response created problems for this analysis. For practical and cost reasons,
my complete network studies had to be focused on small populations. I chose to simulate
the network data instead of collecting real data. This then showed the obvious problem that
the network data was not truly representative of the social situation that real data would
present. The method I implemented was a new agent-based simulation using parameters
taken from census reports detailing changes in population and in physical and social
parameters over the past 100 years. The simulation ran for a community with a population
of approximately 10,000 individuals, for a simulated period of 100 years. This approach
allowed all individuals within the network to behave independently of each other. This
method was chosen because it is agent based and represented the individual within the
model, rather than looking at the behaviour of a community as a whole. The simulation
program was written in C++ for processing speed. Dr. Mark Boddington wrote the
program to my specification. In this simulation, times are in days, distances in meters.
The Agent Based model represents overlapping community and social networks, made up of
links on behalf of family, relationships, work, and local housing, and cantered in differing
physical locations. The model has a bitmap (i.e., a 2d array, such as Housing Density [x][y])
determining how many people can live in each small "square". Initially I only made this as a
fixed "n" for squares within 2 miles. This was still flexible and allowed the simulation of a
cluster of villages, or a town surrounded by a cluster of villages, etc., without changing the
code. A second bitmap, which represented "workplace density", enabled people's jobs to
be distributed in a similar fashion within this map. A third bitmap represented family
locations. Originally I ignored the case of an agent moving to another location within the
7
area, but I did not exclude the possibility of taking this type of case into account later.
When a new person was to be created, I simply placed them randomly on the bitmap, each
square at XY having a probability of being picked proportional to housing density [x][y]
(with a check that I didn’t overcrowd any square). Over time, I expected that the population
density would match the housing density map. This bitmap was then expected to change
over time too, which should give lots of potential change.
All agents within the model had XY location co-ordinates that represented their housing
location. Links between agents are predicted to be made and also broken within the model
at different variable rates for males and females and for different types of association
between individuals. All links between individuals had variable strength from 0 to 10; each
link was one of the 5 types of links described in Table 1. I anticipated that the number and
types of links would be extremely unpredictable; I assumed that some individuals would
have retired and therefore not have any job links, and that a small number of individuals
would not have a family social network. I also assumed that all children and a large number
of adults do not have a relationship (e.g.: single parent). These social networks can change
noticeably and also very quickly over time, as events such as moving location, relationship
break-ups and changing jobs have been modelled. The link-strength is a variable whose
value stops increasing over time. However, new links are continuously made over time as
people’s lives take different courses. Table 1 shows how agents make or break links in the
simulation.
Type of Link
Making of Links
Breaking of Links

Housing social
network
Gradual making links. The chance of Virtually instant break
making an acquaintance is relative to when moving house
distance

Education and Job
social network
Fast making links

Friendship social
network
Gradual making friends.
Possibly Gradual breaking
other types of links become friendship friendships
links with time.

Family social
network
Make at birth permanent link but of Break at death
varying strength

Relationship
Gradual make over 2 years then
becoming the strongest type of link
with one strong link at a time!
Relationship links are age dependant
with most marriages in late 20s and
early 30s.
Virtually instant break
when moving job but
job links are over
longer distance than
other links
of
Instant break
Marriages
last
on
average 8 years with a
30% never splitting.
Table 1: Making and breaking of links within the social network
(Source: My own interpretation of computational agent’s types and roles)
8
Each model looped 100 years one step per day. Agents joined and left the model at set
average rates to represent individuals’ births and the deaths. These rates were varied over
the ‘years’ to simulate the effect of wars, increasing life expectancy and birth rate variation
such as occurred in the 1960's and 1990's. Agents representing individuals also joined and
left the model at set average rates to represent individuals who move into and out of an
area. Initially the model started with nobody having any friends; the model then allowed
the acquaintance-making code to build the network. Per day, the simulation model does the
following:

We create the required number of houses and jobs for a simulated year.

Each unfilled job is filled with a fixed probability. If we fill a job, a link-less newcomer
to the town moves in, moving into a house centred on the job location.

Each friendship is broken with a fixed probability as that person moves with fixed
probability.

Similarly, the person switches jobs with fixed probability.

The person makes new neighbours within a time, and random meetings occur.
5 Probability Modelling
As was previously discussed in the section on probability modelling, the most commonly
referenced type are Marcov chains (1971), which has been rejected by most, if not all
subjects of academia and apart from including this reference I shall move onto the next
section on random modelling.
6 Erdos-Renyi graphs
The first method of social network simulation I investigate is Erdos-Renyi graphs. The table
bellow shows the numerical output of the simulation computer program for simulations of
four different sizes ranging for 10000 to 150000 simulated individuals.
Graph size
10000
50000
100000
150000
Mean degree
49.9907
50.01
49.9873
49.9942
S.d of degrees
7.02816
7.06434
7.05878
7.07504
Min degree
25.6667
24.3333
22.3333
21
Max degree
77.3333
83.3333
83.6667
82.3333
Small-world
number
2.77005
3.03656
3.26262
3.41715
Mean
clustering c.
0.00500806
0.000989636
0.000508889
0.000329489
9
Table 2 - Erdos-Renyi simulation output
(Source: output of my own implementation of Erdos-Renyi Social network simulation
computer program averaged over 3 trials)
I next present in the following figure the resultant network representation of a community
from this simulation program.
Figure 1 - Erdos-Renyi simulation resultant network
(Source: http://en.wikipedia.org/wiki/ErdosRenyi_model)
As can be seen the resultant graph is unlike a geographic map. Next is a cumulatively graph
of the degree of a node (y axis) plotted against the number of nodes within that count (x
axis) in the resultant simulated network.
10
Figure 2 – Erdos-Renyi simulation output
(Source: output of my own implementation of Erdos-Renyi Social network simulation
computer program using the following parameters m=50.00, mean of 3 trial(s), 2m00 per
trial)
The following list summaries the Erdos-Renyi modelling technique in relation to using this
technique to model communities:

Geography
The Erdos-Renyi model dose not
represent the geography of a community

Time
Fixed structure.

Degree of links
Bell curve is ok

Link strength
No link strength

Link type
There is no provision for age, gender, job
type etc in the models

Layers
This model has no representation layers
Table 3 – Erdos-Renyi simulation summary
(Source: My own interpretation of Erdos-Renyi simulation models)
The Erdos-Renyi model was no good for modelling all the areas required to model a
community. For this reason I explore other types of random network modelling
7 Barabási-Albert graphs
11
The next modelling method I investigate was one developed by Barabási, Albert also none as
“Scale Free Networking”. The statistical output of the modelling program for the network
is given bellow for networks of four different sizes ranging from 10000 to 150000 simulated
individuals.
Graph size
10000
50000
100000
150000
Mean degree
49.875
49.975
49.9875
49.9917
S.d of degrees
51.4088
60.4992
63.9836
66.2624
Min degree
22
25
21.6667
25
Max degree
819.333
1949
2676
3621.33
Small-world
number
2.66601
2.89743
2.97433
3.02736
0.0211512
0.00600621
0.00356949
0.00276578
Mean clustering
c.
Table 4 – Barabási-Albert simulation output
(Source: output of my own implementation of Barabási-Albert social network simulation
computer program averaged over 3 trials)
I next present in the following figure the resultant network representation of a community
from this simulation program.
12
Figure 3 - Barabási-Albert simulation resultant network
(Source: http://en.wikipedia.org/wiki/Scale-free_network)
As can be seen the resultant graph is unlike a geographic map. Next is a cumulatively graph
of the degree of a node (y axis) plotted against the number of nodes within that count (x
axis) in the resultant simulated network.
Figure 4 – Barabási -Albert simulation output
(Source: output of my own implementation of Barabási-Albert social network simulation
computer program averaged over 3 trials)
The following list summarizes the Barabási-Albert modelling technique in relation to using
this technique to model communities:

Geography
The Barabási–Albert model dose not
represent the geography of a community

Time
Each random model results in a fixed
structure.

Degree of links
Unnatural

Link strength
None of the models has link strength.

Link type
There is no provision for age, gender, job
type etc in the random models

Layers
This model has no representation layers
Table 5 – Barabási-Albert simulation summary
(Source: My own interpretation of Barabási-Albert simulation models)
13
The Barabási-Albert model was bad for modelling all areas required to model a community.
For this reason chose to explore other types of random network modelling
8 Watts-Strogatz graphs
The next modelling method I investigate was one developed by Watts and Strogatz. The
statistical output of the modelling program for the network is given bellow for networks of
four different sizes ranging from 10000 to 150000 simulated individuals.
Graph size
10000
50000
100000
150000
Mean degree
50
50
50
50
S.d of degrees
4.82667
4.85487
4.84881
4.86355
Min degree
34
33
32
31.3333
Max degree
70
74
74
76.3333
Small-world
number
2.77781
3.05266
3.2898
3.44409
0.0146803
0.0110893
0.0104561
0.0107154
Mean clustering
c.
Table 6 – Watts-Strogatz simulation output
(Source: output of my own implementation of Watts-Strogatz Social network simulation
computer program averaged over 3 trials)
I next present in the following figure the resultant network representation of a community
from this simulation program.
14
Figure 5 - Watts-Strogatz simulation resultant network
(Source http://en.wikipedia.org/wiki/File:Network_Community_Structure.png)
As can be seen the resultant graph is unlike a geographic map but dose at least have a
physical structure but it can be said to be unnatural. Next is a cumulatively graph of the
degree of a node (y axis) plotted against the number of nodes within that count (x axis) in
the resultant simulated network.
15
Figure 6 – Watts-Strogatz simulation output
(Source: output of my own implementation of Watts-Strogatz Social network simulation
computer program averaged over 3 trials)
The following list summarizes the Watts-Strogatz modelling technique in relation to using
this technique to model communities:

Geography
The Watts-Strogatz model is ok but
unnatural.

Time
The entire random model results in a fixed
static structure.

Degree of links
Bell curve ok

Link strength
None of the random models has link
strength.

Link type
There is no provision for age, gender, job
type etc in the random models

Layers
This model has no representation layers
Table 7 – Watts-Strogatz simulation summary
(Source: My own interpretation of Watts-Strogatz simulation models)
The Watts-Strogatz model was an unnatural improvement for modelling community
geography but it was unsuitable for modelling all other areas required to model a
community. For this reason I choose to explore other types of random network modelling
16
9 Kleinberg 2D graphs
The next modelling method I chose to investigate, introduced the concept of physical
geography into the model. This model was developed by Kleinberg. The statistical output of
the modelling program for the network is given bellow for networks of four different sizes
ranging from 10000 to 150000 simulated individuals.
Graph size*
10000
49729
99856
149769
Mean
degree
(out)
49.8004
49.9104
49.9367
49.9483
S.d of
degrees
(out)
0.808801
0.547466
0.461076
0.417106
Min
degree
(out)
43
43
43
43
Max
degree
(out)
50
50
50
50
(Out)
smallworld number
2.89756
3.39194
3.59781
3.70002
Mean
(out)
clustering c.
0.0933704
0.0710919
0.0651955
0.0623456
Table 8 – Kleinberg simulation output
(Source: output of my own implementation of Kleinberg Social network simulation
computer program averaged over 3 trials)
* - Note the graph sizes differ slightly from those in the other trials. This is because the
Kleinberg model is based on a square lattice, so the number of points has to be the square
of an integer.
I next present in the following figure the resultant network representation of a community
from this simulation program.
17
Figure 7 - Kleinberg simulation resultant network
(Source: http://en.wikipedia.org)
As can be seen the resultant graph is unlike a geographic map but each cluster in the model
can represent a com unity or village. Next is a cumulatively graph of the degree of a node (y
axis) plotted against the number of nodes within that count (x axis) in the resultant
simulated network.
18
Figure 8 – Kleinberg simulation output
(Source: output of my own implementation of Kleinberg Social network simulation
computer program averaged over 3 trials)
The following list summarizes the Kleinberg modelling technique in relation to using this
technique to model communities:

Geography
Kleinberg is the best random model

Time
Results are static with a fixed structure.

Degree of links
Unnatural

Link strength
No link strength

Link type
There is no provision for age, gender, job
type etc in the random models

Layers
This model has no representation layers
Table 9 – Kleinberg simulation summary
(Source: My own interpretation of Kleinberg simulation models)
The Kleinberg model was the best for modelling geography in a community with a matrix
for a map but it was bad for modelling all other areas required to model a community.
10 Multi Layered Network Modelling
The next modelling method I investigate was the multi layered network model developed by
myself. The statistical output of the modelling program for the network is given bellow for
networks of four different sizes ranging from 10000 to 150000 simulated individuals.
19
Graph size
10000
50000
100000
150000
Mean degree
48.8404
50.0464
50.3359
50.5092
S.d of degrees
3.36403
2.96345
2.86358
2.80794
Min degree
35
35
35
35
Max degree
60
62
63
62
Small-world
number
2.97411
3.48605
3.66708
3.76461
Mean clustering
c.
0.171855
0.150188
0.142396
0.141109
Table 10 – Multi Layered simulation output
(Source: output of my own implementation of Multi Layered Social network simulation
computer program averaged over 3 trials)
Next, I present in the following figure the resultant network representation of a community
from this simulation program.
Figure 9 – Multi Layered simulation resultant network
(Source: Output of my own implementation of Multi Layered Social network simulation
computer program)
As can be seen the resultant graph is unlike a geographic map but dose at least have a
physical structure but it can be said to be unnatural. Next is a cumulatively graph of the
degree of a node (y axis) plotted against the number of nodes within that count (x axis) in
the resultant simulated network. The multi-layered model, in contrast, gives a more natural
distribution for each node (individual) with a average of 21.93 friends per person (standard
deviation= 3.31).
20
Figure 10 – Out degree distribution of Multi-Tired social network simulation
(Source: output of my own implementation of Multi-Tiered social network simulation
computer program averaged over 3 trials)
The following list summarizes the Multi-layered modelling technique in relation to using this
technique to model communities:

Geography
Multi-Tired is ok

Time
The model results in a fixed structure.

Degree of links
Bell Curve is ok

Link strength
None of the models has link strength

Link type
There is no provision for age, gender, job
type etc in the model

Layers
The Multi-Tired model is ok
Table 11 – Multi-layered simulation summary
(Source: My own interpretation of Multi Layered simulation models)
The multi-layered model has a better representation of a community than random models
of community with a limited representation of geography, a bell curve distribution of the
degree of links and a method of representing layers within the model. On the negative side
the multi-layered model has no method of representing time, link strength or link type in
the model.
11 Agent Based Network Modelling
21
The next modelling method I consider is the agent based model developed by myself. The
statistical output of the modelling program for the network is given bellow for networks of
four different sizes ranging from 10000 to 150000 simulated individuals.
Graph size
10000
50000
100000
150000
Mean degree
54.1893
54.3514
54.3872
54.4092
S.d of degrees
9.73575
9.51811
9.42792
9.41752
Min degree
10
9.66667
9.33333
9.33333
Max degree
91
93.6667
104.333
101.667
Small-world
number
2.74614
3.01872
3.22805
3.37854
0.0179433
0.0132505
0.012583
0.0123686
Mean clustering
c.
Table 12 – Agent Based simulation output
(Source: output of my own implementation of Agent Based Social network simulation
computer program averaged over 3 trials)
Next I present in the following figure the resultant network representation of a community
from this simulation program.
22
Figure 11 – Agent Based simulation resultant network
(Source: Output of my own implementation of Agent Based Social network simulation
computer program)
As can be seen the resultant graph is a geographic map but dose at least have a physical
structure but it can be said to be unnatural. Next is a cumulatively graph of the degree of a
node (y axis) plotted against the number of nodes within that count (x axis) in the resultant
simulated network.
23
Figure 11 – Out degree distribution of Agent Based social network simulation
(Source: output of my own implementation of Agent Based social network simulation
computer program averaged over 3 trials)
The following list summarizes the Agent-Based modelling technique in relation to using this
technique to model communities:

Geography
This model has a community map

Time
This model results in an incremental
structure.

Link strength
This model represents link strength

Link type
This model represents link type

Layers
This model represents multiple layers
Table 13 – Agent based simulation summary
(Source: My own interpretation of Agent Based simulation models)
Agent based simulation fulfils all of my requirements for the simulation of a community.
These are representing in the model as map of the community, the passing of time in the
model, the representation of link strength, link type and multiple layers in the model.
12 Summary
The applicability of all 6 models to social network analysis as applied to modelling
community is discussed below. We equate points within a graph with individuals within our
population and edges with social links. What follows is a brief discussion of the applicability
of the six models described above to modelling a social network. First I summarise the how
each of the models fulfil the com unit y features identified above
24

Geography -- Erdos-Renyi and Barabási-Albert are bad; Watts-Strogatz is OK but
unnatural, Kleinberg is better. The Kleinberg and Multi-Layered models would have
the individuals embedded within a 2D map and having a set of friends living close,
which is a good start. However Kleinberg has a uniform lattice structure, which is
unlike the natural aggregations of people into houses, streets, hamlets, villages,
towns, cities, and countries. The Agent based model dose represent a map of a
community.

Time -- A social network is a constantly evolving dynamic object, whereas for a given
number of points, each model results in a fixed structure. Social links wax and wane
e.g. fade with time or can be strengthened by repeated social contact. BarabásiAlbert is the only model which has any kind of evolution, but links are permanent
once made and the preferential attachment results in a scale-free degree distribution,
which is undesirable, for example it means that the degree distribution is highly
skewed to the right, with individuals of very high degree. The Agent based model
dose represent the passing of time.

Link strength -- Not all links are equal within a social network. Family ties are very
different to ties with people you are linked to by virtue of sharing an office. None of
the models has link-strength or type except the Agent based model which dose
represent link strength.

Link type - Individual properties -- similarly to all links being equal, all points are
equal. There is no provision for age, gender, job type etc in the models except the
Agent based model which dose represent link type.

Layers -- all models are bad. A real social network has various "layers", each with a
different distance function and points having close links within each. While Kleinberg
might model the housing links. The Agent based model which dose represent layers.

Table 14: Computer modelling summary by judging criteria
(Source: My own interpretation of simulation models fulfilling judging criteria)
Alternately these models can be summarized by type of model. The suitability of each of the
models of community I investigated is summarized in the following list

Marcov chains
No geography, No Time, No link type, No
link strength, No layers

Random Erdos-Renyi models
No geography, No Time, No link type, No
link strength, No layers

Barabasi-Albet random models
No geography, No Time, No link type, No
link strength, No layers

Watts-Strogatz random models
Very Limited geography, No Time, No link
type, No link strength, No layers

Kleinburrg random models
Grid geography, No Time, No link type,
No link strength, No layers
25

Multi layered models
Grid geography, No Time, No link type,
No link strength, Representation of layers

Agent based models
Stylized
map
of
the
geography,
Representation of time, Representation of
link type, Representation of link strength,
Representation of layers
Table 15 – Computer modelling summary by technique
(Source: My own interpretation of simulation models fulfilling judging criteria)
As can be seen from this table I surmise that agent based simulation although not perfect is
by far the best way of simulating a community.
Bibliography
Albert, R. & Barabási, (2002). A-l. “Statistical mechanics of complex networks”, Reviews of
Modern Physics 7J. P 47-97.
Doran, J. Gilbert. (1995). Simulating Societies, UCL press.
Erdos, P. & Renyi, A. (1959). On Random Graphs. Publications Mathematicae, 6, 290-297.
Fienberg and Wasserman (1985). Journal of the American Statistical Association, Vol. 80, No.
389 (Mar., 1985), pp. 51-67
Frank and Strauss (1986), Journal of the American Statistical Association, Vol. 81, No. 395 (Sep.,
1986), pp. 832-842
Gilbert, N. & Troitzsch K. (1999). Simulation for the Social Scientist, Open University Press.
Holland, P. Leinhardt S. (1970) a method for Detecting Structure in Sociometric data, The
American Journal of Sociology. Vol.3. (Nov 1970). pp 492-513.
Holland, P & Leinhardt, S. (1979). Structural Sociometry. Academic press.
Kleinberg, J. (1999). “The Small-World Phenomenon; An Algorithmic Perspective”, Cornell
Computer Science Technical Report 99-1776.
Nosh, Contractor. (2003). Theories of Communication Networks (co-authored with Peter
Monge) Oxford University Press.
Markov, A.A. (1931) Theory of algorithms/A.A. Markov; translated from Russian by Jacques
J. Schorr-Kon and IPST staff, Israel Program for scientific translations.
Robins, G., Pattison, P., & Woolcock, J. (2004). Missing data in networks: exponential
random graph (p*) models for networks with non-respondents. Social Networks, 26, 257283.
Snijders, T.A.B.(2001). The statistical evaluation of social network dynamics. M.E. Sobel &
M.P. Becker (eds.), Sociological Methodology-2001, 361-395. Boston and London: Basil.
26
Snijders, T. & Baerveldt. C (2003). A Multilevel Network Study of the Effects of Delinquent
Behaviour on Friendship Evolution. Taylor & Francis.
Stevens, J.K. (2008), Analysing the experience of the 1996 intake of postgraduates
acquaintances are made, annual (non residential) Graduate Conference, sociology
department, University of Essex.
Suleiman, R, Troitzsch R & Gilbert N, (2000). Tools and Techniques for Social Science
Simulation, Physica-Varlag.
Wasserman, S & Pattison, P (1996). Logit models and logistic regressions for social
networks. Psychometrika.
Watts, D. & S. Strogatz, (1998). "Collective dynamics of small-world networks", Nature
393V.
Download