Secrecy Conserving Sociable Network Communication against Mutual Friend Attacks and Structural Attacks Dr.M.Giri

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
Secrecy Conserving Sociable Network
Communication against Mutual Friend Attacks and
Structural Attacks
Gowrish K.G 1, M.Ashok Kumar2, Dr.M.Giri 3
M.Tech Scholar1 , Assistant professor2 , Professor & Head3
1,2,3
Dept of CSE, SITAMS,Chittoor,A.P,India
Abstract:
I.INTRODUCTION
Critical concern on secrecy storage will communicate
The up growing on mobile and Internet technology, more
sociable networks and have been elevated for unique
and more declaration is recorded by sociable network
secrecy in latest years. There exist many secrecy
applications,
conserving works that can deal with different attack
communicated sociable networks datasets should have
models. In this a new Attack model and refer it as a
secrecy for some unique or groups. With the increasing
Mutual Friend attack and Knob Info and Chain Info
concerns on the secrecy, many works have been proposed
attacks are introduced. Apponents can easily re identify
for the secrecy-conserving sociable network communication
target in the network ,even if the replaced by
.If an attacker can obtain the number of mutual friends
randomized integers also. In mutual friends, the
between two connected vertices, he still can identify (D, F)
apponent can re identify a group of friends by using
from other friend pairs, as only (D, F) has 2 mutual friends.
there number of mutual friends to concern this issues a
In most sociable networking sites, such as Facebook,
new invisibility called k-NMF (anonymity) invisibility is
Twitter, and Chained In, the apponent can easily get the
proposed. The algorithm to achieve the k-NMF
number of mutual friends of two individuals chained by a
invisibility by conserving original vertex set so it allow
relationship. As shown in Figure 1, one can directly see
the addition but no deletion of vertices. For Knob Info
mutual friend list shared with one of his friends on
and Chain Info attack, the k- isomorphism invisibility is
Facebook. The apponent can get the friend lists of two
necessary for storage. The anonymization efficiency is
individuals from Facebook, such as the friend list in Figure
for retrieving the data utility also. For defence against re
1, and then get the number of mutual friends by intersecting
identification under any credential structural attacks,
their friend lists.
such
as
Facebook
and
Twitter.
the k-symmetric model in the network, to achieve a ksymmetric model which modifies a newly economized
network so that for any vertex in the network. And also
a sampling methods to enlarge the approximate original
network for the economized network so that statistical
properties of the original network could be evaluated.
Index Terms: Secrecy Conserving, Sociable Network
Communicate, Mutual Friend, Isomorphism, Structural
Attack, Algorithms, Security.
ISSN: 2231-5381
Figure 1 : Friend List on Facebook
http://www.ijettjournal.org
Page 247
The
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
New relationship attack models based on the number of
A.Storage From Attacks In K- Isomorphism
mutual friends of two connected individuals, and refer it as a
Many real-world sociable networks contain sensible
mutual friend attack. Figure 2 shows an example of the
declaration and critical secrecy concerns on graph data. To
mutual friend attack. The original sociable network G with
understand the kinds of attack, to research some of the
vertex identities is shown in Figure 2(b), and can be naively
credential application data from sociable networks, how a
economized as the network G’ shown in Figure 2(c) by
sociable network is translated into a data graph, what kind of
removing all individuals’ names. The number on each edge
sensible declaration may be at risk and how an apponent
in G’ represents the number of mutual friends of the two end
may attack on unique secrecy.
vertices. Alice and Bob are friends, and their mutual friends
Secrecy storage is about the protection of sensile declaration
are Carl, Dell, Ed and Frank. So the number of mutual
,for example of real datasets identifies two main types of
friends of Alice and Bob is 4. After obtaining this
sensible declaration that a user may want to keep private and
declaration, the apponent can uniquely analyze the edge (D,
which may be under attack in a sociable network.
E) is (Alice, Bob).Also, (Alice, Carl) can be uniquely re-
1. Knob Info:
identified in G’. By combining (Alice, Bob) and (Alice,
The first type is called Knob Info, is some declaration that
Carl), the apponent can uniquely analyze individuals Alice,
is attached with a vertex. This is an one example ,that the
Bob and Carl. This simple example illustrates that it is
emails sent by an unique in the Enron dataset can be highly
possible for the apponent to analyze an edge between two
sensile, since some of the emails have been written only for
individuals and may be identify the individuals when he can
private recipients and should not be allowed to be chained to
get the number of mutual friends of individuals. They do
any individual. Estimate that any identifying declaration
not consider the mutual friend number of two Knobs ifthey
such as names will first be removed from KnobInfo, so that
are not connected. The number of mutual friends of two
the content of KnobInfo does not help the identification of
Knobs connected by an edge e as the number of mutual
its owner.
friends of e.
2. ChainInfo:
In order to protect the secrecy of relationship from the
The second type is called ChainInfo, it is the declaration
mutual friend attack, A new secrecy-conserving model, k-
about the relationships among the individuals, that may also
invisibility on the number of mutual friends (k-NMF
be considered as sensile. Apponent may target at two
Invisibility) introduced. For each edge e, there will be at
different individuals in the network and try to find out if
leastk-1 other edges with the same number of mutual friends
they are connected by some path.
ase. It can be guaranteed that the feasibility of an edge being
It is desire to apply sufficient storage for both KnobInfo and
identified is not greater than 1/k. In the sense that allows the
ChainInfo. Is to point out that the chain age of an unique to
addition but no deletion of vertices. The results on real
a Knob in the communicated graph itself does not disclose
datasets show that the approache scan preserve much of the
any sensile declaration for the KnobInfo target, because if it
utility of sociable networks against mutual friend attacks.
separate the communicate of the KnobInfo from that Knob,
the attacks of the first type will not be possible.
Example :Figure 2(a) estimate that the identity of the center
of the 7-star in G is X. Then X has 7 1-neighbors in G.1community sub graph of X is shown in Figure 2(b). Since
the identities of all vertices in G are hidden, an apponent
does not know which vertex in G is X. If the apponent
knows the 1-community of X, then the vertex of X in G will
Figure 2: Mutual Friend in a Sociable Network
ISSN: 2231-5381
be identified. In general, an apponent may have partial
http://www.ijettjournal.org
Page 248
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
declaration about the community of a vertex, such as a
community of X as shown in Figure 2(c), or Figure 2(d),
because these sub graphs may represent some small groups
whose declaration can be gathered by the apponents .Such
declaration also leads to the re-identification of X.
Figure 4: Illustration of Sociable network G and its
naively-economized version Ga.
Figure 3 : Community Sub graphs as NAGs
The
subgraph
declaration
of
an
To Propose k-symmetry model to achieve this requirement.
apponent
an
NAG(Community Attack Graph). It cannot place any
limitation on the NAG, so it can be any subgraph of G up to
the entire given graph G. Note that in an NAG, one vertex is
always marked (shaded in Figures 3(b),(c),(d)) as the vertex
under attack.
The general idea is to modify the network so that for each
vertex𝑣, there exist at least k− 1 other vertices each of which
serves as the image of 𝑣under some automorphism of the
modified network. The network remains invariant under the
action of an automorphism. For instance, in Figure 4(b), ifit
exchange vertex 1 and 3 while fixing any other vertices, the
Definition of (NAG). The declaration consumed by the
opponent revelent a target unique A is a pair (Ga, v), where
Ga is a connected graph and v is a vertex in Ga that belongs
to A. Call (Ga, v) the NAG (Community Attack Graph)
targeting at A. It also reffer to Ga as the NAG.
vertex adjacency relationships of the network are conserved
and therefore this permutation is an automorphism .Any
structural knowledge characterizing vertex1 could also
characterize vertex 3 and therefore they cannot be
distinguished from each other by any structural knowledge.
B.K-Symmetric Model
One of the fundamental issues when releasing sociable
network data is avoiding disclosure of individuals’ sensile
declaration while still permitting certain analysis on the
network. A straightforward approach to achieve this
objective is naïve
anonymization, which replaces all
identifiers of individuals with randomized integers so that
adversaries cannot directly locate each unique just according
to his identifier.
can
analyze
that
from
the
naively-
(anonymized)economized network, apply that the applicant
vertex matching the knowledge is unique. For instance, as
shown in Figure 1, if the Bob has 2 neighbors with degree
1, then even all identifiers are removed, and can still identify
Bob.First formalize such identity disclosure based on
structural
MUTUAL FRIENDS
The distribution property is Scale free distribution of NMFs.
The NMFs of edges in the large sociable network often have
a scale-free distribution, which means that the distribution
follows a power law or at least asymptotically.
The Property states that the NMFs of edges in large sociable
networks follow a Scale free distribution. Hence, only a
Apponents having certain structural knowledge about an
unique
II. K-NMFANONYMIZATIONON THE NUMBER OF
knowledge
of
vertices
identification(SR).
as
Structural
Re-
small number of edges have a high NMF. First anonymize
these edges, and many edges with low NMF do not need to
be processed.
A. Algorithm ADD
The original graph is G (V, E) and the gradually economized
graph is G’(V’,E’). It sort the NMF sequence f in
descending order and construct the corresponding edge list l
as described. Mark all edges as “uneconomized”, and then
anonymize the edges one by one. Iteratively, to start a new
group GP with the group NMF, gf , equal to the NMF of the
ISSN: 2231-5381
http://www.ijettjournal.org
Page 249
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
first uneconomized edge in l. Then select the edges with
containing this edge. Then
NMF equal to gf and mark them as “economized”. To select
vertices and add new edges to create new triangles.
the first uneconomized edge in l and anonymize it by adding
Considering the utility of the graph, then find the applicant
edges to increase its NMF to gf . After anonymizing this
vertices based on the Breadth First Search (BFS).
edge, to mark it as “economized” and put it into GP.
try to find some applicant
From the Knobs u and v, BFS-based Edge Anonymization
Adding new edges affects the NMF of some other edges,
traverses the graph in a breadth-first manner. For the i-hop
and these new edges will be added into f and l. Then resort
neighbors of u and v, represented by neigi (u) and neigi (v),
the sequences f and l after each edge is economized.
edge anonymization finds the applicant vertices from neigi
Algorithm 1 shows the detailed description of the ADD
(u) α΄—neigi (v) and iteratively chain the best one with u or v to
algorithm.
create a new triangle .Formulize the NMF of the edge (u, v)
Algorithm 1 The ADD Algorithm (GreedyGroup)
as nmf(u, v)
Input: Original graph G(V,E), k
C. Algorithm Add And Del
Output: k-NMF economized graph G’(V’,E’)
This algorithm is shown in Algorithm 2, which
Initialization: G0 = G, and mark all edges as
anonymizes the graph by edge addition and deletion. Similar
“uneconomized”. Compute and sort the sequences f and l.
to the ADD algorithm, ADD&DEL checks the number of
(u)
f
= f, l
(u)
= l, Gf =Ø ;
uneconomized edges with NMF equal to the NMF of the
(u)
1: while l = Ø; do
first uneconomized edge in sorted sequence l(u). If there are
(u)
2: if |l | < 2k then do cleanup-operation and break.
3: GP = {e|e Π„ l
(u)
(u)
(u)
and f e = f
(u)
1
}; gf = f
1;
more than k edges, we put them into this group.
Gf = Gfα΄— gf .
(u)
4: Mark any e Π„ GP as “economized”; update f
(u)
and l .
5: while |GP| < k or (|GP| ≥k and Cmerge≤ Cnew) do
6: Anoymize
l1(u)
by BFSEA. GP = GPα΄—
l1(u)
Algorithm 2 The ADD&DEL Algorithm
Input: Original graph G(V,E), k
Output: k-NMF economized graph G’(V’,E’)
, update l
(u)
and
(u)
Initialization: G’ = G, and mark all edges as
f .
“uneconomized”. Compute and sort the sequences f and l.
7: end while
f(u) = f, l(u) = l,Gf =Ø ;
8: end while
1 while l(u) ≠ Ø do
9: return G’(V’,E’).
2 if |l(u)| < 2k then do cleanup-operation and break.
3 EE = {e|e Π„ l(u) and fe(u) = f1(u) };
4 if |EE| ≥ k, then new group GP = EE, and mark any e Π„
B.BFS-Based Edge Anonymization
There are three challenges to increase the NMF of an edge
GP as economized, Gf = Gf [ f1(u) , update l(u) and f(u) and
via adding edges. First, the added edge should not affect the
continue.
NMF of already economized edges. Secondly, the added
5: GP = Ø , gf = round(mean(f1(u) , …, fk(u) )). Record all
edge should minimize the effect on the utility of the graph.
initial info.
Thirdly, the NMF of the newly added edges should not
6: while f1(u) ≥ gf do
disrupt the current anonymization process which is
7: Anonymize l1(u) by edge-deletion.
progressing in descending order of the NMF value.
8: if anonymize failed, then roll back to initial info, and gf =
Before anonymizing an edge (u, v), the ADD algorithm has
gf +1;else mark l1(u)as economized and GP = GP α΄—l1(u) ;
created some economized groups and got a set Gf containing
update f(u)and l(u).
the group NMFs of these groups. Let gf be the NMF of the
9: end while
current group GP, and then put gf into Gf .Anonymizing the
10: Gf = Gf α΄— gf .
edge (u, v) means that it increases the NMF of (u, v) to the
11: while |GP| < k do
current group NMF gf , i.e. It create some new triangles
ISSN: 2231-5381
http://www.ijettjournal.org
Page 250
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
12: Anoymize l1(u) by BFSEA. GP = GP α΄— l1(u) , update l(u)
(u)
when the APL of the graph islarge, the algorithm can
and f .
perform better than the classic k-degree anonymization as
13: end while
shown in Figure 5(b). The resultsshow that our algorithm
14: end while
performs well on preserving theutility while protecting the
15: return G’ (V ‘,E’).
privacy by carefully exploring the graph property. The
And start another group. Otherwise, to anonymize edges to
classic k-degree anonymization makes less effort on this
form this group. To gradually anonymize edges and create
except minimizing the number of edges added. Figures 5(c),
this group, initially set the group NMF, gf , as the mean
6(c) and 7(c) show the distributions of between ness
value of NMFs of the first k uneconomized edges. Record
centrality of graphs economized by the KDA algorithm
all initial declaration before anonymizing this group.For the
when we set kdeg as 10, 20 and 30. The distributions of the
uneconomized edge with NMF greater than gf , it use edge-
economized graphs are very similar to the distributions of
deletion to anonymize it. If it cannot successfully anonymize
the original graphs especially for the ACM and Bright kite
this edge, set the gf = gf +1 and roll back to all initial
datasets. It shows that the KDA algorithm can preserve
declaration. For the uneconomized edge with NMF less than
much of the utility of the graph economized by the k-NMF
gf, apply the ADD algorithm to anonymize it. It gradually
algorithms.
anonymize uneconomized edges in sorted sequence l
(u)
until
this group has k edges, and start another group.
In the ADD&DEL algorithm, an edge will be economized
by either Edge-deletion or methods of the ADD algorithm.
Therefore, the time complexity of anonymizing an edge is
O(|V|2), and the time complexity of the ADD&DEL
Graph of ACM
algorithm is O(|E||V|2).
D.K1-Degree
Figure 5: K-Degree anonymization on 20-NMF economized
Anonymization
Based
On
K2-NMF
Anonymization
The KDA algorithm on anonymizing the k2-NMF
economized graph G’ to satisfy k1-degree invisibility. To
Figure 6: K-degree anonymization on 25-NMF anonymised
maintain the k2-NMF invisibility of G’, the KDA algorithm
Graph of cora
does not change the NMF of edges in G’ when performing
anonymization.
E. Evaluating The Kd Algorithm
Since there are no new triangles formed after the KDA
algorithmadds new edges,
the clustering coefficient
decreasesa little bit as k increases as shown in Figures 5(a),
6(a)and 7(a).The algorithm performs better than the
classick-degree anonymization on this measure. Since new
edgesare added into the graph, the APL (Average Path
Length)valuedecreases a little bitas k increases as shown in
Figures 5(b), 6(b), and 7(b).The k-NMF anonymity
considered as the classic k-degreeanonymization performs a
little better than the algorithmon the APL measure. But
ISSN: 2231-5381
Figure 7: K-degree anonymization on 25-NMF economized
Graph of Bright kite
III. Robust K-Isomorphism and Its Security in K
Definition (K-SECURITY). Let G = (V, E) be a given graph
with different Knob declaration I(v) for each Knob(vertex) v
∈ V .Each vertex v ∈ V is chained to a unique U(v). Let Gk
be the economized graph of G. Gk satisfies k-security, or Gk
is k-secure, with respect to G if for any two target
individuals C and D with corresponding NAGs GAand GB
http://www.ijettjournal.org
Page 251
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
that are known by the opponent, the following two
First estimate that for the given graph G = {V, E}, |V | is a
conditions hold
multiple of k.
1. (KnobInfo Security) the opponent cannot determine from
Definition (ROBUST K-ISOMORPHISM). A graph G is k-
Gk and GA (GB) that A (B) is chained to I(v) for any vertex
isomorphic if G consists of k disjoint subgraphs g1; :::; gk,
v with a feasibility of more than 1/k
i.e. G = {g1; :::; gk},where gi and g are isomorphic for i =
2. (Chain Info Security) the opponent cannot determine from
j.The solution is as follows. Given a graph G = {V,E}.
Gk, GA and GB that C and D are chained by a path of a
Derive a graph Gk = {Vk,Ek} such that Vk= V , and kis k-
certain length with a feasibility of more than 1/k.
isomorphic, that is Gk= {g1; :::; gk} with pairwise
k-security is main objective in this and there is also another
isomorphic gi and gj , i =j. Gk is the communicated graph.
important objective, which utilize the data. It would like to
For each v ∈ V , KnobInfo I(v) is attached to v in the
communicated graph to keep the main characteristics of the
communicated graph.
original graph in order that it may be useful for data
Theorem 2 (SOUNDNESS). A k-isomorphic graph Gk
analysis. Therefore it must also consider the anonymization
={g1; :::; gk} is k-secure.
cost, to measure of the declaration loss due to the
PROOF.
anonymization. In proposed method, anonymization may
isomorphic, for any NAG of an apponent for a target unique
involve edge additions and deletions. The possible measure
Alice, whenever the NAG is contained in any gi, at least k
of one anonymization cost is the edit distance between G
different vertices v1,…, vk that can be mapped to Alice and
and Gk, that is the total number of edge additions and
they are not distinguishable. Hence KnobInfo security is
deletions.
guaranteed.
Definition
Since
the
graphs
g1,…..,gk
are
pairwise
(PROBLEM DEFINITION). The problem of
The apponent desires to attack the chain age of 2
secrecy storage in graph communication by k-security is
individuals Alice and Bob, in the worst case, opponent can
defined as follows: A network graph G = (V, E) with unique
find matching vertices for both Alice and Bob in one of the
I(v) for each v ∈ V , and a positive integer k, derive an
subgraphs gi. However, by Robust K-Isomorphism, the
economized graph Gk = (VK, EK) to be communicated, such
same is true for each subgraph. There are k different vertices
that (1) Vk= V ; (2)Gkis k-secure with respect to G; and (3)
a1,…., ak that can be mapped to Alice, and k different
the anonymization from G to Gk has minimal anonymization
vertices b1,…., bk that can be mapped to Bob, where ai ∈ gi
cost. This problem k-Secure-PPNP (or k-Secure Secrecy
and bi ∈ gi, for 1 ≤ i ≤ k.
Conserving Network communication).
If a1 and b1 are chained by a path of length p in g1, ai and bi
Theorem 1 (NP-HARDNESS). The problem of k-Secure
are chained by a similar path in gi, for all i. For Alice to be
Secrecy Conserving Network communication is NP-hard.
the owner of a1 and Bob to be the owner of b1, the
PROOF. The proof is by reducing the NP-complete problem
feasibility is 1 k × 1k .The feasibility that Alice is chained to
of ARTITION INTO TRIANGLES.
Bob by a path of length p is hence the feasibility that their
The NP-Hardness for K-Secure-PPNP remains to hold if the
vertices are in the same gi, and it is given by k × 1k× 1k =
minimal anonymization cost requirement is replaced by
1k . Therefore the condition for ChainInfo security holds
minimum edit distance ED(G, Gk) in the problem definition
and Gk is k-secure.
PROOF. Prove by simply removing the condition of
||E(Gk)|−|E(G)|| in the proof for Theorem 1. That the
problem is NP-hard, typically it is not possible to relax the
secrecy requirement. A new solution for the problem of
secrecy storage in a graph for k-security is proposed in this.
Figure 8: Robust K-Isomorphism And K-Security
ISSN: 2231-5381
http://www.ijettjournal.org
Page 252
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
Figure 8 shows a slightly more complex example of Chain
isomorphic function from gi to gj be hij . In this example,
Info attack on a 4-anonymous graph G. In fact G is also 4-
h12(1) = 2, and h21(2) = 1; h34(7) = 8, h24(22) = 24.
automorphic.
Algorithm 3 Baseline Graph Synthesis
The structural attacks of the opponent can be based on the
Input: A graph G and an integer k.
NAG’s Gb and Gc for two individuals Bob and Carol,
Output: An economized graph, Gk = {g1,….., gk}, of G.
respectively, Gband Gc happen to be identical (the shaded
VM: Vertex Mapping for g1….., gk.
vertex in Gb (Gc) corresponds to Bob(Carol)). In G, 4
1. ∀i, 1 ≤ i ≤ k:gi← Ο•; VM ← Ο•;
vertices {1; 2; 3; 4} have Matching community subgraphs
2. while Gis not empty
and any of these can be mapped to Bob or Carol. Although
3. select a graph g with kVD-embeddings b1, ...bkin G;
the opponent cannot pin-point the vertex for either target, it
4. for each embedding bi due to Line 3
obvious that Bob and Carol must be chained by a single
5. remove bifrom G;
edge. Similarly, if the opponent has an NAG Ga for Alice,
6. insert biinto gi;
and Gc for Carol, the opponent can confirm that there must
7. append the new vertex mappings to VM;
be a path of length 2 chaining Alice and Carol.
8. add/delete edges in each gifor pair wise Robust KIsomorphism;
A.Algorithm
The
previous
9. returnGk;
section
involves
the
generation
and
Algorithm 3 also creates a Vertex Mapping VM, which will
communicate of a graph Gk that consists of k isomorphic
be used in the final step of edge addition and deletion. VM
subgraphs, let us call these subgraphs i-graphs. Then
is a table with k columns, c1,….., ck, with ci for sub graph
consider how to arrive at the i-graphs from the given graph
gi, where VM[c, r] is the table entry at column c and row r.
G. Would preserve the set of vertices by partitioning the
Each tuple in the table corresponds to one possible vertex
graph of G into k subgraphs with the same number of
mapping so that the value for hij (VM[i, r]) = VM[j, r] for
vertices. Figure 9 shows an example where k is 4, so that
all1 ≤ i, j ≤ k, and i = j. The vertex mapping VM for the
given graph G is partitioned into 4 subgraphs g1; g2; g3; g4.
example in Figures 4 and 8 as shown in Figure 9. Here
vertex 5 = VM[1,2],vertex 7 = VM[3,2], h13(5) = 7.
Figure 10: Vertex Mapping VM For Gk= {g1,g2,g3,gk}
Algorithm 4 i-Graph Formation
Input: G = (V, E) (E = {e1,….., e |E|}), VM.
Figure 9: Given graph G and partitioning
Definition (SUBGRAPH ISOMORPHISM). Let G = (V,
Output: Gk = {g1,…., gk}.
E)and G′= (V′, E′) be two graphs. There exists a subgraph
CE stores the number of edges in E crossing 2 i-graphs in
isomorphism from G to G′ if G contains a subgraph that is
Gk.
isomorphic to G.
1. V (Gk) ← V ; E(Gk) ← Ø; CE ← 0;
After the partitioning, the subgraphs are augmented by edge
∀i, 1 ≤ i ≤ |E|: Add[i].cnt ← k; Add[i].VM ← Ø;
addition
Processed[i] ← False, Marked [i] ← False;
and
deletion
to
ensure
pair
wise
graph
isomorphism. In Figure 6, edges are added or deleted so that
2. for each edge ej = {vA, vB} ∈ E
to obtain the graph Gk as shown in Figure 4. Let the
3. if not Processed[j]
4. if vA and vB appear in different columns in VM
ISSN: 2231-5381
http://www.ijettjournal.org
Page 253
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
5. CE ←CE + 1; /* increment number of cross edges */
economized graph Gk =(Vk,Ek) from the algorithm to the
6. else if vA = VM[c, a] and vB = VM[c, b]
original data graph G = (V,E) for K= 10 is compared. It
7. Marked[j] ← True; Processed[j] ← True;
considered a random graph as a baseline case. The random
8. Add[j].VM ← {a, b};
graph has been generated by fixing the number of vertices to
9. for each e′ = {VM[i, a],VM[i, b]} /* isomorphic edges */
the same number in the dataset at hand, and also setting the
10. if e′ = er ∈ E
average degree to be the same as the original graph for the
11. Add[j].cnt ← Add[j].cnt − 1;
dataset. Overall, the economized graphs are able to preserve
12. Processed[r] ← True;
the essential graph declaration. In most cases the curves for
13. retain only entries Add[i] where Marked[i] = True; /*Let
Gk and Gare aligned, and the random graph behaves very
there be n retained entries in Add[] */
different. The experiments with
14. sort the retained entries Add[] by Add[].cnt in increasing
utility qualities are very similar.
K= 5,10,15,20 and the
order;
15. determine cut point x in the sorted Add[] to minimize
|Σ1≤ i ≤ x Add[i].cnt − (Σx< i ≤n(k − Add[i].cnt) + CE)|
16. for each 1 ≤ i ≤ x
17. add all isomorphic edges determined by Add[i].VM to
Figure 11: Distrubutions of degrees(k=10)
Gk;
18. return Gk;
In Algorithm 4, for each pair of rows C and D in VM, if
(VM[i; a], VM[i; b] ) corresponds to some edge e in E, the
entry of Add[j] for either e or exactly one of the edges in E
Figure 12: Distributions of Cluster coefficients (Transitivity)
isomorphic to e will be filled so that Add[j]:vm = {a; b} and
(k=10)
Add[j]:cntis k minus the number of edges in E isomorphic to
IV. K-SYMMETRIC MODEL FOR ANONYMIZATION
e and also Marked[j]is set to True. The Processes helps to
Definition (π‘˜-Symmetry Invisibility). Given a graph 𝐺and
avoid processing an edge which has been considered during
an integer π‘˜, if ∀Δ ∈ (𝐺), |Δ| ≥ π‘˜, then 𝐺is π‘˜-symmetric, or,
the processing of some other edge. Then after the sorting at
𝐺satisfies the requirement of π‘˜-symmetry invisibility.
Line 14, the Add[e] entries are in increasing order of
𝐾-symmetry invisibility is a generalization of any other π‘˜-
Add[e]:cnt, which is the number of edges to be added if all
anonymities of graphs based on different structural
edges isomorphic to e should exist in Gk. It cut the point x
constraints on vertices. In graph is π‘˜-symmetric, it also
and determines a point in the sorted list where all entries
satisfies any other π‘˜-invisibility requirements defined in
above the point correspond to edge addition, and those some
terms of other structural constraints on vertices, such as
will be involve edge deletion.
degree, community’s and so on. The problem becomes:
B. Evaluating Robust K-Isomorphism
Given a graph 𝐺and an integer π‘˜, how to modify 𝐺so that
The HEP-Th database presence declaration in theoretical
high energy physics. The EUemail is communication
the resulting graph 𝐺is π‘˜-symmetric? It only consider
vertex/edge insertion as the graph modification operations.
network data set generated using data from a large europian
Consequently ,the original graph 𝐺must be a subgraph of the
research institute, and live general is an online generalizing
economized graph 𝐺.
community.
Knobs are users and edges represents
relationship of friend list between users.
Since
Figures 11to 12 show the results of the experiments with
respect to the three measurements. The properties of the
ISSN: 2231-5381
A. Orbit Copying Operation
the
vertices
in
each
orbit
are
already
automorphically equivalent to each other, the basic idea to
modify agraph 𝐺to be k-symmetric is then to make duplicate
http://www.ijettjournal.org
Page 254
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
copies of each orbit in (𝐺), until the total size of each orbit
≤ π‘˜) until the size of the union of𝑉iand its copies are equal
combined with its copies is at least π‘˜. the concept of
to or larger than π‘˜.
automorphism partition to sub-automorphism partition,
Algorithm : Anonymization
which underlies thedefinition of orbit copying operation as
Input: a graph 𝐺and its automorphism partition(𝐺) = {𝑉1,
well as the following theoretic analysis.
𝑉2, ..., π‘‰π‘š}; the specified threshold π‘˜
Definition (Sub-automorphism partition). Let 𝐺be a graph
Output: a π‘˜-symmetric graph 𝐺′ with respect to 𝐺and(𝐺)
and 𝒱be a vertex partition on (𝐺). 𝒱is a sub-automorphism
1 for 1 ≤ 𝑖≤ π‘šdo
partition of 𝐺if ∀𝑂∈ 𝒱, ∀𝑒, 𝑣∈ 𝑂,∃𝑔∈ 𝐴𝑒𝑑(𝐺) such that
2if |𝑉i| ≥ π‘˜then
𝑒 = 𝑣and 𝒱 = 𝒱.Clearly, if 𝒱is a sub-automorphism
3 Continue;
partition of 𝐺, then𝒱is finer than π‘‚π‘Ÿπ‘(𝐺), which means that
4 end
for each 𝑉𝑖∈ 𝒱,there must exist some Δ𝑗∈ π‘‚π‘Ÿπ‘(𝐺) such that
5 else
𝑉𝑖⊆ Δ𝑗. In particular, (𝐺) is also a sub-automorphism
6Let 𝑉′i= 𝑉i;
partition of𝐺. Hence, sub-automorphism partition can be
7while |𝑉′|<π‘˜do
considered asa generalization of automorphism partition.
8(𝐺,(𝐺), 𝑉i);
g
g
9 𝑉′i= 𝑉′i∪ 𝑉i;
10 end
11 end
12 end
C.Experimental
Results
of
Excluding
Hubs
in
K-
Symmetry model
The hub vertices is the vertices in the network with high
Figure 13:Illustration of orbit copying corporation
Definition
degree, that dominate the anonymization cost of the π‘˜-
(Orbit Copying). Given a graph 𝐺and a sub-
automorphism partition 𝒱of 𝐺. Suppose 𝑉∈ 𝒱, an orbit
symmetry model.
The benefits of excluding the anonymization of hub
copying operation (𝐺, ) is defined as follows:
vertices by experimental results on the network Net trace,
For each 𝑣∈ , introduce a new vertex 𝑣′into graph 𝐺and:
that whose degree distribution is extremely heterogeneous.
1.if (𝑒, 𝑣) ∈ 𝐸(𝐺), 𝑒∈ π‘ˆ, π‘ˆ∈ 𝒱and π‘ˆβˆ•= 𝑉, then add anedge
First, investigate the relationship between the anonymization
(𝑒, 𝑣′) into 𝐺;
cost (quantified by the total number of new vertices and
2.if (𝑒, 𝑣) ∈ 𝐸(𝐺), 𝑒∈ 𝑉, then add an edge (𝑒′, 𝑣′) into 𝐺
edges inserted) and the percentage of vertices not protected.
Theorem 3. Let 𝐺be a graph and 𝒱= {𝑉1, 𝑉2, ..., π‘‰π‘š}be a
In Figure 14, the fraction of vertices excluded (in the
sub-automorphism partition of 𝐺. Suppose O is anyorbit
descending order of degree) increases slightly, the
copying operation sequence of length 𝑁performed on𝐺. The
anonymization cost decreases dramatically. When the
resulting vertex partition and the correspondinggraph be (𝑁)
instance isπ‘˜= 10, if 5% of vertices with largest degrees are
and (𝑁), where each cell in (𝑁) is the unionof the original
excluded from storage, the number of inserted edges
orbit and all of its copies. Then (𝑁) is asub-automorphism
decreases from 201,913 to 13,444, saving nearly 94%
partition of (𝑁).
overhead. When only 1% hub vertices are excluded from
B Anonymization Procedure For K-Symmetric Model
storage, it can save 61.5%overhead by decreasing the
Based on the orbit copying operations, an anonymization
number of inserted edges from201,913 to 77,749, which is
procedure to modify a graph to be π‘˜-symmetric, which is
an impressive achievement. In Figure 14, it can also see that
shown in Algorithm 5. The basic idea ofthe anonymization
usually the number of edges inserted dominates the overall
is repeating the orbit copying operationfor each 𝑉𝑖∈ (𝐺)(|𝑉𝑖|
cost.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 255
International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 6 – Jul 2014
13.X. Wu, X. Ying, K. Liu, and L. Chen, “A survey of privacy preservation
of graphs and social networks,” Managing and Mining Graph Data, vol. 40,
pp. 421–453, 2010.
14.C.-H. Tai, P. S. Yu, D.-N. Yang, and M.-S. Chen, “Privacy preserving
social network publication against friendship attacks,” in Proc. of KDD,
San Diego, CA, 2011, pp. 1262–1270.
15.Y. Xiao, M. Xiong, W. Wang, and H. Wang. Emergence of symmetry in
complex networksReview E, 77:066108, 2008.
16. X. Ying and X. Wu. Randomizing social netwspectrum preserving
approach. In SIAM ConfData Mining, 2007.
Figure 14: Anonymization cost when some hub vertices
are excluded from storage.
V CONCLUSION
A new problem is identified called mutual friend in the
sociable network communication for that problem the K-
17.Zou, Lei, Lei Chen, and M. Tamer Özsu. "Kautomorphism: A general
framework for privacy
preserving network publication." Proceedings of the
VLDB Endowment 2.1 (2009): 946-957.
18.X. Xiao and Y. Tao. Personalized privacy preservation.
In Proceedings of the 2006 ACM SIGMOD international
conference on Management of data (SIGMOD'06),
pages 229{240, New York, NY, USA, 2006.
ACM Press.
NMF invisibility is proposed to ensured the algorithm and
K-degree invisibility. K-Security for defence sensile
declaration for Knobs in the chains and network data set, the
Robust K-Isomorphism is proposed for apponent and for the
target of the storage. The secrecy storage in sociable
networks is to protect the secrecy against any possible SR,
and
k-symmetric
model
is
proposed
for
AUTHORS:
1.Gowrish K.G is an P.G Scholar in the
Department of Computer Science &Engineering,
in the specialization of Software Engineering,
Sreenivasa Institute of Technology and Management
Studies Chittoor, Andhra Pradesh, India
structural
knowledge, efficiency, and for effectiveness.
2.M.Ashok Kumar
REFERENCES
Department of Computer Science &Engineering,
in the specialization of Software Engineering,
1. Adamic L, Adar E (2005) How to search a social network. Soc Netw
27(3):187–203
2. Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group
formation in large social networks:
membership, growth, and evolution. In: Proceedings of the 12th ACM
SIGKDD international conference
on knowledge discovery and data mining (KDD’06), ACM Press, New
York, pp 44–54
3. Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x?:
anonymized social networks,
hidden patterns, and structural steganography. In: Proceedings of the 16th
international conference on
World Wide Web (WWW’07), ACM Press, New York, pp 181–190
4. Bhagat S, Cormode G, Krishnamurthy B, Srivastava D (2009) Classbased graph anonymization for social
5. L. Backstrom, D. P. Huttenlocher, J. M. Kleinberg, and X. Lan.Group
formation in large social networks: membership, growth, and evolution. In
KDD, pages 44–54, 2006.
6.A. Campan and T. M. Truta. A clustering approach for data and structural
anonymity in social networks. In PinKDD, 2008.
7.J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal
cliques in massive networks by h*-graph. In To appear in SIGMOD, 2010.
8.J. Cheng, Y. Ke, W. Ng, and J. X. Yu. Context-aware object connection
discovery in large graphs. Proceedings of the International Conference on
Data Engineering (ICDE), 2009.
9.G. Cormode, D. Srivastava, T. Yu, and Q. Zhang. Anonymizing bipartite
graph data using safe groupings. PVLDB, 1(1):833–844, 2008.
10. A. Dharwadker. The independent set algorithm.
http://www.geocities.com/dharwadker/independent_set/, 2006.
11.M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide
to the Theory of NP-Completeness. W. H. Freeman & Co., New York, NY,
USA, 1990.
12.B. Zhou, J. Pei, and W. Luk, “A brief survey on anonymization
techniques for privacy preserving publishing of social network data,” ACM
SIGKDD Explorations Newslettervol. 10, no. 2, pp. 12–22, 2008.
ISSN: 2231-5381
is an Assistant professor in
Sreenivasa Institute of Technology and Management
Studies Chittoor, Andhra Pradesh, India
3.Dr.M.Giri is Professor & Head in Department of
Computer Science &Engineering, in the
specialization of Software Engineering, Sreenivasa
Institute of Technology and Management Studies
Chittoor, Andhra Pradesh, India
http://www.ijettjournal.org
Page 256
Download