International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 10 April 2015
Computer Department,Marathwada Mitra Mandal’s Institute of Technology,Pune
Pune University
Abstract
— Data Leakage, put simply, is the unauthorized transmission of private or sensitive data or information from within an organization to a third party, i.e., an unauthorized recipient. In Organizations, detecting data leaker is facing many difficulties to and who cause for the data leakage. In previous approaches there are many techniques to find the leaker using fake objects and how much data leaked. In this paper we represent s-random algorithm, probability function, fake record generation and an algorithm for data distribution. We also implemented a data watcher to detect the guilty agents and calculating the probability.
Keywords
— Data privacy, Data Leakage, Detection, Data watcher. within each copy that is distributed to authorized agents.
When leakage occurred, then this unique code would help identify the party that was responsible for the leak.
E.g. A hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. Many times agent get to know that the data will be watermark that time the data will be erase by the agent that time distributor never knows that who is the leaker.
A.
Drawbacks of Existing system
I.
I NTRODUCTION
In organizations, the data leakage problems are raising rapidly such as sensitive data like database information, details like logins and task details, secret data related to organization, transfer of data from one to others, distribution of data and business data the main property of an organization.
Sensitive data in companies and organization include intellectual property (IP), financial information, personal information (like credit card data) and other information depending on the business and the industry. In the real world scenario, a distributer needs to share sensitive data among various stake-holders such as employees, business partners and customers. This increases the risk that confidential information will fall into unauthorized hands [1].
The problem of data leakage is much more relevant and crucial nowadays as much of our information is available online through social networking sites and third party aggregators.
In this paper we present the method to identify the guilty agents. In section 2 we present a study of related work in this area. In section 3 we present the proposed work in details. It includes system architecture along with all the modules and their explanation. Further, Section 4 presents result of the system along with the graphical representation followed by conclusion in section 5 and references .
1.
Watermarks can be very useful in some cases, but again, involve some modification of the original data.
2.
Furthermore, watermarks can sometimes be destroyed if the data recipient is malicious. i.e. Agent can easily remove it using various software which can easily remove watermarking from the data.
3.
In existing system there is few problem like fixed agents and existing system work comparable with agents whose request known in advance.
III.
PROPOSED SYSTEM
In this system the strategy used that can find out the leakage and also to identify the guilty agents. It contains three major parts as Administrator, Mail Sender and Data Watcher [3].
The Administrator module contains all the database maintenance, agent registration and fake record generation [5], data distribution. Mail Sender software is provided to agents for sending mails to clients. It will send mails to clients which are genuine. Third module is Data Watcher which is used as a guilt model. It fetches all the emails in the client's inbox and if found any other email from any other address or source then it will show the agent is guilty. The Data Watcher also represents the probability of the guilty agent.
II.
EXISTING SYSTEM
Early work in the area of data leakage detection resulted in the Here, a uniquely identifying text or image is embedded
A.
System Architecture
The system architecture contains three modules as
Administrator module, Mail Sender module and Data Watcher module. All of these three modules contain different function
ISSN: 2231-5381 http://www.ijettjournal.org
Page 485
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 10 April 2015 to carry out results. The task of detecting insider is facing many difficulties and very challenging to improve the efficiency and trusted method. The data leakage raises huge loss to particular organization, person and multinational companies. Also there is chance to make misuse of the confidential data or personal data of clients. Because of this reason Data Watcher is very helpful for identification of the data leakage and identifying guilty agents. System architecture as shown in below fig. the generation of fake record is done. It also contains distribution of the data.
Algorithm for Fake record generation
1.
Delete previous Stock data.
2.
Create an array al to store clients contacts as al.
3.
Create another array to store fake records from Stock data as al1.
4.
for (int i = 0; i < al.size(); i++)
5.
Get all contact details.
6.
Insert original and fake records.
A.
Modules
Figure 1: System Architecture
This system contains three modules.
1) Administrator
2) Mail Sender
3) Data Watcher
1) Administrator:
Administrator module is one of the important module in this system which contains all the important information regarding all the agents, clients. Administrator module contains following three functions. a.
Agent Registration b.
Generation of fake records c.
Distribution of records.
Here, in the agent registration details are maintained and the sensitive data which are provided to agents are specified. The designing of the whole database is done. In the admin module
2) Mail Sender:
Mail sender is software which is provided by the administrator to the agent. It contains following function. d.
Download CSV le e.
Provide attachments(if any) f.
Send mails
Mail Sender is software which is provided by the administrator to the agents. All the agents should send all the emails using this Mail Sender software only to the clients. In this Mail Sender software agent have to attach a CSV file to send all the emails. If there are any attachments then there is an option to send it.
3) Data Watcher:
Data Watcher is used to identify the guilty agent. It is also called as “Guilt Model". It contains following functions. a.
Fetch emails b.
Detect guilty agents c.
Calculate probability
This is third module in system which is used to detect the guilty agents. It is used to fetch all the emails in the client’s inbox. If inbox contains any other email instead of agent’s emails or organization emails then it will detect the respective agent as guilty agent. Probability calculation is also part of this Data Watcher. It compares the resulted value with the threshold value and if it is maximum than threshold value then it will show the agent is guilty with its probability.
The formula to calculate the probability is :
P = ((Cn) /(Tc) 100) / 2
Where, Cn is Count of new mails
Tc is Total Clients.
Algorithm for Probability calculation :
1.
Start
2.
Fetch all the mails in client’s inbox.
3.
Show all the new mails in client’s inbox.
(1)
ISSN: 2231-5381 http://www.ijettjournal.org
Page 486
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 10 April 2015
4.
Count all new mails as Cn.
5.
Get total clients which are selected as Tc.
6.
Calculate probability using formula
7.
Return probability.
8.
End is important between two companies. It can be also used where confidential data, personal data want to secure. For this project we present domain filtration [4] and probability calculation which causes the Guilty agent detection effective.
IV.
RESULT
We developed the system which gives the effective result in minimum number of efforts or it is not much complex. We created agents email addresses and clients addresses. In this work we used our Google apps ID which is provided by college. While running the system if there is any other email from another address got found then Data Watcher will show guilty agent with its probability. We added the fake objects and distributed it randomly to avoid data leakage. This system gives more efficient way to restrict the data leakage by agents .
A CKNOWLEDGMENT
We would like to express our gratitude to all those who helped us to complete this work. We want to thank our guide
Prof. Mane G.V. for her continuous help and generous assistance. She helped in a broad range of issues from giving us direction, helping to find the solutions, outlining the requirements and always having the time to see us. We would like to thank our colleagues who helped us time to time and giving good suggestions. We also extend sincere thanks to all the staff members of Department of Computer Engineering and Information Technology for helping us in various aspects.
Figure 2: Probability Graph
It contains number of agents and their probability ratio. Up to the threshold value agent is not guilty but when threshold value get exceed then that agent is guilty.
The probability graph shows the different values for different agents. With the help of these values Data Watcher detects the guilty agents. In above graph,
Where, x-axis = no. of agents
y-axis = probability values
The blue colour bars shows the probability less than the threshold value which is set by Admin.
The red colour bars show the probability greater than threshold value.
V.
CONCLUSION
We referred different papers and studied their proposed work. With the help of these works we have developed new system which is simple to understand and study. This system contains random algorithm for randomly selection of clients, but in future for large scale application selection of clients will be as per agent request. Data Watcher is very useful for the identification of the data leakage and identifying guilty agents. Presented system can be used in different types of organizations, policy related organizations where role of agent
R EFERENCES
[1] Panagiotis Papadimitriou, Member, IEEE, Hector Garcia-Molina,
Member, IEEE., Data Leakage Detection , IEEE Transactions On
Knowledge And Data Engineering, Vol. 23, No. 1, January 2011
[2] P. Papadimitriou and H. Garcia-Molina, Data Leakage Detection , technical report, Stanford Univ., 2008
[3] N. P. Jagtap, S. J. Patil, A. K. Bhavsar, Implementation of data watcher in data leakage detection system , International Journal of
Computer Technology Volume 3,No. 1, Aug, 2012
[4] Ankit Agarwal, Mayur Gaikwad, Kapil Garg, Vahid Inamdar,
Robust Data leakage and Email Filtering System , International
Conference on Computing, Electronics and Electrical
Technologies, 2012
[5] Keerthi.P,M.Sheshikala,D.Rajeswara Rao, Guilty Agent Detection by Using Fake Object Allocation ”, International Journal of
Computer Technology Volume -1,2013
[6] Rudragouda G Patil, Development of Data Leakage Detection
Using Data Allocation Strategy International Journal of Computer
Applications in Engineering Sciences,VOL I, Issue II, June 2011
[7] Ajay Kumar ,Ankit Goyal ,Ashwani Kumar ,Navneet Kumar
Chaudhary ,Sowmya Kamath S ,”Comparative Evaluation of
Algorithms for Effective Data Leakage Detection”, Proceedings of
2013 IEEE Conference on Information and Communication
Technologies (ICT), 2013
[8]
Ahirrao P. P., Rai S. S., Pathania B. R., “Data Leakage Detection”,
International Journal of Recent Technology and Engineering
( IJRTE ) ISSN: 2277-3878, Volume-3, Issue-1, March 2014
[9] B. Sruthi Patil, Mrs. M. L. Prasanthi,”Modern Approaches for
Detecting Data leakage Problems”, International Journal Of
Engineering And Computer Science ISSN :2319-7242,Volume 2.Issue 2,Feb 2013
[10] Ajay Kumar,Ankit Goyal, Ashwani Kumar, Navneet Kumar,
Chaudhary, Sowmya Kamath S,Dept of Information Technology
NITK Surathkal, India, Comparative Evaluation of Algorithms for active Data Leakage Detection , Proceedings of 2013 IEEE
Conference on Information and Communication Technologies
(ICT), 2013
ISSN: 2231-5381 http://www.ijettjournal.org
Page 487