International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6881-6885 © Research India Publications. http://www.ripublication.com Improved Steganography using Enhanced K Strange Points Clustering Terence Johnson PhD scholar (Information Technology), AMET University, Chennai, India. Assistant Professor, Dept. of Computer Engineering, Agnel Institute of Technology & Design, Goa, India. Dr. Santosh Kumar Singh Head, Dept. of Information Technology, Thakur College of Science and Commerce Kandivali (E), Mumbai, Maharashtra, India. Valerie Menezes Assistant Professor, Dept. of Computer Engineering, Agnel Institute of Technology & Design, Goa, India. Edrich Rocha, ShriyanWalke, DikshaPrabhuKhorjuvekar, Sana Pathan BE Students, Dept. of Computer Engineering, Agnel Institute of Technology & Design, Goa, India. 1 Stego-system: A stego-system in image steganography is the one in which we can hide secret message such that no third party will be aware of its existence [2]. The output image from this process is known as stego-image and this image is almost similar as the input image. Encoder: The encoder is that part of the process who embeds the secret message into the cover medium. Decoder: The decoder is the one who receives the image which contains the secret data. Clustering: A group of similar object is known as a cluster [10]. Clustering is the process which forms these clusters [5]. They are created based on color, size etc. These clusters contain useful information [6]. We can get different sets of data using clustering [7]. The objects in one cluster are similar to each other and are different from the objects in other cluster [8]. This approach is very useful in image steganography. In this paper we perform clustering based on pattern matching using color (RGB). We first divide the image into n clusters based on color. After the process of clustering we select the largest cluster and embed the secret message in that cluster using steganography. This stego-image is then sent over some channel to the receiver. On the other end, we apply the inverse procedure wherein the input is the stego-image. After forming the clusters the largest cluster is identified and the secret message is extracted. Abstract Steganography is an act of hiding information. It uses a cover medium to hide the secret information within itself. We perform the act of steganography using clustering. In this paper, we implemented the Enhanced K Strange Points Clustering Algorithm to achieve steganography. We then compare the results obtained with the K Means clustering algorithm and find that our methodology of implementing steganography works better with the Enhanced K Strange Points Clustering Algorithm. LSB technique is used to hide data in the cover medium. Any type of image can act as a cover medium. We propose an improved scheme which provides a better hiding capacity. Keywords: Steganography, Clustering, LSB technique, Steganography using Clustering, k-Means clustering and Enhanced K Strange Points Clustering Algorithm. Introduction In this new era, most of the communication takes place through the internet. This data which is transferred is not secured and it can be attacked by any third party and decrypted. Thus we can use steganography to hide the secret data such that its existence cannot be detected. Steganography is the process of hiding the data into the cover medium. It hides the data in such a way that only the receiver knows the existence of the secret message. In earlier days data was hidden using wax tablet, writing tables, etc. Now, the data is transmitted in the form of text, image, audio with the help of the cover medium[1]. Before going deep into the core of each algorithm let us take a moment to define the terms which would be used to improve the readability, and make it easier to understand how each algorithm works with regards to others. Enhanced K-Strange Clustering Algorithm The Enhanced K-Strange Points Clustering Algorithm [3] projected in this paper is about discovering strange points which are maximally apart from each other. This algorithm also addresses the effect on the running time due to choice of two farthest points from the dataset. This algorithm first finds the minimum Kmin from the dataset. It finds this value by computing the Euclidean distance [9] between all the points from the dataset. The algorithm now finds the second point is farthest from Kmin. This point is represented as Kmax. This 6881 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6881-6885 © Research India Publications. http://www.ripublication.com algorithm improves the running time by first finding the minimum from the dataset and then it finds another point which is farthest from the minimum. This procedure avoids the need of selecting the two strange points by calculating the Euclidean distances between all the points from the dataset, which thereby improves the performance of the algorithm. It requires O(n) times for finding the Kmin and Kmax point. Next the algorithm finds the third strange point from the dataset which is maximally separated from the two strange points. If the third strange point is closer to Kmin, then the position of the third point is corrected using (1) by selecting a middle point between the third strange point and Kmax and this becomes the final strange point. However, if the third strange point is closer to Kmax, then the position of the third strange point is relocated by finding a middle point between the third strange point and the Kmin and this new point becomes the final third strange point[4]. 3. 4. 5. 6. Perform clustering (Enhanced K Strange) based on color. Select the largest cluster in which secret message is to be hidden. Hide the message in the selected image using LSB steganography. Transmit the stego-image over the selected channel. Figure 1: Sender side process The LSB Technique: The least significant bit i.e., the 7th and 8th bit of an image pixel is changed to a bit of secret message. When using 24 bit image, one can store up to 3 bits in each pixel by changing 2 bits of each of the red, green and blue components. As an example, suppose that we have three adjacent pixels (9 bytes) with the RGB encoding: 10010101 00001101 11001001 10010110 00001111 11001011 10011111 00010000 11001011 At the receiver side: Here we apply the inverse procedure which is as follows: 1. Take the input image (stego-image). 2. Scan the image according to RGB (pattern matching). 3. Perform clustering (Enhanced K Strange based on color. 4. Select the largest which contains the secret message. 5. Perform extraction to get the secret message. When a number 254 whose binary representation is 11111110 embedded into the least significant bits of this part of the image, we get the following data sets after embedding: 10010111 00001111 11001011 10010110 00001111 11001011 10011111 00010000 11001011 Here only 3 bits needed to be changed according to the embedded message. Only few bits in an image will need to be modified to hide a secret message. Since there are 256 possible intensities of each primary color changing the LSB of a pixel results in small changes in the intensities of the colors. Figure 2: Receiver side process Experimental Results First we are going to implement k-Means algorithm and show the outputs of sender and receiver side respectively. We are going to perform the same operations as mentioned earlier (introduction). The image selected to perform the above technique is shown in the figure below (input image): Motivation When we studied k-Means algorithm [4] we found out that kMeans doesn't converge for large datasets. During the implementation we noticed that the algorithm is time consuming and it takes many iterations to obtain the output. The time complexity is directly proportional to the size of the image. To reduce this factor we are using Enhanced K Strange Points Clustering Algorithm, which reduces the time complexity and the number of iterations. Proposed Methodology At the sender side: To hide the data he following steps are followed: 1. Take a color image which will act as a cover medium. 2. Scan the image according to RGB (pattern matching). Figure 3: Receiver side process An image selected for the above proposed technique. 6882 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6881-6885 © Research India Publications. http://www.ripublication.com From the obtained output we select the largest cluster and insert the message in the same. This is shown in the figure below: The Enhanced K-Strange Clustering Algorithm The Enhanced K-Strange Points Clustering Algorithm runs faster than the K Means clustering method for growing size and higher dimensionality of data. The below table shows the purity of the 2 methods which is a measure of the level to which a cluster has items fitting to any lone cluster. From the table it can be seen that not only does the Enhanced K-Strange Points Clustering Algorithm run faster than the K Means clustering method but in the process it also provides good quality of clusters. Clustering Algorithm K Means Enhanced K Strange Purity (with Iris Dataset) 0.88 0.86 We implement the Enhanced K-Strange Clustering Algorithm and show the outputs of sender and receiver side respectively. The image selected to perform the above technique is shown in the figure below (input image): Figure 6: Receiver side process Snapshot of message insertion into largest cluster. The ASCII value of the message is converted into an 8 bit binary number and the same is inserted into the selected image. The message length is sent along with the message to the receiver over some selected channel. Example: the message length of 'helloooo' is 8 and hence we append the message length(8) to the message '8helloooo'. Figure 4: Receiver side process An image selected as input for Enhanced K Strange. The clusters formed for the above image is shown in figure below: Figure 7: Receiver side process The stego-image. Message length is required to display only those many characters of the message. The garbage value indicates that there is no message inserted at that position inside the image. Extraction is done using LSB technique where the last 2 bits are selected from every pixel which forms an 8 bit and this is converted back to the ASCII value and hence we get the message. The image received at the other end is shown in the figure 7 The clusters formed on the stego-image is shown in figure below: Figure 5: Receiver side process Output of Enhanced K Strange Clustering at sender side. 6883 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6881-6885 © Research India Publications. http://www.ripublication.com The ASCII value of the message is converted into an 8 bit binary number and the same is inserted into the selected image. The message length is sent along with the message to the receiver over some selected channel. Example: the message length of 'be' is 2 and hence we append the message length (2) to the message '2be'. Message length is required to display only those many characters of the message. Figure 7 shows how a message is stored inside an image. The garbage value indicates that there is no message inserted at that position inside the image. Extraction is done using LSB technique where the last 2 bits are selected from every pixel which forms an 8 bit and this is converted back to the ASCII value and hence we get the message. The image received at the other end is shown in the figure below: Figure 8: Receiver side process Output of Enhanced K Strange Clustering at receiver side along with message extracted. At the receiver end the stego-image is received and clustering is performed and the same procedure takes place. In figure 8, we can see that the received message is 'helloooo' which was the input message that was inserted during the sending process. Comparison with K Means The same was implemented using K Means and is shown below for comparison with the proposed methodology. The clusters formed for the above image is shown in figure below: Figure 11: Receiver side process The stego-image. The clusters formed on the stego-image is shown in figure below: Figure 9: Receiver side process Output of K Means Clustering at sender side. From the obtained output we select the largest cluster and insert the message in the same. This is shown in the figure below: Figure 12: Receiver side process Output of K Means Clustering at receiver side along with message extracted. At the receiver end the stego-image is received and clustering is performed and the same procedure takes place. In figure 12, we can see that the received message is 'be' which was the input message that was inserted during the sending process. Figure 10: Receiver side process Snapshot of message insertion. 6884 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6881-6885 © Research India Publications. http://www.ripublication.com Conclusion In this paper we implemented the Enhanced K Strange Points Clustering Algorithm to achieve steganography. We then compared the results obtained with steganography using K Means clustering algorithm and find that our methodology of implementing steganography works better with the Enhanced K Strange Points Clustering Algorithm. References Jasleen Kour, Deepankar Verma, ”Steganography Techiques-A Review Paper” International Journal of Emerging Reserch in Management and Technology, ISSN:2278-9359 Vol. 3, Issue-5 [2] Chamkor Sigh, Gaurav Deep, ”Cluster Based Image Steganography Using Pattern Matching” ” International Journal of Emerging Trends and Technology in Computer Science(IJETTCS), ISSN:2278-9359, 2013. [3] Johnson Terence, Dr.Santosh Kumar Singh, ” Enhanced K-Strange Points Clustering Algorithm” International Conference on Emerging Information Technology and Engineering Solutions(EITES 2015), 78-1-4799-1838-6/15, IEEE, DOI:10.1109/ EITES.2015.14, pp 32-37. [4] Dr.M.P.S Bhatia, Deepika Khurana, ”Experimental Study Of Data Clustering Using K-Means And Modified Algorithms”, International Journal of Data Mining and Knowledge Management Process(IJDKP), DOI:10.5121/ijdkp.2013.3302, Vol. 3, No.3. [5] Terence Johnson, Santosh Kumar Singh and Valerie Menezes, ” The Regress Averaging Clustering Algorithm”, International Journal of Engineering & Technology IJET-IJENS Vol:15 No:05. Pp 46-51, 155105-8484-IJET-IJENS © October 2015 IJENS. [6] Terence Johnson and Santosh Kumar Singh, “Improved Collinear Clustering in Lower Dimensions”, Proceedings of „Second International Conference on Emerging Research in Computing, Information, Communication and Applications‟, ERCICA 2014, ISBN 9789351072638, Vol III, pp 343-348, © Elsevier Publications 2014. [7] Terence Johnson, “Bisecting collinear clustering algorithm, ” International Journal of Computer Science Engineering and Information Technology Research, © TJPRC Pvt. Ltd, ISSN: 2249-6831, vol. 3, Issue 5, Dec. 2013, pp. 43-46. [8] Terence Johnson and Jervin Zen Lobo, “Collinear clustering algorithm in lower dimensions, ” IOSR Journal of Computer Engineering, ISSN: 2278-0661, ISBN: 2278-8727, vol. 6, Issue 5, pp. 08-11, NovDec 2012. [9] A. Alfakih, A. Khandani, and H. Wolkowicz, “Solving Euclidean distance matrix completion problems”. Comput. Optim. Appl., 12(1999), pp.1330 [10] Jiawei Han and Micheline Kamber, Data MiningConcepts and techniques, Elsevier. [1] 6885