International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Number 1- September 2015 An Empirical Model of SEO with Bitwise and Evolutionary Technique GrandhiTataji1, P.Rajasekhar2 1,2 Final M.techStudent1, Assistant Professor2 Dept Of CSE, Avanthi Institute of Engineering.& Tech, Makavarapallem, A.P Abstract: Time relevance and user interestingness are important factors while searching queries. In the years of research, various models proposed for search engine optimization, still various pros and cons in traditional approaches. In this paper we introduced an efficient search goals mechanism, it is less time complexity, Bit wise matrix and evolutionary algorithm gives optimal resultswithout additional overhead of candidate set generations. I. INTRODUCTION Site design improvement is the preparing system for the showing of a site or a website page in a web crawler unpaid results and regularly alluded to as unique results. The prior or higher positioned on the list items page and all the more habitually a site shows in the query items list and the more guests it will get from the inquiry apparatuses clients. This component may target various types of inquiry including picture look, nearby pursuit, feature seek, scholarly hunt news pursuit and industry-particular vertical web crawlers. [1] As an Internet promoting system, SEO considers how web indexes work, what individuals scan for, the genuine hunt terms or essential words wrote into web crawlers and which web crawlers are favored by their focused on group of onlookers. Optimizing a site may include altering its substance, HTML and related coding to both build its significance to particular decisive words and to evacuate hindrances to the indexing exercises of web crawlers. Elevating a site to build the quantity of backlinks, or inbound connections, is another SEO strategy. Web crawlers use complex scientific calculations to figure which sites a client looks for. In ISSN: 2231-5381 this outline, if every air pocket speaks to a site, programs infrequently called creepy crawlies analyze which locales connection to which different destinations, with bolts speaking to these connections. Sites getting more inbound connections, or more grounded connections, are ventured to be more essential and what the client is hunting down. In this case, since site B is the beneficiary of various inbound connections, it positions all the more very in a web seek. Also, the connections "bring through," such that site C, despite the fact that it just has one inbound connection, has an inbound connection from a very well-known site (B) while site E does not. [2] SEO systems can be arranged into two general classifications: methods that web search tools suggest as a component of good outline, and those procedures of which web crawlers don't favor. The web indexes endeavor to minimize the impact of the recent, among them spam indexing. Industry analysts have ordered these systems, and the experts who utilize them, as either white cap SEO, or dark cap SEO.[1] White caps tend to create results that keep going quite a while, though dark caps suspect that their destinations might in the long run be banned either incidentally or for all time once the web crawlers find what they are doing [2][4]. A search engine optimization method is viewed as white cap on the off chance that it adjusts to the web crawlers' rules and includes no trickery. As the web crawler guidelines [4][13] are not composed as a progression of tenets or charges, this is an imperative qualification to note. White cap SEO is about after rules, as well as speaks the truth guaranteeing that the substance a web crawler lists and consequently positions is the same substance a client will see. White cap exhortation is by and large summed up as making substance for clients, not for http://www.ijettjournal.org Page 40 International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Number 1- September 2015 web indexes, and afterward making that substance effortlessly open to the creepy crawlies, instead of endeavoring to trap the calculation from its expected reason. White cap SEO is from multiple points of view like web improvement that advances accessibility [5][14], despite the fact that the two are not indistinguishable. Dark cap SEO endeavors to enhance rankings in ways that are opposed by the internet searchers, or include double dealing. One dark cap procedure utilizes content that is covered up, either as content hued like the foundation, in an imperceptible div, or situated off screen. Another strategy gives an alternate page contingent upon whether the page is being asked for by a human guest or a web crawler, a system known as shrouding. Another classification at times utilized is dim cap SEO. This is in the middle of dark cap and white cap approaches where the systems utilized stay away from the site being punished however don't act in delivering the best substance for clients, rather completely centered around enhancing web search tool rankings.[6] II. RELATED WORK In the past methodologies criticism sessions are gathered and grouped in light of comparability between the went by url and the decisive words which prompts same site in light of k-means bunching calculation. Here it arbitrarily creates k number of centroids from the arrangement of pseudo records and process the most extreme closeness with all reports and spots the deliberate archive to individual group holder, however these methodologies fizzles while thickness of the information object varies[7]. In this paper we are presenting an example based client query items with bitwise grid, bitwise lattice is one of the proficient example constructing so as to dig calculation for era of regular examples the network, complete framework contains "o" and '1'.Traditional calculation like apriori endures with fundamentally two disadvantages those are Candidate set era and Multiple information base scan[4].Even however FP tree determines the past systems yet in the event that tree structural planning is unpredictable when information is more. [8] ISSN: 2231-5381 In this paper we are utilizing a dataset to test our proposed work which contains Sessionid (it shows the length of time of time between from login to log out or straightforward consistent went to duration),input inquiry (it is the info pivotal word sent by the end user),Visited URL (it incorporates the al URLS which are chatted with in the particular session, record gives the arrangement or request of going to the specific site or url and Duration (measure of time spent on particular site) and we are proposing a transformative calculation to recognize the ideal examples which are created after hereditary operations like traverse and change. We can enhance the proposed methodology by creating the successive examples with any less many-sided quality based example mining calculation like bitwise lattice calculation. We can enhance the optimalityof the proposed work by introducing the advances cross over and mutation operations over frequent patterns. Regular pattern mining is a fairly expansive range of exploration, and it identifies with a wide mixed bag of points in any event from an application particular viewpoint. Comprehensively talking, the examination in the zone falls in one of four distinct classes: • Technique-focused: This zone identifies with the determination of more effective calculations for incessant pattern mining. A wide assortment of calculations have been proposed in this connection that utilization distinctive identification tree investigation procedures, what's more, distinctive information representation strategies. Furthermore, various varieties such as the determination of compacted patterns of incredible enthusiasm to scientists in information mining. [9] • Scalability issues: The versatility issues in successive pattern mining are exceptionally critical. At the point when the information touches base as a stream, multi-pass routines can never again be utilized. At the point when the information is dispersed or extensive, then parallel or huge information systems must be utilized. These situations require distinctive sorts of calculations. • Advanced information sorts: Numerous varieties of regular pattern mining have been proposed for cutting edge information sorts. These varieties have been http://www.ijettjournal.org Page 41 International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Number 1- September 2015 used in a wide assortment of errands. Likewise, diverse information areas, for example, diagram information, tree organized information, and spilling information frequently require particular calculations for regular pattern mining. Issues of interestingness of the patterns are additionally entirely significant in this setting [6]. • Applications: Frequent pattern mining have various applications to other major information mining issues, Web applications, programming bug investigation, and synthetic furthermore, natural applications. A lot of examination has been dedicated to applications in light of the fact that these are especially vital in the connection of successive pattern III. PROPOSED WORK feedback sessions, even though traditional cluster based approaches works efficient over session based approaches, they are not optimal in terms of ranking and optimal patterns, those cluster based approaches efficient when differentqueries shares the common URL .In this paper we are proposing pattern based approach for generation of optimal results through Bit wise matrix and genetic algorithm. Our experimental analysis synthetic dataset consists of records which contains Sessionid (it indicates the duration of time between from login to log out or simple continuous visited duration),input query (it is the input keyword forwarded by the end user),Visited URL (it includes the al URLS which are visited with in the specific session, index gives the sequence or order of visiting the particular website or url and Duration (amount of time spent on specific website). We are proposing an empirical model of user search goals to retrieve user interesting patterns from ISSN: 2231-5381 http://www.ijettjournal.org Page 42 International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Number 1- September 2015 Patterns can be constructed by combing the session wise results with respect to input query and the results must be greater than or equal to visited duration of time, these input patterns can be forwarded to Bit wise matrix for generation of frequent patterns from these input patterns. Bit wise matrix: Bit wise matrix is a novel technique for generation of frequent patterns, it reduces the traditional complexity issues like Candidate set generation and multiple data base scans by constructing a simple matrix between transactions and items or data objects here frequent items can be generated based on flag values if any item exists specific transaction then it can be set to 1 else 0. Algorithm for Bit wiseMatrix : 1: While (Patterns available) 2: Load the individual patterns Pifrom transaction table 3: Generate a matrix with l rows and m columns Where „l‟ is item in transaction and „m „ is id of the transaction 4: if corresponding item „l‟ isavailableinspecific transaction „m‟ then Set intersection (l,m)=‟1‟ else set to‟ 0‟. 5: Continuesteps 2 to 5 completed until all transactions Now we can extract frequent patterns from the matrix, to extract frequent 1 itemset, initially count number of ones in vertical columns with respect to item, if it matches minimum threshold values then treat it as frequent item else ignore, continue same process for 2 itemset,check whether two items have „1‟ in their corresponding vertical columns then increment, continue until all transactions verified. If total count greater than threshold value then treat it as frequent item 1: Load item_set {I1,I2…In) and count:=0 and final_counter :0 2: for i:=0 ;i< n ;i++ ISSN: 2231-5381 Initialize the For j:=0 j<trans _size() ;j++ If intersection of (i,j)==1 then Count :=+1; Next If counter ==Ii .size_() then add items to list Next 3: Set minimum support count value (t) 4: for k=0;k<item_list_size ;k++ Ifitem_list[k].count >= t Then add to list of frequent items Next 5: return frequent pattern list Bitwise matrix can be generated based on the existence of the item with respect to transactions . It initially reads first transaction from the database ,for example it contains “a,b,c,d” ,in corresponding positions of matrix , item values can be set to „1‟ in corresponding transaction else „0‟ and consider second transaction “a,c,e”,set the corresponding item positions to „1‟ in second transaction and continue the process until all transactions placed in matrix representation.. Optimal Pattern Generation After an initial population is randomly generated, the algorithm evolves the through three operators: - selection which equates to survival of the fittest; crossover which represents mating between individuals; mutation which introduces random modifications. Crossover Operator - Prime distinguished factor of GA from other optimization techniques Two individuals are chosen from the population using the selection operator A crossover site along the bit strings is randomly chosen The values of the two strings are exchanged up to this point http://www.ijettjournal.org Page 43 International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Number 1- September 2015 - - - If S1=000000 and s2=111111 and the crossover point is 2 then S1'=110000 and s2'=001111 The two new offspring created from this mating are put into the next generation of the population By recombining portions of good individuals, this process is likely to create even better individuals Here we used single point cross over , that means we will take some part of chromosome up to some position which is randomly selected and replaced in next chromosome. Mutation Operator - - - With some low probability, a portion of the new individuals will have some of their bits flipped. Its purpose is to maintain diversity within the population and inhibit premature convergence. Mutation alone induces a random walk through the search space Mutation and selection (without crossover) create a parallel, noise-tolerant, hillclimbing algorithms Which is same as the first chromosome so ignore it. The we apply mutation considering the mutation operation as explained above. Consider the random position is 5. Then flip that bit into 0 if 1 or 1 if 0. Then the resultant chromosomes are 101110, 010110. Upon these chromosomes we have to find which one is optimized chromosome. Therefore we have to apply the positive and negative rule conditions such as true positive, true negative , false positive and false negative. So for the above chromosomes comparing first chromosome with second one the true positive value is 2 which means items which is present in both chromosome. And the true negative value is 2 which mean the item which is present in first chromosome only. Similarly False positive is 1 and false negative is 1. For second chromosome true positive 2 ,true negative 1, false positive 2 and false negative 1. Then find completeness of chromosome 1 is: 2/(2+1)=2/3=0.66 Completeness 2/(2+1)=2/3=0.66 of chromosome 2is: Here we used flipping mechanism that means randomly selected bit flipped to 0 if 1 or 1 if 0. Confidence factor 2/(2+1)=2/3=0.66 if chromosome 1 is: Below we shown an example in detail Confidence factor 2/(2+2)=2/4=0.5 if chromosome 2 is: Consider a chromosome that contains 6 unique items such as a, b, c, d, e, f Then fitness of chromosome 1 is=0.66*0.66=0.4356 The initial chromosome is set to 000000 Then fitness of chromosome 2 is=0.66*0.5=0.33 If the frequent itemsets are acdf, bdef The threshold value of fitness function is 0.1 and above. So both are optimized patterns. So the chromes are represented as 101111 and 010111 Then we apply crossover on these two chromosomes Consider the random position is 4 then we replace the second chromosome bits 01011 with first chromosome bits 10111. So the resultant chromosome is 101111. ISSN: 2231-5381 IV. CONCLUSION We have been concluded our present research work with an improved data acquisition of feedback session logs which are set of session id, URL,input query or keyword, sequential order of visited site and duration of time spent over a website, set of patterns can be formed by grouping the same set of sessions and details and forwards to frequent http://www.ijettjournal.org Page 44 International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Number 1- September 2015 item set generation followed by genetic evolutionary approach for optimal results. or REFERENCES [1] R. Agrawal, T. Imielinski, and A. Swami, ―Mining Association Rules between Sets of Items in Large Databases,‖ Proc. ACM SIGMOD, pp. 207-216, 1993. [2] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996. [3] A. Silberschatz and A. Tuzhilin, ―What Makes Patterns Interesting in Knowledge Discovery Systems,‖ IEEE Trans. Knowledge and Data Eng. vol. 8, no. 6, pp. 970-974, Dec. 1996. [4] Mining Frequent Patterns without Candidate Generation Jiawei Han, Jian Pei, and Yiwen Yin [5] An introduction to Genetic Algorithms Melanie Mitchell [6]Organizing User Search Histories Heasoo Hwang, Hady W. Lauw, LiseGetoor, and Alexandros Ntoulas [7] C.-K Huang, L.-F Chien, and Y.-J Oyang, “Relevant Term Suggestion in Interactive Web Search Based on Contextual Information in Query Session Logs,” J. Am. Soc. for Information Science and Technology, vol. 54, no. 7, pp. 638-649, 2003. [8] Answering General Time-Sensitive Queries WisamDakka, Luis Gravano, and Panagiotis G. Ipeirotis, Member, IEEE [9] U. Lee, Z. Liu, and J. Cho, “Automatic Identification of User GoalsinWeb Search,” Proc. 14th Int‟l Conf. World Wide Web (WWW ‟05),pp. 391-400, 2005. [9] H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hatonen, and H. Mannila, ―Pruning and Grouping of Discovered Association Rules,‖ Proc. ECML-95 Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, pp. 47-52, 1995. [10] B. Baesens, S. Viaene, and J. Vanthienen, ―Post-Processing of Association Rules,‖ Proc. Workshop Post-Processing in Machine Learning and Data Mining: Interpretation, Visualization, Integration, and Related Topics with Sixth ACM SIGKDD, pp. 2023, 2000. [11] J. Blanchard, F. Guillet, and H. Briand, ―A User-Driven and Quality-Oriented Visualization for Mining Association Rules,‖ Proc. Third IEEE Int‟l Conf. Data Mining, pp. 493-496, 2003. [12] B. Liu, W. Hsu, K. Wang, and S. Chen, ―Visually Aided Exploration of Interesting Association Rules,‖ Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), pp. 380389, 1999. [13] G. Birkhoff, Lattice Theory, vol. 25. Am. Math. Soc., 1967 [14] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, ―Discovering Frequent Closed Itemsets for Association Rules,‖ Proc. Seventh Int‟l Conf. Database Theory (ICDT ‟99), pp. 398416, 1999. ISSN: 2231-5381 http://www.ijettjournal.org Page 45