International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 Scope and Limitations of Indian University Clusters Mr.Srinatha Karur1, Prof.M.V.Ramana Murhty2 1 IT Faculty and DBA, Government Engineering College, Ibra, Sultanate of Oman. 2 HoD, Department of Computer Science & Mathematics, Osmania University, Hyderabad, India. Abstract This document gives and emphasis on Scope and Limitations of Indian Universities Clustering which is completely depend on the previous publications which are published by the same authors at different International Journals. The authors tested and emphasis on technical and policy limitations and behavior of different models which are available in University Clustering System without Networks system. In this document the author’s points out the different technical requirements for construct the different models for find out the different deviations of proposed and implemented University Clustering System. The authors and hence finally proposed different rules for University Clustering System with the help of Hybrid Association Rules which gives flexible solution to different technical and architecture polices. Keywords: Hybrid Association rules, Linear and Poly Curves, MS-SQL server, Clusters nature 1. INTRODUCTION This document mainly emphasis on Scope and Limitations of Indian Universities Clustering with respect to technical Mathematical, and policy based procedures which are available at different local levels. Even though these three parameters are independent on each other, the authors made an attempt to create equilibrium point between the above said three parameters for Optimum solution. The technical details and policy details are available in various publications of the authors in the form of Mathematical and Data mining models. The authors are observed the different hybrid types of University clusters are available in the real world case studies and real world applications. Some of the clusters are Hybrid also depend upon the policies and requirements of different countries. Local level, national level, continent level clusters are available in different countries as per Educational needs of both the sides of countries [1]. The clusters of various Universities in the world are available on different parameters such as subject wise, policy wise, and technical wise, common goal wise, etc. The authors observed that different countries have different needs and there is huge gap between developed and developing countries. The authors considered samples from different developed countries such as USA, UK, Australia, New Zealand and Germany. The Australia clusters have “Research at University level” as precise objective [1]. The nature of clusters for local and central systems is entirely different and in India both central and local Governments are available. So it is necessary to read the clusters as local as well as central also. In their publications the authors mainly concentrated on local policy clusters due to policy structure of Government of India. Geographically India consists of nearly 30 Units as local units. These local clusters are estimated as per needs of local units only[2].Since the countries like India heterogeneous conditions are available and study of all local units are practically very difficult. On the basis of practical difficulties the authors considered “Andhra Pradesh” as on the local node and estimate and implement the same for all remaining n-1 local nodes [3]. After the study of local clusters formation with respect to Andhra Pradesh unit the authors apply the same local policy to entire scale in terms of Data modelling and Data testing policy for all Universities in India. The authors used Data testing technique with the help of Statistical and Mathematical models. In this the authors used mainly different types of Mathematical and Data mining tools for outlier’s estimation [4], pp. 2987-2980. At finally the authors estimated e-Learning nodes with the help of survey of different Universities infrastructure and their policies. The authors are observed that only 20% - 25% of Universities are capable of formatting e-Learning nodes and its clusters. For accreditation of Universities the authors followed NAAC accreditation official web site. The authors observed that Deemed or Private Universities have considerable infrastructure and policies to form eLearning nodes. These Private Universities have some tie up’s with other organizations like IIT-M,IIT-B,IIM-B or link with other Government Universities[5],pp. 151. At last the authors find out the scope using Hybrid Association rules and used MS-SQL server and Data mining tools for different observation of outputs. The association rules are used for estimation of scope of Clustering of Indian Universities. The author’s five published papers give the exact nature of Indian University Clustering. All five published papers are arranged sequentially for easy understanding of users. The authors are used data mining tool R (Rattle-GUI) for describe the University Database system with different frequencies as shown in the figure-1. The One-to-Many relationship between Universities to different attributes in the form of charts is available in figure-2. Volume 3, Issue 5, May 2014 Page 259 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 Figure 1: Shows University Database with frequencies (Rattle-GUI) Figure 2: Shows different attributes of University (One-to-Many) with Rattle-GUI The above attributes are available for defined University Database system where the description is available in mainly as proposed model for “Indian University Clusters for estimation of e-Learning nodes”. The database consists of nearly 55 columns which give the support to e-Learning node attribute for either supervised or unsupervised or Hybrid learning. 2. LITERATURE REVIEW & GROUND LEVEL WORK An easy way to estimate the exact nature of ”Indian University Clustering system” is arrange the things on the basis of different scale levels. The authors pointed out the global efforts, local efforts and different policies of Government bodies on University clustering [1], pp. 127. The developed counties have separate Government policies and methods for University clustering. Due to different policies, different types of clusters are formed in these countries [1], pp. 134. The authors used Association Rules concept on Indian Universities on 30 locations as random sample. In this case the authors considered “Chennai” as root element for linear projections [1],pp. 137-138. The authors defined specifications at cluster and University level separately which gives exact nature for to define the scope of “Indian Universities Clustering”[1],pp. 135.The authors gave algorithm for different types of types for tracing the information with respect to Data structures[1],pp. 134-135. In their conclusion the authors noted down Japan and France clusters are most suitable than developed countries clusters [1],pp. 143. India consists of more than 600 Universities and available at different geographical locations. The authors used 12TH FYP of UGC as standard for estimation of local clusters on entire India basis. It is very difficult to consider above 50% sample for “Distance matrix” values estimation. So the authors randomly selected state capital city as single entity. At the same time we observed that some Universities are also eligible even though they are in “District levels”. In some cases we may have multiple values also, which satisfies given conditions. Some geographical virtual clusters are also formed, but we did not considered the logical clusters due to out of scope of aim. The details of logical clusters are available in [6]. The topology of logical clusters is available at source: Tapscott, Ticoll. and Lowy 2000. The proposed system description available in the form of different zones with allotted nodes. We Volume 3, Issue 5, May 2014 Page 260 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 construct the proposed system for South zone and implement the same procedure for all remaining zones [2], pp. 25. All sampling details are available in table form [2], pp. 33. Kurtosis, Skewness and Hierarchical clustering is used by R software (Rattle GUI)[2],pp. 36. Now we can consider the sample from Sample zone and the Unit name is “Andhra Pradesh” where this Unit itself is fragments of three different fragments is as shown in table [3], pp. 6. We now estimated all possible clusters as per nature of Andhra Pradesh nodes as local nodes. We used Regression Analysis and Curve fitting methods for obtain the relationship between available and defined variables. Curve Expert and Excel Sheets are used for Mathematical modelling and Data mining tools are used for Cluster formation. The authors found that R and Rapid miner tools are not suitable for the available data and Tanagra, Weka and Orange tools are successfully implemented are as shown in the figures[3], pp. 9-12. The depth of various University clusters and cut off values are available in the form of table-7[3], pp. 15. The data is prepared with respect to Andhra Pradesh clusters and Data Analysis is carried out in all local clusters which are available as sub clusters. After finalize the Data with local clusters the next step is modelling the data for entire data on the basis of local clusters. After modelling the data it is once again test with data mining tools for outliers. After testing the University Clustering with data mining tools the results are generated and recorded in the form of tables or figures. We want to estimate what type of clusters is useful for Indian Universities using Modelling and Data mining tools. The authors find out the outliers detection in University Clusters with different Statistical, Mathematical and Data mining tools and finalize the results in the form of poly curves with second and third degree are shown in table9[4], pp. 2994-2995.All different types of curves are tabulated in the above said table. The authors repeated the experiment with 50% data and full data and examined the results. It is found that 50% data is enough for testing and results are almost same for different size of data(i.e., 50% and 100% data).The authors applied the same local cluster policy to entire country for estimation of e-Learning nodes and the apply cluster policy. For this purpose the authors considered NAAC as a model for define the standards of Universities. Every year, Universities in India will get the grades based on overall Performance of the individual universities by National Assessment and Accreditation Council (NAAC), New Delhi. Around 612 universities are present in the country, out of which 172 universities are accredited by NAAC. Out of the 172 universities accredited, 67 have been placed in Grade a, 99 universities in Grade b and only 6 are grade c, based on the scores awarded during the process of accreditation. More details are available at http://www.careerindia.com/news/2012/12/05/172-universities-out-of-612-are-accredited-by-naac-003445.html. Only 28% Universities have accredited by NAAC and from this 28% data only the authors tested the data for e-Learning nodes. The authors used Linear Programing problem and Gini Index to implement the e-Learning nodes at University level [5], pp. All the results of e-Learning nodes in terms of Zones and Clusters are available in the table-10 and table-11 respectively [5], pp. 157. The proposed Database is available in the author’s publication [1], pp. 135 as part of specifications. The various milestones of “Indian University Clusters” consist of the following stages which indicate the scope and limitations of system. The table-1 gives all available implementations which are published by the authors. Table 1: Deals about the all implementation phases of “Indian University Clusters” SNo 1 2 3 4 5 Aim Survey and Analysis for worldwide clusters Tools/Resources used Tanagra, Weka, Ggobi, Excel, and Gephi Local clusters formation for Indian Universities Weka, Orange, R and Excel (Zone wise) Data preparation and Analysis for Andhra Pradesh Tanagra, Weka, Orange and Excel local clusters formation as local clusters Modelling and Data testing for Indian Universities Easy fit curve, Stat fit, Online tools clusters for CI, Tanagra, Weka, Orange, Rapid miner, R(Rattle-GUI), and Excel E-Learning nodes estimation for NAAC Indian NAAC official website, Tanagra, Universities LPP tools, R(Rattle-GUI), Gini online tool, Orange, and Excel. The various aims which are defined in above table represent the scope of the “Indian University Clusters” with respect to local and geographical conditions. It is necessary to emphasis on logical geographical clusters for “University Clusters” with respect to India geographical conditions and various zones are described in “Local clusters formation for Indian Universities” published by the authors[2],pp. 25-26. The authors considered the policy of “Expansion of Higher Education in India” and maximum enrolment for next FYP, from the UGC official website [7], pp. 73-80. In this policy document the UGC stated the role of “University Clusters” in their 12th FYP report [7], pp. 79. The UGC plans to introduce cluster colleges as an alternative to deemed universities, as there have been no new deemed universities created in the last seven years. The court case challenging the 2010 University Grants Commission regulations has effectively Volume 3, Issue 5, May 2014 Page 261 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 blocked the route to forming new Deemed Universities [8]. More industry and Educational Institutes are tied up or bringing in common frame as a cluster policy are defined as “Industry clusters for Educational Research” [9], pp. 8-10. Due to scope of different policies and geographic nature it is assumed that the local clusters are the best suitable for “Indian University Clusters”. The case studies of developed countries “Educational Clusters” are suggest that local clusters give optimum solution than centralized clusters. To understand the depth of “Scope of Indian University Clusters” the authors are applied “Association and Rapid Association Rules policy”[10],pp. 01.The authors observed that Dynamic Association Mining is suitable for technical entities only. The policy based rules are arranged with Static or general Association Rules mining for estimate the “Scope of University Clusters”. The authors are also observed that “Neives Bayes theorem” is also very helpful for estimation of different entities in the Database Modelling. The authors are used Tanagra and MS-SQL server 2008 R2 for Association Rules implementation and followed by reports generation. We can use other tools also such as Tanagra, Weka, Orange, Rapid miner, R(Rattle-GUI) respectively. We emphasis on MSSQL server 2008 R2 version where Import and Export, Business Intelligence and Management studio are available. Instead of IBM SPSS tool we can use general Data mining tools or any Statistics tool. For effective implementation of nodes generation we can use Microsoft Share point. In near future authors have a plan use of Oracle server and MS-SQL server at a time for more comfortable and flexible point of view and hence form Hybrid technologies (MS-SQL, Oracle, IBM-SPSS, Data mining tools). 3. PROPOSED AND CONCERNED WORK The concerned work consists of estimation of Scope and Limitations of the “Indian University Clusters” with respect to policy and technical based oriented. In the real world case studies different types of polices and technical approaches are available. Especially different types of policies are defined and implemented as per needs of individual country. Broadly we can emphasis on developed countries like USA, UK, Australia, Germany, France and New Zealand. These different country cluster policies give us different real world problems or different scenarios in the world. As per context we can use either Supervised or Unsupervised learning for estimate the Scope and Limitations of “Indian University Clusters”. The authors used different tools which are defined in Table-1 for technical implementation of given objective. 3.1 Use of Supervised, Unsupervised and Hybrid methods The Scope and Limitations “Indian University Clusters” consists of different strategies and mathematical models. All these models generally depend on Linear Regression, Polynomial curves, Naive Bayes theorem, Decision trees, and Cluster Methods. But it is not guarantee to estimate Scope of aim using these methods exactly. So authors used Association Rules also to understand the maximum rules generation for deep understanding of Scope of proposed Database of University system and Australian University Research Clusters [1],pp. 135,131.Sometimes the outliers are giving similar meaning for Limitations or entity minimum and maximum values. But technically both are different and the comparative study of outliers and limitations of proposed system are out of scope for this paper. In Data mining outliers have more technical meaning than limitations of system. The authors are used different Data mining tools for estimation of outliers [1][4], pp. 132,2993. Algorithm for Scope and Limitations for Indian University Clusters The authors are used Association Rules generation method and then find out the item sets at different intervals. Then authors used Curve fitting methods or Logistic Regression method for now the relation between two variables at different intervals. Algorithm:1. Load the database into any data mining tool or MS-SQL Server 2008R2. Trim the values if necessary. 2. Find the item sets, frequent item sets and minimal item sets. 3. Repeat the step 2 for different intervals of time. Consider only minimum support values and >0 values. Consider the rules with only >0 values as per minimum support and probability. Sometimes negative values are available for given conditions. Then trim the values of given conditions for total number of item sets and rules generation. 4. Repeat step-3 for required number of times and draw the graph between importance and rules generated. 5. Apply Hierarchical Clustering method or any Cluster method for different rules are available at importance and find out the different heights and fix the nearest value with respect to Rules generated. Let ‘x’ be the obtained value, then x min, x max are the two values which are available from set of experimental results. Then nearest values of these values give idea about scope of given problem. The above algorithm mainly emphasis on the method for construct the hybrid system for estimation of given objective. The authors mainly emphasis on Association mining, Linear Regression and Clusters formation on the basis of linear relationship between Importance, given probability for number of rules generated as output. The result is completely depending upon number of input values and minimum frequency value of items. The authors neglected negative importance values or value. Volume 3, Issue 5, May 2014 Page 262 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 4. IMPLEMENTATION DETAILS The authors used different tools and methods for implement the algorithm which is available in previous section. The authors used MS-SQL Server 2008 R2, MS-Excel, Curve fitting tools, and IBM Statistics tools. The authors are used Tanagra and Orange tools for closer observation of results. The Orange Data mining tools is used for estimation of Confidence Intervals of different Universities as shown in figure. Figure 3: Show CI of different Universities as local clusters (Target value is South Zone) From the above figure it is obvious that the Confidence interval of different Universities with respect to zones are completely depend upon the zone wise. The CI of all zones are distinct. Due to space complexities, the authors are providing only South zone Universities Confidence Interval (CI) as a part of “Indian University Clusters”. There are two locations are available. One is District Head Quarters(DHQ) and another is State Head Quarters(SHQ) location. The authors used Orange data mining tool for this purpose and modelling for rules generation is as follows. Figure 4: Shows modelling for rules mining with Orange data mining tool 4.1 Understanding of MS-SQL 2008 R2 & Other tools The scope of given problem mainly requires the mathematical models and suitable data mining tools. Since all tools cannot suite the same data. Generally we can use Data mining tools for Statistics analysis also. The authors used Tanagra, Weka, Orange, R (Rattle-GUI), and Rapid miner in their previous published papers. The authors used different tools for various phases of implementation. This is the first time the authors are using MS-SQL 2008 R2 for data testing and implementation. They observed that server version have more facilities than client version tools. Using SQL server it is possible to find out the total number of links and strong links which is possible to find out the scope and limitations of Volume 3, Issue 5, May 2014 Page 263 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 “Indian University Clusters” are shown in the different figures as per context. IBM SPSS tool is used for not only for required Statistics and quality of model which is shown in figure 2. The following real world cases are generated during the process implementation. They are as follows. Understanding of University clusters Understanding of world case studies Understanding of local clusters with respect to local policies and strategies Implement some local clusters Repeat the same local clusters procedure with respect to centralized database systems. Understanding the relation between different variables (entities). Understanding mathematical and data mining models Understanding of MS-SQL 2008R2 Analytical services, IBM SPSS tool and other Data mining tools (Tanagra, Weka, Orange, R(Rattle-GUI), and Rapid miner studio) Understanding of Association Rule Mining and its implementation with MS-SQL 2008 R2 and other Data mining tools. Focus on mathematical models for linear curve fitting methods and Excel implementation. 4.2 Implementation of mining rules with MS-SQL 2008R2 The association rules give the number of items and feasible items with respect to given superset or set. The authors applied the University Database for rules mining. As per minimum frequency and confidence the number of rules generated are vary from one interval to another. We neglected the negative values and overflow values. The authors checked for one hundred values for available Database. The below figure shows MS-SQL 2008R2 server implementation for Association rule mining at different confidences. Figure 5: Shows rules mining process is successfully completed From the above figure it is well known fact that the mining process is successfully executed. The database is import from Excel sheet to MS-SQL 2008R2 Server. The process start time, end time and time duration to complete the process details are available in above figure and self-explain also. We can use directly server without import the data from other data sources such as text file, Access and Excel sheets. The authors repeat the experiment for different confidences and get different rules. The output is as follows. Figure 6: Shows server output for rules mining with negative and positive values. Volume 3, Issue 5, May 2014 Page 264 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 From the above figure it is obvious that some negative values are available and those values are simply neglected and consider only total number of rules are generated at defined or required interval. Negative values are marked with red mark and feasible values are available with blue colour. There are 2000 rules are available with probability 0.5 and negative and positive importance values. Figure 7: Shows total numbers of item sets are generated for a given data base From the above figure it is known that there are 829 Item sets are available with 0.5 probability value with 2000 rules. The number of attributes or item set size is three as shown in column name “Size”. The authors considered only 3 item sets as maximum. When item set value increases number of rules are decreases. Figure 8: Shows network of all links and strong links of different entities The above figure shows network with all links and strong links for a proposed database for “Indian university Clusters”. The left side network shows strong links between different entities and whereas right side network shows all links between all available entities in proposed Indian University Clusters database. Figure 9: Shows number of rules is 14361 with maximum rule length is 4(Tanagra). From the figure it is obvious that there are total 14361 rules are generated and maximum rule length is 4 only. All values are >0 and outliers are three only. The minimum frequency is 0.5 and confidence is 90%. The lift value is 1.10 only. The authors used IBM SPPS tools for estimate the quality of proposed model and the accuracy of model for rules generation is approximately 80%. The model summary consists of different entities. They are target, automatic data preparation, and model selection method and information criteria. Here information criteria value is nearly 90% which shows all entities Volume 3, Issue 5, May 2014 Page 265 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 are good enough to form model. For local clusters only this method is useful. For centralized databases it is very difficult to fit the small information criteria value. Practically it is very difficult to fit small value for information criteria, since centralized, logical and hybrid clusters are available. The model is selected on “Forward stepwise method” only. This is strictly irreversible system where only one way communication available. Figure 10: Shows the model is 80% accurate. a. Study of linear relationship between importance and probability The authors find out the linear relation between probability and importance for rules generation. They arranged the data from negative scale to positive scale with zero. The sample database in the form of table is as follows. The data is available in Excel sheet and sample is arranged for understanding purpose. Its output is available in “Results and Discussion” section. We have recorded the various interval values at given probability in the form of below sample table. Table-1: Shows linear relation between two variables(Sample table) Importance 0.02 0.04 0.06 0.08 0.1 0 -0.02 -0.04 -0.06 -0.08 Rules 1815 1673 1610 1568 1476 1856 1920 2000 2000 2000 The authors used same data with other tools like Expert curve fitting and SQL-SERVER 2008R2.All the outputs are available in “Results” section. The below figure shows loaded the data into “Expert curve fitting” tool which is useful for estimate the linear relationship between entities are as follows. There are two intersection points which are available at 1860 and 1310 rules. We considered only 1310 are feasible since they are in positive interval and 1860 rules are available at negative interval which is shown in below figure. More saturated state is available in negative interval and regular decrease in positive interval is available as shown in below figure. There are three types of values are available in positive interval. One is minimum, other is maximum and another is intersection point(1300 rules). Minimum rules are 725 and maximum 1860 rules are available Figure 11: Shows minimum, maximum and intersection points of linear curve Volume 3, Issue 5, May 2014 Page 266 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 The above figure indicates the relationship between interval values on the x-axis and number of rules generated on y-axis. The feasible value is approximately 1300 rules at 0.1 interval value. The standard residuals of the linear variables for Normal distribution are as follows. Figure 12: Shows Normal distribution of linear variables The authors used IBM SPSS tools for estimation of residuals Normal Distribution where the Mean is -0.01 and Standard Deviation is 1.007 and N=82. The curve is strictly bell curve as shown in above figure. b. Clustering of Linear Variables The authors used Excel 2007 Data mining add-in for estimate the clusters density and which is available at different intervals are as shown in the figure. From the figure it is well known fact that one cluster is completely isolated from remaining clusters. The authors observed the nature of clusters in terms of total links and strong links. In both of the cases the authors found that one cluster is completely isolated and two clusters are only connected as shown in the below figure-13. The authors observed that 77% data is available for below clusters. The intensity of data is not shown in the figure. It is also examined the nature of attributes of three different clusters and average of three cluster which is available in “Results and Discussion” section. Figure 13: Shows three clusters are formed for linear curves The above figure shows that Cluster 2 is available with 77% intensity data. Cluster 1 and Cluster 2 have link with each other. The details of these clusters are available in “Results and Discussion” section. Figure 14: Shows three clusters are formed for linear curves with Weka Volume 3, Issue 5, May 2014 Page 267 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 There are total three clusters are available and marked with different colours. Class color is also available in three different colours are as shown in above figure 14. Class color is started at 602 and ending at 2000. So from figure13 an figure 14 it is confirmed that there are only three clusters are available for linear curve entities. We used Weka data mining tool also which is available in work flow form and end user form. The implementation details are not available here. From the output screen it is noted that along the x-axis Instace_number is available and y-axis rules are available. We can use Jitter for clarity of diagram. Right hand side of diagram along the x and y axis sample clusters with different colours are available for end user understanding. For rules also three types of colours are available starts from 602 and end with 2000(another colour). The author used Weka due to the following reasons. Comparative study with SQL-Server Workflow and end user forms are available More user friendly and client based only Special configuration skills are not necessary Different tools are available for visualization and .arff file converters Command mode also available for troubleshoot the problem and command base execution. 5. RESULTS AND DISCUSSION All the results are available and recorded on the basis of different tools and methods which is defined as per context. The different types of tools are useful for different purposes are noted in the below table. Sno 1 2 3 4 5 6 Table 2: Shows different tasks and resources Task Tool/Resources Back ground preparation Authors published journals Association rules MS-SQL 2008 R2 Linear variables Data Excel Sheet Linear Relation between variables MS-SQL 2008 R2, Excel sheet and Curve fitting tool Clustering MS-2007 Data mining add-in, Weka Scope and Limitation of Model IBM SPSS The authors are applied Association rules with the help of MS-SQL 2008 R2 server and details are available in figure-2. The authors tested the data at 0.5 probability and -0.1 importance and observed that 2000 rules are generated as shown in the figure and red color mark is applied for negative values and they are out of scope. There are total 289 item sets are available for this constraints are shown in figure-5. The network dependency system describes about the total number of links and strong links between available entities in defined database. The proposed database consists have more than 50 entities and network is so complicated. Only 15 entities have strong relation with each other as shown in the figure-6.The scope and limitation of system is implemented with IBM SPPS tool and it shows nearly 80% accuracy as shown in the figure -5. From the figure -6 it is obvious that 1300 rules are feasible for a given dataset. The clusters of the given dataset for linear curve are three in numbers as shown in figure-7 and figure-8 implement with MS-Excel 2007 data mining addin and Weka respectively. The clusters profiles of the above said three clusters are examined using data mining add-in. The below diagram gives the details of different clusters (3 clusters) profiles are shown in below diagram. Figure 15: Cluster profiles for three clusters Volume 3, Issue 5, May 2014 Page 268 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 From the above figure it is well known fact that there are three cluster profiles are formed. The profiles are generated on the basis if linear variables Importance and Rules. Total number of clusters is 58 and cluster 1 has maximum number of clusters 17. Cluster 2 and Cluster 3 have 17 and 13 etc. The size of the cluster cannot give solution. But we can found out its nature by using its attributes. The above diagram is self-explain nature. Figure 16: Cluster 1 characteristics with feasible and infeasible values From the above figure it is known that the rules are available at negative interval i.e., -0.4 to -0.1 and -0.1 to 0.0. There are only three rules are available between this interval -0.4 to -0.1. The cluster2 and cluster3 values are available at positive interval as shown in the below figures. The probabilities are also shown in the last column of above figure. Figure 17: Cluster 2 characteristics with feasible values only Figure 18: Cluster 3 characteristics with feasible and infeasible values Volume 3, Issue 5, May 2014 Page 269 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 From the above figure it is known that the rules are available at interval i.e., 0.4 to 0.1 and 0.1 to 0.0. There are different rules are available between these intervals. The negative values are available between 0.0 to -0.1 intervals. The rules available between 0 to 0.1 intervals have two ranges. One is 1553-1926 and another is 1179-1553. In the next row importance is available in two stages and Rues are also two stages. There is one-to-one relation is available between Importance and Rules. By default system search for mapping between importance and rules. The probabilities are also shown in the last column of above figure. The average population of all these clusters is as follows. Figure 19: Average population of all 3 clusters Figure 20: Graph generated by MS-SQL 2008R2 Server for rules mining The authors are also observed the linear and poly curves for above rules with respect to different intervals are shown in different diagrams are as follows. Figure 21: Shows linear and poly curves for rules mining Volume 3, Issue 5, May 2014 Page 270 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 The above figure is self-explain and higher degrees are neglected due to complex nature. Only two degree curves are examined as linear and quadratic curves. Finally the authors used IBM SPSS for outlier’s estimation for linear curves which gives the scope and limitations of defined database of University System. We used Cooks distance for outlier’s estimation for “Indian University Clusters” as shown below in the figure. Figure 22: Cook’s distance for outliers estimation for records Records with large distance for different records are highly influential in the model consumptions and that type of records is distort the model accuracy. 6. CONCLUSIONS This paper mainly focused on scope and limitations of “Indian University Clusters” with respect to e-Learning. We put our efforts maximum to get balance between mathematical and data mining models. Different tools are used for special purpose are stated in table-2.The clusters are formed and analyzed on the basis of mainly intervals and probabilities. The network diagram is analyzed on the basis of all links and strong links and observed that Bi-cluster is suitable (Figure 13). In the same way the association rules and item sets are generated using different intervals and probabilities. The authors neglect negative intervals for rules generation and considered only positive intervals (Figures 15, 16, 17, 18). The accuracy of system is approximately 80 %.( Figure: 10). More emphasis on mathematical model is necessary to improve the accuracy of system and network topology leads communication between various entities which is mainly network oriented phase and out of scope with respect to Database point of view. References [1] Srinatha Karur, Prof.M.V.Ramana Murthy, ”Survey and Analysis of University Clustering”, IJAIA,Vol (4), No (4), pp. 127-143, July 2013. [2] Srinatha Karur, Prof.M.V.Ramana Murthy, “Local Clusters formation for Indian Universities”, IJAIA,Vol (4), No (5), pp. 19-38, September 2013. [3] Srinatha Karur, Prof.M.V.Ramana Murthy,” Data Preparation and Analysis for Andhra Pradesh Clusters”, IJSBAR, Vol (7), No (1), pp. 4-16, 2013, ISSN 2307-4531 [4] Srinatha Karur, Prof.M.V.Ramana Murthy, “Modelling and Data testing for Indian University Clusters”, IJES, Vol (2), Issue (10), pp. 2985-2996, October-2013, ISSN: 2319-7242 [5] M. Srinatha Karur, Prof.M.V.Ramana Murthy, ” E-Learning nodes estimation for NAAC Indian Universities”, IJETTCS, Vol (2), Issue (6), pp. 143-159, November-2013, ISSN: 2278-6856 [6] Prof.Giuseppina Passiante, Dr.Giustina Secundo ,“From geographical innovation clusters towards virtual innovation clusters the Innovation Virtual System”, [Online] Available:http://www-sre.wu-wien.ac.at/ersa/ersaconfs/ersa02/cdrom/papers/270.pdf [7] [Online] Available: http://www.ugc.ac.in/ugcpdf/740315_12FYP.pdf [8] [Online] Available: http://www.edu-leaders.com/edu/news/39501/cluster-colleges-instead-deemed-university-ugc [9] [Online] Available: http://www.eadi.org/fileadmin/WG_Documents/Reg_WG/vandijk1.pdf [10] [Online] Available: http://www.davidwoon.com/researcher/files/wise01.pdf AUTHORS Mr. Srinatha Karur received M.C.A. and M.Tech (IT) during the periods of 1994-97 and 2002-2004 and the author waiting for his Ph.D. Certificate The author has completed his M.Phil. from “Global Open University” Dimapur, Nagaland, India in 2009 December. He has 16+ years of continuous service after his P.G qualification. He has four international publications on Data mining and techniques and Applications concept. At present he is in Government Engineering College, Ibra, and Sultanate of Oman as IT faculty and Oracle Volume 3, Issue 5, May 2014 Page 271 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org Volume 3, Issue 5, May 2014 ISSN 2319 - 4847 DBA. The author has experience on both Technical and Academic lines and at present he has a plan for e-Learning tools with MS-SQL and Oracle DBA tools as a part of the Engineering stream curriculum. The area of interest for authors is Operations Research, Numerical Methods (Theory), Operating Systems (Windows, UNIX flavour) and DBA (MS-SQL, Oracle). At present I am working for Government Engineering College, Post box no: 327, Zip code: 400, Ibra, and Sultanate of Oman as IT faculty and Oracle & MS-SQL DBA. The author special interest is modeling of required data in terms of server and client based tools and comparative study of client and server version of tools or software. Prof. M. V. Ramana Murthy is senior Professor and HoD of Department of Mathematics & Computer Science, Osmania University, Hyderabad, India. He is author of Core programming languages and international Publications on different applications. His profile shows his complete grip on Academic, Administrative and Technical fields. At present the professor is on foreign assignment. The professor has the same grip on multiple subjects in terms of Mathematics and Computer Science Engineering. He has a special interest in core Mathematical Modelling for engineering applications which is the heart of the Research studies. At present the professor is very busy in his foreign assignment. The professor has tremendous grip on both Mathematical and Networks modeling. The professor handles Neural networks, Artificial Intelligence and Data mining subjects at Post graduation AND Research level. The professor guides the Research and Postgraduate students in terms of Mathematical and Engineering modeling. . Volume 3, Issue 5, May 2014 Page 272