International Journal of Application or Innovation in Engineering & Management... Web Site: www.ijaiem.org Email: Volume 3, Issue 5, May 2014

advertisement
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
Scope and Limitations of Indian
University Clusters
Mr.Srinatha Karur1, Prof.M.V.Ramana Murhty2
1
IT Faculty and DBA, Government Engineering College, Ibra,
Sultanate of Oman.
2
HoD, Department of Computer Science & Mathematics, Osmania University,
Hyderabad, India.
Abstract
This document gives and emphasis on Scope and Limitations of Indian Universities Clustering which is completely depend on the
previous publications which are published by the same authors at different International Journals. The authors tested and
emphasis on technical and policy limitations and behavior of different models which are available in University Clustering
System without Networks system. In this document the author’s points out the different technical requirements for construct the
different models for find out the different deviations of proposed and implemented University Clustering System. The authors and
hence finally proposed different rules for University Clustering System with the help of Hybrid Association Rules which gives
flexible solution to different technical and architecture polices.
Keywords: Hybrid Association rules, Linear and Poly Curves, MS-SQL server, Clusters nature
1. INTRODUCTION
This document mainly emphasis on Scope and Limitations of Indian Universities Clustering with respect to technical
Mathematical, and policy based procedures which are available at different local levels. Even though these three
parameters are independent on each other, the authors made an attempt to create equilibrium point between the above
said three parameters for Optimum solution. The technical details and policy details are available in various publications
of the authors in the form of Mathematical and Data mining models. The authors are observed the different hybrid types
of University clusters are available in the real world case studies and real world applications. Some of the clusters are
Hybrid also depend upon the policies and requirements of different countries. Local level, national level, continent level
clusters are available in different countries as per Educational needs of both the sides of countries [1]. The clusters of
various Universities in the world are available on different parameters such as subject wise, policy wise, and technical
wise, common goal wise, etc. The authors observed that different countries have different needs and there is huge gap
between developed and developing countries. The authors considered samples from different developed countries such as
USA, UK, Australia, New Zealand and Germany. The Australia clusters have “Research at University level” as precise
objective [1]. The nature of clusters for local and central systems is entirely different and in India both central and local
Governments are available. So it is necessary to read the clusters as local as well as central also. In their publications the
authors mainly concentrated on local policy clusters due to policy structure of Government of India. Geographically India
consists of nearly 30 Units as local units. These local clusters are estimated as per needs of local units only[2].Since the
countries like India heterogeneous conditions are available and study of all local units are practically very difficult. On
the basis of practical difficulties the authors considered “Andhra Pradesh” as on the local node and estimate and
implement the same for all remaining n-1 local nodes [3]. After the study of local clusters formation with respect to
Andhra Pradesh unit the authors apply the same local policy to entire scale in terms of Data modelling and Data testing
policy for all Universities in India. The authors used Data testing technique with the help of Statistical and Mathematical
models. In this the authors used mainly different types of Mathematical and Data mining tools for outlier’s estimation [4],
pp. 2987-2980. At finally the authors estimated e-Learning nodes with the help of survey of different Universities
infrastructure and their policies. The authors are observed that only 20% - 25% of Universities are capable of formatting
e-Learning nodes and its clusters. For accreditation of Universities the authors followed NAAC accreditation official web
site. The authors observed that Deemed or Private Universities have considerable infrastructure and policies to form eLearning nodes. These Private Universities have some tie up’s with other organizations like IIT-M,IIT-B,IIM-B or link
with other Government Universities[5],pp. 151. At last the authors find out the scope using Hybrid Association rules and
used MS-SQL server and Data mining tools for different observation of outputs. The association rules are used for
estimation of scope of Clustering of Indian Universities. The author’s five published papers give the exact nature of
Indian University Clustering. All five published papers are arranged sequentially for easy understanding of users. The
authors are used data mining tool R (Rattle-GUI) for describe the University Database system with different frequencies
as shown in the figure-1. The One-to-Many relationship between Universities to different attributes in the form of charts
is available in figure-2.
Volume 3, Issue 5, May 2014
Page 259
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
Figure 1: Shows University Database with frequencies (Rattle-GUI)
Figure 2: Shows different attributes of University (One-to-Many) with Rattle-GUI
The above attributes are available for defined University Database system where the description is available in mainly as
proposed model for “Indian University Clusters for estimation of e-Learning nodes”. The database consists of nearly 55
columns which give the support to e-Learning node attribute for either supervised or unsupervised or Hybrid learning.
2. LITERATURE REVIEW & GROUND LEVEL WORK
An easy way to estimate the exact nature of ”Indian University Clustering system” is arrange the things on the basis of
different scale levels. The authors pointed out the global efforts, local efforts and different policies of Government bodies
on University clustering [1], pp. 127. The developed counties have separate Government policies and methods for
University clustering. Due to different policies, different types of clusters are formed in these countries [1], pp. 134. The
authors used Association Rules concept on Indian Universities on 30 locations as random sample. In this case the authors
considered “Chennai” as root element for linear projections [1],pp. 137-138. The authors defined specifications at cluster
and University level separately which gives exact nature for to define the scope of “Indian Universities Clustering”[1],pp.
135.The authors gave algorithm for different types of types for tracing the information with respect to Data
structures[1],pp. 134-135. In their conclusion the authors noted down Japan and France clusters are most suitable than
developed countries clusters [1],pp. 143. India consists of more than 600 Universities and available at different
geographical locations. The authors used 12TH FYP of UGC as standard for estimation of local clusters on entire India
basis. It is very difficult to consider above 50% sample for “Distance matrix” values estimation. So the authors randomly
selected state capital city as single entity. At the same time we observed that some Universities are also eligible even
though they are in “District levels”. In some cases we may have multiple values also, which satisfies given conditions.
Some geographical virtual clusters are also formed, but we did not considered the logical clusters due to out of scope of
aim. The details of logical clusters are available in [6]. The topology of logical clusters is available at source: Tapscott,
Ticoll. and Lowy 2000. The proposed system description available in the form of different zones with allotted nodes. We
Volume 3, Issue 5, May 2014
Page 260
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
construct the proposed system for South zone and implement the same procedure for all remaining zones [2], pp. 25. All
sampling details are available in table form [2], pp. 33. Kurtosis, Skewness and Hierarchical clustering is used by R
software (Rattle GUI)[2],pp. 36. Now we can consider the sample from Sample zone and the Unit name is “Andhra
Pradesh” where this Unit itself is fragments of three different fragments is as shown in table [3], pp. 6. We now estimated
all possible clusters as per nature of Andhra Pradesh nodes as local nodes. We used Regression Analysis and Curve fitting
methods for obtain the relationship between available and defined variables. Curve Expert and Excel Sheets are used for
Mathematical modelling and Data mining tools are used for Cluster formation. The authors found that R and Rapid miner
tools are not suitable for the available data and Tanagra, Weka and Orange tools are successfully implemented are as
shown in the figures[3], pp. 9-12. The depth of various University clusters and cut off values are available in the form of
table-7[3], pp. 15. The data is prepared with respect to Andhra Pradesh clusters and Data Analysis is carried out in all
local clusters which are available as sub clusters. After finalize the Data with local clusters the next step is modelling the
data for entire data on the basis of local clusters. After modelling the data it is once again test with data mining tools for
outliers. After testing the University Clustering with data mining tools the results are generated and recorded in the form
of tables or figures. We want to estimate what type of clusters is useful for Indian Universities using Modelling and Data
mining tools. The authors find out the outliers detection in University Clusters with different Statistical, Mathematical
and Data mining tools and finalize the results in the form of poly curves with second and third degree are shown in table9[4], pp. 2994-2995.All different types of curves are tabulated in the above said table. The authors repeated the
experiment with 50% data and full data and examined the results. It is found that 50% data is enough for testing and
results are almost same for different size of data(i.e., 50% and 100% data).The authors applied the same local cluster
policy to entire country for estimation of e-Learning nodes and the apply cluster policy. For this purpose the authors
considered NAAC as a model for define the standards of Universities. Every year, Universities in India will get the
grades based on overall Performance of the individual universities by National Assessment and Accreditation Council
(NAAC), New Delhi. Around 612 universities are present in the country, out of which 172 universities are accredited by
NAAC. Out of the 172 universities accredited, 67 have been placed in Grade a, 99 universities in Grade b and only 6 are
grade c, based on the scores awarded during the process of accreditation. More details are available at
http://www.careerindia.com/news/2012/12/05/172-universities-out-of-612-are-accredited-by-naac-003445.html.
Only
28% Universities have accredited by NAAC and from this 28% data only the authors tested the data for e-Learning nodes.
The authors used Linear Programing problem and Gini Index to implement the e-Learning nodes at University level [5],
pp. All the results of e-Learning nodes in terms of Zones and Clusters are available in the table-10 and table-11
respectively [5], pp. 157. The proposed Database is available in the author’s publication [1], pp. 135 as part of
specifications. The various milestones of “Indian University Clusters” consist of the following stages which indicate the
scope and limitations of system. The table-1 gives all available implementations which are published by the authors.
Table 1: Deals about the all implementation phases of “Indian University Clusters”
SNo
1
2
3
4
5
Aim
Survey and Analysis for worldwide clusters
Tools/Resources used
Tanagra, Weka, Ggobi, Excel, and
Gephi
Local clusters formation for Indian Universities Weka, Orange, R and Excel
(Zone wise)
Data preparation and Analysis for Andhra Pradesh Tanagra, Weka, Orange and Excel
local clusters formation as local clusters
Modelling and Data testing for Indian Universities Easy fit curve, Stat fit, Online tools
clusters
for CI, Tanagra, Weka, Orange,
Rapid miner, R(Rattle-GUI), and
Excel
E-Learning nodes estimation for NAAC Indian NAAC official website, Tanagra,
Universities
LPP tools, R(Rattle-GUI), Gini
online tool, Orange, and Excel.
The various aims which are defined in above table represent the scope of the “Indian University Clusters” with respect to
local and geographical conditions. It is necessary to emphasis on logical geographical clusters for “University Clusters”
with respect to India geographical conditions and various zones are described in “Local clusters formation for Indian
Universities” published by the authors[2],pp. 25-26. The authors considered the policy of “Expansion of Higher
Education in India” and maximum enrolment for next FYP, from the UGC official website [7], pp. 73-80. In this policy
document the UGC stated the role of “University Clusters” in their 12th FYP report [7], pp. 79. The UGC plans to
introduce cluster colleges as an alternative to deemed universities, as there have been no new deemed universities created
in the last seven years. The court case challenging the 2010 University Grants Commission regulations has effectively
Volume 3, Issue 5, May 2014
Page 261
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
blocked the route to forming new Deemed Universities [8]. More industry and Educational Institutes are tied up or
bringing in common frame as a cluster policy are defined as “Industry clusters for Educational Research” [9], pp. 8-10.
Due to scope of different policies and geographic nature it is assumed that the local clusters are the best suitable for
“Indian University Clusters”. The case studies of developed countries “Educational Clusters” are suggest that local
clusters give optimum solution than centralized clusters. To understand the depth of “Scope of Indian University
Clusters” the authors are applied “Association and Rapid Association Rules policy”[10],pp. 01.The authors observed that
Dynamic Association Mining is suitable for technical entities only. The policy based rules are arranged with Static or
general Association Rules mining for estimate the “Scope of University Clusters”. The authors are also observed that
“Neives Bayes theorem” is also very helpful for estimation of different entities in the Database Modelling. The authors are
used Tanagra and MS-SQL server 2008 R2 for Association Rules implementation and followed by reports generation. We
can use other tools also such as Tanagra, Weka, Orange, Rapid miner, R(Rattle-GUI) respectively. We emphasis on MSSQL server 2008 R2 version where Import and Export, Business Intelligence and Management studio are available.
Instead of IBM SPSS tool we can use general Data mining tools or any Statistics tool. For effective implementation of
nodes generation we can use Microsoft Share point. In near future authors have a plan use of Oracle server and MS-SQL
server at a time for more comfortable and flexible point of view and hence form Hybrid technologies (MS-SQL, Oracle,
IBM-SPSS, Data mining tools).
3. PROPOSED AND CONCERNED WORK
The concerned work consists of estimation of Scope and Limitations of the “Indian University Clusters” with respect to
policy and technical based oriented. In the real world case studies different types of polices and technical approaches are
available. Especially different types of policies are defined and implemented as per needs of individual country. Broadly
we can emphasis on developed countries like USA, UK, Australia, Germany, France and New Zealand. These different
country cluster policies give us different real world problems or different scenarios in the world. As per context we can
use either Supervised or Unsupervised learning for estimate the Scope and Limitations of “Indian University Clusters”.
The authors used different tools which are defined in Table-1 for technical implementation of given objective.
3.1 Use of Supervised, Unsupervised and Hybrid methods
The Scope and Limitations “Indian University Clusters” consists of different strategies and mathematical models. All
these models generally depend on Linear Regression, Polynomial curves, Naive Bayes theorem, Decision trees, and
Cluster Methods. But it is not guarantee to estimate Scope of aim using these methods exactly. So authors used
Association Rules also to understand the maximum rules generation for deep understanding of Scope of proposed
Database of University system and Australian University Research Clusters [1],pp. 135,131.Sometimes the outliers are
giving similar meaning for Limitations or entity minimum and maximum values. But technically both are different and
the comparative study of outliers and limitations of proposed system are out of scope for this paper. In Data mining
outliers have more technical meaning than limitations of system. The authors are used different Data mining tools for
estimation of outliers [1][4], pp. 132,2993.
Algorithm for Scope and Limitations for Indian University Clusters
The authors are used Association Rules generation method and then find out the item sets at different intervals. Then
authors used Curve fitting methods or Logistic Regression method for now the relation between two variables at different
intervals.
Algorithm:1. Load the database into any data mining tool or MS-SQL Server 2008R2. Trim the values if necessary.
2. Find the item sets, frequent item sets and minimal item sets.
3. Repeat the step 2 for different intervals of time. Consider only minimum support values and >0 values. Consider the
rules with only >0 values as per minimum support and probability. Sometimes negative values are available for given
conditions. Then trim the values of given conditions for total number of item sets and rules generation.
4. Repeat step-3 for required number of times and draw the graph between importance and rules generated.
5. Apply Hierarchical Clustering method or any Cluster method for different rules are available at importance and find
out the different heights and fix the nearest value with respect to Rules generated. Let ‘x’ be the obtained value, then x min,
x max are the two values which are available from set of experimental results. Then nearest values of these values give idea
about scope of given problem.
The above algorithm mainly emphasis on the method for construct the hybrid system for estimation of given objective.
The authors mainly emphasis on Association mining, Linear Regression and Clusters formation on the basis of linear
relationship between Importance, given probability for number of rules generated as output. The result is completely
depending upon number of input values and minimum frequency value of items. The authors neglected negative
importance values or value.
Volume 3, Issue 5, May 2014
Page 262
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
4. IMPLEMENTATION DETAILS
The authors used different tools and methods for implement the algorithm which is available in previous section. The
authors used MS-SQL Server 2008 R2, MS-Excel, Curve fitting tools, and IBM Statistics tools. The authors are used
Tanagra and Orange tools for closer observation of results. The Orange Data mining tools is used for estimation of
Confidence Intervals of different Universities as shown in figure.
Figure 3: Show CI of different Universities as local clusters (Target value is South Zone)
From the above figure it is obvious that the Confidence interval of different Universities with respect to zones are
completely depend upon the zone wise. The CI of all zones are distinct. Due to space complexities, the authors are
providing only South zone Universities Confidence Interval (CI) as a part of “Indian University Clusters”. There are two
locations are available. One is District Head Quarters(DHQ) and another is State Head Quarters(SHQ) location. The
authors used Orange data mining tool for this purpose and modelling for rules generation is as follows.
Figure 4: Shows modelling for rules mining with Orange data mining tool
4.1 Understanding of MS-SQL 2008 R2 & Other tools
The scope of given problem mainly requires the mathematical models and suitable data mining tools. Since all tools
cannot suite the same data. Generally we can use Data mining tools for Statistics analysis also. The authors used Tanagra,
Weka, Orange, R (Rattle-GUI), and Rapid miner in their previous published papers. The authors used different tools for
various phases of implementation. This is the first time the authors are using MS-SQL 2008 R2 for data testing and
implementation. They observed that server version have more facilities than client version tools. Using SQL server it is
possible to find out the total number of links and strong links which is possible to find out the scope and limitations of
Volume 3, Issue 5, May 2014
Page 263
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
“Indian University Clusters” are shown in the different figures as per context. IBM SPSS tool is used for not only for
required Statistics and quality of model which is shown in figure 2. The following real world cases are generated during
the process implementation. They are as follows.

Understanding of University clusters

Understanding of world case studies

Understanding of local clusters with respect to local policies and strategies

Implement some local clusters

Repeat the same local clusters procedure with respect to centralized database systems.

Understanding the relation between different variables (entities).

Understanding mathematical and data mining models

Understanding of MS-SQL 2008R2 Analytical services, IBM SPSS tool and other Data mining tools
(Tanagra, Weka, Orange, R(Rattle-GUI), and Rapid miner studio)

Understanding of Association Rule Mining and its implementation with MS-SQL 2008 R2 and other
Data mining tools.

Focus on mathematical models for linear curve fitting methods and Excel implementation.
4.2 Implementation of mining rules with MS-SQL 2008R2
The association rules give the number of items and feasible items with respect to given superset or set. The authors
applied the University Database for rules mining. As per minimum frequency and confidence the number of rules
generated are vary from one interval to another. We neglected the negative values and overflow values. The authors
checked for one hundred values for available Database. The below figure shows MS-SQL 2008R2 server implementation
for Association rule mining at different confidences.
Figure 5: Shows rules mining process is successfully completed
From the above figure it is well known fact that the mining process is successfully executed. The database is import from
Excel sheet to MS-SQL 2008R2 Server. The process start time, end time and time duration to complete the process details
are available in above figure and self-explain also. We can use directly server without import the data from other data
sources such as text file, Access and Excel sheets. The authors repeat the experiment for different confidences and get
different rules. The output is as follows.
Figure 6: Shows server output for rules mining with negative and positive values.
Volume 3, Issue 5, May 2014
Page 264
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
From the above figure it is obvious that some negative values are available and those values are simply neglected and
consider only total number of rules are generated at defined or required interval. Negative values are marked with red
mark and feasible values are available with blue colour. There are 2000 rules are available with probability 0.5 and
negative and positive importance values.
Figure 7: Shows total numbers of item sets are generated for a given data base
From the above figure it is known that there are 829 Item sets are available with 0.5 probability value with 2000 rules.
The number of attributes or item set size is three as shown in column name “Size”. The authors considered only 3 item
sets as maximum. When item set value increases number of rules are decreases.
Figure 8: Shows network of all links and strong links of different entities
The above figure shows network with all links and strong links for a proposed database for “Indian university Clusters”.
The left side network shows strong links between different entities and whereas right side network shows all links
between all available entities in proposed Indian University Clusters database.
Figure 9: Shows number of rules is 14361 with maximum rule length is 4(Tanagra).
From the figure it is obvious that there are total 14361 rules are generated and maximum rule length is 4 only. All values
are >0 and outliers are three only. The minimum frequency is 0.5 and confidence is 90%. The lift value is 1.10 only. The
authors used IBM SPPS tools for estimate the quality of proposed model and the accuracy of model for rules generation is
approximately 80%. The model summary consists of different entities. They are target, automatic data preparation, and
model selection method and information criteria. Here information criteria value is nearly 90% which shows all entities
Volume 3, Issue 5, May 2014
Page 265
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
are good enough to form model. For local clusters only this method is useful. For centralized databases it is very difficult
to fit the small information criteria value. Practically it is very difficult to fit small value for information criteria, since
centralized, logical and hybrid clusters are available. The model is selected on “Forward stepwise method” only. This is
strictly irreversible system where only one way communication available.
Figure 10: Shows the model is 80% accurate.
a. Study of linear relationship between importance and probability
The authors find out the linear relation between probability and importance for rules generation. They arranged the data
from negative scale to positive scale with zero. The sample database in the form of table is as follows. The data is
available in Excel sheet and sample is arranged for understanding purpose. Its output is available in “Results and
Discussion” section. We have recorded the various interval values at given probability in the form of below sample table.
Table-1: Shows linear relation between two variables(Sample table)
Importance
0.02
0.04
0.06
0.08
0.1
0
-0.02
-0.04
-0.06
-0.08
Rules
1815
1673
1610
1568
1476
1856
1920
2000
2000
2000
The authors used same data with other tools like Expert curve fitting and SQL-SERVER 2008R2.All the outputs are
available in “Results” section. The below figure shows loaded the data into “Expert curve fitting” tool which is useful for
estimate the linear relationship between entities are as follows. There are two intersection points which are available at
1860 and 1310 rules. We considered only 1310 are feasible since they are in positive interval and 1860 rules are available
at negative interval which is shown in below figure. More saturated state is available in negative interval and regular
decrease in positive interval is available as shown in below figure. There are three types of values are available in positive
interval. One is minimum, other is maximum and another is intersection point(1300 rules). Minimum rules are 725 and
maximum 1860 rules are available
Figure 11: Shows minimum, maximum and intersection points of linear curve
Volume 3, Issue 5, May 2014
Page 266
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
The above figure indicates the relationship between interval values on the x-axis and number of rules generated on y-axis.
The feasible value is approximately 1300 rules at 0.1 interval value. The standard residuals of the linear variables for
Normal distribution are as follows.
Figure 12: Shows Normal distribution of linear variables
The authors used IBM SPSS tools for estimation of residuals Normal Distribution where the Mean is -0.01 and Standard
Deviation is 1.007 and N=82. The curve is strictly bell curve as shown in above figure.
b. Clustering of Linear Variables
The authors used Excel 2007 Data mining add-in for estimate the clusters density and which is available at different
intervals are as shown in the figure. From the figure it is well known fact that one cluster is completely isolated from
remaining clusters. The authors observed the nature of clusters in terms of total links and strong links. In both of the
cases the authors found that one cluster is completely isolated and two clusters are only connected as shown in the below
figure-13. The authors observed that 77% data is available for below clusters. The intensity of data is not shown in the
figure. It is also examined the nature of attributes of three different clusters and average of three cluster which is available
in “Results and Discussion” section.
Figure 13: Shows three clusters are formed for linear curves
The above figure shows that Cluster 2 is available with 77% intensity data. Cluster 1 and Cluster 2 have link with each
other. The details of these clusters are available in “Results and Discussion” section.
Figure 14: Shows three clusters are formed for linear curves with Weka
Volume 3, Issue 5, May 2014
Page 267
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
There are total three clusters are available and marked with different colours. Class color is also available in three
different colours are as shown in above figure 14. Class color is started at 602 and ending at 2000. So from figure13 an
figure 14 it is confirmed that there are only three clusters are available for linear curve entities. We used Weka data
mining tool also which is available in work flow form and end user form. The implementation details are not available
here. From the output screen it is noted that along the x-axis Instace_number is available and y-axis rules are available.
We can use Jitter for clarity of diagram. Right hand side of diagram along the x and y axis sample clusters with different
colours are available for end user understanding. For rules also three types of colours are available starts from 602 and
end with 2000(another colour). The author used Weka due to the following reasons.

Comparative study with SQL-Server

Workflow and end user forms are available

More user friendly and client based only

Special configuration skills are not necessary

Different tools are available for visualization and .arff file converters

Command mode also available for troubleshoot the problem and command base execution.
5. RESULTS AND DISCUSSION
All the results are available and recorded on the basis of different tools and methods which is defined as per context. The
different types of tools are useful for different purposes are noted in the below table.
Sno
1
2
3
4
5
6
Table 2: Shows different tasks and resources
Task
Tool/Resources
Back ground preparation
Authors published journals
Association rules
MS-SQL 2008 R2
Linear variables Data
Excel Sheet
Linear Relation between variables
MS-SQL 2008 R2, Excel sheet and
Curve fitting tool
Clustering
MS-2007 Data mining add-in,
Weka
Scope and Limitation of Model
IBM SPSS
The authors are applied Association rules with the help of MS-SQL 2008 R2 server and details are available in figure-2.
The authors tested the data at 0.5 probability and -0.1 importance and observed that 2000 rules are generated as shown in
the figure and red color mark is applied for negative values and they are out of scope. There are total 289 item sets are
available for this constraints are shown in figure-5. The network dependency system describes about the total number of
links and strong links between available entities in defined database. The proposed database consists have more than 50
entities and network is so complicated. Only 15 entities have strong relation with each other as shown in the figure-6.The
scope and limitation of system is implemented with IBM SPPS tool and it shows nearly 80% accuracy as shown in the
figure -5. From the figure -6 it is obvious that 1300 rules are feasible for a given dataset. The clusters of the given dataset
for linear curve are three in numbers as shown in figure-7 and figure-8 implement with MS-Excel 2007 data mining addin and Weka respectively. The clusters profiles of the above said three clusters are examined using data mining add-in.
The below diagram gives the details of different clusters (3 clusters) profiles are shown in below diagram.
Figure 15: Cluster profiles for three clusters
Volume 3, Issue 5, May 2014
Page 268
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
From the above figure it is well known fact that there are three cluster profiles are formed. The profiles are generated on
the basis if linear variables Importance and Rules. Total number of clusters is 58 and cluster 1 has maximum number of
clusters 17. Cluster 2 and Cluster 3 have 17 and 13 etc. The size of the cluster cannot give solution. But we can found out
its nature by using its attributes. The above diagram is self-explain nature.
Figure 16: Cluster 1 characteristics with feasible and infeasible values
From the above figure it is known that the rules are available at negative interval i.e., -0.4 to -0.1 and -0.1 to 0.0. There
are only three rules are available between this interval -0.4 to -0.1. The cluster2 and cluster3 values are available at
positive interval as shown in the below figures. The probabilities are also shown in the last column of above figure.
Figure 17: Cluster 2 characteristics with feasible values only
Figure 18: Cluster 3 characteristics with feasible and infeasible values
Volume 3, Issue 5, May 2014
Page 269
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
From the above figure it is known that the rules are available at interval i.e., 0.4 to 0.1 and 0.1 to 0.0. There are different
rules are available between these intervals. The negative values are available between 0.0 to -0.1 intervals. The rules
available between 0 to 0.1 intervals have two ranges. One is 1553-1926 and another is 1179-1553. In the next row
importance is available in two stages and Rues are also two stages. There is one-to-one relation is available between
Importance and Rules. By default system search for mapping between importance and rules. The probabilities are also
shown in the last column of above figure. The average population of all these clusters is as follows.
Figure 19: Average population of all 3 clusters
Figure 20: Graph generated by MS-SQL 2008R2 Server for rules mining
The authors are also observed the linear and poly curves for above rules with respect to different intervals are shown in
different diagrams are as follows.
Figure 21: Shows linear and poly curves for rules mining
Volume 3, Issue 5, May 2014
Page 270
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
The above figure is self-explain and higher degrees are neglected due to complex nature. Only two degree curves are
examined as linear and quadratic curves. Finally the authors used IBM SPSS for outlier’s estimation for linear curves
which gives the scope and limitations of defined database of University System. We used Cooks distance for outlier’s
estimation for “Indian University Clusters” as shown below in the figure.
Figure 22: Cook’s distance for outliers estimation for records
Records with large distance for different records are highly influential in the model consumptions and that type of records
is distort the model accuracy.
6. CONCLUSIONS
This paper mainly focused on scope and limitations of “Indian University Clusters” with respect to e-Learning. We put
our efforts maximum to get balance between mathematical and data mining models. Different tools are used for special
purpose are stated in table-2.The clusters are formed and analyzed on the basis of mainly intervals and probabilities. The
network diagram is analyzed on the basis of all links and strong links and observed that Bi-cluster is suitable (Figure 13).
In the same way the association rules and item sets are generated using different intervals and probabilities. The authors
neglect negative intervals for rules generation and considered only positive intervals (Figures 15, 16, 17, 18). The
accuracy of system is approximately 80 %.( Figure: 10). More emphasis on mathematical model is necessary to improve
the accuracy of system and network topology leads communication between various entities which is mainly network
oriented phase and out of scope with respect to Database point of view.
References
[1] Srinatha Karur, Prof.M.V.Ramana Murthy, ”Survey and Analysis of University Clustering”, IJAIA,Vol (4), No (4),
pp. 127-143, July 2013.
[2] Srinatha Karur, Prof.M.V.Ramana Murthy, “Local Clusters formation for Indian Universities”, IJAIA,Vol (4), No
(5), pp. 19-38, September 2013.
[3] Srinatha Karur, Prof.M.V.Ramana Murthy,” Data Preparation and Analysis for Andhra Pradesh Clusters”, IJSBAR,
Vol (7), No (1), pp. 4-16, 2013, ISSN 2307-4531
[4] Srinatha Karur, Prof.M.V.Ramana Murthy, “Modelling and Data testing for Indian University Clusters”, IJES, Vol
(2), Issue (10), pp. 2985-2996, October-2013, ISSN: 2319-7242
[5] M. Srinatha Karur, Prof.M.V.Ramana Murthy, ” E-Learning nodes estimation for NAAC Indian Universities”,
IJETTCS, Vol (2), Issue (6), pp. 143-159, November-2013, ISSN: 2278-6856
[6] Prof.Giuseppina Passiante, Dr.Giustina Secundo ,“From geographical innovation clusters towards virtual innovation
clusters the Innovation Virtual System”, [Online] Available:http://www-sre.wu-wien.ac.at/ersa/ersaconfs/ersa02/cdrom/papers/270.pdf
[7] [Online] Available: http://www.ugc.ac.in/ugcpdf/740315_12FYP.pdf
[8] [Online] Available: http://www.edu-leaders.com/edu/news/39501/cluster-colleges-instead-deemed-university-ugc
[9] [Online] Available: http://www.eadi.org/fileadmin/WG_Documents/Reg_WG/vandijk1.pdf
[10] [Online] Available: http://www.davidwoon.com/researcher/files/wise01.pdf
AUTHORS
Mr. Srinatha Karur received M.C.A. and M.Tech (IT) during the periods of 1994-97 and 2002-2004 and
the author waiting for his Ph.D. Certificate The author has completed his M.Phil. from “Global Open
University” Dimapur, Nagaland, India in 2009 December. He has 16+ years of continuous service after his
P.G qualification. He has four international publications on Data mining and techniques and Applications
concept. At present he is in Government Engineering College, Ibra, and Sultanate of Oman as IT faculty and Oracle
Volume 3, Issue 5, May 2014
Page 271
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 3, Issue 5, May 2014
ISSN 2319 - 4847
DBA. The author has experience on both Technical and Academic lines and at present he has a plan for e-Learning tools
with MS-SQL and Oracle DBA tools as a part of the Engineering stream curriculum. The area of interest for authors is
Operations Research, Numerical Methods (Theory), Operating Systems (Windows, UNIX flavour) and DBA (MS-SQL,
Oracle). At present I am working for Government Engineering College, Post box no: 327, Zip code: 400, Ibra, and
Sultanate of Oman as IT faculty and Oracle & MS-SQL DBA. The author special interest is modeling of required data in
terms of server and client based tools and comparative study of client and server version of tools or software.
Prof. M. V. Ramana Murthy is senior Professor and HoD of Department of Mathematics & Computer
Science, Osmania University, Hyderabad, India. He is author of Core programming languages and
international Publications on different applications. His profile shows his complete grip on Academic,
Administrative and Technical fields. At present the professor is on foreign assignment. The professor has the
same grip on multiple subjects in terms of Mathematics and Computer Science Engineering. He has a special interest in
core Mathematical Modelling for engineering applications which is the heart of the Research studies. At present the
professor is very busy in his foreign assignment. The professor has tremendous grip on both Mathematical and Networks
modeling. The professor handles Neural networks, Artificial Intelligence and Data mining subjects at Post graduation
AND Research level. The professor guides the Research and Postgraduate students in terms of Mathematical and
Engineering modeling.
.
Volume 3, Issue 5, May 2014
Page 272
Download