- Krest Technology

Effective Pattern Discovery For Text Mining Aim:Text mining is the discovery of interesting knowledge in text documents. Abstract: In our work, the text data of text mining has gradually become a new follow a line of investigation. Text clustering can greatly simplify browsing large collections of documents by reorganizing them into a smaller number of patterns in text documents manageable clusters. Text clustering is mainly used for a document clustering system which clusters the set of documents based on the user typed key term. Firstly the system preprocesses the set of documents and the user given terms. We use the feature evaluation to reduce the dimensionality of high-dimensional text vector. The system then identifies the term frequency and then those frequencies are weighted by using the inverted document frequency method. Then this weight of documents is used for clustering. Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text classification. presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving In this paper, we propose a fuzzy similarity-based self-constructing algorithm for feature clustering. The words in the feature vector of a document set are grouped into clusters, based on similarity test. Words that are similar to each other are grouped into the same cluster. Each cluster is characterized by a membership function with statistical mean and deviation. When all the words have been fed in, a desired number of clusters are formed automatically. We then have one extracted feature for each cluster. The extracted feature, corresponding to a cluster, is a weighted combination of the words contained in the cluster. Experimental results show that our method is applied to the text clustering, making the results of clustering more efficient & accurate and stable than the existing algorithm. Existing System: 1) Existing text clustering uses the frequent word sets to cluster the documents. 2) Many well known clustering algorithms deal with documents as bag of words and ignore the important relationships between words like synonyms. 3) Existing algorithm has a higher probability of grouping unrelated documents into the same cluster. Proposed System: 1) Our proposed text clustering has a frequent concept to cluster the text documents. 2) The proposed technique uses two processes, pattern deploying and pattern evolving, to refine the discovered patterns in text documents. 3) Our Proposed algorithm utilizes the semantic relationship between words to create concepts. 4) The Relationship between words like synonyms, hypernymy, also be identified & hypernymy is most effective for Text clustering. 5) Associating a meaningful label to each final cluster is more essential. Then, the high dimensionality of text documents should be reduced. 6) A clustering algorithm works with frequent concepts rather than frequent items used in traditional text mining techniques. 7) FCDC found more accurate, scalable and effective when compared with existing text clustering algorithms. Modules:- 1) Registration 2) User 3) Authentication SYSTEM REQUIREMENT: Hardware Requirements • System : Pentium IV 2.4 GHz • Hard disk : 40 GB • Monitor : 15 VGA colour • Mouse : Logitech. • Ram : 256 MB • Keyboard : 110 keys enhanced. Software Requirements  Operating System :  Programming language: Java  Front-End: Swings,Awt  Back-End: MySql SERVER Windows

- Krest Technology

Related documents

Products

Support

- Krest Technology

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib