Konkani Wordnet Development

advertisement
Punjabi WordNet Development
Thapar University & Punjabi University
Patiala
Presentation Outline
• Status of number of synsets completed
• Database Development Process
• Demonstration of Punjabi WordNet Site
Status of work completed Dec' 2011
S. No.
1.
Record File
Pan Indian Record
Total Synsets
Total Synsets Completed
by Thapar University
Total Synsets Completed
by Punjabi University
1347
674
673
3084
4084
104
105
(All Completed)
2.
3.
Universal Synsets (7168
rercords)
Adverb Synsets
7168
(All Completed)
209
(All Completed)
4.
Verb Synsets
1798
976/991
703/807
(Completed 1679)
(Completed/Assigned)
(Completed/Assigned)
1803
5.
6.
Adjective Synsets
Remaining Noun Record File
3605
1771/1802
(Completed 3574)
(Completed/Assigned)
22050
1190/11025
1798/11025
(Completed 2988)
(Completed/Assigned)
(Completed/Assigned)
Pan Indian Synsets
Total Number of synset : 1347
Total Synsets Completed by Thapar University : 674
Total Synsets Completed by Punjabi University : 673
File Status : Completed
Universal Synsets
Total Number of synset : 7168
Total Synsets Completed by Thapar University : 3084
Total Synsets Completed by Punjabi University : 4084
File Status : Completed
Adverb Synsets File
Total Number of synset : 209
Total Synsets Completed by Thapar University : 104
Total Synsets Completed by Punjabi University : 105
File Status : Completed
Verb Synsets
Total Number of synset : 1798
Completed : 1679
Total Synsets Completed by Thapar University : 976/991
(Completed/Assigned)
Total Synsets Completed by Punjabi University : 703/807
(Completed/Assigned)
File Status : Ongoing
Adjective Synsets
Total Number of synset : 3605
Completed : 3574
Total Synsets Completed by Thapar University : 1803
Total Synsets Completed by Punjabi University : 1771/1802
(Completed/Assigned)
File Status : Ongoing
Remaining Noun Synsets
Total Number of synset : 22050
Completed : 2988
Total Synsets Completed by Thapar University:1190/11025
(Completed/Assigned)
Total Synsets Completed by Punjabi University : 1771/1802
(Completed/Assigned)
File Status : Ongoing
Total Synsets Completed: 16965 (Completed By Thapar University & Punjabi University ).
Synset Id's that have been
transliterated during creation process
9629, 10024, 11063, 12113, 13954,
14168, 14256, 14384, 14602
Database Creation Process
Issues in the tool send by Goa University.
Sample Synset file uploaded on
the tool
Ouput Sanpshot of the tool
Data in different tables
wn_word
wn_synset_words
Data in different tables
Table: wn_synset
Table: wn_synset _example
Approached followed in creation
of database
Port the whole synset file in a table with
following structure:
*Synset_id
*Synset
*Gloss: stores both concept and examples
*Category
Table Snapshot
Logic for Insertion of Data into different tables
To insert data into wn_word and wn_synset_words
We select both synset_id and synsets fields from the table. After getting synsets value
from a tbl_all_punjabi_synset_data table and their corresponding synset_id. We
seperated the individual synset sepearted by commas(,) using tokenizer, and before insert
we check whether that particular word is already exists in the data base or not. If the
doesn't exists in database then the query automatically insert the word into data base.
During insertion of words it also insert the priority of word into data table by counting the
words under same synset_id the query sets priority one to the very first word of the
synset_id, sets priority two to the next after first word and so on.
To insert data in wn_synset and wn_synset_example
Select both synset_id and gloss fields from the table. After getting all gloss value from a
tbl_all_punjabi_synset_data table and their corresponding synset_id. We inserts the
examples and gloss into “wn_synset_example” and “wn_synset. If there exists two or
more examples or concepts for a particular id then we simply seperated the individual
value sepearted by commas(/) using tokenizer, and insert the values into tables with same
synset_id.
Snapshot of wn_word after
insertion of data
Snapshot of wn_synset_words after
insertion of data
Snapshot of wn_synset after insertion
of data
Snapshot of wn_synset_example after
insertion of data
Punjabi WordNet Demonstartion
Site Address:
http://125.19.69.26:8080/PunjabiWordNet/
Available over Internet
Snapshot of Punjabi WordNet Website
Snapshot of Punjabi WordNet Website
Snapshot of Punjabi WordNet Website
Snapshot of Punjabi WordNet Website
Snapshot of Punjabi WordNet Website
Thanks
Download