Vũ Nhật Linh 00557
Lê Quang Hoàn 00479
Group Members Nguyễn Duy Quyền 00485
Hoàng Nam
Nguyễn Thế Anh
Supervisor Phan Trường Lâm
Ext Supervisor
Capstone Project code
Hanoi, 11 st May, 2011
525 | P a g e
Record of Changes
11/05/2011 All
Change Item Description
Create the document
LinhVN 0.1
2 | P a g e
3 | P a g e
Project name: Capstone Project Documents Management System
Project code: CProDM
Product type: Website Application
Timeline: from May 2011 to August 2011
This project is registered and implemented as the capstone project for the team members. The first purpose is to fulfill the requirements from FPT University studying program. The second purpose is to create a complete product for going live.
Supervisor 1
Full name
Phan Trường Lâm
E-Mail lampt@fpt.edu.vn
Team members:
Full name
Student 1 Vũ Nhật Linh
Student 2
Lê Quang
Student 3
Nguyễn Duy
Student 4 Hoàng Nam
Student 5
Nguyễn Thế
Student code
E-mail linhvn00557@fpt.edu.vn
0947547789 quyennd00485@fpt.edu.vn
01656082600 namh00194@fpt.edu.vn
0984819575 anhnt00361@fpt.edu.vn
Role in
Today, with the development of the internet, storing and managing a huge volumes of data is extremely urgent problem and difficult to solve especially with education. Many leading universities in over the world have spent a lot of money researching and deploying data management system and achieved high effect. But in Vietnam, the universities have not really focused to solve this problem.
This becomes a barrier for education quality in universities in Vietnam.
4 | P a g e
For example: each year, each department of each university has hundreds of graduate students with dozens of projects, lecturers cannot fully know about the number as well as content of all those documents. So, it is hard to control copy status of students in next courses. If having checking, it also takes many time and effort to look up. So: “Is there a system helping lecturers to save time and manage those documents?”
From such thought, we want to build a website which lecturers and students can look up
existing documents projects and assist to detect cheating. With universities in general and FPT
University in particular, students have to do their capstone projects at the end of course if they want to get certificate. They do these projects to show their skills, show what they learned during the course. So these projects must be unique and done by themselves. The capstone projects have become more and more, so in hundreds even thousands projects but lecturers just need to log in to our sytem and “one click” to know there are similar projects.
Finally, last but not lastest, the website, we have an ambition to create a standard system, which is the template can be deployed in all universities in over the country. We wish to bring practical benefits for education quality as well as knowledge economy in Vietnam, contributing to the development of the country.
Overview of similar existing solutions & existing methods
In the course of study and research, we found that today, Vietnam, the systems like this type are rare or not implemented in formal way or exists as small offline and spontaneous softwares. There are already many plagiarism detection websites in over the world, such as: turnitin.com
These websites have offered a relatively large amount of information on goods, and received the acceptance and frequently visits from customer to update information about items as well as advertising.
Below are the existing methods that these websites use to build their systems:
Turnitin’s solution - OriginalityCheck™ Plagiarism Prevention
The OriginalityCheck plagiarism prevention service is recognized as the worldwide standard for preventing Internet plagiarism. It helps protect students’ original work from being used without citation by another person, and serves as a learning tool to help instructors and students better identify and correct unintentional plagiarism. This comprehensive plagiarism prevention system lets instructors quickly and effectively check
5 | P a g e
students’ work in a fraction of the time necessary to scan a few suspect papers using a search engine—and delivers more comprehensive results.
Big database
A plagiarism detection tool is only as effective as the database it searches and
Turnitin's enormous database provides the most extensive comparisons of all detection software. Turnitin currently compares papers to an Internet database including over 13.5 billion live and archived web pages, a publications database including articles from over
10,000 major newspapers, publications, and journals, as well as thousands of books. With this database and our digital pattern matching technology, Turnitin is the leading plagiarism detection tool in the world.
There are currently three types of repository:
• Internet repository - billions of active and archived web pages from the internet.
Internet sources indicate a date of download on the Turnitin Originality Report if the match is not found on the most recent download of content from this site.
• Periodicals - a repository of frequently updated content from professional journals, periodicals, and publications.
• Student paper repository - a repository of papers previously submitted by Turnitin users.
• Institution paper repository - a collection of papers submitted to the institution’s repository.
Almost website have search engine for products and goods, all of the search engine are self-developed or use the search service from third-party provider. But these systems used algorithms which are optimized and modern techinique to search millions of documents.
Turnitin is constantly crawling web pages and adding new web pages and sites to their databases. All the web pages they crawl are indexed and added to the databases unless 1. They already have the data indexed or 2.
Urkund’s manual search: Urkund is equipped with a manual search engine to be used when digital documents cannot be sent through the system as usual. The search engine can also be used by those who prefer to upload documents rather than having them sent via e-mail.
Free text: This is the default setting when entering the page. Type in the text you wish to analyse in the search box (a minimum of 400 characters). The following day the result is sent to your inbox just like when submitting text attached in an e-mail.
File: Instead of sending a document via e-mail, this function can be used to upload a single file. The result will be processed and sent to your inbox the following day just like after a free text search.
Ranking and presentation of return results
Mostly, this site is achieving success on the reasonable arangement, sorting and displaying items coherently, clearly and easy to use. Focus on behavior and habits of users.
turnitin.com: List return results are on the right website, results is sorted by similarity percentages. The bigger percentages are set on higher rows. System uses colors to present similarity percentage values:
6 | P a g e
Figure 1.1: Displaying results of turnitin.com
Blue (no matching words)
Green (one matching word - 24% similarity index)
Yellow (25-49% similarity index)
Orange (50-74% similarity index)
Red (75-100% similarity index)
With urkund.com: The colour indicates how many percent of the submitted documents contain similarities from other sources. The colour scheme ranges from green to black, where green indicates “no matches” and black indicates “everything”.
Figure 1.2: Displaying results of urkund.com
Achievements of the existing systems
Easy to Read
Once Turnitin scans its databases for matching content, any matching text is highlighted in your paper.
Speed & Reliability
Turnitin scans its immense databases containing billions of pages of written work within seconds.
WriteCheck is based on Turnitin, which became the leader in plagiarism detection and prevention by comparing submitted papers against the largest and most reliable content bases, protecting users' privacy, providing quality customer support, and being completely web-based, flexible, and easy to use.
Turnitin serves over 10,000 educational institutions in 126 countries, including leading colleges and universities, high schools, distance learning and middle schools.
Quality Results
With the most extensive databases and advanced plagiarism checker technology, Turnitin supplies its users with trustworthy results of the highest quality.
Ensuring Privacy
7 | P a g e
Protecting the privacy and identity of end users should be the top priority of any information system. Turnitin is serious about protecting the security of our users' information. Turnitin achieve extremely high levels of security through use of SSL encryption, redundant servers, sophisticated firewalls, offsite secure backups and much more. Turnitin complies with both the Better Business Bureau and Safe Harbor's privacy and security standards.
Students are made aware of the problem of plagiarism from a moral perspective as this together with discussion on citation techniques and correct source management follow naturally on the implementation of URKUND. Discussions that is invaluable in order to make plagiarism a thing of the past.
Limitations of the existing systems
As far as we know only universities or other institutions can subscribe to Turnitin system. We don't think individuals can check their papers for plagiarism. We think it's very expensive for a regular student.
If Turnitin system is deployed in university scope, the costs are not small. Currently the number of foreign universities used this system is not much, most are famous school in the world such as leading universities or private schools are invested a huge amounts of capital. And Vietnam, this is almost too hard but not wants to say that far away.
The U.S. Family Educational Rights and Privacy Act (FERPA) prohibit disclosing confidential information about students to third parties without their or their families' permission.
Anything you write for school is automatically copyrighted; you do not need to apply for one. When you submit your work to Turnitin, Turnitin saves a copy to its database to compare against future submissions in order to stop students from plagiarizing other students’ work. Since you don’t know that Turnitin is saving a copy of your paper, and it doesn’t ask for or acquire your agreement, it’s a potential violation of your rights. Saving a copy of your paper is copyright infringement because Turnitin is using your paper for its own economic gain without compensating you. It is probably the reason there are already multiple lawsuits filed against Turnitin.
The copyright problem is making companies have to answer so many questions, disputes between students with the company, between universities with the company.
Many cases become large lawsuits which disturb opinions in public. This also caused confusion in the legal issues, barriers in deploying Turninit popularly in universities as well as having responses of students.
Relationship between Students – Teachers
These systems create an atmosphere of distrust between students and teachers.
When teachers use system, they send their students the message “We don’t trust you.”
Their verdicts are of questionable reliability. They also hurt students who incorporate direct quotes into their papers since they have no capacity to recognize citations. At the end of its check, these systems return the results as a percent plagiarized—just a number.
If a teacher just looks at the percent and does not examine where it came from (system mark each section), students who used direct quotes are accused “cheating” but actually
8 | P a g e
they are innocent. Using these detect tools doesn’t teach students that plagiarism is wrong. They just encourage cheaters to find another way to beat the system.
Because there is no website like this before is used in FPT University and importance of managing
Capstone Project Documents, we consider it is an opportunity for us to be able to develop a solution for University.
The idea is simple but high realizability, especially at this moment FPT Universisty has no tool to manage these documents so we think our program will be welcome.
By implementing a Capstone Project Documents Management website for FPT University, our product will have the following advantages:
Support looling up capstone project documents to lecturers and students of FPT University.
Help students to avoid repeating and copying idea for their capstone project.
Compare project documents to detect cheating.
Our product will be a website. First version, we intend to apply for FPT University with real data getting from capstone project of graduated students. Our users are FU lecturers who are directly involved in teaching specialized subjects and FU students who prepare doing the capstone projects.
We hope that our system can deploy on all universities in over country and using data includes not only capstone project documents but also assignments, workshops documents… in the future.
The product should include two main functions:
- Search documents by keyword (use for lecturers and students)
- Compare content of documents to estimate similarities (use for lecturers).
Turnitin’s website: http://turnitin.com
Writecheck’s website: http://www.writecheck.com
Wikipedia about Turnitin: http://en.wikipedia.org/wiki/Turnitin
Urkund.com’s website: http://www.urkund.com
9 | P a g e