ca SBA report

advertisement
CA SBA REPORT
Wong Hei Wai
7A (26)
SCHOOL-BASED
ASSESSMENT REPORT
[2011-2012]
Yan Chai Hospital Lim Por Yen Secondary School
7A Wong Hei Wai (26)
HONG KONG EXAMINATION AND ASSESSMENT AUTHORITY
HONG KONG AVANCED LEVEL EXAMINATION
AS COMPUTER APPLICATIONS
[ DISCUSSION FORUM SYSTEM ]
1
CA SBA REPORT
Wong Hei Wai
7A (26)
~CONTENT~
1. Objective and Analysis
p.3-8
2. Design
p.9-11
3. Implementation
4. Testing and Evaluation
5. Conclusion and Discussion
6. Documentation
2
CA SBA REPORT
Wong Hei Wai
7A (26)
1. Objective and Analysis
1.1 Background
Database system is widely used nowadays. A database is an organized
collection of data for one or more purposes, usually in digital form. The data
are typically organized to model relevant aspects of reality, in a way that
supports processes requiring this information. A general-purpose DBMS is
typically a complex software system that meets many usage requirements, and
the databases that it maintains are often large and complex. The utilization of
database is now spread to such a wide degree that virtually every technology
and product relies on databases and DBMSs for its development and
commercialization. For example, companies will use database function to
contain a large amount of data, such as customers’ information, staff
information, different statistics and so on. And also, in the world of Internet,
database system is important as well since many websites need to use this
function to contain its data, such as public discussion forums. Public
discussion forums are being popular in the Internet. There are thousands of
discussion forums in the Internet world. Some of the most famous discussion
forums are Uwants.com, Little Soldier Forum and Hong Kong Fail Forum.
In a public discussion forum, database is an essential structure for it to
operate which is used to store user information and forum posts. Each post
contains different essential information, such as the user identifier, IP address
and time of the post. They should be able to store for security and user
behavior analysis purposes.
In this SBA project, a database structure of a public discussion forum is
asked to be designed.
The system will generate the following statistics:
 posting statistics
 user statistics
 online traffic analysis
 posting habits analysis
3
CA SBA REPORT
Wong Hei Wai
7A (26)
1.2 Analysis
All of the public discussion forums are on the purpose of giving expression
and discuss with other internet users. And the forums also have similar
functions which are based on the database system, such as online status
statistics, user information searching and news searching functions. It is
useful for administrators and users to look up for different information of the
discussion forum. During the summer vacations, I have done a research on
two discussion forums, the Uwants forum and TVB forum and made a
comparison on their functions. Here are some examples.
1.2.1
Online and posting statistic
According to fig.1, public discussion forums can contain the counting
amount of number of users who are onlineing and also the record of the
maximum number of users which online at the same time
Besides, there is a record which containing the number of posts in
that public discussion forum, and the number of posts which post on that
day.
fig. 1
1.2.2
user information
According to fig.2, users can search for their account information in
the public discussion forum, such as User ID, their identity group, the
registration date, their last posting record.
fig.2
4
CA SBA REPORT
Wong Hei Wai
7A (26)
1.2.3
posting record
There are some records about posting from a specific member. In
order to attract more users to register in their discussion forum, some
administrators have much creativity that they will use new name such as
“accumulating marks” or “internet cash” instead of the traditional one.
fig.3
1.2.4
searching function
Users are allowed to use a searching function in a discussion forum.
There are thousands and thousands of posts in a discussion forum, if the
users want to read a specific news, it is extremely difficult to search the
posts one by one. Therefore the main purpose of using the search engine
is to let the users to search some posts. Users can do searching by
keywords of the post, the name of user, posting time, theme of post and the
related region. Fig.4 is an example.
fig.4
5
CA SBA REPORT
1.2.5
Wong Hei Wai
7A (26)
forum layout
The database system can divide all the posts into different categories.
The following print screen picture is from Uwants forum. In it, the database
system creates an index referred to different categories, such as News,
Food, Travel, Comics, Sports, Fashion, etc. The index function can allow
users to search their interested information conveniently and the layout of
the forum can be more orderliness.
fig.5
The print screen below is another forum, TVB forum. It adopts a simpler
layout structure by using database system. The TVB company is a company
which offering different TV programmes. So they divided the information of the
forum into Dramas, News, Life, Entertainment and so on.
fig.6
6
CA SBA REPORT
1.2.6
Wong Hei Wai
7A (26)
Voting function
Besides the function of posting news, discussion forum also allow
users to make votes in posts. A bar chart will be created to show the
number of votes in different options. The following print screen is showing a
voting post in TVB forum. The statistic showed there were 3919 total votes,
which means 3979 users have read this post and made vote.
Fig.7
7
CA SBA REPORT
Wong Hei Wai
7A (26)
Database software
There are different database software in the market nowadays. For example,
FoxPro, Microsoft Access, mySQL and Oracle. Although they are all so famous
and have lots of users, there are still some differences between those
database software. The following tables compare the limits about data size
limits and capabilities for some database software.
Database software
FoxPro
Access
My SQL
Oracle
Unlimited
2 GB
Unlimited
Unlimited
2 GB
256 TB
4 GB
Max char size 16 MB
255 B
64 KB
4000 B
Max number 32 bits
size
32 bits
64 bits0
126 bits
Max column 64
name size
/
64
30
Merge join
No
No
No
Yes
Windowing
Functions
No
No
No
Yes
Common
Table
Expressions
No
No
No
Yes
Max DB size
Max
size
table 2 GB
According to the comparison of these database software, Oracle is
superior to others in every aspect and it is much expensive. The Oracle
Database is a product from Oracle Corporation and it has always been
choosing to be the database system software in big companies. Yet, this SBA
report doesn't need that expensive software, and also the school has Microsoft
Access software, so I decided to use Microsoft Access to design the database.
8
CA SBA REPORT
Wong Hei Wai
7A (26)
2. Design
The first thing we have to do is design the structure of the database
system. We can use an Entity Relationship Diagram (ERD) to show the
structure clearly. The ERD is showed below:
Fig.8
In the above ERD, the relations between entities are clearly showed.
Rectangles represent entities or called record. An entity is a representation of
any composite information of a real object or an abstract object. Oval
represent the relation the attributes or called fields. Rhombuses represent the
relation between two entities, it is unique and cannot be null. There may be
more than one relationship between two (or more) entities. Cardinality
information can be divided into two types – minimum cardinality and maximum
cardinality. In the ERD, 1, 0, M is the cardinality and existence of a relationship.
0 means the existence of the entity in the relationship is optional. 1 mean that
the existence of the entity in a relationship must have at least one of at mist
one. M means more than one existence between entities. There are three
9
CA SBA REPORT
Wong Hei Wai
7A (26)
entities in total (members, news and category).
In the above ERD, users can read news, post news, reply news or delete
news. The following are description of the relationship of the five linkages.





One news can be read by none or many users and also be the same
situation in the opposite direction, so the relation between reading
news by users is Many to Many relation.
One user can post none or much news, but one news can be posted
by only one user, so the relation between posting news by users is
One to Many relation.
One news can be replied by none or many users and also be the
same situation in the opposite direction, so the relation between
replying news by users is Many to Many relation.
One user can delete none or much news, but one news can only be
deleted by one user, so the relation between deleting news by users
is One to Many relation.
One category can contain none or much news, but one news can only
belong to one category, so the relation between category and news is
One to Many relation.
After finishing the ERD, we need to change the ERD into database
schema, because ERD is a result of data analysis, but it can’t directly form a
table structure, so we have to change it into schema. A database schema is its
structure described in a formal language supported by the database
management system (DBMS) and refers to the organization of data to create a
blueprint of how a database will be constructed.
When changing the ERD to schema, we need to follow some rules.
 For a 1: 1 cardinality relationship, all the attributes of the related
entities are grouped into single table.
 For 1: M cardinality relationship, model each of the related entities in

a separate table and post the primary key of the “one” side entity as
an (foreign key) attribute to the table that represents the “many” side
entity.
For an M: M cardinality relationship, model each of the related entities
in a separate table and create a new table (which is referred to as the
intersection table) and post the primary key of each entity set/type as
an attribute in the new table. If the relationship has its own attributes,
those attributes are to be stored in the intersection table too. The
10
CA SBA REPORT
Wong Hei Wai
7A (26)
primary key of the intersection table is a composite key which
includes the primary key of each concerned entity type.
Besides those rules, we also need to do normalization to the table or
schema. Normalization is a database design technique based on analyzing
relations among key and non-key attributes of database tables. The main
purpose of normalization is to minimize data redundancy and anomalies.
There are 3 normal forms of normalization, the First Normal Form, the Second
Normal Form and the Third Normal Form. The First Normal Form is used to
ensure that no repeating fields in the table. The Second Normal Form is in First
Normal Form and exhibits no partial dependencies in a table, i.e. non-key
attribute in the table is full functionally dependent on the primary key of the
table. The last is the Third Normal Form, it is in Second Normal Form and
exhibits no transitive dependencies. Therefore the schema is shown below:
Member ( user_ID, user_name, email, sex, birthday, password, start_date,
online)
News ( news_ID, IP, user_name, date_time, category_name, user_ID)
Category ( category_name, description, administrator, start_date)
Posting ( user_ID, news_ID, date_time)
Reply ( user_ID, news_ID, date_time)
11
CA SBA REPORT
Wong Hei Wai
7A (26)
3. Implementation
After the design of the database, we need to create those tables by
Microsoft Access.
Pressing the above function that highlighted with red circle to create a new
table.
After clicking the button, we can see the following table to type the table field
name, type and information.
12
CA SBA REPORT
Wong Hei Wai
7A (26)
In the following table, the field name is needed to type into the cell in the
first place. Then in the second cell, we need to choose the type of the field, for
example, character, memo, integer, date and so on. We need to choose the
suitable field type for further processing. In the last field , we should type the
description of the field, but it is optional for entering content.
After creating all fields in the table, we need to set a key field that is the
primary key of the table in schema.
To set the key field, right click the mouse in the side bar of the setting
page. The choose the key field which is the first button, then that field will be
the key field of the whole table. Other tables in schemas also do these steps to
13
CA SBA REPORT
Wong Hei Wai
7A (26)
set all tables in schemas. In one single table, more than one primary key field
can be set. After doing that, all the table will show at the main page of the
database as follow.
They are used for inserting data of the discussion forum.
fig.9
For inserting records into the table, two method can be used. First method is
typing all the record to the table by SQL and the access insert function. But it's
quite inefficient, so I used choose the other method, using other source like
spreadsheet to insert the information.
Steps of insert the record by other resource are showed as follow. First, you
need the file which stores all your record and the structure of storing records is
same as the table structure of the table. Let me use the table Member to be an
example.
14
CA SBA REPORT
Wong Hei Wai
7A (26)
Then you can use the function in Access to insert all the data to the table.
Here show the following step.
15
CA SBA REPORT
Wong Hei Wai
7A (26)
First click file button on the top of Access, then we choose the choice
marked with red circle, then the other box will appear, then we choose the
choice Insert. Then one window will appear to the following steps.
Then we should choose the file that stores the record that match with the
table. In this case, we should choose student.xls. After choosing the right file
then click the Insert(M) button and continuous the process. After clicking the
button, other window will show to continuous the process. Then window is
showed as following:
16
CA SBA REPORT
Wong Hei Wai
7A (26)
In these two steps, we just need to click the buttons of the next step because
no any setting you need to set in this part.
After clicking the next step in the following box, we need to choose the table
that we is needed to insert information.
17
CA SBA REPORT
Wong Hei Wai
7A (26)
In this part, we need to choose which table you want to insert the record, in
this example we need to choose the table Member. After choosing the table,
you need to click next step again. Lastly, we just need to click finish, then, the
coping of record to the table have been finished. Here is the table copied the
record.
fig.10
This is the member table which contains all the record of the member, like
user ID, user name, email, sex, birthday, password, date of starting to use the
forum and the status of online. I insert some data randomly, and there are 15
members in the forum, they have different personal information.
fig.11
This is the news table which contains all the record of news, like news ID,
user name, posting date and time, category name and user ID. There are 10
news posted by 10 different users in the forum and they belong to the 5
categories.
18
CA SBA REPORT
Wong Hei Wai
7A (26)
fig.12
This is the category table which contains all the record of existing category
group, like category name, description of that category, the administrator and
the date of creating that category.
fig.13
fig.14
These are the posting table and reply table which contains the user ID,
news ID and the date of posting news and replying news. I assume that in the
discussion forum, there are 10 news posted by 10 different users, but there are
only 4 replies.
After copied all records to all tables, I need to create some SQL to generate
statistics and carry out analysis afterwards.
Posting Statistics
19
CA SBA REPORT
Wong Hei Wai
7A (26)
fig.15
The above is the SQL which can show the number of posting of each
member from the discussion forum. It applies with the GROUP BY function. it
is used to project rows having common values into a smaller set of rows.
fig.16
The above is another SQL showing the posting record which contains the
date of posting and the category it belongs to, also the total number of news in
that day and category.
User Statistics
20
CA SBA REPORT
Wong Hei Wai
7A (26)
fig.16
The above is the SQL which can show the information of each member in
the discussion forum. But not all the information contain in the table Member,
because ordinary users don't have full limits of authority, they can only search
other users' ID, name, email, sex, birthday and the registration date.
Online traffic analysis
fig.17
The above is the SQL which can show how many member are being online
at a specific moment. The "Online" field in table Member is a Logical field type.
The Online status can only be True or False. If true, it means that member is
being online.
21
CA SBA REPORT
Wong Hei Wai
7A (26)
6. Testing and Evaluation
For testing the database system, I will use the SQL to test if it works.
Testing for posting statistics:
After executing the posting statistics SQL, the above result came out. Under
the assumed data, there are 10 members have posted news, from u0001 to
u0010. They all have been posted 1 news in the discussion forum.
22
CA SBA REPORT
Wong Hei Wai
7A (26)
The above is the 2nd SQL searching the posting record, including the date
of posting, the belonged category of that news and the total number of news
posting in the same day and category. In the assumed data, members post
news in 10 days between 21st March and 30th March, 1 news for each days.
In those 10 news, there are 5 different categories, including sport, food,
travel, movie and music. Each category has 2 news in it. And they are posted
in 10 different days.
Testing for user statistics:
Since ordinary users of a discussion forum have limits of authority, they can
only read some of the information of other users, such as user_ID, user_name,
email, sex, birthday and start_date. The password is hided from the
searchable area, it can only be checked by administrator.
In this assumed data, there are in total 15 users in this discussion forum, and
they got different name, which must be unique as well as the user ID. For the
convenience of testing the SQL, I just simply set their name as a, b, c, etc.
They got 15 different emails, and some are female and some are male. In 15
users, they have different birthday, it can be repeated in some of cases but not
in this. And also they are registered in 15 different days.
23
CA SBA REPORT
Wong Hei Wai
7A (26)
Testing for online traffic analysis:
The above SQL is used for testing which member is online at a specific
moment. If that member is online, their “online” field in table Member will
become true, otherwise maintain false. In the assumed data, there are 8
members online, they are u0001, u0003, u0005, u0007, u0009, u0011, u0013
and u0015. The SQL will show the user ID and user name of online members.
Testing for not null function:
24
CA SBA REPORT
Wong Hei Wai
7A (26)
In the table Member, the field User_ID is set as the primary key, as well as
the not null function which is highlighted in red circle on the above.
The not null function is required the data entry must conclude this field, the
user_ID must not be empty. If users enter data without entering the user ID, the
database system will ask the user to enter it until it is not null.
For testing if the not null function works, I enter a fake user record who
named P. And I didn’t enter his user ID. After that the Access showed the box
above, stating the field “user_ID” cannot be null, it required me to re-enter a
user ID.
25
CA SBA REPORT
Wong Hei Wai
7A (26)
Testing for deleting data:
Use the table Member be an example. In deleting the excess data, highlight
all student who are excess, then right click mouse and choose the second
choice that marked as the above picture. We should notice all the record that is
deleted cannot recover again, so we should do it very carefully.
Special cases:
However, some unusual cases we can also think about. In the reality, when
the users of discussion forum log in to it, they may forget their password and
cannot log in to the forum. The database system can offer a “Forget Password”
function to users. When they forget their password, they can use a Safety
Question and Answer to get back their password which the question and
answer is being given in the registration state.
26
CA SBA REPORT
Wong Hei Wai
7A (26)
5. Conclusion and Discussion
This system can fit the requirements in the objective. After this project,
using database is a essential method to store and handle users’ data in
discussion forum. Report is also made by the table after the process. So
database is a useful tool for handling data and making report.
In this project, I can learn other database program like access except
FoxPro that teaching in class, and I learn how to use the access function to
insert the record from other resources. I can also learn how to generate
statistics and carry out analysis by writing SQL. I can learn the difference on
spreadsheet and I can compare and construct different database tool and
program. I know more about the database system of public discussion forum.
By creating the report to generalize statistics about different data, I can learn
about the process to making a report by using tables and the function in
database. In this project, I can also learn how to write a complete report to
present my ideas to others.
However, I also find some problems during building up this system.
First, I don’t know well about database of public discussion forum. Then I did a
research on different Hong Kong popular discussion forum, and found out what
characteristics do they have. After doing a presentation power point, I have a
clearer direction on how to create a database system.
Before I create the basic of database by Access, I need to design a ER
diagram. Mr Law has taught us the way of drawing the ER diagram, but I found
that it was quite difficult to implement into the situation of discussion forum.
Facing a large number of entities, fields and relations, I could hardly manage
them probably. My classmates gave me a software named SmartDraw, it was a
very good tool to draw diagram, and I can easily edit it if I found there is any
adjustment.
Besides, it is difficult to directly inputting data in to the database, so I
suggest inputting data to spreadsheet first and then import the data into
database table, this is more efficient.
To improve this database, we can offer a function of “Forget Password”
to forum users since some of them may be forget the password and cannot log
in all the time. And also we can create more statistics for users, they can know
more details in the discussion forum.
27
CA SBA REPORT
Wong Hei Wai
6. Documentation
Wikipedia: http://en.wikipedia.org/wiki/Main_Page
28
7A (26)
Download