XiYiRen:International Student College Experience Enhancement

advertisement

International Student College Experience

Enhancement Program

Team Members

Alice Zhang

Florence Liao

Huan Guo

Jake Magner

Li Shubin

Viraj Mohan

Zahin Ali

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Project Background

Project Objective

To design a database for a website that helps international students with various aspects of “settling in”, by providing a platform for interaction between students, local communities, cultural organizations and employers

Client

XiYiRen, a start up social utility website will be using a small part of our expansive project, focusing on Chinese students.

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

DP I Summary

Progress

 Project Background: Objective and Client description

 Summary of entities involved

 Database capabilities

 Simplified EER diagram with 10 entities, 3 Weak entities/relationships, and superclass/subclass division

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

DP II Summary

Progress

 Revised simplified EER diagram

Including more entities and 30 relationships

 Implementation of queries in relational algebra

 Realized need for more complex queries utilizing

IEOR methods: forecasting, optimal event locating, etc.

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

DP III Summary

Progress

 Revised simplified EER diagram

 Relational schema

 Five queries implemented in SQL and Access

 Focused on client-centric queries

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

EER

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Relational Schema

1. Person(Pid, Fname, Lname, MI, Birth_date, Profile 5 )

2. Student(Pid 1 , Housing 7 , University 14 , Pickup_Person 3 , Flight, Country 11 , price_preference, year, sleep, wakeup, study, friends, outgoing)

3. Community_Member(Pid 1 , occupation)

4. Alumni(Pid 1 , Class, Occupation, Donation_Amount)

5. Profile(Profile_id, Pic, Email, Phone)

6. Location(Street, City, State, Apt_Suite, Zip, x, y)

7. Housing(Hid, offered_by_person 1 , Street 6 , Apt_Suite 6 , Zip 6 , offered_by_org 8 , org_profile 5 , price, availability_date, furnished, number_rooms, number_bathrooms, water, electice, garbage, gas, internet, move-in special)

8. Organization(OrgName, Profile_id 5 , Street 6 , Apt_Suite 6 , Zip 6 , type, description)

9. Department(DepName, University 14 )

10. Event(EventName, Profile_id 5 , Street 6 , Apt_Suite 6 , Zip 6 , description, attendance, date, time)

11. Country(Name, Capital, Population)

12. Language(Name, Countries_spoken_in)

13. Resource(Rid, Owner 1 , Price, Quantity)

14. University(Name, student_population, ranking)

15. Donation(Did, Amount, Time, Date, Pid 1 )

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Relational Schema (contd)

16. Mentors(Mentor 1 , Mentee 2 )

17. Student_University(Student 2 , University 14 )

18. Person_in_Org(Person 1 , OrgName 8 , OrgProfile 5 )

19. RSVP(Person 1 , EventName 10 , EventProfile 5 , SurveyScore)

20. Student_in_Department(Student 2 , DepName 9 , UniName 14 )

21. Person_speaks_language(Person 1 , Language 12 )

22. Housing_near_Uni(Housing 7 , UniName 14 )

23. Organization_University(OrgName 8 , OrgProfile 5 , UniName 14 )

24. Org_holds_event(OrgName 8 , OrgProfile 5 , EventName 10 , EventProfile 5 )

25. Org_speaks_Language(OrgName 8 , OrgProfile 5 , Language 12 )

26. Org_Country(OrgName 8 , OrgProfile 5 , Country 11 )

27. Dep_sponsors_event(DepName 9 , UniName 14 , EventName 10 , EventProfile 5 )

28. Event_speaks_language(EventName 10 , EventProfile 5 , Language 12 )

29. Event_country(EventName 10 , EventProfile 5 , Country 11 )

30. Country_Language(Country 11 , Language 12 )

31. Alumni_Uni(Pid 4 , UniName 14 , class_of)

32. Alumni_Dept(Pid 4 , DepName 9 )

33. Person_gives_donation(Pid 1 , Did 15 )

34. Rommates(Pid1 1 , Pid2 1 )

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Relational Design

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 1: Roommate Matching

Description

• Shows all possible roommate combinations ordered by MatchRating.

• A dorm/off-campus housing facility can use it to pair up students interested in their housing

Description of Attributes

Sleep

Wakeup

Outgoing

Study

Friends

Early to late sleep time (Scale of 1-5)

Early to late (1-5)

Outgoingness Level (1-5)

In room(1) - Library(5)

Having friends in room never(1) – always(5)

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 1: Roommate Matching

SQL Code

SELECT P.Fname, P.Lname, Q.Fname, Q.Lname, Min(0.2*(Abs(S.sleep-

R.sleep))+0.2*(Abs(S.wakeup-R.wakeup))+0.2*(Abs(S.outgoing-

R.outgoing))+0.2*(Abs(S.study-R.study))+0.2*(Abs(S.friends-

R.friends))) AS Matchrating

FROM Student AS S, Student AS R, Person AS P, Person AS Q

WHERE (((S.pid)=[P].[pid]) AND ((Q.pid)=[R].[pid] And (Q.pid)<[P].[pid]))

GROUP BY P.Fname, P.Lname, Q.Fname, Q.Lname

HAVING (((([P].[Fname]=[Q].[Fname]) And ([P].[Lname]=[Q].[Lname]))=False))

ORDER BY Min(0.2*(20-Abs(S.sleep-R.sleep))+0.2*(20-Abs(S.wakeup-

R.wakeup))+0.2*(20-Abs(S.outgoing-R.outgoing))+0.2*(20-

Abs(S.study-R.study))+0.2*(20-Abs(S.friends-R.friends)));

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 1: Roommate Matching

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 2: New Student Forecasting

Description

• Extracts the data of how many new students come each year which can then be used to forecast the future number of students

• The year table is a one attribute table containing a list of years

• Uses regression equation y=ax+b with slope b = (N∑XY - (∑X)(∑Y))/(N∑X2 -

(∑X)2), and intercept a = (∑Y - b(∑X))/N. Where N = number of tuples, X =year, and Y = number of students

SQL Code

SELECT y.year AS [Year], count(s.pid) AS Number_Of_Students, u.name AS

University

FROM [year] AS y, student AS s, university AS u

WHERE s.year=y.year AND s.university=u.name

GROUP BY y.year, u.name

ORDER BY y.year;

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 3: Event Interest

Description

• Outputs a list of all events along with their computed attendance rate, the average level of student interest, and a metric combining surveyed interest with actual attendance

• Organizations throwing events with low attendance but high survey scores may need to look into changing venues or increasing advertising.

SQL Code

SELECT e.EventName, e.Attendance/(Count(r.person)) AS Attendance_Rate,

Avg(r.SurveyScore) AS Surveyed_Interest,

Avg(r.SurveyScore)*e.Attendance/(Count(r.person)) AS Interest_Metric

FROM Event AS e, RSVP AS r

WHERE (((r.EventProfile)=[e].[Profile_id]))

GROUP BY e.EventName, e.Profile_id, e.Attendance

ORDER BY Avg(r.SurveyScore)*e.Attendance/(Count(r.person)) DESC;

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 3: Event Interest

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 4: Optimal Event Location

Description

• Selects optimal potential event location on UC Berkeley campus in relation to attendee housing locations.

• By utilizing P-Median approach for event location that minimizes total demand weighted distances

• Assume P = 1 and calculate D ij by utilizing Euclidean distance formula:

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 4: Optimal Event Location

SQL Code

SELECT e.EventName, l2.street AS Potential_Location, sum(((l.x-l2.x)^2)+((l.yl2.y)^2)^0.5) AS distance, AVG(s.EventInterest) AS Demand

FROM Student AS s, RSVP AS p, Housing AS h, location AS l, location AS l2,

Event AS e

WHERE s.PID=p.person And p.EventName=e.EventName And s.housing=h.hid

And h.street=l.street And h.state=l.state And h.city=l.city And h.apt_suite=l.apt_suite And h.zip=l.zip

GROUP BY e.EventName, l2.street

ORDER BY e.EventName, sum(((l.x-l2.x)^2)+((l.y-l2.y)^2)^0.5);

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 4: Optimal Event Location

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 5: Min Airport Pick-up Cost

Description

Assumptions:

(1)Only take students who arrive at the airport between 8am to 7:59 pm into account

(2)Buses leave the airport on the hour.

(3)The opportunity cost of each student waiting for a bus for an hour is $10. (4) Each type I bus has a total of 5 seats and each type II bus has a total of 10 seats.

(5) We only deal with the arrival hour of each student, (student arriving at 1:01pm is treated the same as a student arriving at 1:59pm in this query implementation. and a ten-seat-vehicle to the airport and back cost $50 and $100, respectively.

• For date, airport extract # of students arriving in each time interval C i

• A≤i≤L; C i is interpreted as the number of students arriving at the airport no earlier than

(i-1) o’clock but prior to i o’clock

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 5: Min Airport Pick-up Cost

Formulation

Decision variables: tij= 1 if a type j bus is arranged to pick up students at i o’clock.

tij = 0 otherwise; (For A≤i≤L, 1≤j≤2)

Objective Function (Cost Min.):

Subject to.

People_constrain {Z in

A,B,C,D,E,F,G,H,I,J,K,L}:

SQL Code

SELECT s.airport AS Airport, s.arr_date AS Arr_Date, s.flight_arr_hour AS

Arr_Time, COUNT(*) AS Number_of_Students

FROM student AS s

GROUP BY s.flight_arr_hour, s.arr_date, s.airport;

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Query 5: Min Airport Pick-up Cost

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Normalization Analysis: 1NF

1NF

R is in 1NF if the domain of an attribute must include only atomic

(simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute.

Profile (Profile_id, Pic, Emails, Phones)

Pic (Profile_id, Pic)

Email (Profile_id, Email)

Phone (Profile_id, Phone)

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Normalization Analysis: 2NF

2NF

R is in 2NF if R is in 1NF and every nonprime attribute A in R is fully functionally dependent on the primary key of R.

Location (Street, City, State, Apt_Suite, Zip, x, y)

Assumption: ZIP_CODE determines CITY and STATE.

Location1 (Street, Apt_Suite, Zip, x, y)

Zip (Zip, City, State)

Organization (OrgName, Profile_id 5 , Street 6 , Apt_Suite 6 , Zip 6 , type, description)

Assumption: The name of an organization determines its type.

OrgName (OrgName, Type)

Organization1 (OrgName, Profile_id 5 , Street 6 , Apt_Suite 6 , Zip 6 , description)

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Normalization Analysis: 3NF

3NF

R is in 3NF if R is in 2NF and no nonprime attribute of R is transitively dependent on the primary key.

Housing (Hid, offered_by_person 1 , Street 6 , Apt_Suite 6 , Zip 6 , offered_by_org 8 , org_profile 5 , price, availability_date, furnished, number_rooms, number_bathrooms, water, electricity, garbage, gas, internet, move_in_special, ready_to_move_in)

Assumption: For a housing place to be “ready to move in”, it has to have

Internet, water, electricity, gas and garbage.

Housing1 (Hid, offered_by_person 1 , Street 6 , Apt_Suite 6 , Zip 6 , offered_by_org 8 , org_profile 5 , price, availability_date, furnished, number_rooms, number_bathrooms, move_in_special, Water, Electricity,

Garbage, Gas, Internet)

Ready_to_move_in (ready_to_move_in, Water, Electricity, Garbage, Gas,

Internet)

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Normalization Analysis: BCNF

BCNF

R is in BCNF if whenever a nontrivial functional dependency X  A holds in R, then X is a superkey of R.

Student (Pid 1 , Housing 7 , University 14 , Pickup_Person 3 , Flight,

Country 11 , price_preference, year, sleep, wakeup, study, friends, outgoing)

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Organization Form

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Person Form

Student Report

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION FORMS

Questions?

Download