IEOR 115 Final Presentation (2)

Post on 15-Apr-2017

20 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

DP Final Presentation

Silicon Valley Youth Bridge

Team 6: Abbey Chaver, Catherine Darmawan, Desmond Chan, Isha Thapa, Jason Mao, Jessica Wijaya

Team 6Desmond Chan, Abigail Chaver, Catherine Darmawan, Jason Mao, Isha Thapa, Jessica Wijaya

Meet the Client

Non-profit

Teaches grade-school students how to play bridge

Goal: to inspire the next generation of bridge players

Fully run by ~100 volunteers

~400 registered youth members

SiVY Bridge Background

Events and ProgramsYouth Tournaments (Pizza Party, Casual Friday)

Summer Camp

Parent-Child Games

External Tournament

About BridgePartnership Game

Complex game involving strategy and logic

Two parts - bidding and playing

Duplicate bridge at tournaments

Win Masterpoints

EER Diagram + Relational Schema

Previous Approach DP 1 DP 2

Final EER Diagram

Relational Schema1. Person (PersonID, Fname, Lname, gender, Start_date, Branch36, DOB, email, points)2. Volunteer (Volunteer_ID, PersonID1, certification)3. Teacher (Teacher_ID, Volunteer_ID2)4. Board_Member (BM_ID, Volunteer_ID2, Position, Year)5. Mentor (Mentor_ID, Volunteer_ID2)6. Alumnus (Alumnus_ID, PersonID1, End_date_as_student)7. Student (StudentID, PersonID1, waiver, School, Parent19, Parent29, Parent39, Phone_number,

Emergency_contact_phone, how_did_you_hear, Prior_experience, ABCL_member, ACBL_number, Date_joined_ACBL, Training_program_participant)

8. Donor (Donor_ID, PersonID1)9. Parent( Parent_ID, PersonID1)

10. Event (Event_ID, start_date, end_date)11. External Tournament (Tournament_ID, Event_ID,10 Type, City)

Relational Schema12. Internal Event ( InternalE_ID, Event_ID10, RoomID22, BranchID36, Date, Time)13. Pizza_Party_Tournament (PP_ID, InternalE_ID10, Food_Ordered, Food_Consumed)14. Parent_Child_Tournament (PC_ID, InternalE_ID10)15. Casual_Friday (CF_ID, InternalE_ID10)16. Class (Class_ID, InternalE_ID10, Class_Name, Term, Teacher_ID3, School_hosting24, Weekly_hour,

Weekly_day)17. Class_Session(Session_ID, Date, Class_ID16, No_Attendees)18. Summer_Camp ( Camp_ID, InternalE_ID10, Year)19. Fundraiser(Fundraiser_ID, InternalE_ID10)20. IntEvent_Performance (InternalEvent_ID12, Person_ID1, Partner_ID1, point_type, points_achieved)21. Time_interval (Time_ID, RoomID22, Start_time, End_time, Date)22. Room (RID, BID23, capacity, projector)23. Building (BID, Street_address, City, ZIP_code)24. School (School_name, level)25. Donation (Donation_ID, Donor_ID8, Amount, Associated_Fundraiser19)

Relational Schema26. Camp_Tuition (Tuition_ID, Camp_ID18, Student_ID7, Expense_ID)27. Transaction (Transaction_ID, Amount, date, type)28. Revenue (Rev_ID, Transaction_ID27)29. Expense (Exp_ID, Transaction_ID27)30. Supply Order (Order_ID, InternalE_ID10, items_description,VolunteerID2, date, Exp_ID29)31. Skill (Skill_Name)32. Skill_Teaching (Skill_Name31, Teacher_ID3)33. Skill_Student( Skill_Name31, Student_ID7, level, Test_ID35)34. Sponsorship (Sponsorship_ID, Student_ID7, Exp_ID29, UsedOn) 35. Test (Test_ID, Time_ID21, Skill_Name31, date)36. Branch (Branch_ID, City, Country)37. ExtTournament_Performance( StudentID7, TournamentID11,point_type, points_achieved)38. IntEvent_RSVP_and_Attendance(PersonID1, EventID12,Attended, RSVP, Partner1)39. StudentParent(StudentID7, ParentID9)

Queries/Analysis

1. Optimizing Food Purchases for Events

2. Assessing Skill Levels3. Partner Matching4. Donation Trend Analytics5. Forecasting Event

Participation Level

1. Optimizing Food Purchases for Events● Optimizing amount of food purchased

○ We can do a forecast on the number of students that would most likely participate in the event based on previous events’ attendees data

○ According to this number of attendees, we can then buy the optimal amount of food to reduce leftovers

● Benefit: ○ help the organization reduce internal event’s expenses

○ improving the quality of the organization indirectly as the money saved can then be

allocated on other areas for improvement (e.g. using the money to sponsor students to

tournament, to hold extra session for underperforming students, to be used for marketing purposes, etc.)

Step 1: Use SQL query to extract data

Step 2: Step 3:

Creating the QueryStep 1: Retrieve data of the number of attendees, amount of food, and amount leftover

SQL > SELECT S.InternalE_ID, count(IA.Attended), count(IA.RSVP), S.sum(quantity), PPT.Pizza_Remaining

FROM Pizza_Party_Tournament as PPT, Supply_Order as S, IntEvent_RSVP_and_Attendance as IA, Internal_Event as IE

WHERE PPT.order_id = S.order_idAND S.product_type = “pizza”AND IA.event_ID = IE.Event_IDAND IE.InternalE_ID = PPT.InternalE_IDAND IA.Attended = 1AND IA.RSVP = 1GROUP BY S.InternalE_ID;

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:

Step 2: Use linear regression to predict the number of attendees

Step 3:

Predict Number of Attendees from RSVPs- Regress number of attendees against number of RSVPs- Verify linear model- Use linear model function in R- Check significance level

Example: Attendance = .27 + .8*RSVP

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 1: Step 2:

Step 3: Use linear regression to predict amount of pizza consumed

Predict Number of Pizzas Consumed- Children may eat less than

standard serving size- Regress pizza consumption

against number of attendees

- Assume distribution of ages is the same

- Verify linear model- Set intercept to zero- ex: Pizza = .26* Attendees

Use Both Models to Predict Consumption- obtain predicted number of attendees from first model- plug value into second model to estimate amount of pizza - don’t extrapolate data!

2. Assessing Skill Levels● Identify underperforming student for mentors/teachers to provide extra

support and attention● Identify best performing or “most improved” students to reward with

recognition and prizes like sponsorships for external tournaments● Evaluation is based on points, years playing bridge, participation in

classes, attendance for events excluding classes, and test scores● Benefit:

○ When more attention is put on the underperforming student, they will be more likely to

improve. This will in turn improve the quality of the organization, and making the students and parents more proud of the improvement and achievement made.

○ If the right student was picked to get the sponsorship to attend external tournament, the

organization will have greater chance of having its member winning the tournament. This will improve Sivy Bridge’s reputation as well

Step 1:Use SQL to display all names, points, attendance and skill level

Step 2: Step 3:

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:

Step 2: Normalize data and graph in MS Excel

Step 3:

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 1: Step 2:Step 3:Analyze data with reservations

Step 1: Use SQL to Gather Relevant InfoSQL > SELECT P.PersonID, P.points, count(EA.Attended) as Total_attendance, average(SS.level) as Average_skill

FROM Person P, IntEvent_RSVP_and_Attendance as EA, Skill_Student SS, Student S

WHERE P.PersonID = S.PersonID AND SS.StudentID = S.StudentID AND EA.PersonID = S.PersonID AND

EA.Attended = 1; GROUP BY P.PersonID;

Step 2: Graph Data with MS Excel

Step 3: Analyze Data with Reservations- Do not take data for granted- When looking at data, choosing outliers may be easy but understanding

which students for teachers to focus on may be completely different- Jack Ma and Isabel Wong seem to be the most underperforming students

but Frank Liu actually is- Upon closer inspection one can see that Frank Liu attends events but

doesn’t perform on par with his skill level- Jack and Isabel have a high skill level but have a lower overall score

because they didn’t attend events, why? Perhaps they cannot learn any more from attending events focused towards the majority of the organization, which is at a lower skill level than their current

3. Partner Matching● Maximize sum value of partnerships for students at a tournament● Partnership “value” weighted by skill level, point accumulation, age, prior

partnership○ Using linear programming, we will minimize difference in skill level, age, and personal

points, and maximize games played together and points achieved together.

● Benefit:○ Better compatibility increases quality of teamwork in playing bridge for the tournament,

and therefore will increase chances of winning.

○ Create strong relationships between students, improving the experience of playing and their commitment to the game.

Step 1: Extract Data Step 2: Step 3:

Query: Relevant Data for each possible matchSQL > CREATE VIEW Skill_rank (select StudentID, average(level) as

Average_levelFROM Student_SkillGROUP BY StudentID);

SQL > SELECT P1.PersonID, P2.PersonID, P1. points - P2.points, SR1.Average_level - SR2.Average_level,P1.DOB-P2.DOB, count(IEP.InternalEvent_ID), sum(IEP.points_achieved)

FROM Person P1, Person P2, Skill_rank SR1, Skill_rank SR2, Student S1, Student S2, IntEvent_performance IEP

WHERE P1.Pid < P2.Pid AND IEP.PersonID = P1.PersonID AND IntEvent_Performance.PartnerID = P2.PersonID AND SR1.StudentID = S1.StudentID AND SR2.StudentID = S2.StudentID AND S1.PersonID = P1.PersonID AND S2.PersonID = P2.PersonID AND P1.PersonID, P2.PersonID IN (SELECT PID FROM IEP

WHERE IEP.InternalEventID = 15) GROUP BY P1.PersonID, P2.PersonID;

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1: Step 2: Optimize matches with AMPL

Step 3:

Optimize matches with AMPL formulationparam n; # number of students attending event

param m = n*(n-1)/2; # possible matches

set attributes;

param match{1..m, attributes};

var y {1..m} binary; # indicates match i is selected

var x {1..n, 1..n} binary; # representation of a match i between person j and k

minimize MatchValue:

sum {i in 1..m} (abs(match[i, 1]) + abs(match[i, 2]) + abs(match[i, 3]) - match[i, 4]

- match[i, 5])*y[i];

subject to

# Condition 1: A person can only have one partner

C1a {i in 1.. n}: sum {j in 1 .. n} x[i,j] <=1;

C1b {j in 1.. n}: sum {i in 1.. n} x[i,j] <= 1;

# Condition 2: All students should be matched unless n is odd, in which case only one

should be unmatched

C2a: sum {j in 1..n, k in 1..n} x[j, k] <= n/2;

C2b: sum {j in 1..n, k in 1..n} x[j, k] >= n/2 - 1;

# Condition 3: Student cannot be paired with him/herself

C3 {j in 1..n}: x[j,j] <= 0;

# Condition 4: Eliminate identical pairings with different order

C4 {j in 1 ..n, k in 1..j}: x[j, k] <= 0;

# Condition 5: Relating x[j, k] to y[i] through a numerical transformation based on the

ordering of the match matrix

C5a {j in 1..n, k in 1..n}: x[j,k] <= y[(j-1)*n - (j-1)*j/2 + (k - j)];

C5b {j in 1..n, k in 1..n}: x[j,k] >= y[(j-1)*n - (j-1)*j/2 + (k - j)];

data; #####################

param n: 5;

set attributes := "PID1", "PID2", "PointDiff", "SkillDiff", "BDiff", "IEP", "JointP";

param match:

PID1 PID2 PointDiff SkillDiff BDiffIEP JointP:=

1 1 9 4 0.5 1 3 3

2 1 16 3 1 -3 0 0

3 1 4 -2 -0.25 2 2 5

4 1 5 6 0.5 -4 1 4

5 9 16 -1 0 4 1 1

6 9 4 -6 0 1 4 8

7 9 11 2 0.2 -5 1 0

8 16 4 -5 0 5 1 2

9 16 11 3 -1 -1 3 1

10 4 11 8 0.5 -6 0 0;

Output:

y1 y2 y3 y4 y5 y6 y7 y8 y9 y10

0 0 0 0 0 1 0 0 1 0

In this case, the optimal matching is to select match 6 and 9, resulting in the pairs (9, 4) and (16, 11). Person 1 is unmatched and will play with a volunteer.

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 1: Extract Data with SQL

Step 2: Optimize matches with AMPL

Step 3: Tune Objective Weights

Step 3: Tune Objective WeightsMinimize MatchValue:

sum {i in 1..m} (Z1*abs(match[i, 1]) + Z2*abs(match[i, 2])

+ Z3*abs(match[i, 3]) - Z4*match[i, 4] - Z5*match[i, 5])*y[i];

Using IntEvent_Performance.Points as the result, we can evaluate the success of our matching. By adding weights to the components of the objective function, we can try to optimize the coefficients to give the most weight to the most accurate predictors of partnership success.

4. Donation Trend AnalyticsIs the amount of money received from donation consistent over months

and years?

Business Justification:

● Finding trends for money donations● Analyze whether time affects the amount of donations● Predict financials to foresee the future of the organization and to note if

fundraising efforts would be needed

Step 1: Microsoft AccessCreate a query using SQL

Step 2:Microsoft Excel: ANOVA Test

Step 3:Microsoft Access:Chi-Squared Goodness Fit Test

Step 1: Creating a QueryFind the total amount per month using SQL in MS Access

SQL Code

SELECT DISTINCTROW Format$([Donation].[Date],'yyyy/mm') AS [Year and Month], Sum(Donation.Amount) AS [Sum Of Amount]

FROM DonationGROUP BY Format$([Donation].[Date],'yyyy/mm'),

Year([Donation].[Date])*12+DatePart('m',[Donation].[Date])-1ORDER BY Format$([Donation].[Date],'yyyy/mm'),

Year([Donation].[Date])*12+DatePart('m',[Donation].[Date])-1;

Output

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Microsoft AccessCreate a query using SQL

Step 2:Microsoft Excel:ANOVA test

Step 3:Making analysis from existing data

Step 2: Consistency of donation amount over yearsExport the data to MS Excel Use ANOVA: Single Factor Data Analysis

Step 2Export the data to MS Excel Use ANOVA: Single Factor Data Analysis

Since F < F critical, Accept H0 = µ2012 = µ2013 = µ2014

Step 2: Consistency of donation amount over years

Step 2: Consistency of donation amount over monthsFind the average of donations per month ANOVA

Step 2: Consistency of donation amount over months

Since F > F critical, Reject H0 = µ1 = µ2 = ... = µ12

Find the average of donations per month ANOVA

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 1: Microsoft AccessCreate a query using SQL

Step 2: Microsoft Excel: ANOVA Test

Step 3:Making analysis from existing data

Step 3: Summary Since the total amount of donations are consistent over years, it will be beneficial for SiVY to use this data for planning of long-term goals and expansions. Therefore, SiVY can determine whether fundraising is necessary to collect more funds to aid future missions and cover expenditures.

Since donations are not consistent over months, SiVY needs to carefully plan the usage of donations for expenditures of events and competitions ahead of time (i.e. creating financial plans for 2016 activities and expenditures and make sure that the money is available before the start of year 2016)

5. Forecasting Event ParticipationSummary:

● We find the seasonal trend of student’s participation level at internal events and forecast future events’

participation levels based on previous attendance data

○ There may be some period of time when more students would be more/less interested in attending

event (e.g. during holiday season, beginning of school year, etc.)

● Using this forecast, we can then plan more events during this season so that it will be more effective and less

events during low season period (low number of attendees).

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1: Retrieve & filter data from Access using SQL QueryFor each event, we can filter the data using SQL to find the total number of people

attended the event held on particular dates.

SELECT IntEvent_RSVP_and_Attendance.EventID, Internal_Event.Date, Count(IntEvent_RSVP_and_Attendance.Attended) AS TotalAttendanceFROM IntEvent_RSVP_and_Attendance, Internal_EventWHERE Internal_Event.InternalE_ID = IntEvent_RSVP_and_Attendance.EventIDGROUP BY IntEvent_RSVP_and_Attendance.EventID, Internal_Event.Date;

Output

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 2: Export data & forecast with Holt-Winters method● Using holt-winters method, we forecast future attendance level for events held at

different time. ○ The intuition behind using holt-winters model is because we might have seasonal

factor affecting the attendance level.

● Therefore, we are going to set the seasonal period to be 12 (monthly season), ● We are also going to be using multiplicative seasonal method.

Holt-Winters Formulayt = forecast at time t

lt = coefficient level at time t

bt = trend at time t

st = seasonal factor at time t

= smoothing parameter for coefficient

= smoothing parameter for the trend

= smoothing parameter for seasonal factor

m = number of period

The first year’s data are taken just to get the method calculation started.

We averaged the attendance level and use it to become the initial values for the Holt-

Winters formula.

1st Year (actual data)

2nd and 3rd Year (actual data)

4th and 5th year (forecast)

Step 1:Filter & retrieve

data from Access

database to

create a query

using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3: Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 1:Filter & retrieve data from Access database to create a query using SQL

Step 2:Export the data to

Excel and apply

Holt-Winters

method to do

forecasting

Step 3:Generate analysis from the result to create recommendation for future events

Step 3: Analysis and recommendation for future eventsNotice that the seasonal trend for the attendance level is maintained when forecast for future period is made with holt-winters method.● In this case, we assume that children are more likely to go to events in the middle

of the semester, and less likely to go during summer break and winter break as they might already have plans on their own with families & friends.

The attendance level would be higher for events in the middle of the semester (Feb-May) and slightly lower for events during school breaks (June-August & Dec-Jan)● focus creating more events during the peak period since it will be more effective as

more members (students) will be participating in the event● hold events during low-season period (school break) or modify the event planning

to fit the lower number of attendees (e.g. ordering less food, booking smaller rooms, etc.) to reduce cost.

Normalization1NF

2NF

3NF

BCNF

First Normal Form (1NF)Before:

1. Person (PersonID, Fname, Lname, gender, Start_date, Branch36, DOB, email, points)

A person can be part of more than 1 branch of the organization, therefore “Branch” is a multivalued attribute => not in 1NF

After (to normalize it, we break it into 2 tables):

1.1. Person (PersonID, Fname, Lname, gender, Start_date, DOB, email, points)1.2. Person_of_Branch (PersonID, Branch36)

Second Normal Form (2NF)Before:

16. Class (Class_ID, InternalE_ID10, Class_Name, Term, Teacher_ID3, School_hosting24, Weekly_hour, Weekly_day)

Class_ID alone can determine Class_Name => not fully FD on every CK

After (to normalize it, we break it into 2 tables):

16.1. Class (Class_ID, InternalE_ID10, Term, Teacher_ID3, School_hosting24, Weekly_hour, Weekly_day)

16.2. Class_Name (Class_ID, Class_Name)

Third Normal Form (3NF)Before:

23. Building (BID, Street_address, City, ZIP_code)

{Street_address, City} alone can determine ZIP_code => Not in 3NF After (to normalize it, we break it into 2 tables):

23.1. Building (BID, Street_address, City)23.2. Address_ZIP (Street_address, City, ZIP_code)

Assumption: Same street address can exist in multiple cities, so it has to be combined with city to be unique!

Boyce-Codd Normal Form (BCNF)Before:

33. Skill_Student( Skill_Name31, Student_ID7, level, Test_ID35)

Test_ID → Skill_Name because Tests are administered on a single skill (dependency captured in the relation Test.

To normalize this into BCNF:33. Skill_Student(Test_ID35, Student_ID7, level)

However, this defeats the purpose of easily identifying which skills a student possesses, so it’s not very sensible.

Fully normalized Boyce-Codd Normal Form (BCNF)10. Event (Event_ID, start_date, end_date)

is in 3NF because the two non-prime attributes are fully dependent on the primary key, and because there is no functional dependency between the two non-prime attributes.It is further in BCNF because every functional dependency is of the form superkey → non-prime attribute:

Event_ID → start_dateEvent_ID → end_datestart_date ↛ Event_IDend_date ↛ Event_ID

start_date ↛ end_dateend_date ↛ start_date

Assumption: some events span more than one day (otherwise we would not track this as two separate attributes).

Questions?

top related