Team Design Project

Fall 2018

Project Description

To gain real world experience in database design, teams of students will work with a local organization (small company, university department, campus organization, etc.) on a semester-length Database Design Project (DP). The DP starts with analysis of the organization's current methodology and needs, and proceeds through data modelling, design, and implementation of a prototype Relational Database, including queries, forms, and reports. Your design should not be limited by the current needs of the organization (you should free to add features/relations that your client may feel are unnecessary). Teams should emphasize the structure of their design and richness of their queries rather than the user interface. Extra credit will be given for mathematical analysis involving IEOR methods. The organization must understand that your prototype is not a fully operational system (and that you will not provice maintainance and customer support after the class is over!).

Project Deliverables


To gauge participation, as part of the Final Exam, we will ask everyone to name and rate their teammembers on a scale of 0-5 (0 = very little contribution, 5 = active contribution through semester).

Client Contact Report (1pg individually completed by every student)

To get a sense of what you and your team will produce for the Team Design Project, review 2-4 Sample Projects from past years (see link to left). To find potential a client for your team project, consider past jobs, friends, family members, and/or approach local retailers to find one client/organization who is interested in participating in the DP (has enthusiasm at least 5 on a scale of 0-10). Identify yourself as a Berkeley student and describe this course and project, reassuring the client that participating will require no more than 3 meetings over the semester and that at no time will you ask for access to real data (you will create fictional but realistic data to protect proprietary and private data). Please convey to potential clients that your team project will produce a prototype, and not a fully functional and maintainable database system. Type up a one page description of the client, the kinds of data they record, number of employees, your contact person and how you found him or her, existing database support (if any), current or future availability of a computer to run MS Access, and your estimate of their level of enthusiasm (on a scale of 0-10, 10 is very enthusiastic).

Please note that you should not post advertisements on Craigslist or any other internet source. You should also make sure that the company is aware that this project is solely for educational purposes and that support will not be provided beyond the end of the semester.

Team Client Proposal: (2pg per team, typed and turned in as hardcopy)

A much more detailed version of above for the client selected and confirmed by the team. Details on kinds of data they might model beyond the examples from Odessa Electronics, such as: customers, transactions, locations, appointments, equipment, inspections, evaluations, routing, events, meetings, evaluations, shippers, specials, sales, Client contact person, email, and phone number. Current DB system if any. Estimates of Data size (number of records). Proposed benefits. Availability of MS Access. Team member responsibilities, including Team CEO, CCO (Chief Communications Officer who will send an email to Prof. Goldberg and GSI cc'ing all team members with this precise subject line: "Subject: IEOR 115: From Team 3: All Members cc'd". Schedule. NOTE: All Team reports must include at the top of each report: Team number, Design Project Title, and date, as well as team member names.

DP Review I: (oral, <8 mins using Powerpoint (.ppt or .pptx or .pdf)

** Note: To avoid delays, For all Project Presentations, be sure to load Powerpoint slides onto the GSIs presentation computer before Review Class begins. **

All team members should participate if possible to present a summary of your client, their business (products, services etc), ideally illustrated with photos either from their website or if needed from related companies you find online. Textual summary of database requirements. Simplified EER diagram (omitting most attributes, but including relationships and cardinality constraints) with at least 20 entities and 20 relationships. Your team's project schedule. Also turn in a handout with: 1 page Simplified EER Diagram. Each project should have at least 1, preferably 2 or 3, superclass/subclass divisions, and at least one weak entity. Please check your EER carefully, make sure all your entity names are singular: “Building”, not “Buildings”, check that there are no relationships connected to attributes, that each subclass/superclass is properly labelled with the o/d, and p/t constraints, that weak entities have a single clearly defined owner, and carefully check all your cardinality constraints! As you know, errors in the EER can lead to big errors in your relationship schema. Be sure that your design is sufficiently different than the Odessa Electronics example, please don't repeat too many elements from that example; focus instead on new entities and relationships as suggested in the Client Proposal above. The revised EER diagram should contain your title and highlight the updates. Please also make sure to choose an appropriate font size for readability and try to fit everything into one page.<\p>

DP Review II: (oral, <10 mins presentation .ppt or .pptx or .pdf)

Revised version of Review I plus: Simplified EER diagram, Relational Design (schema) using notation from lectures, 5 "interesting’’ queries described in plain English. Note: do not implement these queries in SQL yet! Describe in detail what each query will accomplish (include mathematical notation if relevant) and justify each query: how will it be useful for your client? See more on Queries under DP Review III.

Your Queries should demonstrate the the power of your database to answer questions of value to your client. Each query must be justified in terms of organizational needs and described in plain text (not SQL yet!).

DP Review III: (<=4pg, written) Executive Summary.

This must include (in this order): project title, team number, team member names, client description, Simplified EER, relational design (schema), screencapture of at least 20 tables of your relational design implemented in Access Relationship View, your 5 queries in plain english, and in algebra or SQL (it's OK to use Views). I realize it's tricky but do not exceed the 4 page limit (two if double-sided). No cover pages or appendices.

Regarding Your 5 Queries:

Now that you've organized all this great data, show us how it can be used to produce valuable insights! For the Queries, put yourself in the position of your client. What analyses, results, and plots could help your client grow, optimize profits, minimize time, attract and retain valuable employees and customers?

Have fun with this:

It would be ideal if at least 3 of your queries used SQL to extract data that you then input into external software such as R, Excel, Matlab, Sigma, AMPL, or custom code that you use to analyze and display the results graphically. Please use some of the features of SQL we covered in class such as GROUP BY, HAVING, NOT EXISTS, etc.

Apply methods you've learned in other IEOR courses and/or in Statistics, EECS, and Machine Learning courses. For example, if you have an inventory control problem, you might want to use the EOQ model. If you want to forecast future demand based on historical data, you may want to build a (regularized) linear regression model. if you have time series data, you might want to use time series models such as ARIMA. If you want to choose optimal locations, you might want to use P-medians.

Other examples: correlation between attributes, regression, LP, ILP, k-means clustering, Economic Order Quantity, Retailer Discount Values, Winter's seasonal method for forecasting LeadTime, Profit, or Inventory, using Exponential Smoothing, Ranking employees by a specific custom productivity metric, fitting stochastic models to forecast expected demand, scheduling employees, planning transportation using travelling salesman approximations, facility location (to minimize travel time), ranking advertisements based on return on investment, displaying seasonal trends, setting warranty periods based on Mean Time Between Failures, etc.

See prior projects on the course website for ideas (but don't copy directly).

In each case explain what data you'll extract with SQL, the mathematial models you'll use, give details and justify, and the specific tools/code you'll use to compute and display example results.

"Example Data." Very few clients can or will share their confidential data, so you can enter realistic data or use tools to generate realistic example data (eg, see "Generate Example Datasets" on Google). Consider also looking for example data online using UCR Time Series datasets, Kaggle, or published research papers.

Final Presentations (oral, <10 mins using Powerpoint):

Extra credit if you can arrange for your client to attend! Overall summary of PR I, II, III, Normalization analysis, and screencaptures showing Access implementation of your database design and queries. (Demonstration of Access implementation will be done separately with GSI). Screenshots of MS Access demonstration of at least two queries should be based on realistic examples of 5-10 tuples for the relations involved (please don't use joke names or data). For each Query, please clearly state:

  1. the business justification for the query, how it will benefit their organization

  2. the model you are using, give the mathematical formula explain variables, and provide a reference.

  3. the SQL you use to extract the relevant data.

  4. a sample of the output (this can be based on a small amount of sample data in each table). For the Normalization Analysis, there is no need to modify your database. Just point out several relations that are not in BCNM and show how you could normalize them. (but you do not need to do this).

Final Report: (<=20pg, written: submit 2 copies!).

Revised 4 page Executive Summary. The other 16 pages is an expanded version including: Introduction describing client, previous approach and goals. Access screen shots of at least 2 forms and 2 reports. Normalization analysis for your design: Indicate Functional Dependencies for 5 relations, at least one in 2NF, one in 3NF, and one in BCNF, explaining why they are not fully normalized why and how they might be normalized. If client did not attend Final Presentation, provide one page signed letter with client feedback: This can be faxed or emailed directly to me by the deadline at (510) 642-1403. Team members contributions. Discussion and Future work.

Project Grading

  • 30% DP Reviews I - III

  • 40% Final Presentation

  • 30% Final Report