General Description

Design (Phase 1) and implement (Phase 2) a database system, GalaxyNews, to maintain information about newspapers owned an operated by the GalaxyNews corporation. The information represented in the database will be used to track events through time. For example, with the 2008 presidential elections in the USA just around the corner (!), reporters will have the ability to use the GalaxyNews database system to search for stories relevant to the nominees vying for the presidency or relevant to political issues being debated.

As mentioned there are two phases in this project; the goals of the design phase are to research the problem and to build a model of this enterprise. Your job in this phase is to focus on modeling the database needs as completely and correctly as possible. You must create a UML class diagram of the model while applying the appropriate database design patterns we have learned and also while building a database model that can adapt to future changes. Do not worry about the size of the model as you will not implement all of it, so in this phase be certain to build a complete model. Your interaction with the client will also determine the scope of the database model to build.

The goals of the implementation phase are to demonstrate your relational database skills in creating tables in the relational model, maintaining integrity constraints, inserting/updating/deleting information, and creating queries on the data that you have supplied. NOTE: you will not implement the complete model from the first phase since there will not be time. Instead, Dr. Monge will meet with each group after the design phase and determine what parts of the model each group will implement. So do not start on the implementation phase until after you have been assigned the part of the enterprise you will implement. Different groups might end up implementing comparable but different parts of the enterprise. There is also the possibility that a different database model will be implemented rather than the one that each group designs; in this case, the database model will be provided as a UML class diagram and each team will then do the implementation.

Background

Newspapers are in the process of evolving now that the Internet can provide up to the minute reporting of events. People can get their news from the Internet and do not have to wait until the following day to read about an event that has taken place. Newspapers and the corporations that own them are redefining themselves to include content of their newspaper on the Internet. As this transition occurs, the corporation must determine where the revenue must come from since the traditional newspaper subscriptions are decreasing dramatically.

The database you design must model all of the newspapers owned by the GalaxyNews corporation. The central aspects of a newspaper are the stories (articles) written by the reporters. In addition to modeling the reporting aspect of a newspaper, you must also model the business aspect in which people subscribe for newspaper delivery.

Required Functionality

Here is a sample of the functionality that should be possible with the data in the GalaxyNews database system.

  1. List the newspaper articles that mention a particular personality (say Anna Nicole Smith or Barack Obama).
  2. For each of the last 5 years, list the number of subscribers to each newspaper by year's end.
  3. List the name and areas of expertise of reporters who have collaborated together on at least one story about the most recently completed USA Presidential Elections.
  4. For each reporter, display their name, the number of stories in which she has been the lead reporter, and the number of different categories of news that she has written for.
  5. Given a date, list all the newspaper articles published on that date. The list should be organized by newspaper and category.
  6. For each city, list the number of households in that city, its population, and the number of current subscribers to some newspaper owned by GalaxyNews.
  7. Additional functionality to be added later. To incorporate this functionality, you need to design the database with flexibility in mind. Is your database prepared to handle a new category of news? If the corporation purchases (or sells) a newspaper, can the database support such changes without a change to the database structure? Etcetera.

Getting started

Any database project starts with examining the actual data involved (assuming that there is some sample data).  Often times, prior to the time you build the relational structures for the database, the raw data for your database exists.  It may be filed on paper forms in filing cabinets, or it may be on web sites, or sales forms, or weekly/monthly reports, etc.  Before you do any actual modeling, it helps to know that data will eventually be stored in the database, so the first step is to research the problem and to analyze the data available to determine some constraints of your model. One of the most important activities in this first step involves communicating with the client about their needs, since the client is expected to know how their business runs and what is needed from a database. In software engineering, this is the requirements elicitation (also requirements gathering) phase.

On this project, there are lots of resources where you can gather information about the enterprise. Many, if not all, of you keep current by reading your local newspaper. Perhaps, you or your family is subscribed to a newspaper -- you may have even delivered when you were younger. If you're not confident regarding newspaper subscriptions, you may want to inquire with your local newspaper about it. These are good sources of information. Still, it is wise to get the information from as many sources as possible. So, start gathering information about different newspapers and the corporations that own them.

You are the database experts...

To simulate a certain amount of reality in this project, Dr. Monge will play the role of client and each group independently forms the database design and development teams being assigned the project. In the setting of this project, the client has requested your services as you are the ones with the database skills necessary. There will be two official times for interaction with the client, each is referred to as a round. In each round, the database teams will send the questions to the client via E-mail. These questions must be about the enterprise and not about how to model or implement some aspect of the enterprise. The client will answer each team's questions only and these will be sent via a reply E-mail message. After the first round, the client will hold a private meeting with each database team to clarify answers to any questions from the first round. In all of the rounds, each team will get only answers to the questions submitted by the team. Team members may communicate only with other members of the same group. Inter-team communication is prohibited.

Deadlines of project activities
Activity How/Where? When?
Round #1: List of first round questions for the client Send the list via E-mail with subject: CECS-323: GalaxyNews 1st Round, group # Thursday Apr. 12 4:15pm
Client answers the first round of questions Answers will be sent via E-mail Friday Apr. 13
Round #2: List of second round questions for the client Send the list via E-mail with subject: CECS-323: GalaxyNews 2nd Round, group # Tuesday Apr. 17th 4:15pm
Client answers the second round of questions Answers will be sent via E-mail reply to each team Wednesday Apr. 18th
Final database model and documentation. Follow instructions here -- available soon. Printed (hard copy) to be turned in by end of lab. Copy must also be sent via E-mail message with subject:
CECS-323: GalaxyNews DB Model, group #
Thursday Apr. 26 by end of lab
Implementation Phase: implement a relational database Print the relational database scheme, SQL DDL script, SQL DML, and a final report. Thursday May 17 by end of lab

NOTE: All of the above deadlines are hard. They will not be adjusted and any team missing a deadline will be penalized in their grade as specified in the syllabus.

Project administration

This is a group project and as such each student will work on this project with at least one other student from the class. Each member of the group has an equal share on the responsibilities in completing the project. Every group must follow these guidelines as you work on the project:

  1. Every group must keep track of the contributions made by each of the members. Every week each group must send a report via e-mail stating the contributions made by each person during that week. All weekly reports are due by the end of lab time every Thursday.
  2. All of these reports must be included with your final project at the end of the semester. In each report, clearly state the contributions and work accomplished by each individual.
  3. Every group member (including sender) must be a recipient of all group-wide e-mail messages -- including the weekly reports; do this in the CC line of the E-mail. As a recipient, each member of the group agrees and accepts the contents of the E-mail message.
  4. The grade in the project will have two components: a group grade that is based on the completeness and quality of the project submitted and an individual grade that is based on the contributions the person has made for the success of the team.
  5. When a team presents their project (more details to follow later), I will ask questions to each person to evaluate their understanding of the work accomplished during the project. This may affect a person's individual grade as well as the group grade.

E-mail communication

All E-mail messages must meet the following format requirements:

  1. For group initiated messages, each member of the group must be copied in the CC line
  2. The E-mail subject line must be as specified in the project handout.  All other E-mail messages whose subject line is not specified must have an appropriate subject line that starts with CECS-323: GalaxyNews
  3. The E-mail body must be in HTML or in plain text.
  4. Each of the First & Second round questions must be a numbered list of questions and must be organized based on topics as best as possible.
  5. The E-mail must be received no later than the set deadline.

References

  1. Tribune Company: WikiPedia entry and home page
  2. Los Angeles Times: WikiPedia entry and home page

Database Resources