Supplement 3
Advanced Topics in Conceptual Data Modeling

 L E A R N I N G    O B J E C T I V E S

After studying this supplement, you should be able to:

INTRODUCTION

A strong theme of Modern Systems Analysis and Design is that a thorough description of data is an essential activity in the requirements structuring step of the analysis phase of the SDLC. Such data structuring should include as rich a meaning of data as can be captured and represented. Various experts have identified numerous semantics and ways to represent these in the E-R notation. In this supplement, you will see four extensions to the basic E-R modeling capabilities explained in Chapter 10: modeling time-dependent data, multiple relationships between the same entity types, and the data abstractions of generalization and aggregation. These semantics are now commonly represented in E-R data models. Other semantics exist, and the interested reader is referred to Batini, Ceri, and Navathe (1992) and Flavin (1981) for detailed coverage.

MODELING TIME-DEPENDENT DATA

Data values vary over time (see the fifth and eighth questions in Table 10-1). For example, the unit price for each product may change as material and labor costs and market conditions change. If only the current price is required, then only that value needs to be represented in a data model. For accounting, billing, and other purposes, however, we are likely to need a history of the prices and the time period over which each was in effect. As Figure 1a shows, we can conceptualize this requirement as a series of prices and the effective date for each price. This results in repeating data that include the attributes Price and Effective_Date. In Figure 1b, these repeating data have been replaced by a new (weak) entity named PRICE HISTORY, similar to how we modeled multivalued attributes (see section on Multivalued Attributes in Chapter 10). The relationship between PRODUCT and PRICE HISTORY is named Has.

In Figure 1b, each value of the attribute Price is time stamped with its effective date. The use of simple time stamping (as in the above example) is often adequate for modeling time-dependent data. Time may introduce, however, more subtle complexities in data modeling. For example, Figure 2a represents a portion of an E-R diagram for Pine Valley Furniture Company. Each product is assigned to one product line (a related group of products). Customer orders are processed throughout the year, and monthly summaries are reported by product line and by product within product line.

Suppose that in the middle of the year, due to a reorganization of the sales function, some products are reassigned to different product lines. The model shown in Figure 2a is not designed to accommodate the reassignment of a product to a new product line. Thus, all sales reports will show cumulative sales for a product based on its current product line, rather than the one at the time of the sale. For example, a product may have total year-to-date sales of $50,000 and be associated with product line B, yet $40,000 of those sales may have occurred while the product was assigned to product line A. This fact will be lost using the model of Figure 2a.

The simple design change shown in Figure 2b will correctly recognize product reassignments. A new relationship (called Sales for product line) has been added between ORDER and PRODUCT LINE. As customer orders are processed, they are credited to both the correct product and product line as of the time of the sale using the Sales for product and Sales for product line relationships. When a product changes product lines, a change is made only in the Assigned relationship. In this way, orders implicitly change to the new product line through the Assigned relationship but stay associated to the product line at the time of sale through the Sales for product line relationship.

MULTIPLE RELATIONSHIPS BETWEEN ENTITIES

Suppose employees in a contract engineering firm are not all eligible to work on every project. Further assume that, as a new contract project is acquired, the chief project manager reviews the personnel files of all engineers to see who has the required skills, experience, and availability to be assigned to the project. As the project progresses, certain engineers are actually assigned to the project. This brief scenario describes a situation in which there are two relationships between the entity types of EMPLOYEE and PROJECT: Eligible for and Assigned to. Figure 3 depicts these two many-to-many relationships between these entity types.

Such instances of two or more relationships between the same entities are common in organizations. Here are a few other example situations:

In this last example, it is possible that the same artist may have composed, arranged, played, and conducted the same score. However, in other cases when there are multiple relationships between the same pair of entity types, there may be restrictions on entity instances participating in more than one of the relationships at the same time. For example, consider the situation depicted in Figure 4. This example shows that a DOCTOR and a NURSE may be Married to each other or may Work for each other, but not both at the same time. The restriction of participating in only one of several relationships is called exclusive relationships.

An exclusive relationship can apply not only to multiple relationships between the same pair of entity types but also to multiple relationships between an entity type and several other entity types. Consider the situation depicted in Figure 5. This example shows that a PATIENT can be Assigned to either a BED or a WARD, but not both, whereas a PATIENT simultaneously is Cared for by a PHYSICIAN.

GENERALIZATION

One of the unique aspects of human intelligence is its ability to classify objects and experiences and to generalize their properties. If we see a robin or an eagle, for example, we immediately classify each as a bird. Even if the bird is searching for worms in the grass, we know that it can fly since most birds can fly (we memorize the exceptions at an early age). Similarly, if we see a Porsche, we assume that it is both fast and expensive since these are properties of luxury sports cars.

Business entities are often best modeled using the concepts of generalization and categorization. Generalization is the concept that some things (entities) are subtypes of other, more general, things (see the sixth question in Table 1-1). For example, to an airline a business passenger is one subtype of the more general type called passenger.

Subtypes and Supertypes

One of the major challenges in data modeling is to recognize and clearly represent entities that are almost the same; that is, entity types that share common properties but also have one or more distinct properties. For example, suppose that an organization has three basic types of employees: hourly employees, salaried employees, and contract consultants. Some of the important attributes for these types of employees are the following:

  1. Hourly employees: Employee_No., Name, Address, Date_Hired, Hourly_Rate
  2. Salaried employees: Employee_No., Name, Address, Date_Hired, Annual_Salary, Stock_Option
  3. Contract consultants: Employee_No., Name, Address, Date_Hired, Contract_Number, Daily_ Rate

Notice that all of the employee types have several attributes in common (Employee_No., Name, Address, Date_Hired). In addition, each type has one or more unique attributes that distinguish it from the other types (for example, Hourly_Rate is unique to hourly employees).

Figure 6 shows a representation of the EMPLOYEE supertype with its subtypes, using E-R notation. A supertype is a generic entity type (such as EMPLOYEE) that is subdivided into subtypes. A subtype is a subset of a supertype (that is, the instances of entities in the subtype are a subset of the instances of its associated supertype). Supertype and subtype are equivalent to the terms superclass and subclass, defined in Chapter 4. The members of a subtype have some attributes or relationships distinct from members of the other subsets. Entity subtypes behave in exactly the same way as any entity type. For example, entity subtypes have attributes and may have relationships with other entity types. In Figure 6, common attributes for all employees are included with the EMPLOYEE entity type. The primary key for EMPLOYEE, as well as for each of the subtypes, is Employee_No. Attributes that are peculiar to each subtype are included with that subtype only.

The relationship between each subtype and supertype is called an IS-A relationship. A rectangle with rounded ends is used to designate IS-A relationships. This relationship is read from the subtype to the supertype; for example, "HOURLY EMPLOYEE IS-A EMPLOYEE" (correct grammar is not enforced). As shown in Figure 6, the cardinality of the relationship from a subtype to the supertype is mandatorily one. The cardinality is mandatorily one because an instance of a subtype is always an instance of the supertype (a salaried employee is always an employee). On the other hand, the relationship from the supertype to a subtype is optionally zero or one (an employee may or may not be a salaried employee). Since these cardinality relationships are always the same in IS-A relationships, you will omit the cardinality notation in such diagrams.

The IS-A relationships in Figure 6 are exclusive, but this is not always true for generalizations. Consider the situation depicted in Figure 7. Here VEHICLE has subtypes LAND, SEA, and AIR. Since some vehicles can travel in multiple mediums, the same VEHICLE instance may participate in several of these IS-A relationships. Consider an amphibious vehicle. This vehicle would have instances of the VEHICLE supertype and the LAND and SEA subtypes, but not the AIR subtype.

Inheritance

Inheritance is the property by which all attributes of a supertype become attributes of its subtypes. The term inheritance is also more generally used in object-oriented programming languages and covers not only the adoption of attributes by subtypes but also methods, or behaviors, of the supertype (see Chapter 4). Thus in Figure 6, the attributes Name, Address, and Date_Hired are inherited by all three employee subtypes (which is why these attributes are not explicitly attached to the subtypes). Except for the primary key, only attributes that are unique to a subtype are associated with that subtype.

AGGREGATION

An aggregation (see Chapter 5) is a collection of entities that together form a higher-order concept. An instance of a WORK ORDER entity in Pine Valley Furniture, for example, would be the collection of related RAW MATERIAL, TOOL, WORK CENTER, and FACTORY WORKER entity instances needed to produce a certain piece of furniture. In addition to inheriting the attributes about component entities, an aggregate entity can have attributes not found in its component entities. For example, a work order might have a promised completion date as well as material descriptions, tool codes, and so on from the component entities. An aggregate may also have attributes which summarize characteristics of the component entities. For example, we might want to know the number of tools used on a work order. Finally, an aggregate may participate in other relationships. For example, a work order is likely for the purpose of building a particular product. This complete situation involving a work order is depicted in Figure 8.

Note that the aggregate entity type, WORK ORDER, is shown as a gerund in Figure 8. This is perfectly consistent with the application of gerunds as explained in Chapter 10. The computed attribute of WORK ORDER, NUMBER OF TOOLS, is shown in a dashed oval. The component entities are composed into a quaternary relationship, the WORK ORDER gerund. There is a separate binary relationship between WORK ORDER and PRODUCT, since a product is sometimes built via a work order, and a work order must describe the building of some product.

SUMMARY

The purpose of conceptual data modeling is to capture into a structured form as much meaning about data as you can understand. Any semantic, or meaning, of data would ideally be represented in a data model diagram, such as an E-R diagram. A thorough coverage of all semantics that have been identified is beyond the scope of this book, but this supplement has extended the discussion from Chapter 11 to include four frequently represented, advanced data modeling concepts: time in data modeling, multiple relationships between the same entities, and the data abstractions of generalization and aggregation.

Time can enter a data model in a wide variety of ways, two of which were discussed. First, a time series or history of the same attribute(s) may need to be maintained for an entity type. Second, the original association between an entity when it was created and another entity as well as the current association may need to be kept.

There may exist multiple associations between the same pair of entity types, and these relationships may or may not be exclusive. If several relationships (between an entity and one or more other entities) are exclusive, then instances may participate in only one relationship at a time (for example, an employee could be taking a course or teaching a course, but not both at the same time). Alternatively, in a nonexclusive set of relationships, instances may participate in multiple relationships simultaneously. For example, an artist could play, conduct, and arrange the same musical score.

Generalization occurs when one or more entity types are special cases of another more inclusive entity type. For example, an employee supertype may have clerical, technical, and professional employee subtypes. A supertype-subtype relationship is called an IS-A relationship, and as with other types of relationships, IS-A relationships may be exclusive or nonexclusive.

An aggregate entity can be defined as the composition of several more detailed entities. For example, an airline reservation is composed of one or more passengers on an itinerary of flights, possibly with assigned seats on each flight. An aggregate entity can have its own attributes and participate in relationships besides those associated with its component entities.

 A P P E N D I X    R E V I E W


KEY TERMS

Aggregation (aggregate entity)
Exclusive relationships
Generalization
IS-A relationships
Subtype
Supertype


REVIEW QUESTIONS

  1. Define each of the following terms:
    1. gerund
    2. quaternary relationship
    3. subtype
    4. time stamp
  2. Describe two general situations in which time plays a role in modeling data.
  3. What is the difference between exclusive and nonexclusive relationships? Give an example of each.
  4. What data modeling construct defined in Chapter 10 is used to represent an aggregate entity type?
  5. Describe the property of inheritance as it applies to entity subtypes and supertypes.


PROBLEMS AND EXERCISES

  1. Consider an aggregate entity of an airline reservation with component entities passenger, flight, and seat. Pick a few attributes for each component entity and for the aggregate and draw this situation with an E-R diagram.
  2. Suppose employees in a company are all described by their employee ID, name, and phone number. An employee is either a clerical, technical, or professional employee. For clerical employees, we want to know their typing speed and which word processors they can use. For technical employees, we need to know their highest degree title. For professional employees, we need to know what other employees they supervise. Depict this situation with an E-R diagram.
  3. In a club of computer information systems students, members can chair activities and participate in activities. A club member can also be assigned to evaluate club activities, but not those he or she chairs. Show this situation with an E-R diagram.
  4. Employees are assigned to departments. We want to know to which department an employee was first assigned and to which department an employee is currently assigned. Show this situation with an E-R diagram using two separate relationships, one for first assignment and one for current assignment.
  5. Employees are assigned to departments, and employees may be reassigned to different departments over their career. We must keep track of this history of assignments, and for each assignment we need to know the start and end date of the assignment. Depict this employment situation with an E-R diagram.
  6. Refer to Problem and Exercise 9 in Chapter 10. Since both "customers" and "case workers" are employees of the company, redraw your answer to this exercise using one entity type for employees and subtypes for customers and case workers. Indicate on this diagram the business rule that a case worker may not be assigned to a purchase request if that case worker is also the customer of that request.


FIELD EXERCISES

  1. Obtain access to a CASE tool and investigate how you could model IS-A relationships with this CASE tool. Is there a direct way to model IS-A relationships or must you use other constructs to simulate an IS-A relationship? Can you model whether IS-A relationships are exclusive or not?
  2. Talk to a systems or data analyst in an organization in which you have contacts. Show this person Figure 2 and explain the situation depicted in this figure. Ask them if there are any similar situations in their organization and how the data models for their systems deal with dynamic relationships. If their answer is that such dynamics are ignored, have them explain what the potential problems might be with ignoring this issue. Write a summary of what they said and discuss how you would suggest they change the data model to handle the dynamic data entity relationships in their organization.


REFERENCES

Batini, C., Ceri, S., and Navathe, S. B. 1992. Conceptual Database Design. Redwood City, CA: Benjamin/Cummings Publishing.

Flavin, M. 1981. Fundamental Concepts of Information Modeling. New York, NY: Yourdon Press.


Back to the suppliment index

1999 Prentice-Hall, Inc., A division of Pearson Education, Upper Saddle River, New Jersey 07458 Legal Statement
Comments should be directed to webmaster@prenhall.com