Ever have one of those long, back and forth email exchanges where when you’re done you feel as if you’ve written a few chapters of an upcoming book?  I’m going to be taking a few of these from my archive to share with others who may be getting the same types of questions from non-modelers.

Today, we will be discussing the question of why does resolving a many-to-many relationship require a new relationshipWhy do we have to create a new entity?


This is a fairly classical data modeling concept.  It might be referred to as an intersectional entity, a resolution entity, an associative entity, or in short form, a M:N entity.  Silverston calls this concept Intersection or Association entity.  Simsion and Witt call these Intersection entities, Associative entities, Resolution entities, or Relationship entities.  Riordan calls these Junction tables, even in the data model.  I’d say that the most common terms are Intersection or Associative entity, but I think it depends on what tool one uses and what types of data modeling books one reads.

A typical example might start out with a many-to-many relationship, say between cars and people:

PERSON >-|---owns----o-< CAR

This is a many-to-many relationship:  A person may own cars and a car may be owned by more than one person.  In my made up example, my business rule is that a car has to be owned by at least one person (which is sort of bending the real world rules, but bear with me).  We’ll also ignore the fact that there are other relationships between cars and people.  We’ll focus just on ownership.

We can’t leave many-to-many relationships that way in a relational database, so we need to resolve them.  There are also normally real business reasons why they need to be resolved, but I’ll leave that for another discussion, too.

To resolve a many-to-many, you create a new entity, in my example, OWNERSHIP:

PERSON - ||---registers----0-< ownership >-|-----is registered on-----CAR

OWNERSHIP keeps track of the relationship between a specific person and a car.  It becomes the list of just two things: a person and a car.

Karen owns car 1234

Kirstin owns car 2345

Rob owns car 1234

Rob owns car 3456

In this list, notice that Karen and Rob jointly own car 1234.  Car 2345 is owned only by Kirstin and car 3456 is owned only by Rob.  Karen owns only one car, Rob owns two, and Kirstin owns one.  We can also assume that Richard owns no cars (according to the data) because he has no entry in OWNERSHIP. 

In the attributed model, the entity OWNERSHIP would look like this:

OWNERSHIP

=======================

Person.PersonID (fk)

Car.CarID (fk)

…in a very simple world where we don’t worry about time.   The real world reason why we need these associative entities is because they almost always involve an aspect of time and other attributes, but we’ll ignore that for now.

Each foreign key in this associative entity came from the relationship from CAR and PERSON to OWNERSHIP.  That’s why we need two relationships.  We could not drop one of them, because each plays the part of associating the two concepts to each other, one pair at a time. 

The PARTY AFFILIATION entity is a special case of the associative entity above because it started out (at least conceptually), as a recursive relationships (a relationship from an entity to itself).  These are more difficult to draw in ASCII data modeling, so I’ll just duplicate the entity:

PARTY >-o-----is affiliated with-----o-< PARTY >

So just imagine the relationship being “dog eared” back to the same entity.

We created PARTYAFFILIATION  to do the same job as OWNERSHIP:

PARTY -||-------is affiliated via-----o-< party affiliation >-0----------is affiliated via-----|| PARTY

It would result in an associative entity that looked like this:

PARTYAFFILIATION

=====================

PARTY.PARTYID (fk)

PARTY.PARTYID (fk)

…which we can’t have, since the foreign key would have migrated twice with the same name.  So we rolenamed one of the relationships for it to be:

PARTYAFFILIATION

=====================

PARTY.PARTYID (fk)

PARTY.SUBPARTYID (fk)

Personally, I prefer to rolename both in these cases so it is very clear which role the foreign key is taking, but it works either way.

So if you check out your books on data modeling for the terms I mentioned in the beginning, you might come up with some more examples of why many-to-many relationships resolve to two relationships with an associative entity in the middle.