At 09:22 2019-01-26, Ted Roche tedroche@gmail.com wrote:
[snip]
To avoid this kind of refactoring later in the process, my rule has always been that every table has a unique, non-data-bearing PK which uniquely identifies the record from birth to death. You will never have to deal with all the RI code involved in changing primary keys because the data values (category or country codes) change, and avoid intricate and bothersome code.
Why not? What if someone creates a second restriction row with the same category, country, and any other factors? Couldn't this create an integrity nightmare?
In my client billing app, I have a few tables that have a date range for when each row is valid. This does mean that I have to handle the lookup into these tables.
[snip]
Sincerely,
Gene Wirchenko
On Mon, Jan 28, 2019 at 8:34 PM Gene Wirchenko genew@telus.net wrote:
At 09:22 2019-01-26, Ted Roche tedroche@gmail.com wrote:
[snip]
To avoid this kind of refactoring later in the process, my rule has always been that every table has a unique, non-data-bearing PK which uniquely identifies the record from birth to death. You will never have to deal
with
all the RI code involved in changing primary keys because the data values (category or country codes) change, and avoid intricate and bothersome code.
Why not?
Why not what? When I said "RI code" I was referring to the Relational Integrity code that has to be generated to handle the situations where the Primary Key changes. If any PK changes or gets deleted, you have to decide if the change needs to be Cascaded to other tables where the PK is exported as an FK, or Nullified to remove the FK reference, or prevented if a key change should not be allowed to cascade through the the database. Needless to say, in a complex ERD, these RI rules add another layer of complexity to the application-to-business-object-to-data-manager-to-database path. Instead, an unchangeable PK removes this concern: If the PK never changes, no RI code, and Bob's your uncle.
What if someone creates a second restriction row with
the same category, country, and any other factors? Couldn't this create an integrity nightmare?
This isn't an RI issue, strictly speaking. Relational Integrity concerns the consistency of primary keys and their foreign key representation in associated tables. This is a problem with candidate keys.
If there is a situation where there is a candidate key (one or more fields which SHOULD uniquely identify a record. A candidate key can be the primary key (all primary keys are candidate keys, only one candidate key can be the primary key.), Then there is a requirement that the candidate key be defined with a 'unique constraint" key definition in SQL. (Sadly, this is one of the few places that FoxPro clashes with standard SQL in re-using a term with a different meaning: "UNIQUE" is a feature of old, old XBase which SHOULD NOT BE USED. Instead, define the candidate keys with the clever option of "CANDIDATE" which will enforce uniqueness at the table level. Your table handling logic has to catch candidate violation errors. This keeps evil-doers from messing with your data via Excel. Of course, in your application either at the front end or in the business objects (or both!), you should be checking for duplicate records where there should be none and informing the operator a record already exists (and, typically, having them update the existing record instead. Or overwriting, depending on your app's behavior and business rules.)
In my client billing app, I have a few tables that have a date
range for when each row is valid. This does mean that I have to handle the lookup into these tables.
A unique, non-data-bearing primary key solves the RI problem, but candidate keys are pretty difficult to manage when the uniqueness of a candidate key has to include no overlap in the ranges of valid dates in two fields. I'm not aware of a general data pattern that solves all the permutations of this.
Modeling data over time introduces some tricky issues.
I've run into this in pricelist tables where there are current values and upcoming price changes and so forth. How you structure this seems to be dependent on how you need to access historical or future values; in most cases, I migrate old values into audit tables (write-only) for historical reference. Most of my data models are designed to reflect the current conditions of the data (prices, statuses, etc.) but sometimes I (or the operators) have to get "clever" to close out old orders or schedule future orders when the prices or schedules shift under our feet.
This is one of the reasons that data is sometimes duplicated in a database design without it being a violation of normal design: a company record has a shipping address, kept up to date. The shipping address is also copied onto each order, because the two addresses are not the same thing: the first is the company shipping address TODAY, which the order's address is where that address was to be shipped when the order was placed.