Re: Database design

16 Jun 2018


      On 6/15/2018 12:30 PM, mbsoftwaresolutions@mbsoftwaresolutions.com wrote:
...
Some years ago I had created a solution where I had split the main 
data table into 5 separate tables, each with a 1:1 relationship.  I 
kept the commonly held fields in the table, and moved others off like so:
 - Main Job table
 - Job Address table (job site address)
 - Job Dates table (key event dates for this job)
...
[snip]
Based on my experience the fewer fields in a record (row) the better 
your performance will be IF you do not have to ALWAYS join the other 
tables. I think Ted touched on this already: trying to keep the field 
count small, but keep the most frequently needed fields in the same 
table. In general, JOINs are 'slow' (NOTE: the VFP ability to SET 
RELATION TO RECNO() pretty much obliterated JOIN performance - it really 
was like just having one huge table at your disposal - of course there 
were other things to worry about <g>, oh well....).
Anyway, to illustrate, I had a field called "PTN_factor". This field was 
only filled in about 30% of the time. However, because of its business 
value it was included in almost every report and calculation (yes, they 
simply wanted the confirmation that no PTN_factor was contributing....). 
So, in that case, I kept the PTN_factor in my "main" table. I too moved 
the Notes to a separate table - which turned out well because they 
eventually wanted a "date log" note table (a kind of "baseline", then 
revisions thing...)
I have seen Oracle queries speed up when the "row width" was 
significantly reduced. But I have not tested the observation 
systematically in other RDBMSs. I think there may well be some factor of 
variable length fields, and needing to parse for the field separator 
1,000 times instead of 50. Also, if you frequently do "SELECT *..." I 
believe a 100 column table significantly takes more server resources 
(not just temporary space) than a 20 column table (what the server has 
to do to find and assemble all the column names, etc).
Of course there are all kinds of over variations: 3 columns of 
containing 4000 characters will give the "feeling" of being slower than 
25 integer fields (in regards to user experience, not server resources). 
So it is sometimes hard to prove/demonstrate.
Thus, my rule of keeping normalization in mind insofar as data that is 
not top priority to users and/or have a good probability of becoming a 1 
to many need. It's still guess work (or scientific analysis if you 
prefer) <g>
Last but not least, if you have it working now, it seems to be a tough 
justification to redesign it just to "feel" better about less tables. 
But in the end, whatever seems to be causing most maintenance headaches 
is probably the best thing to target to change. So if those multi-joins 
are constantly driving you crazy, then yeah, maybe some consolidation is 
in order. But I recommend against a single, wide (many columns), data 
record. It does impact performance.
HTH,
-Charlie

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

Re: Database design