Hi folks,
Having never attempted this before, I am seeking any guidance you can offer.
I'm using VFP tables across a LAN, with VFP 9 SP 1. The tables are on a virtualized Windows 2012 file server that has gargantuan memory and HDD resources available. The workstations are Windows 7 Ultimate 32-bit with processors in the 3.5 GHz range and 4 GB of RAM.
My framework does not use the RAD features of VFP; no data environments, no control sources, no buffering (I actually wrote my own "buffering"/conflict resolution system), no auto-incrementing columns. The only time I manually lock anything is when my integer surrogate key-creation program is generating a new key--and I only lock the table that stores the last-used key value.
I do use a database container so that the default value methods of the PK fields of my tables can call the key generation program. When users create new records, I use APPEND, not INSERT. The system was deliberately designed to divorce database access from business rules and GUI (because initially, long ago, I thought I might actually embark on writing my own database server).
I understand that there is brief automatic locking, however, whenever VFP writes to a table (or, rather, when the server OS, in its infinite wisdom, deigns to carry out VFP's request to write to the table).
I've been given the task of importing data from a spreadsheet into my multi-user system. I have not done any testing on this but it is possible that the import process, run every week or every two weeks, will create thousands of records that will be "children" of a few hundred "parent" records.
The system is in near constant use by somewhere around a half-dozen people, and much larger numbers use it occasionally.
Some of the half-dozen intensive users are quite likely to be accessing the parent records at the time the bulk import process is creating the child records.
I am concerned about creating massive access conflicts leading to race conditions, freezing of the system, and/or data corruption.
Should I just block everybody from saving any data while the import process runs? (I don't know how long it will take, but I do envision a lot of error-checking and logging on each row of the spreadsheet to prevent bad data from getting into the system.) Or are there simple things I can do to prevent issues ("simple" means, no major changes to the framework or the database)? Or is this something I don't really need to worry about?
Thanks very much for any suggestions.
Ken Dibble www.stic-cil.org
Is the proposed import doing any updates of the parent records or just adding\updating child records?
Normally (i.e. in vanilla VFP) unless the logged-in users have exclusively opened or otherwise locked the parent records (which you say they won't) then I would be in broad terms:
* Pre-validating the import data to report on exceptions and arrive at a clean import set of records. * Upserting the child records wrapped in a transaction. * Repeat if applicable for parent records.
You say your application doesn't explicitly lock any tables (rlock(), flock() or automatic locks via BEGIN TRANSACTION) and if the OS does indeed do any sort of ultra-brief locking on file updates it's never caused me a problem.
So I would say the worst that will happen is that other users might see a 'record not available - please wait' message if they try and save a parentr record when you're still inside a transaction on whatever machine the import is running on. But assuming there aren't millions of records and you keep the transaction as quick as possible I doubt they'd ever even see that.
My first thought when I read this was that I would create temporary cursors (CREATE CURSOR) to import the data initially and then add the new records to the production tables from the temporary cursors. I love using temporary cursors for importing data!
This would give you plenty of time to perform error checking during the import process and then it will allow you to minimize the time it takes to post the imported data to the production tables. You could even control the post process to push one parent-child data set at a time to the production tables as opposed to posting all parents first and then posting all the child records.
Paul H. Tarver Tarver Program Consultants, Inc. Email: paul@tpcqpc.com
-----Original Message----- From: ProfoxTech [mailto:profoxtech-bounces@leafe.com] On Behalf Of Ken Dibble Sent: Thursday, June 22, 2017 9:49 AM To: profoxtech@leafe.com Subject: Bulk Imports While Multi-User System is in Use
Hi folks,
Having never attempted this before, I am seeking any guidance you can offer.
I'm using VFP tables across a LAN, with VFP 9 SP 1. The tables are on a virtualized Windows 2012 file server that has gargantuan memory and HDD resources available. The workstations are Windows 7 Ultimate 32-bit with processors in the 3.5 GHz range and 4 GB of RAM.
My framework does not use the RAD features of VFP; no data environments, no control sources, no buffering (I actually wrote my own "buffering"/conflict resolution system), no auto-incrementing columns. The only time I manually lock anything is when my integer surrogate key-creation program is generating a new key--and I only lock the table that stores the last-used key value.
I do use a database container so that the default value methods of the PK fields of my tables can call the key generation program. When users create new records, I use APPEND, not INSERT. The system was deliberately designed to divorce database access from business rules and GUI (because initially, long ago, I thought I might actually embark on writing my own database server).
I understand that there is brief automatic locking, however, whenever VFP writes to a table (or, rather, when the server OS, in its infinite wisdom, deigns to carry out VFP's request to write to the table).
I've been given the task of importing data from a spreadsheet into my multi-user system. I have not done any testing on this but it is possible that the import process, run every week or every two weeks, will create thousands of records that will be "children" of a few hundred "parent" records.
The system is in near constant use by somewhere around a half-dozen people, and much larger numbers use it occasionally.
Some of the half-dozen intensive users are quite likely to be accessing the parent records at the time the bulk import process is creating the child records.
I am concerned about creating massive access conflicts leading to race conditions, freezing of the system, and/or data corruption.
Should I just block everybody from saving any data while the import process runs? (I don't know how long it will take, but I do envision a lot of error-checking and logging on each row of the spreadsheet to prevent bad data from getting into the system.) Or are there simple things I can do to prevent issues ("simple" means, no major changes to the framework or the database)? Or is this something I don't really need to worry about?
Thanks very much for any suggestions.
Ken Dibble www.stic-cil.org
[excessive quoting removed by server]
On 6/22/2017 3:40 PM, Paul H. Tarver wrote:
My first thought when I read this was that I would create temporary cursors (CREATE CURSOR) to import the data initially and then add the new records to
[snip] Agreed with temp cursors. It gives you incredible flexibility if you need it. No worries about table locks etc.
As another option, you could use Excel automation. But I would not recommend that if the import has a performance requirement (e.g. get it loaded within 30 sec). Another downside is the computer performing the data load would have to have Excel installed.
In both the above cases, you can do the record inserts one at a time, removing the worry about long "table locks", etc. And you could even give a progress bar as you do the inserts.
HTH, -Charlie
-----Original Message----- From: ProfoxTech [mailto:profoxtech-bounces@leafe.com] On Behalf Of Ken Dibble
[snip]
I've been given the task of importing data from a spreadsheet into my multi-user system. I have not done any testing on this but it is possible that the import process, run every week or every two weeks, will create thousands of records that will be "children" of a few hundred "parent" records. The system is in near constant use by somewhere around a half-dozen people, and much larger numbers use it occasionally. Some of the half-dozen intensive users are quite likely to be accessing the parent records at the time the bulk import process is creating the child records. I am concerned about creating massive access conflicts leading to race conditions, freezing of the system, and/or data corruption. Should I just block everybody from saving any data while the import
[snip]
Actually there's a VFPx project (https://github.com/ggreen86/XLXS-Workbook-Class) that can read and write native XLSX without Excel. Having said that, if you're using the XLS format than you will need Excel, as Charlie says.
--
rk
-----Original Message----- From: ProfoxTech [mailto:profoxtech-bounces@leafe.com] On Behalf Of Charlie-gm Sent: Thursday, June 22, 2017 9:16 PM To: profoxtech@leafe.com Subject: Re: Bulk Imports While Multi-User System is in Use
On 6/22/2017 3:40 PM, Paul H. Tarver wrote:
My first thought when I read this was that I would create temporary cursors (CREATE CURSOR) to import the data initially and then add the new records to
[snip] Agreed with temp cursors. It gives you incredible flexibility if you need it. No worries about table locks etc.
As another option, you could use Excel automation. But I would not recommend that if the import has a performance requirement (e.g. get it loaded within 30 sec). Another downside is the computer performing the data load would have to have Excel installed.
In both the above cases, you can do the record inserts one at a time, removing the worry about long "table locks", etc. And you could even give a progress bar as you do the inserts.
HTH, -Charlie
-----Original Message----- From: ProfoxTech [mailto:profoxtech-bounces@leafe.com] On Behalf Of Ken Dibble
[snip]
I've been given the task of importing data from a spreadsheet into my multi-user system. I have not done any testing on this but it is possible that the import process, run every week or every two weeks, will create thousands of records that will be "children" of a few hundred "parent" records. The system is in near constant use by somewhere around a half-dozen people, and much larger numbers use it occasionally. Some of the half-dozen intensive users are quite likely to be accessing the parent records at the time the bulk import process is creating the child records. I am concerned about creating massive access conflicts leading to race conditions, freezing of the system, and/or data corruption. Should I just block everybody from saving any data while the import
[snip]
[excessive quoting removed by server]
I do a lot of importing of data from hospitals. If your indexes are pretty simple, importing data into the tables while other people are using the system shouldn't be much of a performance hit. All I do is convert to csv and append it to the table. You may need an interim database if you'd like To be more efficient, do a sql insert, if performance is really bad. If your indexes are complex, then I would wait til close of business to append.
From: Ken Dibble krdibble@stny.rr.com To: profox@leafe.com Sent: Thursday, June 22, 2017 10:48 AM Subject: Bulk Imports While Multi-User System is in Use
Hi folks,
Having never attempted this before, I am seeking any guidance you can offer.
I'm using VFP tables across a LAN, with VFP 9 SP 1. The tables are on a virtualized Windows 2012 file server that has gargantuan memory and HDD resources available. The workstations are Windows 7 Ultimate 32-bit with processors in the 3.5 GHz range and 4 GB of RAM.
My framework does not use the RAD features of VFP; no data environments, no control sources, no buffering (I actually wrote my own "buffering"/conflict resolution system), no auto-incrementing columns. The only time I manually lock anything is when my integer surrogate key-creation program is generating a new key--and I only lock the table that stores the last-used key value.
I do use a database container so that the default value methods of the PK fields of my tables can call the key generation program. When users create new records, I use APPEND, not INSERT. The system was deliberately designed to divorce database access from business rules and GUI (because initially, long ago, I thought I might actually embark on writing my own database server).
I understand that there is brief automatic locking, however, whenever VFP writes to a table (or, rather, when the server OS, in its infinite wisdom, deigns to carry out VFP's request to write to the table).
I've been given the task of importing data from a spreadsheet into my multi-user system. I have not done any testing on this but it is possible that the import process, run every week or every two weeks, will create thousands of records that will be "children" of a few hundred "parent" records.
The system is in near constant use by somewhere around a half-dozen people, and much larger numbers use it occasionally.
Some of the half-dozen intensive users are quite likely to be accessing the parent records at the time the bulk import process is creating the child records.
I am concerned about creating massive access conflicts leading to race conditions, freezing of the system, and/or data corruption.
Should I just block everybody from saving any data while the import process runs? (I don't know how long it will take, but I do envision a lot of error-checking and logging on each row of the spreadsheet to prevent bad data from getting into the system.) Or are there simple things I can do to prevent issues ("simple" means, no major changes to the framework or the database)? Or is this something I don't really need to worry about?
Thanks very much for any suggestions.
Ken Dibble www.stic-cil.org
[excessive quoting removed by server]