I have an accounting system data dump that I export each year at tax time, I think I have always had a few records that got mangled but this year I would like to solve the problem.
My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the ,CSV file type delimited. This is very fast and seems to be perfect except for about 24 records out of 10,000. What happens is that fields that belong in later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.
I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.
Another possible approach would be to pull the entire file into memory and parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.
Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they belong in Excel. I don't like to add a process in another package but it may make sense to do so.
Any wisdom will be much appreciated. Thanks in advance,
Joe
--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html ---
Joe, you may consider using the CSVProcessor class at https://github.com/atlopes/csv.
On Sun, Apr 16, 2023 at 3:36 AM Joe Yoder joe@wheypower.com wrote:
I have an accounting system data dump that I export each year at tax time, I think I have always had a few records that got mangled but this year I would like to solve the problem.
My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the ,CSV file type delimited. This is very fast and seems to be perfect except for about 24 records out of 10,000. What happens is that fields that belong in later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.
I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.
Another possible approach would be to pull the entire file into memory and parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.
Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they belong in Excel. I don't like to add a process in another package but it may make sense to do so.
Any wisdom will be much appreciated. Thanks in advance,
Joe
--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html
[excessive quoting removed by server]
'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non ASCII characters ir anything?
This challenge may arise if the delimiter is also a value in the field content such as a comma.
Ideally it would be good to use a delimiter which isn't ever used in the content such as the pipe symbol.
Trust this should solve your challenge.
-- Cheers!!!
*Srikanth Bhandari* Email: srikanth.bhandaari@gmail.com Web:
On Sun, Apr 16, 2023 at 11:19 PM Alan Bourke alanpbourke@fastmail.fm wrote:
'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non ASCII characters ir anything?
-- Alan Bourke alanpbourke (at) fastmail (dot) fm
On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote:
I have an accounting system data dump that I export each year at tax
time,
I think I have always had a few records that got mangled but this year I would like to solve the problem.
My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the
,CSV
file type delimited. This is very fast and seems to be perfect except
for
about 24 records out of 10,000. What happens is that fields that belong
in
later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.
I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.
Another possible approach would be to pull the entire file into memory
and
parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.
Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they
belong
in Excel. I don't like to add a process in another package but it may
make
sense to do so.
Any wisdom will be much appreciated. Thanks in advance,
Joe
--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html
[excessive quoting removed by server]
Alan and all, I have researched the problem records and find that they have line feed characters without a leading carriage return. This results from the operator hitting a return in a notes field of the accounting package I am using.
Apparently Excel import insists on a combination of carriage return and line feed while VFP's append from delimited accepts a line feed by itself. It seems to me I am not the first person to figure this out and that surely there is a parameter to alter this behavior.
I thought I could simply preprocess the file and change each occurrence of a single line feed with a comma but when I look at functions available I suspect it will be painfully slow.
Any ideas appreciated,
Joe
On Sun, Apr 16, 2023 at 1:49 PM Alan Bourke alanpbourke@fastmail.fm wrote:
'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non ASCII characters ir anything?
-- Alan Bourke alanpbourke (at) fastmail (dot) fm
On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote:
I have an accounting system data dump that I export each year at tax
time,
I think I have always had a few records that got mangled but this year I would like to solve the problem.
My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the
,CSV
file type delimited. This is very fast and seems to be perfect except
for
about 24 records out of 10,000. What happens is that fields that belong
in
later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.
I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.
Another possible approach would be to pull the entire file into memory
and
parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.
Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they
belong
in Excel. I don't like to add a process in another package but it may
make
sense to do so.
Any wisdom will be much appreciated. Thanks in advance,
Joe
--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html
[excessive quoting removed by server]
You can use notepad ++ find and replace to preprocess the file. Rick Q quilhotr@gmail.com
On Mon, Apr 17, 2023 at 12:04 PM Joe Yoder joe@wheypower.com wrote:
Alan and all, I have researched the problem records and find that they have line feed characters without a leading carriage return. This results from the operator hitting a return in a notes field of the accounting package I am using.
Apparently Excel import insists on a combination of carriage return and line feed while VFP's append from delimited accepts a line feed by itself. It seems to me I am not the first person to figure this out and that surely there is a parameter to alter this behavior.
I thought I could simply preprocess the file and change each occurrence of a single line feed with a comma but when I look at functions available I suspect it will be painfully slow.
Any ideas appreciated,
Joe
On Sun, Apr 16, 2023 at 1:49 PM Alan Bourke alanpbourke@fastmail.fm wrote:
'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non
ASCII
characters ir anything?
-- Alan Bourke alanpbourke (at) fastmail (dot) fm
On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote:
I have an accounting system data dump that I export each year at tax
time,
I think I have always had a few records that got mangled but this year
I
would like to solve the problem.
My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the
,CSV
file type delimited. This is very fast and seems to be perfect except
for
about 24 records out of 10,000. What happens is that fields that
belong
in
later positions in some records show up in the first field position.
It
seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.
I have investigated Low level file input using an fgets function to
pull
records out one at a time. This is obviously slower and still seems to have the same problem.
Another possible approach would be to pull the entire file into memory
and
parse everything out from there. This will take much more time and be
a
lot of work to set up but might make sense to try.
Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they
belong
in Excel. I don't like to add a process in another package but it may
make
sense to do so.
Any wisdom will be much appreciated. Thanks in advance,
Joe
--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html
[excessive quoting removed by server]
On Mon, Apr 17, 2023 at 12:04 PM Joe Yoder joe@wheypower.com wrote:
I thought I could simply preprocess the file and change each occurrence of a single line feed with a comma but when I look at functions available I suspect it will be painfully slow.
Any ideas appreciated,
You ought to try it before rejecting it. VFP may surprise you. Nothing runs like the fox.
If I was trying to replace orphan single linefeeds with commas, I'd do something like this:
#DEFINE CRLF CHR(13)+CHR(10) #DEFINE LF CHR(10) badstr = FileToStr("mybadfile.csv") goodstr = STRTRAN(badstr, CRLF, "~~") goodstr = STRTRAN(goodstr, LF, ',') goodstr = STRTRAN(goodstr, "~~", CRLF) StrToFile(goodstr, "mygoodfile.csv")
Check to ensure ~~ don't exist in the original file, or substitute your own placeholders.