.csv file imports to VFP

List overview All Threads
Download

newer

older

NF Recommendations for a js create...

Re: .csv file imports to VFP

Joe Yoder

16 Apr 2023 16 Apr '23

2:35 a.m.

I have an accounting system data dump that I export each year at tax time, I think I have always had a few records that got mangled but this year I would like to solve the problem.

My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the ,CSV file type delimited. This is very fast and seems to be perfect except for about 24 records out of 10,000. What happens is that fields that belong in later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.

I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.

Another possible approach would be to pull the entire file into memory and parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.

Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they belong in Excel. I don't like to add a process in another package but it may make sense to do so.

Any wisdom will be much appreciated. Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html ---

Show replies by date

António Tavares Lopes

16 Apr 16 Apr

6:43 a.m.

Joe, you may consider using the CSVProcessor class at https://github.com/atlopes/csv.

On Sun, Apr 16, 2023 at 3:36 AM Joe Yoder joe@wheypower.com wrote:

...

I have an accounting system data dump that I export each year at tax time, I think I have always had a few records that got mangled but this year I would like to solve the problem.

My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the ,CSV file type delimited. This is very fast and seems to be perfect except for about 24 records out of 10,000. What happens is that fields that belong in later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.

I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.

Another possible approach would be to pull the entire file into memory and parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.

Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they belong in Excel. I don't like to add a process in another package but it may make sense to do so.

Any wisdom will be much appreciated. Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html

[excessive quoting removed by server]

Alan Bourke

5:49 p.m.

'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non ASCII characters ir anything?

-- Alan Bourke alanpbourke (at) fastmail (dot) fm On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote: > I have an accounting system data dump that I export each year at tax time, > I think I have always had a few records that got mangled but this year I > would like to solve the problem. > > My preferred technique is to create a cursor with the fields I need to > import along with any required place holders. I then append from the ,CSV > file type delimited. This is very fast and seems to be perfect except for > about 24 records out of 10,000. What happens is that fields that belong in > later positions in some records show up in the first field position. It > seems some records are handled as though they have a carriage return > linefeed too early and what follows ends up at the beginning of the > following record. > > I have investigated Low level file input using an fgets function to pull > records out one at a time. This is obviously slower and still seems to > have the same problem. > > Another possible approach would be to pull the entire file into memory and > parse everything out from there. This will take much more time and be a > lot of work to set up but might make sense to try. > > Finally I have opened the file in Excel with no problem. Some of the > fields that want to show up in the left field in VFP stay where they belong > in Excel. I don't like to add a process in another package but it may make > sense to do so. > > Any wisdom will be much appreciated. > Thanks in advance, > > Joe > > > --- StripMime Report -- processed MIME parts --- > multipart/alternative > text/plain (text body -- kept) > text/html > --- > [excessive quoting removed by server]

Srikanth Bhandari

17 Apr 17 Apr

7:48 a.m.

This challenge may arise if the delimiter is also a value in the field content such as a comma.

Ideally it would be good to use a delimiter which isn't ever used in the content such as the pipe symbol.

Trust this should solve your challenge.

-- Cheers!!!

*Srikanth Bhandari* Email: srikanth.bhandaari@gmail.com Web:

On Sun, Apr 16, 2023 at 11:19 PM Alan Bourke alanpbourke@fastmail.fm wrote:

...

'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non ASCII characters ir anything?

-- Alan Bourke alanpbourke (at) fastmail (dot) fm

On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote:

...
I have an accounting system data dump that I export each year at tax

time,

...
I think I have always had a few records that got mangled but this year I would like to solve the problem.

My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the

,CSV

...
file type delimited. This is very fast and seems to be perfect except

for

...
about 24 records out of 10,000. What happens is that fields that belong

in

...
later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.

I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.

Another possible approach would be to pull the entire file into memory

and

...
parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.

Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they

belong

...
in Excel. I don't like to add a process in another package but it may

make

...
sense to do so.

Any wisdom will be much appreciated. Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html

[excessive quoting removed by server]

Joe Yoder

4:04 p.m.

Alan and all, I have researched the problem records and find that they have line feed characters without a leading carriage return. This results from the operator hitting a return in a notes field of the accounting package I am using.

Apparently Excel import insists on a combination of carriage return and line feed while VFP's append from delimited accepts a line feed by itself. It seems to me I am not the first person to figure this out and that surely there is a parameter to alter this behavior.

I thought I could simply preprocess the file and change each occurrence of a single line feed with a comma but when I look at functions available I suspect it will be painfully slow.

Any ideas appreciated,

Joe

On Sun, Apr 16, 2023 at 1:49 PM Alan Bourke alanpbourke@fastmail.fm wrote:

...

'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non ASCII characters ir anything?

-- Alan Bourke alanpbourke (at) fastmail (dot) fm

On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote:

...
I have an accounting system data dump that I export each year at tax

time,

...
I think I have always had a few records that got mangled but this year I would like to solve the problem.

My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the

,CSV

...
file type delimited. This is very fast and seems to be perfect except

for

...
about 24 records out of 10,000. What happens is that fields that belong

in

...
later positions in some records show up in the first field position. It seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.

I have investigated Low level file input using an fgets function to pull records out one at a time. This is obviously slower and still seems to have the same problem.

Another possible approach would be to pull the entire file into memory

and

...
parse everything out from there. This will take much more time and be a lot of work to set up but might make sense to try.

Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they

belong

...
in Excel. I don't like to add a process in another package but it may

make

...
sense to do so.

Any wisdom will be much appreciated. Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html

[excessive quoting removed by server]

Richard Quilhot

4:56 p.m.

You can use notepad ++ find and replace to preprocess the file. Rick Q quilhotr@gmail.com

On Mon, Apr 17, 2023 at 12:04 PM Joe Yoder joe@wheypower.com wrote:

...

Alan and all, I have researched the problem records and find that they have line feed characters without a leading carriage return. This results from the operator hitting a return in a notes field of the accounting package I am using.

Apparently Excel import insists on a combination of carriage return and line feed while VFP's append from delimited accepts a line feed by itself. It seems to me I am not the first person to figure this out and that surely there is a parameter to alter this behavior.

I thought I could simply preprocess the file and change each occurrence of a single line feed with a comma but when I look at functions available I suspect it will be painfully slow.

Any ideas appreciated,

Joe

On Sun, Apr 16, 2023 at 1:49 PM Alan Bourke alanpbourke@fastmail.fm wrote:

...
'Append from' has always been bombproof in my experience, as long as what you're giving it is correct. Do those problem rows have any weird non

ASCII

...
characters ir anything?

-- Alan Bourke alanpbourke (at) fastmail (dot) fm

On Sun, 16 Apr 2023, at 3:35 AM, Joe Yoder wrote:

...
I have an accounting system data dump that I export each year at tax

time,

...
I think I have always had a few records that got mangled but this year

I

...
...
would like to solve the problem.

My preferred technique is to create a cursor with the fields I need to import along with any required place holders. I then append from the

,CSV

...
file type delimited. This is very fast and seems to be perfect except

for

...
about 24 records out of 10,000. What happens is that fields that

belong

...
in

...
later positions in some records show up in the first field position.

It

...
...
seems some records are handled as though they have a carriage return linefeed too early and what follows ends up at the beginning of the following record.

I have investigated Low level file input using an fgets function to

pull

...
...
records out one at a time. This is obviously slower and still seems to have the same problem.

Another possible approach would be to pull the entire file into memory

and

...
parse everything out from there. This will take much more time and be

a

...
...
lot of work to set up but might make sense to try.

Finally I have opened the file in Excel with no problem. Some of the fields that want to show up in the left field in VFP stay where they

belong

...
in Excel. I don't like to add a process in another package but it may

make

...
sense to do so.

Any wisdom will be much appreciated. Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html

[excessive quoting removed by server]

Ted Roche

6:16 p.m.

On Mon, Apr 17, 2023 at 12:04 PM Joe Yoder joe@wheypower.com wrote:

...

I thought I could simply preprocess the file and change each occurrence of a single line feed with a comma but when I look at functions available I suspect it will be painfully slow.

Any ideas appreciated,

You ought to try it before rejecting it. VFP may surprise you. Nothing runs like the fox.

If I was trying to replace orphan single linefeeds with commas, I'd do something like this:

#DEFINE CRLF CHR(13)+CHR(10) #DEFINE LF CHR(10) badstr = FileToStr("mybadfile.csv") goodstr = STRTRAN(badstr, CRLF, "~~") goodstr = STRTRAN(goodstr, LF, ',') goodstr = STRTRAN(goodstr, "~~", CRLF) StrToFile(goodstr, "mygoodfile.csv")

Check to ensure ~~ don't exist in the original file, or substitute your own placeholders.

1038

Age (days ago)

1039

Last active (days ago)

profox@leafe.com

6 comments

6 participants

tags (0)

participants (6)

Alan Bourke
António Tavares Lopes
Joe Yoder
Richard Quilhot
Srikanth Bhandari
Ted Roche