Text file import

List overview All Threads
Download

newer

older

SQL sum

Referring to a Column

Matt Wiedeman

20 Apr 2017 20 Apr '17

2:58 p.m.

Hello everyone,

I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.

Does anyone have a function or tip they can share to resolve this situation?

Show replies by date

Kurt at VR-FX

20 Apr 20 Apr

3:05 p.m.

Here's a quick & dirty idea to resolve your problem.

Read that file in via low-level file handling commands. Then, when you get to that field that's too long - chop it into 2 parts - with another pipe between - and spit it all out to a new Text file. Now you can simply read in that new Text file.

-K-

On 4/20/2017 10:58 AM, Matt Wiedeman wrote:

...

Hello everyone,

I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.

Does anyone have a function or tip they can share to resolve this situation?

[excessive quoting removed by server]

Ted Roche

3:13 p.m.

A parsing routine is usually the fastest. LLFF's (don't forget to use _MLINE) or FileToStr() or Append to Memo and then parse each come out about even in terms of processing time and code, unless the file is really extreme ( 10 Gb, or million byte lines or funny characters...)

Whil ran into a situation where long fields and an impossible number of columns made that technique impractical, and he came up with an innovative solution: import the datea into SQLIte, then use VFP ODBC to manipulate the date from there. He published his whitepaper here: http://hentzenwerke.com/catalog/sqlite2gb.htm

[Disclaimer: I tech-edited this, but make no money from promoting it.]

On Thu, Apr 20, 2017 at 10:58 AM, Matt Wiedeman Matt.Wiedeman@nahealth.com wrote:

...

Hello everyone,

I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.

Does anyone have a function or tip they can share to resolve this situation?

[excessive quoting removed by server]

mbsoftwaresolutions＠mbsoftwaresolutions.com

21 Apr 21 Apr

3:15 p.m.

On 2017-04-20 11:13, Ted Roche wrote:

...

A parsing routine is usually the fastest. LLFF's (don't forget to use _MLINE) or FileToStr() or Append to Memo and then parse each come out about even in terms of processing time and code, unless the file is really extreme ( 10 Gb, or million byte lines or funny characters...)

Whil ran into a situation where long fields and an impossible number of columns made that technique impractical, and he came up with an innovative solution: import the datea into SQLIte, then use VFP ODBC to manipulate the date from there. He published his whitepaper here: http://hentzenwerke.com/catalog/sqlite2gb.htm

[Disclaimer: I tech-edited this, but make no money from promoting it.]

I do something similar--I use MariaDB/MySQL to import the text file to a table there, having all fields be TEXT fields. Works great. Why reinvent the wheel?

Kevin J Cully

20 Apr 20 Apr

3:56 p.m.

We had a file format that was coming from a mainframe where it was impossibly long line, but the first two characters told the line type. We knew the line wasn't more than 2000 characters long including the delimiters. We came up with this approach and it works blazingly fast with VFP.

Step 1: Create a temporary cursor for the entire contents of the file. Basically get it into VFP as quick as we can. CREATE CURSOR MyTempFile ( cType C(2), Field01 C(200), Field02 C(200), Field03 C(200), Field04 C(200), Field05 C(200), Field06 C(200), Field07 C(200), Field08 C(200), Field09 C(200), Field10 C(200))

Step 2: Append into MyTempFile from file DATFile.txt

Step 3: Select the records out of the MyTempFile based on the cType which tells us what the structure is expected to be. This might fit your scenario where a certain field in the delimeter "means" something. You only need to handle the scenario where a delimiter spans from one field to another. Example solution?: SCAN SCATTER NAME oRec cFullField = oRec.Field01 + oRec.Field02 + oRec.Field03 + oRec.Field04 + oRec.Field05 + oRec.Field06 + oRec.Field07 + oRec.Field08 + oRec.Field09 + oRec.Field10 SELECT CASE CASE oRec.cType = "X2" SELECT TableX2 SCATTER NAME oRec BLANK oRec.Fld01 = STREXTRACT(cFulLField, "|", "|", 1) oRec.Fld02 = STREXTRACT(cFulLField, "|", "|", 2) oRec.Fld03 = STREXTRACT(cFulLField, "|", "|", 3) && Might be a memo field INSERT INTO TableX2 FROM NAME oRec CASE oRec.cType = "TY" && do something OTHERWISE && Unhandled situation ENDCASE Endscan

Hope this helps!

-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Matt Wiedeman Sent: Thursday, April 20, 2017 10:58 AM To: profox@leafe.com Subject: Text file import

Hello everyone,

Does anyone have a function or tip they can share to resolve this situation?

[excessive quoting removed by server]

Richard Kaye

4:12 p.m.

Or you could use ALINES.

m.ctext=[field 1|field 2||field 4|field 5|field 6] ?ALINES(anarray,m.ctext,2,[|]) DISPLAY MEMORY LIKE anarray*

rk -----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Matt Wiedeman Sent: Thursday, April 20, 2017 10:58 AM To: profox@leafe.com Subject: Text file import

Hello everyone,

Does anyone have a function or tip they can share to resolve this situation?

mbsoftwaresolutions＠mbsoftwaresolutions.com

21 Apr 21 Apr

3:18 p.m.

New subject: STREXTRACT (was RE: Text file import)

On 2017-04-20 11:56, Kevin J Cully wrote:

...

<snipped> oRec.Fld01 = STREXTRACT(cFulLField, "|", "|", 1) oRec.Fld02 = STREXTRACT(cFulLField, "|", "|", 2) oRec.Fld03 = STREXTRACT(cFulLField, "|", "|", 3) && Might be a memo field

Wow! Never heard of this function before, so I looked it up in VFP and there it is! Thanks!!!!!!!!!!

Richard Kaye

3:21 p.m.

New subject: STREXTRACT (was RE: Text file import)

STREXTRACT came along in 7 maybe? It really simplifies a lot of that old OCCURS()/AT()/RAT() type code

rk -----Original Message----- From: ProfoxTech [mailto:profoxtech-bounces@leafe.com] On Behalf Of mbsoftwaresolutions@mbsoftwaresolutions.com Sent: Friday, April 21, 2017 11:19 AM To: profoxtech@leafe.com Subject: STREXTRACT (was RE: Text file import)

On 2017-04-20 11:56, Kevin J Cully wrote:

...

<snipped> oRec.Fld01 = STREXTRACT(cFulLField, "|", "|", 1) oRec.Fld02 = STREXTRACT(cFulLField, "|", "|", 2) oRec.Fld03 = STREXTRACT(cFulLField, "|", "|", 3) && Might be a memo field

Wow! Never heard of this function before, so I looked it up in VFP and there it is! Thanks!!!!!!!!!!

Ted Roche

3:29 p.m.

New subject: STREXTRACT (was RE: Text file import)

On Fri, Apr 21, 2017 at 11:21 AM, Richard Kaye rkaye@invaluable.com wrote:

...

STREXTRACT came along in 7 maybe? It really simplifies a lot of that old OCCURS()/AT()/RAT() type code

7 it was.

There's an obscure feature of the Hacker's Guide CHM file: Open the content tab on the left, select "Customize..." from the right-click context menu, and there's a set of dialogs that let you filter what you see: select "Custom" on the first page. Next, and check off (or tick in the UK) which versions you want to see.

(And, yes, we documented this too, in "How to use the Help File, under the subhead Just the Way You Want It)

Paul H. Tarver

20 Apr 20 Apr

5:21 p.m.

In this case, I would probably get each line of data to memory or a cursor as quickly as possible to speed up the parsing (ie: FileToStr if you can, or FGETS/FREADS to a cursor if you can't). However, I believe the low-level file functions would be easier to use here.

FGETS( ) returns a maximum of 8,192 bytes. (If you need more characters per line you can FREAD up to 65,535 bytes but using FREAD will require more programming to set the read length correctly for each line and to manually position your file pointer.)

I haven't tested this code but I think it would look something like this:

LOCAL lnFileHandle, lnFileSize, lcLineData, lnFldLoop, lcDestFld, laDataArray[1]

USE DestTable IN 0 &&I'm assuming that the number and order of fields match text file you are importing

lnFileHandle = FOPEN("test.txt") lnFileSize = FSEEK(lnFileHandle, 0, 2)

IF lnFileSize <= 0 WAIT WINDOW "Error! Empty file!" NOWAIT ELSE

DO WHILE !FEOF(lnFileHandle)

lcLineData = FGETS(lnFileHandle, 8192)

FOR lnFldLoop = 1 TO ALINES(laDataArray, lcLineData, "|")

lcDestFld = FIELD(lnFldLoop,'DestTable') REPLACE lcDestFld.&lcDestFld WITH laDataArray(lnFldLoop) IN DestTable

ENDFOR

ENDDO

ENDIF

= FCLOSE(lnFileHandle) && Close the file

You can get field level specific if you want to convert imported character data into numeric, date, binary or memo fields by doing direct replacements instead the field loop process shown above.

The bigger challenge that I see is that when a field is bigger than 254 characters, it is typcally reserved for user input/notes/memos/etc and invariably a user will enter delimiter values inside of the field which can affect your parse routine.

Paul H. Tarver

-----Original Message----- From: Matt Wiedeman [mailto:Matt.Wiedeman@nahealth.com] Sent: Thursday, April 20, 2017 9:58 AM To: profoxtech@leafe.com Subject: Text file import

Hello everyone,

Does anyone have a function or tip they can share to resolve this situation?

[excessive quoting removed by server]

mbsoftwaresolutions＠mbsoftwaresolutions.com

21 Apr 21 Apr

3:13 p.m.

I'll take a quick second to remind folks too of VFP2C32.FLL for accessing files greater than 2GB too. Functions are FGetSex, FCloseEx, FReadEx, FOpenEx, etc.

--Mike

On 2017-04-20 13:21, Paul H. Tarver wrote:

...

In this case, I would probably get each line of data to memory or a cursor as quickly as possible to speed up the parsing (ie: FileToStr if you can, or FGETS/FREADS to a cursor if you can't). However, I believe the low-level file functions would be easier to use here.

FGETS( ) returns a maximum of 8,192 bytes. (If you need more characters per line you can FREAD up to 65,535 bytes but using FREAD will require more programming to set the read length correctly for each line and to manually position your file pointer.)

I haven't tested this code but I think it would look something like this:

LOCAL lnFileHandle, lnFileSize, lcLineData, lnFldLoop, lcDestFld, laDataArray[1]

USE DestTable IN 0 &&I'm assuming that the number and order of fields match text file you are importing

lnFileHandle = FOPEN("test.txt") lnFileSize = FSEEK(lnFileHandle, 0, 2)

IF lnFileSize <= 0 WAIT WINDOW "Error! Empty file!" NOWAIT ELSE

DO WHILE !FEOF(lnFileHandle)
lcLineData = FGETS(lnFileHandle, 8192)

FOR lnFldLoop = 1 TO ALINES(laDataArray, lcLineData, "|")

	lcDestFld = FIELD(lnFldLoop,'DestTable')
	REPLACE lcDestFld.&lcDestFld WITH
laDataArray(lnFldLoop) IN DestTable
ENDFOR
ENDDO

ENDIF

= FCLOSE(lnFileHandle) && Close the file

You can get field level specific if you want to convert imported character data into numeric, date, binary or memo fields by doing direct replacements instead the field loop process shown above.

The bigger challenge that I see is that when a field is bigger than 254 characters, it is typcally reserved for user input/notes/memos/etc and invariably a user will enter delimiter values inside of the field which can affect your parse routine.

Paul H. Tarver

-----Original Message----- From: Matt Wiedeman [mailto:Matt.Wiedeman@nahealth.com] Sent: Thursday, April 20, 2017 9:58 AM To: profoxtech@leafe.com Subject: Text file import

Hello everyone,

I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.

Does anyone have a function or tip they can share to resolve this situation?

[excessive quoting removed by server]

Peter Cushing

3:28 p.m.

On 21/04/2017 16:13, mbsoftwaresolutions@mbsoftwaresolutions.com wrote:

...

I'll take a quick second to remind folks too of VFP2C32.FLL for accessing files greater than 2GB too. Functions are FGetSex

Hope that doesn't take you to a dodgy site ;-)

Peter

This communication is intended for the person or organisation to whom it is addressed. The contents are confidential and may be protected in law. Unauthorised use, copying or disclosure of any of it may be unlawful. If you have received this message in error, please notify us immediately by telephone or email.

www.whisperingsmith.com

Whispering Smith Ltd Head Office:61 Great Ducie Street, Manchester M3 1RR. Tel:0161 831 3700 Fax:0161 831 3715

London Office:17-19 Foley Street, London W1W 6DW Tel:0207 299 7960

mbsoftwaresolutions＠mbsoftwaresolutions.com

5:46 p.m.

On 2017-04-21 11:28, Peter Cushing wrote:

...

On 21/04/2017 16:13, mbsoftwaresolutions@mbsoftwaresolutions.com wrote:

...
I'll take a quick second to remind folks too of VFP2C32.FLL for accessing files greater than 2GB too. Functions are FGetSex

Hope that doesn't take you to a dodgy site ;-)

Peter

LOL!! You saw my pun-like reference. when you put the command in code, it upper cases all of it and well, it just looks like that instead of FGetsEx. :D

3226

Age (days ago)

3227

Last active (days ago)

profox@leafe.com

12 comments

8 participants

tags (0)

participants (8)

Kevin J Cully
Kurt at VR-FX
Matt Wiedeman
mbsoftwaresolutions＠mbsoftwaresolutions.com
Paul H. Tarver
Peter Cushing
Richard Kaye
Ted Roche