Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
Here's a quick & dirty idea to resolve your problem.
Read that file in via low-level file handling commands. Then, when you get to that field that's too long - chop it into 2 parts - with another pipe between - and spit it all out to a new Text file. Now you can simply read in that new Text file.
-K-
On 4/20/2017 10:58 AM, Matt Wiedeman wrote:
Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
[excessive quoting removed by server]
A parsing routine is usually the fastest. LLFF's (don't forget to use _MLINE) or FileToStr() or Append to Memo and then parse each come out about even in terms of processing time and code, unless the file is really extreme ( 10 Gb, or million byte lines or funny characters...)
Whil ran into a situation where long fields and an impossible number of columns made that technique impractical, and he came up with an innovative solution: import the datea into SQLIte, then use VFP ODBC to manipulate the date from there. He published his whitepaper here: http://hentzenwerke.com/catalog/sqlite2gb.htm
[Disclaimer: I tech-edited this, but make no money from promoting it.]
On Thu, Apr 20, 2017 at 10:58 AM, Matt Wiedeman Matt.Wiedeman@nahealth.com wrote:
Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
[excessive quoting removed by server]
On 2017-04-20 11:13, Ted Roche wrote:
A parsing routine is usually the fastest. LLFF's (don't forget to use _MLINE) or FileToStr() or Append to Memo and then parse each come out about even in terms of processing time and code, unless the file is really extreme ( 10 Gb, or million byte lines or funny characters...)
Whil ran into a situation where long fields and an impossible number of columns made that technique impractical, and he came up with an innovative solution: import the datea into SQLIte, then use VFP ODBC to manipulate the date from there. He published his whitepaper here: http://hentzenwerke.com/catalog/sqlite2gb.htm
[Disclaimer: I tech-edited this, but make no money from promoting it.]
I do something similar--I use MariaDB/MySQL to import the text file to a table there, having all fields be TEXT fields. Works great. Why reinvent the wheel?
We had a file format that was coming from a mainframe where it was impossibly long line, but the first two characters told the line type. We knew the line wasn't more than 2000 characters long including the delimiters. We came up with this approach and it works blazingly fast with VFP.
Step 1: Create a temporary cursor for the entire contents of the file. Basically get it into VFP as quick as we can. CREATE CURSOR MyTempFile ( cType C(2), Field01 C(200), Field02 C(200), Field03 C(200), Field04 C(200), Field05 C(200), Field06 C(200), Field07 C(200), Field08 C(200), Field09 C(200), Field10 C(200))
Step 2: Append into MyTempFile from file DATFile.txt
Step 3: Select the records out of the MyTempFile based on the cType which tells us what the structure is expected to be. This might fit your scenario where a certain field in the delimeter "means" something. You only need to handle the scenario where a delimiter spans from one field to another. Example solution?: SCAN SCATTER NAME oRec cFullField = oRec.Field01 + oRec.Field02 + oRec.Field03 + oRec.Field04 + oRec.Field05 + oRec.Field06 + oRec.Field07 + oRec.Field08 + oRec.Field09 + oRec.Field10 SELECT CASE CASE oRec.cType = "X2" SELECT TableX2 SCATTER NAME oRec BLANK oRec.Fld01 = STREXTRACT(cFulLField, "|", "|", 1) oRec.Fld02 = STREXTRACT(cFulLField, "|", "|", 2) oRec.Fld03 = STREXTRACT(cFulLField, "|", "|", 3) && Might be a memo field INSERT INTO TableX2 FROM NAME oRec CASE oRec.cType = "TY" && do something OTHERWISE && Unhandled situation ENDCASE Endscan
Hope this helps!
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Matt Wiedeman Sent: Thursday, April 20, 2017 10:58 AM To: profox@leafe.com Subject: Text file import
Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
[excessive quoting removed by server]
Or you could use ALINES.
m.ctext=[field 1|field 2||field 4|field 5|field 6] ?ALINES(anarray,m.ctext,2,[|]) DISPLAY MEMORY LIKE anarray*
--
rk -----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Matt Wiedeman Sent: Thursday, April 20, 2017 10:58 AM To: profox@leafe.com Subject: Text file import
Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
On 2017-04-20 11:56, Kevin J Cully wrote:
<snipped> oRec.Fld01 = STREXTRACT(cFulLField, "|", "|", 1) oRec.Fld02 = STREXTRACT(cFulLField, "|", "|", 2) oRec.Fld03 = STREXTRACT(cFulLField, "|", "|", 3) && Might be a memo field
Wow! Never heard of this function before, so I looked it up in VFP and there it is! Thanks!!!!!!!!!!
STREXTRACT came along in 7 maybe? It really simplifies a lot of that old OCCURS()/AT()/RAT() type code
--
rk -----Original Message----- From: ProfoxTech [mailto:profoxtech-bounces@leafe.com] On Behalf Of mbsoftwaresolutions@mbsoftwaresolutions.com Sent: Friday, April 21, 2017 11:19 AM To: profoxtech@leafe.com Subject: STREXTRACT (was RE: Text file import)
On 2017-04-20 11:56, Kevin J Cully wrote:
<snipped> oRec.Fld01 = STREXTRACT(cFulLField, "|", "|", 1) oRec.Fld02 = STREXTRACT(cFulLField, "|", "|", 2) oRec.Fld03 = STREXTRACT(cFulLField, "|", "|", 3) && Might be a memo field
Wow! Never heard of this function before, so I looked it up in VFP and there it is! Thanks!!!!!!!!!!
On Fri, Apr 21, 2017 at 11:21 AM, Richard Kaye rkaye@invaluable.com wrote:
STREXTRACT came along in 7 maybe? It really simplifies a lot of that old OCCURS()/AT()/RAT() type code
7 it was.
There's an obscure feature of the Hacker's Guide CHM file: Open the content tab on the left, select "Customize..." from the right-click context menu, and there's a set of dialogs that let you filter what you see: select "Custom" on the first page. Next, and check off (or tick in the UK) which versions you want to see.
(And, yes, we documented this too, in "How to use the Help File, under the subhead Just the Way You Want It)
In this case, I would probably get each line of data to memory or a cursor as quickly as possible to speed up the parsing (ie: FileToStr if you can, or FGETS/FREADS to a cursor if you can't). However, I believe the low-level file functions would be easier to use here.
FGETS( ) returns a maximum of 8,192 bytes. (If you need more characters per line you can FREAD up to 65,535 bytes but using FREAD will require more programming to set the read length correctly for each line and to manually position your file pointer.)
I haven't tested this code but I think it would look something like this:
LOCAL lnFileHandle, lnFileSize, lcLineData, lnFldLoop, lcDestFld, laDataArray[1]
USE DestTable IN 0 &&I'm assuming that the number and order of fields match text file you are importing
lnFileHandle = FOPEN("test.txt") lnFileSize = FSEEK(lnFileHandle, 0, 2)
IF lnFileSize <= 0 WAIT WINDOW "Error! Empty file!" NOWAIT ELSE
DO WHILE !FEOF(lnFileHandle)
lcLineData = FGETS(lnFileHandle, 8192)
FOR lnFldLoop = 1 TO ALINES(laDataArray, lcLineData, "|")
lcDestFld = FIELD(lnFldLoop,'DestTable') REPLACE lcDestFld.&lcDestFld WITH laDataArray(lnFldLoop) IN DestTable
ENDFOR
ENDDO
ENDIF
= FCLOSE(lnFileHandle) && Close the file
You can get field level specific if you want to convert imported character data into numeric, date, binary or memo fields by doing direct replacements instead the field loop process shown above.
The bigger challenge that I see is that when a field is bigger than 254 characters, it is typcally reserved for user input/notes/memos/etc and invariably a user will enter delimiter values inside of the field which can affect your parse routine.
Paul H. Tarver
-----Original Message----- From: Matt Wiedeman [mailto:Matt.Wiedeman@nahealth.com] Sent: Thursday, April 20, 2017 9:58 AM To: profoxtech@leafe.com Subject: Text file import
Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
[excessive quoting removed by server]
I'll take a quick second to remind folks too of VFP2C32.FLL for accessing files greater than 2GB too. Functions are FGetSex, FCloseEx, FReadEx, FOpenEx, etc.
--Mike
On 2017-04-20 13:21, Paul H. Tarver wrote:
In this case, I would probably get each line of data to memory or a cursor as quickly as possible to speed up the parsing (ie: FileToStr if you can, or FGETS/FREADS to a cursor if you can't). However, I believe the low-level file functions would be easier to use here.
FGETS( ) returns a maximum of 8,192 bytes. (If you need more characters per line you can FREAD up to 65,535 bytes but using FREAD will require more programming to set the read length correctly for each line and to manually position your file pointer.)
I haven't tested this code but I think it would look something like this:
LOCAL lnFileHandle, lnFileSize, lcLineData, lnFldLoop, lcDestFld, laDataArray[1]
USE DestTable IN 0 &&I'm assuming that the number and order of fields match text file you are importing
lnFileHandle = FOPEN("test.txt") lnFileSize = FSEEK(lnFileHandle, 0, 2)
IF lnFileSize <= 0 WAIT WINDOW "Error! Empty file!" NOWAIT ELSE
DO WHILE !FEOF(lnFileHandle)
lcLineData = FGETS(lnFileHandle, 8192) FOR lnFldLoop = 1 TO ALINES(laDataArray, lcLineData, "|") lcDestFld = FIELD(lnFldLoop,'DestTable') REPLACE lcDestFld.&lcDestFld WITHlaDataArray(lnFldLoop) IN DestTable
ENDFORENDDO
ENDIF
= FCLOSE(lnFileHandle) && Close the file
You can get field level specific if you want to convert imported character data into numeric, date, binary or memo fields by doing direct replacements instead the field loop process shown above.
The bigger challenge that I see is that when a field is bigger than 254 characters, it is typcally reserved for user input/notes/memos/etc and invariably a user will enter delimiter values inside of the field which can affect your parse routine.
Paul H. Tarver
-----Original Message----- From: Matt Wiedeman [mailto:Matt.Wiedeman@nahealth.com] Sent: Thursday, April 20, 2017 9:58 AM To: profoxtech@leafe.com Subject: Text file import
Hello everyone,
I need to setup a job to import a pipe delimited text file. This is easy enough but one of the fields is larger than 254 characters. If I use a memo field, it does not import that field. I started to setup a routine to step through each character and store the fields manually but I would rather not do it that way.
Does anyone have a function or tip they can share to resolve this situation?
[excessive quoting removed by server]
On 21/04/2017 16:13, mbsoftwaresolutions@mbsoftwaresolutions.com wrote:
I'll take a quick second to remind folks too of VFP2C32.FLL for accessing files greater than 2GB too. Functions are FGetSex
Hope that doesn't take you to a dodgy site ;-)
Peter
This communication is intended for the person or organisation to whom it is addressed. The contents are confidential and may be protected in law. Unauthorised use, copying or disclosure of any of it may be unlawful. If you have received this message in error, please notify us immediately by telephone or email.
www.whisperingsmith.com
Whispering Smith Ltd Head Office:61 Great Ducie Street, Manchester M3 1RR. Tel:0161 831 3700 Fax:0161 831 3715
London Office:17-19 Foley Street, London W1W 6DW Tel:0207 299 7960
On 2017-04-21 11:28, Peter Cushing wrote:
On 21/04/2017 16:13, mbsoftwaresolutions@mbsoftwaresolutions.com wrote:
I'll take a quick second to remind folks too of VFP2C32.FLL for accessing files greater than 2GB too. Functions are FGetSex
Hope that doesn't take you to a dodgy site ;-)
Peter
LOL!! You saw my pun-like reference. when you put the command in code, it upper cases all of it and well, it just looks like that instead of FGetsEx. :D