I'm coming from a Java/.NET background and trying to learn ABL but the difference in structure and the limited information on the internet is making it hard. What I want to do is import data from a text file which is in the following format:
john smith 52 ceo
...
line by line, and take the different parts based on the position of the character. For example, positions 1-10 are for the first name, 10-20 second name and so on... Do I have to use entry for that? If so can someone more experienced give an example how to do it cause I'm quite confused. Then I need to add a record for each line to a temp-table I have created called tt-employee. How do I go about doing that?
I apologise if my question is a bit vague but as I said, I am new to this so I'm still figuring things out.
If space is a delimiter you can use the IMPORT statement.
DEFINE TEMP-TABLE tt-employee NO-UNDO
FIELD firstname AS CHARACTER
FIELD lastname AS CHARACTER
FIELD age AS INTEGER
FIELD empTitle AS CHARACTER.
INPUT FROM c:\temp\indata.dat.
REPEAT:
CREATE tt-employee.
IMPORT DELIMITER " " tt-employee.
END.
INPUT CLOSE.
However if there isn't a delimiter but rather a fixed record with (as you mention) you can do something like this (error checking and correct record lengths needs to be applied).
/* Skipping temp-table definition - copy-paste from above */
DEFINE VARIABLE cRow AS CHARACTER NO-UNDO.
INPUT FROM c:\temp\indata.dat.
REPEAT:
IMPORT UNFORMATTED cRow.
/* You could replace 0 with a higher number that qualifies a record so
SUBSTRING doesn't return an error if reading past end of line */
IF LENGTH(cRow) > 0 THEN DO:
CREATE tt-employee.
ASSIGN
tt-employee.firstname = SUBSTRING(cRow, 1, 10)
tt-employee.lastname = SUBSTRING(cRow, 11, 10)
tt-employee.age = INTEGER(SUBSTRING(cRow, 21, 2))
tt-employee.empTitle = SUBSTRING(cRow, 23, 10) NO-ERROR.
END.
END.
INPUT CLOSE.
There are several places on the web to look for OpenEdge information:
Official knowledgebase - http://knowledgebase.progress.com/
Official community - https://community.progress.com/?Redirected=true
More communities - http://www.progresstalk.com/ and http://oehive.org/
Related
I have read data from a csv file containing many duplicate email addresses into a temp table. The format is essentially id, emailtype-description, email.
Here is an example of some data:
id emailtype-description email
1 E-Mail john#gmail.com
1 preferred E-mail john#gmail.com
2 2nd E-mail stacey#yahoo.com
2 preferred-Email sth#yahoo.com
2 family E-Mail sth#yahoo.com
cInputFile = SUBSTITUTE(cDataDirectory, "Emails").
INPUT STREAM csv FROM VALUE(cInputFile).
IMPORT STREAM csv DELIMITER "," ^ NO-ERROR.
REPEAT TRANSACTION:
CREATE ttEmail.
IMPORT STREAM csv DELIMITER ","
ttEmail.uniqueid
ttEmail.emailTypeDescription
ttEmail.emailAddr
.
END.
INPUT STREAM csv CLOSE.
I want to dedupe the rows, but I don't want to do this randomly. I want to make sure that certain types take priority over others. For instance, some are marked with the type "preferred E-mail" and those should always remain if they exist, additional types take precedent over others, so "E-mail" will take precedent over "2nd-Email" or "family E-Mail".
I'd like to do in Progress code the equivalent of a custom sort of emailtype-description, then a de-dupe. That way I could define the sort order and then dedupe to retain the emails and the types by priority.
Is there a way to do this to my table in Progress? I want to sort first by uniqueid, then by emailtype-description, but I want a custom sort, not an alphabetical sort. What is the best approach?
When you say that you want a custom sort, not alphabetical do you mean that you want to sort by the emailtype in a non-alphabetical way? If so then I think that you would need to translate the email type into a field that sorts the way that you wish. Something along these lines:
/* first add a field to your ttEmail called emailTypeSortOrder */
define variable emailTypeSortOrderList as character no-undo.
emailTypeSortOrderList = "preferred E-mail,E-mail,2nd-Email,family E-mail".
cInputFile = SUBSTITUTE(cDataDirectory, "Emails").
INPUT STREAM csv FROM VALUE(cInputFile).
IMPORT STREAM csv DELIMITER "," ^ NO-ERROR.
REPEAT TRANSACTION:
CREATE ttEmail.
IMPORT STREAM csv DELIMITER ","
ttEmail.uniqueid
ttEmail.emailTypeDescription
ttEmail.emailAddr
.
/* classify the email type sort order
*/
ttEmail.emailTypeSortOrder = lookup( emailTypeDescription, emailTypeSortOrderList ).
if ttEmail.emailTypeSortOrder <= 0 then emailTypeSortOrder = 9999999.
END.
INPUT STREAM csv CLOSE.
And now you can sort and de-duplicate using the newly ordered field:
for each ttEmail break by ttEmail.emailAddr by ttEmail.emailTypeSortOrder:
if first-of( ttEmail.emailAddr ) then
next. /* always keep the first one */
else
delete ttEmail. /* remove unwanted duplicates... */
end.
I am trying to import a .csv file to match the records in the database. However, the database records has leading zeros. This is a character field The amount of data is a bit higher side.
Here the length of the field in database is x(15).
The problem I am facing is that the .csv file contains data like example AB123456789 wherein the database field has "00000AB123456789" .
I am importing the .csv to a character variable.
Could someone please let me know what should I do to get the prefix zeros using progress query?
Thank you.
You need to FILL() the input string with "0" in order to pad it to a specific length. You can do that with code similar to this:
define variable inputText as character no-undo format "x(15)".
define variable n as integer no-undo.
input from "input.csv".
repeat:
import inputText.
n = 15 - length( inputText ).
if n > 0 then
inputText = fill( "0", n ) + inputText.
display inputText.
end.
input close.
Substitute your actual field name for inputText and use whatever mechanism you are actually using for importing the CSV data.
FYI - the "length of the field in the database" is NOT "x(15)". That is a display formatting string. The data dictionary has a default format string that was created when the schema was defined but it has absolutely no impact on what is actually stored in the database. ALL Progress data is stored as variable length length. It is not padded to fit the display format and, in fact, it can be "overstuffed" and it is very, very common for applications to do so. This is a source of great frustration to SQL reporting tools that think the display format is some sort of length limit. It is not.
Hi Progress OpenEdge dev,
I am using the following syntax to generate an XML file from temp table. All is good but for one item.
dataset dsCust:write-xml("FILE", "c:/Test/Customer.xml", true).
This is my temp table declaration
def temp-table ttCustomer no-undo
namespace-uri "http://WMS.URI"
namespace-prefix "ns0"
field PurchaseOrderNumber as char
field Plant as char.
This is my output
<ns0:GoodsReceipt xmlns:ns0="http://WMS.URI">
<ns0:PurchaseOrderNumber/>
<ns0:Plant>Rose</ns0:Plant>
</ns0:GoodsReceipt>
But this is my desired output
<ns0:GoodsReceipt xmlns:ns0="http://WMS.URI">
<PurchaseOrderNumber/>
<Plant>Rose</Plant>
</ns0:GoodsReceipt>
Notice the element inside GoodsReceipt node does not have ns0 prefix.
Can this achived using write-xml? I want to avoid using DOM or SAX if possible.
Thank you
You can always manually set attributes and tag-names using XML-NODE-TYPE and SERIALIZE-NAME.
However: I've worked with lot's of xml:s and API:s together with Progress OpenEdge and have yet to fail based on namespace-problems but I guess it might depend on what you want to do with the data.
Since you don't include the entire dataset this is something of a guess. It produces more or less what you want for this specific case. I don't know how multiple "receipts" should be rendered though so you might need to change this.
DEFINE TEMP-TABLE ttCustomer NO-UNDO SERIALIZE-NAME "ns0:GoodsReceipt"
FIELD xmlns AS CHARACTER SERIALIZE-NAME "xmlns:ns0" INITIAL "http://WMS.URI" XML-NODE-TYPE "ATTRIBUTE"
FIELD PurchaseOrderNumber AS CHARACTER
FIELD Plant AS CHARACTER .
DEFINE DATASET dsCust SERIALIZE-HIDDEN
FOR ttCustomer .
CREATE ttCustomer.
ASSIGN Plant = "Rose".
DATASET dsCust:write-xml("FILE", "c:/temp/Customer.xml", TRUE).
From a quick Google on the subject, it seems the W3C suggests that the namespace prefix should be presented the way OpenEdge does it: https://www.w3schools.com/xml/xml_namespaces.asp
And I'm pretty certain you can't change the behaviour with write-xml like you want to either. The documentation doesn't mention any way of overriding the behaviour. https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvxml/namespace-uri-and-namespace-prefix.html
Good day.
I have a previous question in this link. On the exported csv, I put on the first line the TABLE NAME. I wish to import this CSV to my system.
My current code is this:
DEF VAR ic as INT.
DEF VAR cTable as CHAR.
INPUT FROM VALUE(SESSION:TEMP-DIRECTORY + "temp.csv").
ic = 0.
REPEAT:
ic = ic + 1.
IF ic > 1 THEN DO:
CREATE cTable.
IMPORT DELIMITER "," cTable.
END.
IMPORT cTable.
END.
INPUT CLOSE.
I know that the code is wrong in CREATE part. How do I do this?
Also, when I EXPORT, there is an additional BLANK line after the last record. How do I remove this without opening the CSV file?
Removing the empty line at the end of your file doesn't fix your problem, worse you will not be able to read the last valid CSV line with the IMPORT statement if you do that (it needs an empty line at the end to work properly).
The actual problem is you get an empty row in your table because the IMPORT DELIMITER "," cTable. fails when the REPEAT block reaches the end of the file to leave the loop it raised the ENDKEY condition. But since you call CREATE cTable. before the loop is left you get en empty entry. I hope this explanation helps you understand how the REPEAT loop works, if you don't know this it looks just like an endless loop without any break condition.
Anyway to fix that problem you can either delete the empty row (like you did before), that is perfectly valid or you can omit the NO-UNDO from the temp table definition, because then the REPEAT will UNDO the CREATE by default.
To your other question about the CSV header line, you have to read the line somehow, I don't think there is a statement to just skip it and start reading at the 2. line in the file.
if you need the header names you can simply define a char variable for every column and import it like:
IMPORT DELIMITER "," cColumn1 cColumn2. /* for every column */
or if you just want to read and ignore it you can use
IMPORT UNFORMATTED cTemp.
with a temp variable that reads the whole line.
If you're trying to import an unknown file format, you might try reading in the entire line of data and then pick through the individual fields. As for question 2, it's better to handle the incorrect data programmatically, rather than trying to change the file itself.
This code reads in each line from the file, skipping any that are blank or null. Then it goes through each comma-separated field in the line and displays them.
DEFINE VARIABLE cTable AS CHARACTER NO-UNDO.
DEFINE VARIABLE cField AS CHARACTER NO-UNDO.
DEFINE VARIABLE iLoop AS INTEGER NO-UNDO.
INPUT FROM VALUE(SESSION:TEMP-DIRECTORY + "temp.csv").
REPEAT ON ERROR UNDO, NEXT:
IMPORT UNFORMATTED cTable. /* Read an entire line from the file. */
IF cTable = "" OR cTable = ? THEN NEXT. /* Skip blank lines. */
DO iLoop = 1 TO NUM-ENTRIES(cTable, ","): /* Break up the line by the comma delimiter. */
cField = ENTRY(iLoop, cTable).
MESSAGE "Field " + STRING(iloop) + ": " + cField VIEW-AS ALERT-BOX.
END.
END.
INPUT CLOSE.
Since the file layout is unknown, all of the fields are read as character. You'll need to add some logic to determine if the values are integers, decimals, dates, etc.
I need to create data files from our OpenEdge database for import to another system. The files are required to have no formatting or delimiters but to use fixed length fields. For example, the importing system is looking for something like what is below, where each field always starts at same place in the file, regardless of the length of the data or even if there is data in that field.
For example, if I export the a temp-table called "contact" that contains first-name, last-name, city, state, spouse-name, and favorite-color, I need the first name to always start on position 1, the last name to always start on position 16, the city to always start on position 31, and so on. It should look like:
Jane Doe Acme NY John Blue
Joe Sixpack Spingfield IL Grey
I can use PUT UNFORMATTED to get rid of the double-quotes around CHAR fields and export without delimiters, but have not been able to force each field to start at the exact same position each time. It comes out looking like:
JaneDoeAcmeNYJohnBlue
JoeSixpackSpringfieldILGrey
Is there a way to do this?
What I've been doing is:
DEF TEMP-TABLE contact
FIELD first-name as CHAR FORMAT "x(15)"
FIELD last-name as CHAR FORMAT "x(15)"
FIELD city as CHAR FORMAT "x(12)"
......
FOR EACH CONTACT:
PUT UNFORMATTED first-name last-name city....
END.
You are so close. Just remove UNFORMATTED and you're there...
UNFORMATTED tells you just that - ignore the format.