UPDATE variable and OUTPUT TO filepath - openedge

I'm trying to ask a user to insert a filepath and then output the result to that filepath:
DEFINE VARIABLE outputPath AS CHARACTER FORMAT "x(50)".
UPDATE outputPath.
OUTPUT TO outputPath.
This doesn't seem to be working. But when I do for example:
OUTPUT TO "C:\temp\test.txt".
It seems to work.

To use the value of a variable in an OUTPUT statement:
OUTPUT TO VALUE( outputPath ).
VALUE is also used with INPUT FROM, INPUT THROUGH and INPUT-OUTPUT THROUGH.
(A "naked" variable name will be treated as a file name, no quotes needed -- a result of one of those "makes a good demo" decisions 30 years ago...)

Related

Add leading zeros to a character variable in progress 4gl

I am trying to import a .csv file to match the records in the database. However, the database records has leading zeros. This is a character field The amount of data is a bit higher side.
Here the length of the field in database is x(15).
The problem I am facing is that the .csv file contains data like example AB123456789 wherein the database field has "00000AB123456789" .
I am importing the .csv to a character variable.
Could someone please let me know what should I do to get the prefix zeros using progress query?
Thank you.
You need to FILL() the input string with "0" in order to pad it to a specific length. You can do that with code similar to this:
define variable inputText as character no-undo format "x(15)".
define variable n as integer no-undo.
input from "input.csv".
repeat:
import inputText.
n = 15 - length( inputText ).
if n > 0 then
inputText = fill( "0", n ) + inputText.
display inputText.
end.
input close.
Substitute your actual field name for inputText and use whatever mechanism you are actually using for importing the CSV data.
FYI - the "length of the field in the database" is NOT "x(15)". That is a display formatting string. The data dictionary has a default format string that was created when the schema was defined but it has absolutely no impact on what is actually stored in the database. ALL Progress data is stored as variable length length. It is not padded to fit the display format and, in fact, it can be "overstuffed" and it is very, very common for applications to do so. This is a source of great frustration to SQL reporting tools that think the display format is some sort of length limit. It is not.

Comparing the MD5 sum of a string to the contents of a file

I am trying to compare a string (in memory) to the contents of a file to see if they are the same. Boring details on motivation are below the question if anyone cares.
My confusion is that when I hash file contents, I get a different result than when I hash the string.
library(readr)
library(digest)
# write the string to the file
the_string <- "here is some stuff"
the_file <- "fake.txt"
readr::write_lines(the_string, the_file)
# both of these functions (predictably) give the same hash
tools::md5sum(the_file)
# "44b0350ee9f822d10f2f9ca7dbe54398"
digest(file = the_file)
# "44b0350ee9f822d10f2f9ca7dbe54398"
# now read it back to a string and get something different
back_to_a_string <- readr::read_file(the_file)
# "here is some stuff\n"
digest(back_to_a_string)
# "03ed1c8a2b997277100399bef6f88939"
# add a newline because that's what write_lines did
orig_with_newline <- paste0(the_string, "\n")
# "here is some stuff\n"
digest(orig_with_newline)
# "03ed1c8a2b997277100399bef6f88939"
What I want to do is just digest(orig_with_newline) == digest(file = the_file) to see if they're the same (they are) but that returns FALSE because, as shown, the hashes are different.
Obviously I could either read the file back to a string with read_file or write the string to a temp file, but both of those seem a bit silly and hacky. I guess both of those are actually fine solutions, I really just want to understand why this is happening so that I can better understand how the hashing works.
Boring details on motivation
The situation is that I have a function that will write a string to a file, but if the file already exists then it will error unless the user has explicitly passed .overwrite = TRUE. However, if the file exists, I would like to check whether the string about to be written to the file is in fact the same thing that's already in the file. If this is the case, then I will skip the error (and the write). This code could be called in a loop and it will be obnoxious for the user to continually see this error that they are about to overwrite a file with the same thing that's already in it.
Short answer: I think you need to set serialize=FALSE. Supposing that the file doesn't contain the extra newline (see below),
digest(the_string,serialize=FALSE) == digest(file=the_file) ## TRUE
(serialize has no effect on the file= version of the command)
dealing with newlines
If you read ?write_lines, it only says
sep: The line separator ... [information about defaults for different OSes]
To me, this seems ambiguous as to whether the separator will be added after the last line or not. (You don't expect a "comma-separated list" to end with a comma ...)
On the other hand, ?base::writeLines is a little more explicit,
sep: character string. A string to be written to the connection
after each line of text.
If you dig down into the source code of readr you can see that it uses
output << na << sep;
for each line of code, i.e. it's behaving the same way as writeLines.
If you really just want to write the string to the file with no added nonsense, I suggest cat():
identical(the_string, { cat(the_string,file=the_file); readr::read_file(the_file) }) ## TRUE

U-SQL get filename of input and use for output

I have a filename of test.csv and I want the output to be test.txt.
I can extract the filename of the input but don't know how to use it for the output?
OUTPUT #result TO "/output/{filename}.txt"
USING Outputters.Text(outputHeader:false, quoting:false);
The filename is in the #result.
This feature isn't supported as of yet.
Does anyone have a work around?
U-SQL How can I get the current filename being processed to add to my extract output?
Ideally I would like dd-mm-yy-test.text?
How do I append the day month and year?
I am using USQL for this.
Thanks
Let me address both issues you're laying out in this question:
To use the same output name as the input, there would have to be a way to access rowset values into u-sql variables which I'm pretty sure cannot be done, taking into account that the language is built around the necessity to process many files at once.
To append a date into the output you would only need to declare the current datetime at some point and then use it to write the output file name like this:
DECLARE #now DateTime = DateTime.Now;
OUTPUT #output TO "/tests/output/" + #now.ToString("dd-MM-yyyy") + "-output.csv" USING Outputters.Csv();

U-SQL How can I get the current filename being processed to add to my extract output?

I need to add meta data about the Row being processed. I need the filename to be added as a column. I looked at the ambulance demos in the Git repo, but can't figure out how to implement this.
You use a feature of U-SQL called 'file sets' and 'virtual columns'. In my simple example, I have two files in my input directory, I use file sets and refer to the virtual columns in the EXTRACT statement, eg
// Filesets, file set with virtual column
#q =
EXTRACT rowId int,
filename string,
extension string
FROM "/input/filesets example/{filename}.{extension}"
USING Extractors.Tsv();
#output =
SELECT filename,
extension,
COUNT( * ) AS records
FROM #q
GROUP BY filename,
extension;
OUTPUT #output TO "/output/output.csv"
USING Outputters.Csv();
My results:
Read more about both features here:
https://msdn.microsoft.com/en-us/library/azure/mt621320.aspx

How to convert IBM file to hexadecimal using DFSORT?

I'm trying to pass a IBM file to hex values.
With this input:
H800
Would save this output in a file:
48383030
I tried by this way:
//R45ORF80V JOB (EFAS,2SGJ000),'LLAMI',NOTIFY=R45ORF80,
// MSGLEVEL=(1,1),MSGCLASS=X,CLASS=A,
// REGION=0M,TIME=5
//*---------------------------------------------------
//SORTEST EXEC PGM=ICEMAN
//SORTIN DD DSN=LF58.DFE.V1408001,DISP=SHR
//SORTOUT DD DSN=LF58.DFE.V1408001.OUT,
// DISP=(NEW,CATLG,DELETE),
// LRECL=4,DATACLAS=CDMULTI
//SYSOUT DD SYSOUT=X
//SYSPRINT DD SYSOUT=X
//SYSUDUMP DD SYSOUT=X
//SYSIN DD *
SORT FIELDS=COPY
OUTREC FIELDS=(1,4,HEX)
END
/*
But it outputs the following:
C8F1F0F0
What am I doing wrong?
Is posible to convert to hexadecimal a file with 500 of LREC with COMP-3 fields too?
Just by the way I could use "HEX" command while I browse a file using file manager.
Your control cards are giving you the output you have asked for. They are showing you the hexadecimal values of those characters in EBCDIC, not in ASCII, the hexadecimal values you are expecting.
If you actually want to see the ASCII equivalent, use TRAN=ETOA, then TRAN=HEX.
You are using OUTREC FIELDS. FIELDS has a new synonym (from exactly 10 years) which is BUILD. FIELDS is supported for backwards compatibility.
INREC and OUTREC are similar, INREC operates before a SORT or MERGE, OUTREC afterwards.
What I recommend, unless you need to be doing it after a SORT/MERGE, is to use INREC.
So:
INREC BUILD=(1,4,TRAN=ETOA)
But, there is no need to use BUILD. BUILD always creates a new version of the record. Many times this is what you want when you are rearranging fields. Here, you are not.
INREC OVERLAY=(1,4,TRAN=ETOA)
If you replace your OUTREC with that, your output file will be encoded in ASCII.
If you want to see the ASCII as well:
INREC OVERLAY=(1,4,TRAN=ETOA,1,4,TRAN=HEX)
If you want to see the ASCII instead:
INREC OVERLAY=(1,4,TRAN=ETOA,1:1,4,TRAN=HEX)
Note the 1: in the last example. This says "the results are going to be at position 1", so overwriting your previous converted data. OVERLAY can do that, BUILD cannot in one statement.

Resources