Whitespaces on the right side dont matter when comparing - openedge

DEFINE VARIABLE a AS CHARACTER NO-UNDO.
DEFINE VARIABLE b AS CHARACTER NO-UNDO.
a = "123".
b = "123 ".
MESSAGE a = b
VIEW-AS ALERT-BOX.
MESSAGE LENGTH(a) = LENGTH(b)
VIEW-AS ALERT-BOX.
Does anyone know, why the first equals returns true?
Are whitespaces ignored on the right side? Because a whitespace on the left would cause the equals to be false. It also doesn't matter how many whitespaces there are on the right side.
Thank you all

https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvref/eq-or-=-operator.html
The equal comparison ignores trailing blanks. Thus, "abc" is equal to "abc ". However, leading and embedded blanks are treated as characters and " abc" is not equal to "abc".

Well, that's just the way the ABL is implemented.
You can use the COMPARE function.
MESSAGE a = b SKIP
COMPARE (a, "EQ", b, "RAW")
VIEW-AS ALERT-BOX.

Related

What is the meaning of quote = "" in count.fields function in R?

I dont understand the meaning of quote="" or quote=" \ " ' " in the count.fields function. Can someone please explain the use of the quote field and difference between the above two values of quote field?
Consider the text file
one two
'three four'
"file six"
seven "eight nine"
which we can create with
lines <- c(
"one two",
"'three four'",
"\"file six\"",
"seven \"eight nine\"")
writeLines(lines, "test.txt")
The quote= parameter lets R know what characters can start/end quoted values within the file. We can ignore quotes all together by setting quote="". Doing that we see
count.fields("test.txt", quote="")
# [1] 2 2 2 3
so it's interpreting the spaces as starting new fields and each word is it's own field. This might be useful if you have fields that contain quotes for things other than creating strings. Such as last names like o'Brian and measurements like 5'6". If we just say only double quotes start string values, we get
count.fields("test.txt", quote="\"")
# [1] 2 2 1 2
So here the first two lines are the same but line 3 is considered to have just one value. The space between the quotes does not start a new field.
The default is to use either double quotes or single quotes which gives
count.fields("test.txt")
# [1] 2 1 1 2
So now the second line is treated like third line as having just one value
cat is often a good way to show what you are dealing with when you have quotes inside quotes.
> cat("Nothing:", "", "\n")
Nothing:
> cat("Something:", "\"'", "\n")
Something: "'
The first example of quote="" is specifying you have no quotes in the file.
The second example of quote="\"'" is specifying you have " or ' as potential quoting fields.
The \ backslash is used to 'escape' the following character so \" is treated literally as " instead of closing off the argument to quote= prematurely.

Extracting string between two "-" in r

I am trying to understand how do i extract the string which is between two hyphens.
For example,
node->testtransport-fasttrack-direct
I want the string fasttrack to be extracted and it shouldnt be based on the position of the strings as they might change.
I want the hard code to extract the string present between two hyphens
Thank you in advance.
Here are some approaches. No packages are used.
1) Here we assume that the part between the two minus signs must be all upper case letters so >DHLPAKET is excluded because even though it is between two minus signs it has a character which is not an upper case letter. Match the start (^) and then anything (.*) followed by minus (-) followed by an upper case string which is captured ([A-Z]+) and another minus (-) and everything else and finally the end of string ($). Replace all that with the captured portion (\1)
x <- "WRO2->DHLPAKET-ASCHHEIM-DI"
sub("^.*-([A-Z]+)-.*$", "\\1", x)
## [1] "ASCHHEIM"
2) If the two minus signs surrounding the string of interest are always the second and third minus signs then this would work. It uses read.table picking off the third minus-separated field.
read.table(text = x, sep = "-", as.is = TRUE)$V3
## [1] "ASCHHEIM"

replacing second occurence of string with different string

I have very simple issue with replacing the strings second occurrence with the new string.
Lets say we have this string
string <- c("A12A32")
and we want to replace the second A with B string. A12B32 is the expected output.
by following this relevant post
How to replace second or more occurrences of a dot from a column name
I tried,
replace_second_A <- sub("(\\A)\\A","\\1B", string)
print(replace_second_A)
[1] "A12A32"
it seems no change in the second A why?
Note that .*? matches the shortest string until the next A:
string <- "A12A32"
sub("(A.*?)A", "\\1B", string)
## [1] "A12B32"
First, there is no need to escape the letter A using backslashes. They are only required to escape special characters that have other meanings e.g. "." means "any character", "\\." means "period".
Second, your regular expression "(\\A)\\A" reads "match A followed by another A, keeping the first A for reuse." You don't have two consecutive "A", they are separated by digits.
So this works ("\\d+" means "match 1 or more digits"):
sub("(A\\d+)A","\\1B", "A12A32")
[1] "A12B32"

Progress 4gl control character remove

I want to remove all control character from the given string. i don't want to use Replace method because it take multiple iteration.
Help me.
Thanks in advance.
You may not like it, but REPLACE is the simplest way to do it. I've used this code to strip non-printable characters from a string. This will replace the control characters with a space:
DEFINE VARIABLE str AS CHARACTER NO-UNDO.
DEFINE VARIABLE iLoop AS INTEGER NO-UNDO.
DO iLoop = 1 TO 31:
str = REPLACE(str, CHR(iLoop), " ").
END.
Since there are multiple control characters that have to be removed, it seems that any solution will involve multiple iterations.
Depending on what you define as a control character and what character set you are using this might do the trick. Or at least point you in a helpful direction:
define variable i as integer no-undo.
define variable n as integer no-undo.
define variable c as character no-undo.
define variable s as character no-undo.
define variable x as character no-undo.
s = "something with control characters in it".
x = "".
n = length( s ).
do i = 1 to n:
c = substring( s, i, 1 ).
if asc( c ) >= 32 and asc( c ) < 127 then
x = x + c.
end.

R grepl variable comparison

Just need some help with grepl, it's doing my head in!
I have two variables:
str1<-"AAV.L"
str2<-"AAV2.L"
And what I want to do is check if str2 is an extension of str1 (which it is in this case). Basically here str2 has an extra "2" in it's name..
Ideally the solution is something like:
grepl(str1,paste0(str2,...))
But I have no idea to account for the . in str1. The lengths of variables aren't the same either so I can't just check if the first 3 characters of str1 are present in str2.
Anyone have any ideas?
Thanks!
EDIT - Clarification..
Basically by "extension of" I mean if one variable contains exactly the same letters, and more, in the same order. So the above example, AAV.L and AAV2.L would match because it contains AAV..L. It doesn't have to be like this however, it should match REWR with REWRLE as well meaning REWR...
So c("AAV.LE", "BAAV.L","AABV.L","AAV..L","ABCAV.L"), none would match. If I were to put a rule for the match into plain English it would be:
Does str2 start with str1 OR does str2 start with any subset of str1 and end with the other subset?
I've taken a look into agrep but it matches too inaccurately. For example AAV.L and AAV2.L match which is good, but then ADD and APUAD do as well, which is incorrect! I know I can specify max.distance but some strings could be ADD and ADDDDDDDDD which would make settings this value implausible..
Let me know if this helps.
You could drop the dot extension before placing in grepl.
str1 <- sub("\\.[[:alnum:]]+$", "", str1);
## AAV
str2 <- sub("\\.[[:alnum:]]+$", "", str2);
## AAV2
Note: This is a method for removing file extensions. It won't remove any other occurances of the dot character. It works by replacing an occurance of a period followed by nothing but alphanumeric characters, and searches from the end of the string. It replaces it with an empty ("") string.
str3 <- "A.A.V.L"
str3 <- sub("\\.[[:alnum:]]+$", "", str3);
## A.A.V
Then, using grepl
grepl(str1, str2)
## TRUE

Resources