How can I manipulate a string and eliminate the character * or #? - r

In R, how can I manipulate a string and eliminate the character * or #? For example in "ALL8606#057R0" I try with RFC_corr[5] = str_split(RFC[5],split= "#",fixed=true)

As tospig suggested:
> sub("#", "", "ALL8606#057R0")
[1] "ALL8606057R0"
Edit for your comment below: to apply this to a vector you don't need a loop; you can just use the vector of interest when calling the function:
> x <- c("vect#or", "th-at#", "ha%s", "weir*d", "stu+ff")
> gsub("[-+%*#]", "", x)
[1] "vector" "that" "has" "weird" "stuff"
```

The simplest way to verify this make a loop and visit for each char in that array when the "#" and "*" skip that while doing that make the copy of string that contains the string skip the # and *.
int i=0;
string userstr="ALL8606#057R0";
char[] copystr=new char[userstr.Length()];
foreach(char s in userstr)
{
if(s!="#" || s!="*")
{
copystr[i]=s;
i++;
}
}
Hope this code will help you to resolve conflict.if you are getting the error in the userstr.Length so please put the hard coded value and try.
Bye and Happy Coding.

Related

Extract all substrings in string

I want to extract all substrings that begin with M and are terminated by a *
The string below as an example;
vec<-c("SHVANSGYMGMTPRLGLESLLE*A*MIRVASQ")
Would ideally return;
MGMTPRLGLESLLE
MTPRLGLESLLE
I have tried the code below;
regmatches(vec, gregexpr('(?<=M).*?(?=\\*)', vec, perl=T))[[1]]
but this drops the first M and only returns the first string rather than all substrings within.
"GMTPRLGLESLLE"
You can use
(?=(M[^*]*)\*)
See the regex demo. Details:
(?= - start of a positive lookahead that matches a location that is immediately followed with:
(M[^*]*) - Group 1: M, zero or more chars other than a * char
\* - a * char
) - end of the lookahead.
See the R demo:
library(stringr)
vec <- c("SHVANSGYMGMTPRLGLESLLE*A*MIRVASQ")
matches <- stringr::str_match_all(vec, "(?=(M[^*]*)\\*)")
unlist(lapply(matches, function(z) z[,2]))
## => [1] "MGMTPRLGLESLLE" "MTPRLGLESLLE"
If you prefer a base R solution:
vec <- c("SHVANSGYMGMTPRLGLESLLE*A*MIRVASQ")
matches <- regmatches(vec, gregexec("(?=(M[^*]*)\\*)", vec, perl=TRUE))
unlist(lapply(matches, tail, -1))
## => [1] "MGMTPRLGLESLLE" "MTPRLGLESLLE"
This could be done instead with a for loop on a char array converted from you string.
If you encounter a M you start concatenating chars to a new string until you encounter a *, when you do encounter a * you push the new string to an array of strings and start over from the first step until you reach the end of your loop.
It's not quite as interesting as using REGEX to do it, but it's failsafe.
It is not possible to use regular expressions here, because regular languages don't have memory states required for nested matches.
stringr::str_extract_all("abaca", "a[^a]*a") only gives you aba but not the sorrounding abaca.
The first M was dropped, because (?<=M) is a positive look behind which is by definition not part of the match, but just behind it.

Edit distance leetcode

So I am doing this question of EDIT DISTANCE and before going to DP approach I am trying to solve this question in recursive manner and I am facing some logical error, please help....
Here is my code -
class Solution {
public int minDistance(String word1, String word2) {
int n=word1.length();
int m=word2.length();
if(m<n)
return Solve(word1,word2,n,m);
else
return Solve(word2,word1,m,n);
}
private int Solve(String word1,String word2,int n,int m){
if(n==0||m==0)
return Math.abs(n-m);
if(word1.charAt(n-1)==word2.charAt(m-1))
return 0+Solve(word1,word2,n-1,m-1);
else{
//insert
int insert = 1+Solve(word1,word2,n-1,m);
//replace
int replace = 1+Solve(word1,word2,n-1,m-1);
//delete
int delete = 1+Solve(word1,word2,n-1,m);
int max1 = Math.min(insert,replace);
return Math.min(max1,delete);
}
}
}
here I am checking the last element of both the strings if both the characters are equal then simple moving both string to n-1 and m-1 resp.
Else
Now I am having 3 cases of insertion , deletion and replace ,and between these 3 I have to find minima.
If I am replacing the character then simply I moved the character to n-1 & m-1.
If I am inserting the character from my logic I think I should insert the character at the last of smaller length string and move the pointer to n-1 and m
To delete the element I think I should delete the element from the larger length String that's why I move pointer to n-1 and m but I think I am making mistake here please help.
Leetcode is giving me wrong answer for word1 = "plasma" and word2 = "altruism".
The problem is that the recursive expression for the insert-case is the same as for the delete-case.
Reasoning further, it turns out the one for the insert-case is wrong. In that case we choose to resolve the letter in word2 (at index m-1) through insertion, so it should not be considered any more during the recursive process. On the other hand the considered letter in word1 could still be matched with another letter in word2, so that letter should still be considered during the recursive process.
That means that m should be decremented, not n.
So change:
int insert = 1+Solve(word1,word2,n-1,m);
to:
int insert = 1+Solve(word1,word2,n,m-1);
...and it will work. Then remains to add the memoization for getting a good efficiency.
Python clean DP based solution,
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
return self.edit_distance(word1, word2)
#cache
def edit_distance(self, s, t):
# Edge conditions
if len(s) == 0:
return len(t)
if len(t) == 0:
return len(s)
# If 1st char matches
if s[0] == t[0]:
return self.edit_distance(s[1:], t[1:])
else:
return min(
1 + self.edit_distance(s[1:], t), # delete
1 + self.edit_distance(s, t[1:]), # insert
1 + self.edit_distance(s[1:], t[1:]) # replace
)

How do you access name of a ProtoField after declaration?

How can I access the name property of a ProtoField after I declare it?
For example, something along the lines of:
myproto = Proto("myproto", "My Proto")
myproto.fields.foo = ProtoField.int8("myproto.foo", "Foo", base.DEC)
print(myproto.fields.foo.name)
Where I get the output:
Foo
An alternate method that's a bit more terse:
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* myproto")
print(string.sub(fieldString, i + 2, j - (1 + string.len("myproto")))
EDIT: Or an even simpler solution that works for any protocol:
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* ")
print(string.sub(fieldString, i + 2, j - 1))
Of course the 2nd method only works as long as there are no spaces in the field name. Since that's not necessarily always going to be the case, the 1st method is more robust. Here is the 1st method wrapped up in a function that ought to be usable by any dissector:
-- The field is the field whose name you want to print.
-- The proto is the name of the relevant protocol
function printFieldName(field, protoStr)
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* " .. protoStr)
print(string.sub(fieldString, i + 2, j - (1 + string.len(protoStr)))
end
... and here it is in use:
printFieldName(myproto.fields.foo, "myproto")
printFieldName(someproto.fields.bar, "someproto")
Ok, this is janky, and certainly not the 'right' way to do it, but it seems to work.
I discovered this after looking at the output of
print(tostring(myproto.fields.foo))
This seems to spit out the value of each of the members of ProtoField, but I couldn't figure out the correct way to access them. So, instead, I decided to parse the string. This function will return 'Foo', but could be adapted to return the other fields as well.
function getname(field)
--First, convert the field into a string
--this is going to result in a long string with
--a bunch of info we dont need
local fieldString= tostring(field)
-- fieldString looks like:
-- ProtoField(188403): Foo myproto.foo base.DEC 0000000000000000 00000000 (null)
--Split the string on '.' characters
a,b=fieldString:match"([^.]*).(.*)"
--Split the first half of the previous result (a) on ':' characters
a,b=a:match"([^.]*):(.*)"
--At this point, b will equal " Foo myproto"
--and we want to strip out that abreviation "abvr" part
--Count the number of times spaces occur in the string
local spaceCount = select(2, string.gsub(b, " ", ""))
--Declare a counter
local counter = 0
--Declare the name we are going to return
local constructedName = ''
--Step though each word in (b) separated by spaces
for word in b:gmatch("%w+") do
--If we hav reached the last space, go ahead and return
if counter == spaceCount-1 then
return constructedName
end
--Add the current word to our name
constructedName = constructedName .. word .. " "
--Increment counter
counter = counter+1
end
end

convert `NULL` to string (empty or literally `NULL`)

I receive a list test that may contain or miss a certain name variable.
When I retrieve items by name, e.g. temp = test[[name]] in case name is missing I temp is NULL. In other cases, temp has inadequate value, so I want to throw a warning, something like name value XXX is invalid, where XXX is temp (I use sprintf for that purpose) and assign the default value.
However, I have a hard time converting it to string. Is there one-liner in R to do this?
as.character produces character(0) which turns the whole sprintf argument to character(0).
Workflow typically looks like:
for (name in name_list){
temp = test[[name]]
if(is.null(temp) || is_invalid(temp) {
warning(sprintf('%s is invalid parameter value for %s', as.character(temp), name))
result = assign_default(name)
} else {
result = temp
print(sprintf('parameter %s is OK', name)
}
}
PS.
is_invalid is function defined elsewhere. I need subsitute of as.character that would return '' or 'NULL'.
test = list(t1 = "a", t2 = NULL, t3 = "b")
foo = function(x){
ifelse(is.null(test[[x]]), paste(x, "is not valid"), test[[x]])
}
foo("t1")
#[1] "a"
foo("t2")
#[1] "t2 is not valid"
foo("r")
#[1] "r is not valid"
You can use format() to convert NULL to "NULL".
In your example it would be:
warning(sprintf('%s is invalid parameter value for %s', format(temp), name))
Well, as ultimately my goal was to join two strings, one of which might be empty (null), I realized, I just can use paste(temp, "name is empty or invalid") as my warning string. It doesn't exactly convert NULL to the string, but it's a solution.

Python 3.4 help - using slicing to replace characters in a string

Say I have a string.
"poop"
I want to change "poop" to "peep".
In fact, I also want all of the o's in poop to change to e's for any word I put in.
Here's my attempt to do the above.
def getword():
x = (input("Please enter a word."))
return x
def main():
y = getword()
for i in range (len(y)):
if y[i] == "o":
y = y[:i] + "e"
print (y)
main()
As you can see, when you run it, it doesn't amount to what I want. Here is my expected output.
Enter a word.
>>> brother
brether
Something like this. I need to do it using slicing. I just don't know how.
Please keep your answer simple, since I'm somewhat new to Python. Thanks!
This uses slicing (but keep in mind that slicing is not the best way to do it):
def f(s):
for x in range(len(s)):
if s[x] == 'o':
s = s[:x]+'e'+s[x+1:]
return s
Strings in python are non-mutable, which means that you can't just swap out letters in a string, you would need to create a whole new string and concatenate letters on one-by-one
def getword():
x = (input("Please enter a word."))
return x
def main():
y = getword()
output = ''
for i in range(len(y)):
if y[i] == "o":
output = output + 'e'
else:
output = output + y[i]
print(output)
main()
I'll help you this once, but you should know that stack overflow is not a homework help site. You should be figuring these things out on your own to get the full educational experience.
EDIT
Using slicing, I suppose you could do:
def getword():
x = (input("Please enter a word."))
return x
def main():
y = getword()
output = '' # String variable to hold the output string. Starts empty
slice_start = 0 # Keeps track of what we have already added to the output. Starts at 0
for i in range(len(y) - 1): # Scan through all but the last character
if y[i] == "o": # If character is 'o'
output = output + y[slice_start:i] + 'e' # then add all the previous characters to the output string, and an e character to replace the o
slice_start = i + 1 # Increment the index to start the slice at to be the letter immediately after the 'o'
output = output + y[slice_start:-1] # Add the rest of the characters to output string from the last occurrence of an 'o' to the end of the string
if y[-1] == 'o': # We still haven't checked the last character, so check if its an 'o'
output = output + 'e' # If it is, add an 'e' instead to output
else:
output = output + y[-1] # Otherwise just add the character as-is
print(output)
main()
Comments should explain what is going on. I'm not sure if this is the most efficient or best way to do it (which really shouldn't matter, since slicing is a terribly inefficient way to do this anyways), just the first thing I hacked together that uses slicing.
EDIT Yeah... Ourous's solution is much more elegant
Can slicing even be used in this situation??
The only probable solution I think would work, as MirekE stated, is y.replace("o","e").

Resources