I have a string in a data frame as: "(1)+(2)"
I want to split with delimiter "+" such that I get one element as (1) and other as (2), hence preserving the parentheses. I used strsplit but it does not preserve the parenthesis.
Use
strsplit("(1)+(2)", "\\+")
or
strsplit("(1)+(2)", "+", fixed = TRUE)
The idea of using strsplit("(1)+(2)", "+") doesn't work since unless specified otherwise, the split argument is a regular expression, and the + character is special in regex. Other characters that also need extra care are
?
*
.
^
$
\
|
{ }
[ ]
( )
Below Worked for me:
import re
re.split('\\+', 'ABC+CDE')
Output:
['ABC', 'CDE']
Related
I am new to R programming and was trying out the gsub function for text replacement in pandas dataframe series(i.e new_text).
It a vast series so will not be able to print all here.
It is just a series with strings containing postal address.
I came across this gsub code : gsub(pattern = "( \n|\n )", replacement = " ", x = new_text) -> new_text
can you please let me know the meaning of this regex expression as well as the python alternative using regex expression.
Your pattern, slightly rewritten, is [ ]\n|\n[ ], which says to match:
[ ]\n a space followed by a newline
| OR
\n[ ] a newline followed by a space
Note that you might be able to use [ ]?\n[ ]? to the same effect, depending on the actual text you are using with gsub.
PurchPackingSlipJournalCreate class -> initHeader method have a line;
vendPackingSlipJour.PackingSlipId = purchParmTable.Num;
but i want when i copy and paste ' FDG 2020 ' (all blanks are tab character) in Num area and click okey, write this value as 'FDG2020' in the PackagingSlipId field of the vendPackingSlipJour table.
I tried -> vendPackingSlipJour.PackingSlipId = strRem(purchParmTable.Num, " ");
but doesn't work for tab character.
How can i remove all whitespace characters from string?
Version 1
Try the strAlpha() function.
From the documentation:
Copies only the alphanumeric characters from a string.
Version 2
Because version 1 also deletes allowed hyphens (-), you could use strKeep().
From the documentation:
Builds a string by using only the characters from the first input string that the second input string specifies should be kept.
This will require you to specify all desired characters, a rather long list...
Version 3
Use regular expressions to replace any unwanted characters (defined as "not a wanted character"). This is similar to version 2, but the list of allowed characters can be expressed a lot shorter.
The example below allows alphanumeric characters(a-z,A-Z,0-9), underscores (_) and hyphens (-). The final value for newText is ABC-12_3.
str badCharacters = #"[^a-zA-Z0-9_-]"; // so NOT an allowed character
str newText = System.Text.RegularExpressions.Regex::Replace(' ABC-12_3 ', badCharacters, '');
Version 4
If you know the only unwanted characters are tabs ('\t'), then you can go hunting for those specifically as well.
vendPackingSlipJour.PackingSlipId = strRem(purchParmTable.Num, '\t');
I'm trying to combine some stings to one. In the end this string should be generated:
//*[#id="coll276"]
So my inner part of the string is an vector: tag <- 'coll276'
I already used the paste() method like this:
paste('//*[#id="',tag,'"]', sep = "")
But my result looks like following: //*[#id=\"coll276\"]
I don't why R is putting some \ into my string, but how can I fix this problem?
Thanks a lot!
tldr: Don't worry about them, they're not really there. It's just something added by print
Those \ are escape characters that tell R to ignore the special properties of the characters that follow them. Look at the output of your paste function:
paste('//*[#id="',tag,'"]', sep = "")
[1] "//*[#id=\"coll276\"]"
You'll see that the output, since it is a string, is enclosed in double quotes "". Normally, the double quotes inside your string would break the string up into two strings with bare code in the middle:
"//*[#id\" coll276 "]"
To prevent this, R "escapes" the quotes in your string so they don't do this. This is just a visual effect. If you write your string to a file, you'll see that those escaping \ aren't actually there:
write(paste('//*[#id="',tag,'"]', sep = ""), 'out.txt')
This is what is in the file:
//*[#id="coll276"]
You can use cat to print the exact value of the string to the console (Thanks #LukeC):
cat(paste('//*[#id="',tag,'"]', sep = ""))
//*[#id="coll276"]
Or use single quotes (if possible):
paste('//*[#id=\'',tag,'\']', sep = "")
[1] "//*[#id='coll276']"
I'm using a linq query where i do something liike this:
viewModel.REGISTRATIONGRPS = (From a In db.TABLEA
Select New SubViewModel With {
.SOMEVALUE1 = a.SOMEVALUE1,
...
...
.SOMEVALUE2 = If(commaseparatedstring.Contains(a.SOMEVALUE1), True, False)
}).ToList()
Now my Problem is that this does'n search for words but for substrings so for example:
commaseparatedstring = "EWM,KI,KP"
SOMEVALUE1 = "EW"
It returns true because it's contained in EWM?
What i would need is to find words (not containing substrings) in the comma separated string!
Option 1: Regular Expressions
Regex.IsMatch(commaseparatedstring, #"\b" + Regex.Escape(a.SOMEVALUE1) + #"\b")
The \b parts are called "word boundaries" and tell the regex engine that you are looking for a "full word". The Regex.Escape(...) ensures that the regex engine will not try to interpret "special characters" in the text you are trying to match. For example, if you are trying to match "one+two", the Regex.Escape method will return "one\+two".
Also, be sure to include the System.Text.RegularExpressions at the top of your code file.
See Regex.IsMatch Method (String, String) on MSDN for more information.
Option 2: Split the String
You could also try splitting the string which would be a bit simpler, though probably less efficient.
commaseparatedstring.Split(new Char[] { ',' }).Contains( a.SOMEVALUE1 )
what about:
- separating the commaseparatedstring by comma
- calling equals() on each substring instead of contains() on whole thing?
.SOMEVALUE2 = If(commaseparatedstring.Split(',').Contains(a.SOMEVALUE1), True, False)
I am trying to write a regular expression that doesn't allow single or double quotes in a string (could be single line or multiline string). Based on my last question, I wrote like this ^(?:(?!"|').)*$, but it is not working. Really appreciate if anybody could help me out here.
Just use a character class that excludes quotes:
^[^'"]*$
(Within the [] character class specifier, the ^ prefix inverts the specification, so [^'"] means any character that isn't a ' or ".)
Just use a regex that matches for quotes, and then negate the match result:
var regex = new Regex("\"|'");
bool noQuotes = !regex.IsMatch("My string without quotes");
Try this:
string myStr = "foo'baa";
bool HasQuotes = myStr.Contains("'") || myStr.Contains("\""); //faster solution , I think.
bool HasQuotes2 = Regex.IsMatch(myStr, "['\"]");
if (!HasQuotes)
{
//not has quotes..
}
This regular expression below, allows alphanumeric and all special characters except quotes(' and "")
#"^[a-zA-Z-0-9~+:;,/#&_#*%$!()\[\] ]*$"
You can use it like
[RegularExpression(#"^[a-zA-Z-0-9~+:;,/#&_#*%$!()**\[\]** ]*$", ErrorMessage = "Should not allow quotes")]
here use escape sequence() for []. Since its not showing in this post