Let's say I have the following phrase:
Tiger Woods plays golf
I'm trying to use jq to replace all of the spaces with + signs so the final result would be:
Tiger+Woods+plays+golf
Is there any way to do this?
Given a string "Tiger Woods plays golf"
using jq's expression gsub("\\s";"+") - you should be able to replace those spaces by "+" characters
result[1]: "Tiger+Woods+plays+golf"
[1] https://jqplay.org/s/SxiKIClW13
Related
I'm cleaning some text data and I've come across a problem associated with removing newline text. For this data, there are not merely \n strings in the text, but \n\n strings, as well as numbered newlines such as: \n2 and \n\n2. The latter are my problem. How does one remove this using regex?
I'm working in R. Here is some sample text and what I've used, so far:
#string
string <- "There is a square in the apartment. \n\n4Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten.\n2"
#code attempt
gsub("[\r\\n0-9]", '', string)
The problem with this regex code is that it removes numbers and matches with the letter n.
I would like to have the following output:
"There is a square in the apartment. Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten."
I'm using regexr for reference.
Writing the pattern like this [\r\\n0-9] matches either a carriage return, one of the chars \ or n or a digit 0-9
You could write the pattern matching 1 or more carriage returns or newlines, followed by optional digits:
[\r\n]+[0-9]*
Example:
string <- "There is a square in the apartment. \n\n4Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten.\n2"
gsub("[\r\n]+[0-9]*", '', string)
Output
[1] "There is a square in the apartment. Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten."
See a R demo.
I have a document, and I need to find all the words(no spaces) borded with '. (e.g. 'apple', 'hello') What would be the regular expression?
I've tried ^''$ but it didn't work.
If there isn't any solution, it could not be "any word" but also it can be a word from an order(e.g. apple, banana, lemon) but it still must have the (')s.
Thank you so much
Andrew
If you want to capture single-quoted strings, literally any character run except single-quotes but between the single-quotes, use
/'[^']+'/
If you need single words, i.e. alphabetic characters but no spaces, try
/'[a-zA-Z]+'/
I'm asssuming a couple things here:
You're using a language that delimits regexes with slashes. This includes Javascript and Perl to my knowledge, and probably a bunch of others. In some other languages, like C#, you should use double quotes to delimit, e.g. "'[a-zA-Z]+'"
You're using a flavor of regex that does not need to escape the plus sign.
You're trying to capture all such words within a long string. I.e., if the input string is "Here is a 'long' string with 'some' 'words' single-quoted" then you will capture three words: 'long','some', and 'words'.
I have a dataframe with a bunch of titles for classroom courses that include special characters in it. I'm trying to find and replace them but it's not working
Example
Tank Walk Around – Round Portable Restroom Tanks
db$objectName[db$objectName == "Tank Walk Around – Round Portable Restroom Tanks"] <- "Tank Walk Around - Round Portable Restroom Tanks"
I also have other course titles with these special characters that have been problematic as well
` ’ “ „ ¢ € ®
Assuming you want to keep all alphanumeric characters the following code can be used. The code uses a regex expression to remove all non-alphanumerics.
str = "Tank Walk Around – Round Portable Restroom Tanks"
print(strsplit(gsub("[^[:alnum:] ]", "", str), " +")[[1]])
Result:
source('~/.active-rstudio-document')
[1] "Tank" "Walk" "Around" "â" "Round" "Portable" "Restroom" "Tanks"
Source: R remove non-alphanumeric symbols from a string
I'd like to split the string into the following
S <- "No. Ok (whatever). If you must. Please try to be careful (shakes head)."
[1] No.
[2] Ok (whatever). If you must.
[3] Please try to be careful (shakes head).
The pattern is the first . before each (...).
I'm familiar with (?<=...) (i.e. positive lookbehind) but this doesn't seem to work with non-fixed length patterns. I'd like to know if I'm wrong about positive lookbehind or if there's some regex magic to do this. Thanks!
Note that I don't know much about ruby, but there should be something like a split method that uses a regex pattern as a delimiter and split the string accordingly.
Use this regex:
(?<=\.) (?=[^.]+?\(.+?\))
This looks for a space character. Behind the space, there must be a dot (?<=\.). After it (?=, there must be a bunch of characters that are not dots [^.]+?, and then a pair of brackets with something inside \(.+?\).
Try it online: https://regex101.com/r/8PcbFJ/1
I'm using the ASP.NET RegularExpressionValidator
I need a regular expression to keep users who fill out a form from using all caps.
For example, if they write their name:
Bob JONES or BOB JONES or BOB JOnes or whatever, it will not match.
I am able to match all caps with this regular expression:
[A-Z]{2,10}
But the RegularExpressionValidator requires me to match valid text, not invalid text.
If your goal is to have each word have no more than 1 capital letter in a row at a time, and assuming it's okay to restrict to ASCII letters, try something like this:
^(?:[a-z]|[A-Z](?![A-Z])|['-])+$
In other words, the string must be entirely composed of either lowercase letters, or uppercase letters not followed by another uppercase letter.
This works for single words. For multiple words (like a full name, first and last), simply add a space to the alternation:
^(?:[a-z]|[A-Z](?![A-Z])|[\s'-])+$
(Edited to allow apostrophe and hyphen punctuation)
use this Regex: #"^[^A-Z]*$" It will match anything that not contains upper case characters.
use this regular expression ^[a-z ]+$
if you want catch names like Bob Jones use this one ^([A-Z][a-z ]+)+$
Maybe i'm just stating the obvious, but couldn't you just to myVar.string.toLower before doing the Compare?