The f_num function from the numform package will remove leading zeros from a number:
f_num(0.1)
Output:
.1
I need this very same thing, but with a comma instad of the period. It would also be great if the functionality of the f_num function which allows you to round up the number of decimals would be kept.
Here is a custom alternative(see note below):
detrail <- function(num,round_dec=NULL){
if(!is.null(round_dec)){
num<-round(num,round_dec)
}
gsub("^\\d\\.",",",num)
}
detrail(0.1)
[1] ",1"
detrail(1.1)
[1] ",1"
detrail(0.276,2)
[1] ",28"
NOTE:
To read this as numeric, you'll need to change options(OutDec) to , instead of . ie options(OutDec= ","). I have not done this as I do not like changing global options.See Also
This also removes any number that is not zero. Disable this by using 0 instead of \\d.
Related
I have a variable named full.path.
And I am checking if the string contained in it is having certain special character or not.
From my code below, I am trying to grep some special character. As the characters are not there, still the output that I get is true.
Could someone explain and help. Thanks in advance.
full.path <- "/home/xyz"
#This returns TRUE :(
grepl("[?.,;:'-_+=()!##$%^&*|~`{}]", full.path)
By plugging this regex into https://regexr.com/ I was able to spot the issue: if you have - in a character class, you will create a range. The range from ' to _ happens to include uppercase letters, so you get spurious matches.
To avoid this behaviour, you can put - first in the character class, which is how you signal you want to actually match - and not a range:
> grepl("[-?.,;:'_+=()!##$%^&*|~`{}]", full.path)
[1] FALSE
Using R, I have a long list of keywords that I'm searching for in a dataset. One of the keywords needs to have parentheses around it in order to be included.
I've been attempting to replace the parenthesis in the keywords list with \\ then the parentheses, but have not been successful. If there is a way to modify the grepl() function to recognize them, that would also be helpful. Here is an example of what I'm trying to accomplish:
patterns<-c("dog","cat","(fish)")
data<-c("brown dog","black bear","salmon (fish)","red fish")
patterns2<- paste(patterns,collapse="|")
grepl(patterns2,data)
[1] TRUE FALSE TRUE TRUE
I would like salmon (fish) to give TRUE, and red fish to give FALSE.
Thank you!
As noted by #joran in the comments, the pattern should look like so:
patterns<-c("dog","cat","\\(fish\\)")
The \\s will tell R to read the parentheses literally when searching for the pattern.
Easiest way to achieve this if you don't want to make the change manually:
patterns <- gsub("([()])","\\\\\\1", patterns)
Which will result in:
[1] "dog" "cat" "\\(fish\\)"
If you're not very familiar with regular expressions, what happens here is that it looks for any one character within the the square brackets. The round brackets around that tell it to save whatever it finds that matches the contents. Then, the first four slashes in the second argument tell it to replace what it found with two slashes (each two slashes translate into one slash), and the \\1 tells it to add whatever it saved from the first argument - i.e., either ( or ).
Another option is to forget regex and use grepl with fixed = T
rowSums(sapply(patterns, grepl, data, fixed = T)) > 0
# [1] TRUE FALSE TRUE FALSE
I use knitr to have LaTeX pull numbers directly from R output, e.g., using \Sexpr{res$a[1]} to \Sexpr{res$a[5]}. Is there a way to automatically precede positive numbers with a plus sign? Sure, I could add plus signs to relevant numbers manually, but this seems to defeat the purpose of knitr.
Sorry it took me a while to get back to this. And it turned out to be easier than I thought. knitr doesn't appear to have its own options for controlling printing options. Instead, it relies on the options from your R session.
Now consider the following:
x <- 5.1234567899876543
x
[1] 5.123457
options()$digits
[1] 7
So the way the number is printed to the console is (partially) determined by options("digits"). Now, watch what happens when we apply the format function to x with all of the default arguments:
format(x)
[1] "5.123457"
We get back a character string that matches the representation when we simply printed x. Let's leverage this to our benefit:
with_plus <- function(x, ...)
{
if (x > 0)
{
sprintf(
fmt = "+ %s",
format(x, ...)
)
}
else
{
x
}
}
with_plus(x)
[1] "+ 5.123457"
Now you have a function that, under the default settings, will print numbers the same way they appear in the console but with a "+" prepended to positive numbers. Using format, you also get the flexibility to adjust individual values as needed.
I can change the decimal character from output using:
> 1/2
[1] 0.5
> options(OutDec = ',')
> 1/2
[1] 0,5
But, this change does not affect sprintf() function.
> sprintf('%.1f', 1/2)
[1] "0.5"
So, my question is: There is an easy way to change it (the decimal character)? I think that I can't use a 'simple' RE because not every . need be traded by ,.
I don't have any idea of how to do it, so I can't say what I've already done.
I think you can do this by setting your locale appropriately, making sure that the LC_NUMERIC component is set to a locale that uses a comma as the decimal separator (http://docs.oracle.com/cd/E19455-01/806-0169/overview-9/index.html).
Sys.setlocale("LC_NUMERIC","es_ES.utf8")
sprintf("%f",1.5)
## "1,500000"
This gives a warning that R may behave strangely; you probably want to switch LC_NUMERIC back to C as soon as you're done generating output.
Try this
sprintf("%s",format(1.5,decimal.mark=","))
Or try this in other cases (e.g. I wanted "%+3.1f %%" in sprintf) :
gsub("\\.",",", sprintf("%+3.1f %%",1.99))
I have
str=c("00005.profit", "00005.profit-in","00006.profit","00006.profit-in")
and I want to get
"00005.profit" "00006.profit"
How can I achieve this using grep in R?
Here is one way:
R> s <- c("00005.profit", "00005.profit-in","00006.profit","00006.profit-in")
> unique(gsub("([0-9]+.profit).*", "\\1", s))
[1] "00005.profit" "00006.profit"
R>
We define a regular expression as digits followed by .profit, which we assign by keeping the expression in parantheses. The \\1 then recalls the first such assignment -- and as we recall nothing else that is what we get. The unique() then reduces the four items to two unique ones.
Dirk's answer is pretty much the ideal generalisable answer, but here are a couple of other options based on the fact that your example always has a - character starting the part you wish to chop off:
1: gsub to return everything prior to the -
gsub("(.+)-.+","\\1",str)
2: strsplit on - and keep only the first part.
sapply(strsplit(str,"-"),head,1)
Both return:
[1] "00005.profit" "00005.profit" "00006.profit" "00006.profit"
which you can then wrap in unique to not return duplicates like:
unique(gsub("(.+)-.+","\\1",str))
unique(sapply(strsplit(str,"-"),head,1))
These will then return:
[1] "00005.profit" "00006.profit"
Another non-generalisable solution would be to just take the first 12 characters (assuming string length for the part you want to keep doesn't change):
unique(substr(str,1,12))
[1] "00005.profit" "00006.profit"
I'm actually interpreting your question differently. I think you might want
grep("[0-9]+\\.profit$",str,value=TRUE)
That is, if you only want the strings that end with profit. The $ special character stands for "end of string", so it excludes cases that have additional characters at the end ... The \\. means "I really want to match a dot, not any character at all" (a . by itself would match any character). You weren't entirely clear about your target pattern -- you might prefer "0+[1-9]\\.profit$" (any number of zeros followed by a single non-zero digit), or even "0{4}[1-9]\\.profit$" (4 zeros followed by a single non-zero digit).