How do I use regular expressions in Rstudio's "Find in files"?
Searching for literal numbers work just fine:
But when trying to use a regular expression to find a number I can't:
The documentation does not mention which type of regex is needed:
https://support.rstudio.com/hc/en-us/articles/200710523-Navigating-Code
So maybe I am using a wrong flavor of regex?
The RStudio documentation is (to be kind) sorely lacking a reference describing the regex syntax that is supported within RStudio find and replace dialogs.
However, in answer to your question about how to find numeric digits, either of the following works in the "Find in Files" dialog with "Regular expression" option ticked:
[0-9]
[[:digit:]]
Unfortunately as you found \d does not work. In fact, on my current version* \d simply finds the letter 'd' or 'D'.
\s works as expected so perhaps \d not working is a bug in RStudio?
*RStudio version I'm using:
Version 1.1.463 – © 2009-2018 RStudio, Inc.
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) RStudio Safari/538.1 Qt/5.4.0
RStudio uses POSIX Basic Regular Expressions for its "Find in files" functionality (because it is using grep under the hood).
Related
I was teaching an online course and a student asked me why R only uses / and not \ in file paths when using read.csv and other related functions. I tried looking at the documentation but it didn’t really mention anything about it. Never really thought about it because I use a Mac, and the default in Macs is \, but not so in Windows machines.
I’m not trained in computer science so I was left a bit stumped to answer the question a I’m afraid. Students always ask the darnest things!
Interesting question.
First off, the "forward slash" / is actually more common as it used by Unix, Linux, and macOS.
Second, the "backward slash" \ is actually somewhat painful as it is also an escape character. So whenever you want one, you need to type two in string: "C:\\TEMP".
Third, R on Windows knows this and helps! So you can you use a forward slash whereever you would use a backward slash: "C:/TEMP" works the same!
Fourth, you can have R compute the path for you and it will use use the separator: file.path("some", "dir").
So the short answer: R uses both on Windows and lets you pick whichever you find easier. But remember to use two backward slashes (unless you use the very new R 4.0.0 feature on raw strings which I'll skip for now).
(Note: backslashes as directory folder separators on Macs is a recent innovation.See History of Mac folder separators
I think if you review the history (or look it up if you were not there when it occurred as I was) you will find that Unix (which Linux copied completely) got there first. It preceded either MS-DOS or Macs or last of all Windows. R is a work-alike clone of S which was developed like Unix at Bell Labs.
Mac originally used colons (:) as folder separators (and still won't accept them in file names) and converted to slashes sometime during its long transition to BSD Unix which it licensed from ATT.
Shouldn't the question be: why Microsoft chose to use a backslash?
Here's a simple problem but I can't solve it alone since I'm not really familiar with SQL.
Most of you may already know this, in German there are umlaut-letters, e.g. "Ä,Ö,Ü", the lower case of them would be "ä,ö,ü".
I'm using a sqlite-database, accessing it with the Firefox plugin "SQLiteManager".
My select statement looks like this:
SELECT * FROM Projects WHERE Token LIKE '%ä%'
The Firefox plugin and also a SQLite library for .NET both return the wrong output. They return not only the entries with the lower case "ä", but also the entries with the upper case "Ä".
Do you guys know a simple solution to this?
The documentation says:
SQLite only understands upper/lower case for ASCII characters by default. The LIKE operator is case sensitive by default for unicode characters that are beyond the ASCII range.
But:
The ICU extension to SQLite includes an enhanced version of the LIKE operator that does case folding across all unicode characters.
This is a very inconvenient workaroud which does not make queries faster, but it does the trick. I replace all uppercase german umlauts after lowering my test_string like this:
SELECT replace(replace(replace(lower('ÄAÄBÖOÖGDDÜUÜ'), 'Ä', 'ä'), 'Ü', 'ü'), 'Ö', 'ö') AS lowered
lowered
---------
äaäböoögddüuü
The characters * and ? are used as wildcards in pathnames. How does one refer to a filename that has ? as one of its actual characters? For example:
[18]> (wild-pathname-p #p"foo")
NIL
[19]> (wild-pathname-p #p"foo?")
T
So referring to the filename "foo?" cannot be done this way. I tried to escape the ? with a backslash, but that didn't work. I tried going unicode by using \u3f or \u003f, but that didn't work.
How do I refer to a file that contains a wildcard as part of its name: How to probe it, open it, etc.?
It depends on the implementation, but for some, a backslash does in fact work. But because namestrings are strings, to get a string with a backslash in it, you have to escape the backslash with another backslash. So, for example, "foo?" is escaped as "foo\\?", not "foo\?".
Last time I checked, in CLISP, there is no way to refer to files with wildcards in the names. My solution to that is to avoid CLISP.
On my Mac running Mac OS X 10.10.3: Clozure CL, SBCL and LispWorks write a pathname with * like this:
#P"/private/tmp/test.\\*"
They might differ in some other details, though.
SBCL supports (make-pathname :directory "" :name file) to escape a string to a proper pathname.
A similar question was asked a year ago, but the requirements were different (querent wanted R studio), and the solution package is not compatible with R 3.0.
I am using the R interpreter directly from the bash command line. I would like my scripts to output color text, ideally in a manner similar to how using a particular sequence of characters in C causes the color to be different.
More specifically, in C, we can output colors using printf as described in the answer to this question. I wonder if R 3.0.2 has a facility to do the same.
The ANSI sequences in the question you mentioned are processed by the terminal emulator so they will work fine in R:
cat("\033[32;1m OK \033[0m\n")
Note that \033 is (octal) code for escape symbol. It is one (non-printable) symbol which tells the terminal to start interpreting the control sequence. print when given \033 will output the four symbols \, 0, 3, 3 literally which, of course, tells the terminal nothing. See Wikipedia for the full list of ANSI escape sequences.
I have a txt file that I want to sort of 'grep' through and get rid anything between and including '<' and '>'. i am using osx so if you want to recommend a good, free IDE, or if I can just do it with emacs and probably already have a compiler of some sort on the OS. so, I am looking for the script mainly, in whatever language. ReplaceAll("<*>",""), something like that? Is that java or what?
You can use sed for this. Try sed "s/<[^>]*>//g" file.html
That will replace all occurrences of <[^>]*> (which is a basic regular expression for all text between > and )
For more details, see:
http://www.gnu.org/software/sed/manual/html_node/Regular-Expressions.html
http://en.wikipedia.org/wiki/Sed