How to handle special characters - r

I need some help with R programming.
Basically I need to get user input from the user and use it as a variable in my R script.
When getting the user input the following checks need to be made.
to see if missing values exist:
else Prompt user to reenter
Check to see that only alpha numeric characters are entered.
else prompt user to reenter.
allow some special characters: $,#,&, etc
White space is allowed as in first name, " ", last name.

It is unclear what you are trying to do with the else if part of your code. The nature of readline() is that it will return a string of the user's input. Are there any specific characters you don't want included in the input? You could use grepl() to identify them and prevent the user from entering them as an input.
If you are trying to ensure that the user inputs something then you should use a while loop as suggested in the comments. If you are going to use your variable in R after the function runs then you need to return() the value of v1 - the user input. If you are trying to replace the space in between the first and last name with %20 then you may want to use gsub(). See the code below.
fun1 <- function(){
v1 <- c("")
v1 <- readline(prompt='Enter your First & Last Name: ')
while (v1==""){
v1 <- readline("You forgot to enter your Name. Please try again: ")
}
return(gsub(" ", "%20", v1))
}
> "David%20Smith"

Related

Why do input functions like readline() and scan() include code when scanning for inputs? How to fix?

If I run,
input <- as.integer(scan(what = "integer"))
I get no problems, but
input <- scan(what = "integer")
#Anything here, including whitespace
Now input incudes the "#Anything here" as part of the input, and trying to as.integer() it gives me an NA. (If nmax=1, it automatically reads that as the input and doesn't allow any more.)
How is this preventable? What am I doing wrong?
Only works when you run the individual input line all by itself, give the input, then run the rest by itself. Should run code until input function is called, then give an input interface, THEN include the rest of the code. I am using RStudio.
In RStudio, use Source to read the input the way you want. Using Ctrl-Enter or Run simulates pasting it into the console (which means text following the scan() is assumed to be input).
So this works with Ctrl-Enter:
input <- scan(what = integer())
3
#Anything here, including whitespace
and the input ends up containing 3. (what should be an example of the type you want, not the name of it. So use integer() if you want an integer result.)
This works with Source:
input <- scan(what = integer())
#Anything here, including whitespace
and will prompt you to enter the number.

How to store a special character in a string in R

I am trying to store the special escape character \ in R as part of a string.
x = "abcd\efg"
# Error: '\e' is an unrecognized escape in character string starting ""abcd\e"
x = "abcd\\efg"
# Works
The problem is that x is actually a password that I am passing as part of an API web call so I need to find a way for the string to store a literal single slash in order for this to work.
Example using the ignoring literal command:
> x = r"(abcd\efg)"
> y = "https://url?password="
> paste(x, y, sep = "")
[1] "abcd\\efg123"
> # What I need is for it to return
[1] "abcd\efg123"
> z = paste(y, x, sep = "")
> z
[1] "https://url?password=abcd\\efg"
> cat(z)
https://url?password=abcd\efg
When I pass z as part of the API command I get "bad credentials" message because it's sending \\ as part of the string. cat returns it to the console correctly, but it appears to not be sending it the same way it's printing it back.
Like everyone else in the comments said, the string appears to be handled correctly by R. I would assume it becomes a trouble somewhere else down the line. Either the API or the pass-matching software itself might be doing something with it.
As one example, the backslash will not work in most browsers if written verbatim, you would need to use %5C instead. Here is one discussion about it.
So in your case - try replacing the backslash with something else, either %5C or, as #Roland mentioned in the comments, some extra back-slash symbols, like \\\\ or \\\ or \\. And then see if any of them work.
The answers and comments contain the solution, but for completeness ill post an answer that matches my specific situation.
My package expects the user to supply a password. If there is a \ in the password the command will fail as stated above. The solution seems to be to take care of it on the input rather than trying to alter the value once the user submits the password.
library(RobinHood)
# If the true password is: abc\def
# The following inputs will all work
rh <- RobinHood("username", pwd = r"(abc\def)")
rh <- RobinHood("username", pwd = "abc\\def")
rh <- RobinHood("username", pwd = "abc%5Cdef")

R: deparse(substitute(+))

If I use
d <- function(x){deparse(substitute(x))}
for letters or number all works fine. d(a1) gives "a1", for example. But using special characters results in an error. I want to use d(+) and get "+" as result.
From comments:
I want "+" == d(+) to give a TRUE. In other words, I do not want to use d(`+`). Is this possible? The function is part of a code that will await input from non-R-users and that is why I want to avoid using `` for special characters (I do not want explain to every user what a special character is).

Python - code works, but I don't know why

Basically I want someone to give me a simple rundown of how this bit of python code works. Much appreciated
vari :
kw1 = ['keyword1', 'keyword2']
problem = input("Detect keywords from list\n")
main :
if set(kw1).intersection(problem.split()):
print(" Kw found. ")
else:
print(" Keywords not found. ")
A lot of things there.
First, when you call input you're asking for the user to give you an input string.
When you use split() on it you transform it into a list of strings, by separating the input string based on the empty spaces, so that "bla bli blo".split() gives you ["bla","bli","blo"].
Then, when you call set(my_list), it will transform my_list into a set, which is a mathematical construct without any duplicates and which responds to operators like union, intersection and so on.
Finally, when you compare your set (made from splitting the user input) to a list of keywords, if there are no matches (so none of the keywords in the list appreared directly in the user input), then it will give you an empty set and that will be considered as false by the if. So if set(["bla","bli","blo"]).intersection(["blu"]) will not activate, but if set(["bla","bli","blo"]).intersection(["blu","blo"]) will, as it is not an empty set.
Note that if you want to recognize keywords inside words, this method will NOT work. For instance, if you're looking for keywords kw1=['car','truck','bike'] and the user inputs cars trucks bikes, none of the keywords will be recognized, because the split() will split along empty spaces, giving you ['cars','trucks','bikes'] and 'cars'!='car'...

readline is considering every record in the spreadsheet as a new line [R]

I am trying to create a function that will calculate the frequency count of keywords using TM package. The function works fine if the text pasted from readline is on free form text without a new line. The problem is, when I paste a bunch of text copied from a spreadsheet, readline considers it as a new line.
keyword <- function() {
x <- readline(as.character('Input text here: '))
x <- Corpus(VectorSource(x))
...
tdm <- TermDocumentMatrix(x)
...
tdm
}
Here's the full code: https://github.com/CSCDataAnalytics/PM-Analysis/blob/master/Keyword.R
How can I prevent this from happening or at least consider a bunch of text of every row from the spreadsheet as one vector only?
If I'm understanding you correctly, the problem is when the user pastes the text from another application: the newline is causing R to stop accepting the subsequent lines.
One technique (fragile as it may be) is to look for a specific line, such as an empty line "" or a period ".". It's a little fragile because now you need (1) assurance that the data will "never" include that as a whole line, and (2) it is easily appended by the user.
Try:
endofinput <- ""
totalstr <- ""
while(! endofinput == (x <- readline('prompt (empty string when done): ')))
totalstr <- paste(totalstr, x)
In this case, the empty string is the catch, and when the while loop is done, totalstr contains all input separated by a space (this can be changed in the paste function).
NB: one problem with this technique is that it is "growing" the vector totalstr, which will eventually cause performance penalties (depending on the size of the input data): every loop iteration, more memory is allocated and the entire string is copied plus the new line of text. There are more verbose ways to side-step this problem (e.g., pre-allocate a vector larger than your anticipated input data), but if you aren't anticipated 1000s of lines then you may be able to accept this naive programming for simplicity.
Another option would be to have the user save the data to a text file and use file.choose() and readLines() to get your data.
Try collapsing the data into a single string after using readline
x <- paste(readline(as.character('Input text here: ')), collapse=' ')

Resources