"How to skip a line between strings in R?"

"How to skip a line between strings in R?" - r

I'm doing a swirl lesson.
This is the problem:
Edit the string inside writeLines() so that it correctly displays
(with the line breaks in these positions)
This is a really
really really
long string
I tried typing
writeLines("This is a really\n\nreally really\n\nlong string")
but the swirl lesson keeps telling me that it is incorrect. Is there a different way to write the same thing?

Swirl is generally very strict about the answer, as it would be time consuming and difficult to put in ways to check for all the potentially correct answers.
As a matter of fact the answer is writelines("This is a really \nreally really \nlong string") (see here). You have the newline \n doubled, so Swirl won't accept that as an answer.

Related

Cleaning forum post with multiple quotations in rvest + stringr

I am scraping a very long forum thread, and I want to come up with a database that has columns containing the following info: date / full post text / quoted user / quoted text / clean text
The clean text should be each user's post, without the quotations if they are replying to anyone. if the post is not a reply, I would leave it as NA. The following is an invented post, with invented user, to illustrate what I have managed to do so far:
post<-"Meow1 wrote: »\noday is gonna be the day that they're gonna throw it back to you?\nBy now you should've somehow Realized what you gotta do\n\n\nI don't believe that anybody Feels the way I do, about you now\nMeow1 wrote: »\nI'm sure you've heard it all before But you never really had a doubt\n\n\nBecause maybe, you're gonna be the one that saves me\nMeow1 wrote: »\nAnd after all, you're my wonderwall\n\n\nAnd all the lights that lead us there are blinding"
Then I try to pull out the quoted user (Meow1) and it works:
QuotedUser_1<-ifelse(grepl('wrote:', post), gsub('\\s*wrote.*$', '', post), NA)
QuotedUser_1
[1] "Meow1"
Then I created this codes for pulling out the quoted text, and the clean text:
Quotedtext_1<- ifelse(grepl('wrote:', post), gsub('^.*wrote\\s*|\\s*\\n\\n\\n.*$', '', post), NA)
It works when there is only one quoted text, but otherwise, it only gives the last quoted bit (in the example, 'And after all, you´re my wonderwall')
And same for the clean text, it only returns the last reply:
Clean_text<- sub('^.*\\n\\n\\n\\s*|\\s*wrote.*', '', post)
If anyone has a suggestion to improve the code, so that I can have a vector with all the quotations, and a vector with all the replies, I would be very grateful...
Cheers

Are you sure you cannot scrape the author and text information separately? Without a source it's difficult to know, but I guess they can be obtained by different css-selectors making it much easier to split the data.
If not, it might be helpful to look into str_locate_all which allows you to locate all occurences of e.g. "wrote:" and split the string accordingly.

Removing a part of a string with double quote [R]

I have the following string:
x<-"\"stream;\"\" Well done\"\t\" fans !!\"\";\"\"Boy\""
and I woould like to change it to
x= "\"stream;\"\" Well done fans !!\"\";\"\"Boy\""
would be great if anyone could help me removing \"\t\" from this string.

(from comment) You can use
sub("\"\t\"","",x)
That removes exactly what you're asking to be removed (though there is still an extra space compared to your desired output)

String continuation across multiple lines, no newline characters

Am using the RODBC library to bring data into R. I have a long query that I want to pass a variable to, much like this SO user.
Problem is that R interprets the whitespace/carriage returns in my query as a newline '\n'.
The accepted solution for this question suggests to simply break up the text into chunks and then paste() together - which works, but ideally I'd like to keep the whitespace intact - makes it easier to test/verify the behavior of the query over in the database before pasting into R.
In other languages I'm familiar with there's a simple line continuation character - indeed, several of the comments on the accepted answer are looking for an approach similar to python's \.
I found an aside to a workaround using strwrap deep in the bowels of an R discussion lists, so in the interest of making the internet better I will post it here. However, if someone can point the direction toward a more elegant/straightforward solution, I will happily accept your answer.

I don't know if you will find this helpful or not, but I have eventually gravitated towards keeping my SQL separate from my R scripts. Keeping the query in my R script, except for very very short ones, I find gets unreadable very quickly.
These days, I tend to keep queries that are more than a single line in their own separate .sql file. Then I can keep them nice and formatted and readable in a nice text editor, and read them into R as needed via something like this:
read_sql <- function(path){
stopifnot(file.exists(path))
sql <- readChar(path,nchar = file.info(path)$size)
sql
}
For binding parameters into the queries, I just keep a %s where the parameter will go in the .sql file, and then add in the parameters in R using sprintf.
I've been much happier this way, as I was finding that cluttering up my R scripts with really long paste statements and multi-line character objects was making my code really hard to read.

R's strwrap will destroy whitespace, including newline characters, per the documentation.
Essentially, you can get the desired behavior by initially letting R introduce line breaks/newline \ns, and then immediately stripping them out.
#make query using PASTE
query_1 <- paste("SELECT map.ps_studentid
,students.first_name || ' ' || students.last_name AS full_name
,map.testritscore
,map.termname
,map.measurementscale
FROM map$comprehensive_with_growth map
JOIN students
ON map.ps_studentid = students.id
WHERE map.termname = '",map_term,"'", sep='')
#remove newline characters introduced above.
#width is an arbitrary big number-
#it just needs to be longer than your string.
query_1 <- strwrap(query_1, width=10000, simplify=TRUE)
#execute the query
map_njask <- sqlQuery(XE, query_1)

query <- gsub(pattern='\\s',replacement="",x=query)

Try using sprintf to get variable substitution, and then replacing all newlines and whitespace.
See my answer to a similar question for details.

This regex is not right

I am trying to use regex generators to create an expression, but I can't seem to get it right.
What I need to do is find the following type of string in a string:
community_n
For example, within the string which may be
community community_1 community_new_1 community_1_new
from that, I just want to extract community_1
I have tried /(community_\\d+)/, but that is clearly not right.

Try adding word boundries, so
/(\\bcommunity_\\d+\\b)/

Try using the regex (community_\d+).
Though I could be incorrect since I don't know which language you are using.
(For some reason I cannot add comments, I can only answer questions).

Decode <e8> <e9> etc. characters in R

How can I decode Base 64 characters like <e9> (é) or <b0> (°) in a table that was saved with write.table without the UTF-8 option?
Apologies if the answer is obvious, but the R documentation pages mention nothing else than enc2utf8() (which does not work here).
Note: if there is no solution, I know I can either gsub() the whole thing, but that would be long and messy, or I can generate the data again, but that would take some real time (crawler data).

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

"How to skip a line between strings in R?" - r

Related

Cleaning forum post with multiple quotations in rvest + stringr

Removing a part of a string with double quote [R]

String continuation across multiple lines, no newline characters

This regex is not right

Decode <e8> <e9> etc. characters in R

Categories

Resources