Splitting a character into separate words in R [closed] - r

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am working on a project in R (on TED_Talks data set). I have a data frame with one column called "tags" which contains a character like
"gaming,gender,sex,feminism,education,culture".
The problem is, the whole row is being read as a single character.
I want the output to be a vector containing separate words. eg:
"gaming","gender","sex","feminism","education","culture"
so I can do further analysis on tags.

You can simply do the following:
say your entry is in object a, and you want to allocate the final result to object b:
a <- "gaming,gender,sex,feminism,education,culture"
b <- unlist(strsplit(a, "[,]"))

Related

Use of Ifelse and Or in R [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I have a dataset named SPT which has one column which consist 4 unique values-A,B,C,D, Now I have to create one more column which consist of 1 if first column had A, B otherwise 0.How to do it using ifelse command
Using ifelse:
SPT$new <- ifelse(SPT$col %in% c('A','B'),1,0)
Where new must be changed by the name of your new variable, and col by the name of the column that is storing the 4 letters

Find duplicate registers in R [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I have an excel file with a list of emails and channels that collected it. How can I know how many emails per channel are duplicated using R and automate it (every time I import a different file just have to run it and get the results ) ?
Thank you!!
Assuming the "df" dataframe has the relevant variables under the names "channel" and "email", then:
To get the number of unique channel-email pairs:
dim(unique(df[c("channel", "email")]))[1]
To get the sum of all channel-email observations:
sum(table(df$channel, df$email))
To get the number of duplicates, simply subtract the former from the later:
sum(table(df$channel, df$email)) - dim(unique(df[c("channel", "email")]))[1]

Dataframe convert factors to numerical error [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am trying to convert factors from a data-frame to numeric using the commands
data[] <- lapply (data, function(x) as.numeric(as.character(x))
But it keeps asking me for more coding. What am I doing wrong?
The data-frame is named data and it consists of 50 rows and 2 columns. Will this command change every variable in numeric right? Or shall I do something else?
screenshot after using 'dput' at http://imgur.com/Sde9QSk.png
Shouldn't you add ) at the end of your code?

Find strings that start and end with certain characters [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I'm working on a text-mining project with data from twitter. In my data frame, many words are converted into Unicode characters, e.g.
<U+0E2B><U+0E25><U+0E07><U+0E1E>
I want to collect every converted words like above and put them into 1 large string so I can deal with them separately.
Is there any way I can find all the strings that start with <U+ and end with > using R?
Your request is a bit imprecise, so I'm taking the liberty to make a few assumptions on how you want the output.
text <- "Words <Q+0E2B><U+0E2B2>, 1 < 2, <p>
<U+0E2B><U+0E25><U+0E07><U+0E1E> </p> some more words"
regmatches(text, gregexpr("<U\\+[0-9A-Z]{4}>", text))
# "<U+0E2B>" "<U+0E25>" "<U+0E07>" "<U+0E1E>"

apply function doesn't work [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a data frame that has some empty entries. I set the
options(stringsAsFactors = FALSE)
so that I can change the empty cells. I then wrote the following code:
apply(my_data[,6:65],2, function(x) x[which(x=='')]<-0)
, hoping that it replaces all the empty cells with zeros. But it isn't working!
Note that my_data has 65 columns and columns 1:5 contain string.
Thanks in advance
No need to use apply, just use [<- with logical indexing
my_data[my_data==""] <- 0

Resources