Practice Exercise on tidyr Functions [closed] - r

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Using the first.df data frame, separate the DoB column data into 3 new columns - date, month,year by using the separate() function.I tried last line but it is not giving desired result.
fname <- c("Martina", "Monica", "Stan", "Oscar")
lname <- c("Welch", "Sobers", "Griffith", "Williams")
DoB <- c("1-Oct-1980", "2-Nov-1982", "13-Dec-1979", "27-Jan-1988")
first.df <- data.frame(fname,lname,DoB)
print(first.df)
separate(first.df,DoB,c('date','month','year'),sep = '-')

Moved my comment to an actual answer.
To retain the date column you need to add the remove = FALSE parameter, and to discard one of the separated columns simply add NA instead of a column name. The correct command is then
separate(first.df,DoB,c(NA,'month','year'),sep = '-', remove=FALSE)

Related

How do you group data with the same name in a vector? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 10 months ago.
Improve this question
I have a DataFrame with three columns:region, year, grdp.
How do I group data with the same name in 'region' column.
Here's the code to create a sample dataset:
Here's the desired result:
store data of values with the same name in the 'region' column
ex) 'region' column has three "서울특별시" data. I want to group the three "서울특별시" data in three columns and assign it to a variable
I'm not completely understanding the question, but I think one of these two might solve what you're looking for?
library(dplyr)
df <- data.frame(region=sample(c('x','y','z'),100,replace=TRUE),
year=sample(c(2017,2018,2019),100,replace=TRUE),
GRDP=sample(200000000:400000000,100))
regions <- unique(df$region)[order(unique(df$region))]
#OPTION 1
for(i in 1:length(regions)){
assign(tolower(LETTERS[i]),df %>% filter(region==regions[i]))
}
a
b
c
#OPTION 2
ltrs <- tolower(LETTERS[1:length(regions)])
df['ex)'] <- sapply(df$region,FUN=function(x){ltrs[which(regions==x)]})
head(df)

If/else statement using column names in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to do an ifelse function here pulling from an existing df "Tokens" a column name called "Vowel" to create column name "Ambiguity".
If column "Vowel" contains "o" or "u", I want to create a column called "High.Ambiguity", and put the value "1"; else, put "0".
What would the syntax for this look like?
I believe this should do the trick for you. mutate creates a new column, in this case called High.Ambiguity which takes on the value 1 when Vowel (a column in Tokens) is either 'o' or 'u' otherwise it is 0.
library(dplyr)
Tokens <- Tokens %>%
mutate(High.Ambiguity = ifelse(Vowel %in% c("o", "u"), 1, 0))

Retrieve a set of string with unique substring [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a set of strings in R. In the form of: "X-Y-Z.3000.F.PP0016-C.A-SL-0433.P-N.fC-G.txt". I want to retrieve the set of strings containing just the first occurrence of a string. It depends on the 4th field. In this set for e.g. I have multiple string with X-Y-Z.3000....." I want only the first one having id = 3000, the same for the others.
For reproducibility:
X-Y-Z.3000.F.PP0016-C.A-SL-0433.P-N.fC-G.txt
X-Y-Z.3000.F.PP0016-C.A-SL-0433.F-N.fC-G.txt
X-Y-Z.3008.F.PP0016-C.A-SL-0433.P-N.fC-G.txt
X-Y-Z.3008.F.PP0016-C.B-SX-0433.P-N.fC-G.txt
So at the end I would only the first anche 3th string
X-Y-Z.3000.F.PP0016-C.A-SL-0433.P-N.fC-G.txt
X-Y-Z.3008.F.PP0016-C.A-SL-0433.P-N.fC-G.txt
Extract "4th field" which is 2nd field if we split on ".", then exclude duplicated items:
# data
x <- c("X-Y-Z.3000.F.PP0016-C.A-SL-0433.P-N.fC-G.txt",
"X-Y-Z.3000.F.PP0016-C.A-SL-0433.F-N.fC-G.txt",
"X-Y-Z.3008.F.PP0016-C.A-SL-0433.P-N.fC-G.txt",
"X-Y-Z.3008.F.PP0016-C.B-SX-0433.P-N.fC-G.txt")
x[ !duplicated(sapply(strsplit(x, ".", fixed = "TRUE"), "[", 2)) ]
# [1] "X-Y-Z.3000.F.PP0016-C.A-SL-0433.P-N.fC-G.txt"
# [2] "X-Y-Z.3008.F.PP0016-C.A-SL-0433.P-N.fC-G.txt"

How to extract any given number from a dataframe and assign it a name [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
How do I extract a number in any given location of a dataframe? Let's say I have a 4x4 matrix, how would I take the number value in (2,4) and assign that value a name?
You can use the setNames function as so: setNames(value, c(name1))
This works for vectors and columns too- for instance: setNames(df[c(col1, col2), c(name1, name2)]; and setNames(c(val1, val2, val3), c(name1, name2, name3))
Edit-
#dataframe with one row and two columns as such
df <- data.frame('a','b')
#You can access a value by:
val <- levels(droplevels(df[1,2])) #Value at first row, second column
#To assign it a name, you can either use:
setNames(val, c(name))
#or
names(val) <- c(name)
Hope this helps!

Want to add characters to every element of a data frame [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have a data frame of strings as below and would like to add the string "Market" to each of the elements of the data frame. Is there a function that would allow me to do this easily without having to use a for loop?
V1
1 PUBLIC_DISPATCHSCADA_20141221.zip
2 PUBLIC_DISPATCHSCADA_20141222.zip
3 PUBLIC_DISPATCHSCADA_20141223.zip
4 PUBLIC_DISPATCHSCADA_20141224.zip
5 PUBLIC_DISPATCHSCADA_20141225.zip
6 PUBLIC_DISPATCHSCADA_20141226.zip
We can use paste and specify the delimiter. In this case, I am using _ and pasteing the "Market" at the beginning of the string.
df1$V1 <- paste("Market", df1$V1, sep="_")
If we need to do this for each column
df1[] <- lapply(df1, function(x) paste("Market", x, sep="_"))

Resources