I have a function similar to this:
testfun = function(jID,kID,d){
g=paste0(jID,kID)
date = d
bb=data.frame(g,date)
return(bb)
}
Data frame:
x=data.frame(jID = c("a","b"),kID=c("c","d"),date="20170206",stringsAsFactors = FALSE)
I want to pass each row as inputs into the function. The solutions provided here: Passing multiple arguments to a function taken from dataframe are great but in their case, the number of columns was known. How would a solution like this:
vtestfun <- (Vectorize(testfun, SIMPLIFY=FALSE))
vtestfun(x[,1],x[,2],x[,3])
be applied if the number of columns in the dataframe is not known or keeps changing?
If you can match the argument names to the column names like so:
testfun <- function(jID, kID, date){ # 'date', not 'd'
g <- paste0(jID, kID)
bb <- data.frame(g, date)
return(bb)
}
You could do:
purrr::pmap(x, testfun)
Returning:
[[1]]
g date
1 ac 20170206
[[2]]
g date
1 bd 20170206
# Data used:
x <- structure(list(jID = c("a", "b"), kID = c("c", "d"), date = c("20170206", "20170206")), class = "data.frame", row.names = c(NA, -2L))
I have 1 row of data and 50 columns in the row from a csv which I've put into a dataframe. The data is arranged across the spreadsheet like this:
"FSEG-DFGS-THDG", "SGDG-SGRE-JJDF", "DIDC-DFGS-LEMS"...
How would I select only the middle part of each element (eg, "DFGS" in the 1st one, "SGRE" in the second etc), count their occurances and display the results?
I have tried using the strsplit function but I couldn't get it to work for the entire row of data. I'm thinking a loop of some kind might be what I need
You can do unlist(strsplit(x, '-'))[seq(2, length(x)*3, 3)] (assuming your data is consistently of the form A-B-C).
# E.g.
fun <- function(x) unlist(strsplit(x, '-'))[seq(2, length(x)*3, 3)]
fun(c("FSEG-DFGS-THDG", "SGDG-SGRE-JJDF", "DIDC-DFGS-LEMS"))
# [1] "DFGS" "SGRE" "DFGS"
Edit
# Data frame
df <- structure(list(a = "FSEG-DFGS-THDG", b = "SGDG-SGRE-JJDF", c = "DIDC-DFGS-LEMS"),
class = "data.frame", row.names = c(NA, -1L))
fun(t(df[1,]))
# [1] "DFGS" "SGRE" "DFGS"
First we create a function strng() and then we apply() it on every column of df. strsplit() splits a string by "-" and strng() returns the second part.
df = data.frame(a = "ab-bc-ca", b = "gn-bc-ca", c = "kj-ll-mn")
strng = function(x) {
strsplit(x,"-")[[1]][2]
}
# table() outputs frequency of elements in the input
table(apply(df, MARGIN = 2, FUN = strng))
# output: bc ll
2 1
I have read data from a sav (spss) file. Using the following code:
library(foreign)
test <- read.spss(path_to_file, to.data.frame = TRUE)
the resultant data frame is in the following format:
structure(list(srl = c(4096, 15024, 4094), mem_id = c(278812,
2341700, 251337), q1 = c(2, 2, 1)), row.names = c(NA, 3L), class = "data.frame")
While the object test is a data frame, each of the columns is rendered as a list. I tried the following to convert:
dd <- data.frame(srl = unlist(df$srl), mem_id = unlist(df$mem_id), q1 = unlist(df$q1))
still the resultant data frame is in the same as given in the dput.
Even if we cannot reproduce it and run it so that we could check if it works, why don't you try:
lst <- lst[-c(4,5)]
and then
new_lst <- as.data.frame(lst)
,where lst is the name of your list. I suggest remove the 4th and 5th element cause in a dataframe you probably won't need it.
I'm looking to manipulate a set of strings in R.
The data I have:
Data Field
Mark Twain 5
I want it to instead be:
Data Field
Twain Mark 5
My idea was to first split the string into two columns and then concatenate. But I'm wondering if there is an easier way.
you can try this approach:
> df <- data.frame(Data=c("Mark Twain"), Field=5)
> df$Data <- lapply(strsplit(as.character(df$Data), " "), function(x) paste(rev(x), collapse=" "))
> df
Data Field
1 Twain Mark 5
This will work even if the number of rows in your data frame is > 1
we can use sub to do this
df1$Data <- sub("(\\S+)\\s+(\\S+)", "\\2 \\1", df1$Data)
df1
# Data Field
#1 Twain Mark 5
data
df1 <- structure(list(Data = "Mark Twain", Field = 5L),
.Names = c("Data", "Field"), class = "data.frame",
row.names = c(NA, -1L))
I have the following data frame that I want to order by the fifth column ("Distance").
When I try `
df.order <- df[order(df[, 5]), ]
I always get the following error message.
Error in order(df[, 5]) : unimplemented type 'list' in 'orderVector1'`
I don't know why R consider my data frame as a list. Running is.data.frame(df) returns TRUE. I have to admit that is.list(df) also returns TRUE. Is is possible to force my data frame to be only a data frame and not a list?
Thanks for your help.
structure(list(ID = list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
Latitude = list(50.7368, 50.7368, 50.7368, 50.7369, 50.7369, 50.737, 50.737, 50.7371, 50.7371, 50.7371),
Longitude = list(6.0873, 6.0873, 6.0873, 6.0872, 6.0872, 6.0872, 6.0872, 6.0872, 6.0872, 6.0872),
Elevation = list(269.26, 268.99, 268.73, 268.69, 268.14, 267.87, 267.61, 267.31, 267.21, 267.02),
Distance = list(119.4396, 119.4396, 119.4396, 121.199, 121.199, 117.5658, 117.5658, 114.9003, 114.9003, 114.9003),
RxPower = list(-52.6695443922406, -52.269130891243, -52.9735258244422, -52.2116571930007, -51.7784534281727, -52.7703448813654, -51.6558862949081, -52.2892907635308, -51.8322993596551, -52.4971436682333)),
.Names = c("ID", "Latitude", "Longitude", "Elevation", "Distance", "RxPower"),
row.names = c(NA, 10L), class = "data.frame")
Your data frame contains lists, not vectors. You can convert this data frame to the "classical" format using as.data.frame and unlist:
df2 <- as.data.frame(lapply(df, unlist))
Now, the new data frame could be sorted in the intended way:
df2[order(df2[, 5]), ]
I've illustrated with a small example what's the problem:
df <- structure(list(ID = c(1, 2, 3, 4),
Latitude = c(50.7368, 50.7368, 50.7368, 50.7369),
Longitude = c(6.0873, 6.0873, 6.0873, 6.0872),
Elevation = c(269.26, 268.99, 268.73, 268.69),
Distance = c(119.4396, 119.4396, 119.4396, 121.199),
RxPower = c(-52.6695443922406, -52.269130891243, -52.9735258244422,
-52.2116571930007)),
.Names = c("ID", "Latitude", "Longitude", "Elevation", "Distance", "RxPower"),
row.names = c(NA, 4L), class = "data.frame")
Notice that list only occurs once. And all the values are wrapped by c(.) and not list(.). This is why doing sapply(df, class) on your data resulted in all columns having class list.
Now,
> sapply(df, classs)
# ID Latitude Longitude Elevation Distance RxPower
# "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Now order works:
> df[order(df[,4]), ]
# ID Latitude Longitude Elevation Distance RxPower
# 4 4 50.7369 6.0872 268.69 121.1990 -52.21166
# 3 3 50.7368 6.0873 268.73 119.4396 -52.97353
# 2 2 50.7368 6.0873 268.99 119.4396 -52.26913
# 1 1 50.7368 6.0873 269.26 119.4396 -52.66954
This turns you data.frame of lists into a matrix:
mat <- sapply(df,unlist)
Now you can order it.
mat[order(mat[,5]),]
If all columns are of one type, e.g., numeric, a matrix often is preferable, because operations on matrices are faster than on data.frames. However, you can transform to a data.frame using as.data.frame(mat).
Btw, a data.frame is a special kind of list and thus is.list returns TRUE for every data.frame.
Ran across this same problem. This worked for me (maybe it might help someone else who is having the same problem and stumbled on this page).
I had a structure like:
lst <- list(row1 = list(col1="A",col2=1,col3="!"), row2 = list(col1="B",col2=2,col3="#"))
> lst
$row1
$row1$col1
[1] "A"
$row1$col2
[1] 1
$row1$col3
[1] "!"
$row2
$row2$col1
[1] "B"
$row2$col2
[1] 2
$row2$col3
[1] "#"
I was doing:
df <- as.data.frame(do.call(rbind, lst))
And I kept getting the same error you were getting when I tried to df[order(df$col1),]. Turns out I had to do:
df <- do.call(rbind.data.frame, lst)