In R, when I run two functions in lapply, it runs the first function on the entire list, then run the second function on the list. Is it possible to force it runs both functions on the first element on the list before moving onto the second element?
I am using the function print and nchar for illustration purpose -- I wrote more complex functions that generate data.frame.
lapply(c("a","bb","cdd"), function(x) {
print(x)
nchar(x)
})
the output would be
[1] "a"
[1] "bb"
[1] "cdd"
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
I would like to have something like this:
[[1]]
[1] "a"
[1] 1
[[2]]
[1] "bb"
[1] 2
[[3]]
[1] "cdd"
[1] 3
is this possible?
Juan Antonio Roladan Diaz and cash2 both suggested using list, which kind of works:
lapply(c("a","bb","cdd"), function(x) {
list(x, nchar(x))
})
[[1]]
[[1]][[1]]
[1] "a"
[[1]][[2]]
[1] 1
[[2]]
[[2]][[1]]
[1] "bb"
[[2]][[2]]
[1] 2
[[3]]
[[3]][[1]]
[1] "cdd"
[[3]][[2]]
[1] 3
But it is a bit too messy.
using print gives a better result,
lapply(c("a","bb","cdd"), function(x) {
print(x)
print(nchar(x))
})
[1] "a"
[1] 1
[1] "bb"
[1] 2
[1] "cdd"
[1] 3
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
but is there a way to suppress nchar from being print out again?
invisible(lapply(c("a","bb","cdd"), function(x) { print(x); print(nchar(x)) }))
This happens because the function prints x, then returns nchar(x); the returned elements are put into a list by lapply and returned, and printed out on the REPL.
Replace nchar(x) with print(nchar(x)). Or, if you want the list returned, just return list(x, nchar(x)) from the inner function.
Related
I have a list that contains vectors of length 2, the first element of the vector denotes one data type, and the second - the second data type
[[1]]
[1] "51224.99" "0.879"
[[2]]
[1] "51224.50" "0.038"
[[3]]
[1] "51224.00" "0.038"
[[4]]
[1] "51223.50" "0.038"
[[5]]
[1] "51223.00" "0.062"
[[6]]
[1] "51222.50" "0.038"
[[7]]
[1] "51222.00" "0.038"
[[8]]
[1] "51221.86" "0.370"
[[9]]
[1] "51221.82" "0.015"
[[10]]
[1] "51221.50" "0.038"
[[11]]
[1] "51221.44" "2.100"
[[12]]
[1] "51221.39" "0.196"
[[13]]
[1] "51221.00" "0.038"
[[14]]
[1] "51220.50" "0.038"
[[15]]
[1] "51220.19" "0.292"
[[16]]
[1] "51220.00" "0.038"
[[17]]
[1] "51219.97" "0.012"
[[18]]
[1] "51219.62" "0.684"
[[19]]
[1] "51219.50" "0.038"
[[20]]
[1] "51219.02" "2.311"
I need to find the maximum value by the second element of the vector. That is, in the end result, I should get the following result:
[1] "51219.02" "2.311"
since the maximum second number in vectors is 2.311
Assuming your list is called yourList, you can do the follwoing:
secondValuesNumeric <- as.numeric(sapply(yourList,"[[",2))
maxIndex <- which.max(secondValuesNumeric)
result <- yourList[[maxIndex]]
You could just turn it into a dataframe! Here's how I would do it:
(first I make an example list to mimic yours):
ex <- purrr::map(seq(12), function(i) c(rnorm(1), rnorm(1)))
Then you can use purrr to turn it into a dataframe, and filter to filter to where the second value is the max of that column:
purrr::map_df(ex, function(x) data.frame(val1 = x[1], val2 = x[2])) %>%
dplyr::filter(val2 == max(val2))
You should be able to use this example^ by replacing ex with the name of your list.
Here is an option with rbind, extract the second column, convert to numeric, find the max index to subset the list
lst1[which.max(as.numeric(do.call(rbind, lst1)[,2]))]
I have a vector of character strings (v1) like so:
> head(v1)
[1] "do_i_need_to_even_say_it_do_i_well_here_i_go_anyways_chris_cornell_in_chicago_tonight"
[2] "going_to_see_harry_sunday_happiness"
[3] "this_motha_fucka_stay_solid_foh_with_your_naieve_ass_mentality_your_synapsis_are_lacking_read_a_fucking_book_for_christ_sake"
[4] "why_twitter_will_soon_become_obsolete_http_www.imediaconnection.com_content_23465_asp"
[5] "like_i_said_my_back_still_fucking_hurts_and_im_going_to_complain_about_it_like_no_ones_business_http_tumblr.com_x6n25amd5"
[6] "my_picture_with_kris_karmada_is_gone_forever_its_not_in_my_comments_on_my_mysapce_or_on_my_http_tumblr.com_xzg1wy4jj"
And another vector of character strings (v2) like so:
> head(v2)
[1] "here_i_go" "going" "naieve_ass" "your_synapsis" "my_picture_with" "roll"
What is the quickest way that I can return a list of vectors where each list item represents each vector item in v1 and each vector item is a regular expression match where an item in v2 appeared in that v1 item, like so:
[[1]]
[1] "here_i_go"
[[2]]
[1] "going"
[[3]]
[1] "naieve_ass" "your_synapsis"
[[4]]
[[5]]
[1] "going"
[[6]]
[1] "my_picture_with"
I'd like to leave another option with stri_extract_all_regex() in the stringi package. You can create your regular expression directly from v2 and use it in pattern.
library(stringi)
stri_extract_all_regex(str = v1, pattern = paste(v2, collapse = "|"))
[[1]]
[1] "here_i_go"
[[2]]
[1] "going"
[[3]]
[1] "naieve_ass" "your_synapsis"
[[4]]
[1] NA
[[5]]
[1] "going"
[[6]]
[1] "my_picture_with"
If you want speed, I'd use stringi. You don't seem to have any regex, just fixed patterns, so we can use a fixed stri_extract, and (since you don't mention what to do with multiple matches) I'll assume only extracting the first match is fine, giving us a little more speed with stri_extract_first_fixed.
It's probably not worth benchmarking on such a small example, but this should be quite fast.
library(stringi)
matches = lapply(v1, stri_extract_first_fixed, v2)
lapply(matches, function(x) x[!is.na(x)])
# [[1]]
# [1] "here_i_go"
#
# [[2]]
# [1] "going"
#
# [[3]]
# [1] "naieve_ass" "your_synapsis"
#
# [[4]]
# character(0)
#
# [[5]]
# [1] "going"
Thanks for sharing data, but next time please share it copy/pasteably. dput is nice for that. Here's a copy/pasteable input:
v1 = c(
"do_i_need_to_even_say_it_do_i_well_here_i_go_anyways_chris_cornell_in_chicago_tonight" ,
"going_to_see_harry_sunday_happiness" ,
"this_motha_fucka_stay_solid_foh_with_your_naieve_ass_mentality_your_synapsis_are_lacking_read_a_fucking_book_for_christ_sake",
"why_twitter_will_soon_become_obsolete_http_www.imediaconnection.com_content_23465_asp" ,
"like_i_said_my_back_still_fucking_hurts_and_im_going_to_complain_about_it_like_no_ones_business_http_tumblr.com_x6n25amd5" ,
"my_picture_with_kris_karmada_is_gone_forever_its_not_in_my_comments_on_my_mysapce_or_on_my_http_tumblr.com_xzg1wy4jj")
v2 = c("here_i_go", "going", "naieve_ass", "your_synapsis", "my_picture_with", "roll" )
I have 2 lists, I want to check if the second list in the first list, if yes, paste letters "a","b"... to each element in the first list
list1 <- list("Year","Age","Enrollment","SES","BOE")
list2 <- list("Year","Enrollment","SES")
I try to use lapply
text <- letters[1:length(list2)]
listText<- lapply(list1,function(i) ifelse(i %in% list2,paste(i,text[i],sep="^"),i))
I got wrong output
> listText
[[1]]
[1] "Year^NA"
[[2]]
[1] "Age"
[[3]]
[1] "Enrollment^NA"
[[4]]
[1] "SES^NA"
[[5]]
[1] "BOE"
This is the output I want
[[1]]
[1] "Year^a"
[[2]]
[1] "Age"
[[3]]
[1] "Enrollment^b"
[[4]]
[1] "SES^c"
[[5]]
[1] "BOE"
We can use match to find the index and then use it to subset the first list and paste the letters
i1 <- match(unlist(list2), unlist(list1))
list1[i1] <- paste(list1[i1], letters[seq(length(i1))], sep="^")
You just need change to :
text <- as.character(letters[1:length(list2)])
names(text) <- unlist(list2)
The result is :
> listText
[[1]]
[1] "Year^a"
[[2]]
[1] "Age"
[[3]]
[1] "Enrollment^b"
[[4]]
[1] "SES^c"
[[5]]
[1] "BOE"
Consider the following array assignments:
temp=array(list(),2)
temp[[2]][[2]]=c("a","b")
temp[[1]][[2]]="c"
This produces the following result:
temp
[[1]]
[1] NA "c"
[[2]]
[[2]][[1]]
NULL
[[2]][[2]]
[1] "a" "b"
Instead, I want the result to be:
temp
[[1]]
[[1]][[1]]
NULL
[[1]][[2]]
[1] "c"
[[2]]
[[2]][[1]]
NULL
[[2]][[2]]
[1] "a" "b"
How do I make the assignment so that the former is produced rather than the latter?
You can initialize the list(s) with replicate instead of array. Lists and arrays behave differently
x <- replicate(2, list())
x[[1]][[2]] <- "c"
x[[2]][[2]] <- c("a", "b")
x
Note:
is.array(x)
# [1] FALSE
sapply(x, is.array)
# [1] FALSE FALSE
After several operations on an igraph object (g), I have ended up with the "id" attribute becoming full of nested lists.
It looks like this:
head(V(g)$id)
[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
[1] "http://www.parliament.uk/"
[[2]]
[[2]][[1]]
[[2]][[1]][[1]]
[1] "http://www.businesslink.gov.uk/"
[[3]]
[[3]][[1]]
[[3]][[1]][[1]]
[1] "http://www.number10.gov.uk/"
... and so forth.
I need to 'unnest' this list so it becomes:
head(V(g)$id)
[1] "http://www.parliament.uk/" "http://www.businesslink.gov.uk/"
[3] "http://www.number10.gov.uk/" "http://www.ombudsman.org.uk/"
[5] "http://www.hm-treasury.gov.uk/" "http://data.gov.uk/"
The nested list is causing problems when igraph exports the object to a graphml file. It results in the "id" being assigned default labels (e.g. n0, n1, n2...).
I have tried several other questions, particularly this one. However, I cannot get it to work. It is really frustrating!
Are you just looking for unlist, perhaps?
L <- list(list(list("A")), list(list("B")))
L
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [1] "A"
#
#
#
# [[2]]
# [[2]][[1]]
# [[2]][[1]][[1]]
# [1] "B"
#
#
#
unlist(L)
# [1] "A" "B"