How to read data - r

I am quite a newbie to the r language i wanted to read the following input but Ihave no idea how to proceed:
m n
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
I wanted to read the following input into 6 categories
m
n
c(1,5,9,13)
c(2,6,10,14)
c(3,7,11,15)
c(4,8,12,16)
I tried the following code but it doesn't seem to work
f <- file("stdin")
r <- file("stdin")
data1 = scan(file = r,skip = 1)
data1 <- split(data1, " ")
data2 = scan(file = f ,nlines =1)
data2 <- split(data2, " ")
o1 = data2[1]
o2 = data2[2]
It always seems to give
"Read 0 items"
for data2.

Use read.table twice where Lines is given in the Note at the end.
mn <- read.table(text = Lines, nrows = 1, as.is = TRUE)
DF <- read.table(text = Lines, skip = 1)
giving:
mn
## V1 V2
## 1 m n
mn[[1]]
## [1] "m"
mn$V1 # same
## [1] "m"
DF
## V1 V2 V3 V4
## 1 1 2 3 4
## 2 5 6 7 8
## 3 9 10 11 12
## 4 13 14 15 16
DF[[1]]
## [1] 1 5 9 13
DF$V1 # same
## [1] 1 5 9 13
A list made up of the 6 components is:
unname( c(mn, DF) )
## [[1]]
## [1] "m"
##
## [[2]]
## [1] "n"
##
## [[3]]
## [1] 1 5 9 13
##
## [[4]]
## [1] 2 6 10 14
##
## [[5]]
## [1] 3 7 11 15
##
## [[6]]
## [1] 4 8 12 16
scan
If you prefer to use scan, as in the question, then assuming that the lines all have the same number of fields except for the first line, get the field counts, one per line, into counts and then use scan using those numbers:
counts <- count.fields(textConnection(Lines))
c( scan(text = Lines, what = "", nmax = counts[1], quiet = TRUE),
scan(text = Lines, what = as.list(numeric(counts[2])), skip = 1, quiet = TRUE) )
## [[1]]
## [1] "m"
##
## [[2]]
## [1] "n"
##
## [[3]]
## [1] 1 5 9 13
##
## [[4]]
## [1] 2 6 10 14
##
## [[5]]
## [1] 3 7 11 15
##
## [[6]]
## [1] 4 8 12 16
Note
Assume the input is:
Lines <- "m n
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16"

Related

How to group list elements according to groups defined in a grouping vector?

Consider a list and a grouping vector
l = list(3:4, 8:10, 7:8)
# [[1]]
# [1] 3 4
#
# [[2]]
# [1] 8 9 10
#
# [[3]]
# [1] 7 8
g = c("A", "B", "A")
What is the easiest way to group l elements according to groups defined in g, such that we get:
l_expected = list(c(3:4, 7:8), 8:10))
# [[1]]
# [1] 3 4 7 8
#
# [[2]]
# [1] 8 9 10
One way is to use tapply + unlist, and unname if necessary.
tapply(l, g, unlist)
output
$A
[1] 3 4 7 8
$B
[1] 8 9 10
Using split
lapply(split(l, g), unlist)
$A
[1] 3 4 7 8
$B
[1] 8 9 10
Or get the lengths and split after unlisting
split(unlist(l), rep(g, lengths(l)))
$A
[1] 3 4 7 8
$B
[1] 8 9 10

split list into lists each of length x

Simple problem, given a list:
main_list <- list(1:3,
4:6,
7:9,
10:12,
13:15)
main_list
# [[1]]
# [1] 1 2 3
# [[2]]
# [1] 4 5 6
# [[3]]
# [1] 7 8 9
# [[4]]
# [1] 10 11 12
# [[5]]
# [1] 13 14 15
I want to split the list into multiple lists where I break up the original one into lists each of length x. So if I said x = 2, I would get 3 lists of length 2, 2 and the leftover 1:
target <- list(list(1:3,
4:6),
list(7:9,
10:12),
list(13:15))
target
# [[1]]
# [[1]][[1]]
# [1] 1 2 3
# [[1]][[2]]
# [1] 4 5 6
# [[2]]
# [[2]][[1]]
# [1] 7 8 9
# [[2]][[2]]
# [1] 10 11 12
# [[3]]
# [[3]][[1]]
# [1] 13 14 15
Something like:
my_split <- function(listtest, x) {
split(listtest, c(1:x))
}
target <- my_split(main_list, 2)
Thanks
here is an option with gl
split(main_list, as.integer(gl(length(main_list), 2, length(main_list))))
It can be converted to a custom function
f1 <- function(lstA, n) {
l1 < length(lstA)
split(lstA, as.integer(gl(l1, n, l1)))
}
EDIT: no conditional logic needed. Just use split() with c() and rep():
my_split <- function(l, x){
l_length <- length(l)
l_div <- l_length / x
split(l, c(rep(seq_len(l_div), each = x), rep(ceiling(l_div), l_length %% x)))
}
my_split(main_list, 2)

How to name the element of the list in r?

Suppose I have this list:
my_variable <- list()
x <- c(1,2,3,4)
y <- c(4,5,7,3)
for ( i in 1:4){
my_variable[[i]] <- x[i]*y[i]+2
}
Then I will get this:
[[1]]
[1] 6
[[2]]
[1] 12
[[3]]
[1] 23
[[4]]
[1] 14
How to name the element of the output, like this:
> my_variable
First_result
[1] 6
Second_result
[1] 12
and so on.
You could do it with the paste0 and names
# So first you define vector of names:
names1 <- c("First","Second","Third","Fourth")
# And second you paste them to your list
names(my_variable) <- paste0(names1,"_result", sep = "")
#And the output
$First_result
[1] 6 12 23 14
$Second_result
[1] 6 12 23 14
$Third_result
[1] 6 12 23 14
$Fourth_result
[1] 6 12 23 14

Splitting numeric vectors in R

If I have a vector, c(1,2,3,5,7,9,10,12)...and another vector c(3,7,10), how would I produce the following:
[[1]]
1,2,3
[[2]]
5,7
[[3]]
9,10
[[4]]
12
Notice how 3 7 and 10 become the last number of each list element (except the last one). Or in a sense the "breakpoint". I am sure there is a simple R function I am unknowledgeable of or having loss of memory.
Here's one way using cut and split:
split(x, cut(x, c(-Inf, y, Inf)))
#$`(-Inf,3]`
#[1] 1 2 3
#
#$`(3,7]`
#[1] 5 7
#
#$`(7,10]`
#[1] 9 10
#
#$`(10, Inf]`
#[1] 12
Could do
split(x, cut(x, unique(c(y, range(x)))))
## $`[1,3]`
## [1] 1 2 3
## $`(3,7]`
## [1] 5 7
## $`(7,10]`
## [1] 9 10
## $`(10,12]`
## [1] 12
Similar to #beginneR 's answer, but using findInterval instead of cut
split(x, findInterval(x, y + 1))
# $`0`
# [1] 1 2 3
#
# $`1`
# [1] 5 7
#
# $`2`
# [1] 9 10
#
# $`3`
# [1] 12

How to combine all sublist elements into one list

I have a list (of length 3) which is made up of sublists (each of differing length - 2, 2, 3). I would like to store all of this as one big list (e.g., no sublists - just one list of length 7). I understand how to do it manually, but is there a function or command I can use?
I would like to be able to do this for lists and sublists of any length.
Here's an example of the list:
[[1]]
[[1]][[1]]
name n l_1 t t_3 t_4 t_5 cluster
12 563035 19 9.263158 0.2017045 0.06379453 0.075876830 0.095852895 1
14 563037 19 8.026316 0.2076503 0.05634675 0.098684211 -0.104566563 1
[[1]][[2]]
name n l_1 t t_3 t_4 t_5 cluster
13 563036 20 7.200000 0.1838450 -0.06428098 0.085681987 -0.011070830 2
17 563042 20 7.725000 0.2168285 0.15161037 0.117570045 -0.067102568 2
[[2]]
[[2]][[1]]
name n l_1 t t_3 t_4 t_5 cluster
1 561101 11 6.772727 0.19731544 0.029478458 -0.128117914 6.235828e-02 1
44 563080 11 7.545455 0.18554217 0.103896104 0.285714286 -2.164502e-02 1
[[2]][[2]]
name n l_1 t t_3 t_4 t_5 cluster
48 566017 33 10.400000 0.2037624 0.16432326 0.1166006937 -0.012830017 2
49 566018 22 9.218182 0.2113271 0.30646667 0.2502280702 0.189838207 2
50 566020 19 11.736842 0.3111609 0.51217445 0.5147883012 0.462723120 2
[[3]]
[[3]][[1]]
name n l_1 t t_3 t_4 t_5 cluster
158 568004 18 8.722222 0.1787186 -0.05083857 0.06498952 0.06918239 1
161 568046 19 11.794737 0.3646190 0.54582540 0.49747236 0.32255755 1
162 568047 18 12.916667 0.3366224 0.53523112 0.40464111 0.29960541 1
163 568048 20 11.590000 0.3918986 0.50007725 0.43039556 0.34299752 1
[[3]][[2]]
name n l_1 t t_3 t_4 t_5 cluster
165 568050 20 9.125000 0.2034607 0.29789747 0.31073776 0.09157738 2
167 568054 20 8.850000 0.1332144 0.09895833 0.18636204 0.04641544 2
[[3]][[3]]
name n l_1 t t_3 t_4 t_5 cluster
168 568058 20 8.675000 0.2012741 0.18161266 0.200319163 -0.009375416 3
170 568061 18 24.861111 0.7394676 0.91836281 0.928317483 0.905563950 3
Many thanks,
Sylvia
For your specific question, the answer is simple:
unlist(mylist, recursive = FALSE)
However, you asked how to be able to do this for a list with an arbitrary number of sublists. That is a bit more tricky. Fortunately, an Akhil S Bhel has tackled that problem for us and created a function called LinearizeNestedList. His site is down at the moment, but I had put his function up as a Github Gist.
First, we'll create some sample data with nested lists within nested lists.
NList <- list(a = "a", # Atom
b = 1:5, # Vector
c = data.frame(x = runif(5), y = runif(5)),
d = matrix(runif(4), nrow = 2),
e = list(l = list("a", "b"),
m = list(1:5, 5:10),
n = list(list(1), list(2))))
The source list looks like this. Notice the nesting that happens with the nested list item "e".
NList
# $a
# [1] "a"
#
# $b
# [1] 1 2 3 4 5
#
# $c
# x y
# 1 0.7893562 0.47761962
# 2 0.0233312 0.86120948
# 3 0.4772301 0.43809711
# 4 0.7323137 0.24479728
# 5 0.6927316 0.07067905
#
# $d
# [,1] [,2]
# [1,] 0.09946616 0.5186343
# [2,] 0.31627171 0.6620051
#
# $e
# $e$l
# $e$l[[1]]
# [1] "a"
#
# $e$l[[2]]
# [1] "b"
#
#
# $e$m
# $e$m[[1]]
# [1] 1 2 3 4 5
#
# $e$m[[2]]
# [1] 5 6 7 8 9 10
#
#
# $e$n
# $e$n[[1]]
# $e$n[[1]][[1]]
# [1] 1
#
#
# $e$n[[2]]
# $e$n[[2]][[1]]
# [1] 2
You can see how LinearizeNestedList "flattens" all sublists so you end up with a single list.
LinearizeNestedList(NList)
# $a
# [1] "a"
#
# $b
# [1] 1 2 3 4 5
#
# $c
# x y
# 1 0.7893562 0.47761962
# 2 0.0233312 0.86120948
# 3 0.4772301 0.43809711
# 4 0.7323137 0.24479728
# 5 0.6927316 0.07067905
#
# $d
# [,1] [,2]
# [1,] 0.09946616 0.5186343
# [2,] 0.31627171 0.6620051
#
# $`e/l/1`
# [1] "a"
#
# $`e/l/2`
# [1] "b"
#
# $`e/m/1`
# [1] 1 2 3 4 5
#
# $`e/m/2`
# [1] 5 6 7 8 9 10
#
# $`e/n/1/1`
# [1] 1
#
# $`e/n/2/1`
# [1] 2
By the way, I forgot to mention that you can flatten data.frames in lists too (since a data.frame is a special type of list in R.
If you really want to flatten everything out (well, except arrays, since they are just vectors with dims), add LinearizeDataFrames = TRUE to your LinearizeNestedList call:
LinearizeNestedList(NList, LinearizeDataFrames=TRUE)
how about this:
dissolve <- function(x){
operator <- function(x){
if(is.list(x)){
for(i in seq(x)){
operator(x[[i]])
}
}else{
combi[[length(combi)+1]] <<- x
}
}
combi=list()
operator(x)
return(combi)
}
.... does unlist(mylist) work?

Resources