Warning numerical expression has >1 elements: only the first used

Warning numerical expression has >1 elements: only the first used - r

I have a dataset as follows:
Apr May Jun Jul Aug Sep Oct Nov b
1.0 9.0 4.0 5.3 6.4 3.4 2.5 4.3 2
5.0 6.0 9.0 2.3 5.8 2.3 6.5 5.2 3
8.0 4.0 6.0 0.7 5.2 1.2 2.2 6.1 4
2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 7
3.2 3.2 3.2 3.2 3.2 3.2 3.2 3.2 8
4.4 4.1 5.1 6.1 7.1 8.1 9.1 6.8 6
5.6 5.0 3.2 4.2 5.2 1.2 2.2 3.2 5
6.8 5.9 8.9 2.3 3.3 5.7 4.7 3.7 5
8.0 6.8 9.8 4.8 5.8 6.8 7.8 8.8 5
9.2 7.7 7.7 2.8 3.8 4.8 5.8 6.8 6
I want to add a column sum data$sum=rowSums(data[data$b:8]). But getting a warning `numerical expression has 2124 elements: only the first used. Please let me know a better method.

Here's a solution based on your comments:
data$sum <- NA # important to create the column before the for loop
for (rowIdx in 1:nrow(data)) {
startCol <- data[rowIdx, "b"]
data[rowIdx, "sum"] <- sum(data[rowIdx, startCol:8])
}
You need to use a for loop / apply statement to achieve this because you cannot specify a different starting column for each row using the [ subset operator.
Two things can happen when you use [] without a comma depending on your data structure:
If data is a matrix it will treat the entire matrix as a single vector, where each column occurs one after another. For example, data[1:15] will return the 10 values in the "Apr" column then the first 5 values in the "May" column.
If data is a data.frame it will use the indices to look up columns. That is data[1:5] is the same as data[,1:5]. The reason for this is that a data.frame is really a list() underneath the hood, where each column is an element of the list().

Related

How can I copy specific parts of different files in different directories and save them to open in R?

I have several files with the same name: data.out
and they are in different folders.
I need to get some data on each of these files, save them in file (.csv) to then open in R later. The data that I will need in each of these files are matrices of 7x7.
like that:
***** Estimates of covariance matrix ***************************************
Matrix
1 0.8
2 5.6 0.2
3 3.6 5.1 1.3
4 1.2 6.6 1.2 5.6
5 2.7 -3.2 -8.6 3.1 7.2
6 5.1 9.3 5.8 2.4 4.2 6.2
7 1.5 -2.6 -3.1 9.2 8.1 8.7
1.1
**** Estimates of residual matrix ***********************************
Covariance matrix
1 2.1
2 4.1 3.1
3 1.3 5.6 1.4
4 4.5 2.1 8.5 1.1
5 5.1 -5.1 -6.6 5.2 4.1
6 2.4 4.7 4.2 3.1 -1.2 1.7
7 1.2 9.2 3.1 4.5 8.1 1.3
3.9
**** Estimates of correlation matrix ***********************************
Correlation matrix
1 1
2 4.1 1
3 1.3 5.6 1
4 4.5 2.1 8.5 1
5 5.1 -5.1 -6.6 5.2 1
6 2.4 4.7 4.2 3.1 -1.2 1
7 1.2 9.2 3.1 4.5 8.1 1.3
3.9
I was wondering if vim can help me to do what I would like to but can't figure it out. I first thought to use line number as an index but they are not the same in each file.
In short, I want to get these matrices from each file (data.out) in each directory (different names). Then, save them as .csv files to open in R.
Is that I way to do that without being copy and paste manually? Is vim or other text editor or even R can help me to do this faster?
thanks

I'm not sure this is what you are looking for, but you can append to a register by using uppercase letters.
example:
in the first file, visually select the desired text. Copy to register 'a': "ay
in the following files, visually select the desired text. Append to the same register: "Ay
in the destination file, paste from said register: "ap
You can also copy/append the other ways you know from Vim. E.g.: Append the following 5 lines to register 'a': 4"Ayj, or append from within braces to register 'a': "Ayi}

Matching elements of two lists of different sizes by their names

I have two lists of different sizes. One list (named * trees * ) is composed of phylogenetic trees (class phylo) and the second list (named * data_values*) is composed of numeric values.
The tips names of each phylogenetic tree of the list * tree* match with the names of each element inside of the list of values. But the list data_values is composed of a greater number of elements than the tips of each tree.
library(phytools)
library(ape)
#original tree:
tree_original = rtree(12, tip.label = paste0("species", LETTERS[1:12]))
##list of trees:
nodes = 14:23
trees = lapply(nodes,extract.clade,phy=tree_orignal)
names(trees) <- paste0("", 14:23)
data_values <- list()
for (i in 1:17) { data_values[[paste0('species', LETTERS[i])]] <- round(rnorm(10, 5, 4), 1) }
I would like to match both lists (trees and data_values) using species as an index to have a data frame for each tree (see example below). I can do this operation for each tree of the list trees individually but, as my list of species is much bigger than this example, I would like to know if I can do this operation (below) for the all list of trees and not run tree by tree, like this:
tree14 = data_values[match(trees$`14`$tip.label, names(data_values))]
tree14 = llply(tree14, function(x) sapply(x, as.numeric))
tree14_df = ldply(tree14, .fun=identity) **I will need each result as a data.frame**
.id 1 2 3 4 5 6 7 8 9 10
1 speciesE -0.5 3.4 2.0 5.3 3.7 8.2 3.5 -2.0 3.1 10.2
2 speciesL 6.8 4.3 7.1 5.5 4.9 2.5 0.3 -3.8 4.1 6.4
3 speciesA 2.5 2.5 9.6 10.6 2.2 7.1 4.1 4.4 6.0 6.7
4 speciesI -3.5 7.2 6.8 2.8 7.5 8.9 13.4 13.1 1.8 5.5
5 speciesC 4.3 2.2 10.0 7.4 4.4 8.3 -0.7 3.6 9.2 6.3
6 speciesH 6.3 6.1 2.2 4.6 7.4 7.3 2.9 0.6 3.0 5.2
7 speciesB 8.3 1.7 -0.1 4.5 9.4 -0.2 7.5 1.4 -0.3 4.6
8 speciesD 6.2 5.8 6.6 1.1 5.4 11.1 -1.1 0.0 7.9 0.4
9 speciesG 3.5 2.8 1.4 11.6 -2.8 11.0 3.5 2.8 3.1 4.8
10 speciesK 0.9 4.9 5.4 2.7 -0.7 5.1 18.3 4.9 2.5 -0.7
tree15 = data_values[match(trees$`15`$tip.label, names(data_values))]
tree15 = llply(tree15, function(x) sapply(x, as.numeric))
tree15_df = ldply(tree15, .fun=identity)
.id 1 2 3 4 5 6 7 8 9 10
1 speciesE -0.5 3.4 2.0 5.3 3.7 8.2 3.5 -2.0 3.1 10.2
2 speciesL 6.8 4.3 7.1 5.5 4.9 2.5 0.3 -3.8 4.1 6.4
3 speciesA 2.5 2.5 9.6 10.6 2.2 7.1 4.1 4.4 6.0 6.7
4 speciesI -3.5 7.2 6.8 2.8 7.5 8.9 13.4 13.1 1.8 5.5
5 speciesC 4.3 2.2 10.0 7.4 4.4 8.3 -0.7 3.6 9.2 6.3
6 speciesH 6.3 6.1 2.2 4.6 7.4 7.3 2.9 0.6 3.0 5.2
7 speciesB 8.3 1.7 -0.1 4.5 9.4 -0.2 7.5 1.4 -0.3 4.6
... this operation goes until tree23

Read csv file with selected rows using data.table's fread

I was going through some earlier post-
Quickest way to read a subset of rows of a CSV
One way to select subset of data is
write.csv(iris,"iris.csv")
fread("shuf -n 5 iris.csv")
However I was wondering if I can pass some SQL query instead of top 5 rows e.g. only import those rows that have V6 = versicolor
Is there any way to do this using fread function?

This worked for me in windows (unix alternative is grep)
write.csv(iris,"iris.csv")
fread(cmd = paste('findstr', 'versicolor', 'iris.csv'))
V1 V2 V3 V4 V5 V6
1: 51 7.0 3.2 4.7 1.4 versicolor
2: 52 6.4 3.2 4.5 1.5 versicolor
3: 53 6.9 3.1 4.9 1.5 versicolor
4: 54 5.5 2.3 4.0 1.3 versicolor
5: 55 6.5 2.8 4.6 1.5 versicolor
6: 56 5.7 2.8 4.5 1.3 versicolor
7: 57 6.3 3.3 4.7 1.6 versicolor
8: 58 4.9 2.4 3.3 1.0 versicolor
9: 59 6.6 2.9 4.6 1.3 versicolor
10: 60 5.2 2.7 3.9 1.4 versicolor
11: 61 5.0 2.0 3.5 1.0 versicolor
It outputs only those rows that contain "versicolor" in any field.

An Error when execute : sapply(iris,unique) [R]

There are 5 columns in "iris", which are Sepal.Length, Sepal.Width, Petal.Length, Petal.Width & Species. I have make a few tries as follows:
The function of unique() in each column was workable.
The function of sapply() was also good when I used the FUN as mean. However, I got an Error when I try to use the FUN as unique.
sapply(iris,unique)
$Sepal.Length
[1] 5.1 4.9 4.7 4.6 5.0 5.4 4.4 4.8 4.3 5.8 5.7 5.2 5.5 4.5 5.3 7.0 6.4 6.9 6.5 6.3 6.6 5.9 6.0 6.1 5.6 6.7 6.2
[28] 6.8 7.1 7.6 7.3 7.2 7.7 7.4 7.9
$Sepal.Width
[1] 3.5 3.0 3.2 3.1 3.6 3.9 3.4 2.9 3.7 4.0 4.4 3.8 3.3 4.1 4.2 2.3 2.8 2.4 2.7 2.0 2.2 2.5 2.6
$Petal.Length
[1] 1.4 1.3 1.5 1.7 1.6 1.1 1.2 1.0 1.9 4.7 4.5 4.9 4.0 4.6 3.3 3.9 3.5 4.2 3.6 4.4 4.1 4.8 4.3 5.0 3.8 3.7 5.1
[28] 3.0 6.0 5.9 5.6 5.8 6.6 6.3 6.1 5.3 5.5 6.7 6.9 5.7 6.4 5.4 5.2
$Petal.Width
[1] 0.2 0.4 0.3 0.1 0.5 0.6 1.4 1.5 1.3 1.6 1.0 1.1 1.8 1.2 1.7 2.5 1.9 2.1 2.2 2.0 2.4 2.3
$Species
[1] setosa versicolor virginica
Error in if (n <= 1L || lenl[n] <= width) n else max(1L, which.max(lenl > :
missing value where TRUE/FALSE needed
It's seen that sapply() and unique() have already done their works, but why the Error was showed on the console? I have tried to use "option(error=recover)";however, I couldn't figure it out.... Is it because the class of the Species is factor? How can I make it work?
Actually, I meet the same problem when I take the lesson of swirl. It has stocked me for few days...Could anyone help me to solve the problem? I will appreciate for your help. Thanks.

R: add column with slope coefficient for values over time

I have a dataframe which has values over time. The colnames reflect the time in milliseconds. I would like to add an additional column with the slope coefficient of a line of best fit for each token.
Token 0ms 20ms 40ms 60ms 80ms
1 2.5 3.7 4.8 5.2 6.3
2 3.6 4.9 5.2 6.1 7.8
3 1.1 3.2 4.6 7.8 9.1
4 4.5 3.3 2.1 1.9 NA
5 2.1 3.5 3.7 NA NA
Some rows have NAs, as not all tokens are active for the same amount of time.

d <- read.table(text=
"Token 0ms 20ms 40ms 60ms 80ms
1 2.5 3.7 4.8 5.2 6.3
2 3.6 4.9 5.2 6.1 7.8
3 1.1 3.2 4.6 7.8 9.1
4 4.5 3.3 2.1 1.9 NA
5 2.1 3.5 3.7 NA NA",
header=TRUE,check.names=FALSE)
slopes <- apply(as.matrix(d[,-1]),1,
function(y) {
fit <- lm(y~t,
data=data.frame(y,
t=seq(0,length=length(y),by=20)))
coef(fit)[2]
})
data.frame(d,slopes,check.names=FALSE)
## Token 0ms 20ms 40ms 60ms 80ms slopes
## 1 1 2.5 3.7 4.8 5.2 6.3 0.0455
## 2 2 3.6 4.9 5.2 6.1 7.8 0.0480
## 3 3 1.1 3.2 4.6 7.8 9.1 0.1030
## 4 4 4.5 3.3 2.1 1.9 NA -0.0450
## 5 5 2.1 3.5 3.7 NA NA 0.0400

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Warning numerical expression has >1 elements: only the first used - r

Related

How can I copy specific parts of different files in different directories and save them to open in R?

Matching elements of two lists of different sizes by their names

Read csv file with selected rows using data.table's fread

An Error when execute : sapply(iris,unique) [R]

R: add column with slope coefficient for values over time

Categories

Resources