This should be very simple but I cannot resolve it. I derive what I think is a matrix from the str_match_all function. It appears this is not the case, despite its appearance. I am able to extract the timestamp value from the first row of the 'matrix' by hard coding the indices in sapply [1,2]. I want to do the same thing for the last entry in the matrix and thought I would easily be able to extract the number of rows in the matrix to do this e.g. [nrow(sm),2], but cannot! See below:
sm <- str_match_all(regex_text, regex_list[row, "regex_pattern"] )
print(sm)
#This gives me this (which is good):
# [[1]]
# [,1] [,2] [,3]
# [1,] "09/08/2014 13:01CONTENT_ACCESS.preparing" "09/08/2014 13:01" "CONTENT_ACCESS.preparing"
# [2,] "09/08/2014 13:06CONTENT_ACCESS.preparing" "09/08/2014 13:06" "CONTENT_ACCESS.preparing"
# [3,] "09/08/2014 13:08CONTENT_ACCESS.preparing" "09/08/2014 13:08" "CONTENT_ACCESS.preparing"
#Get the first timestamp
start_t_stamp <- sapply(sm, function(x) x[1,2])
print(start_t_stamp)
# Also good, I get [1] "09/08/2014 13:01"
#Get the last timestamp. How do extract the 'number of rows' in sm?
#This returns NULL
print(nrow(sm))
#transform to matrix???
t_sm <- t(sm)
#This then prints "[1,] Character,9"
print(t_sm)
#Therfore this prints 1
print(nrow(t_sm))
Thanks in advance...
I have a 4x4 matrix and I want to identify elements of this matrix that are equal to a specific value such as 1. I want to save the indices of these elements as well as column and row names to two separate vectors. Finally, I want to write all of this information to a txt file.
I managed to get the indices to a txt file but I have no idea how to retrieve the column and row names from the matrix. To test, I am using following example:
mat <- matrix(c(1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6), ncol=4, nrow=4)
colnames(mat) <- c("C1","C2","C3","C4")
rownames(mat) <- c("R1", "R2","R3","R4")
r.indices <- c()
c.indices <- c()
for (row in 1:nrow(mat)){
for (col in 1:(ncol(mat)-row+1)){
if (mat[row,col] == cutoff){
#print("this is one!")
r.indices <- c(r.indices,row)
c.indices <- c(c.indices,col)
}
}
}
write.csv(cbind(r.indices, c.indices), file="data.txt")
The which function already provides a nice interface to getting all the row and column indices of a matrix that meet a certain criterion, using the arr.ind=TRUE argument. This is both much less typing and much more efficient than looping through each matrix element. For instance, if you wanted to get all the indices where your matrix equaled 5, you could use:
(idx <- which(mat == 5, arr.ind=TRUE))
# row col
# R1 1 2
# R3 3 4
Now all that remains is a simple lookup using the row and column names of your matrix:
cbind(rownames(mat)[idx[,"row"]], colnames(mat)[idx[,"col"]])
# [,1] [,2]
# [1,] "R1" "C2"
# [2,] "R3" "C4"
You could write this result out to a file using write.csv.
I am trying to merge two subsets of a dataframe together, but neither merge nor cbind seem to do exactly what I want. So far I have this:
library(psych)
df1<-NULL
df1$a<-c(1,2,3,4,5)
df1$b<-c(4,5,2,6,1)
df1$c<-c(0,9,0,6,3)
df1$gender<-c(0,0,0,1,1)
df1<-as.data.frame(df1)
male<-subset(df1,gender<1)
male<-male[,-c(4)]
female<-subset(df1,gender>=1)
female<-female[,-c(4)]
library(psych)
merge(corr.test(male)$r,corr.test(female)$r)
My end goal is something like this in every cell:
a b c
a 1/1 -0.6546537/-1 0/-1
....
You can concatenate the entries in both matrices, then just fix the dimensions of the new vector to be the same as the corr.test output using dim<-, aka dim(...) <-.
## Concatenate the entries
strs <- sprintf("%s/%s", round(corr.test(male)$r,2),
round(corr.test(female)$r, 2))
## Set the dimensions
dim(strs) <- c(3,3)
## Or (to have the value returned at the same time)
`dim<-`(strs, c(3, 3))
# [,1] [,2] [,3]
# [1,] "1/1" "-0.65/-1" "0/-1"
# [2,] "-0.65/-1" "1/1" "0.76/1"
# [3,] "0/-1" "0.76/1" "1/1"
Another trick, if you want to have those rownames and column names as in the output of corr.test, and not have to worry about dimensions,
## Get one result
ctest <- corr.test(male)$r
## Concatenate
strs <- sprintf("%s/%s", round(ctest,2),
round(corr.test(female)$r, 2))
## Overwrite the matrix with the strings
ctest[] <- strs
ctest
# a b c
# a "1/1" "-0.65/-1" "0/-1"
# b "-0.65/-1" "1/1" "0.76/1"
# c "0/-1" "0.76/1" "1/1"
I am trying to find out usage of drop() function. I read the documentation that a matrix or array can be the input object for the function however the size of the matrix or object does not change. Can someone explain its actual usage and how it works?
I am using R version 3.2.1. Code snippet:
data1 <- matrix(data=(1:10),nrow=1,ncol=1)
drop(data1)
R has factors, which are very cool (and somewhat analogous to labeled levels in Stata). Unfortunately, the factor list sticks around even if you remove some data such that no examples of a particular level still exist.
# Create some fake data
x <- as.factor(sample(head(colors()),100,replace=TRUE))
levels(x)
x <- x[x!="aliceblue"]
levels(x) # still the same levels
table(x) # even though one level has 0 entries!
The solution is simple: run factor() again:
x <- factor(x)
levels(x)
If you need to do this on many factors at once (as is the case with a data.frame containing several columns of factors), use drop.levels() from the gdata package:
x <- x[x!="antiquewhite1"]
df <- data.frame(a=x,b=x,c=x)
df <- drop.levels(df)
R matrix is a two dimensional array. R has a lot of operator and functions that make matrix handling very convenient.
Matrix assignment:
>A <- matrix(c(3,5,7,1,9,4),nrow=3,ncol=2,byrow=TRUE)
>A
[,1] [,2]
[1,] 3 5
[2,] 7 1
[3,] 9 4
Matrix row and column count:
>rA <- nrow(A)
>rA
[1] 3
>cA <- ncol(A)
>cA
[1] 2
t(A) function returns a transposed matrix of A:
>B <- t(A)
>B
[,1] [,2] [,3]
[1,] 3 7 9
[2,] 5 1 4
Matrix multplication:
C <- A * A
C
[,1] [,2]
[1,] 9 25
[2,] 49 1
[3,] 81 16
Matrix Addition:
>C <- A + A
>C
[,1] [,2]
[1,] 6 10
[2,] 14 2
[3,] 18 8
Matrix subtraction (-) and division (/) operations ... ...
Sometimes a matrix needs to be sorted by a specific column, which can be done by using order() function.
Following is a csv file example:
,t1,t2,t3,t4,t5,t6,t7,t8
r1,1,0,1,0,0,1,0,2
r2,1,2,5,1,2,1,2,1
r3,0,0,9,2,1,1,0,1
r4,0,0,2,1,2,0,0,0
r5,0,2,15,1,1,0,0,0
r6,2,2,3,1,1,1,0,0
r7,2,2,3,1,1,1,0,1
Following R code will read in the above file into a matrix, and sort it by column 4, then write to a output file:
x <- read.csv("sortmatrix.csv",header=T,sep=",");
x <- x[order(x[,4]),];
x <- write.table(x,file="tp.txt",sep=",")
The result is:
"X","t1","t2","t3","t4","t5","t6","t7","t8"
"1","r1",1,0,1,0,0,1,0,2
"4","r4",0,0,2,1,2,0,0,0
"6","r6",2,2,3,1,1,1,0,0
"7","r7",2,2,3,1,1,1,0,1
"2","r2",1,2,5,1,2,1,2,1
"3","r3",0,0,9,2,1,1,0,1
"5","r5",0,2,15,1,1,0,0,0
The DROP function supports natively compiled, scalar user-defined functions.
Removes one or more user-defined functions from the current database
To execute DROP FUNCTION, at a minimum, a user must have ALTER permission on the schema to which the function belongs, or CONTROL permission on the function.
DROP FUNCTION will fail if there are Transact-SQL functions or views in the database that reference this function and were created by using SCHEMA BINDING, or if there are computed columns, CHECK constraints, or DEFAULT constraints that reference the function.
DROP FUNCTION will fail if there are computed columns that reference this function and have been indexed.
DROP FUNCTION { [ schema_name. ] function_name } [ ,...n ]
In R, when I select only one column from a data frame/matrix, the result will become a vector and lost the column names, how can I keep the column names?
For example, if I run the following code,
x <- matrix(1,3,3)
colnames(x) <- c("test1","test2","test3")
x[,1]
I will get
[1] 1 1 1
Actually, I want to get
test1
[1,] 1
[2,] 1
[3,] 1
The following code give me exactly what I want, however, is there any easier way to do this?
x <- matrix(1,3,3)
colnames(x) <- c("test1","test2","test3")
y <- as.matrix(x[,1])
colnames(y) <- colnames(x)[1]
y
Use the drop argument:
> x <- matrix(1,3,3)
> colnames(x) <- c("test1","test2","test3")
> x[,1, drop = FALSE]
test1
[1,] 1
[2,] 1
[3,] 1
Another possibility is to use subset:
> subset(x, select = 1)
test1
[1,] 1
[2,] 1
[3,] 1
The question mentions 'matrix or dataframe' as an input. If x is a dataframe, use LIST SUBSETTING notation, which will keep the column name and will NOT simplify by default!
`x <- matrix(1,3,3)
colnames(x) <- c("test1","test2","test3")
x=as.data.frame(x)
x[,1]
x[1]`
Data frames possess the characteristics of both lists and matrices: if you subset with a single vector, they behave like lists; if you subset with two vectors, they behave like matrices.
There's an important difference if you select a single
column: matrix subsetting simplifies by default, list
subsetting does not.
source: See http://adv-r.had.co.nz/Subsetting.html#subsetting-operators for details