text to dataframe in r - r

I have a problem in making a dataframe with a text in R.
my text is like this:
t1 = "[[1,5,3,4],[3,2,2,1],[19,11,1,1]]"
and I want to make this dataframe:
V1 V2 V3 V4
1 1 5 3 4
2 3 2 2 1
3 19 11 1 1

To combine the comments, you need to do:
yourDf <- as.data.frame(jsonlite::fromJSON(t1))

Related

How to append the column in R?

Consider the following data named mydata. My intention is to put v1 and v2 in the same column by adding an identifier variable v4.
id v1 v2
1 2 3
2 4 5
3 7 8
OUTPUT required:
id v3 v4
1 2 1
2 4 1
3 7 1
1 3 2
2 5 2
3 8 2
Any help is much appreciated!
I think you are looking for something like dplyr::mutate() for adding columns, and rbind() for stacking two data frames on top of each other.
library(dplyr)
mydata <- data.frame (id = c(1,2,3),
v1 = c(2,4,7),
v2 = c(3,5,8))
)
a<- data.frame(mydata$id, mydata$v1)%>%
mutate(v4=1)%>%
rename(v3=mydata.v1, id=mydata.id )
b<- data.frame(mydata$id, mydata$v2)%>%
mutate(v4=2)%>%
rename(v3=mydata.v2, id=mydata.id )
> rbind(a,b)
id v3 v4
1 1 2 1
2 2 4 1
3 3 7 1
4 1 3 2
5 2 5 2
6 3 8 2
What about this:
mydata <- data.frame(c(1,2,3),c(2,4,7),c(3,5,8))
colnames(mydata) <- c("id","v1","v2")
mydata_2 <- rbind(mydata[,c(1,2)], setNames(mydata[,c(1,3)], names(mydata[,c(1,2)])))
mydata_2$v4 <- c(rep(1,length(mydata$v1)),rep(2,length(mydata$v2)))
colnames(mydata_2) <- c("id","v3","v4")
A data.table option
setcolorder(
transform(
setnames(melt(setDT(df), id.var = "id", variable.name = "v4"), "value", "v3"),
v4 = as.numeric(factor(v4))
), c("id", "v3", "v4")
)[]
gives
id v3 v4
1: 1 2 1
2: 2 4 1
3: 3 7 1
4: 1 3 2
5: 2 5 2
6: 3 8 2

How to convert "heading" rows into new columns

I have data (imported imperfectly from a PDF) that has everything in a single column, with certain rows as descriptive headers. For example:
dfx <- data.frame(V1 = c("Box 1", "abcd10", "bcde15", "Box 2", "cdefg35", "jklm40", "nopq50", "rstu52"))
V1
1 Box 1
2 abcd10
3 bcde15
4 Box 2
5 cdefg35
6 jklm40
7 nopq50
8 rstu52
I want to create a separate column where each observation takes on the value of the nearest heading above it. Like this:
V1 v2
1 abcd10 Box 1
2 bcde15 Box 1
3 cdefg35 Box 2
4 jklm40 Box 2
5 nopq50 Box 2
6 rstu52 Box 2
Nothing I've tried has gotten me close. Any help would be appreciated. Thanks!
An idea via base R can be,
i1 <- grepl('Box', dfx$V1)
dfx$new <- with(dfx, ave(V1, cumsum(i1), FUN = function(i) i[1]))
subset(dfx, !i1)
# V1 new
#2 abcd10 Box 1
#3 bcde15 Box 1
#5 cdefg35 Box 2
#6 jklm40 Box 2
#7 nopq50 Box 2
#8 rstu52 Box 2
You could also do:
indx <- grepl("^Box \\d+$",dfx$V1)
transform(dfx,v2=V1[indx][cumsum(indx)])[!indx,]
V1 v2
2 abcd10 Box 1
3 bcde15 Box 1
5 cdefg35 Box 2
6 jklm40 Box 2
7 nopq50 Box 2
8 rstu52 Box 2
Create a V2 column which equals V1 for the Box rows and NA for other rows and then use na.locf0 to fill in the NAs. Finally remove the V1 Box rows.
library(zoo)
isBox <- grepl("Box", dfx$V1)
transform(dfx, V2 = na.locf0(replace(V1, !isBox, NA)))[ !isBox, ]
giving:
V1 V2
2 abcd10 Box 1
3 bcde15 Box 1
5 cdefg35 Box 2
6 jklm40 Box 2
7 nopq50 Box 2
8 rstu52 Box 2

Removing character from dataframe

I have this simple code, which generates a data frame. I want to remove the V character from the middle column. Is there any simple way to do that?
Here is a test code (the actual code is very long), very similar with the actual code.
mat1=matrix(c(1,2,3,4,5,"V1","V2","V3","V4","V5",1,2,3,4,5), ncol=3)
mat=as.data.frame(mat1)
colnames(mat)=c("x","row","y")
mat
This is the data frame:
x row y
1 1 V1 1
2 2 V2 2
3 3 V3 3
4 4 V4 4
5 5 V5 5
I just want to remove the V's like this:
x row y
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
We can use str_replace from stringr
library(stringr)
mat$row <- str_replace(mat$row, "V", "")

How to BiCluster with constant values in columns - in R

My Problem in general:
I have a data frame where i would like to find all bi-clusters with constant values in columns.
For Example the initial dataframe:
> df
v1 v2 v3
1 0 2 1
2 1 3 2
3 2 4 3
4 3 3 4
5 4 2 3
6 5 2 4
7 2 2 3
8 3 1 2
And for example i would like to find the a cluster like this:
> cluster1
v1 v3
1 2 3
2 2 3
I tried to use the biclust package and tested several functions but the result was always not what i want to archive.
I figured out that I may can use the BCPlaid function with fit.model = y ~ m. But it looks like this produce also different results.
Is there a way to archive this task efficient?

The which command is returning an error, what is an alternative?

I have 2 data frames
D1 = V1 V2 V3 V4
1 2 3 4
2 3 4 5
3 5 4 2
D2 = V1 V2 V3
1 2 3
3 5 4
I am trying to match the two data frames and extract index of row D2 which matches with that of D1 using which but getting the error
which(D2[,1:3]==D1[3,1:3])
Error in Ops.data.frame : ‘==’ only defined for equally-sized data frames
(but if I write the equation separately as ,
which(D2[,1]==D1[3,1] & D2[,1]==D1[3,2] & D2[,1]==D1[3,3])
there is no problem but I want to generalise it)
Please suggest some alternative.
This does the trick:
which(apply(D2, 1, function(x) all(D1[3,1:3] == x)))
[1] 2
Data:
D1 <- read.table(text="V1 V2 V3 V4
1 2 3 4
2 3 4 5
3 5 4 2", header=T)
D2 <- read.table(text="V1 V2 V3
1 2 3
3 5 4", header=T)

Resources