Copy a row but with some modifications - r

I have a large data set like this:
SUB SMOKE AMT MDV ADDL II EVID
1 0 0 0 0 0 0
1 0 20 0 16 24 1
1 0 0 0 0 0 0
1 0 0 0 0 0 0
2 1 0 0 0 0 0
2 1 50 0 24 12 1
2 1 0 0 0 0 0
2 1 0 0 0 0 0
...
I want to copy the row where EVID=1 and insert it below, but for the copied row, AMT,ADDL,II and EVID should all equal to 0, SMOKE and MDV remain the same. The expected output should look like this:
SUB SMOKE AMT MDV ADDL II EVID
1 0 0 0 0 0 0
1 0 20 0 16 24 1
1 0 0 0 0 0 0
1 0 0 0 0 0 0
1 0 0 0 0 0 0
2 1 0 0 0 0 0
2 1 50 0 24 12 1
2 1 0 0 0 0 0
2 1 0 0 0 0 0
2 1 0 0 0 0 0
...
Does anyone have idea about realizing this?

# repeat EVID=0 rows 1 time and EVID=1 rows 2 times
r <- rep(1:nrow(DF), DF$EVID + 1)
DF2 <- DF[r, ]
# insert zeros
DF2[duplicated(r), c("AMT", "ADDL", "II", "EVID")] <- 0
giving:
> DF2
SUB SMOKE AMT MDV ADDL II EVID
1 1 0 0 0 0 0 0
2 1 0 20 0 16 24 1
2.1 1 0 0 0 0 0 0
3 1 0 0 0 0 0 0
4 1 0 0 0 0 0 0
5 2 1 0 0 0 0 0
6 2 1 50 0 24 12 1
6.1 2 1 0 0 0 0 0
7 2 1 0 0 0 0 0
8 2 1 0 0 0 0 0

Maybe this:
> t2 <- t[t$EVID==1,] # t is your data.frame
> t2[c("AMT","ADDL","II","EVID")] <- 0
> t2
SUB SMOKE AMT MDV ADDL II EVID
2 1 0 0 0 0 0 0
6 2 1 0 0 0 0 0
> rbind(t,t2)
SUB SMOKE AMT MDV ADDL II EVID
1 1 0 0 0 0 0 0
2 1 0 20 0 16 24 1
3 1 0 0 0 0 0 0
4 1 0 0 0 0 0 0
5 2 1 0 0 0 0 0
6 2 1 50 0 24 12 1
7 2 1 0 0 0 0 0
8 2 1 0 0 0 0 0
21 1 0 0 0 0 0 0 # this row
61 2 1 0 0 0 0 0 # and this one are new

Related

Add a new column generated from predict() to a list of dataframes

I have a logistic regression model. I would like to predict the morphology of items in multiple dataframes that have been put into a list.
I have lots of dataframes (most say working with a list of dataframes is better).
I need help with 1:
Applying the predict function to a list of dataframes.
Adding these predictions to their corresponding dataframe inside the list.
I am not sure whether it is better to have the 1000 dataframes separately and predict using loops etc, or to continue having them inside a list.
Prior to this code I have split my data into train and test sets. I then trained the model using:
library(nnet)
#Training the multinomial model
multinom_model <- multinom(Morphology ~ ., data=morph, maxit=500)
#Checking the model
summary(multinom_model)
This was then followed by validation etc.
My new dataset, consisting of multiple dataframes stored in a list, called rose.list was formatted by the following:
filesrose <- list.files(pattern = "_rose.csv")
#Rename all files of rose dataset 'rose.i'
for (i in seq_along(filesrose)) {
assign(paste("rose", i, sep = "."), read.csv(filesrose[i]))
}
#Make a list of the dataframes
rose.list <- lapply(ls(pattern="rose."), function(x) get(x))
I have been using this function to predict on a singular new dataframe
# Predicting the classification for individual datasets
rose.1$Morph <- predict(multinom_model, newdata=rose.1, "class")
Which gives me the dataframe, with the new prediction column 'Morph'
But how would I do this for multiple dataframes in my rose.list? I have tried:
lapply(rose.list, predict(multinom_model, "class"))
Error in eval(predvars, data, env) : object 'Area' not found
and, but also has the error:
lapply(rose.list, predict(multinom_model, newdata = rose.list, "class"))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows:
You can use an anonymous function (those with function(x) or abbreviated \(x)).
library(nnet)
multinom_model <- multinom(low ~ ., birthwt)
lapply(df_list, \(x) predict(multinom_model, newdata=x, type='class'))
# $rose_1
# [1] 1 0 1 1 0 0 0 1 0 1 1 1 0 0 1 1 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 0 1 0 1 0
# [40] 1 0 0 0 0 0 1 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 1 1 1 1 1 0 0 1
# [79] 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0
# [118] 1 0 0 1 1 0 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1
# [157] 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 1 0 1 1 1 1 0 0 1
# Levels: 0 1
#
# $rose_2
# [1] 0 1 0 1 1 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1 1 1 0 0 1 0 0 1 0 1 1 0 1
# [40] 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1
# [79] 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0
# [118] 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0
# [157] 0 0 0 1 1 1 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0
# Levels: 0 1
#
# $rose_3
# [1] 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 0 1
# [40] 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 1 0 1 0 1 1 0 1 0 0 1 0 0 0 0 1 1
# [79] 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 1
# [118] 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 1 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0
# [157] 0 1 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 1 0 0 0 0
# Levels: 0 1
update
To add the predictions as new column to each data frame in the list, modify the code like so:
res <- lapply(df_list, \(x) cbind(x, pred=predict(multinom_model, newdata=x, type="class")))
lapply(res, head)
# $rose_1
# low age lwt race smoke ptl ht ui ftv bwt pred
# 136 0 24 115 1 0 0 0 0 2 3090 0
# 154 0 26 133 3 1 2 0 0 0 3260 0
# 34 1 19 112 1 1 0 0 1 0 2084 1
# 166 0 16 112 2 0 0 0 0 0 3374 0
# 27 1 20 150 1 1 0 0 0 2 1928 1
# 218 0 26 160 3 0 0 0 0 0 4054 0
#
# $rose_2
# low age lwt race smoke ptl ht ui ftv bwt pred
# 167 0 16 135 1 1 0 0 0 0 3374 0
# 26 1 25 92 1 1 0 0 0 0 1928 1
# 149 0 23 119 3 0 0 0 0 2 3232 0
# 98 0 22 95 3 0 0 1 0 0 2751 0
# 222 0 31 120 1 0 0 0 0 2 4167 0
# 220 0 22 129 1 0 0 0 0 0 4111 0
#
# $rose_3
# low age lwt race smoke ptl ht ui ftv bwt pred
# 183 0 36 175 1 0 0 0 0 0 3600 0
# 86 0 33 155 3 0 0 0 0 3 2551 0
# 51 1 20 121 1 1 1 0 1 0 2296 1
# 17 1 23 97 3 0 0 0 1 1 1588 1
# 78 1 14 101 3 1 1 0 0 0 2466 1
# 167 0 16 135 1 1 0 0 0 0 3374 0
Data:
data('birthwt', package='MASS')
set.seed(42)
df_list <- replicate(3, birthwt[sample(nrow(birthwt), replace=TRUE), ], simplify=FALSE) |>
setNames(paste0('rose_', 1:3))

comparison.wordcloud error of strwidth(words[i], cex = size[i], ...) : invalid 'cex' value (SOLVED)

I have a termdocumentmatrix tdm1 and I put it in through this formula:
comparison.cloud(tdm1, random.order=FALSE,
colors = c("#00B2FF", "red", "#FF0099", "#6600CC", "green", "orange", "blue", "brown"),
title.size=1, max.words=50, scale=c(4, 0.5),rot.per=0.4)
However, I got an error which is "Error in strwidth(words[i], cex = size[i], ...) : invalid 'cex' value"
Not too sure what cex value am I missing.
The tdm1 is as follows:
Docs
Terms anger anticipation disgust fear joy sadness surprise trust
bag 1 0 0 0 0 1 0 0
choices 1 1 0 0 1 2 1 1
limited 1 0 0 0 0 1 0 0
plastic 1 0 0 0 0 1 0 0
provided 1 0 0 0 0 1 0 0
abit 0 1 0 0 1 1 1 1
ai 0 2 0 0 2 1 2 2
always 0 1 0 1 0 0 0 1
amazed 0 1 0 0 1 1 1 1
amount 0 1 0 0 1 0 1 1
app 0 2 0 1 2 1 2 2
area 0 1 0 0 1 0 1 1
areas 0 1 0 0 0 0 0 0
around 0 1 0 0 1 0 1 1
atmosphere 0 1 0 0 1 0 1 1
attended 0 1 0 0 1 1 1 1
back 0 1 0 0 1 1 1 0
basah 0 1 0 1 1 0 1 1
bought 0 1 0 0 0 0 0 0
brands 0 1 0 0 1 0 1 1
bras 0 1 0 1 1 0 1 1
breeze 0 1 0 0 1 1 1 1
buy 0 1 0 1 1 1 1 1
can 0 2 0 0 1 1 1 0
cant 0 1 0 0 0 0 0 0
cashiers 0 1 0 0 1 0 1 1
cbd 0 1 0 0 0 0 0 0
charged 0 1 0 0 1 1 1 1
choose 0 2 0 0 1 0 0 0
chopstick 0 1 0 0 0 0 0 0
classes 0 1 0 0 1 0 0 0
come 0 2 0 0 2 2 2 1
concept 0 4 0 0 3 0 3 3
confused 0 1 0 0 1 1 1 1
contains 0 1 0 0 1 1 1 1
convenient 0 8 0 0 5 1 4 4
cool 0 4 0 0 4 1 4 4
correct 0 1 0 0 1 1 1 1
cream 0 1 0 0 1 0 1 1
cup 0 1 0 0 0 0 0 0
curious 0 1 0 0 0 0 0 0
current 0 1 0 0 1 0 1 1
customer 0 1 0 0 0 0 0 0
cutlery 0 1 0 0 0 0 0 0
doesnt 0 1 0 0 1 0 1 1
dont 0 1 0 0 1 0 1 1
don’t 0 1 0 1 1 1 1 1
download 0 1 0 0 1 1 1 1
drinks 0 1 0 0 1 0 0 0
easy 0 3 0 0 2 0 2 2
eat 0 1 0 0 1 1 1 1
electronic 0 1 0 0 1 1 1 1
eleven 0 1 0 0 1 0 1 1
entering 0 1 0 0 1 1 1 1
ereciept 0 1 0 0 1 1 1 1
especially 0 1 0 0 1 0 1 1
even 0 2 0 2 1 1 1 1
exit 0 1 0 0 1 1 1 1
experience 0 3 0 1 2 1 2 2
explained 0 1 0 0 1 1 1 1
feel 0 1 0 0 1 0 1 1
first 0 1 0 0 1 1 1 1
found 0 1 0 0 1 0 1 1
free 0 1 0 0 1 0 1 1
friends 0 1 0 0 1 1 1 1
fussfree 0 1 0 0 1 0 1 1
gantry 0 1 0 0 1 0 1 1
get 0 2 0 1 2 1 1 1
go 0 3 0 1 3 1 3 3
good 0 2 0 0 2 0 2 2
goods 0 1 0 1 1 1 1 1
goto 0 1 0 0 0 0 0 0
great 0 4 0 0 3 0 3 3
greatly 0 1 0 0 1 0 1 1
hasslefree 0 1 0 0 0 0 0 1
history 0 1 0 0 1 1 1 1
hope 0 1 0 0 1 0 1 1
hour 0 1 0 0 1 0 1 1
hours 0 1 0 0 1 0 1 1
ice 0 1 0 0 1 0 1 1
im 0 1 0 0 1 1 1 0
inside 0 1 0 0 1 1 1 1
items 0 3 0 0 3 2 3 2
jiffy 0 1 0 0 1 1 1 1
just 0 4 0 0 3 2 3 2
large 0 1 0 0 1 0 1 1
leave 0 1 0 0 1 1 1 0
less 0 1 0 0 0 0 0 1
link 0 2 0 0 2 1 2 2
linked 0 1 0 0 1 0 1 1
lots 0 1 0 0 0 0 0 1
love 0 3 0 0 3 1 3 2
lovely 0 1 0 0 1 1 1 1
makes 0 1 0 0 1 0 1 1
making 0 1 0 0 0 0 0 1
many 0 1 0 0 0 0 0 0
method 0 1 0 0 1 0 1 1
methods 0 1 0 0 1 1 1 1
minute 0 2 0 0 1 1 1 2
mrt 0 1 0 1 1 0 1 1
muchneeded 0 1 0 0 1 0 1 1
near 0 1 0 0 1 0 1 1
nearby 0 1 0 0 1 0 1 1
new 0 1 0 0 1 0 1 1
newly 0 1 0 0 0 0 0 0
nice 0 1 0 0 0 0 0 0
noodles 0 1 0 0 0 0 0 0
number 0 1 0 0 1 1 1 1
offers 0 1 0 0 1 0 1 1
often 0 2 0 0 2 2 2 1
opened 0 1 0 0 0 0 0 0
operate 0 1 0 0 1 0 1 1
order 0 1 0 0 1 0 1 1
outlets 0 1 0 0 1 1 1 0
patiently 0 1 0 0 1 1 1 1
pay 0 1 0 0 1 0 1 1
payment 0 3 0 0 3 1 3 3
people 0 1 0 0 0 0 0 0
perfect 0 1 0 0 1 1 1 1
pick 0 4 0 0 3 0 3 3
picking 0 1 0 0 1 1 1 1
prepare 0 1 0 0 0 0 0 0
prices 0 1 0 0 1 0 1 1
product 0 2 0 0 2 2 2 2
products 0 7 0 0 5 1 5 4
promotion 0 1 0 1 1 0 1 1
promotions 0 1 0 0 0 0 0 1
quench 0 1 0 0 1 1 1 1
queue 0 2 0 0 2 1 2 2
quite 0 1 0 0 1 1 1 1
range 0 2 0 0 1 1 1 0
ready 0 1 0 0 1 1 1 1
reasonable 0 1 0 0 1 0 1 1
recieved 0 1 0 0 1 1 1 1
recommend 0 1 0 0 0 0 0 1
reduced 0 1 0 0 1 0 1 1
rush 0 1 0 0 1 1 1 0
salut 0 1 0 0 0 0 0 1
sandwiches 0 1 0 0 1 1 1 1
scan 0 1 0 0 1 0 1 1
see 0 3 0 0 2 0 2 2
seems 0 1 0 0 0 0 0 0
setup 0 1 0 0 0 0 0 1
shop 0 3 0 0 2 1 2 2
shopping 0 4 0 1 4 2 4 4
show 0 1 0 0 1 1 1 1
small 0 1 0 0 0 0 0 0
smu 0 2 0 0 2 0 2 2
snacks 0 3 0 0 3 1 1 1
spent 0 1 0 0 1 0 1 1
staff 0 3 0 1 3 3 3 3
stared 0 1 0 1 1 1 1 1
stop 0 1 0 0 1 1 1 1
store 0 14 0 0 10 5 9 9
stores 0 1 0 0 1 0 1 1
students 0 1 0 0 1 0 0 0
stuff 0 1 0 0 0 0 0 0
super 0 2 0 0 2 0 2 2
sure 0 1 0 0 1 1 1 1
sweets 0 1 0 0 1 0 0 0
take 0 1 0 0 1 1 1 0
technology 0 4 0 0 4 2 4 4
thankfully 0 1 0 0 1 1 1 1
theres 0 1 0 0 0 0 0 0
thirst 0 1 0 0 1 1 1 1
thought 0 1 0 0 0 0 0 0
time 0 2 0 0 2 1 2 2
took 0 1 0 0 0 0 0 1
truly 0 1 0 0 0 0 0 1
unmanned 0 1 0 0 1 1 1 1
use 0 1 0 0 1 0 1 1
used 0 1 0 0 1 0 1 1
useful 0 1 0 0 1 0 0 0
users 0 1 0 0 1 0 1 1
variety 0 4 0 0 4 0 3 3
wait 0 2 0 0 1 0 1 1
waited 0 1 0 0 1 1 1 1
walk 0 2 0 0 1 0 1 1
want 0 1 0 0 0 0 0 0
wanted 0 1 0 0 1 0 1 1
wasnt 0 1 0 0 1 1 1 1
whatever 0 1 0 0 1 0 1 1
wide 0 4 0 0 3 1 2 1
won’t 0 1 0 1 1 1 1 1
worry 0 1 0 1 1 1 1 1
avoid 0 0 0 1 0 0 0 0
away 0 0 0 1 0 0 0 0
better 0 0 0 1 0 0 0 0
choice 0 0 0 1 0 0 0 0
customers 0 0 0 1 0 0 0 0
deceptive 0 0 0 1 0 0 0 0
expired 0 0 0 1 0 0 0 0
listed 0 0 0 1 0 0 0 0
make 0 0 0 1 0 0 0 0
marketing 0 0 0 1 0 0 0 0
minutes 0 0 0 1 0 0 0 0
much 0 0 0 1 0 0 0 0
purchases 0 0 0 1 0 0 0 0
qr 0 0 0 1 0 0 0 0
resulting 0 0 0 1 0 0 0 0
scanner 0 0 0 1 0 0 0 0
screen 0 0 0 1 0 0 0 0
showing 0 0 0 1 0 0 0 0
still 0 0 0 1 0 0 0 0
takes 0 0 0 1 0 0 0 0
tries 0 0 0 1 0 0 0 0
trusted 0 0 0 1 0 0 0 0
trusting 0 0 0 1 0 0 0 0
works 0 0 0 1 0 0 0 0
Hence, not too sure what is the issue about since there is no NA!
Hope you can help. Thank you!

How to keep ID in dummyVars()

I would like to do transform Gender and Country using One-Hot-Encoding.
With the code below I can not create the new dataset including the ID
library(caret)
ID<-1:10
Gender<-c("F","F","F","M","M","F","M","M","F","M")
Country<-c("Mali","France","France","Guinea","Senegal",
"Mali","France","Mali","Senegal","France")
data<-data.frame(ID,Gender,Country)
#One hot encoding
dmy <- dummyVars(" ~Gender+Country", data = data, fullRank = T)
dat_transformed <- data.frame(predict(dmy, newdata = data))
dat_transformed
Gender.M Country.Guinea Country.Mali Country.Senegal
1 0 0 1 0
2 0 0 0 0
3 0 0 0 0
4 1 1 0 0
5 1 0 0 1
6 0 0 1 0
7 1 0 0 0
8 1 0 1 0
9 0 0 0 1
10 1 0 0 0
I want to get a dataset that include the ID without enconding it.
ID Gender.M Country.Guinea Country.Mali Country.Senegal
1 1 0 0 1 0
2 2 0 0 0 0
3 3 0 0 0 0
4 4 1 1 0 0
5 5 1 0 0 1
6 6 0 0 1 0
7 7 1 0 0 0
8 8 1 0 1 0
9 9 0 0 0 1
10 10 1 0 0 0
dat_transformed <- cbind(ID,dat_transformed)
dat_transformed
ID Gender.M Country.Guinea Country.Mali Country.Senegal
1 0 0 1 0
2 0 0 0 0
3 0 0 0 0
4 1 1 0 0
5 1 0 0 1
6 0 0 1 0
7 1 0 0 0
8 1 0 1 0
9 0 0 0 1
10 1 0 0 0

r row names selection using columns

Suppose i have this matrix
0 1 2 3 4 5 6 98 183 385 419 420 422 423 469 470 35698 35709 35729 37415
0 0 1 1 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1
1 1 0 1 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0
2 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0
3 1 0 1 0 1 1 0 1 1 0 1 1 1 1 0 0 1 0 0 1
4 0 0 1 1 0 1 1 1 0 0 1 1 1 0 0 1 0 1 1 0
5 0 1 0 1 1 0 1 1 0 0 0 1 0 0 0 1 0 0 1 0
6 1 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0
98 0 0 0 1 1 1 1 0 0 0 0 1 0 0 0 1 0 0 1 0
183 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1
385 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
419 0 0 1 1 1 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0
420 0 0 0 1 1 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0
422 0 0 1 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 0 1
423 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1
469 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
470 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0
35698 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
35709 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
35729 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0
37415 1 0 0 1 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0
I am getting a value from another program let us say
x=3.
I want to choose the name of rows where x == 1 i.e. where the value of 3 is 1.
Output will be : 0,2,4,5,98,183,419,420,422,423,35698,37415.
And I don't want to pass "3" directly into the command. I want to pass the variable x so that if this number varies I could get the output accordingly.
Can anyone help me, please? thanks in advance
x=matrix(c(1,1,2,5,6,6,5,7,7,8,3,3,1,9,20,20,4,7,9,5),4,5,dimnames = list(c(letters[1:4]),c(LETTERS[1:5])))
you'r requirement is row names then
rownames(x)[x[,"D"]==20]
here '20' is you'r input value and D is you'r searching column.

how to convert a matrix of values into a binary matrix

I'd like to convert a matrix of values into a matrix of 'bits'.
I have been looking for solutions and found this, which seems to be part of a solution.
I'll try to explain what I am looking for.
I have a matrix like
> x<-matrix(1:20,5,4)
> x
[,1] [,2] [,3] [,4]
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 13 18
[4,] 4 9 14 19
[5,] 5 10 15 20
which I would like to convert into
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
3 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
4 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
5 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
so for each value in the row a "1" in the corresponding column.
If I use
> table(sequence(length(x)),t(x))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
5 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
9 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
13 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
15 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
17 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
this is close to what I am looking for, but returns a line for each value.
I would only need to consolidate all values from one row into one row.
Because a
> table(x)
x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
gives alls values of the whole table, so what do I need to do to get the values per row.
Here is another option using table() function:
table(row(x), x)
# x
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
# 2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
# 3 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
# 4 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
# 5 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
bit_x = matrix(0, nrow = nrow(x), ncol = max(x))
for (i in 1:nrow(x)) {bit_x[i,x[i,]] = 1}
Let
(x <- matrix(c(1, 3), 2, 2))
[,1] [,2]
[1,] 1 1
[2,] 3 3
One approach would be
M <- matrix(0, nrow(x), max(x))
M[cbind(c(row(x)), c(x))] <- 1
M
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 0 1
In one line:
replace(matrix(0, nrow(x), max(x)), cbind(c(row(x)), c(x)), 1).
Following your approach, and similarly to #Psidom's suggestion:
table(rep(1:nrow(x), ncol(x)), x)
# x
# 1 3
# 1 2 0
# 2 0 2
We can use the reshape2 package.
library(reshape2)
# At first we make the matrix you provided
x <- matrix(1:20, 5, 4)
# then melt it based on first column
da <- melt(x, id.var = 1)
# then cast it
dat <- dcast(da, Var1 ~ value, fill = 0, fun.aggregate = length)
which gives us this
Var1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
2 2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
3 3 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
4 4 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
5 5 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

Resources