I'm trying to replace a chunck of a dataset A (say, the columns 7 to 25 for some rows listed in a vector "rows") with a dataset B of the same size. The dataset B repeats one row whose values are contained in the vector "new_values." I tried the following code:
A[rows, 7:25] <- sapply(A[rows, 7:25], replace, values=new_values, list= 1:ncol(A[rows, 7:25]))
It's not working, however. What is happening is that the columns in A are all the same and each row has a different value in "new_values"
Any idea how to fix that?
Thank you!
Related
I have a .csv file of 39 variables and 713 rows, each containing a count of plastic items. I have another column which is the survey length, and I want to standardise each count of items by a survey length of 100. I am unsure how to create a loop to run through each row and cell individually to do this. Many also have NA values.
Any ideas would be great.
Thank you.
Consider applying formula directly on columns without need of looping:
# RETRIEVE ALL COLUMN NAMES (MINUS SURVEY LENGTH)
vars <- names(df)[!grepl("survey_length", names(df))]
# EXPAND SINGLE COLUMN TO EQUAL DIMENSION OF DATA FRAME
survey_length_mat <- matrix(df$survey_length, ncol=length(vars), nrow=nrow(df))
# APPLY FORMULA
df[vars] <- (df[vars] / survey_length_mat) * 100
df
I have a dataframe, for all columns of which I want to calculate paired ratios of rows (for example, row1/row2, row3/row4, row5/row6, etc.) and write the result of calculation to a new dataframe. I decided to wrap it in a function with 3 arguments:
paired_row_rat=function(dataframe,rows,columns){
ratio_df=data.frame(matrix(nrow=rows/2,ncol=columns)) #creates new dataframe
#where number of columns is the same as in dataframe used for
#calculation, number of rows for paired ratios will be 2 times lower
cln=colnames(dataframe) #names of columns should be equal in both
colnames(ratio_df)=cln #dataframes
i=seq(1,rows,by=2) #sequance for choosing the first row of calculation
j=i+1 #for choosing second row of calculation
for (k in 1:nrow(ratio_df)){ #here as I am trying to fill new
ratio_df[k,]=dataframe[i,]/dataframe[j,] #dataframe with ratios,
} #the error appears
return(ratio_df)
}
pmap(list(tula3,24,98),paired_row_rat)
#runs the function for my dataframe with 24 rows and 98 columns
In the resulting dataframe each column has the same values for all rows and I have warnings from R:
warnings()
Warning messages:
1: In [<-.data.frame(*tmp*, k, , value = structure(list( ... :
replacement element 1 has 12 rows to replace 1 rows
I've searched a lot for possible solutions but still can't fix this problem. Something is wrong with the for loop. But I don't uderstand where the problem is.
datafrfame used for calculation (the result of head(df)):
Assumed that the requirement is to calculate the ratio for pairs row1/row2, row3/row4 and so on....
Try this:
as.data.frame(t(sapply(seq(1,(nrow(df)-1),2),function(x,df){df[x,]/df[x+1,]},df)))
where df is your data.frame
I looked everywhere but did not find answer to my question. I am having trouble with makig contingency table. I have data with many columns, let say 1, 2 and 3. In the first column there are let say 100 different values, in the second 20 and the third column has 2 possible values: 0 and 1. First I take just data with value 1 in column 3 (data<-data[Column3==1,]). Now I have only around 20 different values in 1. column and 5 in 2. column. However when I do a contingency table its size is 100x20, not 20x5, and contains a lot of zeros (they correspond to combination of column1 and column2 which has value 0 in column3). I would be greatful for every kind of help, thanks.
I guess all your three variables are factors.So convert them into character using
as.character()
to all three variables then apply
table()
for that.
I have a dataset that has 5 columns and 50 rows. I want to divide it into two parts, one with 35 rows and 15 in the other randomly. Then i would like to add another column to this dataset which contains value TRUE/FALSE. TRUE if the row belongs to the 35 randomly selected rows and FALSE if it belongs to the 15. How do i achieve it in R...
All help is greatly appreciated..
Thanks
We create a vector of 'TRUE/FALSE' elements using rep by specifying the times to replicate the 'TRUE/FALSE' values, sample it, and create a new column ('ind') by assigning the output. Then, split the dataset into a list of 2 data.frames by 'ind' column.
df1$ind <- sample(rep(c(TRUE, FALSE), times = c(35, 15)))
split(df1, df1$ind)
data
set.seed(24)
df1 <- as.data.frame(matrix(sample(9, 50*5, replace=TRUE), ncol=5))
I am new to R with a fairly simple question, I just can't figure out the answer. For my example I will use a data frame with 3 columns, but my actual data set is 139 columns with 10000 rows.
I want to replace all of the values in a given row with NA if the value in the same row in column C contains a value < 10.
Assume that all of my columns are either number or integer values.
so I want to take the data frame:
x=data.frame(c(5,9,2),c(3,4,6),c(12,9,11))
names(x)=c("A","B","C")
and replace row 2 with NA to create
y=data.frame(c(5,"NA",2),c(3,"NA",6),c(12,"NA",11))
names(y)=c("A","B","C")
Thanks!
how about:
x[x$C <10 ,] <- NA