Identify and replace data of a column by +1 and -1 - r

I have a one-column xts object:
a <- c(1,1,1,2,3,2,2,2,2,1,0,0,0,0,2,3,4,4,1,1)
date <- Sys.Date()-20:1
data <- xts(a,date)
colnames(data) <- "a"
data
Here I want all the numbers in the column a to be replaced by +1 and then -1 respectively, except 0. I want the a column to look like:
1,-1,1,-1,1,-1,1,-1,1,-1,0,0,0,0,1,-1,1,-1,1,-1
I've asked similar questions, but this is not a exact duplicate.

Assuming that your data frame is named df. This will repeat values 1 and -1 for all a that are not 0.
df[a!=0,]<-c(1,-1)

Related

How to create a dummy variable based on row numbers in R

I have a 10 (question items) by 500 (respondents) vector in R.
Upper 250 are male while lower 250 are female.
Can you tell me how to create a gender variable, and assign 0 and 1 to this variable based on row numbers in R?
Thank you very much! Stay safe.
This solution assumes your dataset is in a data frame, not a vector, that the dataset is named "dat" (change it to whatever you are calling your data), and that the variable "gender" does not already exist in "dat".
dat$gender <- NA # Creates a new, empty column in the dataset (NA stands for missing data, or not available)
dat[1:250, "gender"] <- "0" # assigns the category 0 to rows 1-250
dat[251:500, "gender"] <- "1" # assigns the category 1 to rows 251-500
Hope this helps! As the comments suggest, providing a sample of your data will help us help you.

How can I add 1 to all integers in a dataset consiting integers and NAs (in R)?

I have a dataset consisting of integers and NAs. I want to add 1 (+1) to all of the integers but not to the NAs. How can I do this? Thanks for your help!
Assuming your df is named df and var1 is the column you want to add 1 to:
df$var2 = df$var1 +1
where var2 will be the column having the final result.
P.S. Adding 1 to NA still returns NA.

How to subset a data frame by taking only the Non NA values of 2 columns in this data frame

I am trying to subset a data frame by taking the integer values of 2 columns om my data frame
Subs1<-subset(DATA,DATA[,2][!is.na(DATA[,2])] & DATA[,3][!is.na(DATA[,3])])
but it gives me an error : longer object length is not a multiple of shorter object length.
How can I construct a subset which is composed of NON NA values of column 2 AND column 3?
Thanks a lot?
Try this:
Subs1<-subset(DATA, (!is.na(DATA[,2])) & (!is.na(DATA[,3])))
The second parameter of subset is a logical vector with same length of nrow(DATA), indicating whether to keep the corresponding row.
The na.omit functions can be an answer to you question
Subs1 <- na.omit(DATA[2:3])
[https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html]
Here an example.
a,b ,c are 3 vectors which a and b have a missing value.
once they are created i use cbind in order to bind them in one matrix which afterwards you can transform to data frame.
The final result is a dataframe where 2 out of 3 columns have a missing value.
So we need to keep only the rows with complete cases.DATA[complete.cases(DATA), ] is used in order to keep only these rows that have not missing values in every column. subset object is these rows that have complete cases.
a <- c(1,NA,2)
b <- c(NA,1,2)
c <- c(1,2,3)
DATA <- as.data.frame(cbind(a,b,c))
subset <- DATA[complete.cases(DATA), ]

remove rows from data frame whose column values don't match another data frame's column values - R

So I have two data frames of different dimensions.
The first one, x, is about 10,000 rows and long looks like:
Year ID Number
2008.1 38573 1
2008.2 24395 3
(a lot of data in between)
2008.4 532 4
The second one, x2, is about 80,000 rows long and looks like:
Year ID Number
2008.1 38573 2
2008.2 24395 3
(a lot of data in between)
2008.4 532 4
Basically, I want to remove the rows in the second data that satisfy the following condition: that the Year, ID and Number values in the row don't match any rows of the first data frame. So in the above example, I'd remove row 1 from the second data frame, because the Number doesn't match.
I've tried:
x2new <- x2[(x2$ID == x$ID && x2$Year==x$Year && x2$Number == x$Number),]
But it doesn't work because the lengths of the two data frames are different.
I've tried doing a double for loop to remove rows that don't have all 3 conditions, but R simply can't do that many iterations.
Please help! Thanks.
A simple merge
merge(dat1,dat2)
Using your data for example:
dat1 <- read.table(text='Year,ID,Number
2008.1,38573,1
2008.4,532,4
2008.2,24395,3',header=TRUE,sep=',')
dat2 <- read.table(text='Year,ID,Number
2008.1,38573,2
2008.4,532,4
2008.2,24395,3',header=TRUE,sep=',')
Then you get :
merge(dat1,dat2)
Year ID Number
1 2008.2 24395 3
2 2008.4 532 4
I understood that you want to remove all the rows where no one of three columns has a match in the first data frame, and keep all the row where at least one column has a match, right? if so, just do this:
newX2 <- x2[ x2$ID %in% x$ID | x2$Year %in% x$Year | x2$Number %in% x$Number,]

Change values in row based on a column value r

I am new to R with a fairly simple question, I just can't figure out the answer. For my example I will use a data frame with 3 columns, but my actual data set is 139 columns with 10000 rows.
I want to replace all of the values in a given row with NA if the value in the same row in column C contains a value < 10.
Assume that all of my columns are either number or integer values.
so I want to take the data frame:
x=data.frame(c(5,9,2),c(3,4,6),c(12,9,11))
names(x)=c("A","B","C")
and replace row 2 with NA to create
y=data.frame(c(5,"NA",2),c(3,"NA",6),c(12,"NA",11))
names(y)=c("A","B","C")
Thanks!
how about:
x[x$C <10 ,] <- NA

Resources