I have a simple question. I have a vector of years, spanning 1945:2000, with many repeated years. I want to make this an ordinal vector, so that 1945 is changed to 1, 1946 to 2, etc...
Obviously in this case the easiest way is just to subtract 1944 from the vector. But I have to do this with other numberic vectors that are not evenly spaced.
Is there an R function that does this?
You can do:
as.numeric(factor(x))
For example:
x <- sample(1945:2010, 40)
ordinal_x <- as.numeric(as.factor(x))
plot(x, ordinal_x)
Notice that ordinal_x skips the gaps in x.
Related
I need to calculate how how many times is the first column greater than or equal to
the second column of the matrix using R.
I have done the following:
set.seed(123)
x = matrix(rnorm(4*4,mean=10,sd=2),nrow=4)
x
x[,1]>x[,2]
But I cant figure out how to count the times that the column 1 is greater than column 2, I have used function length but it didn't work out.
thank you!
Logicals can be converted to numbers, TRUE as one, FALSE to zero, therefore:
sum(x[,1] > x[,2])
I have a column with anomaly values and I want to weight it with a specific number representing the number of years (32).
How can I do this?
data(mtcars)
mtcars$weight<-apply(mtcars[5], 1, ??, 32)
mtcars$weighted <- mtcars$drat * 32
Or if your weights are different for each observation
mtcars$weighted <- mtcars$drat * mtcars$cyl
No need for apply, multiplication is already vectorized for your convenience ;)
I would like to ask you how can I order the observations in one variable- needing it for my graphic. Now, the observations are sorted by 1 to 5 and I need to do a rank by 5,3,1,2,4
For more understanding: This is the x- axis of my graphic, I make a discrete geom_bar and need this ranging for better visualizing the data (y-axis is only count)
Thankful for every help!
{ggplot2} will reorder numeric and character data. In order to impose an order on your data, you need to
convert it to an ordered factor, and
impose your desired order.
Luckily this is very easy in a single step using the reorder function:
observations = reorder(1 : 5, c(5, 3, 1, 2, 4))
I understand that you have a vector of observations - this should do the trick when "observations" is your vector of values:
observations <- 1:5 # example data
new_order <- observations[c(5,3,1,2,4)]
new_order
5 3 1 2 4
I have dataframe which has 253 rows(locations on a chromosome in Mbps) and 1 column (Allele score at each location). I need to produce a dataframe which contains the mean of the allele score at every 0.5 Mbps on the chromosome. Please help with R code that can do this. thanks.
The picture in this case is adequate to construct an answer but not adequate to support testing. You should learn to post data in a form that doesn't require re-entry by hand. (That's why you are accumulating negative votes.)
The basic R strategy would be to use cut to create a grouping variable and then use a loop construct to accumulate and apply the mean function. Presumably this is in a dataframe which I will assume is named something specific like my_alleles:
tapply( my_alleles$Allele_score, # act on this vector
# in groups defined by this factor
cut(my_alleles$Location,
breaks=seq(0, max(my_alleles$Location), by=0.5)
),
# with this function
FUN=mean)
I'm running through an old piece of R code I had and making it run more efficiently as a leaning exercise.
I have a matrix which has 366 rows, representing each day of the year (probMatrix).
I have another which has 7 rows, Representing each day of the week (from Monday). Both of these matrices have 10 columns.
The second matrix contains booleans for each day of the week that I want to multiply through the first matrix at the relevant elements (by row).
Finally, because Monday didn't occur until the fourth day of 2016, the second matrix needs to be offset by four so that it gets multiplied by the correct days.
I originally had a for loop that iterates through each day and sweeps the probability matrix using a vector containing the relevant indices of the first matrix that correspond to the first day of the week:
probMatrix <- matrix(rep((rep(1,366)), 10), ncol=10,byrow=TRUE)
booleanMatrix <- matrix(rep(c(0,0,0,0,1), 14), ncol=10, byrow=FALSE)
for (day in 1:7){
actualDay <- (day+3)
dayIndex <- c(seq(actualDay,366,7))
probMatrix[dayIndex,] <- sweep(probMatrix[dayIndex,],2,
as.numeric(booleanMatrix[day,]),"*")
}
However, as I mentioned above, this is quite an inefficient method. I would like something that runs a bit faster, as this sort of code runs through the script a lot.
If you set the row names to
rownames(booleanMatrix) <- 1:7
rownames(probMatrix) <- c(rep(c(5,6,7,1,2,3,4), 52), 5, 6)
then you can do
probMatrix <- probMatrix * booleanMatrix[rownames(probMatrix), ]
which should be much faster. Or keep the indices in a variable and use that variable as index.