I want to calculate slope and distance of two vectors. I am using the following code
df = structure(list(x = c(92.2, 88.1, 95.8, 83.8, 76.7, 83.3, 101.1,
111.8, 84.3, 81.5, 76.2, 87.1), y = c(84.8, 78.5, 103.1, 90.4,
85.1, 78.2, 98.3, 109.2, 85.6, 86.9, 85.6, 94)), class = "data.frame", row.names = c(NA,
-12L))
x <- df$x
y <- df$y
#Slope
diff(y)/diff(x)
#Distance
dist(df, method = "euclidean")
You can see in the output of slope that 11 values are coming. I want to have the slope of 12-1 also. How can I get that? and the from distance output I only want the values of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, 11-12 and 12-1 combinations. How can I achieve it?
The expected output is
Length 7.5 25.8 17.5 8.9 9.5 26.8 15.3 36.2 3.1 5.5 13.8 10.5
Slope 1.54 3.19 1.06 0.75 -1.05 1.13 1.02 0.86 -0.46 0.25 0.77 1.08
I think the diff approach by #Gregor Thomas is concise enough. Here is another option in case you are interested in dist for computing diatances.
> d <- rbind(df, df[1, ])
> with(d, diff(y) / diff(x))
[1] 1.5365854 3.1948052 1.0583333 0.7464789 -1.0454545 1.1292135
[7] 1.0186916 0.8581818 -0.4642857 0.2452830 0.7706422 -1.8039216
> (m <- as.matrix(dist(d)))[col(m) - row(m) == 1]
[1] 7.516648 25.776928 17.472550 8.860023 9.548298 26.848650 15.274161
[8] 36.238239 3.087070 5.457105 13.761177 10.519030
There's no nice diff function for getting the difference of the last and first vector elements, you can directly use (y[12] - y[1]) / (x[12] - x[1]), or if you want to be more general use tail(x, 1) for the last element and head(x, 1) for the first element. Calculate it directly and append it to your slope vector.
For euclidean distance, of successive points, its most direct to calculate it directly: distance = sqrt(diff(x)^2 + diff(y)^2).
(slope = c(diff(y)/diff(x), (head(y, 1) - tail(y, 1)) / (head(x, 1) - tail(x, 1))))
# [1] 1.5365854 3.1948052 1.0583333 0.7464789 -1.0454545 1.1292135 1.0186916
# [8] 0.8581818 -0.4642857 0.2452830 0.7706422 1.8039216
(distance = sqrt(diff(x)^2 + diff(y)^2))
# [1] 7.516648 25.776928 17.472550 8.860023 9.548298 26.848650 15.274161 36.238239 3.087070 5.457105 13.761177
I'll leave it as an exercise for the reader to add the last distance between the first and last points.
Related
I have a vector of numbers that I need to round according to the rules in the image below:
Consider the following examples:
0.5 -> 0.5 (no rounding)
1.2 -> 1.0
3.7 -> 4.0
18.9 -> 20.0
28.1 -> 30.0
110 -> 120
I could in theory write a series of conditional statements to achieve this task; however, it will be a tedious and inefficient thing to do. Is there a way to achieve the desired outcome in an efficient manner?
Thank you
You could use the floor of base 10 logarithm to calculate powers of 10. Then divide the vector by that, round it and multiply with the powers of 10 again.
tens <- 10^floor(log10(abs(x)))
round(x/tens)*tens
# [1] NaN 0.5 1.0 4.0 -4.0 20.0 30.0 100.0
Note, that this won't work for zero and you therefore should use case-handling.
(However, 110 -> 120 is not obvious to me.)
Data:
x <- c(0, .5, 1.2, 3.7, -3.7, 18.9, 28.1, 110)
This solution uses findInterval to get which of the rounding functions is to be applied to the vector's elements.
roundSpecial <- function(x){
round_funs <- list(
no_round <- function(x) x,
round_by_1 <- function(x) round(x),
round_to_20 <- function(x) 20,
round_by_10 <- function(x) 10*round(x / 10),
round_by_15 <- function(x) 15*round(x / 15),
round_by_30 <- function(x) 30*round(x / 30)
)
lims <- c(0, 1, 17, 20, 30, 90, Inf)
which_fun <- findInterval(x, lims)
sapply(seq_along(which_fun), function(i) {
round_funs[[ which_fun[i] ]](x[i])
})
}
roundSpecial(x)
#[1] 0.5 1.0 4.0 20.0 30.0 120.0
Data
x <- c(0.5, 1.2, 3.7, 18.9, 28.1, 110)
The title can be confusing but I guess it has a simple solution. I have my own function and I want to apply same function to multiple lists that consists of two columns. But I need to do different calculations to each column separately.
As an example mydata is:
x1 x2 y1 y2 z1 z2
1 0.0 0.0 0.0 7.8 0.0 8.6
2 8.6 0.0 0.0 7.6 1.6 1.4
3 11.2 7.8 3.4 1.2 7.6 0.0
4 8.4 7.6 21.4 10.2 23.6 0.0
5 0.0 1.2 1.8 7.0 3.2 0.0
6 0.0 10.2 1.4 0.0 0.0 0.0
mydata<-structure(list(x1 = c(0, 8.6, 11.2, 8.4, 0, 0), x2 = c(0, 0,
7.8, 7.6, 1.2, 10.2), y1 = c(0, 0, 3.4, 21.4, 1.8, 1.4), y2 = c(7.8,
7.6, 1.2, 10.2, 7, 0), z1 = c(0, 1.6, 7.6, 23.6, 3.2, 0), z2 = c(8.6,
1.4, 0, 0, 0, 0)), .Names = c("x1", "x2", "y1", "y2", "z1", "z2"
), class = "data.frame", row.names = c(NA, -6L))
And myfun function is:
myfun<- function(x) {
means<-sapply(list(x), function(ss) mean(ss, na.rm = T))
#my point: vars<-sapply(list(y), function(ss) var(ss, na.rm = T))
mean<-means[[1]]
#var<-vars[[1]]
#lists<-list(mean, var)
#names(lists) <- c("mean", "var")
#return(lists)
lists<-list(mean)
names(lists)<-c("mean")
return(lists)
}
I used #for parts that will be added later in the myfun.
When I tried
results<-lapply(mydata, myfun)
I can apply same function and same calculation to each column.
As you see there are 2 columns(x1-x2, y1-y2, z1-z2) for each data (x, y, z).
What I want is:
1) Obtaining means of first columns (x1, y1, z1)
2) Obtaining variances of second columns (x2, y2, z2)
3) And as output; I want to see results of mean1and var1for each data under x, y and z lists like:
x-> mean1 (mean of x1)
var1 (var of x2)
y-> mean1 (mean of y1)
var1 (var of y2)
4) Do all these in a loop with lapply or sapply or with any useful function.
Notes:
1) I did not group x1 and x2 under x, y1 and y2 under y. Because If a solution can be found for mydata form, it would be more useful for me. But if it is necessary I can group them separately.
2) myfun function is finding means of 6 columns now. I have indicated the additional parts that will be used to calculate variances of second columns with #
Consider assigning your groups first, then iterate off this with lapply. In fact use sapply with simplify=FALSE for a named list.
grps <- unique(gsub("[0-9]", "", colnames(mydata)))
# [1] "x" "y" "z"
myfun <- function(grp)
list(mean = mean(mydata[,paste0(grp, 1)]),
variance = var(mydata[,paste0(grp, 2)]))
mean_var_list <- sapply(grps, myfun, simplify = FALSE)
mean_var_list
# $x
# $x$mean
# [1] 4.7
#
# $x$variance
# [1] 20.87467
#
# $y
# $y$mean
# [1] 4.666667
#
# $y$variance
# [1] 16.53467
#
# $z
# $z$mean
# [1] 6
#
# $z$variance
# [1] 11.85067
Or use the default, simplify=TRUE and return a matrix.
mean_var_mat <- sapply(grps, myfun)
mean_var_mat
# x y z
# mean 4.7 4.666667 6
# variance 20.87467 16.53467 11.85067
I would start by splitting the dataframe to create a list of dataframes with 2 columns.
At the point you can use lapply or map_dfr to apply the function mean_var to each of the elements of the list.
The advantage of map_dfr is that it return a dataframe, binding the rows of the function output.
library(purrr)
my_data_l <- split.default(mydata, rep(1:3, each = 2))
mean_var <- function(x) {
list(mean = mean(x[,1]), var = var(x[,2]))
}
map_dfr(my_data_l, mean_var)
I am trying to create or find a function that calculates the distance and angle between two points, the idea is that I can have two data.frames with x, y coordinates as follows:
Example dataset
From <- data.frame(x = c(0.5,1, 4, 0), y = c(1.5,1, 1, 0))
To <- data.frame(x =c(3, 0, 5, 1), y =c(3, 0, 6, 1))
Current function
For now, I've managed to develop the distance part using Pythagoras:
distance <- function(from, to){
D <- sqrt((abs(from[,1]-to[,1])^2) + (abs(from[,2]-to[,2])^2))
return(D)
}
Which works fine:
distance(from = From, to = To)
[1] 2.915476 1.414214 5.099020 1.414214
but I can't figure out how to get the angle part.
What I tried so far:
I tried adapting the second solution of this question
angle <- function(x,y){
dot.prod <- x%*%y
norm.x <- norm(x,type="2")
norm.y <- norm(y,type="2")
theta <- acos(dot.prod / (norm.x * norm.y))
as.numeric(theta)
}
x <- as.matrix(c(From[,1],To[,1]))
y <- as.matrix(c(From[,2],To[,2]))
angle(t(x),y)
But I am clearly making a mess of it
Desired output
I would like having the angle part of the function added to my first function, where I get both the distance and angle between the from and to dataframes
By angle between two points, I am assuming you mean angle between two vectors
defined by endpoints (and assuming the start is the origin).
The example you used was designed around only a single pair of points, with the transpose used only on this principle. It is however robust enough to work in more than 2 dimensions.
Your function should be vectorised as your distance function is, as it is expecting a number of pairs of points (and we are only considering 2 dimensional points).
angle <- function(from,to){
dot.prods <- from$x*to$x + from$y*to$y
norms.x <- distance(from = `[<-`(from,,,0), to = from)
norms.y <- distance(from = `[<-`(to,,,0), to = to)
thetas <- acos(dot.prods / (norms.x * norms.y))
as.numeric(thetas)
}
angle(from=From,to=To)
[1] 0.4636476 NaN 0.6310794 NaN
The NaNs are due to you having zero-length vectors.
how about:
library(useful)
df=To-From
cart2pol(df$x, df$y, degrees = F)
which returns:
# A tibble: 4 x 4
r theta x y
<dbl> <dbl> <dbl> <dbl>
1 2.92 0.540 2.50 1.50
2 1.41 3.93 -1.00 -1.00
3 5.10 1.37 1.00 5.00
4 1.41 0.785 1.00 1.00
where r us the distance and theta is the angle
I've transformed some values from my dataset with the logit transformation from the car-package. The variable "var" represent these values and consists of percentage values.
However, if I transform them back via inv.logit from the boot-package, the values dont match the original ones.
data$var
46.4, 69.5, 82.7, 61.7, 76.4, 84.8, 69.1
data["var_logit"] <- logit(data$var, percents=TRUE)
data$var_logit
-0.137013943, 0.778005062, 1.454239241, 0.452148763, 1.102883518, 1.589885549, 0.760443432
data$var_logback <- inv.logit(data$var_logit)
0.46580 0.68525 0.81065 0.61115 0.75080 0.83060 0.68145
It looks like I have to multiply the result with 100 to get the previous values (or at least some very similar values), but I feel like I'm missing something.
Thanks for the help!
The other thing that's going on here is that car::logit automatically adjusts the data if there are 0 or 1 values:
adjust: adjustment factor to avoid proportions of 0 or 1; defaults to ‘0’ if there are no such proportions in the data, and to ‘.025’ if there are.
library(car)
dat <- c(46.4, 69.5, 82.7, 61.7, 76.4, 84.8, 69.1)
(L1 <- logit(dat, percents=TRUE))
## [1] -0.1442496 0.8236001 1.5645131
## 0.4768340 1.1747360 1.7190001 0.8047985
(L2 <- logit(c(dat,0),percents=TRUE))
## [1] -0.1370139 0.7780051 1.4542392 0.4521488
## 1.1028835 1.5898855 0.7604434 -3.6635616
## Warning message:
## In logit(c(0, dat)) : proportions remapped to (0.025, 0.975)
This means you can't invert the results as easily.
Here's a function (using the guts of car::inv.logit with a little help from Wolfram Alpha because I was too lazy to do the algebra) that inverts the result:
inv.logit <- function(f,a) {
a <- (1-2*a)
(a*(1+exp(f))+(exp(f)-1))/(2*a*(1+exp(f)))
}
zapsmall(inv.logit(L2,a=0.025)*100)
## [1] 46.4 69.5 82.7 61.7 76.4 84.8 69.1 0.0
You set the percents=TRUE flag, which divides your values by 100, and the inverse command does not know about it.
Suppose I have two vectors, representing the height and weight of the 97 participants in a research, now I want to remove all the observation with height below 2m, and at the same time remove the corresponding observations in the weight vector. What functions should I use in R?
You can get a boolean vector by comparing height vector and use that to filter both height and weight vectors.
height.check <- height < 200 # taken in cm scale
height <- height[!height.check]
weight <- weight[!height.check]
You want a data frame (use ?data.frame for info)
x <- data.frame("Participant"=paste("Participant",1:97,sep="_"),
"Height"=height_vector,
"Weight"=weight_vector)
where height_vector and weight_vector are your data
x2 <- x[x$Height >= 2,]
Since you gave us no data, I produced some fake data.
> height <- c(2.0, 1.75, 2.15, 1.98, 1.45) ## in meters
> weight <- c(200, 178, 180, 198, 205) ## in pounds
We can remove the unwanted values using vector operations:
> height[height < 2.0]
[1] 1.75 1.98 1.45
> weight[height < 2.0]
[1] 178 198 205
But it's best to put the two vectors together into a data.frame and then subset on the condition that height is less than 2. This will automatically remove the corresponding weights.
> d <- data.frame(height = c(2.0, 1.75, 2.15, 1.98, 1.45),
weight = c(200, 178, 180, 198, 205))
> d[d$height < 2, ]
height weight
2 1.75 178
4 1.98 198
5 1.45 205