Calculating magnitude of longitudinal and latitudinal error - r

Lets say i have those coordinates..
Long.GPS Lat.GPS Long.TEST Lat.TEST
22.355951 44.699745 24.00092 44.37806
22.355951 44.699745 24.08816 44.36839
22.355951 44.699745 23.73256 44.42112
22.355951 44.699745 22.35929 44.6953
Now, what i want is to know how to calculate the magnitude of longitudinal and latitudinal error by the following: Long.error = |Long.GPS - Long.Test|
and the same for latitude.
Then, i would like to know how to calculate the magnitude of total horizontal error as the Euclidean distance from the true location(GPS) to tested location(TEST) by the following: Total.error = sqrt(Long.error^2 + Lat.error^2)
After all this, i would like to plot them using ggplot2 geom_point, but i think that i can handle this on my own.
I am quite new into this field and i need help in order to finish my project.
Thanks in advance !

Related

Function to calculate shift between to time series based on data points

I am trying to find a function that matches two time series such that the datetime corresponds to reality.
So I need a function that minimizes the distance between the two curves shown above and outputs a new dataframe that has TAIR time-shifted towards the values of tre200h0.
From my bare eyes, it looks like this shift is about 22h.
ggplot
Best,
Fabio
I don't know a function that does this job for me.
Solved by Ric Villalba in the comments to OG Question.
Two R base functions to analyze time series lags are acf and pacf. i.e. given you have x and y you can use acf(y-x) and seek the zeroes in the plot (if your series have adequate seasonal behaviour), or, if you prefer, acf(y-x, plot=F) and get the data. Try which.min( acf(x-y)$acf^2 ).
Of course, it is a simplification of otherwise complex matter

Why the need for a mask when performing Fast Fourier Transform?

I'm trying to find out the peak frequencies hidden in my data using the fft() method in R. While preparing the data, a more experienced user recommends to create a "mask" (more after explaining the details), that does give me the exact diagram I'm looking for. The problem is, I don't understand what it does or why it's needed.
To give some context, I'm working with .txt files with around 12000 entries each. It's voltage vs. time information, and the expected result is just a sinusoidal wave with a clear peak frequency that should be close to 1-2 Hz. This is an example of what one of those files look like:
I've been trying to use the Fast Fourier Transform method fft() implemented in R to find the peak frequencies and get a diagram that reflected them clearly. At first, I calculate some things that I understand are going to be useful, like the Nyquist frequency and the range of frequencies I'll show in the final graph:
n = length(variable)
dt = time[5]-time[4]
df = 1/(max(time)) #Find out the "unit" frequency
fnyquist = 1/(2*dt) #The Nyquist frequency
f = seq(-fnyquist, fnyquist-df, by=df) #These are the frequencies I'll plot
But when I plot the absolute value of what fft(data) calculates vs. the range of frequencies, I get this:
The peak frequency seems to be close to 50 Hz, but I know that's not the case. It should be close to 1 Hz. I'm a complete newbie in R and in Fourier analysis, so after researching a little, I found in a Swiss page that this can be solved by creating a "mask", which is actually just a vector with a repeatting patern (1, -1, 1, -1...) with the same length as my data vector itself:
mask=rep(c(1, -1),length.out=n)
Then if I multiply my data vector by this mask and plot the results:
results = mask*data
plot(f,abs(fft(results)),type="h")
I get what I was looking for. (This is the graph after limiting the x-axis to a reasonable scale).
So, what's the mask actually doing? I undestand it's changing my data point signs in an alternate manner, but I don't get why it would take the infered peak frequencies from ~50 Hz to the correct result of ~1 Hz.
Thanks in advance!
Your "mask" is one of two methods of performing an fftshift, which is commonly done to center the 0 Hz output of an FFT in the middle of a graph or plot (instead of at the left edge, with the negative frequencies wrapping around to the right edge).
To perform an fftshift, you can hetrodyne or modulate your data (by Fs/2) before the FFT, or simply do a circular shift by 50% after the FFT. Both produce the same result. They are the same due to the shift property of the DFT.

How do I produce a probability histogram?

I've just started learning R, and was wondering, say I have the dataset quake, and I want to generate the probability histogram of quakes near Fiji, would the code simply be hist(quakes$lat,freq=F)?
A histogram shows the frequency or proportion of a given value out of all the values in a data set. You need a numeric vector as the x argument for hist(). There is no flat variable in quakes, but there is a lat variable. hist(quakes$lat, freq = F) would show the following:
This shows the north/south geographical distribution of earthquakes, centering around -20, and, since it is approximately normal (with a left skew) suggests that there is a mechanism for earthquake generation that centers around a specific latitude.
The best way to learn is to try. If you wonder if that would be the way to do it, try it.
You might also want to look at this tutorial on creating kernel density plots with ggplot.

How to cleanly use interpolation between points to generate a mean in R

I am having issues trying to generate a code that will cleanly produce a mean (specifically a weighted average) based on a simple plot of points using interpolation.
For Example;
ex=c(1,2,3,4,5)
why=c(2,5,9,15,24)
This shows the kind of information I am working with.
plot(ex, why, type="o")
At this point, I want to actually have each point "binned" so the lines between them are straight. To do this, I have been adding points to the x values manually in excel as (x+0.01).
This is the new output:
why=c(2,2,5,5,9,9,15,15,24,24)
ex=c(1,2,2.01,3,3.01,4,4.01,5,5.01,6)
plot(ex, why, type="o")
So this is where my question comes in to play. I have to do this many times and do not want to generate a ton of new vectors and objects. To get a weighted average, I have been interpolating y values for increments of x at 0.01 using interpolation into a new object. I am then able to go into this new object and get a mean when a point falls between the actual ex values, i.e.
mean(newy[1:245])
Because I made new y values for 100 increments of x that (basically) follow a straight line, I am getting a weighted average here for x= 1 to 2.45.
Is there an easier and more elegant way to embed the interpolate code into the mean code so I could just say "average of interpolated y for nonreal x to nonreal x?"
It doesn't do exactly what you want, but you should consider the stepfun function -- this creates a step function out of two series.
plot(stepfun(ex[-1], why))
stepfun is handy because it gives you a function defined over that interval, so you can easily interpolate just by evaluating anywhere. The downside to it is that it is not strictly defined on the range given (which is why we have to cut off the first value in ex).
Based on your second plotting example, I think you are probably looking for this:
library(ggplot2)
qplot(ex, why, geom="step")
this gives:
Or if you want the line to go vertical first, you can use:
qplot(ex, why, geom="step", direction = "vh")
which gives:

Plotting fluctuation in R

I will try to be as less vague as possible. The below data set consists of a device's power measurement and I have to plot a graph which would show the average fluctuation of the power (watt) during the Time column. I have to accomplish this in R but i really don't know which function or how should I do it as i'm a newbie to R. Any help will be highly appreciated!
Store No.,Date,Time,Watt
33,2011/09/26,09:11:01,0.0599E+03
34,2011/09/26,09:11:02,0.0597E+03
35,2011/09/26,09:11:03,0.0598E+03
36,2011/09/26,09:11:04,0.0596E+03
37,2011/09/26,09:11:05,0.0593E+03
38,2011/09/26,09:11:06,0.0595E+03
39,2011/09/26,09:11:07,0.0595E+03
40,2011/09/26,09:11:08,0.0595E+03
41,2011/09/26,09:11:09,0.0591E+03
rollapply in package:zoo will return a moving average (or a moving any-function). You can plot using points and then add a moving average line:
dat$D.time <- as.POSIXct(paste(dat$Date, dat$Time))
require(zoo)
?rollapply
length(rollapply(dat$Watt,3, mean))
plot(dat$D.time, dat$Watt)
lines(dat$D.time[3:9], rollapply(dat$Watt,3, mean))

Resources