I'm trying to enter the below data into a data frame, to make a ggplot line graph.
#functions for the hh budget and utility functions
pqxf <- function(y)(1*y) # replace p with price of y
pqyf <- function(x)(-1.25*x)+20 # -1.25 is the wage rate
utilityf <- function(x)80*(1/(x)) # 80 is the utility provided
hours <- c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,20)
#functions are turned into data frames
pqy <- data.frame("consumption" =
pqxf(c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,20)))
pqx <- data.frame("leisure" =
pqxf(c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,20)))
utility <- data.frame("utility" =
utilityf(c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,20)))
#each data frame is combined into a single data frame, that will be used for tables and charts
hh <- data.frame(pqx, pqy, utility, hours)
print(hh)
#this shows the utility, and the cost of x and y, one data frame
library(ggplot2)
ggplot(hh, aes(x=pqx, y=hours))+
xlim(0,20)+ylim(0,20)+ # limits set for the assignment
labs(x = "leisure(hours)",y="counsumption(units)")+
geom_line(aes(x = pqx, y = pqy))+
geom_line(aes(x = pqx, y = utility))+
geom_point(aes(x=8,y=10))+ #values of x and y of tangent point
geom_hline(yintercept = 10,linetype="dotted")+ # y of tangent point
geom_vline(xintercept = 8,linetype = "dotted")+ #x of tangent point
geom_text(label="E", x=8,y=10,hjust=-1,size=2)+
geom_text(label="-1.25(units/hour)= -w = MRS", x=9,y=2,hjust=.02,size=2)+
geom_text(label="U=80", x=4,y=19,hjust=1,size=2)
when I enter I get the following message:
Error in is.finite(x) : default method not implemented for type 'list'
Should I store data in a different format than a data frame? format my data frame differently, or set up ggplot differently, so that it can handle lists?
Try to replace pqx with leisure, and pqy with comsumption.
Related
I am writing a function in order to filter out datapoints out of my plot based on a linear equation.
I currently have the following function (with a different function within it):
MD_filter<- function(dataframe, mz_col){
#In-function MD calculation
MZ<- mz_col
MZR<- trunc(mz_col, digits = 0)#Either floor() or trunc() can be used for this part.
MD<- as.numeric(MZ-MZR)
dataframe<- dataframe%>%
dplyr::mutate(MD)%>%
dplyr::select(MD, everything())
#fit data to m/z defect maxima equation
f<- function(x){#This could be problem `1`, maybe resolved by leaving x....
y<-0.00112*x + 0.01953
return(y)}
fit<-as.data.frame(t(apply(dataframe,1,f)))# t() transforms df to matrix...?
filtered<-dataframe[which((dataframe$MD<= fit$MZ)),]
#keep rows in dataframe if MD is less than or equal to fitted value (mz after equation)
#As "fit" calculated the maximum MD value for each MZ value in the MZ column, we subset fit$MZ, as this contains the dataframe MZ values.
#The MD calculated at the very start, needs to be lower than the equivalent MZ value of the fitted dataframe.
filtered<-write.table(filtered,"feature_list_mz_defect_filtered.txt",sep="\t",col.names=NA)
#Now we have pre filter dataframe (dataframe) and post filter df (filtered)
#2 Different plots: (highlight to be removed as well, so we need a 3rd eventually)
MD_plot<- ggplot(dataframe, aes(x= MZ, y = MD)) +
geom_point() +#THE FOLLOWING PART DOES NOT WORK YET
ggtitle(paste("Unfiltered MD data - ", dataframe))
#stat_smooth(method="lm", se=FALSE)-> For linear line through the plot, but may not be necessary to show
return(MD_plot)#While I do get a plot, I have not yet gotten the equation. I could use my earlier approach maybe.
MD_plot_2<- ggplot(filtered, aes(x= MZ, y = MD)) +#Filtered is basically the second dataframe,
#which subsets datapoints with an Y value (which is the MD), below the linear equation MD...
geom_point() +#THE FOLLOWING PART DOES NOT WORK YET
ggtitle(paste("Filtered MD data - ", dataframe))
#stat_smooth(method="lm", se=FALSE) -> For linear line through the plot, but may not be necessary to show
return(MD_plot_2)
}
The function works as follows:
The argument inputs are a dataframe and a specific column inside that dataframe which I call the mz_col.
From this column a second column, the MD column, is generated.
From here on out I want to make two plots:
ggplot 1: A plot with the mz_col (MZ) values on the X axis and the MD values on the Y axis
ggplot 2: EXACTLY the same as ggplot 1, but I want to filter out the datapoints if MD exceeds the linear equation y<-0.00112*x + 0.01953 (as is visible in the code). This linear line is basically my maximum filter limit in the plot, everything above this I want gone.
I've tried many different solutions. I swapped the "x" argument with mz_col among many other solutions such as trying to use plot() instead of ggplot. Currently I'm getting no plot, but I do get this:
Basically my question is: How do I solve my function, so I can get my two plots? The first plot is no real issue, this already works, but the second plot just won't filter out datapoints based on my linear equation.
Thanks in advance! I'm quite new to SO and R, so I apologize if anything is unclear. Please let me know if any clearification is needed and thanks in advance for all the help!
Reproducable sample data:
structure(list(mz = c(446.0394, 346.043, 199.0446, 199.0464, 97.057, 657.0595, 879.0606, 978.0631, 199.0684, 199.0707, 199.0724, 86.0748, 199.0761, 196.0789, 199.0819, 199.0852, 199.0878, 199.089, 908.0935, 147.0958, 199.0999,199.1299, 199.1322, 199.1384, 199.1398, 199.1434, 124.1475, 199.1513, 187.156, 199.1686, 199.1766, 199.1797, 199.1879, 199.1924, 187.1959, 479.1981, 169.1999, 109.2049, 399.2092, 299.2125, 159.2146, 199.2242, 356.2405, 69.2423, 956.4337, 978.5537, 199.5695, 676.5769, 199.5851, 500.6021, 260.6039, 270.6081, 200.6114, 200.6131, 200.6172, 200.6221,
200.6315, 200.6402, 200.6476, 200.766, 200.8591, 200.8732, 200.8768,
200.89, 200.8937, 200.8972, 200.9067, 200.9127, 200.9147, 200.9231,
200.9253, 200.9288, 200.9324, 200.935, 200.9468, 200.9515, 200.9536,
200.9557, 200.9568, 200.9594, 200.9661, 200.968, 200.9729, 200.9745,
200.9819, 200.9837, 200.9858, 200.9937)), row.names = c(NA, -88L), class = c("tbl_df", "tbl", "data.frame"))
I got a bit lost trying to follow your code, but based on your description, does the following work for you?
library(dplyr)
library(ggplot2)
MD_filter <- function(dataframe, mz_col, a = 0.01953, b = 0.00112){
# rename column so that rest of function doesn't depend on inputted column name
dataframe[["MZ"]] <- dataframe[[mz_col]]
# process dataframe
dataframe <- dataframe %>%
select(MZ) %>%
mutate(MD = MZ - trunc(MZ, digits = 0),
MD.limit = a + b*MZ)
p1 <- ggplot(dataframe,
aes(x = MZ, y = MD)) +
geom_point() +
geom_smooth(method = "lm", se = F) +
ggtitle("Unfiltered MD data")
p2 <- p1 %+% filter(dataframe, MD <= MD.limit) +
expand_limits(y = range(dataframe[["MD"]])) + # optional (if you want same
# y-axis range for both plots)
ggtitle("Filtered MD data")
cowplot::plot_grid(p1, p2, nrow = 1)
}
Data & usage
dd <- structure(list(mz = c(
446.0394, 346.043, 199.0446, 199.0464, 97.057, 657.0595, 879.0606,
978.0631, 199.0684, 199.0707, 199.0724, 86.0748, 199.0761, 196.0789,
199.0819, 199.0852, 199.0878, 199.089, 908.0935, 147.0958, 199.0999,
199.1299, 199.1322, 199.1384, 199.1398, 199.1434, 124.1475, 199.1513,
187.156, 199.1686, 199.1766, 199.1797, 199.1879, 199.1924, 187.1959,
479.1981, 169.1999, 109.2049, 399.2092, 299.2125, 159.2146, 199.2242,
356.2405, 69.2423, 956.4337, 978.5537, 199.5695, 676.5769, 199.5851,
500.6021, 260.6039, 270.6081, 200.6114, 200.6131, 200.6172, 200.6221,
200.6315, 200.6402, 200.6476, 200.766, 200.8591, 200.8732, 200.8768,
200.89, 200.8937, 200.8972, 200.9067, 200.9127, 200.9147, 200.9231,
200.9253, 200.9288, 200.9324, 200.935, 200.9468, 200.9515, 200.9536,
200.9557, 200.9568, 200.9594, 200.9661, 200.968, 200.9729, 200.9745,
200.9819, 200.9837, 200.9858, 200.9937)),
row.names = c(NA, -88L),
class = c("tbl_df", "tbl", "data.frame"))
MD_filter(dd, "mz")
# MD_filter(dd, "mz", a = 0.02, b = 0.001) # if you want to change the limit
There are two things I need to do. Firstly I would like to be able to create new variables in a coda mcmc object that have been calculated from existing variables so that I can run chain diagnostics on the new variable. Secondly I would like to be able to index single variables in some of the coda plot functions while still viewing all chains.
Toy data. Bayesian t-test on the sleep data using JAGS and rjags.
data(sleep)
# read in data
y <- sleep$extra
x <- as.numeric(as.factor(sleep$group))
nTotal <- length(y)
nGroup <- length(unique(x))
mY <- mean(y)
sdY <- sd(y)
# make dataList
dataList <- list(y = y, x = x, nTotal = nTotal, nGroup = nGroup, mY = mY, sdY = sdY)
# model string
modelString <- "
model{
for (oIdx in 1:nTotal) {
y[oIdx] ~ dnorm(mu[x[oIdx]], 1/sigma[x[oIdx]]^2)
}
for (gIdx in 1:nGroup) {
mu[gIdx] ~ dnorm(mY, 1/sdY)
sigma[gIdx] ~ dunif(sdY/10, sdY*10)
}
}
"
writeLines(modelString, con = "tempModel.txt")
# chains
# 1. adapt
jagsModel <- jags.model(file = "tempModel.txt",
data = dataList,
n.chains = 3,
n.adapt = 1000)
# 2. burn-in
update(jagsModel, n.iter = 1000)
# 3. generate
codaSamples <- coda.samples(model = jagsModel,
variable.names = c("mu", "sigma"),
thin = 15,
n.iter = 10000*15/3)
Problem one
If I convert the coda object to a dataframe I can calculate the difference between the estimates for the two groups and plot this new variable, like so...
df <- as.data.frame(as.matrix(codaSamples))
names(df) <- gsub("\\[|\\]", "", names(df), perl = T) # remove brackets
df$diff <- df$mu1 - df$mu2
ggplot(df, aes(x = diff)) +
geom_histogram(bins = 100, fill = "skyblue") +
geom_vline(xintercept = mean(df$diff), colour = "red", size = 1, linetype = "dashed")
...but how do I get a traceplot? I can get one for existing variables within the coda object like so...
traceplot(codaSamples[[1]][,1])
...but I would like to be able to get them for the the new diff variable.
Problem Two
Which brings me to the second problem. I would like to be able to get a traceplot (among other things) for individual variables. As I have shown above I can get them for a single variable if I only want to see one chain but I'd like to see all chains. I can see all chains for all variables in the model with the simple
plot(codaSamples)
...but what if I don't want or need to see all variables? What if I just want to see the trace and/or desnity plots for one, or even two, variables (but not all variables) but with all chains in the plot?
Background and Summary of Objective
I am trying to find the y-coordinate at the intersection of two plotted curves using R. I will provide complete details and sample data below, but in the hopes that this is a simple problem, I'll be more concise up front.
The cumulative frequencies of two curves(c1 and c2 for simplicity) are defined by the following function, where a and b are known coefficients:
f(x)=1/(1+exp(-(a+bx)))
Using the uniroot() function, I found "x" at the intersection of c1 and c2.
I had assumed that if x is known then determining y should be simple substitution: for example, if x = 10, y=1/(1+exp(-(a+b*10))) (again, a and b are known values); however, as will be shown below, this is not the case.
The objective of this post is to determine how to find y-coordinate.
Details
This data replicates respondents' stated price at which they find the product's price to be too.cheap (i.e., they question its quality) and the price at which they feel the product is a bargain.
The data will be cleaned before use to ensure that too.cheap is
always less than the bargain price.
The cumulative frequency for the
bargain price will be inverted to become not.bargain.
The intersection of bargain and too.cheap will represent the point at
which an equal share of respondents feel the price is not a bargain
and too.cheap --- the point of marginal cheapness ("pmc").
Getting to the point where I'm having a challenge will take a number of steps.
Step 1: Generate some data
# load libraries for all steps
library(car)
library(ggplot2)
# function that generates the data
so.create.test.dataset <- function(n, mean){
step.to.bargain <- round(rnorm(n = n, 3, sd = 0.75), 2)
price.too.cheap <- round(rnorm(n = n, mean = mean, sd = floor(mean * 100 / 4) / 100), 2)
price.bargain <- price.too.cheap + step.to.bargain
df.temp <- cbind(price.too.cheap,
price.bargain)
df.temp <- as.data.frame(df.temp)
return(df.temp)
}
# create 389 "observations" where the too.cheap has a mean value of 10.50
# the function will also create a "bargain" price by
#adding random values with a mean of 3.00 to the too.cheap price
so.test.df <- so.create.test.dataset(n = 389, mean = 10.50)
Step 2: Create a data frame of cumulative frequencies
so.get.count <- function(p.points, p.vector){
cc.temp <- as.data.frame(table(p.vector))
cc.merged <- merge(p.points, cc.temp, by.x = "price.point", by.y = "p.vector", all.x = T)
cc.extracted <- cc.merged[,"Freq"]
cc.extracted[is.na(cc.extracted)] <- 0
return(cc.extracted)
}
so.get.df.price<-function(df){
# creates cumulative frequencies for three variables
# using the price points provided by respondents
# extract and sort all unique price points
# Thanks to akrun for their help with this step
price.point <- sort(unique(unlist(round(df, 2))))
#create a new data frame to work with having a row for each price point
dfp <- as.data.frame(price.point)
# Create cumulative frequencies (as percentages) for each variable
dfp$too.cheap.share <- 1 - (cumsum(so.get.count(dfp, df$price.too.cheap)) / nrow(df))
dfp$bargain.share <- 1 - cumsum(so.get.count(dfp, df$price.bargain)) / nrow(df)
dfp$not.bargain.share <- 1 - dfp$bargain.share# bargain inverted so curves will intersect
return(dfp)
}
so.df.price <- so.get.df.price(so.test.df)
Step 3: Estimate the curves for the cumulative frequencies
# Too Cheap
so.l <- lm(logit(so.df.price$too.cheap.share, percents = TRUE)~so.df.price$price.point)
so.cof.TCh <- coef(so.l)
so.temp.nls <- nls(too.cheap.share ~ 1 / (1 + exp(-(a + b * price.point))), start = list(a = so.cof.TCh[1], b = so.cof.TCh[2]), data = so.df.price, trace = TRUE)
so.df.price$Pr.TCh <- predict(so.temp.nls, so.df.price$price.point, lwd=2)
#Not Bargain
so.l <- lm(logit(not.bargain.share, percents = TRUE) ~ price.point, so.df.price)
so.cof.NBr <- coef(so.l)
so.temp.nls <- nls(not.bargain.share ~ 1 / (1 + exp(-(a + b * price.point))), start = list(a = so.cof.NBr[1], b = so.cof.Br[2]), data= so.df.price, trace=TRUE)
so.df.price$Pr.NBr <- predict(so.temp.nls, so.df.price$price.point, lwd=2)
# Thanks to John Fox & Sanford Weisberg - "An R Companion to Applied Regression, second edition"
At this point, we can plot and compare the "observed" cumulative frequencies against the estimated frequencies
ggplot(data = so.df.price, aes(x = price.point))+
geom_line(aes(y = so.df.price$Pr.TCh, colour = "Too Cheap"))+
geom_line(aes(y = so.df.price$Pr.NBr, colour = "Not Bargain"))+
geom_line(aes(y = so.df.price$too.cheap.share, colour = "too.cheap.share"))+
geom_line(aes(y = so.df.price$not.bargain.share, colour = "not.bargain.share"))+
scale_y_continuous(name = "Cummulative Frequency")
The estimate appears to fit the observations reasonably well.
Step 4: Find the intersection point for the two estimate functions
so.f <- function(x, a, b){
# model for the curves
1 / (1 + exp(-(a + b * x)))
}
# note, this function may also be used in step 3
#I was building as I went and I don't want to risk a transpositional error that breaks the example
so.pmc.x <- uniroot(function(x) so.f(x, so.cof.TCh[1], so.cof.TCh[2]) - so.f(x, so.cof.Br[1], so.cof.Br[2]), c(0, 50), tol = 0.01)$root
We may visually test the so.pmc.x by plotting it with the two estimates. If it is correct, a vertical line for so.pmc.x should pass through the intersection of too.cheap and not.bargain.
ggplot(data = so.df.price, aes(x = price.point)) +
geom_line(aes(y = so.df.price$Pr.TCh, colour = "Too Cheap")) +
geom_line(aes(y = so.df.price$Pr.NBr, colour = "Not Bargain")) +
scale_y_continuous(name = "Cumulative Frequency") +
geom_vline(aes(xintercept = so.pmc.x))
...which it does.
Step 5: Find y
Here is where I get stumped, and I'm sure I'm overlooking something very basic.
If a curve is defined by f(x) = 1/(1+exp(-(a+bx))), and a, b and x are all known, then shouldn't y be the result of 1/(1+exp(-(a+bx))) for either estimate?
In this instance, it is not.
# We attempt to use the too.cheap estimate to find y
so.pmc.y <- so.f(so.pmc.x, so.cof.TCh[1], so.cof.TCh[2])
# In theory, y for not.bargain at price.point so.pmc.x should be the same
so.pmc.y2 <- so.f(so.pmc.x, so.cof.NBr[1], so.cof.NBr[2])
EDIT: This is where the error occurs (see solution below).
a != so.cof.NBr[1] and b != so.cof.NBr[2], instead a and be should be defined as the coefficients from so.temp.nls (not so.l)
# Which they are
#> so.pmc.y
#(Intercept)
# 0.02830516
#> so.pmc.y2
#(Intercept)
# 0.0283046
If we calculate the correct value for y, a horizontal line at yintercept = so.pmc.y, should pass through the intersection of too.cheap and not.bargain.
...which it obviously does not.
So how does one estimate y?
I've solved this, and as I suspected, it was a simple error.
My assumption that y = 1/(1+exp(-(a+bx))) is correct.
The issue is that I was using the wrong a, b coefficients.
My curve was defined using the coefficients in so.cof.NBr as defined by so.l.
#Not Bargain
so.l <- lm(logit(not.bargain.share, percents = TRUE) ~ price.point, so.df.price)
so.cof.NBr <- coef(so.l)
so.temp.nls <- nls(not.bargain.share ~ 1 / (1 + exp(-(a + b * price.point))), start = list(a = so.cof.NBr[1], b = so.cof.Br[2]), data= so.df.price, trace=TRUE)
so.df.price$Pr.NBr <- predict(so.temp.nls, so.df.price$price.point, lwd=2)
But the resulting curve is so.temp.nls, NOT so.l.
Therefore, once I find so.pmc.x I need to extract the correct coefficients from so.temp.nls and use those to find y.
# extract coefficients from so.temp.nls
so.co <- coef(so.temp.nls)
# find y
so.pmc.y <- 1 / (1 + exp(-(so.co[1] + so.co[2] * so.pmc.x)))
ggplot(data = so.df.price, aes(x = price.point))+
geom_line(aes(y = so.df.price$Pr.TCh, colour = "Too Cheap"))+
geom_line(aes(y = so.df.price$Pr.NBr, colour = "Not Bargain"))+
scale_y_continuous(name = "Cumulative Frequency")+
geom_hline(aes(yintercept = so.pmc.y))
Yielding the following...
which graphically depicts the correct answer.
So I'm working on a project that requires me to create multiple plots using air quality data given for 30+ seperate days/files.
I've learned and copied from a person who has already done it but somehow I'm having problems integrating it with the data I have, any help is appreciated.
Here is the code:
#---START OF THE FOR LOOP---
all_dates = seq(begin_date, end_date, 1)
for (j in 1:length(all_dates)) {
selected_date = all_dates[j]
#read data from the file with (id, lat, lon, elev, value... enter your attributes here) columns
datapointfile = paste(variable,"_",aggregate,"_",format.Date(selected_date,"%Y-%m-%d"),".txt",sep="")
datapoints_wgs84 = read.table(datapointfile, header=TRUE)
coordinates(datapoints_wgs84) = ~lon + lat
proj4string(datapoints_wgs84) <- CRS("+init=epsg:4326")
datapoints <- spTransform(datapoints_wgs84, CRS(projection))
#INTERPOLATION STARTS HERE!
#calculate linear regression intercept and slope(observ=B+A*elev)
observlm <- lm(value ~ elev, datapoints)
datapoints$res = observlm$residuals
#calculate observ value raster using linear model (elevation / observ value)
intercept = observlm$coefficients[1]
slope = observlm$coefficients[2]
regression_grid <- intercept + srtm * slope
#interpolate the residuals using idw
idw_test = idw(res~1, datapoints, defaultgrid)
residual_grid = raster(idw_test, "var1.pred")
#add regression grid and residuals
finalgrid <- regression_grid + residual_grid
#INTERPOLATION ENDS HERE!
#START OF THE PLOT!
layout = list(vrstvastudyarea)
outfile = paste("Output", selected_date, ".png", sep="")
png(filename = outfile, width = 1500, height = 1000, pointsize = 25, bg = "white", res = 150)
nadpis = paste("Air Quality", selected_date)
print(spplot(finalgrid, at=intervaly, col.regions = grid_colors,
sp.layout=layout,
main=list(nadpis, cex=2, col="black", font=2),
colorkey=list(at=intervaly2, labels = list( at=intervaly2, cex = 1.5, labels = intervaly, lab = intervaly2), space="bottom")))
dev.off()
#END OF THE PLOT!
#---OUTPUT---
[inverse distance weighted interpolation]
Error in `[[<-.data.frame`(`*tmp*`, name, value = c(8.66783923397599, :
replacement has 14 rows, data has 18
So basically instead of getting 30+ outputs I get 1 output (for some reason!?) and then I get the error which is given above as the output :/. If I got no output I would understand but given first data file and the second data file has almost no difference in format I should not have any error in formatting...
Couple more information I think that might help solving the issue are:
My data files look like this:
"id" "elev" "lon" "lat" "value"
Afyon 1027 30.54277778 38.75166667 108.2903226
Aydin 54 27.83666667 37.84027778 122.7096774
.
.
.
And the person I copied the data from has a data as the following:
"id" "lat" "lon" "elev" "value"
2 50.69205 15.72876 816 37
3 49.735 16.0336 737 19
.
.
.
Thank you for your time.
So apparently it was caused by my data having some "NA" entries in it. Fixed it by giving them appropriate values. Although I'm still clueless how the code would've worked if I would like it to work with those NA values included.
I have a data frame which contains point daily precipitation for 4 station for 2 years. I want to interpolate to 50m resoulution and write them in to 2 raster images. I used following code to achieve this...
library(ggplot2)
library(gstat)
library(maptools)
library(raster)
library(rgdal)
xcord<-c(100,200,300,400)
ycord<-c(100,200,300,400)
value1<-c(1,2,3,1)
value2<-c(2,5,7,3)
datas<-data.frame(xcord,ycord,value1,value2)
coordinates(datas) = ~xcord+ycord
mesh <- expand.grid(x=seq(0,500,50),y=seq(0,500,50))
coordinates(mesh)=~x+y
gridded(mesh) <- TRUE
oneidw = idw(value1~1,datas,mesh)
spplot(oneidw["var1.pred"], main = " inverse distance weighted interpolations")
It worked. but i want to apply a loop to do it for another variable value2 (and so on...) without doing it manually.
and i used this
sym<-paste("value", 1:2,sep="")
variable=as.vector(print(sym,quote=FALSE))
for (i in 3:ncol(datas)){
one<-idw((print(variable[i],quote=FALSE))~1,datas,mesh)
}
but i got error too many spatial dimensions........
can anybody help me with this....
I'm not too familiar with spplot, but this worked for me using ggplot.
library(ggplot2)
library(gstat)
library(sp)
library(maptools)
library(maps)
library(dplyr)
library(rgdal)
xcord<-c(100,200,300,400)
ycord<-c(100,200,300,400)
value1<-c(1,2,3,1)
value2<-c(2,5,7,3)
datas<-data.frame(xcord,ycord,value1,value2)
new_datas <- select(datas, xcord, ycord)
parse_by <- colnames(datas)[3:4] #change according to designated value columns
for ( i in parse_by ) {
variable <- datas[i]
new_datas2 <- cbind(new_datas, variable) #combine single variable col w/ coordinates
colnames(new_datas2)[3] = "variable" #rename so that you can call to in idw formula
coordinates(new_datas2) = ~xcord+ycord
mesh <- expand.grid(x=seq(0,500,50),y=seq(0,500,50))
coordinates(mesh)=~x+y
gridded(mesh) = TRUE
plot(mesh) #plot background so ggplot can use later
points(new_datas2) #points for ggplot to use later
one<-idw(formula = variable~1, locations = new_datas2, newdata = mesh) #idw formula
one.output <- as.data.frame(one)
names(one.output)[1:3] <- c("xcord", "ycord", "var1.pred") #rename for ggplot geom_tile
ggplot() + geom_tile(data = one.output, alpha = 1, aes(x = xcord, y = ycord, fill = var1.pred)) +
geom_point(data = new_datas, aes(x = xcord, y = ycord)) +
labs(fill = "inverse distance weighted interpolations")
ggsave(paste(i,".png",sep="")) #save as .png according to variable name
}