Show specific values in barplot - r

I am trying to plot a graph that displays the cost of apps but only if they cost more than 0 (aren't free) I have tried the filter function but got nowhere.
if (df$Price > 0){
barplot(table(df$Price))
}
I have also tried an if statement but it gives me a warning message saying:
In if (df$Price == test) { :
the condition has length > 1 and only the first element will be used
I am new to R studio so any help would be appreciated

Related

selecting small vif variables in r

I am trying to write an r function to select covariates with small VIF.
Here is my code:
ea=read.csv("ea.csv")
library(car)
fullm<-lm(appEuse~.,data=ea)
cov<-names(ea)
ncov<-length(cov)
vifs<-rep(NA,ncov)
include<-rep(NA,ncov)
for (i in 1:ncov){
vifs[i]<-vif(fullm)[i]
if (vifs[i]<10){
include[i]<-cov[i+1]
}
}
Error in if (vifs[i] < 10) { : missing value where TRUE/FALSE needed
I was trying to set for loop from 1 to ncov-1, then got argument is of length zero.
Is there a way to go around it?
Could be wrong, but looks like you're trying to loop an if statement over a list of NAs:
vifs<-rep(NA,ncov)
is later referenced at
if (vifs[i]<10){
Could this be your issue?

Accessing API with for-loop randomly has encoding error, which breaks loop in R

I'm trying to access an API from iNaturalist to download some citizen science data. I'm using the package rinat to get this done (see vignette). The loop below is, essentially, pulling all observations for one species, in one state, in one year iteratively on a per-month basis, then summing the number of observations for that year (input parameters subset from my actual script for convenience).
require(rinat)
state_ids <- c(18, 14)
bird_ids <- c(14886,1409)
months <- c(1:12)
final_nums <- vector()
for(i in 1:length(state_ids)){
total_count <- vector()
for(j in 1:length(months)){
monthly <- get_inat_obs(place_id=state_ids[i],
taxon_id=bird_ids[i],
year=2019,
month = months[j])
total_count <- append(total, length(monthly$scientific_name))
print(paste("done with month", months[j], "in state", state_ids[i]))
}
final_nums <- append(final_nums, sum(total_count))
print(paste("done with state", state_ids[i]))
}
Occasionally, and seemingly randomly, I get the following error:
No encoding supplied: defaulting to UTF-8.
Error in if (!x$headers$`content-type` == "text/csv; charset=utf-8") { :
argument is of length zero
This ends up breaking the loop or makes the loop run without actually pulling any real data. Is this an issue with my script, or the API, or something else? I've tried manually supplying encoding information to the get_inat_obs() function, but it doesn't accept that as an argument. Thank you in advance!
I don't believe this is an error in your script. The issue is with the api most likely.
the error argument is of length zero is a common error when you try to make a comparison that has no length. For example:
if(logical(0) == "TEST") print("WORKED!!")
#Error in if (logical(0) == "TEST") print("WORKED!!") :
# argument is of length zero
I did some a few greps on their source code to see where this if statement is and it seems to be within inat_handle line 211 in get_inate_obs.R
This would suggest that the authors did not expect for
!x$headers$`content-type` == 'text/csv; charset=utf-8'
to evaluate to logical(0), but more specifically
x$headers$`content-type`
to be NULL.
I would suggest making a bug report on their GitHub and recommend they change the specified line to:
if(is.null(x$headers$`content-type`) || !x$headers$`content-type` == 'text/csv; charset=utf-8'){
Suggesting a bug is usually more well received if you have a reproducible example.
Also, you could totally make this change yourself locally by cloning out the git repository, editing the file, rebuild the package, and then confirm if you no longer get an error in your code.

Resolving "Error: subscript out of bounds" in a code loop

I am trying to run an R loop on an individual based model. This includes two lists referring to grid cells, which I originally ran into difficulties with because they returned the error: Error: (list) object cannot be coerced to type 'double'. I think I have resolved this error by typing "as.numeric(unlist(x))."
Example from code:
List 1:
dredg<-list(c(943,944,945,946,947,948,949...1744,1745))
dredging<-as.numeric(unlist(dredg)). I refer to 'dredging' in my code, not 'dredg.'
List 2:
nodredg<-list(c(612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631))
dcells<-as.numeric(unlist(nodredg)) I refer to 'dcells' in my code, not 'nodredg.'
However, now when I use these two number arrays (if that's what they are) my code returns the error
Error in dcells[[dredging[whichD]]] : subscript out of bounds
I believe this error is referring to the following lines of code:
if(julday>=dstart[whichD] & julday<=dend[whichD] & dredging[whichD]!=0){
whichcells<-which(gridd$inds[gridd$julday==julday] %in% dcells[[ dredging[whichD] ]]) #identify cells where dredging is occurring
}
where:
whichD<-1
julday<-1+day
'dstart=c(1,25,75,100)dend=c(20,60,80,117)`
Here is the full block of code:
for (i in 1:time.steps){
qday <- qday + 1
for (p in 1:pop){
if (is.na(dolphin.q[p,1,i]!=dolphin.q[p,2,i])) {
dolphin.q[p,6,i]<-which.max(c(dolphin.q[p,1,i],dolphin.q[p,2,i]))
}else{
dolphin.q[p,6,i]<-rmulti(c(0.5,0.5))
}
usP<-usage[p,]
if(julday>=dstart[whichD] & julday<=dend[whichD] & dredging[whichD]!=0){
whichcells<-which(gridd$inds[gridd$julday==julday] %in% dcells[[ dredging[whichD] ]])
usP[whichcells]<-0
usP<-usP/sum(usP) #rescale the home range surface to 1
}
I was wondering if anyone could show me what is going wrong? I apologize if this is a very simple mistake I am making, I am a novice learner that has been scouring the internet, manuals, and Stack for days with no luck!
Thanks in advance!

R / Rstudio : is there a watch function to continuously monitor some values?

I have a lot of different operations running on quite a big dataframe. It starts to be a pain for maintenance, especially with some data being improperly formatted, and I'm looking at some options to make my life easier.
The problem is that at one point in the flow of operations NAs are introduced in several lines, including the id (certainly due to some bad subsetting). Now I cannot find the culprit easily because I have each time to str() it, or to view() it in Rstudio... This takes time and I already did it once without finding the bad operation...
So I'm curious if there is some package answering to this problem or a way to program something "daemon-like", to pop up a warning message when a specific value appears.
A while loop doesn't help, because it evaluates all the statements, and of course at one point the condition is not true and it doesn't print when it stops ...
while(nrow(df[is.na(df$id),]) > 0){
statements OK
breaking statement
other OK statements
}
I'll look for other options but I wanted to ask before...
EDIT : thanks for the useful comments, I'll definitely will look more into those functions. However I tried also to build myself a watch function (see my answer).
Ok, I guess I have finally built something quite like it :
This is a function to source a file line per line until a given condition is met :
watchIt <- function(file,watchexpression,startwatchline){
line <- 1
sourceList <- scan(file = "source_test.R", what="character", sep="\n", blank.lines.skip = FALSE)
maxLines <- length(sourceList)
while(startwatchline > line && maxLines >= line){
cat("l")
eval(parse(text=sourceList[line]))
line <- line+1
cat(line)
cat(" ")
}
while(eval(parse(text=watchexpression)) == FALSE && maxLines >= line){
cat(" L")
eval(parse(text=sourceList[line]))
line <- line+1
cat(line)
cat(" ")
}
if(maxLines <= line) {
cat("End of file reached without condition getting TRUE")
}
else{
cat("Condition evaluated to TRUE on line :")
cat(line)
cat("\n")
cat(sourceList[line])
}
}
So this is how I use it :
watchIt("source_test.R","nrow(df[is.na(df$id),]) > 0",10)
This puts "source_test.R" in a list, each line a new list item, and, starting from line 10, I test if the resultant dataframe as NAs in the id field. The execution stops either when the condition evaluates TRUE or when the end of the list items is reached.
Still I'm waiting for some other/better answers... Also, this is kind of my fourth function I managed to create in R, so I guess there might be ameliorations to be made to it...

Constructing a return in R

I have a function in R that I run on lists of flies. After the following code I apply the function to files and have R put the values in a new file. An equation in this function outputs a value based on water density and depth. I found that the top row of the depth is not always 1 in these files, but I need it to be for the equation. I would like to be able to output an "NA" message into the new sheet for when it is not.
stratindex=function(file){
ctd=read.csv(file,header=T)
x=ctd$Density..sigma.t..kg.m.3..
y=ctd$Depth..salt.water..m...lat...60
if(y[1]!=1){return(NA)} else
((x[30]-x[1]) / 29)
}
This all reads in fine. Then I get to the next part, where it takes the individual files and applies the function to them.
the.files <- choose.files()
index <- sapply(the.files, stratindex)
After this, R prompts:
"Error in if (y[1] != 1) { : missing value where TRUE/FALSE needed"
Where did I go wrong? What can I do to nest other stipulations if I find them? For instance, if the depth at row 30 is not 30.

Resources