Adding rows to different dataframe programatically - r

Im creating a dataframe dynamically and Im using custom names to refer to those data frames.How ever, I can succesfully create the data frames dynamically and add information individually but manually when i try to add a record to it it will run the action but nothing happens. I can open the data frame and it shows as empty
#Extract unique machines on the system
machines <- unique(wo_raw$MACHINE)
for(machine in machines){
#Check if the machine is present on current data frames or has a record
if(exists(machine) && is.data.frame(get(machine))){
#Machine already exists on the system
cat(machine," is a dataframe","\n")
netlbs <- subset(wo_raw,((wo_raw$TYPE =="T" & wo_raw$TYPE2=="E") | (wo_raw$TYPE == "T" & is.na(wo_raw$TYPE2))) & wo_raw$WEEK<=curWeek & wo_raw$MACHINE == machine & wo_raw$YEAR == curYear,select = NET_LBS)
scraplbs<- subset(wo_raw,((wo_raw$TYPE =="T" & wo_raw$TYPE2=="E") | (wo_raw$TYPE == "T" & is.na(wo_raw$TYPE2))) & wo_raw$WEEK<=curWeek & wo_raw$MACHINE == machine & wo_raw$YEAR == curYear,select = SCRAP_LBS)
if(is.data.frame(netlbs) && nrow(netlbs)!=0){
totalNet<- sum(netlbs)
totalScrap<- sum(scraplbs)
scrapRate <- percent(totalScrap/(sum(totalNet,totalScrap)),accuracy = 2)
tempDf<-data.frame(curYear,curMonth,curDay,curWeek,totalNet,totalScrap,scrapRate)
names(tempDf)<-c("year","month","day","week","net_lbs","scrap_lbs","scrap_rate")
cat("Total Net lbs for ",machine,": ",totalNet,"\n")
cat("Total Scrap lbs for ",machine,": ",totalScrap,"\n")
cat("Total Scrap Rate for ",machine,": ",scrapRate,"\n")
#machine<-rbind(get(machine),tempDf)
#assign(machine,rbind(machine,tempDf))
add_row(get(machine),year=curYear,
month=curMonth,
day=curDay,
week=curWeek,
net_lbs=totalNet,
scrap_lbs=totalScrap,
scrap_rate=scrapRate)
cat("added row \n")
}
#info<-c(curYear,curMonth,curDay,curWeek,netlbs)
#cat("Total Net lbs: ",netlbs,"\n")
#netlbs <-NULL
}else{
cat("Creating machine dataframe: ",machine,"\n")
#Create a dataframe labeled with machine name contining
#date information, net lbs,scrap lbs and scrap rate
assign(paste0(machine,""),data.frame(year=integer(),
month=integer(),
day=integer(),
week=integer(),
net_lbs=double(),
scrap_lbs=double(),
scrap_rate=integer()
)
)
#machine$year<-curYear
}
#machine<-NULL
}
All the functions that I've tried are in commented lines from previous answers found on Stack Overflow. I did get working with a for but i dont think that would be really feasible since it will consume a lot of resources plus it doesn't work well when handling various data types . Does anybody have an idea of whats going on, I don't have an error to go by.

I think your code needs quite a lot of cleanup. Make sure you know for yourself at each step what exactly you are handling.
Some hints:
Try to make your code self-contained. If I run your code, I get an error right away, as I don't have wo_raw defined. I understand it's some kind of data.frame, but exactly what is in there? What do I need to do to try to run your code? Also with variables like curYear. I get that it needs to be 2019, but I need to type an awful lot to just get to the problem, I can't just copy-paste.
If you use any libraries, please also include a line for them. I don't know what add_row does or is supposed to do. So I also don't know if that's where your expectations are wrong?
Try to make your code minimal before posting it here. I like the comments and cats sprinkled throughout, but why a line such as netlbs <- subset(wo_raw,((wo_raw$TYPE =="T" & wo_raw$TYPE2=="E") | (wo_raw$TYPE == "T" & is.na(wo_raw$TYPE2))) & wo_raw$WEEK<=curWeek & wo_raw$MACHINE == machine & wo_raw$YEAR == curYear,select = NET_LBS)? For this problem, just something like subset(wo_raw, wo_raw$mach==machine, net) would suffice
I get that the code works, but try to work out where you are using what kind of objects. if (is.data.frame(netlbs)) {total=sum(netlbs)} may work, but summing a data.frame while you actually just need a column leads to confusion.
When using variables to store the names of other variables such as you are doing, be very aware of what you are actually refering to. For that reason, it's generally advisable to steer clear of these constructs, it's almost always easier to store your results in a list or something similar
Come to that: the variable machine is not a data.frame, it's a character. That character is the name of another variable, which is a data.frame. So (commented out) I see some instances of machine <- NULL and machine$year, those are wrong. As is rbind(machine, ...), as machine is not a data.frame
That being said, I think you got close with the assign-statement.
Does assign(machine,rbind(get(machine),tempDf)) work?

Related

Accessing API with for-loop randomly has encoding error, which breaks loop in R

I'm trying to access an API from iNaturalist to download some citizen science data. I'm using the package rinat to get this done (see vignette). The loop below is, essentially, pulling all observations for one species, in one state, in one year iteratively on a per-month basis, then summing the number of observations for that year (input parameters subset from my actual script for convenience).
require(rinat)
state_ids <- c(18, 14)
bird_ids <- c(14886,1409)
months <- c(1:12)
final_nums <- vector()
for(i in 1:length(state_ids)){
total_count <- vector()
for(j in 1:length(months)){
monthly <- get_inat_obs(place_id=state_ids[i],
taxon_id=bird_ids[i],
year=2019,
month = months[j])
total_count <- append(total, length(monthly$scientific_name))
print(paste("done with month", months[j], "in state", state_ids[i]))
}
final_nums <- append(final_nums, sum(total_count))
print(paste("done with state", state_ids[i]))
}
Occasionally, and seemingly randomly, I get the following error:
No encoding supplied: defaulting to UTF-8.
Error in if (!x$headers$`content-type` == "text/csv; charset=utf-8") { :
argument is of length zero
This ends up breaking the loop or makes the loop run without actually pulling any real data. Is this an issue with my script, or the API, or something else? I've tried manually supplying encoding information to the get_inat_obs() function, but it doesn't accept that as an argument. Thank you in advance!
I don't believe this is an error in your script. The issue is with the api most likely.
the error argument is of length zero is a common error when you try to make a comparison that has no length. For example:
if(logical(0) == "TEST") print("WORKED!!")
#Error in if (logical(0) == "TEST") print("WORKED!!") :
# argument is of length zero
I did some a few greps on their source code to see where this if statement is and it seems to be within inat_handle line 211 in get_inate_obs.R
This would suggest that the authors did not expect for
!x$headers$`content-type` == 'text/csv; charset=utf-8'
to evaluate to logical(0), but more specifically
x$headers$`content-type`
to be NULL.
I would suggest making a bug report on their GitHub and recommend they change the specified line to:
if(is.null(x$headers$`content-type`) || !x$headers$`content-type` == 'text/csv; charset=utf-8'){
Suggesting a bug is usually more well received if you have a reproducible example.
Also, you could totally make this change yourself locally by cloning out the git repository, editing the file, rebuild the package, and then confirm if you no longer get an error in your code.

How do I display with summaryCodings() only numbers of codings of a certain codecategory or several codes?

I'm using RQDA right now (for the first time) and I want to have an overview/comparison of the numbers of codings (to see which codes were used comparatively often, which were barely used).
I've tried summaryCodings() but it only gives me an overview of all codes.
How do I specify to make sure only certain codes (or, for example, one code category, and not all codes) are displayed?
I've tried variations like summaryCodings(codename == "xy") or summaryCodings(codecategory == "xx"), getCodingTable(codename == "xy" | codename == "zy").
I'm a beginner so still learning how to manage RQDA (obviously). Thank you for your help in advance!
This code will help you:
setwd("<working path>")
library("RMySQL")
con <- dbConnect(RSQLite::SQLite(), dbname="<database_name.rqda>")
answer <- dbGetQuery( con,'SELECT freecode.name, count(coding.cid) FROM treeCode, freecode, codecat, coding WHERE coding.cid = freecode.id AND treeCode.catid = codecat.catid AND treeCode.cid = freecode.id AND codecat.name = "<your option>" GROUP BY freecode.name')
dbDisconnect(con)
Now, you have the desired answer in 'answer' variable

Undefined columns selected error in R

I apologize in advance because I'm extremely new to coding and was thrust into it just a few days ago by my boss for a project.
My data set is called s1. S1 has 123 variables and 4 of them have some form of "QISSUE" in their name. I want to take these four variables and duplicate them all, adding "Rec" to the end of each one (That way I can freely play with the new variables, while still maintaining the actual ones).
Running this line of code keeps giving me an error:
b<- llply(s1[,
str_c(names(s1)
[str_detect(names(s1), fixed("QISSUE"))],
"Rec")],table)
The error is as such:
Error in `[.data.frame`(s1, , str_c(names(s1)[str_detect(names(s1), fixed("QISSUE")) & :
undefined columns selected
Thank you!
Use this to get the subset. Of course there is other ways to do that with simpler code
b<- llply(s1[,
names(s1)[str_detect(names(s1), fixed("QISSUE"))]
],c)
nwnam=str_c(names(s1)[str_detect(names(s1), fixed("QISSUE"))],"Rec")
ndf=data.frame(do.call(cbind,b));colnames(ndf)=nwnam
ndf
# of course you can do
cbind(s1,ndf)

R:Name the output of function same as input

I edit to specify my problem.
This is my dataset (for example)
library(quantmod)
getSymbols("AAPL",from="2013-01-01")
data<-AAPL
p1<-4
dO<-data[,1]
dC<-data[,4]
emaO<-EMA(dO,n=p1)
emaC<-EMA(dC,n=p1)
fee<-0.1
cross<-ifelse((emaC<emaO & lag(emaC,1)>lag(emaO,1))|emaC>emaO & lag(emaC,1)<lag(emaO,1),"A","N")
type<-ifelse(emaC>emaO,"S",
ifelse(emaC<emaO,"L","Equal"))
Pos_emaO_dO_UP<-emaO>dO
Pos_emaO_dO_D<-emaO<dO
Pos_emaC_dC_UP<-emaC>dC
Pos_emaC_dC_D<-emaC<dC
Pos_emaC_emaO_UP<-emaC>emaO
Pos_emaC_emaO_D<-emaC<emaO
Profit_L<-((((lag(dC,-1))-(lag(dO,-1)))/(lag(dO,-1)))*100)-fee
This should be a data.frame of how it looks like
df1<-data.frame(cross,type,Pos_emaO_dO_UP,Pos_emaO_dO_D,Pos_emaC_dC_UP,Pos_emaC_dC_D,Pos_emaC_emaO_UP,Pos_emaC_emaO_D,Profit_L)
colnames(df1)<-c("cross","type","Pos_emaO_dO_UP","Pos_emaO_dO_D","Pos_emaC_dC_UP","Pos_emaC_dC_D","Pos_emaC_emaO_UP","Pos_emaC_emaO_D","Profit_L")
conditions<-c(Pos_emaO_dO_UP,Pos_emaO_dO_D,Pos_emaC_dC_UP,Pos_emaC_dC_D,Pos_emaC_emaO_UP,Pos_emaC_emaO_D)
And I was maybe wrong to ask you for this function
savefun<-function(x){
Condition<-deparse(substitute(x))
f<-head(subset(table_1,prekrizeni=="A" & TYP1=="L" & x),-1)
Success<-nrow(f[f$Zisk_L>0,])/nrow(f)
d<-data.frame(Condition,Success)
d
}
So I will tell you all I need to not be misunderstanded.
I want to make a function (or loop) which will be 2-step process.
1, I want go trough function savefun() ale of the conditions (First,second and so) and have a data.frame with all these results in form data.frame(Condition,Success) like it is in savefun() with n rows=length(conditions)
2, And at the end I want some kind of loop which repeat it until there is no Success column higher the higher of the previous. It means. Use savefun() for all conditions, choose the conditions with the highest Success column, take this condition and give it to savefun(), parameter f like this>
savefun<-function(x){
Condition<-deparse(substitute(x))
f<-head(subset(table_1,prekrizeni=="A" & TYP1=="L" & NEW_ADDED_CONDITION & x),-1)
Success<-nrow(f[f$Zisk_L>0,])/nrow(f)
d<-data.frame(Condition,Success)
d
}
Run the savefun() again on all conditions (instead the new_added_condition) and repeating this process until there is no combination with higher "the highest success" then previous one. Then stop the loop and show as result data.frame or just names of used conditions in last step before stop.
I hope it is clear, I will really apreciate your help, I've got to finish my school work and I am in time press. Thanks a lot again
#Richard Scriven #Osssan
With assign and get
savefun<-function(x){
f<-subset(table_1,prekrizeni=="A" & TYP1=="L" & x)
b<-nrow(f[f$Zisk_L>0,])/nrow(f)
assign('x',list(x=x,b=b),envir=.GlobalEnv)
return(get('x',envir=.GlobalEnv))
}

Lua: Doing arithmetic in for k,v in pairs(tbl) loops

I have a table such as the following:
mafiadb:{"Etzli":{"alive":50,"mafia":60,"vigilante":3,"doctor":4,"citizen":78,"police":40},"Charneus":{"alive":29,"mafia":42,"vigilante":6,"doctor":14,"citizen":53,"police":33}}
There are more nested tables, but I'm just trying to keep it simple for now.
I run the following code to extract certain values (I'm making an ordered list based on those values):
sortmaf={}
for k,v in pairs(mafiadb) do
sortmaf[k]=v["mafia"]
end
That's one of the codes I run. The problem I'm running into is that it doesn't appear you can do arithmetic in a table loop. I tried:
sortpct={}
for k,v in pairs(mafiadb) do
sortpct[k]=(v["alive"]*100)/(v["mafia"]+v["vigilante"]+v["doctor"]+v["citizen"]+v["police"])
end
It returns that I'm attempting to do arithmetic on field "alive." What am I missing here? As usual, I appreciate any consideration in answering this question!
Editing:
Instead of commenting on the comment, I'm going to add additional information here.
The mafiadb database I've posted IS the real database. It's just stripped down to two players instead of the current 150+ players I have listed in it. It's simply structured as such:
mafiadb = {
Playername = {
alive = 0
mafia = 0
vigilante = 0
doctor = 0
police = 0
citizen = 0
}
}
Add a few hundred more playernames, and there you have it.
As for the error message, the exact message is:
attempt to perform arithmetic on field 'alive' (nil value)
So... I'm not sure what the problem is. In my first code, the one with sortmaf, it works perfectly, but suddenly, it can't find v["alive"] as a value when I'm trying to do arithmetic? If I just put v["alive"] by itself, it's suddenly found and isn't nil any longer. I hope this clarifies a bit more.
This looks like a simple typo to me.
Some of your 150 characters is not well written - probably they don't have an "alive" property, or it's written incorrectly, or it's not a number. Try this:
for k,v in pairs(mafiadb) do
if type(v.alive) ~= 'number' then
print(k, "doesn't have a correct alive property")
end
end
This should print the names of the "bad" characters.

Resources