R error while using OR - r

In a dataframe poll, statement 1 works but statement 2 does not. Any advise as why this is happening ? Error which is coming wrong.
Statement 1:
test = subset(poll, Internet.Use=="1" | Smartphone == "1")
Statement 2 :
limited = subset(po11, Internet.Use=="1" | Smartphone == "1")
Error in subset(po11, Internet.Use == "1" | Smartphone == "1") :
object 'po11' not found

In your second statement you are writing po11 (with two digits "1" instead of two characters "l").
Therefore R can´t find this object, because it doesn´t exist.

Related

updating one data.table from another data.table in R

im trying to update one table (bigDta, fields smiles) using data from another table, but it produces an error
(bigData$smiles == '' | is.null(bigData$smiles) | is.na(bigData$smiles))
& bigData$compound_id %in% tmpCompounds$compound_id
,
`:=` (
smiles=dtChembl[dtChembl$chembl_id == compound_id ,]$canonical_smiles
, comment = paste(comment,'smiles added from chemblDB by chemblID;')
, filteringStep=12
)
]
the error i get is
Error in `[.data.table`(dtChembl, dtChembl$chembl_id == compound_id, ) :
i evaluates to a logical vector length 5832210 but there are 1088555 rows. Recycling of logical i is no longer allowed as it hides more bugs than is worth the rare convenience. Explicitly use rep(...,length=.N) if you really need to recycle.
In addition: Warning message:
In dtChembl$chembl_id == compound_id :
longer object length is not a multiple of shorter object length
i have solved the problem..in case that anyone need it, this is solution:
bigData[
dtChembl
, on=.(compound_id = chembl_id)
,
`:=` (
smiles= canonical_smiles
, comment = paste(comment,'smiles added from chemblDB by chemblID;')
, filteringStep=12
)
]

If condition is not showing the result

I am running below code, its working but not showing me output
for (name in tita$name){
if (tita$sex == 'female' && tita$embarked == 'S' && tita$age > 33.00)
{
print (name)
}
}
It's just showing me ****** in R studio, though when I check dataset, it has data which have female having age greater than 33 and embarked from S, but this statement is not showing me result. But when I change the value from 33 to 28 the same code shows me the result. Why is that.
I am using the following dataset:
https://biostat.app.vumc.org/wiki/pub/Main/DataSets/titanic3.csv
I think you're mixing loops and vectorization where you shouldn't. As I mentioned in the comments your conditions are vectorized, but it looks like you're trying to evaluate each element in a loop.
You should do either:
# loop through elements
for (i in seq_along(tita$name)){
if (tita$sex[i] == 'female' & tita$embarked[i] == 'S' & tita$age[i] > 33.00){
print(tita$name[i])
}
}
OR use vectorization (this will be faster and is recommended):
conditions <- tita$sex == 'female' & tita$embarked == 'S' & tita$age > 33.00
names <- tita$name[conditions]
Here conditions is a TRUE and FALSE logical vector -- TRUE where all the conditions are met. We can use the to subset in R. For more information on what I mean by vectorization please see this link.

MS Project formula calculation returns inconsistent results

In MS Project Professional I have a custom field that returns the correct value...sometimes, no value at other times, and an #ERROR at still other times with no apparent rhyme or reason.
The goal: I need to capture the [Resource Names] field for use in an external application - easy enough - but when I have a fixed units task with limited resource units I need to exclude the "[##%]" portion of the name. Example: Sam[25%] but I need just, "Sam"
The formula: IIf(IsNumeric(InStr(1,[Resource Names],"[")),LEFT([Resource Names],Len([Resource Names])-5),[Resource Names])
The results are in summary:
Marian == M
Sam == #ERROR
Sam[25%] == Sam
IDNR == #ERROR
Core Dev == Cor
Bindu == Bindu
Bindu[50%] == Bindu
Michele == Mi
Michele[25%] == Michele
Disha == empty
Disha[33%] == Disha
Stuart[50%] == Stuart
Stuart == S
Strangely enough, Summary Tasks show no value which is correct.
The need: can someone help me fix the formula? Or, should I just suck it up and manually delete the offending brackets and numbers?
If you only ever have one resource assigned to a task, this formula will work: IIf(0=InStr(1,[Resource Names],"["),[Resource Names],Left([Resource Names],InStr(1,[Resource Names],"[")-1)).
However, building a formula to handle more than one resource would be extremely tedious with the limited functions available. In that case a macro to update the field would work much better:
Sub GetResourceNames()
Dim t As Task
For Each t In ActiveProject.Tasks
Dim resList As String
resList = vbNullString
Dim a As Assignment
For Each a In t.Assignments
resList = resList & "," & a.Resource.Name
Next a
t.Text2 = Mid$(resList, 2)
Next t
End Sub

R function error when using a second & in a subset vector

Hello I am new to stack overflow and have a R query.
I am editing some existing code to analyse some data for a report.
The existing code is:
bar_lib <- make_table(col_type = 'multi_yn', multi_cols = Bar_lib_cols,inclNA=TRUE, title = 'Barriers to using public library services', subsetvec = (!is.na(DATA$sclibrary) & DATA$sclibrary=='No'))
The above code works and produces a table.
Below is the edited code I am trying to analyse the variable barlib07:
subsetvec = (!is.na(DATA$sclibrary) & DATA$sclibrary=='No'& DATA$barlib07='Yes'))
With this code I am getting an error:
Error in !is.na(DATA$sclibrary) & DATA$sclibrary == "No" &
DATA$barlib07 = "Yes" : could not find function "&<-"
I am not sure how to resolve this.
Please help.
Thanks,
Analyst001
Try changing to subsetvec = (!is.na(DATA$sclibrary) & DATA$sclibrary == 'No' & DATA$barlib07 == 'Yes'))

Using condition in columns of data frame to generate a vector in R

I have the following array:
Year Month Day Hour
1 1 1 1 0
2 1 1 1 3
...
etc
I wrote a function which I then tried to vectorize by using apply in order to run calculations row-by-row basis, but it doesn't work due to the booleans:
day_in_season<-function(tarr){
#first month in season
if((tarr$month==12) || (tarr$month==3) ||(tarr$month==6) || (tarr$month==9)){
d=tarr$day
#second month in season
}else if ((tarr$month==1) || (tarr$month==4)){
d=31+tarr$day
}else if((tarr$month==7) || (tarr$month==10)){
d=30+tarr$day
#third month in season
}else if((tarr$month==2)){
d=62+tarr$day
}else{
d=61+tarr$day
}
h=tarr$hour/24
d=d+h
return(d)
}
I tried
apply(tdjf,1,day_in_season)
but it raised this exception:
Error in tarr$month : $ operator is invalid for atomic vectors
(I already knew about this potential pitfall, but that's why I wanted to use apply in the first place!)
The only way I can currently get it to work is if I do this:
days<-c()
for (x in 1:nrow(tdjf)){
d<-day_in_season(tdjf[x,])
days=append(days,d)
}
If there were only a few values, I'd throw up my hands and just use the for loop, efficiency be damned, but I have over 15,000 rows and that's just one dataset. I know that there has to be a way to make it work.
To vectorize your code, use ifelse() and| instead of ||:
ifelse(
(tarr$month==12) | (tarr$month==3) |(tarr$month==6) | (tarr$month==9),
tarr$day,
ifelse((tarr$month==1) | (tarr$month==4),
31+tarr$day,
ifelse((tarr$month==7) | (tarr$month==10),
30+tarr$day,
ifelse(tarr$month==2,
62+tarr$day,
61+tarr$day)
)
)
)+tarr$hour/24
You might be surprised at how quickly a well constructed for loop can run. If designed well, it has about the same efficiency of an apply statement.
The properfor loop in your case is
tdjf$days <- vector ("numeric", nrow (tdjf))
for (x in seq_along (tdjf$days)){
tdjf$days [x] <- day_in_season(tdjf[x,])
}
If you really want to go the apply route, I would recommend rewriting your function to take three arguments -- month, day, and hour -- and pass those three columns into mapply

Resources