I have a dataframe
soDf <- structure(list(State = c("Exception", "Exception", "Exception", "Exception", "Approval", "Processing"), User = c("1","2", "1", "3", "1", "4"), Voucher.Number = c(10304685L, 10304685L, 10304685L,10304685L, 10304685L, 10304685L), Queue.Exit.Date = c("8/24/2016 14:59", "8/26/2016 13:25", "8/26/2016 15:56", "8/26/2016 16:13", "8/26/2016 16:25", "8/26/2016 17:34")),.Names = c("State", "User", "Voucher.Number","Queue.Exit.Date"), row.names = 114:119, class = "data.frame")
I have a list of rules that I want to filter rows by:
One of the rules being
(Voucher.Number == lag(Voucher.Number)) & (State == 'Exception' & lag(State) == 'Exception' )
If the current and lag voucher number are equal, and both have an exception tag, then count mark that row as True.
When I apply this rule with a couple other it returns the 4th row as True when it should be returned as False
State User Voucher.Number Queue.Exit.Date toFilt
1 Exception 1 10304685 8/24/2016 14:59 NA
2 Exception 2 10304685 8/26/2016 13:25 TRUE
3 Exception 1 10304685 8/26/2016 15:56 TRUE
4 Exception 3 10304685 8/26/2016 16:13 TRUE
5 Approval 1 10304685 8/26/2016 16:25 FALSE
6 Processing 4 10304685 8/26/2016 17:34 FALSE
Here is the code I used with all of the filtering rules
soDf <- soDf %>%
arrange(Voucher.Number, Queue.Exit.Date)%>%
mutate(toFilt = ((User == lag(User)& Voucher.Number ==lag(Voucher.Number)))|
((Voucher.Number != lag(Voucher.Number)) & State == "Exception") |
((Voucher.Number == lag(Voucher.Number)) & (State == 'Exception' & lag(State) == 'Exception' ))|
((Voucher.Number == lag(Voucher.Number)) & (User == lag(User))))
Line 5 does not meet your conditional statement in the mutate column. The state of line 5 is "Approval" as opposed to "Exception", and the User ID does not match the lagged user ID.
For this reason, it returns FALSE as none of the 4 statements are TRUE. It does not appear to be a coding error just the conditional statement needs altering to match your needs. Hope this helps!
Related
I have the following code:
Telemetry
| where DataMetadata["category"] == "Warning"
| summarize
Duration = sum(case(Name == "Event", totimespan(Value), totimespan(0))),
Text = min(case(Name == "Information", tostring(Value), "N/A")),
DeviceID = min(case(Name == "Ident", tostring(Value), "N/A"))
by Timestamp
| summarize TotalDuration = sum(Duration) by Text,DeviceID
| top 2 by TotalDuration
| summarize Duration = max(case(isnotnull(TotalDuration) or isnotempty(TotalDuration), strcat("Duration: ",format_timespan(TotalDuration, 'dd:hh:mm:ss'), "[sec] ",DeviceID," - ",Text), tostring(timespan(0))))
Checking the last hour of data, the condition DataMetadata["category"] == "Warning" is not met and in this case I want to display as a result 00:00:00:00 as shown in the summarize at the end of the code.
However, what I get as a result is the following:
What is the issue here and how can I solve it ?
I assume that you do want the top 2 records by TotalDuration, in case there are any.
let Telemetry = datatable(DataMetadata:dynamic, Name:string, Timestamp:datetime, Value:string)[];
Telemetry
| where DataMetadata["category"] == "Warning"
| summarize
Duration = sum(case(Name == "Event", totimespan(Value), totimespan(0))),
Text = min(case(Name == "Information", tostring(Value), "N/A")),
DeviceID = min(case(Name == "Ident", tostring(Value), "N/A"))
by Timestamp
| summarize TotalDuration = sum(Duration) by Text,DeviceID
| union (print TotalDuration = 0s, Text = "NA", DeviceID = "NA")
| top 2 by TotalDuration
| project Duration = strcat("Duration: ",format_timespan(TotalDuration, 'dd:hh:mm:ss'), "[sec] ",DeviceID," - ",Text)
Duration
Duration: 00:00:00:00[sec] NA - NA
Fiddle
I'm using this code:
ovabonnement <- ovabonnement %>%
mutate(c12_ovabonnement_type_con_voor = case_when(s2_ovabonnement_type_voor_anders == 1 ~ NA,
s2_ovabonnement_type_voor_1 == 1 |
s2_ovabonnement_type_voor_13 == 1 ~ "Basis",
s2_ovabonnement_type_voor_2 == 1 |
s2_ovabonnement_type_voor_3 == 1 |
s2_ovabonnement_type_voor_4 == 1 |
s2_ovabonnement_type_voor_9 == 1 |
s2_ovabonnement_type_voor_11 == 1 ~ "Voordeel",
s2_ovabonnement_type_voor_5 == 1 |
s2_ovabonnement_type_voor_6 == 1 |
s2_ovabonnement_type_voor_7 == 1 |
s2_ovabonnement_type_voor_8 == 1 |
s2_ovabonnement_type_voor_10 == 1 |
s2_ovabonnement_type_voor_12 == 1 |
s2_ovabonnement_type_voor_14 == 1 ~ "Vrij"))
So I have these 15 variables that represent whether a person has that subscription added onto their public transport membership. Because it was a multiple choice questionnaire people could select multiple choices, which is why they are different variables.
I want to make these into one variable that takes NA if people answered "other", "Basis" if people answered 1 or 13, "Voordeel" if people answered 2,3,4,9 or 11 and "Vrij" if people answered 5,6,7,8,10,12 or 14.
If people answered 2, there will be a 1 in s2_ovabonnement_type_voor_2. People can have answered multiple of these, which makes it a bit tricky. However, I want it to go through these chronologically. For example, if a person answered 2 AND 10, it should choose the 10, because the code is later, but I'm not sure if that is how case_when works.
I get this error:
Error in `mutate()`:
! Problem while computing `c12_ovabonnement_type_con_voor = case_when(...)`.
Caused by error in `names(message) <- `*vtmp*``:
! 'names' attribute [1] must be the same length as the vector [0]
Run `rlang::last_error()` to see where the error occurred.
case_when/if_else are type sensitive i.e all the expressions should return the same type. In the OP's expression, the first expression returns NA and NA by default is logical, and all others return character type. We need NA_character_ to match the type of others
ovabonnement <- ovabonnement %>%
mutate(c12_ovabonnement_type_con_voor = case_when(s2_ovabonnement_type_voor_anders == 1 ~ NA_character_,
s2_ovabonnement_type_voor_1 == 1 |
s2_ovabonnement_type_voor_13 == 1 ~ "Basis",
s2_ovabonnement_type_voor_2 == 1 |
s2_ovabonnement_type_voor_3 == 1 |
s2_ovabonnement_type_voor_4 == 1 |
s2_ovabonnement_type_voor_9 == 1 |
s2_ovabonnement_type_voor_11 == 1 ~ "Voordeel",
s2_ovabonnement_type_voor_5 == 1 |
s2_ovabonnement_type_voor_6 == 1 |
s2_ovabonnement_type_voor_7 == 1 |
s2_ovabonnement_type_voor_8 == 1 |
s2_ovabonnement_type_voor_10 == 1 |
s2_ovabonnement_type_voor_12 == 1 |
s2_ovabonnement_type_voor_14 == 1 ~ "Vrij"))
I am strugling with this loop. I want to get "6" in the second row of column "Newcolumn".I get the following error.
Error in if (mydata$type_name[i] == "a" && mydata$type_name[i - :
missing value where TRUE/FALSE needed.
The code that I created:
id type_name name score newcolumn
1 a Car 2 2
1 a van 2 6
1 b Car 2 2
1 b Car 2 2
mydata$newcolumn <-c(0)
for (i in 1:length(mydata$id)){
if ((mydata$type_name [i] == "a") && (mydata$type_name[i-1] == "a") && ((mydata$name[i]) != (mydata$name[i-1]))){
mydata$newcolumn[i]=mydata$score[i]*3 }
else {
mydata$newcolumn[i]=mydata$score[i]*1
}
}
Thank you very much in advance
List starts at index 1 in R but like you are doing a i-1 in your loop starting at 1, your list is out of range (i-1=0) so your code can not return a True or False.
I've read in my SPSS file in R and want to recode a new variable if such and such assumptions are made. To be specific:
I want to turn my spssdata_sub$gest variable into a new variable if the following the conditions are met:
spssdata_sub$indusert != 2 & spssdata_sub$ivf != 1 & spssdata_sub$leie != 3 & spssdata_sub$svkompl_II != 7 & spssdata_sub$svkompl_II != 2 & spssdata_sub$svkompl_II != 1
Anyone here who can help me with a code?
Does one of the following codes work for you?
Either this adapted version of Renu's solution
spssdata_sub$gest <- ifelse(spssdata_sub$indusert != 2 & spssdata_sub$ivf != 1 & spssdata_sub$leie != 3 & spssdata_sub$svkompl_II != 7 & spssdata_sub$svkompl_II != 2 & spssdata_sub$svkompl_II != 1, spssdata_sub$gest, NA)
or this code for filtering observations:
library(dplyr)
spssdata_sub_new <- spssdata_sub %>%
filter(indusert != 2 & ivf != 1 & leie != 3 & svkompl_II != 7 & svkompl_II != 2 & ssvkompl_II != 1)
One way is the following, if you really mean either one of the conditions
Mynewdata <- dplyr::filter(spssdata, indusert != 2, ivf != 1, leie != 3,
svkompl_II != 7 & svkompl_II != 2 & svkompl_II != 1)
only keeps entries that are neither, or putting it the other way exludes entries that have either indusert = 2 or ivf = 1 etc... one of the condition is enough to exclude it.
add-on: or something also like that:
Mynewdata <- dplyr::filter(spssdata, indusert != 2, ivf != 1, leie != 3,
!(svkompl_II %in% c(7,2,1))
I have a database table which contanis a field name Province bits and it adds up the count of the pattern like:
AB-1
BC-2
CD-4
DE-8
EF-16.... and so on.
Now in the table entry I have a value-13(Province bit), which implies checkboxes against entry AB,CD,DE(adds up to 13)should be checked.
I am not able to get the logic behind the same, how can check only those checkboxes whose sum adds up to the entry in the table?
You need to check to see if the value is in the bitwise total.
if( interestedInValue & totalValue == interestedInValue)
{
// this value is in the total, check the box
}
Documentation on & http://msdn.microsoft.com/en-us/library/sbf85k1c(v=vs.71).aspx
e.g. 13 = 1 + 4 + 8
13 & 1 == 1 // true
13 & 2 == 2 // false
13 & 4 == 4 // true
13 & 8 == 8 // true
13 & 16 == 16 // false
EDIT: for more clarification
ab.Checked = 1 && ProvinceBit == 1 // checkbox AB
bc.Checked = 2 && ProvinceBit == 2 // checkbox BC
...
The field is using bit flags.
13 is 1101 binary.
So convert the value to bits and assign one bit to each checkbox.
By converting your number to a string, you can convert to an array or just iterate through the string. A bit brute force, but will give you what you need.
var value = 13
string binary = Convert.ToString(value, 2);
//binary = "1101"