Regex in if else statement in R - r
I have a rather simple question. I am trying to get the if else statement below to work.
It is supposed to assign '1' if the if statement is met, 0 otherwise.
My problem is that I cannot get the regex in the if statement to work ('\w*|\W*). It is supposed to specify the condition that the string either is "Registration Required" or Registration required followed by any character. I cannot specify the exact cases, because following the "Registration required" (in the cases where something follows), it will usually be a date (varying for each observation) and a few words.
Registration_cleaned <- c()
for (i in 1:length(Registration)) {
if (Registration[i] == ' Registration Required\\w*|\\W*') {
Meta_Registration_cleaned <- 1
} else {
Meta_Registration_cleaned <- 0
}
Registration_cleaned <- c(Registration_cleaned, Meta_Registration_cleaned)
}
You may use transform together with ifelse function to set the Meta_Registration_cleaned.
For matching the regular expression grep function can be used with pattern "Registration Required\w*".
Registration <- data.frame(reg = c("Registration Required", "Registration Required ddfdqf","some str", "Regixxstration Required ddfdqf"),stringsAsFactors = F)
transform(Registration,Meta_Registration_cleaned = ifelse(grepl("Registration Required\\w*",Registration[,"reg"]), 1, 0))
Gives result:
reg Meta_Registration_cleaned
1 Registration Required 1
2 Registration Required ddfdqf 1
3 some str 0
4 Regixxstration Required ddfdqf 0
I might have misunderstood the OP completely, because I have understood the question entirely differently than anyone else here.
My comment earlier suggested looking for the regex at the end of the string.
Registration <- data.frame(reg = c("Registration Required", "Registration Required ddfdqf","Registration Required 10/12/2000"),stringsAsFactors = F)
#thanks #user1653941 for drafting the sample vector
Registration$Meta_Registration_cleaned <- grepl('Registration required$', Registration$reg, ignore.case = TRUE)
Registration
1 Registration Required TRUE
2 Registration Required ddfdqf FALSE
3 Registration Required 10/12/2000 FALSE
I understand the OP as such that the condition is: Either the string "Registration required" without following characters, or... anything else. Looking forward to the OPs comment.
Related
Conditional Handling in R
I've been trying to create an error message when the ouput entered is wrong, for example, in this code instead of entering 4 digits number, it is entered a character. I keep receiving an error. Any tips? get_age <- function() { yob <- readline("Please enter your year of birth: ") age <- 2022 - as.numeric(yob) return(age) } if (get_age != as.numeric(yob)) { withCallingHandlers( warning = function(cnd){ readline("This is not a number. Please, try again.") }, print("please, enter a numerical value"), return(get_age()) ) }
MS Project formula calculation returns inconsistent results
In MS Project Professional I have a custom field that returns the correct value...sometimes, no value at other times, and an #ERROR at still other times with no apparent rhyme or reason. The goal: I need to capture the [Resource Names] field for use in an external application - easy enough - but when I have a fixed units task with limited resource units I need to exclude the "[##%]" portion of the name. Example: Sam[25%] but I need just, "Sam" The formula: IIf(IsNumeric(InStr(1,[Resource Names],"[")),LEFT([Resource Names],Len([Resource Names])-5),[Resource Names]) The results are in summary: Marian == M Sam == #ERROR Sam[25%] == Sam IDNR == #ERROR Core Dev == Cor Bindu == Bindu Bindu[50%] == Bindu Michele == Mi Michele[25%] == Michele Disha == empty Disha[33%] == Disha Stuart[50%] == Stuart Stuart == S Strangely enough, Summary Tasks show no value which is correct. The need: can someone help me fix the formula? Or, should I just suck it up and manually delete the offending brackets and numbers?
If you only ever have one resource assigned to a task, this formula will work: IIf(0=InStr(1,[Resource Names],"["),[Resource Names],Left([Resource Names],InStr(1,[Resource Names],"[")-1)). However, building a formula to handle more than one resource would be extremely tedious with the limited functions available. In that case a macro to update the field would work much better: Sub GetResourceNames() Dim t As Task For Each t In ActiveProject.Tasks Dim resList As String resList = vbNullString Dim a As Assignment For Each a In t.Assignments resList = resList & "," & a.Resource.Name Next a t.Text2 = Mid$(resList, 2) Next t End Sub
How to prompt a user for input until the input is valid in Julia
I am trying to make a program to prompt a user for input until they enter a number within a specific range. What is the best approach to make sure the code does not error out when I enter a letter, a symbol, or a number outside of the specified range?
In alternative to parse, you can use tryparse: tryparse(type, str; base) Like parse, but returns either a value of the requested type, or nothing if the string does not contain a valid number. The advantage over parse is that you can have a cleaner error handling without resorting to try/catch, which would hide all exceptions raised within the block. For example you can do: while true print("Please enter a whole number between 1 and 5: ") input = readline(stdin) value = tryparse(Int, input) if value !== nothing && 1 <= value <= 5 println("You entered $(input)") break else #warn "Enter a whole number between 1 and 5" end end Sample run: Please enter a whole number between 1 and 5: 42 ┌ Warning: Enter a whole number between 1 and 5 └ # Main myscript.jl:9 Please enter a whole number between 1 and 5: abcde ┌ Warning: Enter a whole number between 1 and 5 └ # Main myscript.jl:9 Please enter a whole number between 1 and 5: 3 You entered 3
This is one possible way to achieve this sort of thing: while true print("Please enter a whole number between 1 and 5: ") input = readline(stdin) try if parse(Int, input) <= 5 || parse(Int, input) >= 1 print("You entered $(input)") break end catch #warn "Enter a whole number between 1 and 5" end end Sample Run: Please enter a whole number between 1 and 5: 2 You entered 2 See this link for how to parse the user input into an int.
How to check if subset is empty in R
I have a set of data with weight with time (t), I need to identify outliers of weight for every time (t), after which I need to send a notification email. I'm using bloxplot($out) to identify the outliers, it seems to work, but I'm not sure if: It's the correct way to use the boxplot? I can't detect if the boxplot has no outlier or if its empty (or maybe, I'm using a wrong technique) Or possibly the subset itself is empty (could be the root cause) For now, I just need to trap the empty subset and check if out variable is empty or not. Below is my R script code: #i am a comment, and the compiler doesn't care about me #load our libraries library(ggplot2) library(mailR) #some variables to be used later from<-"" to<-"" getwd() setwd("C:\\Temp\\rwork") #read the data file into a data(d) variable d<-read.csv("testdata.csv", header=TRUE) #file #get the current time(t) t <-format(Sys.time(),"%H") #create a subset of d based on t sbset<-subset(d,Time==t) #identify if outlier exists then send an email report out<-boxplot(sbset$weight)$out if(length(out)!=0){ #create a boxplot of the subset boxplot(sbset$weight) subject = paste("Attention: An Outlier is detected for Scheduled Job Run on Hour ",t) message = toString(out) #sort(out) }else{ subject = paste("No Outlier Identified") message = "" } email<-send.mail(from=from, to=to, subject=subject, body=message, html=T, smtp=list(host.name = "smtp.gmail.com", port = 465, user.name = from, passwd = "", #password of sender email ssl = TRUE), authenticate=TRUE, send=TRUE) DATA weight,Time,Chick,x 42,0,1,1 51,2,1,1 59,4,1,1 64,6,1,1 76,8,1,1 93,10,1,1 106,12,1,1 125,14,1,1 149,16,1,1 171,18,1,1 199,20,1,1 205,21,1,1 40,0,2,1 49,2,2,1 58,4,2,1 72,6,2,1 84,8,2,1 103,10,2,1 122,12,2,1 138,14,2,1 162,16,2,1 187,18,2,1 209,20,2,1 215,21,2,1 43,0,3,1 39,2,3,1 55,4,3,1 67,6,3,1 84,8,3,1 99,10,3,1 115,12,3,1 138,14,3,1 163,16,3,1 187,18,3,1 198,20,3,1 202,21,3,1 42,0,4,1 49,2,4,1 56,4,4,1 67,6,4,1 74,8,4,1 87,10,4,1 102,12,4,1 108,14,4,1 136,16,4,1 154,18,4,1 160,20,4,1 157,21,4,1 41,0,5,1 42,2,5,1 48,4,5,1 60,6,5,1 79,8,5,1 106,10,5,1 141,12,5,1 164,14,5,1 197,16,5,1 199,18,5,1 220,20,5,1 223,21,5,1 41,0,6,1 49,2,6,1 59,4,6,1 74,6,6,1 97,8,6,1 124,10,6,1 141,12,6,1 148,14,6,1 155,16,6,1 160,18,6,1 160,20,6,1 157,21,6,1 41,0,7,1 49,2,7,1 57,4,7,1 71,6,7,1 89,8,7,1 112,10,7,1 146,12,7,1 174,14,7,1 218,16,7,1 250,18,7,1 288,20,7,1 305,21,7,1 42,0,8,1 50,2,8,1 61,4,8,1 71,6,8,1 84,8,8,1 93,10,8,1 110,12,8,1 116,14,8,1 126,16,8,1 134,18,8,1 125,20,8,1 42,0,9,1 51,2,9,1 59,4,9,1 68,6,9,1 85,8,9,1 96,10,9,1 90,12,9,1 92,14,9,1 93,16,9,1 100,18,9,1 100,20,9,1 98,21,9,1 41,0,10,1 44,2,10,1 52,4,10,1 63,6,10,1 74,8,10,1 81,10,10,1 89,12,10,1 96,14,10,1 101,16,10,1 112,18,10,1 120,20,10,1 124,21,10,1 43,0,11,1 51,2,11,1 63,4,11,1 84,6,11,1 112,8,11,1 139,10,11,1 168,12,11,1 177,14,11,1 182,16,11,1 184,18,11,1 181,20,11,1 175,21,11,1 41,0,12,1 49,2,12,1 56,4,12,1 62,6,12,1 72,8,12,1 88,10,12,1 119,12,12,1 135,14,12,1 162,16,12,1 185,18,12,1 195,20,12,1 205,21,12,1 41,0,13,1 48,2,13,1 53,4,13,1 60,6,13,1 65,8,13,1 67,10,13,1 71,12,13,1 70,14,13,1 71,16,13,1 81,18,13,1 91,20,13,1 96,21,13,1 41,0,14,1 49,2,14,1 62,4,14,1 79,6,14,1 101,8,14,1 128,10,14,1 164,12,14,1 192,14,14,1 227,16,14,1 248,18,14,1 259,20,14,1 266,21,14,1 41,0,15,1 49,2,15,1 56,4,15,1 64,6,15,1 68,8,15,1 68,10,15,1 67,12,15,1 68,14,15,1 41,0,16,1 45,2,16,1 49,4,16,1 51,6,16,1 57,8,16,1 51,10,16,1 54,12,16,1 42,0,17,1 51,2,17,1 61,4,17,1 72,6,17,1 83,8,17,1 89,10,17,1 98,12,17,1 103,14,17,1 113,16,17,1 123,18,17,1 133,20,17,1 142,21,17,1 39,0,18,1 35,2,18,1 43,0,19,1 48,2,19,1 55,4,19,1 62,6,19,1 65,8,19,1 71,10,19,1 82,12,19,1 88,14,19,1 106,16,19,1 120,18,19,1 144,20,19,1 157,21,19,1 41,0,20,1 47,2,20,1 54,4,20,1 58,6,20,1 65,8,20,1 73,10,20,1 77,12,20,1 89,14,20,1 98,16,20,1 107,18,20,1 115,20,20,1 117,21,20,1 40,0,21,2 50,2,21,2 62,4,21,2 86,6,21,2 125,8,21,2 163,10,21,2 217,12,21,2 240,14,21,2 275,16,21,2 307,18,21,2 318,20,21,2 331,21,21,2 41,0,22,2 55,2,22,2 64,4,22,2 77,6,22,2 90,8,22,2 95,10,22,2 108,12,22,2 111,14,22,2 131,16,22,2 148,18,22,2 164,20,22,2 167,21,22,2 43,0,23,2 52,2,23,2 61,4,23,2 73,6,23,2 90,8,23,2
Your first use of boxplot is unnecessarily creating a plot, you can use out <- boxplot.stats(sbset$weight)$out for a little efficiency. You are interested in the presence of rows, but length(sbset) will return the number of columns. I suggest instead nrow or NROW. if (NROW(out) > 0) { boxplot(sbset$weight) # ... } else { # ... }
Pyparsing - name not starting with a character
I am trying to use Pyparsing to identify a keyword which is not beginning with $ So for the following input: $abc = 5 # is not a valid one abc123 = 10 # is valid one abc$ = 23 # is a valid one I tried the following var = Word(printables, excludeChars='$') var.parseString('$abc') But this doesn't allow any $ in var. How can I specify all printable characters other than $ in the first character position? Any help will be appreciated. Thanks Abhijit
You can use the method I used to define "all characters except X" before I added the excludeChars parameter to the Word class: NOT_DOLLAR_SIGN = ''.join(c for c in printables if c != '$') keyword_not_starting_with_dollar = Word(NOT_DOLLAR_SIGN, printables) This should be a bit more efficient than building up with a Combine and a NotAny. But this will match almost anything, integers, words, valid identifiers, invalid identifiers, so I'm skeptical of the value of this kind of expression in your parser.