I'm stumped. My issue is that I want to grab specific names from a given column. However, when I try and filter them I get most of the names except for a few, even though I can clearly see their names in the original excel file. I think it has to do what some sort of special characters or spacing in the name column. I am confused on how I can fix this.
I have tried using excels clean() function to apply that to the given column. I have tried working an Alteryx flow to clean the data. All of these steps haven't helped any. I am starting to wonder if this is an r issue.
surveyData %>% filter(`Completed By` == "Spencer,(redbox with whitedot in middle)Amy")
surveyData %>% filter(`Completed By` == "Spencer, Amy")
in r the first line had this redbox with white dot in between the comma and the first name. I got this red box with white dot by copy the name from the data frame and copying it into notepad and then pasting it in r. This actually works and returns what I want. Now the second case is a standard space which doesn't return what I want. So how can I fix this issue by not having to copy a name from the data frame and copy to notepad then copying the results from notepad to r, which has the redbox with a white dot in between the comma(,) and first name.
Expected results is that I get the rows that are attached to what ever name I filter by.
I was able to find the answer, it turns out the space is actually a break space with unicode of (U+00A0) compared to the normal space unicode (U+0020). The break space is not apart of the American Standard Code for Information Interchange(ACSII). Thus r filter() couldn't grab some names because they had break spaces. I fixed this by subbing the Unicode of the break space with the Unicode for a normal space and applying that to my given column. Example below:
space_fix = gsub("\u00A0", " ", surveyData$`Completed By`, fixed = TRUE) #subbing break space unicode with space unicode for the given column I am interested in
surveyData$`Completed By Clean` = space_fix
Once, I applied this I could easily filter any name!
Thanks everyone!
R thinks that the columns "nonrow" (which I added as a tester) and "sample" don't exist (it says they're null) in the following CSV file. (It sees every other column fine...)
nonrow,,Fill Weight,Fill Weight,Fill Weight,Fill Weight,xbar,r,sample
0,,352,348,350,351,350.25,3,1
0,,351,352,351,350,351,2,2
0,,351,346,342,350,347.25,8,3
0,,349,353,352,352,351.5,1.5,4
0,,351,350,351,351,350.75,1,5
0,,353,351,346,346,349,5,6
0,,348,344,350,347,347.25,6,7
0,,350,349,351,346,349,5,8
0,,344,345,346,349,346,4,9
0,,349,350,352,352,350.75,2,10
0,,353,352,354,356,353.75,4,11
0,,348,353,346,351,349.5,7,12
0,,352,350,351,348,350.25,3,13
0,,356,351,349,352,352,3,14
0,,353,348,351,350,350.5,3,15
0,,353,354,350,352,352.25,4,16
0,,351,348,347,348,348.5,1.5,17
0,,353,352,346,352,350.75,6,18
0,,346,348,347,349,347.5,2,19
0,,351,348,347,346,348,2,20
0,,348,352,351,352,350.75,1.25,21
0,,356,351,350,350,351.75,1.75,22
0,,352,348,347,349,349,2,23
0,,348,353,351,352,351,2,24
I have tried moving the column around, renaming it, and trying the tester column (which it also mysteriously can't see). Does anyone have any suggestions? Thank you!
I Have dataset like below which I am trying to convert column "Installs" to numeric, my codes are like below:
Original Dataset
My Codes:-
Data$Installs<-substr(Data$Installs,1,nchar(Data$Installs)-1)
Data$Installs<-gsub(",","",gsub("\\s+","",Data$Installs))
Data$Installs<-as.numeric(Data$Installs)
after the code I get below
This is the result I get
Any help?
From what I can see, you need only to remove commas and a possible trailing plus sign. So, the following should work:
Data$Installs <- as.numeric(gsub("[+,]", "", Data$Installs))
You might want to create a new column though and keep the original one.
So, I have created a variable "batch" with datatype datetime. Now my OLEBD source has a column "addDate" eg 2012-05-18 11:11:17.470 so does empty destination which is to be populated.
now this column addDate has many dates and I want to copy all dates which are "2012-05-18 11:11:17.470"
When I put value of the variable as this date, it automatically changes to mm/dd/yyyy hh;mm AM format and hence in my conditional split transformation, it couldn't match the date with the variable and hence no records are getting copied to the destination !!
Where exactly is the problem?
Thanks!
I had this issue and the best solution I found is not “pretty”.
Basically you need to change the “expression” of the variable and the “evaluate as expression” to true (otherwise it will ignore the value on expression).
The secret is (and kind of the reason I said it is not a pretty solution) to create a second variable to evaluate the expression of the first variable because you can’t change the value of a variable based on a expression.
So let’s say your variable is called “DateVariable” and you have 23/05/2012, create a variable called “DateVar2” for example and set its expression to
(DT_WSTR,4)YEAR(#[User::DateVariable]) + "/"+RIGHT("0" +
(DT_WSTR,2)MONTH(#[User::DateVariable]),2) + "/" + RIGHT("0" +
(DT_WSTR,2)DAY(#[User::DateVariable]),2)
That will give you 2012/05/23
Just keep going to get the date on the format you want
I found the easier solution. Select datatype as string. put any desired value.
Before conditional split, you need data conversion transformation.
convert it into DT_DBTIMESTAMP then run the package.
It works!