Removing space at the end of string - trim

I would want to remove white spaces at the end of the string. I used below code but it doesn't work, also it does not throw any error.
df[ABC_Col].str.strip()
ABC_Col is the column header
Could you please help me on this?
Thank you

You need to assign the result of strip to the data frame column on the left hand side:
df[ABC_Col] = df[ABC_Col].str.strip()

Related

Is there an R function for removing a specific snippet of the data in a column?

So I have a data frame that includes a column like this:
image
And I would like to remove the operator as well as the numbers to the right of it, i.e. so the first entry would just say 51.81 rather than 51.81 - 11.19. How would I go about this? I feel like using a for loop might work but I'm unsure of the syntax required.
Thanks
We can use sub to match zero or more spaces (\\s*) followed by a - or + and other characters, and replace with blank ("")
df1$xG <- as.numeric(sub("\\s*[-+]+.*", "", df1$xG))

Cleaning a column with break spaces that obtain last, first name so I can filter it from my data frame

I'm stumped. My issue is that I want to grab specific names from a given column. However, when I try and filter them I get most of the names except for a few, even though I can clearly see their names in the original excel file. I think it has to do what some sort of special characters or spacing in the name column. I am confused on how I can fix this.
I have tried using excels clean() function to apply that to the given column. I have tried working an Alteryx flow to clean the data. All of these steps haven't helped any. I am starting to wonder if this is an r issue.
surveyData %>% filter(`Completed By` == "Spencer,(redbox with whitedot in middle)Amy")
surveyData %>% filter(`Completed By` == "Spencer, Amy")
in r the first line had this redbox with white dot in between the comma and the first name. I got this red box with white dot by copy the name from the data frame and copying it into notepad and then pasting it in r. This actually works and returns what I want. Now the second case is a standard space which doesn't return what I want. So how can I fix this issue by not having to copy a name from the data frame and copy to notepad then copying the results from notepad to r, which has the redbox with a white dot in between the comma(,) and first name.
Expected results is that I get the rows that are attached to what ever name I filter by.
I was able to find the answer, it turns out the space is actually a break space with unicode of (U+00A0) compared to the normal space unicode (U+0020). The break space is not apart of the American Standard Code for Information Interchange(ACSII). Thus r filter() couldn't grab some names because they had break spaces. I fixed this by subbing the Unicode of the break space with the Unicode for a normal space and applying that to my given column. Example below:
space_fix = gsub("\u00A0", " ", surveyData$`Completed By`, fixed = TRUE) #subbing break space unicode with space unicode for the given column I am interested in
surveyData$`Completed By Clean` = space_fix
Once, I applied this I could easily filter any name!
Thanks everyone!

Can't get toupper to work

I want to transform the contents of a factor column in a dataframe from lowercase to upper case. The function toupper(dataframe$columnname) prints the contents in uppercase, but nothing actually seems to happen to the contents. When I check using levels(dataframe$columnname) or just visually inspecting the dataframe, the contents are still in lowercase. What I am doing wrong?
To change your data.taframe's content you must alter the columns values
dataframe$columnname <- toupper(dataframe$columnname)
Although, if you want to play with characters, do it like
dataframe$columnname <- toupper(as.character(dataframe$columnname))

Problems pattern matching using R regular expressions

I'm trying to extract a string using str_extract. Here is a small example of the type of string:
library(stringr)
gs<-"{\"type\":\"Polygon\",\"coordinates\":[[[1,2],[3,4],[5,6],[7,8]]]}"
s='\\{\\\"type\\\"*\\}'
str_extract(gs,s)
I'd like to get a print-out of the entire string (the real string will have more characters of this type and should only return the piece I specified here). Instead I get NA. I'd be grateful for any ideas as to what I'm doing wrong. Thank you!
Does this do what you want?
gs<-"{\"type\":\"Polygon\",\"coordinates\":[[[1,2],[3,4],[5,6],[7,8]]]} I DO NOT WANT THIS {\"type\":\"Not a Polygon\",\"coordinates\":[[[1,2],[3,4],[5,6],[7,8]]]}"
s="\\{\"type\"(.*?)\\}"
result = str_match_all(gs,s)[[1]][,1]
To test, I added the string 'I DO NO WANT THIS', which should not be returned, and added a second object which type is 'not a polygon'
It returns:
"{\"type\":\"Polygon\",\"coordinates\":[[[1,2],[3,4],[5,6],[7,8]]]}"
"{\"type\":\"Not a
Polygon\",\"coordinates\":[[[1,2],[3,4],[5,6],[7,8]]]}"
So only the elements requested. Hope this helps!

Shiny, When I make a query using reactive input variables I get an extra blank before the string I'm comparing too so my query doesn't get anything

I have this query
query<-paste("select disease_status,expvalue from taylor_21036,taylor_exp_21036 where geo_accession=id_geoacc and id_gpl like '",input$gen,"' order by'",input$gen,"'")
For some strange reason when I view the query I get:
select disease_status,expvalue from taylor_21036,taylor_exp_21036 where geo_accession=id_geoacc and id_gpl like ' hsa-let-7a ' order by ' hsa-let-7a '
and extra blank is added in the left and in the right of my string
How can I fix this? Any idea?
I was getting mad because I didn't know why I was getting the error "need finite 'ylim' values" when I was trying to boxplot the data frame I got after doing a dbGetQuery with the query mentioned before but finally I found the problem, the data frame was empty because my query doesn't get any rows because of the annoying extra blanks.
Please I will appreciate any advice. Thank you!!!
Thanks in advance.
The default separator between terms in paste is " " (space). If you don't want a separator then use paste0 or add the argument sep="" to your paste.

Resources