Trimming first 4 character from given table in teradata - teradata

I have table A in teradata which has column plancode .
Possible values of this plancode are
GNSC11Q
BNSC12Q
HNSC13Q
12345
A1234
I want to remove first 4 character from string which has first 4 character like GNSC, BNSC and HNSC so the final values will be like 11Q, 12Q, 13Q.
I need update statement that will remove above mentioned first 4 char from all the data in that Plancode column. Any help would be appreciated.

Related

Working on loop and wanting some feedback, re-adding this to update code and list .csv

Acses to
https://www.opendataphilly.org/dataset/shooting-victims/resource/a6240077-cbc7-46fb-b554 39417be606ee
I have gotten close and got my loop to run, but not gotten the output I want
want a split of street # any '&' locations to a col called 'street$2
**Main objective explained et's deal with the streets with & separating their names. Create a new column named street2 and set it equal to NA.
Then, iterate over the data frame using a for loop, testing if the street variable you created earlier contains an NA value.
In cases where this occurs, separate the names in block according to the & delimiter into the fields street and street2 accordingly.
Output the first 5 lines of the data frame to the screen.
Hint: mutate(); for; if; :; nrow(); is.na(); strsplit(); unlist().
library('readr')
NewLocation$street2 <- 'NA'
#head(NewLocation)
Task7 <- unlist(NewLocation$street2)
for (row in seq(from=1,to=nrow(NewLocation))){
if (is.na(Task7[NewLocation$street])){
NewLocation$street2 <-strsplit(NewLocation$street,"&",(NewLocation[row]))
}
}
This is changing all on my street2 to equal street 1 and get rid of my "NA"s

Error converting character strings to numeric

Trying to convert a series of 29 columns in a dataframe 'df' to numeric variables. Each column currently contains strings that all look like this:
two three
"-2.5346346" "-4.2342342"
"-3.645735" "-2.23434542"
"-4.235234" "-1.23422"
as.character(two)
works fine.
as.numeric(as.character(two))
does not. as.numeric() returns all NAs, not even just NAs for certain observations.
In any case, there are not any extraneous commas, letters, etc. I cannot think what could be causing the problem and have run out of ideas. If it's at all relevant, I constructed the columns from vector strings (ex. c("-3.23423", "-2.34532)) where each string became a new column and now I'm wondering if there's something in the 'str_extract_all' function that I used to do that I'm not aware of. Thank you.
Edited to include sample data.
head(df)
one two three four five six
1 c("-3.19474987" "-3.9386188" "-5.3585024" "-7.3370402" "-4.65656894" "-5.37296894"
2 c("-3.86805776" "-2.57038981" "-4.88910112" "-3.82336021" "-1.51641245" "-4.19533412"
3 c("-4.64324462" "-3.51131105" "-5.81064472" "-6.63382723" "-4.47048461" "-7.08557932"
4 c("-4.88484732" "-3.48084998" "-4.97011221" "-5.36993391" "-3.14765309" "-4.60799153"
5 c("-4.99299683" "-3.26320573" "-4.5861881" "-5.3340004" "-2.14507341" "-3.30230272"
6 c("-5.15376815" "-4.08624463" "-6.50014523" "-5.49561174" "-4.14988788" "-6.57583067"

Using order in R dataframes fails after column names have been changed. how can I recover this?

Setup dataframe
mta<-c("ldall","nold","ldall","nold","ldall","nold","ldall","nold")
mtb<-c(491, 28581,241,5882,365,7398,512,10887)
df1<-data.frame(mta,mtb)
I can order my dataframe in the normal way. This works fine.
df1[order(mtb),]
But if I change the names of the columns
names(df1)<-c("mta1","mtb1")
df1[order(mtb1),]
This gives the error
Error in order(mtb1) : object 'mtb1' not found.
If I use the old column name in the instruction it works, although the output shows the new column name.
df1[order(mtb),]
If I change the name back to the original, the command appears to work normally. Can anyone explain? Is order using a hidden version of the column name?
This should work. Let me know if this helps.
mta<-c("ldall","nold","ldall","nold","ldall","nold","ldall","nold")
mtb<-c(491, 28581,241,5882,365,7398,512,10887)
df1<-data.frame(mta,mtb)
# Change column names
colnames(df1) <- c("mta1","mtb1")
# Sort column mtb1 from the data frame
df1[order(df1$mtb1), ]
mta1 mtb1
3 ldall 241
5 ldall 365
1 ldall 491
7 ldall 512
4 nold 5882
6 nold 7398
8 nold 10887
2 nold 28581

How can I sort, find duplicates in one column and delete them together with their match in another column?

I have a really big file with this information
7-92383888 rs10
7-6013153 rs10000
12-126890980 rs1000000
4-57561647 rs10000003
4-85161558 rs10000005
4-172776204 rs10000008
4-71048953 rs10000009
2-50711642 rs1000001
The first column is the chromosome number and the base pair position and the second column is the SNP which can be found in this specific area. In the first column there are some duplicates, but not in the second one. How can I sort column 1, find duplicates and then delete the entire row? So, delete the duplicate of column 1 and at the same time their matching value in column 2. Also, while sorting I dont want the match between column 1 and column 2 to change.
I read previous posts and I know that I have to use sort and uniq command but I dont know how.
Thank you.
sort has `-u' flag for that.
sort -uk 1
Here's an example:
$ echo -e 'b 1\na 2\nb 1\na 2' | sort -uk 1
a 2
b 1

How to remove rows based on a particular condition

The data set contains a column with over a 10000 cell phone numbers,it also contains some garbage values with no particular format.
How do I retain only the rows with the correct cell phone numbers
cell number ............ comment
9674544444............... a
9453453455............... c
asd..as23.....................d
as sas E2...................d
232dsasd....................,,,,,,,,,,,,,,,23,,,,,231
required table
cell number ............ comment
9674544444............... a
9453453455............... c
Like this;
df<-read.table(header=T,sep="|",text="cell number|comment
9674544444|a
9453453455|c
asd..as23|d
as sas E2|d
232dsasd|23,,,,,231")
df[grep("[0-9]{10}",df$cell.number),]
# cell.number comment
#1 9674544444 a
#2 9453453455 c

Resources