Assume there's a list in R with a variable which contains character fields. I want to output all values which begin with certain letters like "ab". How can I do this? Thanks for help.
As the OP didn't provide any reproducible example, based on the description, it seems to be a data.frame, we can use grep to subset the elements in the column that begin with 'ab'.
grep('^ab',yourdata$yourcol, value=TRUE)
Related
I have a huge data frame df, with many columns. One of the columns named id_nm happens to be a character with values such as: aksh123dn.Ins
class(df$id_nm)
returns character
I need to lookup all those values which have the id_nm say aksh123dn.Ins
I used:
new_df<-df[df$id_nm=='aksh123dn.Ins',]
this returns the entire df which isn't the case in reality
also tried:
new_df<-df%filter(id_nm=='aksh123dn.Ins']
still getting the same answer
I think its possibly because it is a character string. Please help me with this. TIA
I need to use R to print dataset of a dataframe so that columns are in alphabetical order. It sounds sorting column name is required. I tried sort (data.frame$) but it didn't work. Can anybody help me?
You can use the code below
df[order(names(df))]
or
df[sort(names(df))]
The text column can hold up to 100 letters for each entry. How can i write a script that recognizes the word "Approved" or "Rejected". Sometimes the word will be "-Approved", "Approved","Approved" or "Approve". I want it to account for each scenario with a "LIKE" type of function.
There are two words i am looking for so "OR" may be applicable to this as opposed to a range.
R has a pair of text-similarity functions, agrep and agrepl, which are like grep and grepl in returning a vector when given a vector. The agrepl function is logical and of the same length as the input so works better in cases like this:
agrepl("Approved", df$text_col) | agrepl("Rejected", df$text_col)
That could be used to logically index matching rows of a dataframe. Or you could sum the logical vector to get a count. Suggestion: Edit your question with an example to use for demonstration.
There are additional parameters that can be used to adjust the tightness of the approximate matching.
This is probably a basic question, but why does R think my vector, which has a bunch of words in it, are numbers when I try to use these vectors as column names?
I imported a data set and it turns out the first row of data are the column headers that I want. The column headers that came with the data set are wrong ones. So I want to replace the column names. I figured this should be easy.
So what I did was I extracted the first row of data into a new object:
names <- data[1,]
Then I deleted the first row of data:
data <- data[-1,]
Then I tried to rename the column headers with the "names" object:
colnames(data) <- names
However, when I do this, instead of changing my column names to the words within the names object, it turns it into a bunch of numbers. I have no idea where these numbers come from.
Thanks
You need to actually show us the data, and the read.csv()/read.table() command you used to import.
If R thinks your numeric column is string, it sounds like that's because it wrongly includes the column name, i.e. you omitted header=TRUE in your read.csv()/read.table() import.
But show us your actual data and commands used.
I use the following code in r to read a CSV file of stock prices.
library(quantmod)
#column headings ("open","high","low","close","volume","adj.")
fmt <- '%Y-%m-%d'
SPY <- read.zoo("~/Stocks/csv/SPY.csv",header=TRUE,sep=',',tz='',format=fmt,index=0:1)
plot(SPY['open'])
I can successfully use plot(SPY) to plot all columns.
How would I select just one column by name, for example plot just the "open" column? I've tried a bunch of things such as plot(SPY['open']) but can't figure it out.
Could somebody help? Many thanks!
Try:
plot(SPY[,'open']
The square brackets method of selecting a subset requires two expressions: first, one describing the rows, and second, one describing the columns. These two expressions are separated by a comma. When you want to include all the rows, just leave a blank before the comma, and specify the name of the column you want.
Your code, with only one expression, treats 'open' as a row, not a column. The result is probably a strip chart, a one-dimensional graph, instead of the plot you were expecting.