named Element-wise operations in R - r

I am a beginner in R and apologize in advance for asking a basic question, but I couldn't find answer anywhere on Google (maybe because the question is so basic that I didn't even know how to correctly search for it.. :D)
So if I do the following in R:
v = c(50, 25)
names(v) = c("First", "Last")
v["First"]/v["Last"]
I get the output as:
First
2
Why is it that the name, "First" appears in the output and how to get rid of it?

From help("Extract"), this is because
Subsetting (except by an empty index) will drop all attributes except names, dim and dimnames.
and
The usual form of indexing is [. [[ can be used to select a single element dropping names, whereas [ keeps them, e.g., in c(abc = 123)[1].
Since we are selecting single elements, you can switch to double-bracket indexing [[ and names will be dropped.
v[["First"]] / v[["Last"]]
# [1] 2
As for which name is preserved when using single bracket indexing, it looks like it's always the first (at least with the / operator). We'd have to go digging into the C source for further explanation. If we switch the order, we still get the first name on the result.
v["Last"] / v["First"]
# Last
# 0.5

Related

Why R is returning object when names are partially matching? [duplicate]

This question already has answers here:
Weird case with data tables in R, column names are mixed
(3 answers)
Closed 1 year ago.
Let say I have below list
Dat = list('AAA' = 1:4, 'BBB' = 5:9)
Now I have below syntax
Dat$AA
## [1] 1 2 3 4
However my question in why R is retuning value from Dat$AA given that there is no element with such name? Is R returning partial names?
If this is the case, then I think such behaviour is utterly risky and should not be allowed.
According to ?Extract
Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[ does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of [[ can be controlled using the exact argument.
Also, it is written under name
name - A literal character string or a name (possibly backtick quoted). For extraction, this is normally (see under ‘Environments’) partially matched to the names of the object.
where the usage is
x$name
As the exact is FALSE, it allows for partial matching when we use $. It is one of the reasons to use [[ where the exact = TRUE as the usage is
x[[i, exact = TRUE]]
It is also mentioned in ?Extract how to change the options
Thus the default behaviour is to use partial matching only when extracting from recursive objects (except environments) by $. Even in that case, warnings can be switched on by options(warnPartialMatchDollar = TRUE).
akrun already explained that this behaviour is documented, but generally we prefer that this doesn't happen. Consequently, a set of lines that I always put into my .Rprofile is the following:
options(
warnPartialMatchArgs = TRUE,
warnPartialMatchDollar = TRUE,
warnPartialMatchAttr = TRUE
)
This will be run when R starts, and generate warnings whenever partial matching occurs in R. There are three contexts that it happens as shown: in function arguments, when using $, and in using attr(). I don't think you can turn the partial matching off entirely, but this should highlight whenever it happens and prevent ensuing bugs.

How can i remove the first x number of characters of a column name from 200+ columns with each column being not the same number of characters

How can I remove a specific number of characters from a column name from 200+ column names for example: "Q1: GOING OUT?" and "Q5: STATE, PROVINCE, COUNTY, ETC" I just want to remove the "Q1: " and the "Q5: "I have looked around but haven't been able to find one where I don't have to manually rename them manually. Are there any functions or ways to use it through tidyverse? I have only been starting with R for 2 months.
I don't really have anything to show. I have considered using for loops and possibly using gsub or case_when, but don't really understand how to properly use them.
#probably not correctly written but tried to do it anyways
for ( x in x(0:length) and _:(length(CandyData)-1){
front -> substring(0:3)
back -> substring(4:length(CandyData))
print <- back
}
I don't really have any errors because I haven't been able to make it work properly.
Try this:
col_all<-c("Q1:GOING OUT?","Q2:STATE","Q100:PROVINCE","Q200:COUNTRY","Q299:ID") #This is an example.If you already have a dataframe ,you may get colnames by **col_all<-names(df)**
for(col in 1:length(col_all)) # Iterate over the col_all list
{
colname=col_all[col] # assign each column name to variable colname at each iteration
match=gregexpr(pattern =':',colname) # Find index of : for each colname(Since you want to delete characters before colon and keep the string succeeding :
index1=as.numeric(match[1]) # only first element is needed for index
if(index1>0)
{
col_all[col]=substr(colname,index1+1,nchar(colname))#Take substring after : for each column name and assign it to col_all list
}
}
names(df)<-col_all #assign list as column name of dataframe
The H 1 answer is still the best: sub() or gsub() functions will do the work. And do not fear the regex, it is a powerful tool in data management.
Here is the gsub version:
names(df) <- gsub("^.*:","",names(df))
It works this way: for each name, fetch characters until reaching ":" and then, remove all the fetched characters (including ":").
Remember to up vote H 1 soluce in the comments

R - Why does frameex[ind, ] needs a ", " to display rows

I am new to R and I have troubles understanding how displaying an index works.
# Find indices of NAs in Max.Gust.SpeedMPH
ind <- which(is.na(weather6$Max.Gust.SpeedMPH))
# Look at the full rows for records missing Max.Gust.SpeedMPH
weather6[ind, ]
My code here works, no problem but I don't understand why weather6[ind] won't display the same thing as weather6[ind, ] . I got very lucky and mistyped the first time.
I apologize in advance that the question might have been posted somewhere else, I searched and couldn't find a proper answer.
So [ is a function just like any other function in R, but we call it strangely. Another way to write it in this case would be:
'[.data.frame'(weather6,ind,)
or the other way:
'[.data.frame'(weather6,ind)
The first three arguments to the function are named x, i and j. If you look at the code, early on it branches with the line:
if (Narg < 3L)
Putting the extra comma tells R that you've called the function with 3 arguments, but that the j argument is "missing". Otherwise, without the comma, you have only 2 arguments, and the function code moves on the the next [ method for lists, in which it will extract the first column instead.

R: Extract list element but without index number in output

This seems to be a beginner question but I couldn't find the answer anywhere. I know how to extract an element from a list using listname[[1]] but output will always include the index number like
[1] First element of the list
Same is true for using the name like listname$name or unlist(listname$name). All I want is
First element of the list
I could of course remove the index number using regex but I doubt that this is the way it should be :-)
The reason that the [1] appears is because all atomic types in R are vectors (of type character, numeric, etc), i.e. in your case a vector of length one.
If you want to see the output without the [1] the simples way is to cat the object:
> listname <- list("This is the first list element",
"This is the second list element")
> cat(listname[[1]], "\n")
This is the first list element
>

Using lists in R

Sorry for possibly a complete noob question but I have just started programming with R today and I am stuck already.
I am reading some data from a file which is in the format.
3.482373 8.0093238198371388 47.393873
0.32 20.3131 31.313
What I want to do is split each line then deal with each of the individual numbers.
I have imported the stringr package and using
x = str_split(line, " ")
This produces a list which I would like to index but don't know how.
I have learnt that x[[1:2]] gets the second element but that is about it. Ideally I would like something like
x1 = x[1]
x2 = x[2]
x3 = x[3]
But can't find anyway of doing this.
Thanks in advance
By using unlist you will get a vector instead of a list of vectors, and you will then be able to index it directly :
R> unlist(str_split("foo bar baz", " "))
[1] "foo" "bar" "baz"
But maybe you should read your file directly from read.table or one of its variant ?
And if you are beginning with R, you really should read one of the introduction available if you want to understand subsetting, indexing, etc.
you can wrap your call to str_split with unlist to get the behavior you're looking for.
The usual way to get this in would be to import it into a dataframe (a special sort of list). If file name is "fil.dat"" and is in "C:/dir/"
dfrm <- read.table("C:/dir/fil.dat") # resist the temptation to use backslashes
dfrm[2,2] # would give you the second item on the second row.
By default the field separator in R is "white-space" and that seems to be what you have, so you do not need to supply a sep= argument and the read.table function will attempt to import as numeric. To be on the safe side, you might consider forcing that option with colClasses=rep("numeric", 3) because if it encounters a strange item (such as often produced by Excel dumps), you will get a factor variable and will probably not understand how to recover gracefully.

Resources