Assigning a list to column properties

Assigning a list to column properties - r

I have a list containing values and I want to assign it to the column properties of a table in spotfire. I am currently using a for loop to do it. Is there a better approach to this, like assigning the entire list in one go?
As mentioned previously I am doing it currently using a for loop which can be seen below:
high=c(5,2,10)
low=c(3,1,0)
for(col in 1:ncol(temp)){
attributes(temp[,col])$SpotfireColumnMetaData$limits.whatif.upper=(high[col])[1]
attributes(temp[,col])$SpotfireColumnMetaData$limits.whatif.lower=(low[col)[1]
}
}
I have also tried just to do
attributes(temp2)$SpotfireColumnData$limits.whatif.upper=high
but that didnt seem to work.
So I want the column for limits.whatif.upper to be 5 for the first row, 2 for the second, and 10 for the third. As I said this code works, but I want to see if there is a faster way of doing it, since it seems that accessing the column property every time and changing it slows down the code a lot.The columns properties already exist so I am not creating new ones with this code.

It seems that python works faster than R with column properties. So if you need to do it faster, it may be better just to transfer the data over to python and do it from there. I dont have as much expierence in R, so it may just be poorly written R code as well.

Related

R - looping through the values of a column that has a dot in its name

I'm learning the very basics of the R language.
I would like to loop (with either a loop or a while function since that's what I'm learning) through all the values of a specific dataset column that is called "Kid.Height". Let's say the dataset is called test.
I can target a standard column like this : "test$KidHeight". But not with a dot in its name ("test$Kid.Height").
I would like to do something like this:
for(i in test$Kid.Height) {
print(Kid.Height.value);
}
So that I can read all the row values of that column
I can't find any instance on the web that tells me how to deal with dots in columns name.
I know how to target a column by its index but not by name so that it always works, however fancy it is.
PS: since I'm learning the basics, if I can ask you the most clean and recommended way to achieve this, so that I can learn from scratch, I would be grateful.
Thank you.

Columns can also be accessed using square brackets for difficult column names, try:
test['Kid.Height']

Removing columns from an index request in a data frame

Hi I currently have the code:
matrixed.data <- data.matrix(df[1:row.dim,7:col.dim])
Where the row.dim and col.dim are variables for the size of the whole frame. I would like to remove the column "df$WEATHER" that is included in the col.dim selection but don't know how to word it. I have tried adding - df$WEATHER and !df$WEATHER inside the bracket but fear I'm misinterpreting the scope of these commands.
Is it possible to do this without creating a new col.dim variable; I'm trying to keep the code as limitless as possible as the data frame may increase in size in the future.

Thank you digEmAll! I thought it would be reasonably simple I'm just a bit too green at R to think of something like that. For others what worked for me was:
(df[1:row.dim, setdiff(7:col.dim,which(names(df) == "WEATHER"))])

Not aggregating correctly

My goal of this code is to create a loop that aggregates each company's word frequency by a certain principle vector I created and adds it to a list. The problem is, after I run this, it only prints the 7 principles that I have rather than the word frequencies along side them. The word frequencies being the certain column of the FREQBYPRINC.AG data frame. Individually, running this code without the loop and just testing out a certain column, it works no problem. For some reason, the loop doesn't want to give me the correct data frames for the list. Any suggestions?
list.agg<-vector("list",ncol(FREQBYPRINC.AG)-2)
for (i in 1:14){
attach(FREQBYPRINC.AG)
list.agg[i]<-aggregate(FREQBYPRINC.AG[,i+1],by=list(Type=principle),FUN=sum,na.rm=TRUE)
}

I really wish I could help. After reading your statement, It seems that to you , you feel that the code should be working and it is not. Well maybe there exists a glitch.
Since you had previously specified list. agg as a list, you need to subset it with double square brackets. Try this one out:
list.agg<-vector("list",ncol(FREQBYPRINC.AG)-2)
for (i in 1:14){
list.agg[[i]]<-aggregate(FREQBYPRINC.AG[,i+1],by=list
(Type=principle),FUN=sum,na.rm=TRUE)}

How to get R to use a certain dataset for multiple commands without usin attach() or appending data="" to every command

So I'm trying to manipulate a simple Qualtrics CSV, and I want to use colSums on certain columns of data, given a certain filter.
For example: within the .csv file called data, I want to get the sum of a few columns, and print them with certain labels (say choice1, choice2 etc). That is easy enough by itself:
firstqn<-data.frame(choice1=data$Q7_2,choice2=data$Q7_3,choice3=data$Q7_4);
secondqn<-data.frame(choice1=data$Q8_6,choice2=data$Q8_7,choice3=data$Q8_8)
print colSums(firstqn); print colSums(secondqn)
The problem comes when I want to repeat the above steps with different filters, - say, only the rows where gender==2.
The only way I know how is to create a new dataset data2 and replace data$ with data2$ in every line of the above code, such as:
data2<-(data[data$Q2==2,])
firstqn<-data.frame(choice1=data2$Q7_2,choice2=data2$Q7_3,choice3=data2$Q7_4);
however i have 6 choices for each of 5 questions and am planning to apply about 5-10 different filters, and I don't relish the thought of copy/pasting data2 and `data3' etc hundreds of times.
So my question is: Is there any way of getting R to reference data by default without using data$ in front of every variable name?
I can probably use attach() to achieve this, but i really don't want to:
data2<-(data[data$Q2==2,])
attach(data2)
firstqn<-data.frame(choice1=Q7_2,choice2=Q7_3,choice3=Q7_4);
detach(data2)
is there a command like attach() that would allow me to avoid using data$ in front of every variable, for a specified amount of code? Then whenever I wanted to create a new filter, I could just copy/paste the same code and change the first command (defining a new dataset).
I guess I'm looking for some command like with(data2, *insert multiple commands here*)
Alternatively, if anyone has a better way to do the above in an entirely different way please enlighten me - i'm not very proficient at R (yet).

using value of a function & nested function in R

I wrote a function in R - called "filtre": it takes a dataframe, and for each line it says whether it should go in say bin 1 or 2. At the end, we have two data frames that sum up to the original input, and corresponding respectively to all lines thrown in either bin 1 or 2. These two sets of bin 1 and 2 are referred to as filtre1 and filtre2. For convenience the values of filtre1 and filtre2 are calculated but not returned, because it is an intermediary thing in a bigger process (plus they are quite big data frame). I have the following issue:
(i) When I later on want to use filtre1 (or filtre2), they simply don't show up... like if their value was stuck within the function, and would not be recognised elsewhere - which would oblige me to copy the whole function every time I feel like using it - quite painful and heavy.
I suspect this is a rather simple thing, but I did search on the web and did not find the answer really (I was not sure of best key words). Sorry for any inconvenience.
Thxs / g.

It's pretty hard to know the optimum way of achieve what you want as you do not provide proper example, but I'll give it a try. If your variables filtre1 and filtre2 are defined inside of your function and you do not return them, of course they do not show up on your environment. But you could just return the classification and make filtre1 and filtre2 afterwards:
#example data
df<-data.frame(id=1:20,x=sample(1:20,20,replace=TRUE))
filtre<-function(df){
#example function, this could of course be done by bins<-df$x<10
bins<-numeric(nrow(df))
for(i in 1:nrow(df))
if(df$x<10)
bins[i]<-1
return(bins)
}
bins<-filtre(df)
filtre1<-df[bins==1,]
filtre2<-df[bins==0,]