I am trying to get the following done: I have two columns (lets say codeA and codeB) in a dataframe A and want to compare these characters to a column (codeC) of another dataframe B. The codeA and codeB are the same in most cases, if they are not the same, the code (A/B) that matches codeC should be written in a new column.
So far I did not manage to achieve this result in combining if and for loops in R. Can someone help me?
Gretly appreciated!
I tried to code it using if and for loop but did not get the result needed.
Related
I am a beginner in R and I have a question about dataframes.
In my case, I have this data.frame as a example:
I just want one ENSEMBL_ID Per SYMBOL, so I'd like to get rid of one of these. Can someone tell me how could I do this? In this case unique(data.frame) does not work as there is a difference in the third column.
I am a Stata user and I have to use R for a project. I want to repeat a line of code for different values. In other words, I want to avoid repeating these lines 15 times:
M20<-as.data.frame(cbind(c(1:20),exclusivity(model20), semanticCoherence(model=model20, docs), "Mod20"))
M21<-as.data.frame(cbind(c(1:21),exclusivity(model21), semanticCoherence(model=model21, docs), "Mod21"))
If I follow the logic of Stata, I would write something like this:
forvalues i=20(1)35 {
M`i'<- as.data.frame(cbind(c(1:`i'),exclusivity(model`i'), semanticCoherence(model=model`i', docs), "Mod`i'"))
}
and Stata would understand that `i' should change each time, whether it's a number or part of the name of an object.
I can't figure out how I can replicate the same loop in R.
I appreciate your help
p.s. someone said my question is the same as this one. I don't see the similarity!
How to apply a function to a certain column for all the data frames in environment in R
I'm learning the very basics of the R language.
I would like to loop (with either a loop or a while function since that's what I'm learning) through all the values of a specific dataset column that is called "Kid.Height". Let's say the dataset is called test.
I can target a standard column like this : "test$KidHeight". But not with a dot in its name ("test$Kid.Height").
I would like to do something like this:
for(i in test$Kid.Height) {
print(Kid.Height.value);
}
So that I can read all the row values of that column
I can't find any instance on the web that tells me how to deal with dots in columns name.
I know how to target a column by its index but not by name so that it always works, however fancy it is.
PS: since I'm learning the basics, if I can ask you the most clean and recommended way to achieve this, so that I can learn from scratch, I would be grateful.
Thank you.
Columns can also be accessed using square brackets for difficult column names, try:
test['Kid.Height']
I have a small problem, which I don't think is too hard, but I couldn't find any answer here (maybe I phrased my research wrong so please excuse me if the question has already been asked!)
I am importing data from an excel sheet which is split in two columns as in the following picture:
Now, I am trying to import all the data in the second column to my R script, but by splitting it into different vectors: one vector for category A, one for category B, etc... by keeping the data points in the order they are in the file (because as it happens, they are in chronological order).
Now, the categories each have a different number of elements, however, they are ordered alphabetically (ie you'll never find an A in the B's, for example). So I guess that makes it easier, but I'm still a novice with R and I don't really know how to proceed without getting really messy with the code and I know there's probably a simple way of doing it.
Does anyone have an idea on how to treat this nicely please? :)
We can use split in base R to return a list of vectors of 'Data' based on the unique values in 'Category'
lst1 <- split(df1$Data, df1$Category)
My goal of this code is to create a loop that aggregates each company's word frequency by a certain principle vector I created and adds it to a list. The problem is, after I run this, it only prints the 7 principles that I have rather than the word frequencies along side them. The word frequencies being the certain column of the FREQBYPRINC.AG data frame. Individually, running this code without the loop and just testing out a certain column, it works no problem. For some reason, the loop doesn't want to give me the correct data frames for the list. Any suggestions?
list.agg<-vector("list",ncol(FREQBYPRINC.AG)-2)
for (i in 1:14){
attach(FREQBYPRINC.AG)
list.agg[i]<-aggregate(FREQBYPRINC.AG[,i+1],by=list(Type=principle),FUN=sum,na.rm=TRUE)
}
I really wish I could help. After reading your statement, It seems that to you , you feel that the code should be working and it is not. Well maybe there exists a glitch.
Since you had previously specified list. agg as a list, you need to subset it with double square brackets. Try this one out:
list.agg<-vector("list",ncol(FREQBYPRINC.AG)-2)
for (i in 1:14){
list.agg[[i]]<-aggregate(FREQBYPRINC.AG[,i+1],by=list
(Type=principle),FUN=sum,na.rm=TRUE)}