Populating a Data Frame with Characters in a For Loop R - r

Currently I have a loop that is adding rows from one data frame into another master data frame. Unfortunately, it converts the characters into numbers, but I don't want that. How can I get the following for loop to add the rows from one data frame into the master data frame while keeping the characters?
AnnotationsD <- data.frame(x = vector(mode = "numeric",
length = length(x)), type = 0, label = 0, lesion = 0)
x = c(1,2)
for(i in length(x)){
D = data.frame(x = i, type = c("Distance"),
label = c("*"), lesion = c("Wild"))
AnnotationsD[[i,]] <- D[[i]]
}
So what I would like to come out of this is:
x type label lesion
1 1 Distance * Wild
2 2 Distance * Wild

This should work:
x = c(1,2)
AnnotationsD <- data.frame(x = as.character(NA), type = as.character(NA),
label = as.character(NA), lesion = as.character(NA),
stringsAsFactors =F)
for(i in 1:length(x)){
D = c(x = as.character(i), type = as.character("Distance"),
label = as.character("*"), lesion = as.character("Wild"))
AnnotationsD[i,] <- D
}

Related

Trying to create an R function which finds the input value in column 1 of a dataframe and returns column 2 value of the same row

New to R functions, I have a dataframe which looks like this except about 10,000 rows long:
Gene.name
Ortho.name
abc
DEF
qrs
TUV
wx
YZ
I'm trying to create a really simple function in r which when I input qrs, returns TUV. If someone could help I would really appreciate it.
fun <- function(vec, data) data$Ortho.name[ match(vec, data$Gene.name) ]
Z <- structure(list(Gene.name = c("abc", "qrs", "wx"), Ortho.name = c("DEF", "TUV", "YZ")), class = "data.frame", row.names = c(NA, -3L))
fun("qrs", data = Z)
# [1] "TUV"
fun("nothing", data = Z)
# [1] NA
fun(c("qrs", "abc", "not found"), data = Z)
# [1] "TUV" "DEF" NA
In case anyone is using seurat for plotting orthologs in a cross-species comparison, this is how I implemented the above using orthologs from BioMart:
chick_fish_ortho <- read.csv("chick_orthos.csv")
mac_fish_ortho <- read.csv('mac_orthos.csv')
macfun('glula', mac_fish_ortho)
chickfun('glula', chick_fish_ortho)
chickfun <- function(vec, data) data$Chicken.gene.name[ match(vec, data$Gene.name) ]
macfun <- function(vec, data) data$Macaque.gene.name[ match(vec, data$Gene.name) ]
fish_chick_mac <- function(gene, chickdata, macdata) {
p1 = FeaturePlot(object = fish_MG, reduction = "umap", label = TRUE, min.cutoff = 0, features = gene)
p2 = FeaturePlot(object = chick_MG, reduction = "umap", label = TRUE, min.cutoff = 0, features = chickfun(gene, chickdata))
p3 = FeaturePlot(object = mac_MG, reduction = "umap", label = TRUE, min.cutoff = 0, features = macfun(gene, macdata))
p1 + p2 + p3
}
fish_chick_mac('glula', chick_fish_ortho, mac_fish_ortho)

Conditional statement: change one variable in a data list based on certain input

Can I use conditional statement to change one variable in a data list based on certain input?
For instance, a data list as follows. I need d = perd or phyd when I use different input: dlist[x], d=perd; dlist[y], d=phyd. x and y can be anything, what I need is just to give an order and then make it as perd or phyd.
dlist <- list(
Nsubjects = 1,
Ntrials = 2,
d = perd,
)
perd <- c (1,2,3)
phyd <- c (4,5,6)
Can you create another list with names to store perd and phyd ?
plist <- list(x = c (1,2,3), y = c (4,5,6))
You can then extract the data from it by it's name.
val <- 'x'
dlist <- list(
Nsubjects = 1,
Ntrials = 2,
d = plist[[val]]
)
Without creating plist you can do. :
list(
Nsubjects = 1,
Ntrials = 2,
d = if(val == 1) c(1,2,3) else c(4,5,6)
)
Or also :
list(
Nsubjects = 1,
Ntrials = 2,
d = list(c(1,2,3),c(4,5,6))[[val]]
)
where val <- 1 or 2.

R: Get index names while looping through df elements

Say, I have a data frame and I need to do something with its cells and remember what cells I have changed. One way is to loop through indices with two for-loops. But is there a way to do this with one loop?
Perfectly I need something like this:
changes = data.frame(Row = character(), Col = character())
for (cell in df){
if (!(is.na(df))){
cell = do.smt(cell)
temp = list(Row = get.row(cell), Col = get.col(cell))
changes = rbind(changes,temp)
}
}
Example of what I need:
df = data.frame(A = c(1,2,3), B = c(4,5,6), C = c(7,8,9))
rownames(df) = c('a','b','c')
changes = data.frame(Row = NA, Col = NA)
for (i in rownames(df)){
for (j in colnames(df)) {
if (df[i,j] > 5) {
df[i,j] = 0
temp = list(Row = i, Col = j)
changes = rbind(changes, temp)
}
}
}
This gets rid of both loops
df = data.frame(A = c(1,2,3), B = c(4,5,6), C = c(7,8,9))
rownames(df) = c('a','b','c')
changes <- which(df > 5, arr.ind=TRUE)
df[changes] <- 0
If you want the format exactly as specified you can sort that out with
changes <- data.frame(changes,row.names=NULL)
changes$row <- rownames(df)[changes$row]
changes$col <- colnames(df)[changes$col]
and its a simple matter of sorting if you're concerned that the order of the rows matches your example output

Faster alternative to nested loops

I have written the below function, which contains a nested loop. In short, it calculates differences in emissions between i (28) pairs alternative technologies for j (48) countries. For a single combination and a single country, it takes 0.32 sec, which should give a total time of 0.32*28*48 = around 7 min. The function actually takes about 50 min, which makes me think there may be some unnecessary computing going on. Is a nested loop the most efficient approach here?
Any help is greatly appreciated!
alt.comb.p <- function(Fmat){
y.empty = matrix(data = 0,ncol = 2,nrow = nrow(FD)-1)
row.names(y.empty) <- paste(FD$V1[2:nrow(FD)],FD$V2[2:nrow(FD)],sep = " ")
country.list = unique(FD$V1)
for (j in 1:length(country.list)){ # for every country
for (i in 1:ncol(alt.comb)){ # for every possible combination
# the final demand of the first item of the combination is calculated
first = alt.comb[,i][1]
first.name = row.names(Eprice.Exio)[first]
loc1 = grep(pattern = first.name,x = row.names(y.empty))
country.first = substr(x = row.names(y.empty)[loc1[j]],start = 0,stop = 2)
y.empty[,1][loc1[j]] <- Eprice.Exio[first.name,country.first]
# the final demand of the second item of the combination is calculated
second = alt.comb[,i][2]
second.name = row.names(Eprice.Exio)[second]
loc2 = grep(pattern = second.name,x = row.names(y.empty))
country.second = substr(x = row.names(y.empty)[loc2[j]],start = 0,stop = 2)
y.empty[,2][loc2[j]] <- Eprice.Exio[second.name,country.second]
# calculates the difference between the total pressures from item 1 and item 2
r.1 = sum(Fmat%*%as.vector(y.empty[,1]))
r.2 = sum(Fmat%*%as.vector(y.empty[,2]))
r.dif = r.1-r.2 # negative means alternative 1 is better
alt.comb[2+j,i] <- r.dif
row.names(alt.comb)[2+j] <- country.first
y.empty = matrix(data = 0,ncol = 2,nrow = nrow(FD)-1)
row.names(y.empty) <- paste(FD$V1[2:nrow(FD)],FD$V2[2:nrow(FD)],sep = " ")
}
}
return(alt.comb)
}
Edit:
A simplified example would be:
Fmat = matrix(data = runif(1:9600), ncol=9600, nrow=9600)
alt.comb.p <- function(Fmat){
y.empty = matrix(data = 0,ncol = 2,nrow = 9600)
country.list = runif(n = 10)
alt.comb = matrix(data=0,ncol=5,nrow=10)
for (j in 1:10){ # for every country
for (i in 1:5){ # for every possible combination
y.empty[50,1] <- runif(1)
y.empty[60,2] <- runif(1)
# calculates the difference between the total pressures from item 1 and item 2
r.1 = sum(Fmat%*%as.vector(y.empty[,1]))
r.2 = sum(Fmat%*%as.vector(y.empty[,2]))
r.dif = r.1-r.2 # negative means alternative 1 is better
alt.comb[j,i] <- r.dif
y.empty = matrix(data = 0,ncol = 2,nrow = 9600)
}
}
return(alt.comb)
}

Cycle for plotting multiple graphs according to the number of factors

I've a factor vector containing 25 unique variables for categorizing two numeric variables (x,y)
I want to plot for each single factor a scatterplot
for (factor in Coordinates$matrixID) {
dev.new()
plot(grid, type = "n")
vectorField(Coordinates$Angle,Coordinates&Length,Coordinates$x,Coordinates$y,scale=0.15, headspan=0, vecspec="deg")
}
This function result in generating 63 identical graphs of overall data. I want 25 different graphs, one for each factor
Could you please help me,
Thanks
EDIT: Example given
library(VecStatGraphs2D)
Data <- data.frame(
x = sample(1:100),
y = sample(1:100),
angle = sample(1:100),
lenght = sample(1:100),
matrixID = sample(letters[1:25], 20, replace = TRUE))
for (factor in matrixID) {
dev.new()
plot(grid, type = "n") V
VectorField(Data$angle,Data$lenght,Data$x,Data$y,scale=0.15,headspan=0, vecspec="deg")
}
Not so tidy, but you may try something like:
library(plotrix)
library(VecStatGraphs2D)
Data <- data.frame(
x = sample(1:100),
y = sample(1:100), angle = sample(1:100), lenght = sample(1:100),
matrixID = sample(letters[1:4], 20, replace = TRUE))
for (i in unique(Data$matrixID))
{
dev.new()
Data1 <- subset(Data, matrixID == i)
plot(0:100, type = "n")
vectorField(Data1$angle,Data1$lenght,Data1$x,Data1$y,scale=0.15, headspan=0, vecspec="deg")
}
for your example, and
for (i in unique(Coordinates$matrixID))
{
dev.new()
Coordinates1 <- subset(Coordinates, matrixID == i)
plot(grid, type = "n")
vectorField(Coordinates1$Angle,Coordinates1&Length,Coordinates1$x,Coordinates1$y,scale=0.15, headspan=0, vecspec="deg")
}
in your code.
Is this what you're trying to achieve?
# Dummy dataset
library(plotrix)
Data <- data.frame(
x = sample(1:100),
y = sample(1:100), angle = sample(1:100), lenght = sample(1:100), matrixID = sample(letters[1:4], 20, replace = TRUE))
# Get the levels of matrixID
lev <- levels(Data$matrixID)
# Plot each graph
for (i in lev) {
temp <- subset(Data,matrixID==i)
plot(temp$x,temp$y,type="n", main=i)
with(temp, vectorField(u=angle,v=lenght,x=x,y=y,scale=0.15,headspan=0, vecspec="deg"))
}

Resources