I am generating dynamic variables names like P_1_Onsets_PRH, p_2_Onsets_PRH, etc. in a for loop. In this same loop, I'd like to read these variable names and generate corresponding matrices P1_Durations_PRH, etc. having the same number of elements as the respective Onset matrix.
for (i in 1:nrow(LabviewFiles)){
assign(x = paste("P",i , "Onsets_PRH", sep = "_"), value = t(subset.data.frame(All_Phase, All_Phase$Phase==i) %>%
filter(CONDITIONS == "NULL_TRIAL",
MISC_REWARD == 1,
MISC_PASSIVE_FAILED == 1) %>%
select(Feedback_onset)))
assign(x = paste("P",i , "Durations_PRH", sep = "_"), value = t(rep(0.5, times = length(noquote(paste("P",i , "Onsets_PRH", sep = "_"))))))
}
How do I read the length of matrix 'P_i_Onsets_PRH'?
I'm a newbie to R. Any help is appreciated.
You may use get to do this -
library(dplyr)
for (i in 1:nrow(LabviewFiles)){
assign(x = paste("P",i , "Onsets_PRH", sep = "_"), value = t(subset.data.frame(All_Phase, All_Phase$Phase==i) %>%
filter(CONDITIONS == "NULL_TRIAL",
MISC_REWARD == 1,
MISC_PASSIVE_FAILED == 1) %>%
select(Feedback_onset)))
assign(x = paste("P",i , "Durations_PRH", sep = "_"), value = t(rep(0.5, times = length(get(paste("P",i , "Onsets_PRH", sep = "_"))))))
}
Note that using assign and creating variables in global environment is discouraged. You may also read Why is using assign bad?
Related
Code:
GeoSeparate <- function(Dataset, GeoColumn) {
GeoColumn <- enquo(GeoColumn)
Dataset %>%
separate(GeoColumn, into = c("Section1", "Section2"), sep = "\\(")%>%
separate(Section1, into = c("Section3", "Section4"), sep = ",")%>%
separate(Section2, into = c("GeoColumn", "Section5"), sep = "\\)")%>%
separate(GeoColumn, into = c("GeoColumnLat", "GeoColumnLon"), sep = ",")%>%
select(-Section3, -Section4, -Section5) #remove sections we don't need
}
Test:
GeoSeparate(df3, DeathCityGeo)
Error:
Must extract column with a single subscript.
x The subscriptvarhas the wrong typequosure/formula.
ℹ It must be numeric or character.
My function separates a column that has the format: "Norwalk, CT\n(41.11805, -73.412906)" so that the latitude and longitude are all that remain and they are in two separate columns. It worked for a while, but now I get the error message described above. It may be because I updated my libraries but I'm not sure. Any help would be amazing! Thank you.
We need to evaluate (!!)
GeoSeparate <- function(Dataset, GeoColumn) {
GeoColumn <- enquo(GeoColumn)
Dataset %>%
separate(!!GeoColumn, into = c("Section1", "Section2"), sep = "\\(")%>%
separate(Section1, into = c("Section3", "Section4"), sep = ",")%>%
separate(Section2, into = c("GeoColumn", "Section5"), sep = "\\)")%>%
separate(!!GeoColumn, into = c("GeoColumnLat", "GeoColumnLon"), sep = ",")%>%
select(-Section3, -Section4, -Section5) #remove sections we don't need
}
Or another option is curly-curly ({{}})
GeoSeparate <- function(Dataset, GeoColumn) {
Dataset %>%
separate({{GeoColumn}}, into = c("Section1", "Section2"), sep = "\\(")%>%
separate(Section1, into = c("Section3", "Section4"), sep = ",")%>%
separate(Section2, into = c("GeoColumn", "Section5"), sep = "\\)")%>%
separate({{GeoColumn}}, into = c("GeoColumnLat", "GeoColumnLon"), sep = ",")%>%
select(-Section3, -Section4, -Section5) #remove sections we don't need
}
I want to perform a set of operations (in R) on a number of data frames located within a list. In particular, for each of one I create a "library" column, which is then used to determine which kind of filtering operation to perform. This is the actual code:
sampleList <- list(RNA1 = "data/not_processed/dedup.Bp1R4T2_S2.txt",
RNA2 = "data/not_processed/dedup.Bp1R4T3_S4.txt",
RNA3 = "data/not_processed/dedup.Bp1R5T2_S1.txt",
RNA4 = "data/not_processed/dedup.Bp1R5T3_S2.txt",
RNA5 = "data/not_processed/dedup.Bp1R14T5_S1.txt",
RNA6 = "data/not_processed/dedup.Bp1R14T6_S1.txt",
RNA7 = "data/not_processed/dedup.Bp1R14T6_S2.txt",
RNA8 = "data/not_processed/dedup.Bp1R14T7_S2.txt",
RNA9 = "data/not_processed/dedup.Bp1R14T8_S3.txt",
RNA10 = "data/not_processed/dedup.Bp1R14T9_S3.txt",
RNA11 = "data/not_processed/dedup.Bp1R14T9_S4.txt",
DNA1 = "data/not_processed/dedup.dna10_1_S4.txt",
DNA2 = "data/not_processed/dedup.dna10_2_S5.txt",
DNA3 = "data/not_processed/dedup.dna10_3_S6.txt",
DNA4 = "data/not_processed/dedup.dna50_1_S1.txt",
DNA5 = "data/not_processed/dedup.dna50_2_S2.txt",
DNA6 = "data/not_processed/dedup.dna50_3_S3.txt",
DNA7 = "data/not_processed/dedup.dna50_pcrcocktail_S7.txt")
batch <- lapply(names(sampleList),function(mysample){
aux <- read.table(sampleList[[mysample]], col.names=c(column1, column2, ..., ID, library, column4, etc...))
aux %>% mutate(library = mysample, R = Fw_ref + Rv_ref, A = Fw_alt + Rv_alt) %>% distinct(ID, .keep_all=T)
if (grepl("DNA", aux$library)){
aux %>% filter(aux$R>1 & aux$A>1)
} else {
aux %>% filter((aux$R+aux$A)>7 & aux$Fw_ref>=1 & aux$Rv_ref>=1 & aux$Fw_alt>=1 & aux$Rv_alt>=1)
}
aux
})
batch_file <- do.call(rbind, batch)
write.table(batch_file, "data/batch_file.txt", col.names = T, sep = "\t")
The possible values of the library column are DNA1 to DNA7, and RNA1 to 11. I tried also with "char" %in%, but it gives the same problem:
Error in if (grepl("DNA", aux$library)) { : argument is of length zero
Seems like the if condition is not able to identify the value in library. However, when I tried to apply the if/else condition on the batch_file (not filtered, basically obtained with this code without the if/else part) it worked perfectly.
Many thanks in advance.
I'm working in R & using a for loop to iterate through my data:
pos = c(1256:1301,6542:6598)
sd_all = null
for (i in pos){
nameA = paste("A", i, sep = "_")
nameC = paste("C", i, sep = "_")
resA = assign(nameA, unlist(lapply(files, function(x) x$percentageA[x$position==i])))
resC = assign(nameC, unlist(lapply(files, function(x) x$percentageC[x$position==i])))
sd_A = sd(resA)
sd_C = sd(resC)
sd_all = ?
}
now I want to generate a vector called 'sd_all' that contains the standard deviations of resA & resC. I cannot just do 'sd_all = c(sd(resA), sd(resC))', because then I only use one value in 'pos'. I want to do it for all values in 'pos' off course.
It looks like you'd be best served with sd_all as a list object. That way you can insert each of your 2 values ( sd(resA) and sd(resC) ).
Initialising a list is simple (this would replace the second line of your code):
sd_all <- list()
Then you can insert both the values you want to into a single list element like so (this would replace the last line in your for loop):
sd_all[[ i ]] <- c( sd( resA ), sd( resC ) )
After your loop, you can then insert this list as a column in a data.frame if that's what you'd like to do.
I am having some trouble getting R to recognize items in my Values list (in RStudio) in a function call (just referring to it as a generic function here). Here's an example...the following works just fine if I type it in directly:
result <- function(cnv.chr1.S1, cnv.chr1.S2, cnv.chr1.S3)
because cnv.chr1.S1, cnv.chr1.S2, and cnv.chr1.S3 are objects (specifically GRanges objects) that I've created previously.
But as I'm looping over different chromosomes and there are really many more than 3 samples (S1, S2, S3), I've tried the following (simplified here)
chrom <- paste("chr", 1:1, sep = "")
sample.names <- paste("S", 1:3, sep = "")
for (thischrom in chrom)
{
for (sample in sample.names)
{
a <- function(list(paste(paste("cnv", thischrom, sep = "."), sample.names, sep = ".")))
}
}
However, it doesn't work because
paste(paste("cnv", thischrom, sep = "."), sample.names, sep = ".")
just creates a character list of items that have the same names as the items in my Values list. How do I get R to access the appropriate objects in my Values list?
Thanks for any thoughts you might have!
Steve
Are you looking for something like this?
library(dplyr)
chrom <- paste("chr", 1:1, sep = "")
sample.names <- paste("S", 1:2, sep = "")
cnv.chr1.S1 = c(1, 2)
cnv.chr1.S2 = c(2, 3)
result =
data_frame(chrom = chrom) %>%
merge(data_frame(sample.names = sample.names) ) %>%
rowwise %>%
mutate(object =
paste("cnv", chrom, sample.names, sep = ".") %>%
parse(text = .) %>%
eval %>%
list)
I need help determining how I can use the input for the function below as an input for another r file.
Hotel <- function(hotel) {
require(data.table)
dat <- read.csv("demo.csv", header = TRUE)
dat$Date <- as.Date(paste0(format(strptime(as.character(dat$Date),
"%m/%d/%y"),
"%Y/%m"),"/1"))
library(data.table)
table <- setDT(dat)[, list(Revenue = sum(Revenues),
Hours = sum(Hours),
Index = mean(Index)),
by = list(Hotel, Date)]
answer <- na.omit(table[table$Hotel == hotel, ])
if (nrow(answer) == 0) {
stop("invalid hotel")
}
return(answer)
}
I would input Hotel("Hotel Name")
Here's the other R file using the Hotel name I inputted above.
#Reads the dataframe from the Hotel Function
star <- (Hotel("Hotel Name"))
#Calculates the Revpolu and Index
Revpolu <- star$Revenue / star$Hours
Index <- star$Index
png(filename = "~/Desktop/result.png", width = 480, height= 480)
plot(Index, Revpolu, main = "Hotel Name", col = "green", pch = 20)
testing <- cor.test(Index, Revpolu)
write.table(testing[["p.value"]], file = "output.csv", sep = ";", row.names = FALSE, col.names = FALSE)
dev.off()
I would like for this part to become automated instead of having to copy and paste from the first file an input and then storing it as a variable. Or if it's easier, then make all of this just one function.
Also instead of having to input one Hotel name for the function. Is it possible to make the first file read all the hotel names if they are identified as row names in the .csv file and have that input read in the second file?
Since your example is not reproducible and your code has some bugs (using the column "Rooms" which is not produced by your function), I can't give you a tested answer, but here's how you can structure your code to produce the statistics you want for all hotels without having to copy and paste hotel names:
library(data.table)
# Use fread instead of read.csv, it's faster
dat <- fread("demo.csv", header = TRUE)
dat[, Date := as.Date(paste0(format(strptime(as.character(Date), "%m/%d/%y"), "%Y/%m"),"/1"))
table <- dat[, list(
Revenue = sum(Revenues),
Hours = sum(Hours),
Index = mean(Index)
), by = list(Hotel, Date)]
# You might want to consider using na.rm=TRUE in cor.test instead of
# using na.omit, but I kept it here to keep the result similar.
answer <- na.omit(table)
# Calculate Revpolu inside the data.table
table[, Revpolu := Revenue / Hours]
# You can compute a p-value for all hotels using a group by
testing <- table[, list(p.value = cor.test(Index, Revpolu)[["p.value"]]), by=Hotel]
write.table(testing, file = "output.csv", sep = ";", row.names = FALSE, col.names = FALSE)
# You can get individual plots for each hotel with a for loop
hotels <- unique(table$Hotel)
for (h in hotels) {
png(filename = "~/Desktop/result.png", width = 480, height= 480)
plot(table[Hotel == h, Index], table[Hotel == h, Revpolu], main = h, col = "green", pch = 20)
dev.off()
}