I have a list with the names of my files and an array with some integers i want to use every time I open a file. When I open file1 I want it to use numbers[1], when I open file 14 I want it to open numbers[14].
I have tried to create n inside the function in which I will use lapply, but not knowing how to have an index to know which file I am reading, I discarded it. Then I tried to use mapply but it creates twice as many elements as I want.
I want to execute my function and every time I use the n index element of my fnames and the n index of my array numbers. I want you to save the result in a list.
My function opens a file and calculates the data in that file based on the value of n corresponding to that file (in a while). That's why I need to use the same index for fnames () as for numbers [].
The function returns a dataframe, with lapply intended to enter the dataframe result of each file with its corresponding number in a list.
In this way I create the list of names
x<-list.files(pattern=".txt")
This is the array of numbers:
n<-c(4,4,12,6,3,6,8,32,4,4,9,5,5,6,8,3,6,7,3,6,5,3,5)
I do not know how to execute the function with those two parameters to get a list with all the results as if I were running lapply with it.
Related
I'm parsing a JSON using the RJSONIO package.
The parsed item contains nested lists.
Each item in the list can be extracted using something like this:
dat_raw$`12`[[31]]
which correctly returns the string stored at this location (in this example, the '12' refers to the month and [[31]] to day).
"31-12-2021"
I now want to run a for loop to sequentially extract the date for every month. Something like this:
for (m in 1:12) {
print(dat_raw$m[[31]])
}
This, naturally, returns a NULL because there is no $m[[31]] in the list.
Instead, I'd like to extract the objects stored at $`1`[[31]], $`2`[[31]], ... $`12`[[31]].
There must be a relatively easy solution here but I haven't managed to crack it. I'd value some help. Thanks.
EDIT: I've added a screenshot of the list structure I'm trying to extract. The actual JSON object is quite large for a dput() output. Hope this helps
So, to get the date in this list, I'd use something like dat_raw$data$`1`[[1]]$date$gregorian$date.
What I'm trying to do is run a loop to extract multiple items of the list by cycling through $data$`1`[[1]]$..., $data$`2`[[1]]$... ... $data$`12`[[1]]$... using $data$m[[1]]$... in a for loop where m is the month.
Instead of dat_raw$`12`[[31]], you can have dat_raw[[12]][[31]] if 12 is the 12th element of the JSON. So your for loop would be:
for (m in 1:12) {
print(dat_raw[[m]][[31]])
}
I am working on a list of .csv data which I have read in and keep the variables that i need for study. During this process, I have build multiple data set with name xxx_101(PY_101, vB_101_FG_101, etc.) which store in global environment. Now I want to put every new data set with ending _101 into a list. Is it a clever way to build that list other than type them in one by one? Once I read them in to a list, I would like to rename the each list with their original data name. Is there a easy way to do that?
I could do it one by one, but just feel there should be a better way to do. Thanks.
We can use mget with ls and specify the pattern with "_101" as the end ($) of the object name. It would get the values of all those objects into a list
lst1 <- mget(ls(pattern = "_101$"))
attach.files = c(paste("/users/joesmith/nosection_", currentDate,".csv",sep=""),
paste("/users/joesmith/withsection_", currentDate,".csv",sep=""))
Basically, if I did it like
c("nosection_051418.csv", "withsection_051418.csv")
And I did that manually it would work fine but since I'm automating this to run every day I can't do that.
I'm trying to attach files in an automated email but when I structure it like this, it doesn't work. How can I recreate this so that the character vector accepts it?
I thought your example implied the need for "parallel" inputs to the path stem, the first portion of the file name, and the date portions of those full paths. Consider this illustration of using a 2 item vector and a one item vector (produced by Sys.Date, replacing your "currentdate") to populate the %s positions in that sprintf string (suggested by #Gregor):
sprintf("/users/joesmith/%s_%s.csv", c("nosection", "withsection"), Sys.Date() )
[1] "/users/joesmith/nosection_2018-05-14.csv" "/users/joesmith/withsection_2018-05-14.csv"
I have the following vector:
USTickers=c("BAC","C","JPM","HBS","WFC","GS","MS","USB","BK","PNC")
Actually this vector of mine is much longer, but I just cut it short. This vector has ticker names of stocks.
I use quantmod to download data of the stocks from yahoo.
Since I do not intend to write function for every specific ticker I want to do a loop.
First I want to use a function getSymbols which is not a problem. An object of a specific stock is downloaded.
However I want to make some adjustments of it and save it. Then I have a problem (second line in the for in loop). I want to have a variable name. The name of an object in which it will be saved has to be changing. But I am unable to do that.
for (i in 1:(length(USTickers))) {
getSymbols.yahoo(paste(USTickers[i]),.GlobalEnv,from=StrtDt,to=EndDt)
as.symbol(USTickers[i]=data.frame(time(get(USTickers[1])),get(USTickers[1])[,4],row.names=NULL)
}
In addiction:
in every object of a stock that I download, a column name is in this form "AAL.Open" and i want to change it to "AAL". How am I supposed to change column name?
I know it can be done with colnames function, but i don't know how to automate the operation.
Cause the first part "AAL" will be constantly changing, i just want to get rid of the ".Open" part.
Basically I could just be rewriting it with a ticker name, but I do not know how to apply it when the column name will be changing and I am planning to use as a reference my vector USTickers.
It is a better idea to turn off auto assignment with the getSymbols function and store the results in a list. The elements can be easily accessed later. See the below for some ideas.
require(quantmod)
# Not going to loop through all
USTickers = c("BAC","C")#,"JPM","HBS","WFC","GS","MS","USB","BK","PNC")
# Initialise empty list
mysymbols <- vector("list", length(USTickers))
# Loop through symbols
for (i in 1:length(USTickers)) {
# Store in list
mysymbols[[i]] <- getSymbols.yahoo(paste(USTickers[i]),auto.assign = F)
# Isolate column of interest and date
mysymbols[[i]] <- data.frame(time(mysymbols[[i]]),
mysymbols[[i]][,4],
row.names = NULL)
# Change list elements name to symbol
names(mysymbols)[i] <- USTickers[i]
}
Regarding substituting names, this can be done easily with gsub which can be applied to the colnames. For example:
gsub(".Open", "", "AAL.Open")
However if you just want to make that column name the ticker you can just do that directly in the loop as well colnames(mysymbols[[i]])[2] <- USTickers[i]
I'd like to loop through a list of files and record detailed info about them (size, no. of rows, means of columns).
I just started with storing the info in a data frame:
df<-data.frame()
all <-list.files(pattern=".csv")
for (i in all){
file<-read.csv(i)
filas<-nrow(file)
cols<-ncol(file)
info<-c(i,filas,cols)
df<-rbind(df,i,filas,cols)
}
but it triggers an error caused by the 'i' variable, which is just a file name. What am I doing wrong?
Thanks in advance, p.
Don't use for loops. Rather, use lapply in combination with do.call to obtain your desired result. Try:
do.call(rbind,lapply(all,function(x) {y<-read.csv(x); c(file=x, filas=nrow(y), cols=ncol(y))}))
Your approach was failing because in order of rbind to work, you need two data.frames with the same number of columns. You initially have created an empty data.frame (with 0 column) and this couldn't be rbinded to a vector of length 3 (assuming that you want a row for each file showing file name, number of rows and number of columns). If you really want to use a for loop, you should do something like:
for (i in 1:length(all)) {
file<-read.csv(all[i])
info<- data.frame(file=all[i], filas=nrow(file), cols=ncol(file))
if (i==1) df<-info else df<-rbind(df,info)
}