Looping through names in a csv file

Looping through names in a csv file - r

I am trying to loop through all names in a csv file for the following loop to retrieve twitter data:
require(twitteR)
require(data.table)
consumer_key <- 'KEY'
consumer_secret <- 'CON_SECRET'
access_token <- 'TOKEN'
access_secret <- 'ACC_SECRET'
setup_twitter_oauth(consumer_key,consumer_secret,access_token,access_secret)
options(httr_oauth_cache=T)
accounts <- read.csv(file="FILE.CSV", header=FALSE, sep="")
Sample data in CSV file (each name in one only one row, first column):
timberghmans
alyssabereznak
JoshuaLenon
names <- lookupUsers(c(accounts))
for(name in names){
a <- getUser(name)
print(a)
b <- a$getFollowers()
print(b)
b_df <- rbindlist(lapply(b, as.data.frame))
print(b_df)
c <- subset(b_df, location!="")
d <- c$location
print(d)
}
However, it does not work. Every new row contains a twitter screenname.When I type it in like this:
names <- lookupUsers(c("USER1","USER2","USER3"))
it works perfectly. I also tried to loop through the accounts, but to no avail. Does someone maybe have an general example, or could anyone give a hint please?

Related

Extract rows from csv files in R

I want to extract lat/long data + file name from csv
I have done the following:
#libraries-----
library(readr)
library("dplyr")
library("tidyverse")
# set wd-----EXAMPLE
setwd("F:/mydata/myfiles/allcsv")
# have R read files as list -----
list <- list.files("F:/mydata/myfiles/allcsv", pattern=NULL, all.files=FALSE,
full.names=FALSE)
list
]
#lapply function
row.names<- c("Date=0", "Time=3", "Type=2", "Model=1", "Coordinates=nextrow", "Latitude = 38.3356", "Longitude = 51.3323")
AllData <- lapply(list, read.table,
skip=5, header=FALSE, sep=";", row.names=row.names, col.names=NULL)
PulledRows <-
lapply(AllData, function(DF)
DF[fileone$Latitude==38.3356, fileone$Longitude==51.3323]
)
# maybe i need to specify a for loop?
how my data looks
Thank you.

This should work for you. You may have to change the path location if the .csv files are not in your working directory. And the location to save the final results.
results <- data.frame(Latitude=NA,Longitude=NA,FileName=NA) #create empty dataframe
for(i in 1:length(list)){ # loop through each file obtained from list (called above)
dat <- read_csv(list[i],col_names = FALSE) # read in the ith dataset
df <- data.frame(dat[6,1],dat[7,1],list[i]) # create new dataframe with values from dat
df[,1] <- as.numeric(str_remove(df[,1],'Latitude=')) # remove text and make numeric
df[,2] <- as.numeric(str_remove(df[,2],'Longitude='))
names(df) <- names(results) # having the same column names allows next line
results <- rbind(results,df) # 'stacks' the results dataframe and df dataframe
}
results <- na.omit(results) # remove missing values (first row)
write_csv(results,'desired/path')

R Function with for Loops creates several dataframes, how to have each one have a different name

I have a for loop that loops through a list of urls,
url_list <- c('http://www.irs.gov/pub/irs-soi/04in21id.xls',
'http://www.irs.gov/pub/irs-soi/05in21id.xls',
'http://www.irs.gov/pub/irs-soi/06in21id.xls',
'http://www.irs.gov/pub/irs-soi/07in21id.xls',
'http://www.irs.gov/pub/irs-soi/08in21id.xls',
'http://www.irs.gov/pub/irs-soi/09in21id.xls',
'http://www.irs.gov/pub/irs-soi/10in21id.xls',
'http://www.irs.gov/pub/irs-soi/11in21id.xls',
'http://www.irs.gov/pub/irs-soi/12in21id.xls',
'http://www.irs.gov/pub/irs-soi/13in21id.xls',
'http://www.irs.gov/pub/irs-soi/14in21id.xls',
'http://www.irs.gov/pub/irs-soi/15in21id.xls')
dowloads an excel file from each one assigns it to a dataframe and performs a set of data cleaning operations on it.
library(gdata)
for (url in url_list){
test <- read.xls(url)
cols <- c(1,4:5,97:98)
test <- test[-(1:8),cols]
test <- test[1:22,]
test <- test[-4,]
test$Income <-test$Table.2.1...Returns.with.Itemized.Deductions..Sources.of.Income..Adjustments..Itemized.Deductions.by.Type..Exemptions..and.Tax..Items..by.Size.of.Adjusted.Gross.Income..Tax.Year.2015..Filing.Year.2016.
test$Total_returns <- test$X.2
test$return_dollars <- test$X.3
test$charitable_deductions <- test$X.95
test$charitable_deduction_dollars <- test$X.96
test[1:5] <- NULL
}
My problem is that the loop simply writes over the same dataframe for each iteration through the loop. How can I have it assign each iteration through the loop to a data frame with a different name?

Use assign. This question is a duplicate of this post: Change variable name in for loop using R
For your particular case, you can do something like the following:
for (i in 1:length(url_list)){
url = url_list[i]
test <- read.xls(url)
cols <- c(1,4:5,97:98)
test <- test[-(1:8),cols]
test <- test[1:22,]
test <- test[-4,]
test$Income <-test$Table.2.1...Returns.with.Itemized.Deductions..Sources.of.Income..Adjustments..Itemized.Deductions.by.Type..Exemptions..and.Tax..Items..by.Size.of.Adjusted.Gross.Income..Tax.Year.2015..Filing.Year.2016.
test$Total_returns <- test$X.2
test$return_dollars <- test$X.3
test$charitable_deductions <- test$X.95
test$charitable_deduction_dollars <- test$X.96
test[1:5] <- NULL
assign(paste("test", i, sep=""), test)
}

You could write to a list:
result_list <- list()
for (i_url in 1:length(url_list)){
url <- url_list[i_url]
...
result_list[[i_url]] <- test
}
You can also name the list
names(result_list) <- c("df1","df2","df3",...)

Here's another approach with lapply instead of for loops which will write all resulting data.frames as separate list items which can then be re-named (if needed).
url_list <- c('http://www.irs.gov/pub/irs-soi/04in21id.xls',
...
'http://www.irs.gov/pub/irs-soi/15in21id.xls')
readURLFunc <- function(z){
test <- readxl::read_xls(z)
...
test[1:5] <- NULL
return(test)}
data_list <- lapply(url_list, readURLFunc)

Iterate over multiple columns in a csv file and convert them into a list of lists in R

Suppose I have a csv file which have 2 columns username, tweet. For each user, how can get all the tweets he made into a list. For example the list should be something like list(c(user1,tweet1,t2,t3),c(u2,t7,t8,t9),....)
usernameslist <- alldata$V5
usernameslist <- usernameslist[-1]
tweetslist <- alldata$V2
tweetslist <- tweetslist[-1]
user_and_his_tweets <- split(tweetslist,usernameslist, drop = FALSE )
mylist <- list()
for(i in 1:length(user_and_his_tweets)){
mylist <- list(mylist,c(names(user_and_his_tweets[i]),as.character(user_and_his_tweets[[i]])))
}
This is what I tried. But "mylist" is not in the format I wanted.

Deleting rows in a sequence for MULTIPLE lists in R

I know how to delete rows in in a sequence for a SINGLE list:
data <- data.table('A' = c(1,2,3,4), 'B' = c(900,6,'NA',2))
row.remove <- data[!(data$A = seq(from=1,to=4,by=2) )]
However, I would like to know how to do so with MULTIPLE lists.
Code I've tried:
file.number <- c(1:5)
data <- setNames(lapply(paste(file.number,".csv"), read.csv, paste(file.number)) # this line imports the lists from csv files - works
data.2 <- lapply(data, data.table) # seems to work
row.remove <- lapply(data.2, function(x) x[!(data.2$A = seq(from=1,to=4,by=2)) # no error message, but deletes all the rows
I feel like I'm missing something obvious, any help will be greatly appreciated.

Solution:
for (i in 1:5){
file.number = i
data <- setNames(lapply(paste(file.number,".csv"), read.csv, paste(file.number))
data <- as.data.table(data)
row.remove <- data[!(data$A = seq(from=1,to=4,by=2) )]
}
Instead of analyzing the list simultaneously, this will analyze the lists one by one. It's not a full solution, but more of a work around.

I'm trying to skip over errors and warnings in this for loop in r but its not working?

I am trying to read csv files with their names as dates into a for loop and then print out a few columns of data from that file when it is actually there. I need to skip over the dates that I don't have any data for and the dates that don't actually exist. When I put in my code there is no output, it is just blank. Why doesn't my code work?
options(width=10000)
options(warn=2)
for(a in 3:5){
for(b in 0:1){
for(c in 0:9){
for(d in 0:3){
for(e in 0:9){
mydata=try(read.csv(paste("201",a,b,c,d,e,".csv",sep="")), silent=TRUE)
if(class(mydata)=="try-error"){next}
else{
mydata$Data <- do.call(paste, c(mydata[c("LAST_UPDATE_DT","px_last")], sep=""))
print(t(mydata[,c('X','Data')]))
}
}}}}}

That's a really terrible way to read in all your files. Try this:
f <- list.files(pattern="*.csv")
mydata <- sapply(f, read.csv, simplify=FALSE)
This will return a list mydata of data frames, each of which is the contents of the corresponding file.
Or, if there are other csv files that you don't want to read in, you can restrict the specification:
f <- list.files(pattern="201\\d{5}\\.csv")
And to combine everything into one big data frame:
do.call(rbind, mydata)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Looping through names in a csv file - r

Related

Extract rows from csv files in R

R Function with for Loops creates several dataframes, how to have each one have a different name

Iterate over multiple columns in a csv file and convert them into a list of lists in R

Deleting rows in a sequence for MULTIPLE lists in R

I'm trying to skip over errors and warnings in this for loop in r but its not working?

Categories

Resources