I am trying to create a list of all possible IP addresses in the UK, so I can plot them on a map for a data science project I am working on.
I have tried to simplify the code as much as possible, basically I have a created a blank tibble to use in the for loop, but it does not appear to get any data appended to it.
Guessing I have done the append part of the FOR LOOP incorrectly.
#install libraries
library(iptools)
library(dplyr)
library(tidyverse)
#creating a list of ip ranges for processing
ip_ranges <- list("2.96.0.0/13", "2.120.0.0/13", "2.216.0.0/13")
#converting list into dataframe
ip_ranges_to_process <- do.call(rbind.data.frame, ip_ranges)
colnames(ip_ranges_to_process) <- c("ip.ranges")
#creating a blank tibble for use in for loop
UK_ip_addresses_df <- tibble(
var_name_1 = character()
)
new_df <- tibble(
var_name_1 = character()
)
#for loop to generate all possible ip addresses in ranges
for(i in 1:nrow(ip_ranges_to_process)){
ip_numbers <- range_generate(ip_ranges_to_process$ip.ranges[i])
new_df <- ip_numbers %>%
map_df(as_tibble)
UK_ip_addresses_df[i,] <- new_df
}
Any suggestions would be welcome. Sorry I am pretty new to programming.
Related
I'm trying to loop through an API, to get data from specific sitecodes and merge it into one dataframe, and for some reason the following code is only getting the original dataframe (RoyalLondon_List) and the last sensor (CDP0004)
SiteCodes_all <- c('CLDP0002', 'CLDP0003', 'CLDP0004')
for(i in 1:length(SiteCodes_all)) {
allsites <- paste0(Base,Node,SiteCodes_all[i],'/',Pollutant,StartTime,EndTime,Averaging,Key)
temp_raw <- GET(allsites)
temp_list <- fromJSON(rawToChar(temp_raw$content))
df <- rbind(RoyalLondon_List, temp_list)
}
Any help appreaciated!
The above code combines the previous data and not the looped API url
Try use this
df <- RoyalLondon_List
for(i in 1:length(SiteCodes_all)) {
allsites <- paste0(Base,Node,SiteCodes_all[i],'/',Pollutant,StartTime,EndTime,Averaging,Key)
temp_raw <- GET(allsites)
temp_list <- fromJSON(rawToChar(temp_raw$content))
df <- dplyr::bind_rows(df, temp_list)
}
dplyr::bind_rows() is a function in the dplyr package that allows you to combine multiple dataframes by appending the rows of one dataframe to the bottom of another.see here to more info about it.
I have an object that contains list of lab tests and based on the length of the object, I have created a FOR loop that processes scripts. During each loop, R should create a data frame using list in that object. Please see below.
adlb <- data.frame(subjid = c(1:20), aval = c(100:119))
adlb$paramcd <- ifelse(adlb$subjid <= 10, "ALT", "AST")
lab_list <- unique(filter(adlb, !is.na(aval))$paramcd)
for (i in 1:length(lab_list))
{
lab_name <- unlist(lab_list)[[i]]
print(lab_name)`
**???** <- adlb %>%
dplyr::filter(paramcd == lab_name) %>%
drop_na(aval)
}
When I run it, it should first create data frame named ALT followed by AST. What should I replace ??? with?
Only reason why I would prefer it this way is because it helps me to review data in question and debug scripts when needed.
Thank you in advance.
I tried lab_name[[i]] and few other options but it resulted in either error or incorrect data frame name.
I think this might help:
# example dataframes
df1 <- iris
df2 <- mtcars
df3 <- iris
#put them into list
mylist <- list(df1,df2,df3)
#give names to list
names(mylist) <- c("df_name1","df_name2","df_name3")
#put dataframes into global env
list2env(mylist ,.GlobalEnv)
I have a list of dataframes with varying dimensions filled with data and row/col names of Countries. I also have a "master" dataframe outside of this list that is blank with square dimensions of 189x189.
I wish to merge each dataframe inside the list individually on top of the "master" sheet perserving the square matrix dimensions. I have been able to achieve this individually using this code:
rownames(Trade) <- Trade$X
Trade <- Trade[, 2:length(Trade)]
Full[row.names(Trade), colnames(Trade)] <- Trade
With "Full" being my master sheet and "Trade" being an individual df.
I have attempted to create a function to apply this process to a list of dataframes but am unable to properly do this.
Function and code in question:
DataMerge <- function(df) {
rownames(df) <- df$Country
Trade <- Trade[, 2:length(Trade)]
Country[row.names(df), colnames(df)] <- df
}
Applied using :
DataMergeDF <- lapply(TradeMatrixDF, DataMerge)
filenames <- paste0("Merged",names(DataMergeDF), ".csv")
mapply(write.csv, DataMergeDF, filenames)
Country <- read.csv("FullCountry.csv")
However what ends up happening is that the data does not end up merging properly / the dimensions are not preserved.
I asked a question pertaining to this issue a few days ago (CSV generated from not matching to what I have in R) , but I have a suspicion that I am running into this issue due to my use of "lapply". However, I am not 100% sure.
If we return the 'Country' at the end it should work. Also, better to pass the other data as an argument
DataMerge <- function(Country, df) {
rownames(df) <- df$Country
df <- df[, 2:length(df)]
Country[row.names(df), colnames(df)] <- df
Country
}
then, we call the function as
DataMergeDF <- lapply(TradeMatrixDF, DataMerge, Country = Country)
I have gotten instructions to do an analysis in R with the vegan package (concerning DCA's).
The instructions on a single dataframe are pretty straightforward, but I would like to apply the analysis on a set of dataframes.
I know this can be done with a for-loop or lapply or sapply, but I have trouble dealing with the fact that each step of the analysis a new extension is added to the name of the dataframe.
An example below
Say I have a dataframe DF, then it goes as follows:
DF.t1 <- decostand(DF, "total")
DF.t2 <- decostand(DF.t1, "max")
DF.t2.dca <- decorana(DF.t2)
DF.t2.dca.DW <- decorana(DF.t2, iweigh=1)
names(DF.t2.dca)
summary(DF.t2.dca)
DF.t2.dca.taxonscores <- scores(DF.t2.dca, display=c("species"), choices=c(1,2))
DF.t2.dca.taxonscores <- DF.t2.dca$cproj[ ,1:2]
DF.t2.dca.samplescores <- scores(DF.t2.dca, display=c("sites"), choices=1)
What I want to achieve is to run several dataframes through this analysis without writing it all out separately.
Let's say I have a set of dataframes called "DF_1", "DF_2" & "DF_3" which I want to do this analysis on.
I probably need to put the dataframes in a list, and get all the steps in a for-loop or one of the apply methods.
But how do I approach the problem with the extensions added (.ra, .t1, .t2, .t2.dca, .t2.dca.DW etc.) to the dataframe names?
Edit: I need to retain the original dataframes after the analysis, in order to do follow-up analysis on them.
Unless you have a very limited amount of data frames, I would not advise to define ca. 8 new objects for each data frame in the global environment as this can become very messy.
One approach you might consider is creating a nested list where the first level is the data frame and the second level are the modified data frames.
# some example data sets
DF1 <- mtcars
DF2 <- mtcars*2
DF3 <- mtcars*3
all_dfs <- list(DF1 = DF1, DF2 = DF2, DF3 =DF3)
some_stuff <- function(df) {
DF.t1 <- decostand(df, "total")
DF.t2 <- decostand(DF.t1, "max")
DF.t2.dca <- decorana(DF.t2)
DF.t2.dca.DW <- decorana(DF.t2, iweigh=1)
names(DF.t2.dca)
summary(DF.t2.dca)
DF.t2.dca.taxonscores <- scores(DF.t2.dca, display=c("species"), choices=c(1,2))
DF.t2.dca.taxonscores <- DF.t2.dca$cproj[ ,1:2]
DF.t2.dca.samplescores <- scores(DF.t2.dca, display=c("sites"), choices=1)
return(list(DF.t1 = DF.t1, DF.t2 = DF.t2,
DF.t2.dca = DF.t2.dca,
DF.t2.dca.DW = DF.t2.dca.DW,
DF.t2.dca.taxonscores = DF.t2.dca.taxonscores,
DF.t2.dca.taxonscores = DF.t2.dca.taxonscores
))
}
nested_list <- lapply(all_dfs, some_stuff)
# To obtain any of the objects for a specific data.frame you could, for example, run
nested_list$DF1$DF.t2.dca.DW
I'm trying to pull html data tables into a single data frame, and I'm looking for an elegant solution. There are 255 tables, and the urls vary by two variable: Year and Aldermanic District. I know there must be a way to use for loops or something, but I'm stumped.
I have successfully imported the data by reading each table in with a separate line of code, but this results in a line for each table, and again, there are 255 tables.
library(XML)
data <- bind_rows(readHTMLTable("http://assessments.milwaukee.gov/SalesData/2018_RVS_Dist14.htm", skip.rows=1),
readHTMLTable("http://assessments.milwaukee.gov/SalesData/2017_RVS_Dist14.htm", skip.rows=1),
readHTMLTable("http://assessments.milwaukee.gov/SalesData/2016_RVS_Dist14.htm", skip.rows=1),
readHTMLTable("http://assessments.milwaukee.gov/SalesData/2015_RVS_Dist14.htm", skip.rows=1),
Ideally, I could use a for loop or something so I wouldn't have to hand code the readHTMLTable function for each table.
You could try creating a vector containing all the URLs which you want to scrape, and then iterate over those inputs using a for loop:
url1 <- "http://assessments.milwaukee.gov/SalesData/"
url2 <- "_RVS_Dist"
years <- c(2015:2018)
dist <- c(1:15)
urls <- apply(expand.grid(paste0(url1, years), paste0(url2, dist)), 1, paste, collapse="")
data <- NULL
for (url in urls) {
df <- readHTMLTable(url)
data <- rbind(data, df)
}
We can use map_dfr from the purrr package (part of the tidyverse) package to apply the readHTMLTable function across the URL. The key is to identify the part that is different from each URL. In this case 2015:2018 is the only thing changed, so we can construct the URL with paste0. map_dfr would automatically combine all data frame to return one combined data frame. dat is the final output.
library(tidyverse)
library(XML)
dat <- map_dfr(2015:2018,
~readHTMLTable(paste0("http://assessments.milwaukee.gov/SalesData/",
.x,
"_RVS_Dist14.htm"), skip.rows = 1)[[1]])
Update
Here is the way to expand the combination between year and numbers, and then download the data with map2_dfr.
url <- expand.grid(Year = 2002:2018, Number = 1:15)
dat <- map2_dfr(url$Year, url$Number,
~readHTMLTable(paste0("http://assessments.milwaukee.gov/SalesData/",
.x,
"_RVS_Dist",
.y,
".htm"), skip.rows = 1)[[1]])