I am using the acs.R package and I am having trouble collecting data from the DP tables and S tables. The tables beginning with B are fine though. Here is an example of my code and the error I receive:
national = geo.make(us="*")
Race_US <- acs.fetch(endyear = 2015, span = 1, geography = national,
table.number = "DP04", col.names = "pretty")
Warning message:
In (function (endyear, span = 5, dataset = "acs", keyword, table.name, :
Sorry, no tables/keyword meets your search.
Suggestions:
try with 'case.sensitive=F',
remove search terms,
change 'keyword' to 'table.name' in search (or vice-versa)
For some reason it is unable to find the table. I have tried acs.lookup with various keywords that should work and still nothing.
Thanks for using the acs.R package.
The problem here is with the "DP" tables: although they are available through the census api, they are not fetched via the acs.R package, since they are in a different format -- not really "raw data" as much as pre-formatted tables made from data found in other places. That said, you should be able to find the underlying data in other tables that are available with acs.fetch.
Related
I am working within Databricks, trying to use the sparklyr function spark_write_jdbc to write a dataframe to a SQL Server table. The server name/driver etc are correct and work, as I successfully used sparklyr::spark_read_jdbc() earlier in the code.
Per the documentation (here), spark_write_jdbc should accept a Spark Dataframe.
I used SparkR::createDataFrame() to convert the dataframe I was working with to a Spark dataframe.
Here is the relevant code:
events_long_test <- SparkR::createDataFrame(events_long, schema = NULL, samplingRatio = 1, numPartitions = NULL)
sparklyr::spark_write_jdbc(events_long_test,
name ="who_status_long_test" ,
options = list(url = url,
user = user,
driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
password = pw,
dbtable = "who_status_long_test"))
However, when I run this, it gives me the following error:
Error in UseMethod("spark_write_jdbc") : Error in UseMethod("spark_write_jdbc") :
no applicable method for 'spark_write_jdbc' applied to an object of class "SparkDataFrame"
I have searched around and cannot find other people asking about this error. Why would it say this function cannot work with a Spark Dataframe, when the documentation says it does?
Any help is appreciated.
What is in events_long? the syntax is correct and make sure your connection properties in options are correct. Make sure that events_long_test is a spark dataframe not a table.
I created an empty table from Big Query GUI with the schema for the table_name. Later I'm trying to append data to the existing empty table from R using bigrquery package.
I have tried below code,
upload_job <- insert_upload_job(project = "project_id",
dataset = "dataset_id",
table = "table_name",
values = values_table,
write_disposition = "WRITE_APPEND")
wait_for(upload_job)
But it is throwing me an error saying,
Provided Schema does not match Table. Field alpha has changed mode from REQUIRED to NULLABLE [invalid]
My table doesn't have any NULL or NA in the mentioned column and data_types in the schema matches exactly with the data types of values_table.
I tried without creating schema uploading directly from R. While I'm doing that it is automatically converting the mode to nullable which is not what I'm looking for.
I also tried by changing write_dispostion = "WRITE_TRUNCATE" which is also converting mode to nullable.
I also looked at this and this which didn't really help me.
Can someone explain what is happening behind the scenes and what is the best way to upload data without recreating schema.
Note: There was a obvious typo error. Earlier it was wirte_disposition edited it to write_disposition.
So I am currently working with a connecting to an Access database. I am able to get connected to the Access DB which is located on my local system. This is actually connected to a SharePoint list. I would love to automate the process handling this SharePoint list with an R and Access combo! What I want to be able to do actually pretty basic, I want to introduce new data via a .csv which is processed for the relevant content and then compared to the current Access DB and finally the new information uploaded from r to Access.
I've learned that you need to pair the bit version of your Windows OS, Office version, and R version. So I am x64 on all of the above. This allowed me to connect to the Access DB. You also need the 'Microsoft Access Database Engine 2016 Redistributable' which is essentially the driver for the connection.
So what I have so far is:
library(odbc)
library(DBI)
file_path <- "C:/user/Documents/R Projects/...pathtofile.../filename.accdb"
accdb_con <- dbConnect(drv = odbc(), .connection_string = paste0("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=",file_path,";"))
access.db <- dbReadTable(accdb_con, "sNPS Deep Dives")
That now connects!
I then read in a .csv of new information
new.df <- read.csv("C:/user/Documents/R projects/...pathtofile.csv", header=T, stringsAsFactors=FALSE, na.strings=c("","NA"))
an example of the data set might just look something like this:
date <- c("15/10/2018","15/10/2018", "16/10/2018", "12/11/2018", "07/09/2018")
score <- c("6", "10", "7", "10", "9")
group <- c("a","b", "b", "a", "b")
CaseID <- c("301", "302", "303", "304", "305")
new.df <- data.frame(date,score,group,CaseID)
new.df$date <- as.character(new.df$date)
new.df$score <- as.numeric(new.df$score)
new.df$group <- as.character(new.df$group)
new.df$CaseID <- as.numeric(new.df$CaseID)
Notably there are more columns in the Access DB that people will fill in by hand with further information.
and I process it to be ready go into the Access DB.
probably not that interesting...
Then I compare the the new data against the Access DB as such:
library(dplyr)
new <- anti_join(new.df, access.db, by= "Case.ID")
Now I've tried:
dbWriteTable(access.db.copy, new, append = TRUE)
dbAppendTable(access.db.copy, new)
I don't seem to be able to get this to go anywhere
I am getting an error:
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘dbWriteTable’ for signature ‘"ACCESS", "data.frame", "missing"’
I've seen plenty of posts in which people are having trouble connecting to an Access DB but I haven't seen anything about writing new data into that database.
I know this isn't quite a reproducible example but it seems like a difficult problem to recreate since it's a connection problem between different tools. I would be happy to provide example sets that might make this easier
I would appreciate any direction you all can provide.
Thanks!
Edit:
It appears that Bing Sun was right, I was missing an argument. So it appears that we need something more like:
dbWriteTable(access.db.copy, "Name of table",new, append = TRUE)
Which produces the error:
Error in result_insert_dataframe(rs#ptr, values) :
nanodbc/nanodbc.cpp:1944: HY104: [Microsoft][ODBC Microsoft Access Driver]Invalid precision value
I wonder if this may something that is an error from Access about a file type?
now if I use the append I don't get an error I get a 0 for output
dbAppendTable(access.db.copy, "Name of table", new, append= TRUE)
With output:
[1] 0
But I don't see any of the new values when I check the Access file.
I know it's years later, but hopefully this will help someone else with this issue since you're right CrayCrayTown, there aren't very many posts covering this issue.
I've run into this problem repeatedly when dealing with R and MS Access. The solution that I've come up with is pretty "hacky" but it accomplishes what's trying to be done...just not very eloquently.
The way I do this is with a combo of RODBC and DBI packages.
First, I open a connection to the DB with RODBC, and use that connection to write my data to the DB as an intermediary table:
chan <- RODBC::odbcDriverConnection(connection = "/path/to/database.accdb")
RODBC::sqlSave(channel = chan,
dat = df,
tablename = "tbl_intermediary",
rownames = FALSE,
append = FALSE)
RODBC::odbcClose(chan)
rm(chan)
Make sure to close the RODBC connection, I also destroy it for good measure, because why not? I use RODBC for the intermediary table because it supports batch insert statements. I know that the same thing can, in theory, be done with DBI with DBI::dbAppendTable()(but we wouldn't be on this post if that worked how we had hoped). I tried this in a previous SO question here, but it didn't solve my problem. I also don't know how big my intermediary tables could get in the future. Hopefully by the time they get too big we'll be in a different DBMS.
Next, I reopen the connection, this time with DBI, and send a statement to the DB to write those data from the intermediary table to the final resting place for those data, and then drop the intermediary table.
con <- DBI::dbConnect(odbc::odbc(), .connection_string = "/path/to/database.accdb")
DBI::dbSendStatement(
conn = con,
statement = 'UPDATE
tbl_intermediary INNER JOIN final_tbl ON tbl_intermediary.SampleID = final_tbl.sampleNumber
SET
final_tbl.field1 = [tbl_intermediary].[field1],
final_tbl.notes = IIf(Nz([tbl_intermediary].[Notes],"")="",[final_tbl].[notes],[final_tbl].[notes] & "; Newest Notes: " & [tbl_intermediary].[Notes]);'
)
DBI::dbSendStatement(
conn = con,
statement = 'DROP TABLE tbl_intermediary;'
DBI::dbDisconnect(con)
rm(con)
)
The main reason why I chose this method is because some of the SQL I use with Access also has some VBA in it. When I send the SQL-VBA hybrid string with RODBC, I get assorted errors in the IIF() and Nz() functions (see example above). From the RODBC CRAN docs the query argument for the sqlQuery() function is strictly assumed to be a valid SQL statement. So, RODBC has no clue how to interpret the IIf() and Nz() MS Access functions. I think this also has to do with how the ODBC driver handles communication as well (please, someone correct me if I'm wrong about this).
As I understand it, DBI::dbSendStatment() however lets the database engine you're working with interpret how to use the statement argument you provide. In the situation above, the VBA is executed exactly how I would expect if it were run in Access directly. As per the DBI docs, for interactive use you'll generally want to use dbGetQuery or dbExecute.
I am struggling to parse contents from HTML using htmlTreeParse and XPath.
Below is the web link from where I need to extract information of "most valuable brands" and create a data frame out of it.
http://www.forbes.com/powerful-brands/list/#tab:rank
As a first step towards building the table, I am trying to extract the list of brands (Apple, Google, Microsoft etc. ). I am trying through below code:
library(XML)
htmlContent <- getURL("http://www.forbes.com/powerful-brands/list/#tab:rank", ssl.verifypeer=FALSE)
htmlParsed <- htmlTreeParse(htmlContent, useInternal = TRUE)
output <- xpathSApply(htmlParsed, "/html/body/div/div/div/table[#id='the_list']/tbody/tr/td[#class='name']", xmlValue)
But its returning NULL. I am not able to find my mistake. "/html/body/div/div/div/table[#id='the_list']/thead/tr/th" works correctly, returning ("", "Rank", "brand" etc.)
This means path upto table is correct. But I am not able to understand what's wrong thereafter.
I'm working with some large transactions data. I've been using read.transactions and apriori (parts of the arules package) to mine for frequent item pairings.
My problem is this: when rules are generated (using "inspect()") I can easily view them in the R console. Right now I'm manually copying the results into a text file, then saving and opening in excel. I'd like to just save the generated rules using write.csv, or something similar, but when I try, I receive an error that the data cannot be coerced into data.frame.
Does anyone have experience doing this successfully in R?
I know I'm answering my own question, but I found out that the solution is to use as() to convert the rules into a data frame. [I'm new to R, so I missed this my first time searching for a solution.] From there, it can easily be manipulated in any way you'd like (sub setting, sorting, exporting, etc.).
> mba = read.transactions(file="Book2.csv",rm.duplicates=FALSE, format="single", sep=",",cols=c(1,2));
> rules_1 <- apriori(mba,parameter = list(sup = 0.001, conf = 0.01, target="rules"));
> as(rules_1, "data.frame");
Another way to achieve that would be:
write(rules_1,
file = "association_rules.csv",
sep = ",",
quote = TRUE,
row.names = FALSE)
I found this post when struggling with writing my rules to excel. My solution is:
library(writexl)
write_xlsx(as(rules_1, "data.frame"), "rules_1.xlsx")
It is much easier to read and report when it is in excel.