R: row to column error while writing to DB

R: row to column error while writing to DB - r

Im using the below statement for converting rownames to column
library(tidyverse)
names(res) <- names(dt)[]
final<- imap(res, ~ .x %>%
as.data.frame %>%
select(!! .y := `Point Forecast`) %>%
rownames_to_column("Month_year")) %>%
reduce(inner_join, by = "Month_year")
and when i try to write the output to a db,
with
dbWriteTable(mycon, value = final , Database= 'mydb' ,name = "Rpredict", append = TRUE )
i receive an error as below:
Error in result_insert_dataframe(rs#ptr, values) : 
  nanodbc/nanodbc.cpp:1587: 42S22: [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid column name 'Month_year'
How do i fix this?

Related

Problem with mutate when trying to create a line_id column

I need to create a line ID column within a dataframe for further pre-processing steps. The code worked fine up until yesterday. Today, however I am facing the error message:
"Error in mutate():
ℹ In argument: line_id = (function (x, y) ....
Caused by error:
! Can't convert y to match type of x ."
Here is my code - the dataframe consists of two character columns:
split_text <- raw_text %>%
mutate(text = enframe(strsplit(text, split = "\n", ))) %>%
unnest(cols = c(text)) %>%
unnest(cols = c(value)) %>%
rename(text_raw = value) %>%
select(-name) %>%
mutate(doc_id = str_remove(doc_id, ".txt")) %>%
# removing empty rows + add line_id
mutate(line_id = row_number())
Besides row_number(), I also tried rowid_to_column, and even c(1:1000) - the length of the dataframe. The error message stays the same.

Try explicitly specifying the data type of the "line_id" column as an integer using the as.integer() function, like this:
mutate(line_id = as.integer(row_number()))

This code works but is not fully satisfying, since I have to break the pipe:
split_text$line_id <- as.integer(c(1:nrow(split_text)))

Using a for loop to get bulk tweets from the Twitter API v2 endpoint in R

I am trying to collect more tweets than is allowed in a single query, hence I am using a for loop to automate this.
tweets <- data_frame()
for(i in 1:10){
httr::GET(url = url_tweet,
httr::add_headers(.headers = headers),
query = params) %>%
httr::content(response, as = "text") %>%
fromJSON(obj, flatten = TRUE) %>%
json_data <- view(enframe(unlist(json_data))) %>%
mutate(
id2 = name %>% str_extract("[0-9]+$"), # ensure unique rows
name = name %>% str_remove("[0-9]+$") %>% str_remove("^data.")
) %>%
pivot_wider(names_from = name, values_from = value) %>%
select(`tweet_id` = id, text, user_id=includes.users.id, user_name=includes.users.username, likes=public_metrics.like_count, retweets=public_metrics.retweet_count, quotes=public_metrics.quote_count) %>%
type_convert() -> data_sep
tweets <- rbind(tweets, data_sep)
}
I have run the code individually and there is nothing wrong with any of it, but when I try to loop it I get this error
Error in `select()`:
! Can't subset columns that don't exist.
x Column `id` doesn't exist.

How to use character vector in filter on a database connection in R?

EDIT: I found my error in the example below. I made a typo in stored_group in filter. It works as expected.
I want to use a character value to filter a database table. I use dplyr functions directly on the connection object. See my steps below.
I connected to my MariaDB database:
con <- dbConnect(RMariaDB::MariaDB(),
dbname = mariadb.database,
user = mariadb.username,
password = mariadb.password,
host = mariadb.host,
port = mariadb.port)
Then I want to use a filter on a table in the database, by using dplyr code directly on the connection above:
stored_group <- "some_group"
con %>%
tbl("Table") %>%
select(id, group) %>%
filter(group == stored_group) %>%
collect()
I got a error saying Unknown column 'stored_group' in 'where clause'. So I used show_query() like this:
stored_group <- "some_group"
con %>%
tbl("Table") %>%
select(id, group) %>%
filter(group == stored_group) %>%
show_query()
And I got:
<SQL>
SELECT `id`, `group`
FROM `Table`
WHERE (`group` = `stored_group`)
In translation, stored_group is seen as a column name instead of value in R. How do I prevent this?
On normal data.frames in R this works. Like:
stored_group <- "some_group"
data %>%
select(id, group) %>%
filter(group == stored_group)
I just tested the solution below, and it works. But my database table will grow. I want to filter directly on the database before collecting.
stored_group <- "some_group"
con %>%
tbl("Table") %>%
select(id, group) %>%
collect() %>%
filter(group == stored_group)
Any suggestions?

Pass user input vector to a function that takes a string

I have a shiny app that takes in user input:
input$libraries is a reactive character vector produced by user input from
output$libraries <- renderUI({
checkboxGroupInput(inputId = "libraries",
label = strong("Select the libraries for which you would like to see part counts"),
choiceValues = LibraryIDs$libraryid,
choiceNames = LibraryNames$name,
selected = LibraryIDs$libraryid[1],
inline = T)}})
})
I would like to select from my postgreSQL database, I have a function set up as such:
get_query <- function(querystring){
# create a connection
# loads the PostgreSQL driver
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname = "RosettaRelational",
host = "localhost", port = 5432,
user = "postgres", password = rstudioapi::askForPassword("Database password"))
on.exit(dbDisconnect(con))
# check for the existance of tables, must be created in pgAdmin4
#dbExistsTable(con, "libraries")
query <- eval(parse(text = querystring))
return(query)
}
It takes in a string and parses it to evaluate the query
now when I try to query the database as such:
Names <- get_query(paste0("con %>% tbl('libraries') %>%
filter(libraryid %in% input$libraries) %>% select(name) %>% collect()"))
I get the error: object 'input' not found. I know it's not parsing the reactive character vector correctly. How should I change this to get it to work?
I tried:
Names <- get_query(paste0("con %>% tbl('libraries') %>%
filter(libraryid %in% '",input$libraries,"') %>% select(name) %>% collect()"))
but that only selects the first library in the vector even when the user selects multiple libraries..this works when input$ is only one character, for example when the input is an action button instead of checkboxes
basically what I need is for input$libraries to look like c('111a,'111b','211','311a') when it is passed into the string if user selects 111a, 111b, 211 and 311a, instead of just '111a' which is what it is currently passing.

It seems from some testing on my side that your code
Names <- get_query(paste0("con %>% tbl('libraries') %>% filter(libraryid %in% '",input$libraries,"') %>% select(name) %>% collect()"))
will "vectorise" in its current form for multiple libraries in input$libraries. This will create a separate string for each library in input$libraries instead of one string containing all libraries. e.g.
> Names
[1] "con %>% tbl('libraries') %>% filter(libraryid %in% '111a') %>% select(name) %>% collect()"
[2] "con %>% tbl('libraries') %>% filter(libraryid %in% '111b') %>% select(name) %>% collect()"
Using my own data and your suggestion "for input$libraries to look like c('111a','111b','211','311a')" I adapted your code to
Names <- get_query(paste0("con %>% tbl('libraries') %>% filter(libraryid %in% c(", paste0("'", input$libraries, "'", collapse = ", "), ")) %>% select(name) %>% collect()"))
This should give you your required c('111a', '111b', '211', '311a').
It's not the most elegant but it should work. You could also do that inner paste0() before if it looks messy, as below
libraries_comma_separated <- paste0("'", input$libraries, "'", collapse = ", ")
This will give you '111a', '111b', '211', '311a' and then do
Names <- get_query(paste0("con %>% tbl('libraries') %>% filter(libraryid %in% c(", libraries_comma_separated, ")) %>% select(name) %>% collect()"))

Error: Invalid First Argument

I'm getting an "invalid first argument" error for the following. However, con is an actual connection and is set up properly. So what does this error actually refer to?
library(dplyr)
con <- RSQLServer::src_sqlserver("***", database = "***")
myData <- con %>%
tbl("table") %>%
group_by( work_dt, campaign, ad_group, matchtype, keyword ) %>%
select( work_dt, campaign, ad_group, matchtype, keyword, impressions, clicks, cost ) %>%
filter(site_id %in% c(6932,6946,6948,6949,6951,6952,6953,6954,
6955,6964,6978,6979,7061,7260,7272,7329,
7791,7794,7850,7858,7983)) %>%
filter(work_dt >= as.Date("2014-10-01 00:00:00") & work_dt < as.Date("2014-10-02 00:00:00")) %>%
summarise(
sum_impressions = sum(impressions),
sum_clicks = sum(clicks),
sum_cost = sum(cost),
) %>%
collect()
This code produces:
Error in exists(name, env) : invalid first argument
exists("con")
> exists(con)
Error in exists(con) : invalid first argument
> exists("con")
[1] TRUE