I am trying to convert an Oracle database table into an "R" data frame.
I am using the dplyr::tbl function as well as dbplyr::in_schmema to connect to the specific schema and table within the Oracle database.
Table <- dplyr::tbl(my_oracle, dbplyr::in_schema('SCHEMA_NAME', 'TABLE_NAME'))
This is the part that confuses me as the result is an object called "Table" that is a "List of 2" as seen below. The two items within the list are also lists of two.
I am able to convert this to a data frame by wrapping it with as.data.frame like this:
Table2 <- as.dataframe(dplyr::tbl(my_oracle, dbplyr::in_schema('SCHEMA_NAME', 'TABLE_NAME')))
However, when I do this i takes a very long time (hours for some tables) to convert to a data frame. I am wondering if there is a more efficient way to achieve the outcome of converting the Oracle table into a usable data frame?
Also any insight into why dplyr::tbl results in a list of lists would also be very appreciated.
Thanks in advance.
Related
I am uploading an R data frame to Big Query with this:
bq_table_upload("project.dataset.table_name", df_name)
It works, but the order of the columns is different in BQ than in R.
Is it possible to get the table to inherit the order of the columns from the data frame? I have looked through the documentation and have not found this option, though it is certainly possible I missed it.
As pointed out in the comments by Oluwafemi Sule, it's only a matter of providing the dataframe in the "fields" attribute like below:
bq_table_upload("project.dataset.table_name", values = df_name, fields=df_name)
I have 24 tables in SQL work folder that goes with tablenameyearMONTH format that goes from 201304-201502 (i.e. tablename201304 ,tablename201305, tablename201306 this goes like this up to tablename201503 ). I need to put all these tables from SQL into R and put into one master table. All the table names are the same apart from the dates(dates go up by 1 month everytime), i was wondering what the best way to do this
I know how to get the data from SQL using ODBC i'm just struggling with dates in R. How should I loop the data in R so all the table can be put into one single table?
I am new to R and coding in general, so please bear with me.
I have a spreadsheet that has 7 sheets, 6 of these sheets are formatted in the same way and I am skipping the one that is not formatted the same way.
The code I have is thus:
lst <- lapply(2:7,
function(i) read_excel("CONFIDENTIAL Ratio 062018.xlsx", sheet = i)
)
This code was taken from this post: How to import multiple xlsx sheets in R
So far so good, the formula works and I have a large list with 6 sub lists that appears to represent all of my data.
It is at this point that I get stuck, being so new I do not understand lists yet, and really need the lists to be merged into one single data frame that looks and feels like the source data (so columns and rows).
I cannot work out how to get from a list to a single data frame, I've tried using R Bind and other suggestions from here, but all seem to either fail or only partially work and I end up with a data frame that looks like a list etc.
If each sheets has the same number of columns (ncol) and same names (colnames) then this will work. It needs the dplyr pacakge.
require(dplyr)
my_dataframe <- bind_rows(my_list)
I have a huge data frame df in R. Now,if I invoke View(df) then Rstudio does not respond since it's too big. So, I am wondering, if there is any way to view, say first 500 lines, of a data frame as spreadsheet.
(I know it's possible to view using head but I want to see it as spreadsheet since it has too many columns and using head to see too many columns is not really user friendly)
If you want to see first 100 lines of the data frame df as spreadsheet, use
View(head(df,100))
You can look at the dataframe as a matrix such as df[rowfrom:rowto, columnfrom:columnto], for example in your case:
df[1:500,]
I'm trying to get some data from a large Baseball database in a nice format. It's a MySQL database so I use RMySQL to access it.
Problem is, the easiest way to retrieve the data I need is using sapply, as I need to vary an index:
myf <- function(ab){
search <- paste('select pitch_type, des from pitches where ab_id=', ab)
query <- dbSendQuery(con2,search)
return(fetch(query,n=-1))
}
pitches <- sapply(players$ab_id,myf,simplify="array")
But it is very hard to access this data, as it returns a list of lists:
> mode(pitches[,1])
>[1] "list"
Since I have two columns of different length in each list, is there an easy way to just stack all of these into a matrix/data frame? I have tried many things to no success.
Thanks!