I have a very large dataset to which I can only access by a pivot table in Excel.
I would like to access to the raw database to work with R.
I have tried several things:
To copy and paste in both, text and excel files, the pivot database and then import to R : It does not work. its shows only the filters that I have selected but not the totality of the database
to click on any cell of the pivot table to see the underlying data : It only shows the first 1000 entries. This is by far the best result that I have got, the only problem is that I get the first 1000 entries and I need all of them.
What I have been doing just until now is to select the variables that interest me, then, I copy them to a new sheet and finally I export to R. But I always forget a variable and its very expensive in time to redo all this procedure everyday.
Does anyone know how can I access to the totality of a database from a pivot table in Excel?
I hope I have been clear, do nto hesitate to ask for more information if needed.
Thank you in advanced
Related
Very new using R for anything database related, much less with AWS.
I'm currently trying to work with this set of code here. Specifically the '### TEST SPECIFIC TABLES' section.
I'm able to get the code to run, but now I'm actually not sure how to pull data from the tables. I assume that I have to do something with 'groups' but not sure what I need to be doing next to pull the data out of it.
So even more specifically, how would I pull out specific data like revenue for all organizations within the year 2018 for example. I've tried readRDS to pull a table as a dataframe but I get no observations or variables for any table. So I'm sort of lost of what I need to do here to pull the data our of the tables.
Thanks in advance!
I am trying to put a large data frame into a new table of a database. It could be done simply done via:
dbWriteTable(conn=db,name="sometablename",value=my.data)
However, I want to specify the Primary keys, foreign keys and the column Types like Numeric, Text and so on.
Is there any thing I can do? Should I create a table with my columns first and then add the data frame into it?
RSQlite assumes you have already your data.frame table all set before writing it to disk. There is not much to specify in the writing query. So, I visualise two ways, either before firing a query to write it, or after. I usually write the table from R to disk, then I polish it using dbGetQuery to alter table attributes. The only problem with this workflow is that Sqlite has very limited feature for altering tables.
I have a database name Team which has 40 tables . How can I connect to that database and refer to particular table without using sqlquerry. By the use of R data Structures.
I am not sure what do you mean with "How can I connect to that database and refer to particular table without using sqlquerry".
I am not aware of a way to "see" DB tables as R dataframes or arrays or whatever without importing the tuples first through some sort of query (in SQL) - this seems to be the most practical way to use R with DB data (without going to the hassle of exporting these as .csv files first, and re-read them in R).
There are a couple ways to import data from a DB to R, so that the result of a query becomes a R data structure (including proper type conversion, ideally).
Here is a short guide on how to do that with SQL-R
A similar brief introduction to the DBI family
As title, I created a dataframe and import it into mdb file using RODBC in R. Things are going OK, but I realize that the table shown in mdb file is not having the same order as my dataframe. I tried using row.names(temp.df) <- NULL after ordering, but the order is still kinda random.
It is not a very big issue as long as the two datasets are the same, but I wonder why this will happen.
Thanks.
Databases generally have no concept of ordering of their rows, by design. If you want a particular sorting order, then you have to put "ORDER BY" in your SQL when working with a database and sort on a column.
Hullo to all! This question is more about a shortcut than anything:
Is there a simple, yet efficient way to associate column names to csv data?
Problem:
I need to associate column names (and bind them) to import the csv file correctly to my SQL Server database. I don't know before I see the csv what column of the csv will contain what data.
Example:
File 1 column order: Name, Address, Phone.
File 2 column order: Name, Phone, Address.
Hence, I need to be able to display the csv, and with the use of well-placed dropdownlists, show the remaining columns that need to be selected.
I need to create an interface that will allow for manual association of the csv column to the datatable column.
Solution: ?
I am caressing the idea of coding the stuff myself, but was asking myself whether or not an existing easier solution already existed, and Google wasn't much help on this one.
Any input from you guys would be infinitely appreciated as it would save me some precious time.
Other than just doing the import via SSIS or some other direct SQL item, I'm not aware of any better way of doing it.
All I would do in this case is show the list of columns for each field, and let the user select the mapping.