I tried to use R package RPostgreSQL to direct import csv files into pgAdmin 4. My code is as below:
dbWriteTable(localdb$con,'test1',choose.files(), row.names=FALSE)
I got the error message:
Warning message:
In postgresqlImportFile(conn, name, value, ...) :
could not load data into table
I checked my pgAdmin 4, there does exist a imported table named test1, but it had no observations. Then I imported this csv to R first, and then used dbWriteTable to write it to PostgreSQL, and it worked well. I am not sure which part is wrong.
The reason I am not using psql or pgAdmin 4 to import csv file directly is that I kept getting the error message "relationship does not exist" every time when I use the COPY FROM commands. I am now using R package RPostgreSQL to bypass this issue, but sometimes my data file is too big to import to R. I need to find a way to use the dbWriteTable function to import file directly to PostgreSQL without consuming R's memory.
Related
I am attempting to import a SQL Server geospatial table using sf package's st_read function, as follows:
dsn <- "driver={SQL Server};server=gdb;database=mydata;trusted_connection=true"
myPolygons <- st_read(dsn=dsn, query=paste0("SELECT * FROM mytable")
This throws the following:
Error: Cannot open "driver={SQL Server};server=gdb;database=mydata;trusted_connection=true";
The file doesn't seem to exist.
This worked fine a few weeks ago and now it has suddenly stopped working. It fails with all of the spatial tables in my SQL Server database.
I'm pretty sure the dsn string is fine, because it works with the RODBC package to import non-spatial data. The st_read function works fine when importing an ESRI shapefile. The error occurs only when trying to import a sql server geospatial table. I'm certain that the tables I'm trying to import have appropriate geometry columns.
I tried removing any packages I thought might conflict with sf, then reinstalled sf, but still getting the same error.
I'm totally baffled why this is suddenly failing. Any help would be most appreciated.
I have recently started working with databricks and azure.
I have microsoft azure storage explorer. I ran a jar program on databricks
which outputs many csv files in the azure storgae explorer in the path
..../myfolder/subfolder/output/old/p/
The usual thing I do is to go the folder p and download all the csv files
by right clicking the p folder and click download on my local drive
and these csv files in R to do any analysis.
My issue is that sometimes my runs could generate more than 10000 csv files
whose downloading to the local drive takes lot of time.
I wondered if there is a tutorial/R package which helps me to read in
the csv files from the path above without downloading them. For e.g.
is there any way I can set
..../myfolder/subfolder/output/old/p/
as my working directory and process all the files in the same way I do.
EDIT:
the full url to the path looks something like this:
https://temp.blob.core.windows.net/myfolder/subfolder/output/old/p/
According to the offical document CSV Files of Azure Databricks, you can directly read a csv file in R of a notebook of Azure Databricks as the R example of the section Read CSV files notebook example said, as the figure below.
Alternatively, I used R package reticulate and Python package azure-storage-blob to directly read a csv file from a blob url with sas token of Azure Blob Storage.
Here is my steps as below.
I created a R notebook in Azure Databricks workspace.
To install R package reticulate via code install.packages("reticulate").
To install Python package azure-storage-blob as the code below.
%sh
pip install azure-storage-blob
To run Python script to generate a sas token of container level and to use it to get a list of blob urls with sas token, please see the code below.
library(reticulate)
py_run_string("
from azure.storage.blob.baseblobservice import BaseBlobService
from azure.storage.blob import BlobPermissions
from datetime import datetime, timedelta
account_name = '<your storage account name>'
account_key = '<your storage account key>'
container_name = '<your container name>'
blob_service = BaseBlobService(
account_name=account_name,
account_key=account_key
)
sas_token = blob_service.generate_container_shared_access_signature(container_name, permission=BlobPermissions.READ, expiry=datetime.utcnow() + timedelta(hours=1))
blob_names = blob_service.list_blob_names(container_name, prefix = 'myfolder/')
blob_urls_with_sas = ['https://'+account_name+'.blob.core.windows.net/'+container_name+'/'+blob_name+'?'+sas_token for blob_name in blob_names]
")
blob_urls_with_sas <- py$blob_urls_with_sas
Now, I can use different ways in R to read a csv file from the blob url with sas token, such as below.
5.1. df <- read.csv(blob_urls_with_sas[[1]])
5.2. Using R package data.table
install.packages("data.table")
library(data.table)
df <- fread(blob_urls_with_sas[[1]])
5.3. Using R package readr
install.packages("readr")
library(readr)
df <- read_csv(blob_urls_with_sas[[1]])
Note: for reticulate library, please refer to the RStudio article Calling Python from R.
Hope it helps.
Update for your quick question:
Brand new to SQL lite, running on a mac. I'm trying to import a csv file from the SQL lite tutorial:
http://www.sqlitetutorial.net/sqlite-import-csv/
The 'cities' data I'm trying to import for the tutorial is here:
http://www.sqlitetutorial.net/wp-content/uploads/2016/05/city.csv
I try and run the following code from Terminal to import the data into a database named 'data' and get the following error:
sqlite3
.mode csv
.import cities.csv data;
CREATE TABLE data;(...) failed: near ";": syntax error
A possible explanation may be the way I'm downloading the data - I copied the data from the webpage into TextWrangler and saved it as a .txt file. I then manually changed the extension to .csv. This doesn't seem very eloquent but that was the advice I found online for creating the .csv file: https://discussions.apple.com/thread/7857007
If this is the issue then how can I resolve it? If not then where am I going wrong?
Another potentially useful point - when I executed the code yesterday there was no problem, it created a database with the data. However, running the same code today produces the error.
sqlite3 dot commands such as .import are not SQL and don't need semicolon at end. Replace
.import cities.csv data;
with
.import cities.csv data
So basically I succesfully exported my SQL view data into a csv file. but no when I load into Rgui software, I get the following errror:
> load("C:\\Users\\dachen\\Documents\\vTargetBuyers.csv")
Error: bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file ‘vTargetBuyers.csv’ has magic number 'Marit'
Use of save versions prior to 2 is deprecated
What should I do? Is it the R version installed wrong? or something wrong with my CSV file?
Try using read.csv instead of load. load is for reading files created by save.
Type ?read.csv to access the documentation.
I am trying to import oracle dump in Oracle 11g XE by using the below command
imp system/manager#localhost file=/home/madhu/test_data/oracle/schema_only.sql full=y
Getting like below
IMP-00037: Character set marker unknown
IMP-00000: Import terminated unsuccessfully
Any one please help me
You received IMP-00037 error because of export file corrupted. I'd suspect either your dump file is corrupted or the dump file was not created by exp utility.
If the issue was occured because of corrupted dump file, then there is no choice other than obtaining uncorrupted dump file. Use impdp utility to import if you have used expdp utility to prepare dumpfile.
Following link will be helpful to try other option:
https://community.oracle.com/thread/870104?start=0&tstart=0
https://community.oracle.com/message/734478
If you are not sure which command(exp/expdp) was used, you could check log file which was created during dump export. It contains exact command which was executed to prepare the dump file.