It will be Great if anyone an help me out with COPY function syntax.
I am trying to copy data from google Doc to vertica with the help of r (using google-sheets and RJDBC Package).
I have exported the data from google sheet but unable to import in Vertica server.
Please Help me out.
googleSheet <- gs_title("GS", verbose = FALSE)
auditSheet <- gs_read(ss = googleSheet, ws = 'GS1')
copy verticaDB.tableName from local auditSheet
Thanks
Related
I have written some scripts in R which I have to run in azure ml notebook but I have not found much documentation how to create a dataset by running code in notebook with R kernel. I have written the following python code which works with python kernel as:
from azureml.core import Dataset, Datastore,Workspace
subscription_id = 'abc'
resource_group = 'pqr'
workspace_name = 'xyz'
workspace = Workspace(subscription_id, resource_group, workspace_name)
datastore = Datastore.get(workspace, 'workspaceblobstore')
# create tabular dataset from all parquet files in the directory
tabular_dataset_3 = Dataset.Tabular.from_parquet_files(path=(datastore,'/UI/09-17-2022_125003_UTC/userdata1.parquet'))
df=tabular_dataset_3.to_pandas_dataframe()
It works fine with python kernel but I want to execute the equivalent R code in notebook with R kernel.
Can anyone please help me what is the equivalent R code ? Any help would be appreciated.
To create an R script and use the dataset, first we need to register the dataset to the portal. Once the dataset is added to the portal, we need to get the dataset URL and open the notebook and use the R kernel.
Upload the dataset and get the data source URL
Go to Machine Learning studio and create a new notebook.
Use the below R script to get the dataset and convert that to dataframe.
azureml_main <- function(dataframe1, dataframe2){
print("R script run.")
run = get_current_run()
ws = workspacename
dataset = azureml$core$dataset$Dataset$get_by_name(ws, “./path/insurance.csv")
dataframe2 <- dataset$to_pandas_dataframe()
# Return datasets as a Named List
return(list(dataset1=dataframe1, dataset2=dataframe2))
}
I am trying to read an excel file (xlsx) into a data frame in ibm watson studio. the excel file is saved in my list of assets. i'm a bit new to python
i have tried creating a project token with some help i got here. I will appreciate if someone helps with the complete code.
i tried this
from project_lib import Project
project = Project(project_id='',
project_access_token='')
pc = project.project_context
file = project.get_file("xx.xlsx")
file.sheet_names
df = pd.ExcelFile(file)
df = file.parse (0)
df.head ()
i needed to pass the excel file into a pandas data frame , pd for eg.
All you need to do is
First insert the project token as you already did.
Then simply fetch file and then do .seek(0),
Then read it using pandas' read_excel() and you should be able to read it.
# Fetch the file
my_file = project.get_file("tests-example.xls")
# Read the CSV data file from the object storage into a pandas DataFrame
my_file.seek(0)
import pandas as pd
pd.read_excel(my_file, nrows=10)
For more information:- https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/project-lib-python.html
I want to import data in JSON format from MongoDB in to R. I am using Mongolite package to connect MongoDB to R, but when i use mongo$find('{}') data is getting stored as dataframe. Please check my Rcode,
library(mongolite)
mongo <- mongolite::mongo(collection = "Attributes", db = "Test", url =
"mongodb://IP:PORT",verbose = TRUE)
df1 <- mongo$find('{}')
df1 is getting stored as dataframe, but I want the data in JSON format only. Please give your suggestions for the same.
Edit -
Actual json structure converted in to list
But when i load data from MongoDB to R using mongolite package, data is getting stored as dataframe and then if i convert to list, the structure is getting changed and few extra columns are inserted in to list.
Please let me know on how to solve this issue.
Thanks
SJB
I have been mainly working with .xlsb files(binary file type of xlsx) which I would like to read/write using R. Could you please let me know if there is any package that is available for this or do I need to create package on my own?
RODBC did not work too.
Try the excel.link package. The xl.read.file function allows rectangular data sets to be read-in, though there are other options available.
You also need to (install and) call the RDCOMClient package before running the first excel.link function.
e.g.,
read_xlsb <- function(x){
require("RDCOMClient")
message(paste0("Reading ", x, "...\n"))
df <- excel.link::xl.read.file(filename = x, header = TRUE,
xl.sheet = Worksheet_name)
df$filename <- x
df <- as.data.frame(df)
return(df)
}
The only annoynce I've found is that I can't override Excel's "save on close" functionality so these pop-ups need to be closed by hand.
BTW I think excel.link only works on Windows machines.
I have the basic setup done following the link below:
http://htmlpreview.github.io/?https://github.com/Microsoft/AzureSMR/blob/master/inst/doc/tutorial.html
There is a method 'azureGetBlob' which allows you to retrieve objects from the containers. however, it seems to only allow "raw" and "text" format which is not very useful for excel. I've tested the connections and etc, I can retrieve .txt / .csv files but not .xlsx files.
Does anyone know any workaround for this?
Thanks
Does anyone know any workaround for this?
There is no file type on the azure blob storage, it is just a blob name. The extension type is known for OS. If we want to open the excel file in the r, we could use the 3rd library to do that such as readXl.
Work around:
You could use the get blob api to download the blob file to local path then use readXl to read the file. We also get could more demo code from this link.
# install
install.packages("readxl")
# Loading
library("readxl")
# xls files
my_data <- read_excel("my_file.xls")
# xlsx files
my_data <- read_excel("my_file.xlsx")
Solved with the following code. Basically, read the file in byte then wrote the file to disk then read it into R
excel_bytes <- azureGetBlob(sc, storageAccount = "accountname", container = "containername", blob=blob_name, type="raw")
q <- tempfile()
f <- file(q, 'wb')
writeBin(excel_bytes, f)
close(f)
result <- read.xlsx(q, sheetIndex = sheetIndex)
unlink(q)