Load a text file into Apache Kudu table? - cloudera

How do you load a text file to an Apache Kudu table?
Does the source file need to be in HDFS space first?
If it doesn't share the same hdfs space as other hadoop ecosystem programs (ie/ hive, impala), is there Apache Kudu equivalent of:
hdfs dfs -put /path/to/file
before I try to load the file?

The file need not to be in HDFS first.It can be taken from an edge node/local machine.Kudu is similar to Hbase.It is a real-time store that supports key-indexed record lookup and mutation but cant store text file directly as in HDFS.For Kudu to store the contents of a text file,it needs to be parsed and tokenised.For that, you need to have Spark execution/java api alongwith Nifi (or Apache Gobblin) to perform the processing and then storing it in Kudu table.
Or
You can integrate it with Impala allowing you to use Impala to insert, query, update, and delete data from Kudu tablets using Impala’s SQL syntax, as an alternative to using the Kudu APIs to build a custom Kudu application.Below are the steps:
Import the file in hdfs
Create an external impala table.
Then insert the data in the table.
Create a kudu table using keyword stored as KUDU and As Select
to copy the contents from impala to kudu.
In this link you can refer for more info- https://kudu.apache.org/docs/quickstart.html

Related

Bulk upload using csv file to remote MonetDB server

I have a remote MonetDB server running and I want to bulk upload a csv file as it is much faster.
Based on the params in MonetDB.R, there is a csvdump=TRUE option but I don't think it works when you are trying to do this against a remote server. The server has to be local.
https://rdrr.io/github/MonetDB/monetdb-r/man/dbWriteTable.html
First, am I correct that I can't do this and if not, is there a workaround? I have a dataframe with +5M rows so it takes a long time with insert statements rather than using COPY INTO.
When I try using csvdump=TRUE against the remote server, it can't find the csv file because it is local to computer that called the dbWriteTable command.
I think you are right. As a workaround either use explicit COPY INTO ON CLIENT SQL statements or first use some file transfer tool to copy the file to the remote server before calling dbWriteTable.
It reads from MonetDB's documentation on COPY INTO:
FROM files ON SERVER
With ON SERVER, which is the default, the file name must be an
absolute path on the system on which the database server (mserver5) is
running. ...
Interestingly enough pymonetdb, the Python driver for MonetDB, uses ON CLIENT for bulk loads. From the pymonetdb's doc:
File Uploads and Downloads
Classes related to file transfer requests as used by COPY INTO ON
CLIENT.
You might want to file an issue for the MonetDB R-driver project to have similar behavior as pymonetdb.

Is there any way to read a properties file at client side using sql plus in a .ksh file?

I have a .ksh file from which I connect to Oracle DB at various environments using sqlplus as below
sqlplus -s $O_USER/$O_PASS#$O_DATABASE <<-EOF
Now, I need to read a properties(.txt) file from sqlplus for dynamically creating a url parameter and this file is located at client side. Is there any way to do this ? I'm ok with reading this via shell script and passing to sqlplus. I'm able to access some string variables in shell script from sqlplus, but is there a way to pass a hash map kind of object from shell script to sql plus ?
A few notes below :
I can't use UTL_FILE because the properties file should be located at client side only. Because I'm developing a monitoring tool for an application which is connected to the application in various environments and I don't want to or I don't have enough permission to put this properties file in each of those environments. So I want to store this properties file at a single place (client side)
I can't use TEXT_IO because I'm not using Oracle forms.
I don't want to put all these properties hard coded in the .ksh file (which will actually work) because there are more than 150 key-value pairs

FTP csv file from Unix server to Oracle DataBase through ODI

We have a CSV files being loaded automatically in Unix machine.
Requirement: We need to load the csv file from that remote server to my oracle DB. We do have an ODI as our ETL tool. Can someone advice on how to proceed further. What is the way to load the CSV from Unix server to Oracle DB.Please help us with some document if this case is possible.
Thanks ,
Gowtham Raja S
Oracle is providing some tutorials (Oracle By Example), one of them explains how to load a flat file to an Oracle table with ODI 12c : https://apexapps.oracle.com/pls/apex/f?p=44785:112:::::P112_CONTENT_ID:7947
You will just need to change the field delimiter to a comma instead of a tab.
The other tutorials can be found on the product page : http://www.oracle.com/technetwork/middleware/data-integrator/learnmore/index.html
if you know how to create a data store for file, in ODI we have a LKM called LKM file to Oracle (SQLLDR) OR LKM file to Oracle (external table), both can be used to load data quickly, or if you feel like this is bit difficult, however since you have the Sqlldr to load data manually from file to DB, what ever the command you are using to start sqlldr place it in ODI procedure with technology as OSCommand whihc loads data automatically.
let me know if any other suggestios are required.

Import database MySQL File-Per-Table Tablespaces to Same Server

Is necessary copy a database within a single server. Was chosen way to "File-Per-Table Tablespaces to Another Server" as it is the fastest for large databases.
The official documentation states that the database name must be the same on the source server and the destination server.
What if the source server and the destination server - this is one and the same server?
Is there any way in order to be able to copy the database files from one database to another within a server quickly.
Or somehow a way to get "File-Per-Table Tablespaces to Another Server" to ignore the name of the database?
Info server: OS: MS Windows Server 2008
MySQL Server: MySQL 5.5 or MariaDB
Tables Type: InnoDB (if MariaDB - InnoDB plugin)
Portability Considerations for .ibd Files
When you move or copy .ibd files, the database directory name must be the same on the source and destination systems. The table definition stored in the InnoDB shared tablespace includes the database name. The transaction IDs and log sequence numbers stored in the tablespace files also differ between databases.
EDITED:
I would create the backup files as suggested in the method, but would also export the schema as create table statements. After the backup I would use the rename table command to move the existing files to another database. Then Iwould recreate the schema in mz current database using the create table statements and then would import back the namespace as described.

How can I convert my Access database (.accdb) to SQLite?

How can I convert my Access database (.accdb) to an SQLite database(.sqlite)?
May be you can use several step algoritm:
1. Export (convert) Access table or query to Excel file
2. Save Excel file as CSV file.
3. Use any SQLLite manager (for example, phpLiteAdmin) to import data from CSV file to exist SQLLite table.
Except Android and IOS, that use SQLLite, there are still webhostings, that use no more database engine, except for SQLLite.
1) If you want to convert a structure of db you shoud use any DB-modeling tools:
create new model from existing Access Database
generate sql scripte for creating SQLite database
use this script in SQL helper
2) If you want to import data from Access Database to your android app. I think you can do case #1, migrate all data from Access Database to temporary SQLite database, save it to asset folder and rewrite from asset to internal SQLite database during first app. start

Resources