We were using Oracle DB previously in our Datawarehouse setup. We used SQL*Loader utility for bulk loading which was invoked through Informatica. We are shifting our DB to SAP HANA. We are very new to HANA DB. We were looking for similar command line utility in SAP HANA DB for EFFICIENT BULK DATA LOAD. I came across utility with CTL file in SAP HANA.
But problem we are facing is that we need to specify CTL file, path DATA file, path BAD file, path on command line only. Is there way to achieve this? Or do we have a better mechanism in SAP HANA for scheduled bulk loading?
The EXPORT/IMPORT commands of the SAP HANA server are not as versatile as the Oracle command line SQL*Loader.
It's mainly aimed at transports between HANA systems.
For proper ETL you rather want to use either "Smart Data Integration" (https://help.sap.com/hana_options_eim) and/or "Smard Data Acess" (https://help.sap.com/saphelp_hanaplatform/helpdata/en/71/0235120ce744c381176121d4c60b28/content.htm).
Specifically for typical EDW scenarios, there's also the option for "Data Warehousing Foundation" (https://help.sap.com/hana_options_dwf) which provides a lot of functionality to handle mass data. partitioning, data distribution etc.
Knowing many former Oracle users just want a 1:1 swap of tools, I want to give fair warning: data loading & transformation in HANA is a lot less command line based.
Related
I have look at other that have been trying to get data from an OpenEdge Progress database.
I have the same problem, but there is a backup routine on the windows file server that dump the data every night. I have the *.pbk and a 1K *.st file. How can I get the data out of the dump file in a form I can use?
Or is't not possible?
Thanks.
A *.pbk file is probably a backup (ProBacKup). You can restore it on another system with compatible characteristics (same byte order, same release of Progress OpenEdge). Sometimes that is helpful if the other system has better connectivity or licensing.
To extract the data from a database, either the original or a restored backup, you have some possibilities:
1) A pre-written extract program. Possibly provided by whoever created the application. Such a program might create simple text files.
2) A development license that permits you to write your own extract program. The output of the "showcfg" command will reveal whether or not you have a development license.
3) Regardless of license type you can use "proutil dbName -C dump tableName" to export the data but this will result in binary output that you probably will not be able to read or convert. (It is usually used in conjunction with "proutil load").
4) Depending again on the license that you have you might be able to dump data with the data administration tool. If you have a runtime only license you may need to specify the -rx startup parameter.
5) If your database has been configured to allow SQL access via ODBC or JDBC you could connect with a SQL tool and extract data that way.
I am working on migrating teradata DBs to any open source DB(which DB is under discussion). I came across Apache Drill engine. My question is can we use drill to load data from teradata? If yes, can we use it as a schema conversion tool?
In theory yes it can load data from Teradata, since Teradata has a JDBC driver you can configure Teradata as a source. For an example of how to configure a JDBC data source in Drill see the docs here.
Drill has a CTAS statement. I know it can be used to write parquet, CSV, and json files, but I'm not sure what other data sources it supports.
To get more information about what Drill can do, and to request features, please get in touch with the Drill team on the user list.
In which way can I access another database (not OpenEdge) via ODBC from OpenEdge without using DataDirect?
The use case is data migration from one system to another, so performance cannot be neglected completely but it's a one time thing that is allowed to take a little longer.
Why without DataDirect? Extra cost. Our client doesn't have the license.
Why not dump and load (via CSV f.e.)? The client doesn't want to do the mapping between the systems this way but with database views.
As far as I know there is no way to directly access other database if you're not using DataDirect or something like DataServer for Oracle etc.
However, you could call a third party ODBC library as external functions, and write your handle your queries to the foreign database by accessing. This wouldn't allow you to use OpenEdge constructs like FOR EACH, buffers etc, but it would allow you to retrieve the data and process it using custom functions, and then insert into the OpenEdge tables etc.
See the following KB for accessing external library functions:
https://knowledgebase.progress.com/articles/Article/P183546
Another approach you could use, assuming your tables are in OpenEdge already, is to use the OpenEdge SQL92 ODBC driver from another language (C/VB/Java/whatever works for you), and read the data from the source database and insert into OpenEdge via SQL92 ODBC.
Looking at the website there are downloadable ODBC drivers for most platforms:
https://www.progress.com/odbc/openedge
I have installed Oracle XE 11g R2 on my machine. I ran few scripts which does the setup by creating schemas, procedures for our application. Now I want to clone this database so that other people by using the cloned dbf file can see the base schema on their respective machine and work on their individual requirement on top of that.
Now it has 6 dbf files
CONTROL.DBF
SYSAUX.DBF
SYSTEM.DBF
TEMP.DBF
UNDO.DBF
USER.DBF
Can i just give them the files or I need to create server parameter file (SPFILE) or Control file. What about the REDO logs.
I have very little knowledge in Database administration. Please suggest. I understand that it is not Enterprise Edition so all things might not supported but assuming cloning process is similar for XE.
While it is possible to restore a database using the data files, I strongly suspect that is not what you're really after. If you're not an experienced DBA, the number of possible issues you'll encounter trying to restore a backup on a different machine and then creating an appropriate database instance are rather large.
More likely, what you really want to do is generate a full export of your database. The other people that need your application would then install Oracle and import the export that you generated.
The simplest possible approach would be at a command line to
exp / as sysdba full=y file=myDump.dmp
You would then send myDump.dmp to the other users who would import that into their own database
imp / as sysdba full=y file=myDump.dmp
This will only be a logical backup of your database. It will not include things like the parameters that the database has been set to use so other users may be configured to use more (or less) memory or to have a different file layout or even a slightly different version of Oracle. But it does not sound like you need that degree of cloning. If you have a large amount of data, using the DataPump version of the export and import utilities would be more efficient. My guess from the fact that you haven't even created a new tablespace is that you don't have enough data for this to be a concern.
For more information, consult the Oracle documentation on the export and import utilities.
Removing content as it is not valid here
I have been developing locally for some time and am now pushing everything to production. Of course I was also adding data to the development server without thinking that I hadn't reconfigured it to be Postgres.
Now I have a SQLite DB who's information I need to be on a remote VPS on a Postgres DB there.
I have tried dumping to a .sql file but am getting a lot of syntax complaints from Postgres. What's the best way to do this?
For pretty much any conversion between two databases the options are:
Do a schema-only dump from the source database. Hand-convert it and load it into the target database. Then do a data only dump from the source DB in the most compatible form of SQL dump it offers. Try loading that into the target DB. When you hit problems, script transformations to the dump using sed/awk/perl/whatever and try again. Repeat until it loads and the results match.
Like (1), hand-convert the schema. Then write a script in your preferred language that connects to both databases, SELECTs from one, and INSERTs into the other, possibly with some transformations of data types and representations.
Use an ETL tool like Talend or Pentaho to connect to both databases and convert between them. ETL tools are like a "somebody else already wrote it" version of (2), but they can take some learning.
Hope that you can find a pre-written conversion too. Heroku one called sequel that will work for SQLite -> PostgreSQL; is it available without Heroku and able to function without all the other Heroku infrastructure and code?
After any of those, some post-transfer steps like using setval() to initialize sequences is typically required.
Heroku's database conversion tool is called sequel. Here are the ruby gems you need:
gem install sequel
gem install sqlite3
gem install pg
Then this worked for me for a sqlite database file named 'tweets.db' in the current working directory:
sequel -C sqlite://tweets.db postgres://pgusername:pgpassword#localhost/pgdatabasename
PostgreSQL supports "foreign data wrappers", which allow you to directly access any data source through the DB, including sqlite. Even up to automatically importing the schema. You can then use create table localtbl as (select * from remotetbl) to get your data into the actual PG storage.
https://wiki.postgresql.org/wiki/Foreign_data_wrappers
https://github.com/pgspider/sqlite_fdw