How do I Join 3 tables in Dbase? - dbase

I don't know anything about dBase, but I am trying to pull data out of a customer's old Dbase database. It's a set of .DBF files. I am using ODBC and DBeaver to pull out data, but when I Join more than 2 tables I get an error.
Select *
from tableA
Left Join tableB on tableA.Key = tableB.Key
Left Join tableC on tableA.LINK = tableC.LINK
The error is:
SQL Error [37000]: [Microsoft][ODBC dBASE Driver] Syntax error
(missing operator) in query expression 'tableA.key = tableB.key LEFT
JOIN tableC ON tableA.LINK = tableC.LIN'.

Related

Putting the first table column (ID) last, without specifying the other table columns

Background
I am using R Studio to connect R to Microsoft SQL Sever Management Studio. I am reading tables into R as follows:
library(sqldf)
library(DBI)
library(odbc)
library(data.table)
TableX <- dbGetQuery(con, statement = "SELECT * FROM [dim1].[dimA].[TableX]")
Which for some tables works fine. However for most tables which have a binary ID variable
the following happens:
TableA <- dbGetQuery(con, statement = "SELECT * FROM [dim1].[dimA].[TableA]")
Error in result_fetch(res#ptr, n) :
nanodbc/nanodbc.cpp:xxx: xxxxx: [Microsoft][ODBC SQL Server Driver]Invalid Descriptor Index
Warning message:
In dbClearResult(rs) : Result already cleared
I figured out that the problem is caused by the first column, which I can select like this:
TableA <- dbGetQuery(con, statement = "SELECT ID FROM [dim1].[dimA].[TableA]")
and looks as follows:
AlwaysLearning mentioned in the comments that this is a recurring problem (1, 2, 3). The query only works when ID is selected last:
TableA <- dbGetQuery(con, statement = "SELECT AEE, ID FROM [dim1].[dimA].[TableA]")
Updated Question
The question is essentially how I can read in the table with the ID variable last, without specifying all table variables each time (because this would be unworkable).
Possible Workaround
I thought a work around could be to select ID as an integer:
TableA <- dbGetQuery(con, statement = "SELECT CAST(ID AS int), COL2 FROM [dim1].[dimA].[TableA]")
However how do I select the whole table in this case?
I am an SQL beginner, but I thought I could solve it by using something like this (from this link):
TableA <- dbGetQuery(con, statement = "SELECT * EXCEPT(ID), SELECT CAST(ID AS int) FROM [[dim1].[dimA].[TableA]")
Where I select everything but the ID column, and then the ID column last. However the solution I suggest is not accepted syntax.
Other links
A similar problem for java can be found here.
I believe I have found a workaround that meets your requirements using a table alias.
By assigning the alias T to the table I want to query, it allows me to select both a specific column ([ID]) as well as all columns in the aliased table without the need to explicitly specify them all by name.
This returns all columns of the table (including the ID column) as well as a copy of the ID column at the end of the table.
I then remove the ID column from the resulting table.
This leaves you with the desired result: all columns of a table in the order that they appear with the exception of the ID column that is placed at the end.
PS: For the sake of completeness, I have provided a template of my own DBIConnection object. You can substitute this with the specifics of your own DBIConnection object.
library(sqldf)
library(DBI)
library(odbc)
library(data.table)
con <- dbConnect(odbc::odbc(),
.connection_string = 'driver={YourDriver};
server=YourServer;
database=YourDatabase;
Trusted_Connection=yes'
)
dataframe <- dbGetQuery(con, statement= 'SELECT T.*, T.[ID] FROM [SCHEMA_NAME].[TABLE_NAME] AS T')
dataframe_scoped <- dataframe[,-1]

R with postgresql database

I've been trying to query data from postgresql database (pgadmin) into R and analyse. Most of the queries work except when I try to write a condition specifically to filter out most of the rows. Please find the code below
dbGetQuery(con, 'select * from "db_name"."User" where "db_name"."User"."FirstName" = "Mani" ')
Error in result_create(conn#ptr, statement) :
Failed to prepare query: ERROR: column "Mani" does not exist
LINE 1: ...from "db_name"."User" where "db_name"."User"."FirstName" = "Mani"
^
this is the error I get, Why is it considering Mani as a column when it is just an element. Someone pls assist me
String literals in Postgres (and most flavors of SQL) take single quotes. This, combined with a few other optimizations in your code leave us with this:
sql <- "select * from db_name.User u where u.FirstName = 'Mani'"
dbGetQuery(con, sql)
Note that introduced a table alias, for the User table, so that we don't have to repeat the fully qualified name in the WHERE clause.

Querying mixed case columns in SQL with R

I have a mixed case column in my_table that can only be queried using double quotes in psql. For example:
select "mixedCase" from my_table limit 5; would be the correct way to write the query in psql, and this returns records successfully
However, I am unable to replicate this query in R:
I have tried the following:
dbGetQuery(con, "SELECT '\"mixedCase\"' from my_table limit 5;")
which throws: RS-DBI driver warning: (unrecognized PostgreSQL field type unknown (id:705) in column 0)
dbGetQuery(con, "SELECT 'mixedCase' from my_table limit 5;")
which throws: RS-DBI driver warning: (unrecognized PostgreSQL field type unknown (id:705) in column 0)
dbGetQuery(con, "SELECT "mixedCase" from my_table limit 5;")
which throws Error: unexpected symbol in "dbGetQuery(con, "SELECT "mixedCase"
What is the solution for mixed case columns with the RPostgreSQL package?
You seem to understand the problem, yet you never actually tried just using the literal correct query in R. Just escape the double quotes in the query string and it should work:
dbGetQuery(con, "SELECT \"mixedCase\" from my_table limit 5;")
Your first two attempts would have failed because you are passing in mixedCase as a string literal, not as a column name. And the third attempt would fail on the R side because you are passing in a broken string/code.

Using SQL commands in Python

I used the following code in iPython in order to get some information from a database's table in the form of a pandas dataframe.
import sqlite3
con = sqlite3.connect('-----.db')
a = pd.read_sql('SELECT * FROM table1, con)
c= con.cursor
I have table 1 as a dataframe named a. However, I need to carry out a number of inner joins between different tables from the database. My question would be how to use SQL commands within iPython using these dataframes? I tried c.execute(''' sql command for inner join''') but the error says that the dataframes mentioned are not tables.
Any help?
You just write the full sql command directly using read_sql.
sql = """
select col1 from
tablea inner join tableb
on tablea.col2 = tableb.col2
where tablea.col3 < 10
limit 10
"""
a = pd.read_sql(sql, con)

Teradata - duplication column error

I want to make a volatile table using teradata.
In the select statement I am using multiple columns from different tables.
However, some of the columns in the different tables have same names.
Therefore, I am getting a 'duplication column error'.
The question is - is there any workaround to bypass this error?
Is it possible to add for example table name to column name?
This is how my code looks:
CREATE MULTISET VOLATILE TABLE test
AS (
SEL *
FROM Table_A Left JOIN Table_B
...
)
WITH DATA
ON COMMIT PRESERVE ROWS
Instead of doing a select * , select individual column names and put aliases next to it. This will bypass the error.
A select all statement only works if you're working off one table. If you're retrieving all data from multiple tables, you've to specify that in your select statement.
CREATE MULTISET VOLATILE TABLE test AS
(
SELECT Table_A.*
, Table_B.*
FROM Table_A
LEFT JOIN Table_B ON ...
...
)
WITH DATA PRIMARY INDEX(«PI»)
ON COMMIT PRESERVE ROWS

Resources