Print sql statement craeted by DBI::dbBind - r

I want to print the sql syntax created with DBI::dbBind while creating safe parametrized query:
conn <- #create connection
stmt <- "select * from dbo.mytable where mycolumn = ?"
params = list("myvalue")
query <- DBI::dbSendQuery(conn, stmt)
DBI::dbBind(query, params) # how print created sql syntax?
in the last line the sql syntax is created. How to view it?

I'll formalize my comment into an answer.
"Binding" does not change the query, it merely "augments" a query with objects that are treated solely as data, vice intermingled with code.
The latter intermingling can be problematic for a few reasons:
Malevolent code deploying "SQL Injection" (https://xkcd.com/327/, and a wiki explanation). In short, it takes advantage of closing out a quoted string or literal and executing SQL code directly, perhaps to delete or modify data, perhaps to extract data.
Innocently enough, if the "data" you add to the query contains quotes that are not escaped properly, you could inadvertently perform your own SQL injection, though perhaps less likely to do as "Bad Things" as #1 would do.
Optimization of SQL code. Most DBMSes will analyze and optimize a SQL query so that it performs better, takes advantage of keys, etc. They often remember queries, so that a repeated query does not need to be re-analyzed, saving time. If you intermingle data/parameters in with the raw SQL query text, then when you change one thing about it (even just one digit in a parameter), the DBMS will likely need to re-analyze the query. Inefficient.
There are functions that facilitate escaping or quoting literals and strings, and if you feel you must put literals in your SQL query, then I urge you to use them. These include (but are not limited to) DBI::dbQuoteString, DBI::dbQuoteLiteral, and DBI::dbQuoteIdentifier.
Another such function is glue::glue_sql, which handles correct quoting/escaping of literals and identifiers, to "make[s] constructing SQL statements safe and easy" (quoted from the github repo). This is "just" string interpolation, so while it should protect you just fine against #1 and #2 above, it does not necessarily permit/encourage #3.
(It only takes one mis-quoting to remind you which is used where for your particular DBMS.)
For the record, binding is rather simple, as provided in its documentation:
iris_result <- dbSendQuery(con, "SELECT * FROM iris WHERE [Petal.Width] > ?")
dbBind(iris_result, list(2.3))
results <- dbFetch(iris_result)
If you want, you can re-use the same res (at least in some DBMSes, not tested on all), as in
# same iris_result as above
dbBind(iris_result, list(2.5))
dbFetch(iris_result)
dbBind(iris_result, list(3))
dbFetch(iris_result)
dbBind(iris_result, list(3.2))
dbFetch(iris_result)
As many times as you need, ultimately finishing with
DBI::dbClearResult(iris_result)

Related

Do you need parameterized SQL searches if you check the inputs?

I'm writing an R Shiny/SQLite app. In the app, I have a function that returns a column from one of the tables in my SQLite database, with the function taking the table name as an input.
Before sending the query to SQLite, the function checks that the table name equals one of the table names that the user is allowed to access. However, I am not using a parameterized query, because the term I'm changing is not a variable used for comparison but the name of the table to extract information from. (There might be a way to make this work anyway with a parameterized search, I suppose.)
My question is whether this is safe from an SQL injection? Can the query be altered on its way from the server to the database, or only from an alteration in the ui input to the server?
(Bear with me, I am new to SQLite.)
Assuming your query is being concatenated as follows:
tbl <- "yourTable"
sql <- paste0("select * from ", tbl, " where some_col = 1")
Then there should be no chance of SQL injection, assuming you check the incoming table name and verify that it matches a table name in your whitelist. Note that this step is critical here to keeping things safe. Let's say that you didn't sterilize the incoming table name. Then, consider this:
tbl <- "yourTable; delete from yourTable"
This would result in the following query being submitted for execution:
select * from yourTable; delete from yourTable where some_col = 1;
Assuming your SQLite driver allows multiple SQL statements to execute, the above hack/trick might end up deleting data from a large portion of one of your tables.
So, your approach should be safe provided that you check the table name. Note that strictly speaking the table name itself is not a parameter in a parameterized query. Rather, only the literal values in the query are parameters.
SQL query parameters cannot be used in place of a table name anyway, so comparing the table name to a list of known authorized tables is your only option.
Yes, it is safe. If you're in control of the set of values that can be interpolated into the SQL query, then you can prevent unauthorized SQL injection.
Note that some other elements of SQL queries cannot be parameters:
Any identifier, e.g. a table name, column name, or schema name.
Expressions
Lists of values in an IN ( ... ) predicate. Use one parameter per value in the list.
SQL keywords.
A query parameter can be used only in place of a single scalar value. That is, where you would use a quoted string literal, quoted date literal, or numeric literal.
The problem of SQL injection is only the user input. Nothing happens to the query on its way from the server to the database (well a malware could in theory alter it, but then even a parametrized query wouldn't help).
I.e., if you create a SQL string like this (C#):
sql = "SELECT * FROM " + tableName;
Then a user might enter a tableName like
MyTable; DROP TABLE MyTable
Guess what happens.
So, if you check the table name, you are on the safe side.

Do I need to worry about SQL injections when using dplyr in R to import csv into SQLite database?

I have a database in R that uses RSQLite. There is one table in this database, and I want the user to be able to import .csv files to add data to that table. If I'm using dplyr to do this, do I need to worry about 'cleaning' the data in the spreadsheet to make sure none of it is going to screw with my database? Like, for example, making sure apostrophes don't interfere with the SQL query. Does dplyr take care of this? I am not very familiar with SQL so please bear with me.
You should be safe as long as you used parametrized statements in RSQLite (dplyr has nothing to do with this), i.e. the potentially dangerous data are set as separated argument and not put directly in the query text. RSQLite is going to make sure the values are properly escaped to avoid SQL injections.
The documentation provides examples of such queries, here with a delete action, but it is transposable to insertion.
rs <- dbSendStatement(mydb, 'DELETE FROM iris WHERE "Sepal.Length" < :x')
## added: bind the actual data here:
dbBind(rs, param = list(x = 4.5))
dbGetRowsAffected(rs)

How to use dynamic values while executing SQL scripts in R

My R workflow now involves dealing with a lot of queries (RPostgreSQL library). I really want to make code easy to maintain and manage in the future.
I started loading large queries from separate .SQL files (this helped) and it worked great.
Then I started using interpolated values (that helped) which means that I can write
SELECT * FROM table WHERE value = ?my_value;
and (after loading it into R) interpolate it using sqlInterpolate(ANSI(), query, value = "stackoverflow").
What happens now is I want to use something like this
SELECT count(*) FROM ?my_table;
but how can I make it work? sqlInterpolate() only interpolates safely by default. Is there a workaround?
Thanks
In ?DBI::SQL, you can read:
By default, any user supplied input to a query should be escaped using
either dbQuoteIdentifier() or dbQuoteString() depending on whether it
refers to a table or variable name, or is a literal string.
Also, on this page:
You may also need dbQuoteIdentifier() if you are creating tables or
relying on user input to choose which column to filter on.
So you can use:
sqlInterpolate(ANSI(),
"SELECT count(*) FROM ?my_table",
my_table = dbQuoteIdentifier(ANSI(), "table_name"))
# <SQL> SELECT count(*) FROM "table_name"
sqlInterpolate() is for substituting values only, not other components like table names. You could use other templating frameworks such as brew or whisker.

How to get around the ' (quote) problem when saving text using SQL

I've a textbox that I save to a SQL Database using ASP.NET C#.
The problem is using quotes, for instance with words like he's and it's. When you save it, I get an SQL error.
From experience to get around the issue, I use to use replace command to find all occurances of quotes and replace them with another character. Then if I was to read the database and the text that was previously saved, I'd replace again.
Is there a better of doing this? Or do you still have to use this 'old' way.
Use parametrized queries.
Using replace is dangerous and can lead to unintentional SQL injection vulnerabilities.
Always use parameterized queries. Never concatenate strings together (with un-sanitised input information) to create SQL on the fly. You run the risk of being vulnerable to SQL-injection attacks.
Read the following guide carefully:
http://www.csharp-station.com/Tutorials/AdoDotNet/Lesson06.aspx
And when you understand the following comic strip, you are ready to code:
http://xkcd.com/327/
As the other answers point out, the correct answer here is to use parameterized queries.
However, this is sometimes not possible. In these cases, you're better off doubling the quotes:
String str = "he's";
str = str.Replace("'", "''");
There's two ways to get around this issue:
Use a parameterised query, e.g. in http://www.csharp-station.com/Tutorials/AdoDotNet/Lesson06.aspx - this will also help avoid basic SQL injection if you're worried about that
Create 2 new functions, one to double up every apostrophe, so that it's saved into the database correctly, and another to remove extra apostrophes when retrieving data from the database.
I would recommend the first method as it's a lot cleaner (and safer) in my opinion. If you're using C# for your ASP.NET you can do things like in my below example:
string clientName = "Test client";
SqlCommand sqlCommand = new SqlCommand("SELECT COUNT(*) FROM Client WHERE UPPER(ClientName) = UPPER(#ClientName)", connection);
sqlCommand.Parameters.AddWithValue("#ClientName", clientName);

Are Parameters really enough to prevent Sql injections?

I've been preaching both to my colleagues and here on SO about the goodness of using parameters in SQL queries, especially in .NET applications. I've even gone so far as to promise them as giving immunity against SQL injection attacks.
But I'm starting to wonder if this really is true. Are there any known SQL injection attacks that will be successfull against a parameterized query? Can you for example send a string that causes a buffer overflow on the server?
There are of course other considerations to make to ensure that a web application is safe (like sanitizing user input and all that stuff) but now I am thinking of SQL injections. I'm especially interested in attacks against MsSQL 2005 and 2008 since they are my primary databases, but all databases are interesting.
Edit: To clarify what I mean by parameters and parameterized queries. By using parameters I mean using "variables" instead of building the sql query in a string.
So instead of doing this:
SELECT * FROM Table WHERE Name = 'a name'
We do this:
SELECT * FROM Table WHERE Name = #Name
and then set the value of the #Name parameter on the query / command object.
Placeholders are enough to prevent injections. You might still be open to buffer overflows, but that is a completely different flavor of attack from an SQL injection (the attack vector would not be SQL syntax but binary). Since the parameters passed will all be escaped properly, there isn't any way for an attacker to pass data that will be treated like "live" SQL.
You can't use functions inside placeholders, and you can't use placeholders as column or table names, because they are escaped and quoted as string literals.
However, if you use parameters as part of a string concatenation inside your dynamic query, you are still vulnerable to injection, because your strings will not be escaped but will be literal. Using other types for parameters (such as integer) is safe.
That said, if you're using use input to set the value of something like security_level, then someone could just make themselves administrators in your system and have a free-for-all. But that's just basic input validation, and has nothing to do with SQL injection.
No, there is still risk of SQL injection any time you interpolate unvalidated data into an SQL query.
Query parameters help to avoid this risk by separating literal values from the SQL syntax.
'SELECT * FROM mytable WHERE colname = ?'
That's fine, but there are other purposes of interpolating data into a dynamic SQL query that cannot use query parameters, because it's not an SQL value but instead a table name, column name, expression, or some other syntax.
'SELECT * FROM ' + #tablename + ' WHERE colname IN (' + #comma_list + ')'
' ORDER BY ' + #colname'
It doesn't matter whether you're using stored procedures or executing dynamic SQL queries directly from application code. The risk is still there.
The remedy in these cases is to employ FIEO as needed:
Filter Input: validate that the data look like legitimate integers, table names, column names, etc. before you interpolate them.
Escape Output: in this case "output" means putting data into a SQL query. We use functions to transform variables used as string literals in an SQL expression, so that quote marks and other special characters inside the string are escaped. We should also use functions to transform variables that would be used as table names, column names, etc. As for other syntax, like writing whole SQL expressions dynamically, that's a more complex problem.
There seems to be some confusion in this thread about the definition of a "parameterised query".
SQL such as a stored proc that accepts parameters.
SQL that is called using the DBMS Parameters collection.
Given the former definition, many of the links show working attacks.
But the "normal" definition is the latter one. Given that definition, I don't know of any SQL injection attack that will work. That doesn't mean that there isn't one, but I have yet to see it.
From the comments, I'm not expressing myself clearly enough, so here's an example that will hopefully be clearer:
This approach is open to SQL injection
exec dbo.MyStoredProc 'DodgyText'
This approach isn't open to SQL injection
using (SqlCommand cmd = new SqlCommand("dbo.MyStoredProc", testConnection))
{
cmd.CommandType = CommandType.StoredProcedure;
SqlParameter newParam = new SqlParameter(paramName, SqlDbType.Varchar);
newParam.Value = "DodgyText";
.....
cmd.Parameters.Add(newParam);
.....
cmd.ExecuteNonQuery();
}
any sql parameter of string type (varchar, nvarchar, etc) that is used to construct a dynamic query is still vulnerable
otherwise the parameter type conversion (e.g. to int, decimal, date, etc.) should eliminate any attempt to inject sql via the parameter
EDIT: an example, where parameter #p1 is intended to be a table name
create procedure dbo.uspBeAfraidBeVeryAfraid ( #p1 varchar(64) )
AS
SET NOCOUNT ON
declare #sql varchar(512)
set #sql = 'select * from ' + #p1
exec(#sql)
GO
If #p1 is selected from a drop-down list it is a potential sql-injection attack vector;
If #p1 is formulated programmatically w/out the ability of the user to intervene then it is not a potential sql-injection attack vector
A buffer overflow is not SQL injection.
Parametrized queries guarantee you are safe against SQL injection. They don't guarantee there aren't possible exploits in the form of bugs in your SQL server, but nothing will guarantee that.
Your data is not safe if you use dynamic sql in any way shape or form because the permissions must be at the table level. Yes you have limited the type and amount of injection attack from that particular query, but not limited the access a user can get if he or she finds a way into the system and you are completely vunerable to internal users accessing what they shouldn't in order to commit fraud or steal personal information to sell. Dynamic SQL of any type is a dangerous practice. If you use non-dynamic stored procs, you can set permissions at the procesdure level and no user can do anything except what is defined by the procs (except system admins of course).
It is possible for a stored proc to be vulnerable to special types of SQL injection via overflow/truncation, see: Injection Enabled by Data Truncation here:
http://msdn.microsoft.com/en-us/library/ms161953.aspx
Just remember that with parameters you can easily store the string, or say username if you don't have any policies, "); drop table users; --"
This in itself won't cause any harm, but you better know where and how that date is used further on in your application (e.g. stored in a cookie, retrieved later on to do other stuff.
You can run dynamic sql as example
DECLARE #SQL NVARCHAR(4000);
DECLARE #ParameterDefinition NVARCHAR(4000);
SELECT #ParameterDefinition = '#date varchar(10)'
SET #SQL='Select CAST(#date AS DATETIME) Date'
EXEC sp_executeSQL #SQL,#ParameterDefinition,#date='04/15/2011'

Resources