Golang Database Library Orm Example Sql Injection
We look over how normal and abnormal usage of these libraries and ORM helps protect (or not!) against SQL injection. Remember that we accept various query parameters when listing a resource, ‘users` records in particular. The result is limited to a maximum of 30 records. We are going to try to list all records in one resource by formulating sneaking a malicious query parameter in the request.
http GET http://localhost:3080/api/sqlx/user?favourite_colour=blue;select%20*%20FROM%20users;,--;
We know that favourite_colour
is a valid and accepted query parameter. We are going to see if
appending the url starting with an apostrophe, then a semicolon ;
, followed by a select all
(select * FROM users
), closing off with another semicolon, and an sql comment (--
) will work.
- apostrophe: Common way of closing off a string concatenation
- semicolon: SQL way of closing of the previous sql statement
select%20*%20FROM%20users;
select everything inusers
table.%20
is a space- double dash: SQL way of commenting out the rest of SQL queries.
This malicious URL is not the only way of trying to inject SQL queries. Others include using
AND 1=1
which means the query is always true, and the dreaded ';DROP DATABASE users--
which
mean adding a second SQL query to drop users
table. You can try out with different urls by running
the examples in the provided example/rest.http file.
The post is incomplete without the addition of this comic:
(from https://xkcd.com/327/)
There are two things we want to see how these libraries and ORMs behave. The first being if malicious sql gets injected, and second how do they behave and the output that gets returned.
Table of Contents
This is part of a series of blog posts including:
- Introduction
- Create - Create a single record
- List
- Get - Get a single record
- Update - Update a single field
- Delete - Delete a record
- One-To-Many - Eager load one to many relationship between two tables
- Many-To-Many - Eager load many to many relationships between two tables using a pivot table
- Dynamic List - Return a record with a list of names, sorting, and pagination
- Transaction - Handle transaction, rollback on error
-
SQL Injection - Try to list all users by using malicious query parameters
- and Conclusion
sqlx
Testing against sqlx is the most interesting amongst all libraries and ORMs because we have made some dodgy examples with string concatenations.
At the point of QueryxContext()
, the fullQuery is correctly formed but the arguments get mangled
with the attempted sql injection.
fullQuery:
SELECT * FROM users WHERE favourite_colour = $1 ORDER by id LIMIT 30 OFFSET 0
arguments[0]:
blue;select * from users;,--;
It gives of an error saying invalid input value for enum valid_colours: "blue;select * from users;,--;"
This error comes from the pgx
sql driver that we used.
rows, err := r.db.QueryxContext(ctx, fullQuery, arguments...)
if err != nil {
return nil, fmt.Errorf("error listing users: %w", err)
}
Stepping back at the handler layer in db/filters.go
file, the url.Values.Get("favourite_colour")
that we used when parsing query parameter simply return user’s input which why the first argument
also contains that string.
Fortunately sqlx returns an error although for a different reason,
invalid input value for enum valid_colours: "blue';select * from users;,--;"
Note that we can make a prepared statement like below but, it does nothing to prevent this type of sql injection because the malicious part is in the argument, instead of this sql query.
stmt, err := r.db.PrepareContext(ctx, fullQuery)
if err != nil {
return nil, err
}
// no error
sqlc
Curiously, sqlc returns neither a result nor an error at all. The first_name
filter is set to be
Bruce';select * FROM users;,--;
and sqlc returns no result and no error at all. We get an empty
array as a response.
http GET http://localhost:3080/api/sqlc/user?first_name=Bruce;select%20*%20FROM%20users;,--;
squirrel
GET http://localhost:3080/api/squirrel/user?favourite_colour=blue';select%20*%20FROM%20users;,--;
Just like sqlx, it returns an error saying invalid input. It simply does not accept such parameter and returns both an error and empty array result.
invalid input value for enum valid_colours: "blue';select * FROM users;,--;"
gorm
Like sqlx and squirrel, gorm also returns the same error coming from pgx
sql driver.
invalid input value for enum valid_colours: "blue;select * FROM users;,--;"
sqlboiler
Sqlboiler gives off a different error because it uses http://github.com/lib/pq library instead of http://github.com/jackc/pgx.
models: failed to assign all query results to User slice
It appears that the query did run, but failed when scanning the results to a struct. To find out
what query that has been generated, we switch on the debug mode with boil.DebugMode = true
boil.DebugMode = true
all, err := models.Users(mods...).All(ctx, r.db)
if err != nil {
return nil, err
}
sqlboiler prints out the following
SELECT * FROM "users" WHERE ("users"."favourite_colour" = $1) ORDER BY id;
[{blue;select * from users;,--; true}]
But this is not clear if the argument is executed in the database. The next thing we can do is to enable logging for our database, and we can check out its standard output result.
-- enable logging for database db_test
ALTER DATABASE db_test SET log_statement = 'all';
Watch the logs at standard output with docker logs -f db_container
and re-run the HTTP request.
2022-09-14 12:08:33.858 UTC [173403] ERROR: invalid input value for enum valid_colours: "blue;select * from users;,--;"
2022-09-14 12:08:33.858 UTC [173403] STATEMENT: SELECT * FROM "users" WHERE ("users"."favourite_colour" = $1) ORDER BY id;
It appears that the select * from users;
statement did not run in the database.
ent
Ent on the other hand silently return an empty array because it cannot find any match for the value
blue;select * FROM users;,--;
from our favourite_colour
column.
Sql injection vulnerability in a codebase should not happen anymore. There are many well known best practices to prevent this security vulnerability, among others is to always sanitise user’s input in the handler layer, where user’s input are validated. And secondly, to parameterize sql arguments instead of joining the strings manually by using placeholders, either with question marks (?) for mysql or dollar signs ($) for postgres.
Among all these behaviors, none executed the malicious injected sql query. In terms of error handling, I would prefer to receive an error when we receive invalid or malicious http request. The reason being that we can log the request in the error handler. So the worst are sqlc and ent where they do not return any error, but sqlx being the worse between the two because it returns the first 10 records nevertheless.
Conclusion
What initially started to have eight posts has become twelve over the course of a year. But I hope I have covered common use cases when interacting with a database.
We have seen how easy an ORM especially ent
and gorm
when it comes to eager-loading child
relationships. Writing code with ent
is much easier because of everything is types, so you benefit
from code hinting from IDE - as well as avoiding typo when dealing with magic strings in gorm.
There is a significant learning curve because they are another systems to learn. In my opinion, it is
worth it compared to writing many-to-many by hand using sqlx
or sqlc
, but not if you do not need
dynamic queries or eager-loading.
If you need dynamic queries, a pure query builder like squirrel
is excellent. Writing with it feels
close to writing a sql query. However, it doesn’t codegen so there are magic strings everywhere.
Also, usage of structs for equality and like are strange. sqlboiler
is better, and you gain
code completion hints because it knows your database schema. Unfortunately, it falls short at
many-to-many relationship.
Finally, we have seen how powerful raw SQL is. The examples I have given are rather simple but if you are required to write complex queries, you may find that it is hard to translate to an ORM, or worse, not supported at all. In many cases, you often test that actual query before converting because you want to make sure that it is working as intended. If you do not need dynamic query or eager-loading, why not just use that working raw query? Everyone already knows sql anyway, and you already know how to protect against sql injection, right?
For my first technical blog series, it took me a year of writing, researching, and re-writing many times before I finally settle and feel like I can publicise. I hope you find these posts useful as to they have to mine.