What is SQLFluff
SQLFluff is an Open Source Software (OSS) that automatically identifies and rectifies issues in your SQL format. It supports various SQL dialects (grammar), making it an effective and flexible tool for diverse SQL environments.
You can access SQLFluff and its latest updates from the official Github repository:
Unlike other languages, SQL does not have many specific writing guidelines. However, the trend of non-IT professionals, such as salespeople and marketers, writing SQL to analyze data is rapidly increasing. As more and more people use SQL to promote data analysis, the SQL queries generated will vary widely in writing style. Using SQLFluff to semi-forcedly unify the way queries are written can be seen as a good attempt to maintain a standard and enhance the efficiency of data analysis.
Installation
To install SQLFluff, you need to have Python3 installed on your system.
You can install SQLFluff using pip, which is a package manager for Python. Simply enter the following command:
$ pip3 install sqlfluff
Using SQLFluff
SQLFluff provides two main functionalities: lint
and fix
.
Lint
The lint
is designed to read SQL queries and highlight any issues found.
Consider the following SQL query:
SELECT
ABC, XYZ
froM
TMP
To use the lint
, input the SQL query as follows:
$ sqlfluff lint example.sql --dialect=bigquery
== [example.sql] FAIL
L: 1 | P: 1 | LT01 | Expected only single space before 'SELECT' keyword.
| Found ' '. [layout.spacing]
L: 1 | P: 1 | LT02 | First line should not be indented.
| [layout.indent]
L: 1 | P: 1 | LT13 | Files must not begin with newlines or whitespace.
| [layout.start_of_file]
L: 1 | P: 8 | LT09 | Select targets should be on a new line unless there is
| only one select target.
| [layout.select_targets]
L: 2 | P: 1 | LT02 | Expected indent of 4 spaces.
| [layout.indent]
L: 3 | P: 3 | CP01 | Keywords must be consistently upper case.
| [capitalisation.keywords]
L: 4 | P: 1 | LT02 | Expected indent of 4 spaces.
| [layout.indent]
All Finished π π!
The lint
will output several lines of text, each line identifying a specific issue in the SQL query.
In this output, L
indicates the line number, and P
indicates the position of the issue in that line. Next to these, you will see an error code like LT01
, LT02
, etc., followed by a message describing the identified issue.
For a more detailed understanding of the error codes returned by SQLFluff, refer to the official documentation:
dialect
You can choose from various dialects when using SQLFluff. Use the following command to list available dialects:
$ sqlfluff dialects
==== sqlfluff - dialects ====
ansi: ansi dialect [inherits from 'nothing']
athena: athena dialect [inherits from 'ansi']
bigquery: bigquery dialect [inherits from 'ansi']
clickhouse: clickhouse dialect [inherits from 'ansi']
databricks: databricks dialect [inherits from 'sparksql']
db2: db2 dialect [inherits from 'ansi']
duckdb: duckdb dialect [inherits from 'postgres']
exasol: exasol dialect [inherits from 'ansi']
greenplum: greenplum dialect [inherits from 'postgres']
hive: hive dialect [inherits from 'ansi']
materialize: materialize dialect [inherits from 'postgres']
mysql: mysql dialect [inherits from 'ansi']
oracle: oracle dialect [inherits from 'ansi']
postgres: postgres dialect [inherits from 'ansi']
redshift: redshift dialect [inherits from 'postgres']
snowflake: snowflake dialect [inherits from 'ansi']
soql: soql dialect [inherits from 'ansi']
sparksql: sparksql dialect [inherits from 'ansi']
sqlite: sqlite dialect [inherits from 'ansi']
teradata: teradata dialect [inherits from 'ansi']
tsql: tsql dialect [inherits from 'ansi']
Customizing SQLFluff Rules
SQLFluff allows detailed customization of its rules. You can configure all rules, individual rules, and even the scope of rule application.
The rule settings can be changed by writing your custom configuration in one of the following files:
setup.cfg
tox.ini
pep8.ini
.sqlfluff
pyproject.toml
For instance, you can configure the .sqlfluff
file as follows:
[sqlfluff]
dialect = bigquery
exclude_rules = L036
In this case, all rules except for rule L036
will be applied.
fix
While the lint
simply points out the issues in your SQL queries according to the set rules, the fix
goes a step further. Not only does it identify the problems, but it also attempts to correct them automatically.
To use the fix
, use the following command:
$ sqlfluff fix example1.sql --dialect=bigquery
==== finding fixable violations ====
== [example.sql] FAIL
L: 1 | P: 1 | LT01 | Expected only single space before 'SELECT' keyword.
| Found ' '. [layout.spacing]
L: 1 | P: 1 | LT02 | First line should not be indented.
| [layout.indent]
L: 1 | P: 8 | LT09 | Select targets should be on a new line unless there is
| only one select target.
| [layout.select_targets]
L: 2 | P: 1 | LT02 | Expected indent of 4 spaces.
| [layout.indent]
L: 3 | P: 3 | CP01 | Keywords must be consistently upper case.
| [capitalisation.keywords]
L: 4 | P: 1 | LT02 | Expected indent of 4 spaces.
| [layout.indent]
==== fixing violations ====
6 fixable linting violations found
Are you sure you wish to attempt to fix these? [Y/n]
When you execute the command, SQLFluff will first find any violations of the rules. If it finds any, it will ask for confirmation before attempting to fix them. To proceed with the fixes, respond with Y
.
Attempting fixes...
Persisting Changes...
== [example.sql] FIXED
Done. Please check your files to confirm.
All Finished π π!
Once you confirm the fixes, SQLFluff will automatically make the necessary changes to your SQL query.
SELECT
ABC,
XYZ
FROM
TMP
Just like the lint
, you can also customize the rules for the fix
according to your needs.
References