Статьи

Как правильно отформатировать код SQL

Запись запросов в базу данных требует знания синтаксиса SQL, но это не все, что вам следует знать. Лучшие практики для написания профессионального кода SQL требуют хороших навыков форматирования. В этой статье я обсуждаю, почему это так важно, и основные правила, которым вы должны следовать.

Почему стоит форматировать код SQL?

Начинающие программисты SQL часто не уделяют много внимания форматированию своего кода. Если вы думаете, что форматирование — это то, что можно безопасно игнорировать, посмотрите на код ниже:

SELECT id, FirstName, LASTNAME,c.nAme FROM people p left JOIN cities AS c on c.id=p.cityid;

Этот SQL-запрос выше был написан без использования каких-либо правил форматирования. Теперь сравните это с форматированным запросом ниже, который представляет собой тот же код:

   SELECT p.PersonId,
          p.FirstName,
          p.LastName,
          c.Name
     FROM Person AS p 
LEFT JOIN City AS c 
          ON p.CityId = c.CityId;

Вы видите разницу? Что является более читабельным? Какой запрос легче понять?

Конечно, очевидно, что первый запрос не очень легко читается. Помимо этого, также является проблемой быстрое внесение изменений в этот код. Если вы хотите сравнить этот запрос с другим подобным запросом, это будет непростой задачей. Второй запрос совершенно другой, даже если это точно такой же код — его легко прочитать, его легко исправить, и его будет легко сравнить с другим хорошо отформатированным кодом. Правильное форматирование кода SQL помогает программистам избежать ошибок.

Хорошо, теперь вы понимаете, почему форматирование кода SQL может быть хорошей идеей. Теперь пришло время научиться делать это.

Как отформатировать код SQL

Есть разные способы подойти к форматированию кода. Некоторые программисты SQL имеют индивидуальные стили и предпочтения для форматирования запросов SQL. Они имеют опыт программирования и следуют удобным для них правилам. Это неплохо, если вы работаете только над своими проектами, но что, если вы работаете вместе с другими сотрудниками?

Working in a team would be problematic if every programmer were to write code using their own individual style. The code would represent a mixture of rules in one project. The solution would be to lay out a set of principles for the entire team. But then what if the code has to be read or corrected by people outside of the company? The best solution, in general, is to follow the SQL Formatting Standard. There is not one official document about it, but there are some generally agreed upon standards and good practices written by experts in SQL. In addition, there are many tools that help to format SQL code that are based on this standard. In this guide, we discuss common and popular rules that are based on this standard.

Naming Objects

First, I discuss general rules about naming database objects. These are the most common rules:

  • Avoid the name of a table/column in the plural. It is better to use employee instead of employees
  • If the name of the table or column must consist of more than one word, use an underscore to connect them, for example employee_city. Some professionals prefer to use what is called CamelCase style instead, for example EmployeeCity. The preferred style is different for different relational database systems
  • Check that the name is not already used as a keyword in SQL
  • If the name is the same as an SQL keyword, enclose the name within quotation marks
  • The name of an object in a database for a table or a column should be unique and not too long. Avoid special characters in the name like $, &, * , etc. (use only letters, numbers, and underscores)
  • Use an underscore in a name only if necessary
  • Don’t start the name with an underscore
  • Use comments only if necessary
  • Avoid abbreviations, but, if you do use them, be sure that they will be understood
  • Avoid giving the same name to both a table and a column
  • Use the same naming rules for aliases for columns or tables
  • Include the AS keyword for creating aliases, because this makes the code more readable
  • For the primary key column avoid the name id. A good idea is to combine id with the name of a table, for example: id_employee

Alignment

Most experts recommend first writing keywords on a new line to the left and then the rest of the code to the right, like this:

SELECT p.PersonId,
       p.FirstName,
       p.LastName,
       c.Name
  FROM Person AS p 
  JOIN City AS c 
    ON p.CityId = c.CityId;

Indentation

The liberal use of newlines can really help the readability of an SQL query. It is a good idea to use a new line for each separate query and to use a new line for each separate column after a comma. Similarly, it is a good idea to use spaces to surround the equals operator, to use spaces before or after apostrophes, and to use a space after a comma.

The sections below present more details about good practices in indentation in different types of SQL queries.

Commenting

Avoid writing too many comments in the code. Of course, there are cases where comments are necessary, but it is usually better to use multiple-line comments that are indicated by /* opening and */ closing characters. It is recommended to write this type of comment at the start of a new line, instead of starting on a line with code that is executed. The comment should be written above the relevant SQL code line, using the same indentation. For example:

SELECT p.PersonId,
       p.FirstName,
       p.LastName,
       /* Name column is the name of the city: */
       p.Name,
  FROM Person AS p 
 WHERE p.Name = 'New York';

In SQL code it is also possible to add one line comments. This type of comment is indicated by a double hyphen (--) at the beginning of the comment text. All text after these characters is treated as a comment.

SELECT -- we have to delete this column p.PersonId,
       p.FirstName,
       p.LastName,
       p.Name
  FROM Person AS p;

SELECT Queries

In this type of query SELECT is the first word in the command. If there are many columns after SELECT, it is better to separate them by placing each on a separate line. Each new line should be indented. Make sure to place commas at the end of the line and not at the beginning of the line.

SELECT p.PersonId,
       p.FirstName,  
       p.LastName,
       c.Name
  FROM Person AS p;

For the keywords FROM, WHERE, ORDER BY, GROUP BY, and HAVING, write each on a new line without indentation.

SELECT p.PersonId,
       p.FirstName,
       p.LastName,
       p.Name,
  FROM Person AS p 
 WHERE p.Name = 'New York';

If the WHERE statement has more than one condition, separate each condition by a new line that is indented, and use a new indented line with the AND or OR conditional operators within the WHERE statement.

SELECT p.PersonId,
       p.FirstName,
       p.LastName,
       p.Name
  FROM Person AS p 
 WHERE p.Name = 'New York'
    OR p.Name = 'Chicago';

JOIN Statements

If you join tables, use new lines for the operators INNER JOIN, LEFT JOIN, etc. For the ON operator, write a new indented line within the JOIN statement. If, however, there is more than one condition, use a new indented line before the AND or OR conditional operator.

SELECT p.PersonId,
       p.FirstName,
       p.LastName,
       c.Name
  FROM Person AS p 
  JOIN City AS c 
    ON p.CityId = c.CityId;

A Long and Nested SQL Query

Long queries sometimes contain subqueries. In this situation the subquery should be on a new indented line.

For the CASE structure place each WHEN and END on a new line.

SELECT p.PersonId,
       p.FirstName,
       p.LastName,
       CASE
         WHEN p.Age < 18 THEN 'below 18'
         WHEN p.Age >= 18 THEN '18 or more'
       END AS Age
  FROM Person AS p;

Other Types of SQL Queries

There are similar rules for queries that modify, insert, or delete data.

Use indent for VALUES in insert queries:

INSERT INTO Car(id_car, name, year) VALUES
  (1, 'Audi', 2010) ; 

In the case where you insert more rows in one query, write every row as a new line with indentation:

INSERT INTO Car(id_car, name, year) VALUES
  (1, 'Audi', 2010) ,
  (2, 'Skoda', 2015) ; 

In a similar way, in an UPDATE query use SET and WHERE as in a SELECT statement, with a new line without indentation:

UPDATE Car
SET year = 2012
WHERE Name = 'Audi';

or in a DELETE query:

DELETE FROM Car
WHERE Name = 'Audi'; 

How Bad Formatting of SQL Code Leads to Problems

One example of how poor formatting leads to problems can be seen in the query below, in which commas are placed at the beginning of each line:

SELECT /* we have to delete this column */ p.PersonId
     , p.FirstName
     , p.LastName
     , p.Name
  FROM Person AS p 
 WHERE p.Name = 'New York';

This might make sense at first, but if you comment out the first column in order to remove it, then the query would return an error.

Another error can occur if you don’t use indentation and new lines. For example:

Select person.id_person, person.name, person.age, person.description, person.city from person  where person.age>20 and person.city = ( select name from city where id_city>20)

In this poorly formatted code, it would be very easy by mistake to delete the WHERE clause in the subquery when you intended to delete the WHERE clause in the main query.

Many problems will be easy to find if the query is properly formatted, especially in a long query with hundreds of lines of code.

Summary

Ignoring the SQL formatting standard can cause significant problems when you are collaborating with other programmers. Proper formatting helps your SQL code be easier to read and helps prevent errors when making changes to your code.

In this article, I presented some of the rules recommended by experts that can help you in writing clearer code. Writing beautiful SQL code is a good working habit valued by employers. Your code indicates your level of professionalism and shows that you take a modern, serious approach to work. Rise to the challenge and become a more professional programmer!