Column Naming Convention

I like lower case with underscores:  company_idfirst_name, telephone_nbr, booking_ref

This is Uncle Joe Celko’s convention as stated in SQL Programming Style (Morgan Kaufmann, 2005), and I accept his justification that Invoices.invoice_nbr  is more readable than Invoices.InvoiceNbr in a dense melee of Sql code.

However, pragmatically, PascalCase is more popular and I find I use it more commonly from choice.

Id/Key column Names

  • Use the singular version of the table followed by Id / Key / Nbr
  • e.g.  table Customers has column CustomerId .

There are a number of reasons for this:  having a bunch of columns all named ‘Id’ (as one popular approach has it) is never helpful.  Another reason is that it makes it practically impossible to search for references.   Searching for ‘CustomerId’ will find definite references; searching for ‘Id’ in a schema full of Ids  is useless.  The same applies to Name, Date, Cost etc used as column names.

Column Name Suffixes

The suffix is a type-and-content indicator.  Probably all columns with the same suffix will be of the similar data type and size and have similar constraints. This is reasonably easy to assert and check using a script or even by eye in a data dictionary.

Column Name Suffixes
Suffix Type Remarks
_abbrv short char or varchar an abbreviation for the item; normally for display where space is limited.
_flag char(1) Constrained to accept only uppercase Y or N.  (if an undefined value is required use _triflag)
_date any date type date or date and time
_id int numeric surrogate key – often the primary key. based on table name: table Companies has company_id
_name varchar proper name e.g. first_name. Use sparingly. name should never be used on its own for a column!
_key normally short varchar or char if natural key is of fixed length (e.g. ISO country codes) natural (normally textual) key. Normally unique except for versioned tables (i.e. tables with effective and obsolete dates).
_ref short char or varchar short textual reference – e.g. initials; company reference code.  Need not be unique.
_perc decimal(5 4) percentage scale 1.  (type specification allows maximum of 999.99%)
_token GUID a row identifier – e.g. case_guid. used for security in HTML links; etc
_triflag char(1) Constrained to accept only uppercase Y or  N or  U  (= Unknown).

 Alternative Conventions

Some other conventions that I have advocated or endured.

The Table-Prefix (c_name) Convention

This was the first convention I adopted (largely in MSAccess) and it has some attractive qualities.

  • All column names are prefixed by (normally) a single letter plus underscore.  Thus all columns in table tblCompany are prefixed with c_ e.g. c_idc_name.
  • Sometimes it isn’t possible to keep the single letter convention (what do you do once you have tblProperty, tblPerson, tblProduct?), so two or even three letter prefixes are acceptable: pn_FirstName etc.
  • Foreign key columns are named differently to the column they relate to.  E.g. tblPerson.pn_CompanyId is foreign key to tblCompany.c_Id.

Advantages

  1. Every column has a unique name.  If you don’t use fully qualified column names, adding a column can break a view if the column name already exists in one of the other tables referenced – or if someone has used Select *.
  2. It’s easy to see which table you’re referring to if you don’t use fully qualified column names:  c_id is the Company ID and p_id is the Product ID.
  3. You rarely have to alias similarly named columns:  pn_address is distinct from c_address.

It works satisfactorily in MS Access and is quick to use.   Working with recordsets in code it is clear which column is being referenced – you never see an anonymous rst!Address.  Because no aliasing is necessary there is no risk of inconsistent aliases: the column name you see in the code is the name it has in the table.

I wouldn’t choose this convention for a new database; however it has caused no problems in any database where I have used it.  With the advent of Sql Data Tools it is now possible to enforce fully qualified column names and prohibit Select * so database-unique column names are less important.  Intellisense in SSDT and SSMS makes full names (with table aliases) easier to write.  Neither of these are available for Sql in Access.

Prefixing keys with PK_ and FK_

I don’t think this helps.  The primary key is indicated in metadata. The naming convention is that it’s the singularised name of the table with a suffix. Customers.customer_id.  The foreign key is again indicated in metadata.  I don’t think confusing them is common.

I think it has disadvantages

  • It burns column properties into the column. Foreign keys might be created or dropped; the table name is unlikely to change.
  • When including the column in a view the output will be FK_customer_id or PK_customer_id – which you might be tempted to alias.  It’s not relevant to the output that the column is a FK or PK.  In fact, PK_customer_id would be confusing if it were not the PK of the rowset being returned.
  • Many to Many tables:  columns can be both PK and FK at the same time.  The conention just confuses here.
  • It obscures the fact that these columns represent the same attribute.

ORM-Generated Tables with Standard Column Names

I have worked with databases where every lookup table had exactly the same column names: twenty plus tables with the columns NameDescriptionKeyEffectiveDate, and so on.  (It was generated using ORM modelling in Visio for Enterprise Architects.  Which of us wouldn’t be flattered by the name of that program?  Unfortunately it put visual effects above good design – another bete noir; thankfully Microsoft dropped Visio for EA after 2005 and put its energies into the grown-up SSDT, which has no visuals at all!)

In addition to the lookup tables, most fact tables in the database also had a Name and Description column.  Almost every Select from more than one table would require the designer to alias columns  (or forget to – Sql Server  is happy to output two columns with the same name; applications are rarely happy to receive them!).

It seems obvious that there will be less confusion if a column name seen by the application is the same as in the table, and if a column always has the same name wherever it appears.    If a column regularly has to be aliased there is a high risk that different aliases will be used.  Why even raise the possibility?

  • Give a unique name to all columns that are likely to appear together in a Select statement.
  • Never use Name as a column name!  (there are a number of other prohibited names).

I’m happy for multiple tables to have effective_date and obsolete_date, for example, because these will normally only be used in where clauses, where full names will be enforced; they won’t often be returned for multiple tables in a view.

Naturally enough it was this database that suffered an ambiguous column reference following the addition of a column to a table.  This could never have happened with the Table-Prefix convention: perhaps that is a good reason to use it, in spite of the aversion to its Access appearance!