Is Enum Evil?

When I work on database designs, either on my own projects or as advisor to others, I often find people very reluctant to use an enum type for any columns. Now, I'm not about to advocate the gratuitous use of an enum column, they definitely have some pitfalls, but I think it is important to understand these rather than just shouting "enum evil!" if anyone should mention them.

There are cases where an enum is the correct choice for a particular type of data, so let's look at what an enum type actually is and does.


Enum Type Intentions

An enum column type is a column which allows only the specified data values. This means that if a particular column can only take certain values, for example 'art', 'music' or 'sport' then you can specify this when you create the table:

CREATE TABLE `teachers` (
  `id` INT(11) PRIMARY KEY AUTO_INCREMENT,
  `name` VARCHAR(50) DEFAULT NULL,
  `specialism` enum('art','music','sport') DEFAULT NULL
)

This means we can only add data into the specialism column which is exactly equal to one of those values. This makes these columns ideal for things that never change. However in my experience, never is relative!

Changing Possible Values of Enum

In order to change the allowed values of an enum column, we need to issue an alter table statement - and this is where it gets messy. Alter table actually creates a new table matching the new structure, copies all the data across, and then renames the new table to the right name. During this process, which takes longer in proportion to the size of the data, the table is locked. Completely. So your whole application is going to have to wait for that to finish before anything else can even read from the table! This is why alter table statements are always a bad idea on production.

So, if your enum column's allowed values could ever change, or if your data set is going to be any kind of large, or you're going to experience traffic on any scale, then the enum isn't the right data type. As you can see, this eliminates most use cases and is the reason why enum is usually avoided.

Alternative to Enum: Lookup Tables

In situations like the teacher table shown above, it is much better to move the data into a simple lookup table; now we need two tables, teachers and specialisms, and they look like this:

CREATE TABLE `teachers` (
     `id` INT(11) PRIMARY KEY AUTO_INCREMENT,
     `name` VARCHAR(50) DEFAULT NULL,
     `specialism_id` INT(11) NOT NULL );
 
CREATE TABLE `specialisms` (
  `specialism_id` INT(11) PRIMARY KEY AUTO_INCREMENT,
  `name` VARCHAR(255) DEFAULT NULL );

With this setup, we are much more able to handle the case where "never" actually turns out to happen. In the event that the data does need to change, we alter the content of the database but not the structure, which is much better practice and doesn't lead to potentially long table-level locks on our production database.

Does anyone have any further approaches for avoiding enums? Or situations where they are a great choice? Leave me a comment, I'm interested to hear the experiences of others on this topic.

5 thoughts on “Is Enum Evil?

  1. Another common misuse of the enum-type is trying to teach it boolean, by giving it the choice of "yes" or "no" (totally ignoring "maybe" btw). A better alternative besides the boolean type is to do it tiny(1)-style. Although there might be traps regarding the "0" vs. NULL situation, but I never figured that out and it never hindered me in any way.

  2. Personally, I use them when I'm very damn certain that it won't have more choices than I know about. Say gender, possible values will be unknown, male, female. This list will not change.

    Of course, there aren't that many times I have lists that will not change.

    • ... oops ...

      We also used in extensively in off-line applications where changing the table definition was unproblematic, one database, one user. It saved us 20+ tables (about 40% of total tables) due to a lot of switching between almost identical datatypes (shared structure, enum stating what type it was)

  3. If you intend to use the lookup table approach, I would recommend creating a foreign key constraint.

    Without a foreign key constraint, there's no validation against bogus specialism_id values in your teachers table.

    This way you don't lose out on the validation which an enum provides.

  4. KMB: Great comment about the tinyint as an alternative, remember though that you can validly put "2" in that column and confuse yourself!

    Emma: that's a great real-world use-case for enum, thanks for commenting :)

Leave a Reply

Please use [code] and [/code] around any source code you wish to share.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>