Separating configuration from data lowers total cost of ownership

All software systems require configuration.  Some require much more configuration that others.  Configuration can be an overloaded term, but I'm speaking about the information necessary for the software to function correctly. 

Without this information, the software is unstable and will not work.  For most business systems, a common configuration item is the database connection string.  Without this configuration item, the system cannot function.  Other common configurations are the list of valid states or provinces for addresses.  Often the local state is defaulted to the top as a shortcut for the users.  Another example is a list of statuses, such as Pending, Active, Complete.  As usual, requirements should drive the approach taken.

Let's talk about the Status list first.  Where should this configuration be managed?  The simplest approach is to maintain the status list in the software itself.  The tests around the software build will verify this configuration, and if the software has logic built around these values, the compiler verifies the important values are present.  If the Status list changes less than once per week, this is a fine approach since a system under enhancement might have weekly deployments.  (Note:  Using agile development, systems are easy to deploy, and our current projects get deployed many times per day to multiple environments.  Without an automated deployment, deployments are much more costly).

Let's consider a list of states.  For a sales system, we might have 5 states if process orders for these 5 regions.  Our state list would have those 5 states with the local state defaulted to the top since it is the most common.  Again, we start with storing this configuration with the codebase, especially since the system is unstable without the local state.  If the list of states changes more frequently than weekly, we might consider wanting to change the list without requiring a deployment.  This would lead us to store the configuration separate from the codebase.  Options include putting the configuration in a file or the database.  The benefit of the configuration residing in a file allows the configuration to change while the system is running.  The system will have to be able to detect the change and reload the configuration, however.

Let's take the requirements a bit further.  Let's say we want a screen in the system to let authorized users add a state to the state list at any time.  They can also remove a state.  We have to analyze if there is a gold list (i.e. a list of states that cannot be removed because there are business rules around those specific values).  We might be tempted to put the state list in the database, but mixing this configuration with data is risky.  First, we need a mechanism to validate the configuration.  We need to have the application check the database table and ensure the "required" values are there.  Values added in one installation might somehow need to be synced up if a new value is considered to be part of the "gold" list.  Putting the list in data confuses data and configuration, and we have shyed away from it since the system is easier to implement and maintain if a new installation starts out with an empty database and functions properly.  Then implementation tasks can make the system usable for the target environment.

Ultimately, the requirements drive the approach to configuration.  Depending on the frequency of change, the number of unremoveable items, and other factors, you will decide where to store the configuration in a manner that minimizes total cost of ownership.

My recommended progressions of configuration locations are as follows.  Specific requirements should be the driving factor that cause you to move configuration from one medium to the next costly alternative.

  1. In the code
  2. In a file
  3. In a database
  4. Some other system-specific external configuration store

There is no cut-and-dried answer for EVERY project, but the requirements should be what drives the decision.  Along those lines, no one location is appropriate for EVERY piece of configuration.  Depending on the nature of the configuration, it will be appropriate to choose a different storage medium.


Trackbacks

Dew Drop - January 10, 2009 | Alvin Ashcraft's Morning Dew Posted on 1.10.2009 at 3:27 PM

Pingback from Dew Drop - January 10, 2009 | Alvin Ashcraft's Morning Dew

App Configuration and Databases Posted on 1.11.2009 at 10:11 PM

Jeffrey “Party With” Palermo recently posted on “Separating configuration from data lowers total cost...

App Configuration and Databases Posted on 1.11.2009 at 10:23 PM

Jeffrey “Party With” Palermo recently posted on “ Separating configuration from data lowers total cost

Hardcoding Considered Harmful - or is it? Posted on 1.13.2009 at 12:34 PM

I had an interested conversation with a colleague recently, and the topic was hard-coding. A definition from wikipedia follows: Hard coding (also, hard-coding or hardcoding ) refers to the software development practice of embedding input or configuration

Weekly #0 (beta) | loosely coupled Posted on 1.22.2009 at 1:34 PM

Pingback from Weekly #0 (beta) | loosely coupled

Comments

Janco Wolmarans said on 1.12.2009 at 7:04 AM

Chances are that your list of states will be referenced by other tables via foreign key in the database. I reckon this holds true for most lookup list type data. Would you view that as a deal-breaking condition for storing the configuration data anywhere else but in the DB?

Jeffrey Palermo said on 1.12.2009 at 7:28 AM

@Janco,

That's a decision that should be based on the requirements. There are many ways to implement configuration of available states. A database table with other tables using a foreign key is just one implementation. Another implementation is other tables using a code column that has check constraints applied.

Your decision will also be driven based on your architecture. Is this a database application, or a domain model application? If your database is shared by multiple systems, it is tempting to put the states in the database. If you don't share the connection string with other systems, then you don't have to enforce data cosistency at the database level.

Another item that might push you in one direction or another is how difficult it is to change and deploy your system. If you have automated deployments and smoke tests, then a deployment is no big deal, and you can probably do that even during the working day. If deployments are an all-weekend affair, then you will probably do anything to avoid one.