Importance of developer data – level 200

Most of my experience has been with systems that require a relational database to house most of the data. Some systems have used a combination of relational databases, files, and external sources. Regardless of the data store, there are different ways to set up the data stores in a developer environment that affect the ease and speed of development. For clarity, I am talking about the database that is used by every developer on the development workstation (not a development team staging server) I’ll start with the worst way.

Use last month’s production backup (BAD)
Assume for a moment that there is not a single corrupt field in your production data. Even so, the data set can be so large that it is tough to see clearly what is going on while developing locally. While working out problems in the program’s behavior, it’s necessary to peek into tables to see what data is there, and with a full production set, the developer has to formulate queries merely for narrowing the scope of the data to view. Using a full production set bogs down development with unnecessary overhead. Finally, it will be very hard to create repeatable tests since the developer is constantly changing the data.

Create a development database with fake data and put it on a shared server(BAD)
The “shared server” part of this one is the killer. Imagine running a scenario using this database, and the data changes. With a shared database, this is what happens. Multiple users changes the data in many different ways. A high degree of communication overhead and waiting is required to make this work. The fake data will need to be refreshed periodically, so it becomes necessary to stop the team from using the database during the data reload.

Use a local database with nothing in it(BAD)
This is probably an obvious one since the system probably won’t function without a proper data set. If it will, security or configuration problems might lurk in the shadows (if you have security or configuration tables).

Use a local database created with the local build and populated with representative data that is enough to support every use case in the application(GOOD)
In your local build script, have a target that creates the database locally from scratch. Have a routine that pumps data into this local database. In this data set, include enough data so that every use case in the application can be exercised. As functionality grows, add to this data set proportionally. Structure your automated build in such a way that it is easy to:

Blow away the database and recreate it from scratch.
Blow away the database and refresh it at any time.
Create multiple local databases in the same way with different names.

Your automated database tests are going to be blowing away and reading data in the normal course of execution, so it’s important to be able to refresh the database at any time. Using this technique, you will be able to run the system locally at any time and use it in a representative manner (and debug if necessary).

WHAT ABOUT A TESTING DATABASE?
That’s a different animal. You want the testing database to make bugs obvious. Another tricky thing is that there isn’t just one testing database, so that’s a topic for another time.

Mark 12:47 pm on January 23, 2007

You are exactly right! The one I hate most is the full production copy on a shared server.

“Hey guys, nobody touch the database! It took me 10 minutes to debug to this point, so don’t screw it up for me!!”
Mike 7:39 pm on January 23, 2007

Any chance of some samples on how to do create the database during the build process?
jpalermo 5:54 pm on January 26, 2007

Mike,
I’ll have to put that on my todo post list.

Comments are closed.

Programming with Palermo

Jeffrey Palermo, Microsoft MVP, Author, Speaker, Clear Measure Chief Architect, Azure DevOps Expert

Importance of developer data – level 200

Related

Related

Discover more from Programming with Palermo