# Sunday, November 02, 2008
« Fluently making a file temporarily writa... | Main | .NET console applications are so old sch... »

Generating test data for your application is not an uncommon task, but there are definitely many different ways to seed your application.  Probably one of the most common ways is to use SQL scripts to directly import data into the database, although while fast this lacks the control and validation that your middle tier code provides.

Loading data directly using SQL works best when your business logic resides in stored procedures near the database, but in a modern application where business logic is generally found in the middle tier application layer written in C#, this doesn't work quite as well.  Loading data directly could potentially cause data corruption since its perfectly plausible your data may very well be invalid.  Besides of the possible side effects of invalid data you're likely skipping some of the automatic management of data your middle tier affords you.  In other words, loading data using your middle tier API is probably more robust and easer to script.

To script the middle tier API to load data we could create data using C# or some other static language, but unfortunately that requires a compile and is much more rigid than a malleable SQL script.  Making the C# "script" data driven does help, but it lacks the flexibility that a native scripting language provides.

With a scripting a language the only part to manage is the script itself which gets checked into source control (or even a My Documents folder).  The other nice benefit is that we can have a nice GUI tool (test suite manager?) that dynamically picks up user scripts from one or more predefined script folders making them a available to run at the click of a button.  No recompile, or reloading, just edit-save-and-go.

So, given that we have a C# middle tier layer that's perfectly capable of persisting data into a database, how do we go about writing a script against it?  There are a few ways, but one of the easiest ways is to use IronPython.  If you're not familiar with IronPython:

IronPython is a new implementation of the Python programming language running on .NET. It supports an interactive console with fully dynamic compilation. It is well integrated with the rest of the .NET Framework and makes all .NET libraries easily available to Python programmers, while maintaining full compatibility with the Python language.

That's exactly what we need!  Something light weight and easy to mange like SQL, but interacts with our middle tier.  We could even embed the interpreter in our GUI runner if we so desired which gives us the full power of a real scripting language and .NET from our own tools.

I created a small spike that tests out how IronPython could be used to load data directly through the middle tier API.  First of all I created a Customer domain object, an Address component, and a CustomerRepository in C#.  From here we directly write Python against this.  Here's what my little spike project looked like:

image

For my purposes I've created a Scripts folder to hold my Python scripts (in a real environment these would probably be located elsewhere).  As you might have inferred, the 10Customers.py script creates 10 customers in my customer repository.  I could potentially add other scripts in this folder, and even chain them together to do other more substantial things, but for now this will do.  Now for the interesting part, the python script which loads ten semi-random customers.

import sys
import clr

sys.path.append(r"E:\Source\spikes\IronPythonSpike\IronPythonSpike\bin\debug")
clr.AddReferenceToFile("IronPythonSpike.dll")
from IronPythonSpike import *
import System.Random as Rand

rand = Rand()
names = ["Joe", "Bob", "John", "Smith", "Hank", "Aaron", "Neal", "Pat", "Tim", "Jones", "Bill"]
streets = ["128th St. W", "Main St.", "1st Ave.", "A.B.D. Rd.", "Lonely Lane", "Pacific Ave.", "6th Ave.", "Foobar Ct."]
states = ["WA", "OR", "CA", "NY", "AK", "NV", "AL", "TX", "FL"]

repository = InMemoryCustomerRepository()
	
for i in xrange(10):
	c = Customer()
	c.Id = i;
	c.FirstName = names[rand.Next(names.Count)]
	c.LastName = names[rand.Next(names.Count)]
	c.Address = Address()
	c.Address.Id = i
	c.Address.HouseNumber = rand.Next(32767);
	c.Address.Street = streets[rand.Next(streets.Count)]
	c.Address.State = states[rand.Next(states.Count)]
	c.Address.PostalCode = rand.Next(10000, 99999).ToString()
	
	repository.SaveOrUpdate(c)

The first thing we do is import sys which we use to add the path to our C# DLL to the available path's for the interpreter to search for our DLL in. 

sys.path.append(r"E:\Source\spikes\IronPythonSpike\IronPythonSpike\bin\debug") 

From here we can use the clr object to add a reference to our C# middle tier assembly.  This assembly of course provides our domain objects.  From there its just a matter of randomly generating the customer and address attributes from a predefined list of possibilities: names, streets, states.

Certainly you could create a separate generator script which would provide for much more varied data, you could even pull these bits from another database or even a web service.  You could even write the script so that the data was not random but always the same.  For my purposes though, this works just fine.  A more realistic domain would have more validation rules and would probably require a more intelligent script.

To verify that everything was added to the repository I can just make a quick call at the end of the script

for c in repository.GetAll():
	print c.ToString()
	print ""

Which on this run prints out:

Tim Tim
959 Lonely Lane
AL, 96094

John Pat
19577 Pacific Ave.
FL, 62699

Pat John
2194 Lonely Lane
OR, 58176

Smith Neal
6430 Pacific Ave.
AL, 88910

Jones John
11059 Pacific Ave.
TX, 67800

Smith Smith
24162 6th Ave.
CA, 81493

Neal Hank
4054 Pacific Ave.
WA, 33392

Smith Jones
32012 Pacific Ave.
FL, 44461

Tim Hank
28812 Foobar Ct.
AL, 98069

Neal Joe
1751 Main St.
NV, 52048

Not bad for a few lines of script!

I think the important part to remember is that we're using the middle tier to load data complete with validation and any other business logic that normally executes on save, all the while keeping things flexible and lightweight.

Thursday, November 06, 2008 5:22:17 PM (GMT Standard Time, UTC+00:00)
Cool stuff! Scripting languages like Python make automatic code and, in your case, data generation so much easier in IMO. One of my favorite aspects of Python is the list data structure (and its associated functions) which you use as your basis for your test data. It has so much versatility and very fundamental particularly in code gen efforts.

I like how you use IronPython to call your C# objects. Perhaps in an MVC framework, you might now have the temptation to use IronPython for the 'Controller' code against your 'Model' that is written in C#. :-)
Monday, November 17, 2008 4:31:15 PM (GMT Standard Time, UTC+00:00)
Indeed, with ASP.NET futures I could use IronPython. No need to stop debugging and recompile.. ;-)
Sneal
Comments are closed.