Total votes: 1
Print: Print Article
Please login to rate or to leave a comment.
Published: 05 Dec 2007
The Foundations of Programming series looks at a number of key concepts, techniques and tools specifically designed to help developers meet the growing complexity of enterprise systems. Based on proven principals like unit testing, domain driven design, dependency injection and O/R Mappers, the series is aimed at developers interested in helping themselves.
Writing maintainable code that delivers value to your client isn't trivial. That doesn't mean that being a successful enterprise developer has to be hard. The Foundations of Programming series looks at a number of key concepts, techniques and tools specifically designed to help developers meet the growing complexity of enterprise systems. Based on proven principals like unit testing, domain driven design, dependency injection and O/R Mappers, the series is aimed at developers interested in helping themselves. Readers are encouraged to first read Part1 and Part 2.
In the previous part we managed to have a good discussion about DDD without talking much about databases. If you're used to programming with
DataSets, you probably have a lot of questions about how this is actually going to work.
DataSets are great in that a lot is taken care of for you. In this part we'll start the discussion around how to deal with persistence using DDD. We'll manually write code to bridge the gap between our C# objects and our SQL tables. In later sections we'll look at more advanced alternatives (two different O/R mapping approaches) which, like
DataSets, do much of the heavy lifting for us. This part is meant to bring some closure to the previous discussion while opening the discussion on more advanced persistence patterns.
As you know, your program runs in memory and requires a place to store (or persist) information. These days, the solution of choice is a relational database. Persistence is actually a pretty big topic in the software development field because, without the help of patterns and tools, it isn't the easiest thing to successfully pull off. With respect to object oriented programming the challenge has been given a fancy name: the Object-Relational Impedance Mismatch. That pretty much means that relational data doesn't map perfectly to objects and objects don't map perfectly to relational stores. Microsoft basically tried to ignore this problem and simply made a relational representation within object-oriented code - a clever approach, but not without its flaws such as poor performance, leaky abstractions, poor testability, awkwardness, and poor maintainability. (On the other side are object oriented databases which, to the best of my knowledge, haven't taken off either.)
Rather than try to ignore the problem, we can, and should face it head on. We should face it so that we can leverage the best of both worlds - complex business rules implemented in OOP and data storage and retrieval via relational databases. Of course, that is providing that we can bridge the gap. But what gap exactly? What is this Impedance Mismatch? You're probably thinking that it can't be that hard to pump relational data into objects and back into tables. If you are, then you're absolutely right (mostly right anyways for now let's assume that it's always a simple process).
For small projects with only a handful of small domain classes and database tables, my preference has generally been to manually write code that maps between the two worlds. Let's look at a simple example. The first thing we'll do is expand on our
Upgrade class (we're only focusing on the data portions of our class (the fields) since that's what gets persisted):
Listing 1: Fields and Properties of the Upgrade class
We've added the basic fields you'd likely expect to see in the class. Next we'll create the table that would hold, or persist, the upgrade information
Listing 2: CREATE TABLE Upgrades
No surprises there. Now comes the interesting part (well, relatively speaking), we'll start to build up our data access layer, which sits between the domain and relational models (interfaces left out for brevity)
Listing 3: Simple DAL
ExecuteReader is a helper method to slightly reduce the redundant code we have to write.
RetrieveAllUpgrades is more interesting as it selects all the upgrades and loads them into a list via the
CreateUpgrade, shown below, is the reusable code we use to map upgrade information stored in the database into our domain. It's straightforward because the domain model and data model are so similar.
Listing 4: DataMapper From Data to Object
If we need to, we can re-use
CreateUpgrade as much as necessary. For example, we'd likely need the ability to retrieve upgrades by id or price - both of which would be new methods in the
Obviously, we can apply the same logic when we want to store
Upgrade objects back into the store. Here's one possible solution:
Listing 5: DataMapper From Object to Data
Despite the fact that we've taken a very simple and common example, we still ran into the dreaded impedance mismatch. Notice how our data access layer (either the
DataMapper) doesn't handle the much needed
RequiredUpgrades collection. That's because one of the trickiest things to handle are relationships. In the domain world these are references (or collection of references) to other objects; whereas the relational world uses foreign keys. This difference is a constant thorn in the side of developers. The fix isn't too hard. First we'll add a many-to-many join table which associates an upgrade with the other upgrades that are required for it (could be 0, 1 or more).
Listing 6: CREATE TABLE UpgradeDepencies
Next we modify
RetrieveAllUpgrade to load-in required upgrades:
Listing 7: Improved DAL
We pull the extra join table information along with our initial query and create a local lookup dictionary to quickly access our upgrades by their id. Next we loop through the join table, get the appropriate upgrades from the lookup dictionary and add them to the collections.
It isn't the most elegant solution, but it works rather well. We may be able to refactor the function a bit to make it little more readable, but for now and for this simple case, it'll do the job.
Although we're only doing an initial look at mapping, it's worth it to look at the limitations we've placed on ourselves. Once you go down the path of manually writing this kind of code it can quickly get out of hand. If we want to add filtering/sorting methods we either have to write dynamic SQL or have to write a lot of methods. We'll end up writing a bunch of RetrieveUpgradeByX methods that'll be painfully similar from one and other.
Oftentimes you'll want to lazy-load relationships. That is, instead of loading all the required upgrades upfront, maybe we want to load them only when necessary. In this case it isn't a big deal since it's just an extra 32bit reference. A better example would be the Model's relationship to Upgrades. It is relatively easy to implement lazy loads, it's just, yet again, a lot of repetitive code.
The most significant issue though has to do with identity. If we call
RetrieveAllUpgrades twice, we'll get to distinct instances of every upgrade. This can result in inconsistencies, given:
Listing 8: Oh oh!
The price change to the first upgrade won't be reflected in the instance pointed to by upgrade1a. In some cases that won't be a problem. However, in many situations, you'll want your data access layer to track the identity of instances it creates and enforce some control (you can read more by googling the Identity Map pattern).
There are probably more limitations, but the last one we'll talk about has to do with units of work (again, you can read more by goolging the Unit of Work pattern). Essentially when you manually code your data access layer, you need to make sure that when you persist an object, you also persist, if necessary, updated referenced object. If you're working on the admin portion of our car sales system, you might very well create a new
Model and add a new
Upgrade. If you call
Save on your
Model, you need to make sure your
Upgrade is also saved. The simplest solution is to call save often for each individual action - but this is both difficult (relationships can be several levels deep) and inefficient. Similarly you may change only a few properties and then have to decide between resaving all fields, or somehow tracking changed properties and only updating those. Again, for small systems, this isn't much of a problem. For larger systems, it's a near impossible task to manually do (besides, rather than wasting your time building your own unit of work implementation, maybe you should be writing functionality the client asked for).
In the end, we won't rely on manual mapping - it just isn't flexible enough and we end up spending too much time writing code that's useless to our client. Nevertheless, it's important to see mapping in action - and even though we picked a simple example, we still ran into some issues. Since mapping like this is straightforward, the most important thing is that you understand the limitations this approach has. Try thinking what can happen if two distinct instances of the same data are floating around in your code, or just how quickly your data access layer will balloon as new requirements come in. We won't revisit persistence for at least a couple parts - but when we do re-address it we'll examine full-blown solutions that pack quite a punch.
Karl Seguin is an senior application developer at Fuel Industries, located in Ottawa, Ontario. He's an editor here at DotNetSlackers, a blogger for the influential CodeBetter.com and a Microsoft MVP.
This author has published 8 articles on DotNetSlackers. View other articles or the complete profile here.
Please login to rate or to leave a comment.