Why do applications of any kind need persistent data? You save on a persistent storage medium any data that is not volatile and needs be retrieved later. The whole process of retrieving a persisted chunk of data is known as the query. At the end of the day, a query is all about loading data into a container object that is easier and faster to work with than the storage medium. A container may be like relatively simple, in-memory lists, such as arrays and collections, or tabular, recordset-like data structures. How do you work with data once it has been queried from the persistent data store and uploaded to the application's memory? In some cases, you just access it using plain indexes or key names. But if and when you need to set up a more sophisticated search, you're in trouble. Like it or not, in fact, nearly any container object features its own programming model for in-memory queries. You cope with the IList interface for collections, the DataSet/DataTable query methods, XPath, XMLDOM, and the like.
The basic idea of the query clearly refers us to the database world: you SELECT data items FROM a given source WHERE some criteria are met. However, there are no reasons to limit the use of this terminology exclusively to the world of databases. A similar syntax can be applied to every data container, including in-memory data containers, as long as a common model for representing data exists. Language Integrated Query, also known as LINQ, is just the Microsoft's API that goes in this direction.
Integrated in the .NET Framework 3.5, LINQ defines a query framework whose key objects are mapped to new and ad hoc keywords in C# and VB. In this way, a query becomes a first-class construct in managed languages. Developers, therefore, can use a brand new and homogeneous syntax to query and update data from a variety of supported data sources. Like icing on the cake, the data source model is extensible and allows third-party vendors (and arguably teams of developers engaged on vertical projects) to formalize and plug in new queryable data sources.
LINQ-to-SQL at a Glance
To make sense of LINQ, have a quick look at the following code snippet. It represents a typical LINQ query written in C#:
I won't spend much on the individual keywords found in the snippet. For that, you can check out the C# reference page on MSDN or the analogous page for VB. It is interesting to note that LINQ-specific keywords in C# are considered contextual keywords. In other words, they are used to provide a specific meaning in the code, but are not reserved words for the language. Let's focus on the code snippet now.
The key element in the snippet that I would call your attention on is the
dataSource placeholder. In the snippet it represents the container being queried. The magic of LINQ is all in the model used to represent that container. The LINQ syntax can be adapted to any data source object that implements a common interface and, subsequently, is queryable. All data types that implement IQueryable or its generic version IQueryable can be employed as data sources in a LINQ query.
The IQueryable interfaces are defined in the System.Linq namespace and live in the system.core.dll assembly.
LINQ-to-SQL is merely the LINQ-based framework that enables you to query over a database-driven object model that the Visual Studio 2008 designer tool builds for you. By using Visual Studio 2008, or a command line tool (sqlmetal.exe), you generate a helper project file that defines in new classes all the tables, views, and stored procedures available over a given connection string. The entry point in this auto-generated object model is the DataContext class.
Note: Most of the times, you won't be using the DataContext class directly. Instead, you'll use a derived class that Visual Studio 2008 or sqlmetal may have generated containing only your selection of database objects. In Visual Studio 2008, you'll find this class hidden in the Solution Explorer under the node for the added LINQ-to-SQL class. A LINQ-to-SQL class has a DBML extension.
The DataContext Class
In a few words, the data context class is a class that acts as a poor man's repository for the auto-generated object model. The data context wraps an object model and provides some predefined methods to access some commonly-used queries. Because it is extensible through the mechanism of partial classes, here's that in my opinion the association with the Repository pattern makes some sense. However, a more precise description of the data context in terms of design patterns would present it as a Registry. The data context, in fact, is a centralized object that provides access to a number of other common objects. At the same time, the data context implements a number of other common patterns for a domain model: identity map, lazy loading, optimistic offline lock, unit of work, and data mapper.
The identity map pattern is visible through the DataContext's capability of caching copies of all objects that have been retrieved from the database in the course of an operation. As long as the DataContext instance is live, the same copy of the object will be returned without running a new query. An object here is merely an object of the same type with the same key.
The lazy loading pattern is visible in the fact that the DataContext offers methods such as LoadWith and AssociateWith to load only a specific portion of the model's graph.
The optimistic offline lock pattern shows up when the DataContext ensures that the changes it is about to save don't conflict with any changes that another transaction may have committed already.
The unit of work pattern is visible through the DataContext's capability of tracking changes made to any objects it has loaded. At any time, the DataContext knows exactly what has changed and why (deletion, insertion, updates). This knowledge makes possible to submit all pending changes in a transactional manner.
The data mapper pattern is in the DataContext object because it holds a reference to an instance of the System.Data.Linq.Mapping.MetaModel class through the Mapping property. The MetaModel type is the abstraction that represents the mapping between the SQL Server database and the auto-generated domain objects.
The following code snippet presents an excerpt from the source code of a data context class created out of a connection string to the Northwind database.
Mapping between the classes in the domain and the database defaults to an instance of the AttributeMappingSource class. In the end, it means that mapping is member-to-column mapping is specified through attributes placed on the entity class - Customer in the example.
An alternate mapping model is supported and is based on an XML file. The mapping class is XmlMappingSource.
The domain model associated with a LINQ-to-SQL data context has a natural strong orientation towards the structure of the underlying database tables. At a first look, it may seem the implementation of an Active Record
pattern. However, while a LINQ-to-SQL domain object certainly wraps a row in a database table, it doesn't encapsulate database access, nor does it contain domain logic. Domain logic (read, behavior), though, may be added through extensions via the mechanism of partial classes.
Using the Data Context
Because the DataContext class is the entry point to the LINQ-to-SQL data model, you need an instance of it every time you need to query or update data. Aware of this, Microsoft designed the DataContext class to produce lightweight and disposable objects that wrap a database connection. You should consider the data context as close as you can to a database connection object: get it as late as possible and get rid of it as soon as you can. Persisting a data context doesn't make much sense. First, it would eat up a significant share of memory - raw data plus tracking information and mapping information. Second, it is not designed to be a thread-safe object. In general, an instance of the DataContext class should survive a single business operation.
Applying more literally the definition of the Repository pattern, you might want to define an additional layer on top of the domain model where to fit and reuse some query logic. Here's an example:
The DataContext class is central to the implementation of a data access layer based on LINQ-to-SQL. In this article, I only hinted at the internal organization of the DataContext class. In addition, I suggested you consider a DataContext akin to a database connection object. This is OK as long as you think of LINQ-to-SQL as a programming model for building your data access layer. From this perspective, which is quite near to reality in my opinion, LINQ-to-SQL is more or less the object-oriented version of ADO.NET. However, in relatively simple application scenarios, I believe that it may be reasonable to consider LINQ-to-SQL as an O/RM tool. If you do so, though, the equation DataContext-equals-connection deserves some more thought. Fodder for the next article.
Dino Esposito is one of the world's authorities on Web technology and software architecture. Dino published an array of books, most of which are considered state-of-the-art in their respective areas. His most recent books are “Microsoft ® .NET: Architecting Applications for the Enterprise” and “...
This author has published 54 articles on DotNetSlackers. View other articles or the complete profile here.
You might also be interested in the following related blog posts
New Entity Framework Feature CTP for VS2010 Beta 2
Entirely unobtrusive and imperative templates with Microsoft Ajax Library Preview 6
Building a class browser with Microsoft Ajax 4.0 Preview 5
An Overview of Partial Classes and Partial Methods
Old School Architecture, Bleeding Edge Technology
LINQ to SQL and alternate Providers
Kobe - Oh Dear Lord Why?!
Outside my element: A CSS Trick
Creating LINQToTwitter library using LinqExtender
Display data hierarchy in the RadGridView
Please login to rate or to leave a comment.