Bashar Kokash' Blog

.Net Framework, windows and web development.

This site

Resources

Friends

Sponsors

  • MaximumASP

How to implement your own aggregate function in SQLCLR, SQL Server 2005

I posted earlier an overview about the SQLCLR component of the SQL Server 2005, and showed how to implement a simple user-defined function using Visual Studio 2008 and C#, this post will show the importance of .NET Framework to SQLCLR by adding more functionality to SQL Server 2005.

Introduction:

Using T-SQL we can create user-defined function, stored procedures and triggers, but we can't do is to add a user-defined aggregate function. An aggregate function operates against a collection of values, but returns only a single value, for example the sum(), avg(), and count() functions. Each of those functions works on all the rows in a specified column, but only returns one value like the sum , the average, or the number of rows.

But what if I needed to create my own aggregate function that does a special job, using T-SQL there is no direct way, but we can use cursors and loop for each row which may result complex code with a bad performance, fortunately SQLCLR provides us the ability to implement user-defined aggregate.

Implementation:

The following will show how to implement a simple aggregate function using C# and Visual Studio 2008.

The aggregate function that was chosen for this example is a simple COUNTNULL aggregate function that returns the number of NULL rows in a specified column, of course you can do it easily in T-SQL but this is just a simple example.

Inside your Database project in Visual Studio add an new item and select Aggregate, and a new class is added to the project, actually it's a structure.

Each aggregate function needs to be in a separate struct, notice this struct is marked as [Serializable] and [SqlUserDefinedAggregate(Format.Native)].

SQL Server 2005 needs to store effective representation of the aggregate on disk so the aggregate should be Serializable and SqlUserDefinedAggregate attribute indicates that this class or struct is a user-defined aggregate, and the property Format.Native tells the .NET Framework to handle the serialization process using a simple algorithm and it works only with value types that's why the aggregate is defined as a struct.

You can also specify the property name in the SqlUserDefinedAggregate attribute to the name of you Aggregate function "The name that is called in SQL Server 2005".
[SqlUserDefinedAggregate (Format.Native,Name = "CountNulls")]

Now, inside the class we have to implement four methods in order the aggregate to work, there are init, accumulate, merge and terminate.

Init() method is used in order to initialize any needed resources or variables and is called only once when the aggregate is called.

Accumulate() method is the actual implementation of the aggregate function and is called once for each row.

Merge() method is used because sometimes if the the column to aggregate contains a large number of data the SQLCLR may split those rows into groups and aggregate each group individually and then merge the results of those group again, then the Merge() method is called.

Terminate() method is used free any used resources and to return the result of this aggregate function.

 

Below is the code for the CountNulls aggregate function with some comments: 

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using Microsoft.SqlServer.Server;
[Serializable]
[SqlUserDefinedAggregate(Format.Native,Name = "CountNulls")]
public struct CountNulls
{
    private int Counter;  // field to count the not null rows
    public void Init()
    {
        Counter = 0;      // initialization
    }
    public void Accumulate(object Value)
    {
        if (Value == DBNull.Value)  // count just the rows that are not equal to NULL
            Counter++;
    }
    public void Merge(CountNulls Group)
    {
        this.Counter += Group.Counter; // when merge is needed the counter of other groups should be added
    }

    public SqlString Terminate()
    {
        return new SqlString(Counter.ToString()); //returning the results
    }
}

After you compile this class you have to deploy it in order to make available to be used inside SQL Server 2005

Finally go to SQL Server Management Studio and use your aggregate function as follows:

dbo.CountNulls(EndDate)from HumanResources.EmployeeDepartmentHistory

 

Summary:

Each class or struct should be public and dedicated for only one aggregate function.

Serialization is required so SQLCLR could store the representation of the aggregate on disk.

Note that this aggregate function can only take one parameter, we can't implement aggregation with more than one parameter in SQL Server 2005, but we can with SQL Server 2008 and I'll post about it later.

Comments

Bashar Kokash' Blog said:

Recently, I've been doing a lot of invoking .NET Assemblies within SQL Server 2005, I bloged an overview

# June 9, 2008 5:52 AM

cerebral ataxia said:

Meta: Login RSS Comments RSS Valid XHTML XFN WP January 31, 2008 Psychology,psychiatric, Autism, Crossing the Cruel Wires of Truth. Filed under: Autism Research spotlight— blog@ 7:

# August 30, 2008 3:18 PM