This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
If the source of a query expression is an IQueryable<T> [ http://msdn.microsoft.com/en-
us/library/bb351562.aspx ] instead of the raw IEnumerable<T> [ http://msdn.microsoft.com/en-
us/library/9eekhta0.aspx ] then the query is represented by a set of objects that preserves the
structure of the actual query. This tree of objects is called an Expression Tree.
An expression tree can be given to a LINQ provider where it is usually translated to a domain-specific
query language – or anything else for that matter. For example in the case of SQL Server, it can be
translated to T-SQL, whereas in the case of Active Directory the same expression tree can be converted
to an LDAP filter. With this approach, it is possible to develop all sorts of providers such as the ones
shown here [ http://blogs.msdn.com/charlie/archive/2008/02/28/link-to-everything-a-list-of-linq-
providers.aspx ] .
This article demonstrates how to create a mini IQueryable LINQ provider.
Before moving on, it is imperative to have a deep understanding of expression trees. If you are puzzled
by what they are, let me show you a simple example. Imagine the following simple expression (where
‘i’ is an integer):
i <= 10
The expression tree representing this expression is a tree of objects that is shown below:
1 : Expression Tree for “ i <= 10”
At its root, this expression is captured by an instance of the BinaryExpression class. This object has a
Left and a Right property of type Expression. The Left expression is a ParameterExpression called ‘i’ of
type ‘Int32’ and the Right expression is a ConstantExpression representing the constant integer ‘10’. Now given this tree of objects, different providers can translate them into different domain-specific
languages.
If the source of the LINQ Query is of type IQueryable<T> then the C# compiler automatically – at
compile time – converts the query to an expression tree. Reflecting on the generated IL will show the
expression tree representing the original expression. For example the query expression below is
automatically converted to its expression tree representation at compile time:
Nothing stops you of building an expression tree manually, however it is simpler to let the compiler to
manage this for you. The following shows how to manually build an expression tree that represents the
‘ i <= 10 ’ expression:
BinaryExpression e = Expression.LessThanOrEqual(
Expression.Parameter(typeof(int), "i"),
Expression.Constant(10, typeof(int)));
Before moving on, there is a simple Visual Studio debug-time Visualizer (shown in Figure 1 ) that helps
to visualize expression trees. I found this to be extremely useful when developing this custom LINQ provider. Please see here [ http://msdn.microsoft.com/en-us/library/bb397975.aspx ] for how to
install and enable this for Visual Studio 2008.
As Figure 2 shows, our custom provider will translate LINQ queries (expression trees) to T-SQL and the
result-sets returned from the database are mapped to objects using an object-relational mapper (ORM)
that is discussed next. Write operations on the other hand, are tracked in memory and only executed
when the application explicitly requests to submit the changes.
fields and properties defined in the TEntity mapper class (e.g. the ProductID property in the Product
class).
The PersistentDataMembers property of the DatabaseMetaTable is a subset of DataMembers
representing a collection of properties or fields that are directly mapped to a database column
(remember, not all data members necessarily have the ColumnAttribute
[ http://msdn.microsoft.com/en-us/library/system.data.linq.mapping.columnattribute.aspx ] meaning
that it is possible to have data members that are not associated to a database column). The
IdentityMembers property contains metadata for columns that are used as primary keys by the entity.
The DatabaseMetaModel holds an in-memory cache of DatabaseMetaTables to avoid the expensive cost
of reflection every time that GetTable<TEntity>() is called (trading memory consumption for speed).
DatabaseMetaDataMember has some important properties such as IsDbGenerated that tells the
provider whether the value of the column is automatically generated by the database engine and
therefore should be ignored when performing write operations. These properties are usually mapped directly to properties of the ColumnAttribute.
Here is a description of some of the other interesting properties of the DatabaseMetaDataMember:
Property Description
CanBeNull Is Null an acceptable value for this item? For instance, if the data member is mapped to a database column then is this a null-able column?
DbType If a database column then the DbType represents the real SQL data type for this column.
DeclaringType The actual entity type (class) that contains this field or property
IsPersistent Is this member associated to a database column?
MappedName The actual name of the column in the database. For example the column
name could be “ContactName” whereas the property representing this column might have been called “CustomerName”.
Member The MemberInfo [ http://msdn.microsoft.com/en-us/library/system.reflection.memberinfo.aspx ] object referencing the actual property or field of the entity class. This is used to gain access to the value of this member.
StorageMember Similar to the Member property, with the difference that it represents the underlying member that should be used by the mapper to set the value of a cell read from the database – should the setter of a property be called when the value of the column is read from the database or should the mapper find the underlying field and set its value directly?
Based on the mapping metadata for TEntity, GetTable<TEntity>() creates and returns an instance of
DatabaseTable<TEntity>:
public DatabaseTable<TEntity> GetTable<TEntity>() where TEntity : class
case "First": // This custom provider does not support the use of a First operator // that takes a predicate. Therefore we check to ensure that no more // than one argument is provided.
if (mc.Arguments.Count != 1) break;
VisitFirst(mc.Arguments[0], false); break;
case "FirstOrDefault": // This custom provider does not support the use of a FirstOrDefault // operator that takes a predicate. Therefore we check to ensure that // no more than one argument is provided.
if (mc.Arguments.Count != 1) break;
VisitFirst(mc.Arguments[0], true); break;
default: return base.VisitMethodCall(mc);
} Visit(mc.Arguments[0]); return mc;
}
This article will not expand on the implementation of the supported query operators except for the
more interesting WhereTranslator. When a call to Queryable Where operator is discovered, an instance
of the WhereTranslator class is created and its Translate method is called passing in the lambda
expression representing the where condition (predicate):
private void VisitWhere(Expression queryable, LambdaExpression predicate) { // this custom provider cannot support more // than one Where query operator in a LINQ query if (_whereTranslator != null)
throw new NotSupportedException( "You cannot have more than one Where operator in this expression");
_whereTranslator = new WhereTranslator(_model); _whereTranslator.Translate(predicate); }
if (!GetSourceTable(query, out source)) throw new NotSupportedException( "This query expression is not supported!"); StringBuilder sb = new StringBuilder(); bool useDefault = false;
// SELECT sb.Append("SELECT "); // TOP if (_takeTranslator != null && _takeTranslator.Count.HasValue)
{ // project on all the mapped columns _selectTranslator = new ProjectionTranslator( _model, source.PersistentDataMembers); } if (!_selectTranslator.DataMembers.Any())
throw new Exception( "There are no items for projection in this query!"); sb.Append(_selectTranslator.ProjectionClause); // FROM sb.Append(" FROM "); sb.AppendLine(source.TableName); // WHERE if (_whereTranslator != null)
{ string where = _whereTranslator.WhereClause; if (!string.IsNullOrEmpty(where))
{ sb.Append("WHERE "); sb.AppendLine(where); } } // ORDER BY if (_orderByTranslator != null)
{ string orderby = _orderByTranslator.OrderByClause; if (!string.IsNullOrEmpty(orderby))
{ sb.Append("ORDER BY "); sb.AppendLine(orderby); } } return new QueryInfo
3 Projecting all customers to instances of a newly created C# anonymous type [ http://msdn.microsoft.com/en-us/library/bb397696.aspx ] (a type that is created by the compiler at the compile type)
var customerInfos =
dc.GetTable<Customer>().Select( c => new
{
ID = c.CustomerID,
Name = c.CustomerName
});
4 Projecting and changing the return value at the same time by performing a string concatenation
IEnumerable <string> phoneNumbers =
dc.GetTable<Customer>()
.Select(c => "+44 " +
c.Phone);
One thing that all of the lambda expressions that are used by the above Select operators have in
common is that they only take a single parameter of type Customer. Therefore in order to perform the
operations defined by these lambda expressions, we need to execute the lambdas passing in an
instance of Customer and remembering the result of the call. Here is a list of all the 4 Select lambda
expressions shown above:
c => c.CustomerName
c => newContact {Name = c.CustomerName,Phone = c.Phone }
c => new { ID = c.CustomerID, Name = c.CustomerName }
c => "+44 " + c.Phone
In order to be able to perform the specifics defined in these lambda expressions, we first need to
compile their expression tree representations into executables (IL). Fortunately, the expression tree API
provides an efficient mechanism for compiling a LambdaExpression into IL using Reflection.Emit
[ http://msdn.microsoft.com/en-us/library/3y322t50.aspx ] . You can simply call the Compile method
on the expression tree:
Delegate f = lambdaExpression.Compile();
Now that the lambda expression is compiled, we can invoke it dynamically passing in an instance of a
Customer:
f.DynamicInvoke(customerEntity);
The lambda expressions above take an instance of Customer. Therefore our result mapper should
always create an instance of Customer and pass that to the compiled code to perform the projection.
The final step is to read each result from the returned result-set and convert it to an instance of
Customer (entity) and if needed perform a projection. The code below illustrates how this is done by
the DatabaseResultMapper<TEntity> class. Please note the use of reflection APIs to dynamically create
MemberInfo[] members = null; BindingFlags bindingFlags = BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance; while (_reader.Read())
{ if (isFirst)
{ // find the order of the columns returned by the database members = new MemberInfo[_reader.FieldCount]; var persistentDataMembers = _info.SourceMetadata.PersistentDataMembers;
for (int i = 0; i < _reader.FieldCount; i++) {
string colName = _reader.GetName(i); DatabaseMetaDataMember mem = persistentDataMembers.FirstOrDefault( p => string.Compare(p.MappedName, colName, true, CultureInfo.InvariantCulture) == 0);
if (mem == null) throw new Exception(string.Format(
"It was not possible to find a mapping column for {0}", colName)); members[i] = mem.StorageMember ?? mem.Member; } isFirst = false; } // create a single instanace of the Entity (the mapper object) object entity = Activator.CreateInstance( _info.SourceMetadata.EntityType, bindingFlags, null, null, null); // populate its members with values from the result-set
for (int i = 0; i < members.Length; i++) { // Do magic conversion from SQL type to CLR type! // NOTE: I am using a very simplied conversion technique here. // You may want to use a more complex one... Type memberType = TypeHelper.GetMemberType(members[i]); // is this a Nullable type? if yes, then get // its generic type argument for conversion
if (TypeHelper.IsNullableType(memberType)) { memberType = memberType.GetGenericArguments()[0]; } object value = Convert.ChangeType(_reader.GetValue(i), memberType); // set the value of the member on the entity instance to 'value' TypeHelper.SetMemberValue(entity, members[i], value); } // project on the entity by calling the // compiled projection funcation
if (projectionFunction == null) { // this entity needs to be tracked // some tracking code to go here ... yield return (TEntity)trackedEntity; }
Therefore it is necessary to find a way of distinguishing between the shapes of the results. For that
reason, you will find an enum that declares the shape of the result of a query. This is determined when the query expression is examined by the QueryTranslator:
internal enum ResultShape
{
None, // The query is not expected to have a return value
Singleton, // It returns a single entity
Sequence // It returns a sequence of entities
}
Here is a simplistic implementation of a function that identifies the shape of the result of an expression
As you can see, if the expression tree includes a First, FirstOrDefault, Single or SingleOrDefault query
operator as its root element, then it is assumed that only a single object should be returned. Based on
this, the implementation of the second Execute method of the DatabaseProvider class must have
different logic for each possible ResultShape:
private object Execute(QueryInfo info) { // Get an open connection to the database DbConnection connection = _cnnManager.GetOpenConnection(); try
{ // Build a SQL Command DbCommand command = connection.CreateCommand(); command.CommandText = info.QueryText; // Attempt to excute the query if no result was expected from the query
if (info.ResultShape == ResultShape.None) command.ExecuteNonQuery();
else { DbDataReader reader = command.ExecuteReader(CommandBehavior.SingleResult); ... // What is the CLR type of the returned result-set rows? Type resultEntityType = (info.LambdaExpression == null) ? info.SourceMetadata.EntityType : info.LambdaExpression.Body.Type; // Build a pipeline so that we can // read and map the results returned from the DB. // Do this by dynamically creating an instance // of DatabaseResultMapper<> class. // The followig use of reflection is equivalent to: // IEnumerable mappedResults = // new DatabaseResultMapper<T>( // info, reader, _dataContext.ChangeTracker); IEnumerable mappedResults = (IEnumerable)Activator.CreateInstance(typeof(DatabaseResultMapper<>) .MakeGenericType(new Type[] { resultEntityType }), BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance, null, new object[] { info, reader, _dataContext.ChangeTracker }, null); // Are we expecting a single entity or a sequence of entities?
if (info.ResultShape == ResultShape.Sequence) { // Read the results by enumerating through all mappedResults // This is not perfect but as I am sharing the database // connection between all queries and even write operations, // I need to load all entities into memory by fully reading // them using the DbDataReader (mappedResults) – that is exactly // what the constructor of List<> does here. // The followig use of reflection is equivalent to: // return new List<T>(mappedResults);
return Activator.CreateInstance(typeof(List<>).MakeGenericType( new Type[] { resultEntityType }), new object[]
{ mappedResults }); }
else if (info.ResultShape == ResultShape.Singleton) { IEnumerator enumerator = mappedResults.GetEnumerator();
The change tracking is performed by the ChangeTracker class that subscribes to the PropertyChanging
event of all mapped entity instances. When the provider is notified of the first change to an object, it
creates an instance of the TrackedObject and adds it to a collection of objects that require processing.
This collection includes a list of newly created, updated and deleted objects.
Figure 9 : Tracked objects and their states
The code listing below demonstrates the logic used by the ChangeTracker’s event handler for all
PropertyChanging events:
private Dictionary<object, TrackedObject> _items; ... private voidOnPropertyChanging( object sender, PropertyChangingEventArgs args) { // Is this object already being tracked? TrackedObject trackedObj; if (!_items.TryGetValue(sender, out trackedObj))
{ DatabaseMetaTable metaTable = CheckAndGetMetaTable(sender.GetType()); // This has not been tracked in the past so create a new tracked object trackedObj = ...; _items.Add(sender, trackedObj); } }
This custom provider is unable to keep track of changes to objects that do not implement
INotifyPropertyChanging.
It is also worth mentioning that the ChangeTracker should often compress its in-memory collection of
WeakReferences and remove the ones with garbage collected targets.
At this point, I would like to define the scope in which changes are tracked by the ChangeTracker. All
changes and in fact all write operations such as delete and insert, are cached by the ChangeTracker
until the AcceptChanges method of this class is called. That method in turn can only be called by the
SubmitChanges method of the DatabaseDataContext.
When SubmitChanges is called, all changes, newly created entities and delete requests are converted to
database commands and are executed inside a transaction. The transaction is only committed when all
commands succeed.
The conversion of TrackedObjects to database commands is done by the ChangeProcessor class:
And here is how the ChangeProcessor converts an insert request to a SQL INSERT command:
private DbCommand BuildInsertCommand(TrackedObject obj) { if (obj == null)
throw new ArgumentNullException("obj"); DbCommand command = GetNewDbCommand(); StringBuilder sb = new StringBuilder("INSERT INTO "); sb.Append(obj.MetaTable.TableName); sb.Append(" ("); // Find all the insertable columns (excluding // all the auto generated data members) var columns = obj.MetaTable.PersistentDataMembers .Where(c => !c.IsDbGenerated); StringBuilder columnValuesSb = new StringBuilder(); bool isFirst = true;
// Include all the columns in the INSERT statement foreach (var col in columns)
{ if (!isFirst)
{ sb.Append(", "); columnValuesSb.Append(", "); }
else isFirst = false; // What if the name of the column has a space character in it? sb.Append(FormatHelper.WrapInBrackets(col.MappedName)); // Get the value from the entity for this column MemberInfo memberInfo = col.StorageMember ?? col.Member; // Format the value of the column to // its acceptable SQL representation object value = FormatHelper.FormatDbValue( TypeHelper.GetMemberValue(obj.Entity, memberInfo)); columnValuesSb.Append(value); } sb.Append(") VALUES ("); sb.Append(columnValuesSb.ToString()); sb.Append(")"); command.CommandText = sb.ToString();
DatabaseDataContext.SubmitChanges() now looks similar to the following – it effectively delegates the
actual translation and execution of the database commands to the ChangeProcessor and has the
responsibility of creating a new transaction or reusing an existing one:
public void SubmitChanges() { this.CheckDispose();
// Get an open database connection DbConnection connection = _provider.Connection.GetOpenConnection(); try
{ // Create or use a transaction
using (TransactionScope ts = new TransactionScope()) { // Enlist in this transaction connection.EnlistTransaction(Transaction.Current); // Process all changes and apply them to the database ChangeProcessor processor =
new ChangeProcessor(_changeTracker, this); processor.SubmitChanges(); // Vote to commit the transaction ts.Complete(); // If all changes were committed successfuly // to the DB then accept all the changes _changeTracker.AcceptChanges(); } } finally
{ connection.Close(); } }
ChangeTracker.AcceptChanges() is called only when all SQL commands are executed successfully.
When called, deleted objects are no longer tracked and the ChangeTracker subscribes to the
PropertyChanging event of all the newly created objects.
Both delete and update operations rely heavily on the primary keys of the entity. In fact, if you look at
the database commands generated for these operations, you will notice that the generated WHERE
clause only specifies the primary key columns. It is worth pointing out that no delete or update
operations can be performed by this mini provider if the entity does not have a primary key defined (an
exception is thrown).
Also, it is not possible to modify a primary key using this provider as it does not keep a copy of the original primary key values so that they could be used by the UPDATE statement to locate the original
record.
Here are some examples of write operations against the Northwind database:
Purpose Implementation Generated SQL
Inserting a new
customer
DatabaseDataContext dc = newDatabaseDataContext(dbConnection);
I would like to thank Stuart Leaks, Carl Nolan and Charlie Calvert for their help in reviewing this article.
Disclaimer : Any code or technique shown here is for illustrative purposes only and the referenced
solution should only be seen as a throw-away prototype. The code has not been fully tested therefore it should not be assumed as production ready. You may find issues and bugs so please understand that
the purpose of this exercise was not to write a fully working and bulletproof provider. The aim is to
demonstrate the steps required to write a custom IQueryable LINQ provider.