Categories
Uncategorized

An entry for the “Oh right, of course that doesn’t work” file…

I have a class called MultiValueDictionary<TKey, TValue> that implements IReadOnlyDictionary<TKey, ReadOnlyCollection<TValue>>.

So far so good. All of usual LINQ stuff works as expected.

Then I think, “Wouldn’t it nice if I could also cast that into a IReadOnlyCollection<KeyValuePair<TKey, TValue>> and get a flattened view of it”.

So I add the interface and wire up the methods as explicit implementations. It compiles cleanly and I open up my tests.

error CS1061: 'MultiValueDictionary<string, int>' does not contain a definition for 'ToList' and no accessible extension method 'ToList' accepting a first argument of type 'MultiValueDictionary<string, int>' could be found (are you missing a using directive or an assembly reference?)

What do you mean it doesn’t have a ToList method. That’s a well known extension method on IEnumerable<T>, which my class implements.

In fact, I implemented that interface twice. Once as IEnumerator<KeyValuePair<TKey, ReadOnlyCollection<TValue>>> and once as IEnumerator<KeyValuePair<TKey, TValue>>.

Oh…

Well back to the drawing board.

Categories
Chain

ORM Idea: Declarative Aggregates

Normally when we need aggregates (min, max, average, sum, etc.) we end up having to declare it in multiple places. First you need an object to hold the result of the query. Then you need to put the actual logic of the query into a view, inline SQL, or LINQ expression.

This alone isn’t too much of a problem. But each time the report needs to be changed, you have to find the class and the query so that you can modify them both. And if the query appears in multiple locations, then you have to duplicate the logic in each place.

But what if we moved the aggregates directly into the class?

[Table("Sales.EmployeeSalesView"]
public class SalesFigures
{
	[AggregateColumn(AggregateType.Min, "TotalPrice")]
	public decimal SmallestSale { get; set; }

	[AggregateColumn(AggregateType.Max, "TotalPrice")]
	public decimal LargestSale { get; set; }

	[AggregateColumn(AggregateType.Average, "TotalPrice")]
	public decimal AverageSale { get; set; }

	[CustomAggregateColumn("Max(TotalPrice) - Min(TotalPrice)")]
	public decimal Range { get; set; }

	[GroupByColumn]
	public int EmployeeKey { get; set; }

	[GroupByColumn]
	public string EmployeeName { get; set; }
}

Now if you want to get more information, say grouping by month, you can do it directly in the model.

What does this look like in action? Well that depends on the ORM. For Tortuga Chain, it looks like this:

var report = dataSource.From<SalesFigures>().ToCollection.ExecuteAsync();

A nice effect of this pattern is that the logic is easily reusable. For example, say you wanted two reports. One for the current month and one for the current year.

var monthlyReport = dataSource.From<SalesFigures>(new {SalesMonth = month, SalesYear = year}).ToCollection.ExecuteAsync();

var yearlyReport = dataSource.From<SalesFigures>(new {SalesYear = year}).ToCollection.ExecuteAsync();

This feature is part of Tortuga Chain 4.3, but I can see it being incorporated into any ORM that includes SQL generation.

Categories
Anchor

Allocation-free Batching in Anchor 4

Traditionally, converting a large collection into batches requires a lot of array allocation and copying operations. Using an offset and count sounds like a good alternative, but very few APIs support it.

This is where the BatchAsSegments extension method comes into play. Rather than returning a new collection, it gives a window into the underlying list. The return type is IEnumerable<ReadOnlyListSegment<T>>.

ReadOnlyListSegment is a struct with 3 fields, a source list, an offset, and a count. It can be cast into a IList<T> or IReadonlyList<T>, but that’s not recommended because it requires a boxing operation and the goal of this is to allocation-free.

Enumerating the ReadOnlyListSegment can be done via an IEnumerable/IEnumerator pair as usual, but again that requires allocation. As an alternative, you can perform a for-each loop against it directly using a struct based enumerator. This is inspired by the enumerator on List<T>.

Categories
Anchor

Supporting Clone with Anchor 4

Implementing a generic Clone method is surprisingly difficult for all but the simplest of classes. Even answering the question, “What should IClonable.Clone do?” proved so difficult that Microsoft deprecated the interface.

The first question is whether a clone should be shallow or deep. A deep clone is one where all of the child objects are also cloned, while a shallow clone reuses them. But there’s also a 3rd option. If a class is immutable, then objects of that class don’t need to be cloned. So a generic clone method could take that into consideration.

In .NET we use the Pure attribute to indicate a class behaves as if it were immutable. That’s not the same as being actually immutable. Performing on operations on a pure object should have no visible side-effects, but internally it may be doing things like caching values.

Currently our MetadataCache.Clone command only supports shallow and deep copies. We’re looking to include support for immutable/pure objects in Anchor 4.1.

In Anchor we have the concept of a property bag. A property bag is used instead of normal fields to back properties so that we offer things like change tracking and two-level undo. There is a performance cost to this, but the benefits outweigh it when you need to support IEditableObject and IRevertableChangeTracking in a WinForms or WPF application.

Inside the property bag is an array where the actual values are stored. Since this is just an array, we can cheaply copy it when performing a clone operation against a subclass of ModelBase or ModelCollection. This is done via the CloneOptions.BypassProperties flag.

Another flag that MetadataCache.Clone honors is CloneOptions.UseIClonable. When set, it will use IClonable.Clone instead of copying each property one by one.

But again, IClonable is deprecated. So perhaps an object exposes a Clone method without using that interface. To address that, we’re considering adding a new option in Anchor 4.1 to honor those methods.

The next wrinkle is that not all classes have a default constructor. When that happens, you have to match up parameters in the constructor with properties in the object being cloned. And of course, you may need to deep-clone those values before passing them to the constructor. So that’s another feature to consider for the future.

Speaking of deep clone, that’s just begging for an infinite loop. So to prevent problems, a cycle detection scheme should be used. Under this model, if an object is seen before it won’t be cloned a second time. This is fairly complex, so for now Anchor just uses a maximum recursion parameter.

Adding together MetadataCache.Clone and the planned work for the future, can we safely say that Anchor will truly have a generic clone function? No, because there are other edge cases such as non-public fields and events to consider.

Instead, Anchor is offering is a starting point for creating your own class-specific clone method. Unless your class is very simple, we expect you to expand upon it from time to time.

Categories
Anchor

Anchor 4 Breaking Changes

Tortuga Anchor has traditionally not seen a lot of churn. Other than the big rewrite for Nullable Reference Type support, the name has changed more frequently than the API. With apologies, I have to discuss a couple breaking changes for Anchor 4.

Public Does Not Mean Public

The first the is the way metadata is handled. The Tortuga.Anchor.Metadata namespace was designed as a wrapper around .NET reflection. In addition to convenience methods, it includes a caching layer so that expensive reflection calls only have to be made once.

Something that isn’t always obvious is the idea that .NET reflection isn’t the same thing as “C# reflection”. While similar, .NET reflection uses slightly different terminology. For example, “public” in .NET reflection means “This can be seen outside of the assembly”. That includes not only public properties in the CE #, but also C#’s properties marked with the protected keyword.

The correct way of handling this is to check both the IsPublic and IsFamily flags on a MethodInfo object, the latter being the marker for protected members.

A Bad Design Decision

The second issue is TableAndView attribute. To understand this, first a little background.

In .NET there is a Table attribute used to indicate which database table a given class relates to. While well known in Entity Framework, it isn’t actually an EF concept and many other ORMs including Tortuga Chain rely on it.

Since reading reflection data is expensive, Anchor caches the Table attribute values, as well as related ones such Key, Column, and NotMapped.

Since most databases have some sort of namespacing option, the Table attribute also includes an optional Schema property.

So far so good. But when trying to use it, you realize that reading directly form the table is annoying. What you would rather do is read from a view that pre-joins all of the lookup tables that you might need. But when you do your writes, you still need the underlying table. (Just the one, you generally don’t want to modify the lookup tables.) For most ORMs this would require having two DTOs and tedious mapping code between them.

Anchor and Chain took a different route. By introducing a TableAndView attribute, you could indicate that reads come from a view while writes go to the table in the same schema.

And that’s where I messed up. I assumed that the view would also be in the same schema as the table, so I just made TableAndView inherit from Table and added a view name property.

I could ‘fix’ this by adding a view schema property to the attribute, but that would be confusing. It would probably still be a breaking change unless I carefully thought through every possible combination of using schema and view schema. And if there is a break, it would be subtle.

So I opted for the easier, and arguably more correct, route of just deprecating TableAndView. It is replaced by a View attribute with its own name and schema attributes. The old attribute will hang around for the time being, but if you have both the new one will take priority.