Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am writing a LINQ provider to a hierarchal data source. I find it easiest to design my API by writing examples showing how I want to use it, and then coding to support those use cases.

One thing I am having trouble with is an easy/reusable/elegant way to express "deep query" or recursion in a LINQ statement. In other words, what is the best way to distinguish between:

from item in immediate-descendants-of-current-node where ... select item

versus:

from item in all-descendants-of-current-node where ... select item

(Edit: please note neither of those examples above necessarily reflect the structure of the query I want. I am interested in any good way to express recursion/depth)

Please note I am not asking how to implement such a provider, or how to write my IQueryable or IEnumerable in such a way that allows recursion. I am asking from the standpoint of a person writing the LINQ query and utilizing my provider - what is an intuitive way for them to express whether they want to recurse or not?

The data structure resembles a typical file system: a folder can contain a collection of subfolders, and a folder can also contain a collection of items. So myFolder.Folders represents all the folders who are immediate children of myFolder, and myFolder.Items contains all the items immediately within myFolder. Here's a basic example of a site hierachy, much like a filesystem with folders and pages:

(F)Products
    (F)Light Trucks
        (F)Z150
            (I)Pictures
            (I)Specs
            (I)Reviews
        (F)Z250
            (I)Pictures
            (I)Specs
            (I)Reviews
        (F)Z350
            (I)Pictures
            (I)Specs
            (I)Reviews
        (I)Splash Page
    (F)Heavy Trucks
    (F)Consumer Vehicles
    (I)Overview 

If I write:

from item in lightTrucks.Items where item.Title == "Pictures" select item

What is the most intuitive way to express an intent that the query get all items underneath Light Trucks, or only the immediate ones? The least-intrusive, lowest-friction way to distinguish between the two intents?

My #1 goal is to be able to turn this LINQ provider over to other developers who have an average understanding of LINQ and allow them to write both recursive and list queries without giving them a tutorial on writing recursive lambdas. Given a usage that looks good, I can code the provider against that.

Additional clarification: (I am really sucking at communicating this!) - This LINQ provider is to an external system, it is not simply walking an object graph, nor in this specific case does a recursive expression actually translate into any kind of true recursive activity under the hood. Just need a way to distinguish between a "deep" query and a "shallow" one.

So, what do you think is the best way to express it? Or is there a standard way of expressing it that I've missed out on?

Rex it is difficult to answer the question outside of the context of the two examples you have given without knowing more about the kinds of operation you expect or more about the hierarchal data structure you are exposing. Can you tell us any more? – Mike Minutillo Apr 24, 2009 at 8:19 Can I ask: what is the reluctance with the xml "Descendents" approach? Since this is iterator-block based it isn't as though it buffers everything. I've also added a "SelectDeep" (to mirror the standard "SelectMany") which you might find useful. Maybe. – Marc Gravell Apr 28, 2009 at 15:01 @Marc to me, it is counter-intuitive to have an object which represents a small scope of items (current children) but to have a method on that object which returns a much larger scope (all descendants). From a consumer standpoint, I expect methods on a collection-like object to act on the scope the original object. – Rex M Apr 28, 2009 at 15:19

Linq-toXml handles this fine, there is an XElement.Elements()/.Nodes() operation to get immediate children, and a XElement.Descendents()/DescendentNodes() operations to get all descendents. Would you consider that as an example?

To summarize Linq-to-Xml's behavior... The navigation functions each correspond to an axis type in XPath (http://www.w3schools.com/xpath/xpath_axes.asp). If the navigation function selects Elements, the axis name is used. If the navigation function selects Nodes, the axis name is used with Node appended.

For instance, there are functions Descendants() and DescendantsNode() correspond to XPath's descendants axis, returning either an XElement or an XNode.

The exception case is not surprisingly the most used case, the children axis. In XPath, this is the axis used if no axis is specified. For this, the linq-to-xml navigation functions are not Children() and ChildrenNodes() but rather Elements() and Nodes().

XElement is a subtype of XNode. XNode's include things like HTML tags, but also HTML comments, cdata or text. XElements are a type of XNode, but refer specifically to HTML tags. XElements therefore have a tag name, and support the navigation functions.

Now its not as easy to chain navigations in Linq-to-XML as it is XPath. The problem is that navigation functions return collection objects, while the navigation functions are applied to non-collections. Consider the XPath expression which selects a table tag as an immediate child then any descendant table data tag. I think this would look like "./children::table/descendants::td" or "./table/descendants::td"

Using IEnumerable<>::SelectMany() allows one to call the navigation functions on a collection. The equivalent to the above looks something like .Elements("table").SelectMany(T => T.Descendants("td"))

The point was to give an example of how a linq provider might deal with recursion, the example is linq2xml. – Frank Schwieterman Jul 27, 2011 at 17:26

Well, the first thing to note is that actually, lambda expressions can be recursive. No, honestly! It isn't easy to do, and certainly isn't easy to read - heck, most LINQ providers (except LINQ-to-Objects, which is much simpler) will have a coughing fit just looking at it... but it is possible. See here for the full, gory details (warning - brain-ache is likely).

However!! That probably won't help much... for a practical approach, I'd look at the way XElement etc does it... note you can remove some of the recursion using a Queue<T> or Stack<T>:

using System;
using System.Collections.Generic;
static class Program {
    static void Main() {
        Node a = new Node("a"), b = new Node("b") { Children = {a}},
            c = new Node("c") { Children = {b}};
        foreach (Node node in c.Descendents()) {
            Console.WriteLine(node.Name);
class Node { // very simplified; no sanity checking etc
    public string Name { get; private set; }
    public List<Node> Children { get; private set; }
    public Node(string name) {
        Name = name;
        Children = new List<Node>();
static class NodeExtensions {
    public static IEnumerable<Node> Descendents(this Node node) {
        if (node == null) throw new ArgumentNullException("node");
        if(node.Children.Count > 0) {
            foreach (Node child in node.Children) {
                yield return child;
                foreach (Node desc in Descendents(child)) {
                    yield return desc;

An alternative would be to write something like SelectDeep (to mimic SelectMany for single levels):

public static class EnumerableExtensions
    public static IEnumerable<T> SelectDeep<T>(
        this IEnumerable<T> source, Func<T, IEnumerable<T>> selector)
        foreach (T item in source)
            yield return item;
            foreach (T subItem in SelectDeep(selector(item),selector))
                yield return subItem;
public static class NodeExtensions
    public static IEnumerable<Node> Descendents(this Node node)
        if (node == null) throw new ArgumentNullException("node");
        return node.Children.SelectDeep(n => n.Children);

Again, I haven't optimised this to avoid recursion, but it could be done easily enough.

"lambda expressions can be recursive...It isn't easy to do, and certainly isn't easy to read" indeed, which is the purpose of this question! See my second-to-last paragraph. I know how to do recursion but I'm a fairly senior guy and, as you said, it's not pleasant even for us. So I have no problem writing the provider side of things, just looking for a mid-level-friendly way to express it. I'll code the provider against that. – Rex M Apr 27, 2009 at 14:44 Wow? Really? You could write a recursive (yet pure) lambda expression? It would take me a good deal of time and concentration... Mads just happens to be frighteningly smart... – Marc Gravell Apr 27, 2009 at 15:21 I think we are saying the same thing. Recursive lambdas are indeed possible; ways to do them is well-documented and probably easy for someone. They are doable but not easy for me and I consider myself a fairly senior guy, so I'm pretty sure they're not easy for mid-level people. So again, as stated in my question, I am looking for an easy way to express it. I am clear on how to execute it, don't need help there (and if I do, I'll open another question for that). – Rex M Apr 27, 2009 at 20:02 OK - we're getting side-tracked. That remark was intended as an (hopefully interesting) aside. My main point was, as others suggest, to look at the approaches used in LINQ-to-XML, with an example. – Marc Gravell Apr 27, 2009 at 20:20 I'll probably go the XML-style route if no better suggestions come along, but I'm not totally happy with that which is why I didn't accept it and opened a bounty. I'd like to get some other ideas. I've found a few via the Google, would like to see some more. – Rex M Apr 27, 2009 at 20:27

I'd go with implementing it in such a way as to have control over how deeply I want to query as well.

Something like Descendants() would retrieve Descendants through all levels while Descendants(0) would retrieve immediate children, Descendants(1) would get children and grandchildren and so on...

Rex, you've certainly opened an interesting discussion, but you seem to have eliminated all possibilities - that is, you seem to reject both (1) having the consumer write recursive logic, and (2) having your node class expose relationships of greater than one degree.

Or, perhaps you haven't entirely ruled out (2). I can think of one more approach which is nearly as expressive as the GetDescendents method (or property), but might not be quite so 'ponderous' (depending on the shape of your tree)...

from item in AllItems where item.Parent == currentNode select item
from item in AllItems where item.Ancestors.Contains(currentNode) select item
                This is one possibility I am seriously considering. I like the idea of having two separate properties which represent different scopes.  It is not my intention to "eliminate" any possibility, I am just trying to push people to think outside the usual methods that come to mind for this type of work.
– Rex M
                Apr 30, 2009 at 4:12
                that's how i did it. I made a flattenchildren and a flattentree which correspond to the descendents and descendentsandself
– Steve
                Apr 30, 2009 at 2:37

Are you okay with doing the heavy lifting in your object? (it's not even that heavy)

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
namespace LinqRecursion
    class Program
        static void Main(string[] args)
            Person mom = new Person() { Name = "Karen" };
            Person me = new Person(mom) { Name = "Matt" };
            Person youngerBrother = new Person(mom) { Name = "Robbie" };
            Person olderBrother = new Person(mom) { Name = "Kevin" };
            Person nephew1 = new Person(olderBrother) { Name = "Seth" };
            Person nephew2 = new Person(olderBrother) { Name = "Bradon" };
            Person olderSister = new Person(mom) { Name = "Michelle" };
            Console.WriteLine("\tAll");
            //        All
            //Karen 0
            //Matt 1
            //Robbie 2
            //Kevin 3
            //Seth 4
            //Bradon 5
            //Michelle 6
            foreach (var item in mom)
                Console.WriteLine(item);
            Console.WriteLine("\r\n\tOdds");
            //        Odds
            //Matt 1
            //Kevin 3
            //Bradon 5
            var odds = mom.Where(p => p.ID % 2 == 1);
            foreach (var item in odds)
                Console.WriteLine(item);
            Console.WriteLine("\r\n\tEvens");
            //        Evens
            //Karen 0
            //Robbie 2
            //Seth 4
            //Michelle 6
            var evens = mom.Where(p => p.ID % 2 == 0);
            foreach (var item in evens)
                Console.WriteLine(item);
            Console.ReadLine();
    public class Person : IEnumerable<Person>
        private static int _idRoot;
        public Person() {
            _id = _idRoot++;
        public Person(Person parent) : this()
            Parent = parent;
            parent.Children.Add(this);
        private int _id;
        public int ID { get { return _id; } }
        public string Name { get; set; }
        public Person Parent { get; private set; }
        private List<Person> _children;
        public List<Person> Children
                if (_children == null)
                    _children = new List<Person>();
                return _children;
        public override string ToString()
            return Name + " " + _id.ToString();
        #region IEnumerable<Person> Members
        public IEnumerator<Person> GetEnumerator()
            yield return this;
            foreach (var child in this.Children)
                foreach (var item in child)
                    yield return item;
        #endregion
        #region IEnumerable Members
        IEnumerator IEnumerable.GetEnumerator()
            return this.GetEnumerator();
        #endregion
                I don't understand how this addresses my question. I am not asking how to implement a recursive lambda - in fact, I already have established most of the inner workings on how the recursion works. I'm trying to find a way to express it from the front-end, so that distinguishing between a first-level query and a recursive query is intuitive to the person writing the LINQ query.
– Rex M
                Apr 30, 2009 at 2:20
                Could you provide a simple hierarchy of your data and which nodes you want to see selected?  It might help everyone here understand your goal better.
– Matthew Whited
                Apr 30, 2009 at 2:42
                That is fine for your purposes, but my LINQ provider is to an external system. Simply writing a clever extension method which can walk an object graph is not relevant to the problem here. If it helps you may think of it as LINQ to SQL.
– Rex M
                Apr 30, 2009 at 4:36
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.