A while back, I wrote a post on recursively iterating a tree with generators. At the time, I wasn’t sure if this was even useful or not. I didn’t have a good use case for it – I was just trying to learn generators and figured out that I could do this.
More recently, I rant in to a scenario where I have a hierarchical tree of JSON data and I needed to iterate through the attributes at the very bottom, selecting the parent object of the attribute that matched a specific criteria.
After thinking through a myriad of ways to handle this, my old generators post popped in to my head, and I wondered if I could use it.
The Data Structure And Criteria
The system in question deals with schedules that have jobs and tasks. Way down at the bottom of the structure, tasks have two separate IDs – the original “template id” from which the task was generated, and the current id – the “task instance id”.
A sample of the data that I pulled from my development environment looks like this:
From this data set, I need to traverse the steps and compare the “_id” to an ID that I have. Once I have that, I need to grab the step object that owns it so I can use the “jobStepTemplateId” to do some other work based on the template from which the step was created.
The First Idea: forEach
My first though on how to solve this was to use my typical .forEach on the structure. This works, no problem. I do this a lot.
But when I started thinking about this, I realized that I don’t like this because it requires complete iteration over the entire data set, even if I found the desired object in the first position.
I don’t want that. I want to find it and exit the loops entirely.
The Next Approach: For Loops and Break;
Breaking out of a for-loop is easy enough and suits my purpose.
That’s nice. It works the way I want by shortcutting the loop and exiting when I find the desired item.
But I don’t like the way this code looks. Having the logic of what item to select buried down deep, makes it very hard to know what’s going on from a high level perspective. Sure, I could use a callback method specified at the top level and passed through all the layer of iteration… but that doesn’t feel right anymore… I couldn’t quite put my finger on why, I just didn’t like passing this callback through layer after layer when it was only used at the very bottom.
Iterating with Generators
It was at this point that I realized I could use my ES6 generator code from that previous post, to do the iteration. I had already discovered the ability to stop iterating with a generator, simply by not calling iterator.next() anymore. That means I don’t need a “break” in a for-loop. I just need to stop calling code and exit like any other function.
So I made the change to use a generator at the very bottom of the code. But in order to do that, I need generators all the way down.
This code first iterates through the top level list of jobs. Rather than yielding any job out to the calling code, though, it hands of control of the iteration to the next level down. Once the code is down in to the steps, it yields the individual step back out to the calling code.
Now I need to inject the code that checks for the desired item. That should be done down in the bottom where I’m yielding the step, right?
But that’s no different than what I had before. I don’t like it. It hides the useful knowledge of what I’m looking for, down deep in the code that implements the mechanics of searching.
Inverting Control Of Iteration And Data Processing
Then it hit me – the “iterator.next” call is what tells the generator to move on to the next item. This was the “aha!” moment for me – when I realized that I don’t need the criteria check at the very bottom. I can have it at the very top of the code where it can be easily seen and understood, because the top level code is in control of the iteration, not the bottom of the code like before!
The change, then, is to stop calling “iterator.next” after I find the item I’m looking for.
Now we’re talking! Now the code is looking better… keeping the control of iteration and the code that checks the data against criteria at the top level.
Cleaning It Up
After I got that working, I realized that I don’t like all the extra noise of controlling the generator iteration in that code. It’s needed, I think, but I don’t want to see it. Instead, I want to see something clean like this:
And that’s certainly easy enough with a single callback method passed in.
In this version of the code, I’m using the best of all the worlds that I can think of. I’ve got a simple callback to check the validity of the item in question, but I’m still using ES6 generators behind the scenes. This lets me stop iteration when I am done, while still pushing the knowledge of what item I want, up to the top of the code structure.
A Benefit Of Generators
I’ve been struggling to understand the “why” of generators since I started learning them. I figured out the “what” and “how” pretty quickly, though not really very easily. But the value of generators, beyond the possibility of async/await style syntax eluded me.
It was this exercise in searching through the data structure to find what I wanted, that helped me to see a truth about generators… to see some of the real value that they provide:
Generators invert control of iteration, allowing you to cleanly separate the mechanics of iteration from the processing of items iterated.
This was the real lesson learned for me – seeing the ability to iterate the entire list of steps at the bottom of the data structure and have the top-level code be in complete control over when the iteration happened and whether or not it continued to happen.
With that, iterators have become more valuable to me than I thought they were, previously.