Efficient deadline management with annotated B-trees
At post-
(Background on annotated trees: given any balanced-
Ian's scenario was that he was trying to schedule a bunch of jobs to be done; each job has a known duration and a known deadline by which it must be completed. The usual trick here is to sort the jobs by their deadline, doing the most urgent ones first. But a literal sort algorithm isn't adequate, because Ian's list of jobs is dynamic: at any time, a new job may become known, and need to be added to the list. Sometimes, adding a new job in the middle of the list might render it impossible to meet all the current deadlines –
This seemed like just the sort of problem which could be solved by an appropriately annotated tree, which was why Ian brought it to me in the first place. And, after a couple of false starts, it turns out that it can.
You store your jobs in a B-
We must show that these are easy to compute for a single job and for the concatenation of two correctly annotated lists of jobs. Clearly both are easy to compute for a single job: the deadline is provided by the user, and the latest safe start time is that minus the job duration.
Now, given sublist A with final deadline DA and latest start time SA, and sublist B with final deadline DB and latest start time SB (and, by the nature of adjacent sublists in a sorted tree, we also know that nothing in B has a deadline before DA), we need to be able to compute the annotations for the concatenated list. Well, the latest deadline for the full list is of course just DB. Now when is the latest start time for the whole lot? Well, the jobs in list A must be finished by time DA (of course), and they must also be finished by time SB, so that we can start doing the jobs in list B. So we compute EDA, the effective deadline for list A, as the earlier of DA and SB; then we adjust our start time SA to take account of the difference between EDA and DA, so that the start time for the whole list is equal to SA + (EDA − DA). That's it.
Given these annotations, we can now do a constant-
So if the latest safe start time for the entire tree is still in the future, we're OK. If it's in the past, we're doomed, and we are inevitably going to miss some deadline or other.
Hence, we have exactly the ‘add job’ operation Ian wanted. Given a new job, we add it to our B-
In fact this was overkill for Ian's actual application, since there's a sufficiently small bound on the number of jobs that it's much easier to do simple stuff with circular buffers and swallow the occasional O(N) reshuffle. But he was curious to know how it would be done efficiently for large job collections, and I think the above approach answers that to at least my satisfaction.