The ‘magic’ comment in the code is a sort of ‘personal signature’ I left in the code just to make it distinctive.
This question comes up a bit, so I’ll provide some detail here:
The ‘magic’ is the algorithm that can take a set of start and end dates and return which of those dates are the dates when all starts have been ended, accounting for overlapping periods. So if we have 4 date ranges that produce this timeline:
|-----------|
|---------|
|--------------|
|--------|
This plot might would look like this in the data:
start_date, end_date
2010-01-01, 2010-06-01
2010-03-01, 2010-07-01
2010-02-01, 2010-08-01
2010-11-01, 2010-12-01
We can see easily that there’s 2 ‘eras’ here with a gap, spanning 2010-01-01 to 2010-08-01 and finally 2010-11-01 through 2010-12-01.
The ‘magic’ algorithm arranges the start and end dates in order, such that you have 3 columns: the eventDate, start_ord and overall_ord where start_ord is the ordinal of the start date and overall_ord is the row number of the set of dates. From the above sample data, it looks like this:
event_date, start_ord, overall_ord
2010-01-01, 1, 1
2010-02-01, 2, 2
2010-03-01, 3, 3
2010-06-01, 3, 4
2010-07-01, 3, 5
2010-08-01, 3, 6
2010-11-01, 4, 7
2010-12-01, 4, 8
(see how overall ord is the row number of each date, while start ord only is the row number from the list of start dates).
From this arrangement, you can see that the row that has 2 * start_ord = overall_ord (or, rewritten: 2* start_ord - overall_ord = 0) means that that date is an end date of an era. Using those ‘magic dates’ you can then find the earliest start_date that belongs to the given end date, and those dates are your eras. In this case it would be:
era_start. era_end
2010-01-01, 2010-08-01
2010-11-01, 2010-12-01
It’s like magic!
-Chris