How to Efficiently Merge Two Lists Unix Timestamps
Say you’re retrieving a comprehensive list of someone’s Github activities, and you now have data of each week’s additions, deletions, and commits. Great! We can just chart it out, nice and easy. Now, say someone comes in and asks to compare engineer A’s activities with engineer B’s activities.. on the same graph! Getting engineer B’s data is just calling Github’s API again, but organizing the data is the tricky part. We’ll have two sets of activities with two different sets of Unix timestamps, how do we resolve it?
If you google how to join two arrays in javascript, the top answers will tell you to concat the arrays then de-duplicate them after. The first Stack Overflow answer is O( n2 ).. that’s a lot of loops. Let’s see if we can get this time complexity down.
Since we are dealing with time here, we can exploit the fact that time progresses linearly and only run through our loop once. This means our time complexity will be O( n ), where n is the length of the longest input. And once we have a linearly unique set of timestamps, we can resolve the rest of the data easily.
var j, i, finalTimestamps, lengthOfA, lengthOfB;
i = j = 0;
lengthOfA = timestampA.length;
lengthOfB = timestampB.length;
finalTimestamps = [];
while(i < lengthOfA && j < lengthOfB){
if(timestampA[i]===timestampB[j]){
finalTimestamps.push(timestampA[i++]);
j++;
} else if(timestampA[i] < timestampB[j]){
finalTimestamps.push(timestampA[i++]);
} else {
finalTimestamps.push(timestampB[j++]);
}
}
if(i === lengthOfA){
for(;j < lengthOfB; j++){
finalTimestamps.push(timestampB[j]);
}
} else {
for(;i < lengthOfA; i++){
finalTimestamps.push(timestampA[i]);
}
}
We keep two pointers, i and j, and iterate it through the two list of timestamps. If the timestamps are the same (which is a common occurrence in Github API data), we push either timestamp into our final set and increment both lists. Else, we just push the set that’s smaller into our final set. We do this until either one runs out, then we push the rest of our timestamps into our final set.