The Scenario: "We are building a Federated Search Aggregator. When a user types a query, we hit five different database shards in parallel. Each shard returns a list of document IDs sorted by their relevance score.
To present the final results to the user, we need to merge these $K$ sorted lists into one master list. Since we are operating at Google-scale, the total number of results ($N$) can be in the millions, but the number of shards ($K$) is small. How do you merge these efficiently without re-sorting the entire dataset