As a preface, I’d like to say that this article isn’t intended to help people cheat at the problem, but is posted for the intellectual exercise contained within. Hopefully the folks in and around the HackerRank community won’t mind.
Lately I’ve been having some fun with the challenges on HackerRank. By all accounts, the general (and consensual) solution/s to the Team Formation problem involve some combination of data structures, sorting algorithms and such.
That’s all well and good, but having hit my head many times as a child, something in the back of my mind was nagging away at me, insisting that there existed a more specialized solution. As is a recurring theme of this blog – and indeed my life – I could not resist its call. Fortunately, this one actually paid off.
The algorithm I subsequently developed does not (in theory) depend on any particular data structure nor require sorting; it satisfies the condition of being “greedy“; (potentially) has a time complexity of , requiring only 1-2 loops; and for added lolz, can be made stable and even online if necessary.*
* While this is all possible, in some cases it’s not practical; in one particular implementation I did have to use std::sort. I’ll elaborate more on this at the end of the post.
The problem (briefly)
The objective of the Team Formation challenge is to accept a list of integers representing skill values for members of a team – and to organise those members such that a) for any one member with skill , either that member is the lowest skilled member of the team or there exists another member of the team with the skill – and b) the members are organised into as few teams as possible, with members being spread as evenly as possible across every team. To pass the test, one must output the size (number of members) of the smallest team.
For example, given the set , the two teams that can be formed are and . The smallest team is and so the correct output for this test case is . (Note that it would be erroneous to organise this input into the teams , and , for example.)
Imagine a grid. If you’re struggling, below is a visual to help you along:
For an example input of , the correct team formation can be visualised as:
If another value were to be added, it would travel down the x axis towards 0 to the next free slot, resulting in:
This works because there is a “free” at coordinates . If this were not the case, the new value would be placed at the end of the row – at . Below is another example, using an additional value of .
Following a set of rules for how to handle various local conditions when a new member is added, it’s possible to organise and – if implemented properly – effortlessly reorganise the members into their most appropriate structure. You can even track the size of each team on the fly.
For the same reason that I’ve not gone into the algorithm in much depth (i.e. not wanting to help out the dirty cheaters too much), I won’t post any code here. However, I did write an implementation which passed the HackerRank test cases.
Following that, I decided to grab a few of the other top scoring solutions along with some of the larger test cases and benchmark them locally against my algorithm using Very Sleepy. The numbers are listed below (lower is better):
As you can see, I outperformed the competition – which was a reassuring conclusion to this little experiment, having spent far more time and energy than was warranted.
Some notes about implementation
This algorithm doesn’t require use of any particular data structure or sorting algorithm; you can just use the integer value of the skill level and perform local operations on and and such.
Under certain conditions (e.g. access to large memory or some assurance that will be within a certain range) this means you can perform all necessary operations using a regular unsorted integer array. However, as the conditions of the challenge state that may be as large as – which is far too big for the test scenario – that leaves two options: use another lookup method (e.g. hash tables) or sort the array.
In my first attempt I did use an unordered_map, but lookups created a bottleneck so I tried out the sorting approach, which significantly improved performance.
Even after all of this, there’s still a voice in my head telling me that an even better algorithm exists, but… I must move on. I’ve satisfied my curiosity (mostly). And I have things to draw.