Sunday, June 30, 2013

Generalized Tower of Hanoi - K Pegs, N Disks

I recently found online that Facebook is giving a fast track job opportunity for those who would solve a specific problem which may be a little more complex than a simple interview question. The process happens online - you login with your Facebook id and have a specific amount of time to write a solution in the language you desire. Here's the url in case you want to try it. They also provide a sample problem and an allotted 45 minutes to solve it. The problem is the well known Tower of Hanoi puzzle but generalized to k pegs and n disks. Given an initial and final configuration of the pegs you have to output the minimal number of steps needed to reach the final configuration. To make it easier they bound the k and n to 1 <= n <= 8, 3 <= k <= 5 and let you assume that there's a solution that takes at least 7 steps.

One basic approach here is brute forcing all possible configurations which results in an undirected graph. And how do you find something in a graph ? Using a search strategy - in my case I chose Breath-First Search.

Before going into the details of this approach let's see what other approaches we can find. Brute forcing will generate in the worst case 5^8 (390625) different configurations and then we'll have to search through them. Is there any other way where we wouldn't do a brute force, but go through the minimal path the first time? It seems like we could use different strategies of navigating through the pegs of a Hanoi Tower. There's basically 3 configuration cases that will take different approaches in navigation. Those are n < k, n=k and n > k. The point is there's a specific move pattern for each of these cases, that you could do to achieve the desired configuration. This article goes into much detail on that. There's also the way of multi-threading - we could parallelize this in some way - for example we could start from both ends.

I chose the brute force this time. While coding this approach a few problems may arise; mainly the problem of how would you represent the pegs. I chose to use a byte array, where each cell represents a peg and each bit of a peg represents a disk (if set).

Code looks like this:

The findLowestBitSet method basically returns the number of trailing zeros. In Java there's a util method for that:

The method could be optimized to look like this:

Monday, June 24, 2013

Bitmap Sort

I was looking through John Bentley's "Programming Pearls" and I found a problem that I encountered before and decided to make note of it here. So you have a large amount of numbers that you're supposed to sort, but there's not enough memory to do that in. Obvious answer is to look for methods of sorting on disk aka  external sorting. There's a few methods known that include taking multiple passes through the input or parallelism. But this article is not about any of those - it should be faster than any of those. Consider that you have a set of 48000 unique numbers in the range of 1..64000. Does this make a difference? Also consider that you have 10K available memory, so you can't load all the numbers in memory. 

The idea here is that we can represent the input as a bit sequence (bit map) - the only thing left is to think of a data type. We could use a number of integers, and map our input to the integers bits. In Java an integer takes 4 bytes or 32 bits, this means we could map 32 input numbers to each integer. So, for example an input number 16 will map to bit 16 of integer 1, where an input number 37 will map to bit 5 of integer 2. 

To represent all possible input numbers we'll need 64000 / 32 integers - 2000 which will take 2000 * 4 = 8K, which fits into our limitations. Second step would be to output the sorted array. Since mapping the integers to a specific bit, makes the sequence sorted, we just need to traverse the bits and output every bit that has been turned on. In the end we get a linear time algorithm that takes O(n) or precisely O(64000) and it looks like this:

- The IO operations are there just to simulate input and output, instead of loading array into memory. Those are trivial, and implementation depends upon language.
- BUCKET_SIZE = 32 in our case

I will go through the code a little to make sure it is clear. So first step is the createBitMap method which creates a bitmap out of the input. On line 11 we decide which bit to set, whereas in line 12 we decide which bucket (integer) this bit should go to. It also does get set on line 12, through an OR operation.

Second step is going through the created bitmap and showing the actual integer values. On line 24 we go through each bit of an integer and check if it's set or not (line 25). If set we convert to the actual integer values on line 27.

Sunday, June 23, 2013

Unsorted Array Converted To "E1 < E2 > E3 < E4..." Format

I had a question once where given an unsorted array of integers, you have to produce an array in the "E1 < E2 > E3 < E4..." format. For example {3, 2, 6, 4, 9, 7, 1, 10, 8, 5} would produce {2, 6, 3, 9, 4, 7, 1, 10, 5, 8} and this is because 2 < 6 > 3 < 9 > 4 < 7 > 1 < 10 > 5 < 8. 

This is a tricky question where if asked by an interviewer you have to ask follow-up questions! I'm saying this out of my experience, my first solutions that I had in mind when I first had this question asked were involving sorting or priority queues, thus at least O(n log n) running time and involving additional space. But when I asked about the desired running time I was told that it should be faster, so I revisited my thoughts. It is in fact a very simple problem where you have to traverse the array just once resulting in O(n) running time. Note that you also have to ask about how to handle duplicates, in this case I considered an array of unique numbers.

The idea is that while traversing the array, you maintain a flag that indicates whether a "less than" or a "greater than" comparison should succeed. If it does not succeed you swap the previous index with the current index. 

This will be easier to understand through an example. Let's say we have the above array {3, 2, 6, 4, 9, 7, 1, 10, 8, 5}. We traverse the array starting at the second element (2), since we'll be comparing the current with the previous element. We also have to maintain the above mentioned flag, say a flag named lessThan which will be defaulted to true since our format definition indicates that we have to start from a less than sign. So we have our element "2" and previous element "3". Is element "3" less than element "2" ? No, then we swap the elements which results in the array {2, 3, 6, 4, 9, 7, 1, 10, 8, 5}. We also have to switch the lessThan flag and then go to the next index. Now the flag tells us that we have to use the "greater than" sign. Is element "3" greater than element "6" ? No, then swap the numbers. But wait, what happens to the previous comparison ? The second index has to stay greater than the first one, but note that this did get verified by the condition that we just imposed by comparing "6" with "3". "6" is greater than "3", which is greater than "2" - this means "6" is greater than "2", thus it's safe swapping "6" with "3". This results in {2, 6, 3, 4, 9, 7, 1, 10, 8, 5}. This goes on until the end of the array. Here's a table that may add more sense to how the algorithm works:

Iteration Array LessThan flag Swapped
0 2, 3, 6, 4, 9, 7, 1, 10, 8, 5 true true
1 2, 6, 3, 4, 9, 7, 1, 10, 8, 5 false true
2 2, 6, 3, 4, 9, 7, 1, 10, 8, 5 true false
3 2, 6, 3, 9, 4, 7, 1, 10, 8, 5 false true
4 2, 6, 3, 9, 4, 7, 1, 10, 8, 5 true false
5 2, 6, 3, 9, 4, 7, 1, 10, 8, 5 false false
6 2, 6, 3, 9, 4, 7, 1, 10, 8, 5 true false
6 2, 6, 3, 9, 4, 7, 1, 10, 8, 5 false false
7 2, 6, 3, 9, 4, 7, 1, 10, 5, 8 true true

The code is pretty simple. See it below:
- The swap function is trivial and it may look like in the code below.

Thursday, June 20, 2013

Validating a Binary Search Tree

This is one of those problems that deceives you into thinking that you got it and if you're not careful enough you'll make a mistake. The problem is that when deciding if a tree is a binary search tree or not, you not only have to make sure that the left child is lesser than the root and right child larger; you also have to make sure that the left child is lesser than all its ancestors and right child is larger than all its ancestors. To do this we'll at  least need to go through all the nodes. Thus we need a traversal algorithm either recursive or iterative. Both the recursive and the iterative will take up to O(n) space (the call stack for recursive and a stack data structure for iterative).

I'll start with the iterative algorithm, and I'll take as an example the tree below.

Simple Binary Search Tree
Fig. 1 - Simple Binary Search Tree

The idea of the iterative one is that a binary search tree traversed in-order will always output a ordered sequence. So the above tree traversed in-order will look like this: 16, 25, 41, 42, 46, 53, 55, 60, 62, 63, 64, 65, 70, 74. And, as a reminder, this happens because an in-order traversal visits the left branch first, then the root, and only in the end the right child - which, according to the binary search tree property where the left branch is less than the root, and the right one is larger than the root; leads us into a sorted sequence. Having this said, it would be enough for us to maintain the last visited node's value and keep traversing while the current node's value is larger than the node that we traversed last. If not - this is not a sorted sequence, thus definitely not a binary search tree.

The recursive version may be slightly difficult to understand, but it may be clearer in a way of showing how the ancestors are checked against the binary search property. This means that the children of a specific node are always limited by an upper and lower limit, so that a binary search tree can happen. For example, the children on the left branch of node "65" in the picture above have to be less than 65 and greater than 60 in order to conform to the binary search property. But what are the children of the left/right branches of root node? Well, left children of the root node, have to be less than 60 and greater than -infinity, and children on the right, have to be greater than 60 and less than infinity. 

I still prefer the iterative version over the recursive one - it's clean enough and besides in a production environment it may be a better idea to use the iterative version. But anyways here's the code below. 

Wednesday, June 19, 2013


Given a function that outputs a coin toss (a random number from 1 to 2) describe an algorithm that outputs a die roll (a random number from 1 to 6). Each outcome should be equally likely.

One solution is tossing the coin 3 times and using the outcome as bits of a three-bit number. So our outcomes would be 000, 001, 010, 011, 100, 101, 110, 111 (0 to 7). If our outcome is 110 or 111 we would have to discard it and toss again.

I found this solution widely accepted on the web, but it seems that discarding 3 tosses is a waste. It seems like we could reuse the lastly tossed number (in the case of 110 and 111 - 0 and 1), and we would need only 2 additional tosses instead of 3. This would change the algorithm to something like this:

The toss function could reuse the random number method mentioned earlier.

Minimum Stack

Another common interview question is implementing a stack like data structure that besides supporting the traditional push() and pop() operations would provide a O(1) operation able to find the minimum element in the stack. The question is easy but I still decided to write it down because it seemed elegant to me.

So the solution consists in maintaining 2 stacks: one for performing the traditional operations as before, and another one that will always have the minimum on top for fast O(1) access. Imagine you have a set of numbers - 5, 2, 8, 5, 7, 13, 3, 11, 1, 6 and you start pushing them on the stack. Every time you push one number on the stack you compare it with the number on top of the other stack, and if it's smaller, you push it on the second stack too. This way after pushing all the numbers on both stacks you should have the second stack looking like this: 5, 2, 1 and the first stack looking as mentioned before like this 5, 2, 8, 5, 7, 13, 3, 11, 1, 6.

We also need to maintain the minimum when pop-ing elements from our new data structure. We pop from the first stack unconditionally, but from the second stack we check first if the element we just popped from the first stack is actually the element on top of the second stack; if so - we pop it from the second too.

This is how it would look if we pop all elements from the stack:

Iteration Stack 1 Stack 2 Current minimum
0 5, 2, 8, 5, 7, 13, 3, 11, 1, 6 5, 2, 1 1
1 5, 2, 8, 5, 7, 13, 3, 11, 1 5, 2, 1 1
2 5, 2, 8, 5, 7, 13, 3, 11 5, 2 2
3 5, 2, 8, 5, 7, 13, 3 5, 2 2
8 5, 2 5, 2 2
9 5 5 5

In code it looks very simple and short:

- It is not necessary to use 2 stacks here. Another way was making MinimumStack a child of Stack. I just like symmetry and encapsulation. Besides it seems easier to show the idea this way.

Monday, June 17, 2013

Shortest Path in a Binary Search Tree

The shortest path in a binary search tree is not much more different than the shortest path in a binary tree that I handled here. The difference consists in reusing the binary search tree property where the left branch nodes are less than the root, and the right branch nodes are larger than the root. As in the binary tree shortest path, the procedure consists of finding the lowest common ancestor which I described here and then "drawing" the path from the ancestor to the nodes. Let's use the below tree as an example.

Simple Binary Search Tree
Fig. 1 - Simple Binary Search Tree

Let's say we want to find the smallest path from "25" to "46". It is obvious that it has to go through the lowest common ancestor that we identify as "41". Now we have 2 basic cases: either one of the nodes we're looking for is the LCA and the other is in either of the branches; or each of the nodes is in a separate branch. Now if we don't have the first case, we can look for the minimum of the two nodes in the left branch and for the maximum in the right branch. And again, this is because of the binary search tree property - that the lower element goes on the left and higher on the right. So we go down on the left binary searching for our node "25" and same with "46". I say binary searching because at each step we compare the current node with the node we're looking for and decide whether to go to the right or to the left. Managing the path is tricky, since the path for the second element will come inverted with recursion. In my implementation I used a stack to manage that as I did when looking for the shortest path in a simple binary tree:

- First stack is not really needed, a simple list would do. I just like symmetry.
- For the lowestCommonAncestor method see here.

Shortest Path in a Binary Tree

This one may be from the tougher category, but it becomes easy when you understand the idea. So the question is "what is the shortest route between 2 unique nodes in a binary tree". I'll handle the same problem but in a binary search tree in a separate article. I knew the answer before, since I reviewed the lowest common ancestor in a binary tree first. Yes, the idea is to find first the lowest common ancestor of the two nodes and then "draw the path" from the LCA to the 2 nodes. I explained finding the LCA in a binary tree in this article, so I'll go straight to the second part: identifying the path. Let's use the below tree as an example.

Simple Binary Tree
Fig. 1 - A Simple Binary Tree

Let's say we want to find the shortest path between nodes "2" (last one) and "11". Having identified the LCA as "7", we can now assume that the nodes are either located each in its separate branch or one node is the LCA and the second is either in the left or right branch. Second option is not the case, so we traverse the left and right branch (recursion possible) until we find any of the two nodes by comparing the branch node with the nodes. First one is easy we go onto the left branch and we find the "2" node right away. Then we go onto the right branch - "6", not our node, recurse on the left and right branch of "6". "5" is not our node, "11" is, so we add it to our path. The final path is "2, 7, 6, 11". How would we manage to maintain the path in code though? If we use recursion we'll get a path like "2, 7, 11, 6", so we need some little adjustments. 

You may notice that the first path is ok up to the root, then we have an inverted path. How do we handle this? There's probably multiple ways, I chose having a stack to invert the list as you can see in the code below.

- First stack is not really needed, a simple list would do - I just like symmetry.
- For the lowestCommonAncestor method see here.

Sunday, June 16, 2013

Deck Shuffling

How would you shuffle a deck of cards or generally an array of distinct integers? The shuffle should happen in such a way that every possible reordering is equally likely. Of course it sounds like we should use a random number, but how to ensure the equal probability of each reordering?

There's probably multiple solutions, but the one I liked is a solution that takes O(n) time and no additional space. It is the Fisher-Yates shuffle. What we have to do is traverse the array once and swap each element with a random element that doesn't appear before the current index.

The code is very simple and looks like this:

- The random method is a method that generates a random number within the specified range. This is trivial and could look like this:
- The swap method is a method that swaps 2 elements in an array with the specified index, the implementation is also trivial. It may look like this:

Order Statistics. Kth Smallest or Largest Element in an Un-ordered Array

Finding the largest or smallest element in an un-ordered array in O(n) is a trivial and obvious task, but what about finding the third largest or sixth or abstractly speaking kth? In statistics this is called Order Statistic. One way of solving this is using the QuickSort partitioning technique. To remind you how this works: you choose a pivot element, and then move all elements less than the pivot element to the left and all elements that are larger than the pivot element to the right. After this operation the pivot element is on the right spot and if the k you're looking for equals to the pivot's index, you don't have to sort anymore and just return the pivot.

As an example, let's say I have the following array:
[42, 92, 31, 73, 46, 11, 29, 53, 64, 4]

I chose the first element as the pivot (or we could use a random element) - 42.
Now we iterate from the start and end of the array and swap the elements in-place as they don't conform to our property (less than the pivot on the left and larger on the right). Is 92 less than 42 ? No, whom do we switch it with ? Let's look at the elements at the end of the array, since in the end we want to have an ascending order sorted array (if we need to). Is 4 larger than 42 ? No! Swap this with 92. Is 31 less than 42? Good. 73? No, then swap, and so on.
After the first operation the array should look like this:
[11, 4, 31, 29, 42, 46, 73, 53, 64, 92]

Now, 42 is in the right spot and it's the 5th number (index 4). Let's say we're looking for the 2nd smallest (index 1) so 5 is no good to us. What we do now, is doing the same thing just over the smaller array that we have: 
[11, 4, 31, 29]

Again, chose a pivot - 11, and move all lesser values to the left and larger to the right:
[4, 11, 31, 29]
11 is the element with index number 1 thus it is the second smallest element so we can stop the algorithm here.

Computing the kth largest element is the same task, with the only difference that we have to translate the k value, since the largest elements are located at the end of the array. So we'll have to translate k using this formula: k = array.length - k + 1.
Here's the algorithm in code:

Thursday, June 13, 2013

Lowest Common Ancestor in a Binary Search Tree

For the LCA in a simple binary tree go here.

Finding the lowest common ancestor (LCA) in a binary search tree may be easy, but coming up with an elegant and quick solution may take some thought.

A binary search tree is a ordered binary tree where the nodes on the left are always less than the root node and the nodes on the right are always larger than the root node.

Let's use the tree below as an example.

Simple Binary Search Tree
Fig.1 - Simple Binary Search Tree

Now keeping in mind that property that makes the binary search tree different that a binary tree we can come up with something that is faster than finding the LCA for a simple binary tree, precisely we'll make it O(h) - where h is the height of the tree. This means that in the tree above it will take maximum 5 iterations to find the LCA.

Let's say we're looking for the nodes 62 and 70.
We take the maximum of these two and compare it to the root, if the maximum is less than the root then our values are in the left branch, and we can forget about the right branch at all. Again, we can do this because of the binary search tree property that all the nodes on the left of the root are less than the root. Since the maximum of our two numbers is not less than the root we can move to the second case.
Now, we take the minimum of those two and compare it with the root. If the minimum is greater than the root, then we go on with the right branch. You may ask: why minimum (or maximum) ? The answer is that the nodes we're looking for might be each other's parent. So if our comparison would not be greater than the root, we would go to our third case which i'll take note later. Since this is the case the minimum of these two is 62 and it is greater the the root - we go further to 74. 
We keep doing our previous conditions, until both of them don't succeed:
- max is less than 74, we go on the left to 65
- max is not less than 65, then is min greater than 65 - not even that!
Here we go to our third case: if both previous cases failed, this could not mean anything but that the root is the LCA of nodes we're looking for, which is 65.

Again in code this is nothing but a binary search:

- Math.max and Math.min = determines the maximum and minimum of two numbers
- In a balanced binary search tree h = log n, so then we have O(log n)
For the node representation we'll use the following class:

Lowest Common Ancestor in a Binary Tree

Another popular interview problem and the start of my tree review is finding the LCA in a binary tree. Now, pay attention - not a binary search tree, that is a different case and will be handled in a separate article; now will use a simple binary tree. For those who don't remember a binary search tree is a sorted binary tree where the left child is always smaller than the root and the right child is always larger. A simple binary tree, as in our case, doesn't have that property.

Let's use the tree below as an example.
Simple Binary Tree
Fig.1 - Simple Binary Tree

Now let's say we want to find the LCA for nodes 4 and 9, we will need to traverse the whole tree to compare each node so that we can locate the nodes. Now considering that we start the traversal with the left branch (pre-order traversal) - we have the worst case here with O(n) running time.
Traversing the tree we compare the current node with both of the nodes and if one of them match, it means that one is the LCA on the respective branch. Let's say after traversing the above tree in pre-order  the first node that matches our nodes is 9 (2, 7, 2, 6, 5, 11, 5, 9). So the first obvious thought is that the 4 must be a child of 9, since we're already on the right child of node 5 and the pre-order traversal looks at the node first, then the left child and lastly the right child. Then we note node 9 as the LCA and we don't have to look further anymore.

Let's use another case, say we're looking for the LCA of 7 and 9. The first node in our pre-order traversal (2, 7, 2, 6, 5, 11, 5, 9, 4) is 7. Now here we can say that the LCA for the left branch is 7 because again, if the second node is in the same branch, independently of where and how deep it will be in this branch, the LCA will still be 7; thus we don't have to look in this branch anymore. But we still did not look at the right branch, so we keep traversing in a pre-order manner, but now omitting the other nodes: 2, 7, 5, 9. Now we can say that the LCA for that branch is 9. We can also affirm that the LCA for the branch with the root in node 5 is also 9. And in the end we have our nodes both in separate branches, which means that the LCA is the root of those branches - node 2.

The algorithm looks as a modified version of a pre-order tree traversal :
For the node representation we'll use the following class:

Wednesday, June 12, 2013

Finding If N Is Palindrome

Today I've decided that I want to review my basic computer science knowledge. And since I have a feeling that I may want to review this in the future again - I thought of making out of it a blogging experience. This way I kill two birds with one stone: blogging and review. Also even though there are lots of solutions and code sites on the web, this still may be of help to other colleagues of mine that plan to review.

The way I'm going to do the review is by solving popular programming problems which may be also a good preparation for a job interview. For my review I'm going to be using the Java programming language, but it should not matter for experienced programmers, since I'm going to use basic language structures and no external libraries.

Today I'll start with a classic popular problem: the palindrome
For those who do not know what a palindrome is - it is a sequence of characters that can be interpreted in the same way independently of what direction is used to read them. For example "abba" is a palindrome because you can interpret this sequence in the same way independently of the direction you read the sequence.

One known problem is determining if a number is a palindrome. It is usually limited to no extra memory. Let's say we have two numbers: 1781 and 42124. The latter one is the palindrome, and the way we do it programmatically would be by comparing the first and last digit and moving on to the second and n-1 digit and so on. Now the main problem is determining the first and last digit which consists of numerical operations. Last digit is easy - just get the reminder of the division by 10 (modulus) but first digit requires a little bit of thought.  So in our number 1781, for the first digit we basically need to divide it by 1000, and we get 1 (781 reminder). How do you find this 1000 number ? - just go through the digits and multiply 1 with 10 n-1 times. (Line 7-9).

Now as we have the first and last number we can compare them and fail right away if they're not equal. Truncating the number from the last and first digit is fairly easy : 1781 % 1000 reminder is 781, and then just divide by 10, and we have the 78 number. (Line 17). Lastly decrease div by 2 digits (Line 18). All in all you get an O(n) solution.