Questions about string matching are pretty often in interviews and this is one of them. I've seen two variations online: one to split the string in meaningful words and another one to determine all possible meaningful words that can be produced out of sub-strings of the input.

First one seems easy - you just go once through the string and output what you've matched, thus O(n) where n is the number of the input string's characters. Second one would involve trying all possible sub-words, which would be O(nk) where k is the length of the longest word.

Now another problem is how would you represent a dictionary. A hash map won't work here because there will be too many collisions and the hash map would loose its efficiency. The first solution here would be a trie and that's what I used in both mentioned problems. An example of an implementation of a trie can be found here.

Now can it be done faster? Yes, we could use the Aho-Corasick string matching algorithm which uses a finite state machine that resembles to a trie. That will improve our time in the second problem from O(nk) to O(n), but don't forget that we have the pre-processing time. This may not be asked to be implemented in an interview, but it's good to mention.

I am using this variant for PHP5. It's pretty fast and simple.

ReplyDeleteWisconsin Football

ReplyDeleteWisconsin Football 2018

Wisconsin Football Schedule

Wisconsin Football 2018 Schedule

Wisconsin Football Live

Wisconsin Football Live Stream

Wisconsin Football Live Streaming

https://wisconsinfootball.de/