Markov chain name algorithm + generated grammar

This addition to the markov chain algorithm allows us to keep the pattern of the generated word within the boundaries of the dictionary words.

Explanation:

For example if you have the following names:

  • John
  • William
  • Hendrik

Their associated patterns are:

  • John ==> CVCC
  • William ==> CVCCVVC
  • Hendrik ==> CVCCCVC

As you can see these patterns are build up based on the consonants and vowels in the names. This gives us the following pattern generation algorithm:

Pattern generation algorithm

This algorithm can then be combined with the rule-set algorithm of the previous post, giving us the next more elaborate algorithm:

Rule-set + pattern generation algorithm

After having the rule-set and pattern list being build, we have to modify our generation algorithm a bit to look like this:

Extended generation algorithm

The DetermineRulesWithCorrectPattern function, just filters out rules that do not have the same pattern as is expected. If the expected pattern is shorter than the stepsize, it trims the rules to the pattern size.

The generation algorithm itself works as follows:

  1. Choose a pattern randomly
  2. While the name is not as long as the pattern:
  3.    Determine the expected pattern for this rule, based on the stepsize
  4.    If we are determining the starting rule:
  5.       From all rules, find the rules with a matching pattern
  6.       Pick one of these rules randomly
  7.    Else
  8.       From all rules that can follow the current rule, find the rules with a matching pattern.
  9.       Pick one of these rules randomly
  10.    If no rule was found
  11.       Stop the name generation
  12.    Add the current rule to the generated name

Advantages:

  • Fewer bad names are generated.
  • From an existing list of names you can generate a lot of new names.

Disadvantages:

  • Even more difficult to implement.
  • Names are still sometimes, very weird.
  • You cannot generate names of a pre-determined size.

Remarks:

Even with the addition of grammars I still get too many names, that just aren’t good enough. I’ll continue on with the next algorithms and maybe a combination of one of these together with the extended markov chain algorithm will do the job.

Output:

Extended markov chain generation (output)

As you can see this algorithm performed best, when I used it with a step-size of 2. When I used a step-size of 3, it seems that I didn’t have enough names in my dictionary, since it just gave me a lot of 3 letter words, I’ll have to look into that.

References: