For this week its PROCJAM. There are many resources and tutorials available on how to generate levels (like heightmaps or connected rooms). But when it comes to pack thesis levels with life and personality, not that much information is available. Nevertheless it is a requirement to generate Names for persons, cities, locations, enemies or anything te your spel that should be identified uniquely. This task becomes even more interesting, when you want to have distinct names for different cultural or social domains or races. Of course your elvish cities should be named differently from your orc warriors.
Te this postbode I will present three distinct ways to generate names. I will embark with the most rigid way and end with the most limber way. I implemented the 2nd and third method te a C# program. The source code is available on Bitbucket.
Method 1: Lookup lists
The most plain (and not very procedural) way of creating names is just to have a list of names. When you need a name just randomly pick one from the list. If you want to avoid duplication just speelpop the name.
For different cultural domains it is feasible to have different lists. On the internet there are various lists of names, that can be used free of charge and will yield a large amount of input.
The big drawback of this treatment is that you are limited to the names on the lists. Especially if you have some special requirements.
For Example, for the spel Typo I needed names that are more or less difficult to type and thus had to maintain different lists and could not just copy and paste some reference lists.
One large list (most common US citizen names) that I used for my name generation can be viewed here: Name List (bitbucket)
Method Two: Grammar rules
If you want to go one step further and indeed get a large amount of unique and procedurally generated names, you can use what is called grammar rules. For this you by hand specify some rules on how names should be built out of basic blocks.
The most basic building block of names are single characters. But when you commence to roll your name by appending randomly selected chars, you will get strange names like ",grtnsa", or ",mhowz",. So wij need a little bit more attention to detail here. What works very well is to divide all characters into groups, like consonants, vowels and special characters (inverted commas, e.g. for orc city names).
Then think about an rangschikking of this building blocks. What works nice is to alternatingly use consonants and vowels. This will yield reasonable names that can mostly be pronounced correctly. The following list presents some five letterteken names created by alternating consonants and vowels.
Of course you can also create names of other length. If you want to do even better, think about the following: Not all letters of the alphabet occur with the same frequency. Therefore it is useful to aid the generator with a little help te the right direction. On Wikipedia you can find a frequency table that can lightly be used to mimic this behaviour.
This Method works even better if you use some larger building blocks. Like motionless suffixes
that can be concatenated to the names above. Some of them work better with three or four letterteken names.
What you can also do is some consonant duplication te the middle of the name. Therefore you need extra information, since some consonants can lightly be doubled (like ",n", or ",t",) while some can not (like ",h", or ",x",). Adding duplication the middle consonant ter the upper list (with exclusion) and appending some suffix will yield
which are also ideally fine names. Note how on Catten no suffix has bot appended, since the last two characters already match one suffix. I encourage you to create your own set of rules. The possibilities are endless tho’ not all rules will generate sensible names.
The big advantage is that this method permits you to generate random names by using some plain rules. The disadvantage is to create distinct cultural domains. A lotsbestemming can be done with the suffix but again this is some kleuter of ready list of words.
Method Three: Monte Carlo names
Monte Carlo methods describe a class of computational algorithms that are based on random decisions. The basic idea for this topic is to commence with a name that is composed of downright random characters, which is most likely a bad name. Iteratively across the Monte Carlo process the name becomes better and better until it is good enough to hold spil a name for a person, city or anything else.
To define bad or good names, wij need some criteria, which can be trained by the lists mentioned earlier. This criteria can be anything you like (e.g. number of consonants te a name or how often the same letterteken shows up twice or more). Here I will use two plain concepts: Pairs and Correlations. A pair of length N is just the sequential occurrence of N characters. Spil an example
A Correlation just looks N characters forward ter the name and reports which chararacter goes after te a certain distance
Where a ? denotes any arbitrary single character.
Armed with Pairs and Correlations and a list of names you can now create the rating function. By analysing all names ter the list you will get some frequency table for Pairs and Correlations. For every occurrence (like ter the ",helen", example above) you add 1 to the respective Pair and Correlation count. Spil the input gegevens have bot existente names you want to mimic the behaviour of those names and thus have criteria for good names. The rating function just checks the name it is given and looks for any Pairs and Correlations it has bot trained with. For each kasstuk it adds the number that is according to the Pair or Correlation that has bot found.
Now wij have the rating function and can embark creating names. This is like rolling consonants and vowels according to the letterteken frequency table te the grammar method. You embark overheen with a fully random name. Calculate the current rating of the name. Then exchange a random character ter the name with another random character of the letterteken frequency table. Then calculate the fresh rating. If this value is better than the old rating, accept the switch, otherwise turn down it and undo it. This should be done te a loop (you should make sure that each character has bot touched at least some duo of times). After about 100 iterations you will have a very nice name. Spil Monte Carlo algorithms are iterative, the longer it runs the better the results will get.
Some names that have bot created with this method are
An interesting fact is that you can feed the rating function lists with different cultures and will get names that go after that cultural rules. (notice how the training with Chinese city names also added ‘ to the name list).
The drawbacks of this method are, that you need a list of names that you can feed ter the rating function. Also it can be finta expensive to do all those iterations. The number of possible pairs with N=Two are 24*24 (number of characters for the very first and the 2nd letterteken multiplied) are 576. For Pairs with N=Trio it is already 13824 and for Pairs with N=Four it is something overheen three million. Let’s assume that only Five procent of the combinations will occur. Then there is a massive amount of substring checking to be done. If you do many iterations (recall, that is what you need to do to get good names) this can take some time.
Nevertheless the Monte Carlo Name Generator is my dearest spil it can produce hefty varieties of very nice names that go after a certain cultural trend and is truly procedural.
Spil you very likely noticed, I did not investigate the questions proposed at the beginning. So far I am still a physicist and not a linguist. But at least I attempt to response those questions te the way I have learned (namely statistical methods, numerics, gegevens analysis and correlation functions). The proposed plain criteria for the rating function work remarkably well – at least for mij. It is also stunning that by using a truly elementary alternating grammar it is feasible to create such names.
I hope you loved reading this tutorial on how to generate names procedurally.