Markov Star Name Generator

BlitzMax Forums/BlitzMax Programming/Markov Star Name Generator

Krischan(Posted 2016) [#1]
Hi there,

I need a good star system name generator for my space game and I've learned that Markov Chains are the way to go (I'm still using my old Elite name generator but I'm tired of these names ;-). I found a good example what I'm looking for, even with source but the source is in javascript and unfortunately I'm not very good in this language.

Perhaps somebody could help me translate this to Blitzmax or give me some hints how to convert it? There are some parts in it I don't understand. Thank you in advance!

In tests I found out that sometimes duplicate names are generated, I don't know why and how to fix it. It would be perfect if it could create a unique name according to a unique star system id, for example: Local name:String=CreateStarName(12345)

However, I've managed to alter the source a little bit for my needs - just save this as "filename.html" and open it in a browser:


Example output:
Menida
Dhalasa Prime
Omicron Aphinid
Delta Alain
Zelasc
Rho Atauchat
Musala
Schalt
Meir
Alsal
Tau Alerker
Dacall
Masetro Prior
Aralge
Stalelge
Delta Alledip
Mibainira Prior
Adielgei
Cheni
My Sishaipha
Aubalo Prime
Ararara
Anteth
Delta Saseral
Shad
Atemir Prior
Pi Zater
Alar
Sasat
Arard Major
Arahela Posterior
Meimer
Upsilon Astal
Alasalr
Karinth
Atala
Merasc
Alhara Prime
Beta Mutal



Floyd(Posted 2016) [#2]
Ignoring the JSON stuff and just generating names here is what it seems to do.

Every star has a main name randomly selected from src_names.

25% of the time the full name is a randomly selected prefix plus the main name.

If that does not happen then in 25% of the remaining cases we get main name plus random suffix.

In case neither of those happens there is only the main name.

Scroll all the way to the bottom of the codebox to see the real code. Everything else is just setting up the name lists.
It's kind of amusing that the actual work is only three lines of code.

As to the duplicates, there are 10,592 possible names so that should be a rare event. I don't see any reasonable way to avoid it other than keeping track of results and rejecting duplicates. If you use modestly sized lists, such as in this example with only 10,592 total names, you could just generate them all and then shuffle them like a giant deck of cards. After that you select names by simply going through the list from the beginning.






Krischan(Posted 2016) [#3]
Thanks Floyd but Markov Name Generation works much different. It takes a list of names and generates completely new names that sound like the names in the list. For example you have only Roman, Aztec, Celtic, Chinese names in the list then the new names sound Roman, Aztec, Celtic or Chinese, too! There exists even more complex Markov Chain name generation algorithms but I only need a simple one like the source I've posted (in Javascript).

I could do that myself what your code is doing but I want a Markov name generator :-/

Here is an example using ~80 Blitzbasic.com User Names as input (the output quality depends on the amount of data and of course the pronounceableness of the input words, which is a problem here):
Staneduposth
Hanust
Rderier
Danonan
Shero
Sonatt
Donaroschof
Mangu
Jusener
Fitir
Davas
Xiang
Jonalor
Flaronre
Herus
Krethe
Dasani
Josthamstz
Neride
Berirani
Herrisi
Haneric
Mscani
Hthele
Tzkynon
Berrone
Wotang
Dangerertys
Tin
Pllonarit
Dantarin
Ragus
Staranier
Pstaconto
Shorangeto
Deriays
Hra
Sobel
Sttsis

Example generated using this HTML file:



Floyd(Posted 2016) [#4]
Okay, after some further reading I at least understand the basic problem.

A Markov chain is intuitive enough. It's a collection of states, and probabilities of transitioning between them. In this case they would be used to build up words one letter at a time. The next letter is chosen probabilistically based on previous letters. A list of words is analyzed to estimate transition probabilities which are likely to produce nice sounding results.

Unfortunately there are no details about this in the description so we have to figure it out from the code. I'm afraid that's beyond me. Maybe someone fluent in Javascript could figure it out.

I briefly fantasized about trying the Haxe version here:

https://www.samcodes.co.uk/project/markov-namegen/

which links to source code

https://github.com/Tw1ddle/MarkovNameGenerator

Haxe can be compiled to various targets, including C++. That could then be used with BlitzMax. After rummaging around in the source I soon abandoned that idea and gave up.

Searching for a detailed description of the algorithm produced nothing so I have run out of ideas.


Krischan(Posted 2016) [#5]
The Haxe version is very sophisticated and hard stuff, I want to keep it simple - it is sufficient if the names sound like the list and do not look too random and are pronouncable. And I'm glad that I'm not the only one who didn't understand the code ;-) I'm sure it's easy for somebody else but we don't see the solution yet. I just need a hint how to create this in Blitzmax as I'm not familar with these Javascript Array thingy and don't understand the algorithm there.

BTW: Markov Chains are funny, try this one: PHP Markov chain text generator. It takes words instead of letters and creates (nonsense) text out of other texts.


xlsior(Posted 2016) [#6]
Here's an interesting visualization, which may make it easier to implement your own: http://setosa.io/ev/markov-chains/

or this one:
http://www.soliantconsulting.com/blog/2013/02/title-generator-using-markov-chains


Krischan(Posted 2016) [#7]
Well I've played around a little bit. It is not really the Markov algorithm I want but it outputs pronouncable random star names. First, the names are loaded and parsed in the BuildChain() Method using a TMap. Then the name is built starting with a consonant/vocal combination and additional "syllables" derived from the TMap key/token couples. Then a prefix/suffix is added and the final name is stored in another TMap to check if it is unique.

However, there are some limitations: if you create too many names they will become longer or the program will run into an endless loop. And there is bug that text is added even if there is already a predefined suffix. But I don't know how to do it better right now.

Perhaps somebody can have a look at it?

Output example
Sigma Hilinyaed
Beta Usakain
Onius Priormu
Omega Tereb
Ny Indemiju
Epsilon Izoreatus
Akanu Prime
Orrimawe
Eta Otanevve
Psi Rotanetese
Eneschpe
Valer Prime
Ahasimhe
Zeta Nistr Minoror
Phi Rindemice
Saiph Primehu
My Pilliu
Xi Kellawa
Somtutba
Elgenep Prime
Tereb Prime
Zeta Domiti
Porrim Prior
Pi Nocesak
Gulus Minor
Sigma Ceptr Priorus
Lafat Prior
Fertum Prime
Kalhata
Rukba Prior
Eta Usaka
Rakara
Epsilon Balal
Gasalaeb Major
Tau Umium
Udiusvoic
Umium Minorna
Ulafa Minorru
Luxus

Blitzmax Code


master.txt (or use your own text file with a large number of names, >1000):



EdzUp MkII(Posted 2016) [#8]
I think in one game I done ages ago you had segments of names that got stuck together rather like the Elite method but was a little more pronouncable :)


gpete(Posted 2016) [#9]
how did you make the Saturn like planet.. what is the b3d code


Krischan(Posted 2016) [#10]
Which Saturn? The one from my signature?