How do I create an in-game language parser?

0 favourites
  • 15 posts
From the Asset Store
Quickly and easily add a Quake-like console to your games
  • I'd like to create something where certain text is changed based on a language parser system. For those of you that have played world of warcraft, that's exactly what I'm trying to create (when opposing factions try to speak to eachother, a 'lol' may come out as 'bur' for someone not on the same faction)

    I don't think anybody's cracked the code exactly for how Blizzard actually does it, but from what I've been able to gather is that the word that comes out depends on what vowels are in the word and the arrangement of the vowels, as well as how many letters the word contains.

    And there's a predetermined list of all possible translations to come out based on the number of letters of a word, for example:

    two letter words: An, Ko, Lo, Lu, Me, Ne, Re, Ru, Se, Ti, Va, Ve

    three letter words: Ash, Bor, Bur, Far, Gol, Hir, Lon, Mod, Nud, Ras, Ver, Vil, Vos

    etc.

    You can see the full list here http://www.wowwiki.com/Common_%28language%29

    Anyways, I believe I can figure it out as long as I know if it's possible in construct 2 to do any of the following based on the text of a text object: compare letters in each word, compare the vowels of each word, compare the letter count of each word, and possibly compare the placement of vowels in a word, If any of that is possible, then I may be able to figure it out. However I have no idea which expressions correlate to this.

    I don't think the idea itself is very complicated. The steps to my idea when it translates a word would be:

    1. Determine how many letters are in the word

    2. Determine the placement of vowels in the word (so if an "a" is the first letter, or the second, or the third, etc)

    3. Determine how many vowels there are

    4. Translate the word

    So, 'lol' would translate different than what 'all' would be because the placement of vowels is different. See what i'm saying? <img src="{SMILIES_PATH}/icon_e_smile.gif" alt=":)" title="Smile">

    Thanks for reading, I hope you'll be able to help out!

  • Yes this is possible? Here is an example for the functions you could use!

    If someone could explain why I had to -1 /2 for the vowel count that would be great It doesn't bother me too much since I have no need for this; which is probably why I didn't look into it myself.

  • Yes this is possible? Here is an example for the functions you could use!

    If someone could explain why I had to -1 /2 for the vowel count that would be great It doesn't bother me too much since I have no need for this; which is probably why I didn't look into it myself.

    hey man, hope you don't mind I tweaked some stuff in your capx. Realy minor stuff. The '-1' is necessary because you had a ',' at the end of 'vowelPositions' so tokencount counts another token (I was going to remove the last ',' just to remove the -1 but it's not worth it).

    The '/ 2' was necessary because you were filling the array twice. I just added a 'set VowelPositions to ""' at the start of getVowelPositions and it's fine. Also something minor but cool, you can use 'Add to' for text as well, so I changed 'System | Set VowelPositions to VowelPositions & ..." to 'System | Add ... to VowelPositions'

    BTW you don't need tokenat() just to go through a piece of text character by character, you can just use mid(). Much the same thing in the end though.

    edit: you could also reduce all the ORs down to

    [attachment=0:roccvxe0][/attachment:roccvxe0]

  • This is amazing. Would this be able to translate each word within entire sentences too?

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • This is amazing. Would this be able to translate each word within entire sentences too?

    That would be up to you With C2 you can manipulate text any way you like, it's all up to your ability to use it.

    [quote:7lfavwnq]I don't think the idea itself is very complicated. The steps to my idea when it translates a word would be:

    1. Determine how many letters are in the word

    2. Determine the placement of vowels in the word (so if an "a" is the first letter, or the second, or the third, etc)

    3. Determine how many vowels there are

    4. Translate the word

    1-3 are easy and you've seen how to. 4 is up to you because the meaning of 'translate' is what you need it to be.

    Just one tip, you could reduce the amount of code significantly if you learn regular express?ons (regex). Look it up in the C2 manual (where you won't learn anything unfortunately) or better, Google 'javascript regex'. My usage of it above is very basic. You can do a lot more sophisticated things with it.

    So learn the basic text manipulat?on stuff (left, mid, right, tokenat, etc) and then after that Regex if you feel you want/need to.

  • > This is amazing. Would this be able to translate each word within entire sentences too?

    >

    That would be up to you With C2 you can manipulate text any way you like, it's all up to your ability to use it.

    [quote:csw01rf6]I don't think the idea itself is very complicated. The steps to my idea when it translates a word would be:

    1. Determine how many letters are in the word

    2. Determine the placement of vowels in the word (so if an "a" is the first letter, or the second, or the third, etc)

    3. Determine how many vowels there are

    4. Translate the word

    1-3 are easy and you've seen how to. 4 is up to you because the meaning of 'translate' is what you need it to be.

    Just one tip, you could reduce the amount of code significantly if you learn regular express?ons (regex). Look it up in the C2 manual (where you won't learn anything unfortunately) or better, Google 'javascript regex'. My usage of it above is very basic. You can do a lot more sophisticated things with it.

    So learn the basic text manipulat?on stuff (left, mid, right, tokenat, etc) and then after that Regex if you feel you want/need to.

    That would take too long; It's all too complicated for me to understand, it took me an entire month to understand arrays and arrays are relatively simple. I'm not too sure how to make this system work with entire sentences.

  • I would second the suggestion that you look into regular expressions. They really are made for exactly what you seem to be trying to do here. Using regexs, you can do more versatile translations, do them more efficiently. For example, I might try something like

    var input; // this is the text you are going to translate
    var in; //a temporary array to hold words
    var t; //temp
    var output = "";
    in = in.split(" ");
    for (var i in in) {
      if (null != (t = in[i].match(/([~aeiou])([aeiou])([~aeiou])([a-z]*)/i))){
        output += " " + t[4] + t[2] + t[3] + t[3] + t[1]
      }
    }
    
    [/code:ekeaikau]
    Pardon any mistakes. I am new to javascript, and i have not tested this at all. It's just to give you a general idea of what you might do.
    
    What this does is split an input sentance into words based on spaces. Then it tries to match each word against a regular expression, which is the funky bit after in[i].match(). The regular expression looks first for a nonvowel, then a vowel, then another nonvowel, then it considers the word matched and includes any letters that come after that. the regex stores the things it finds in the array t. For example, if you fed it the word "language", it would first match the whole things and store it in t[0] = "language". The subsequent parts of the array contain the groups in the regex, so t = ("language","l","a","n","guage").  The line that starts with output +=  will add a space, which was removed in the split, then stitch the word back together from the array t in a suitable mangled fashion: guageannl. similarly, you could feed it "lol" and you would get "olll", or give it "similar" and get "ilarimms". You can also make it do letter substitutions and other crazy shenanigans. It becomes a roll your own, extra complicated form of pig latin!
    
    Was this at all what you were looking for?
  • I would second the suggestion that you look into regular expressions. They really are made for exactly what you seem to be trying to do here. Using regexs, you can do more versatile translations, do them more efficiently. For example, I might try something like

    > var input; // this is the text you are going to translate
    var in; //a temporary array to hold words
    var t; //temp
    var output = "";
    in = in.split(" ");
    for (var i in in) {
      if (null != (t = in[i].match(/([~aeiou])([aeiou])([~aeiou])([a-z]*)/i))){
        output += " " + t[4] + t[2] + t[3] + t[3] + t[1]
      }
    }
    
    [/code:14v1fmad]
    Pardon any mistakes. I am new to javascript, and i have not tested this at all. It's just to give you a general idea of what you might do.
    
    What this does is split an input sentance into words based on spaces. Then it tries to match each word against a regular expression, which is the funky bit after in[i].match(). The regular expression looks first for a nonvowel, then a vowel, then another nonvowel, then it considers the word matched and includes any letters that come after that. the regex stores the things it finds in the array t. For example, if you fed it the word "language", it would first match the whole things and store it in t[0] = "language". The subsequent parts of the array contain the groups in the regex, so t = ("language","l","a","n","guage").  The line that starts with output +=  will add a space, which was removed in the split, then stitch the word back together from the array t in a suitable mangled fashion: guageannl. similarly, you could feed it "lol" and you would get "olll", or give it "similar" and get "ilarimms". You can also make it do letter substitutions and other crazy shenanigans. It becomes a roll your own, extra complicated form of pig latin!
    
    Was this at all what you were looking for?
    

    You guys seem to be going into intense detail about the code and programming behind it all, I just want to know how this can be used for entire sentences and not only words.

  • There's no way around it, you're gonna have to learn some stuff. At a minimum, C2 has the same basic text manipulation functions as other languages. Read up on https://www.scirra.com/manual/126/system-expressions, under 'Text'. These will be more than capable of doing what you need.

  • I wonder if there is a regex plugin?

    In any case, look at my code where I make in = input.split(" "). (really, in is not a valid identifier because it is a reserved word, so use inp instead)This takes apart the sentance at the spaces, so the variable "in" is now an array that contains each word in the sentance. I took it a step deeper, and took each word apart and put it back together wrong, but you could just as easily stop at this point an put the whole sentance together wrong.

     var output = "";
    for (i in inp) {
      output  = inp[i] + " " + output;
    }
    [/code:1k2lj6z8]
    This is fairly simple, it just puts the sentence back together backwards, but you could go into some crazy tower of hanoi recursive fractal type of word reorder. You can use regular expressions to pick out specific words, which, using the above regex, would allow you to do things like pick all consonant-vowel-consonant-anything words and put them in the front of the sentence, with or without mangling the word itself. You can even combine the two, so it's possible to mangle sentence order AND mangle individual words AND substitute both words and letters.
    
    The only thing I can't imagine how to do easily is picking out and reordering based on actual syntactic structure, such as nouns, verbs, adjectives, direct object etc. Understanding human language is hard.
    
    I hope that covered what you were hoping to do. If not, feel free to keep asking. If you can be more clear about what it is that you are in particular looking for, I may be able to help you further
  • why are you trying to teach him JS? I doubt he's going to do that if he doesn't want to even use RegEx. And anyway there are built-in RegEx expressions, did you see me code screenshot above?

  • Oops. Listen to Codah. He knows the simplest way to do it

  • Oops. Listen to Codah. He knows the simplest way to do it

    I just meant that he seems to be after a C2 solution

  • Haha maybe I should? What exactly might such a thing do? I suppose It could have a range of like a dozen premade stackable manglers, which do everything from making charachters speak with an accent to changing sentance structure to making completely deterministic but unintelligible gobbledygook.

    I guess I just got caught up in it because I just did some javascript regex for my own game. I just made a parser for my rpgcharachter behavior, which takes an attack string and factors in randomness and the character's stats and skills to make a damage string. The damage string describes things like degree of armor piercing, ice damage temperature, etc, which the defender processes against their own armor and resistances to take damage. regexes are fun!

  • Sorry If I resume this thread after a while but I need your help.

    I didn't even understand what you did to count vowels etc (I'm guessing that's why I'm using a simple program like Construct 2 haha) but I wanted to ask you if it's actually possible to count the words in a text box.

Jump to:
Active Users
There are 1 visitors browsing this topic (0 users and 1 guests)