strange Null index in string to int conversion using .AsInt; my bad logic?

Started by JoelCBarker, Tue 24/06/2014 07:21:40

Previous topic - Next topic

JoelCBarker

Hey gang,

    Here's something I can't solve that you've probably seen a million times before. First the code:

Code: ags

    iteration = 0;
    Verb_Data_Unit = 1;
    while(bKeep_Loading_Verbs)
    {
      if(Verb_Data_Unit == 4)
      {
        Verb_Data_Unit = 1;
      }
      if(iteration > nSegments)
      {
        bKeep_Loading_Verbs = false;
      }
      if(Verb_Data_Unit == 1)
      {
        conversion_buffer_1 = Segments[iteration].Substring(0, 1);
        Verb[iteration].tense = conversion_buffer_1.AsInt;
        conversion_buffer_1 = Segments[iteration].Substring(1, 1); //  This line is highlighted when I get my error
        Verb[iteration].known_Subject = conversion_buffer_1.AsInt;
        conversion_buffer_1 = Segments[iteration].Substring(2, 1);
        Verb[iteration].abbey_capable = conversion_buffer_1.AsInt;
        conversion_buffer_1 = Segments[iteration].Substring(3, 1);
        Verb[iteration].user_capable = conversion_buffer_1.AsInt;
      }
      if(Verb_Data_Unit == 2)
      {
        Verb[iteration].Word_Text = Segments[iteration];
      }
      if(Verb_Data_Unit == 3)
      {
        Verb[iteration].Sound_File = Segments[iteration].AsInt;
      }
      iteration++;
      Verb_Data_Unit++;
    }
  }
}


    The code works pefectly fine until a specific iteration, maybe 19th? when I get the null index at that line. I can't figure out why it would be inside that if statement at that iteration. I'm wondering if my math is such that since 21 (counting iteration 0) is a multiple of three, my little Verb_Data_Unit ticker with three modes is getting out of sync? The strange part to me is that it gets each variable correct until finally Verb[iteration].Sound_File = 6, then it throws the error. Is the answer obvious?

This is the same number and arrangement of characters in each segment it could view, I already proved to myself that they each are intact:

0011 <--segment 0
accept <--segment 1
0    <--segment 2

0001 <--segment 3
allow <--segment 4
1    <--segment 5        ...and so on...

    I have all of these segments, read from a .dat via ReadBackRawLine(), segmented with great thanks going to JSH, but I want the sorting process to put three different types of segments in their homes, where the first section is four numbers with four different homes. Blah. Is it something to do with .AsInt?

    the null index is probably referring to one of the single-character segments, but I can't wrap my brain around how my system ends up trying to index one of those with substring, when it should be messing with the Sound_File variable?

Sorry to keep bothering everyone... I know this isn't really an adventure game

-Joel 

JSH

The first if condition will not instantly break the loop since while conditions are only checked at the beginning of each iteration. The result is it doing a single iteration out of bounds. To solve it you'd want to enclose the rest of the code within a corresponding else:

Code: ags

if(iteration > nSegments)
{
    bKeep_Loading_Verbs = false;
}
else
{
    // stuff
}


Since you only seem to use the bKeep_Loading_Verbs boolean to determine whether or not to continue the while loop, you can also simplify the code a bit. I took the liberty to do some optimizations below, like using the modulo operator for the verb_data_unit since it's deterministic given the size of the iteration counter:

Code: ags

iteration = 0;
int Verb_Data_Unit;
while(iteration < nSegments)
{
  Verb_Data_Unit = iteration % 3;
  if(Verb_Data_Unit == 0)
  {
    conversion_buffer_1 = Segments[iteration].Substring(0, 1);
    Verb[iteration].tense = conversion_buffer_1.AsInt;
    conversion_buffer_1 = Segments[iteration].Substring(1, 1);
    Verb[iteration].known_Subject = conversion_buffer_1.AsInt;
    conversion_buffer_1 = Segments[iteration].Substring(2, 1);
    Verb[iteration].abbey_capable = conversion_buffer_1.AsInt;
    conversion_buffer_1 = Segments[iteration].Substring(3, 1);
    Verb[iteration].user_capable = conversion_buffer_1.AsInt;
  }
  else if(Verb_Data_Unit == 1)
  {
    Verb[iteration].Word_Text = Segments[iteration];
  }
  else if(Verb_Data_Unit == 2)
  {
    Verb[iteration].Sound_File = Segments[iteration].AsInt;
  }
  iteration++;
}

JoelCBarker

Awesome! I knew a gauntlet of just if statements was bad form but I never was much good with code in general. I can usually find what I need in posts but I sometimes have a hard time following the action in my own script :tongue: Ty ty.
   
    I'm just going to link these in here in case anyone else is trying to read text files with ReadRawLineBack() if that's alright, they were really helpful for me:

Scotch:
http://www.adventuregamestudio.co.uk/forums/index.php?topic=29871.msg381385#msg381385

and more recently Billbis:
http://www.adventuregamestudio.co.uk/forums/index.php?topic=49860.msg636478748#msg636478748

JoelCBarker

Well crud, it still throws the invalid index error at the same iteration. I'm not sure what other information would help. I'm thinking it has to do with the fact that I'm having it read string contents that AGS didn't create? Just using Notepad to make .rtf files to read, but I was really hoping I could use formatting to make it work. I saw the warning in the manual, but it did work to correctly store the segments as Strings... I'm stumped.

If it makes any difference, this is a nested while loop, but the outer just checks for spaces

JSH

You can't read .rtfs as if they were text files, they contain a lot of formatting data mixed in with the actual strings. Use a simple .txt instead. Alternatively, if you need organized data it might be worth looking into .xml. It's a common format typically used for arranging data in a simple tree like structure, there's probably a module or plugin for using it in AGS.

Anyway, when you debug problems like this it's a good idea to use Display() commands to show on the screen what the data looks like in each iteration to see where things go wrong.

JoelCBarker

cRoger that. I'll try .txt and check out .xml. I used Display commands after receiving each error to determine that the problem occurs immediately after Sound_File = 6, but I'll see what I can do. I think I'll take a shower and go get something to eat first though... been a while (8

Yep still not working with .txt files, may just need to try other things

JoelCBarker

Doh! So my slow brain, here goes... permit me to explain my logic...

What I have done, is read my text file using ReadRawLineBack(), then I looked for spaces and stored the characters between the spaces into there own little string array instances (Segments[nSegments]). Easy Peasey. I want to store the contents of every three now-existing strings which belong to my Segments[] String array[] into different instances within a different struct array called Verb[]. I think of the first Segments[] String as being of type A, the next of type B, the next is type C, then back to A in the order they appear in my Segments[] String array.

Next I want to separate the contents of my Segments[] String array according to their respective type (A, B, or C) and store those different types as different variables in my Verb[] struct array, where Segments[] Strings from type B remain strings, Segments[] Strings from type C become a single int, and Segments[] Strings from type A get broken down into four separate int values that need to be stored as four separate int variables in my Verb[] struct array. Okay.

The trouble I'm having is that last bit (which needs to happen first)--the process of breaking down a Segments[] String of type A into four separate int values and storing them as the separate variables they represent within my Verb[] struct array.   

I think I might be able to do that if I eFileWrite the contents of only my Segments[] Strings of type A, one type A Segments[] String at a time, into a temporary text file (made by AGS). Then I can use ReadRawChar(), which operates from right to left mental note, to pull each character from that string in the temporary text file and store it where it belongs (after converting it using AsInt(), much like above).

This is me trying something different because I think I'm having trouble with baby's first math problem in my original code. I'll update this post if it works.

Inspired by Monkey_05_06: http://www.adventuregamestudio.co.uk/forums/index.php?topic=31879.msg411221#msg411221 and in light of the problems I'm having using Substring().

I still think Substring() should work as coded above, but I might also have a 'space' character at the end of most of my Segments[] Strings, which is a problem since the last of those certainly doesn't have a space at the end. At least I'll figure out which of those three possibilities is my original problem. Blah

I know this is riveting stuff

Update: Oh for the love of crap

Code: AGS

        File *temp_string_to_sort = File.Open("Temp_String_Storage.txt", eFileWrite); //  open or create a temporary file to store the string I         want to sort
        temp_string_to_sort.WriteString(Segments[iteration]); //  write the 'type A' string to that temp file
        Display("%c", temp_string_to_sort.ReadRawChar());


just Displays "?"

So this must be a formatting problem? I pulled a string from a .txt file made in Notepad (using ReadRawLine()), then had AGS write it to a new .txt file (using WriteString()), then read back a character from the file it wrote (ReadRawChar()). The original Notepad file contains no '?', so there must be a problem interpreting raw strings from Notepad files? Is that definitively the problem I'm running into? Because in my previous code, I had no trouble recording at least six perfect strings AND converting at least six strings to int variables using ReadRawLine() and .AsInt().

What's the deal--I'm a plumber and a hippie and I'm sorry if there has been any breach of "Beginner's Technical Questions" forum etiquette.

Khris

You can't write to and read from a file at the same time. You can either open it in read mode, read stuff, then close it, or open it in write mode (write or append), write stuff to it, then close it.
However, you're making things much more complicated than they are; getting single characters from a String is trivial and by no means requires juggling Files:
Code: ags
  String a = "hello";
  Display("Second character of %s is %c", a, a.Chars[1]);


AGS will read text files just fine though.

JoelCBarker

Thanks!    So is a character stored as a string or an int?

Code: AGS

{
String a;
String b;

String b = "0011";

a = b.Chars[0];
}


doesn't seem to be working. "Cannot convert char to String"

I want to convert a value from the read only characters array to an integer then.

Quote from: Khris on Thu 26/06/2014 08:03:48
...you're making things much more complicated than they are; getting single characters from a String is trivial..
Seems like it should be (8 but I don't want to display it, I want to store it. I wouldn't make it that complicated, except Substring is returning an "invalid index" error, even after I corrected my above script by including a tracker int instead of iteration, which was the wrong int. Maybe I should post a more revealing script.

JoelCBarker

Code: AGS


int iWS;
int iteration;
int Current_Segment_Tracker;
int Lines_Read;
int nSegments;
int Verb_Data_Unit;

String File_Lines[100];
String Segments[1000];

struct Lexicon_Verb {   //                        *VERBS*
  int tense;  //  0=directive, 2=past, 3=present
  int known_Subject;  //  0=no, 1=yes
  int abbey_capable;  //  0=no, 1=yes
  int user_capable; //  0=no, 1=yes
  String Word_Text;
  int Sound_File;  
};

Lexicon_Verb Verb[1000];

String conversion_buffer_1;    //    since I can't just say Segments[0].Substring(0, 1).AsInt
String conversion_buffer_2;    //    probably don't need 4 of them, but peace of mind
String conversion_buffer_3;
String conversion_buffer_4;

function Load_Lexicon_Verbs()               //===============----------------      Below is all tested and works great until @@@
{
  bool bKeep_Loading_Verbs = false;
  bool bKeep_Segmenting_Line_Verbs = false;
  iteration = 0;  //  reset iteration to zero before segmenting
  nSegments = 0;  //  reset nSegments to zero before segmenting a new string
  iWS = 0;  //  reset white space tracker
  Lines_Read = 0;
  Current_Segment_Tracker = 0;
  gLoading_Box.Visible = true;   // turn on the loading box, it just shows me what steps are working
  
  if(Verbs_Loaded == 0)
  {
    File *load_verbs = File.Open("Lexicon_Verbs.txt", eFileRead);
    while(!load_verbs.EOF && Lines_Read < 100)
    {
      File_Lines[Lines_Read]= load_verbs.ReadRawLineBack();
      Lines_Read++;
    }
    load_verbs.Close();
    bKeep_Segmenting_Line_Verbs = true;
    while(bKeep_Segmenting_Line_Verbs)
    {
      iWS = File_Lines[iteration].IndexOf(" ");
      if(iWS == NOT_FOUND)
      {
        Segments[nSegments] = File_Lines[iteration];
        iteration++;
        if((iteration + 1) > Lines_Read)
        {
          bKeep_Loading_Verbs = true;
          bKeep_Segmenting_Line_Verbs = false;
        }
      }
      else
      {
        Segments[nSegments] = File_Lines[iteration].Truncate(iWS);
        File_Lines[iteration] = File_Lines[iteration].Substring(iWS + 1,  File_Lines[iteration].Length);
      }
      nSegments++;
    }
    //    @@@=================================-----------------                   @@@  Above is all tested and works great @@@
    iteration = 0;
    while(iteration < nSegments)
    {
      Verb_Data_Unit = iteration % 3;
      if(Verb_Data_Unit == 0) //  now we're dealing with a Segments[] String of 'type A'
      {
        conversion_buffer_1 = Segments[Current_Segment_Tracker].Substring(0, 1);
        Verb[iteration].tense = conversion_buffer_1.AsInt;
        conversion_buffer_2 = Segments[Current_Segment_Tracker].Substring(1, 1);
        Verb[iteration].known_Subject = conversion_buffer_2.AsInt;
        conversion_buffer_3 = Segments[Current_Segment_Tracker].Substring(2, 1);
        Verb[iteration].abbey_capable = conversion_buffer_3.AsInt;
        conversion_buffer_4 = Segments[Current_Segment_Tracker].Substring(3, 1);
        Verb[iteration].user_capable = conversion_buffer_4.AsInt;
        Current_Segment_Tracker++;
      }
      else if(Verb_Data_Unit == 1)  //  'type B'
      {
        Verb[iteration].Word_Text = Segments[Current_Segment_Tracker];
        Current_Segment_Tracker++;
      }
      else if(Verb_Data_Unit == 2)  //  'type C'
      {
        Verb[iteration].Sound_File = Segments[Current_Segment_Tracker].AsInt;
        Current_Segment_Tracker++;
      }
      iteration++;
    }
  }
}


The abridged contents of Lexicon_Verbs.txt:

0011 accept 0 0011 add 1 0011 admire 2 0011 admit 3 0011 advise 4 0011 afford 5 0011 agree 6
0011 alert 7 0011 allow 8 0011 amuse 9 0011 analyze 10 0011 announce 11 0011 annoy 12
0011 answer 13

...and so on. I already tested the ability to correctly read two consecutive lines, no problem there. No typos or deviations from that format in the file.

So I'm not trying to make it as complicated as possible, I just don't know why substring gives an invalid index when I use it instead of the .Chars[] reading.

Snarky

Well, given your code, it's clear that Substring() is crashing because the string you're giving it is not 4 characters long. By checking which line crashes, you should be able to tell how long the string in fact is. By printing the value of iteration every loop, you should be able to see which word it crashes on, and examine that one more closely.

When you say "Above is all tested and works great", have you actually looked at the output it produces? What the segmentation loop puts in Segments[]? That code looks a little messier than necessary, and since the categorization/parsing looks correct, I would bet the bug is actually in the segmentation, assuming the data file is in fact correct.

A couple of other comments...

First, don't declare variables outside of a function if you're only using them inside the function. That leads to all that nonsense with having to reset a bunch of values at the top of the function, and it's easy to forget one, or reset it wrong. If you declare them inside the function, you can be sure they never hold the value from the last function call.

Second, use the proper data types. Don't use ints to store true/false, use bools. You might even consider using an enum for the tense, though that is a little more work.

Following these rules helps you write better, more logical code which is easier to debug. For example, in this line:

Code: AGS
  int tense;  //  0=directive, 2=past, 3=present


Are you sure those values shouldn't be 0,1,2 instead of 0,2,3? If you use an enum, you're much less likely to put the wrong value by mistake somewhere in your code.

Quote from: JoelCBarker on Thu 26/06/2014 18:57:03
Thanks!    So is a character stored as a string or an int?

Neither. Characters are stored as char. But they're pretty much like ints (storing the ASCII value of the character). In most languages (I haven't tested in AGS, but would expect it to work) you can convert a char digit to its integer value like so:

Code: AGS

char digit = '5';
int number = digit - '0'; // number == 5


However, if you're converting to bool (as I would recommend for the true/false values), you can simplify it even further:

Code: AGS

String currentSegment = Segments[Current_Segment_Tracker]; // You don't need the conversion buffers, just a convenient handle to the current segment, declared at the point you need it
Verb[iteration].tense = currentSegment.Chars[0] - '0';
Verb[iteration].known_Subject = (currentSegment.Chars[1] == '1'); // Is the character '1'? True/false
Verb[iteration].abbey_capable = (currentSegment.Chars[2] == '1');
Verb[iteration].user_capable = (currentSegment.Chars[3] == '1');

JoelCBarker

Thank you thank you all three. I really appreciate the assist on my learning the code, which none of you are personally invested in. Thanks thanks thanks!

Edit: Still get the invalid index. Using display, I checked every Segments[] as it was stored (several times) and there were no problems. I added Truncate(4) to the type "A" chunks just to be sure they had four characters only. The code works like a charm until the second phase, at exactly Segments[20], where it tries to...

conversion_buffer_2 = Segments[Current_Segment_Tracker].Substring(1, 1);

...at exactly the second digit (index 1, the character in the segment is 0). That's where I'm confused. Is it possible that the code is running through that block of statements too fast or something? Like I said, no flaw in the text file. It must just be math somehow? All segments definitely stored correctly.

Trying the Chars[] digit conversion now

JoelCBarker

Okay. I converted to boolean, and all of the (currentSegment.Chars[] == 1) values are correctly translating at first.

Halfway through each 'line' (as read back by ReadRawLineBack() and then stored in Segments[]) I start getting incorrect values.

Here is the display of each of the four variables as they roll in:

0,0,1,1    0,0,1,1    0,0,1,1    0,0,1,1    0,0,1,1    0,0,1,1    -48,0,0,0    7,0,0,0    8,0,0,0    9,0,0,0    1,0,0,0    1,1,0,0    1,0,0,0    49,0,0,0    49,0,0,0    49,0,0,0    49,0,0,0    49,0,0,0    //And then back to working//    0,0,1,1    0,0,1,1

...and so on. messes up again later (I think the second half of the second 'line').     

Now I'm thinking its back to my separation by spaces process. Any thoughts? The Display of each Segment as it comes in is perfectly correct, and the text file is meticulously correct, though I have no idea if what I see in Notepad is what AGS sees in computer language, ASCII?

I just want to load several values and a string or two into a struct array from an external text file, but at this rate each text file could only contain up to six chunks of variables.

I know about the max character limit to ReadRawLineBack()... does it recognize the carriage return in Notepad to indicate a new line? Is a tab or carriage return -48 or 49 in ASCII? bizarre

Afaik nobody has tried converting ReadRawLineBack() characters from an external text file to integers past 60 or so characters?

Oh duh. The booleans are working, it just still can't handle an int. I guess I'll figure out a way to work around that, enum for the tense and so on. Let's just call this solved unless someone finds it interesting.

Snarky

I think what you're showing here is the result of the parser, and only for Type A, right? It would be more useful to have a look at the content of Segments[] directly.

To me it looks like the parser is working fine, but getting the wrong data. -48 looks like the result of 0 - '0' (the ASCII value of the '0' char is 48), so this indicates that the character being read is a null (not a normal character, usually used as an end-of-string marker). The two most likely explanations for this: (1) there is some weird character at this point in the txt file (maybe even an invisible, non-printing character) that breaks the program, or (2) there's a bug in your segmentation loop.

I am pretty confident that the bug is in your segmentation loop. Most likely, it's writing empty strings into Segments[], or splitting the lines at the wrong index. However, from a quick glance at it I'm actually not sure exactly how it's meant to work, so I can't figure out exactly where the error is. Like, why are you incrementing iteration, for example?

I would recommend refactoring this part of the code to use a helper function, String.Split(), to simply provide you with the space-separated substrings directly. That way, the code should just be a very simple loop that goes through each line of the input, splits it using String.Split(), and puts each split part into Segments[].

As it happens, there are already code examples on the forum of String.Split(), including a module by monkey. (Although his version returns a dynamic array, and I don't see how you're supposed to know its size...)

Edit: Oh, and just in case the terminology isn't clear, the "parser" is the bit at the end that decodes the values (while(iteration < nSegments){...}). The "segmentation loop" (usually called a tokenizer) is that bit that splits up the file into the list of values (while(bKeep_Segmenting_Line_Verbs){...}).

JoelCBarker

Okay I found something. Going through with all my displays looking for the unconverted ASCII number and looking at the ASCII table, I found that the type A actions are, on the seventh pass, attempting to run on the wrong Segments[] for that block. Later, it gets further off, moving to the Segments[] containing the letters representing the name of the word. When that happens, sometimes the first character, (Chars[0]) is either a or b, which is the first character of my alphabetized word names. So I've checked the actual segments after storage about 10 times now, those are fine. There must be a problem with the parser getting off base? the Verb_Data_Unit = iteration % 3; and if 0, if 1, if 2 business?

--Edit: The script wasn't getting off on its own, special characters not visible in the text editor were throwing it off--

Although eventually it ends up showing me ASCII codes for 52, 55, 57 and 59 which don't appear anywhere in the data file, especially not semicolon, but that only happens once it moves to the second batch of ReadRawLineBack() line segments. So this is probably two problems at once. 

Well, saving the original text file as ANSII gets me closest so far, but I still get phantom semicolons. Unicode and UTF-8 produce similar errors but with some slightly different characters, but it still doesn't tell me whether or not I'm trying to perform actions on the wrong segment in the parser. Trying XML... 

JoelCBarker

Jeezum Crow... I'm still getting gibberish. I get furthest with XML but I still get gibberish. Just going to try and write the files I want with AGS and the text prompt

SMF spam blocked by CleanTalk