Feature Request: Inspect variable while debugging

Started by JanetC, Tue 10/01/2017 22:11:21

Previous topic - Next topic

JanetC

Most debuggers allow you to hover over a variable while stepping through the code in order to inspect the contents of the variable. This would make my life so much easier!

Crimson Wizard

#1
Quote from: JanetC on Tue 10/01/2017 22:11:21
Most debuggers allow you to hover over a variable while stepping through the code in order to inspect the contents of the variable. This would make my life so much easier!

The HUGE problem of AGS is that compiled code has very limited reflection. In plain language this means that when engine runs the script, it does not always know what exactly it deals with, it does many things blindly according to instructions. For example, variables are saved not as variables, but as a big array of bytes. The engine is not explicitly told what are those variables and where in this array they are located. For the engine there is no "int a", there is, say, address "50" where it must read 4 bytes. It is possible to deduce some things from instructions, as script runs, but still no way to know variable names.

UPDATE: I just remembered that compiled script stores names of imported and exported variables for some reason. That's a good start, but not enough for general case.

So, this is not only a task of making engine tell some information on the script it runs, but making engine being able to gather such information first.
On other hand, if I am right, I believe that it should not be too hard to add such information about global variables into compiled script. I am not that certain about local ones (the way they are dealt with is somewhat different).

Crimson Wizard

There is yet another way this may be solved.
If Editor will write down table of variables when compiling scripts, remembering names of variable and their addresses for the debugging times, then it may ask engine not about certain names, but certain memory addresses, which engine does know.
I cannot tell whether that would be easier or harder to do. Pro is that in such case you do not need to change compiled script format, con is that this probably introduces new kind of temporary output for the project (because Editor needs to keep these tables somewhere). (also engine still won't know what it works with)

RickJ

I would think having a symbol table containing name, type and address would be the way to go (at least for variables with statically assigned memory addresses).  Why not have an option to generate the symbol table and include it in the game file(s)? It would be a small step further to have script commands that access the table so that debug utilities could be implemented in script.

JanetC

Quote from: Crimson Wizard on Tue 10/01/2017 23:06:48
UPDATE: I just remembered that compiled script stores names of imported and exported variables for some reason. That's a good start, but not enough for general case.

Just that would be a big help, because usually the problem variables for me are imported/exported variables (I use them a lot.)

Crimson Wizard

This task includes number of things to consider and research.

From user interface side:
- how the watched variables are displayed? Should there be a pane with list of those, or floating hints.

From script side:
- The compiled script must have a table of variables with offsets and names, or at least only offsets (but in latter case Editor must keep the lookup table to find offset by variable name).

Data transfer:
- There is already a pipe between engine and debugger, to send commands (breakpoints sent to engine, line numbers and callstack sent to the debugger). What protocol would be optimal for this, and when the information is sent. Should each variable value be sent by request from debugger, or should engine send variable values itself, e.g. when they are modified.

RickJ

I don't know if Janet agrees but I would think that if there were a symbol table from which a script command could return a reference to a named variable then people could write their own debug functions.  It could be useful for other things as well.

JanetC

Quote from: Crimson Wizard on Sat 11/02/2017 19:26:45
From user interface side:
- how the watched variables are displayed? Should there be a pane with list of those, or floating hints.

Whichever is easiest to code :) XCode includes both.

Crimson Wizard

This is old... things always progress slow in AGS.

For the reference, after RTTI feature was merged in AGS, it might be possible to similarly generate a table of variables with names and write along the compiled script, as an optional block of data (which may be enabled or disabled).

If this data is accessible by the engine, then the engine may return an information about variable and its current state.



An alternate approach is still viable too (as was years ago):
QuoteThere is yet another way this may be solved.
If Editor will write down table of variables when compiling scripts, remembering names of variable and their addresses for the debugging times, then it may ask engine not about certain names, but certain memory addresses, which engine does know.
The meaning of this is to ask engine to return a data at offset X of size N from a script memory.

Crimson Wizard

#9
The minimal requirements for the variables watch are:

1. Script compiler generates a table that maps name of a variable to offset in script data. Such table must be done per each script, as variables depend on visibility scope, and each script has its own (just like types).
2. This table is saved either in a separate file, or as an extra data in a compiled script file (similar to rtti).
3. This table is read either by the Editor or Engine.
4. If this table is read by the Editor, then on user request it converts variable name to offset, and sends a command to the running engine through the existing communication mechanism. This command asks engine to return a value from the given data offset (there's more than that, but it's a general idea).
5. If this table is read by the Engine, then Editor sends a command with variable's name instead, and Engine is responsible for converting this name to an offset. From this moment p4 and p5 match.
6. After retrieving variable's value, engine passes it back to the Editor, and Editor displays it.
7. The rest here is mostly an issue of GUI.

Details.
   
Variables are identified within certain context (scope of visibility), because there may be multiple variables of same name inside different scripts, and even same script (think local function variables).
This means that the mapping is done in 2 steps:
   
    Context-dependent name -> Global unique name -> Memory offset
   
For global variables the unique name may be formed as "modulename.varname", similar to how global type ids are formed in RTTI.
   
Local variables are more tricky, because they have a limited lifescope, which may be a function, but also a section of a function (anything surrounded by brackets). This means that for them the table of variables should also mention first and last script line of their life scope. If a variable is requested, but there's no such variable found in the given context (script + current line), then such request must be denied.
   
With local variables in mind, the "global unique name" of a variable should also include their scope. Now, I don't remember if AGS compiler supports overriding variable names in the nested scopes, but I think we should assume that eventually it does (IIRC this was discussed on github once). So, I guess the global unique name should be something like "modulename.varname.scope", where scope could be a pair of numbers meaning the first and last line of code of their scope of visibility.
   
A conversion between a "Context-dependent name" and "Global unique name" possibly can be done like this.
We have a table of variable names for the given context (a script module), and for each key in this table we will have not 1 variable, but a list, sorted by the first line this variable is visible on. For a variable request, we find its name in the table and traverse this list until we find an entry which pair of first-last lines matches the location of current breakpoint. This is how we learn the "global unique name". This "global unique name" is passed further to find a memory offset.

Example of a list (only to demonstrate a potential case):
    module.myvar -> module.myvar.10.20 -> module.myvar.30.120 -> module.myvar.50.60
Here we have a global variable "myvar", some local one between lines 10-20, another local between 30-120, and a nested one which overrides previous for the duration of its life scope - between lines 50-60.
   
Noteably, memory offset should be paired with a memory type: either global memory or local memory (stack), so that engine knew where to look for it.
   
After receiving memory offset we need to interpreter the value stored on it, and convert to a string. For that we need RTTI which tells us variable's type.
   
    Editor                                                                                                              Engine
        Context-dependent name -> Global unique name -> Memory offset
        Displayed value <- Data Value <- Memory offset

Handling structs.
   
Reading a struct's member can be done by passing another, nested offset. Getting that offset requires RTTI, which tells which relative offset does a member of certain name has. Same as with variable itself, there are two alternatives here, one where a list of offsets is resolved on Editor's side, and second where it is resolved on Engine's side (and Editor passes just a sequence of names - "variable name :: member name :: member name ...").
   
Regardless, engine must know how to access each nested member. There are two variants here: plain struct and managed struct. A member of a plain struct is accessed simply by adding a relative offset to the struct instance's address. A member of managed struct is accessed by resolving the pointer first.

Handling arrays.

I suppose that arrays are handled like plain structs.

Reading values of attributes (aka properties).
   
Attributes are pairs of get/set functions in AGS. Reading a property would require to call a registered function. In theory it must be possible to do this even outside of a script vm, but the biggest issue is potential side-effects that such call may involve. I'd rather leave this out at least until the variable reading mechanism is developed and proved working.
   
   

eri0o

#10
About the issue of GUI, I think I imagine it would use a new panel for this with a TreeView and it would show each variable in scope as a node in the TreeView. Then it would be possible to expand a struct and view it's entries. Not sure about the arrays if they need expansion or not. My first guess would be to not support this for arrays.

When the node is shown some elements could have an alternative pretty print of them so you could read it more easily (an array of ints could pretty print its values as "{1, 5, 0, 3838}" for instance and a string should pretty print it's contents. A point would pretty print as "(160, 120)".

The node text in my mind is "varname: value".

I don't know if necessary, but it's possible to group the nodes under either Local or Global node.

The TreeView only populates when the script breaks or when it advances only one step.

There's a right click menu for each variable where it's possible to copy its values.

When the game stops running the panel clears.

I don't know if any configuration is necessary to store for this panel. My guess is none now - except layout stuff that AGS already stores somewhere else.

Crimson Wizard

#11
Quote from: eri0o on Wed 27/03/2024 15:00:26About the issue of GUI, I think I imagine it would use a new panel for this with a TreeView and it would show each variable in scope as a node in the TreeView.

I must point out that having this for all variables at once will require editor to ask engine for all of their values each step, as editor does not know which ones of them change.
Otherwise there would have to be a mechanism that detects changes to the script memory, which we do not have at the moment.
This would also raise a question of convenience, as user will have to search for wanted variables in this tree view.
For a first iteration, I'd suggest to have a list where user inputs wanted variables by name, similar to how Visual Studio does this.

Crimson Wizard

#12
Hmm, the list of all variables may be presented as a selection of what to add to the "watch" panel.

In other words, there's a watch panel which lists currently watched variables, and a separate panel or a dialog window, where user may choose from a full list of variables. Although this will work reliably for global variables. It may list known local variables too, except not all may be active in particular scope.

EDIT:
well, in any case, I think it should be as simple as possible at start, because the biggest problem is making the variable value extraction, and GUI may always be adjusted later.

Crimson Wizard

#13
I got a very primitive memory watcher running:


In this draft state it requires user to type in script module tag and a literal byte offset. But my intention was to test the request mechanism.

The source code is here: https://github.com/ivan-mogilko/ags-refactoring/tree/ags4--draft-memwatch

There's a long path till this will become convenient, and supports variable types. Right now it reads strictly integers at the given location.

eri0o

Oh, NOW I got the screenshot... var1 is global in position 0 of memory so we get g:0, and it has the value it's reading from it, 4 bytes later it's g:4, and we get it's contents, and additional 4 bytes later it's g:8. And the values that are read match with the expected ones set. Nice! :)

Great work CW! That's a really nice start.

Crimson Wizard

#15


Supports resolving structs and pointers of any complexity now.

But typing these instructions becomes pretty tedious pretty quickly. This cannot get easier without compiling a table of variables from script.

Quick explanation of what these weird lines mean:

Code: ags
    // Format for DRAFT testing only:
    // x[N]:offset
    // x[N]:offset[,type[:offset,type[:...]]
    // where x can be -
    //  - g    - globalscript
    //  - m[N] - module
    //  - r    - room state
    // offset is in bytes
    // type can be:
    //  - c    - char (1 byte integer)
    //  - iN   - integer of given size in bytes, e.g. i1, i2, i4
    //  - fN   - float of given size in bytes, e.g. f4
    //  - dN   - plain data (struct, plain array), optionally of given size in bytes
    //  - s    - plain string (null-terminated sequences of chars)
    //  - p    - pointer, reserved
    //  - h    - handle (managed), an int32 that may be resolved to a pointer

So, "g:17,d:92,h:0.s" means:
- go to address 17 of global script memory, and find some data array there;
- from there add 92 bytes forward (the size of a single struct is 16, so that's 5th struct in array + 12 bytes = 80 + 12 = 92), and find a managed handle
- resolve the managed pointer, and treat the address like it contains a string.

Of course in the end these instructions will be generated by the Editor, instead of making users compose them.

Crimson Wizard

There's something I forgot about, in regards to building a table of variables. Unlike structs and their members in RTTI, the declared variable is not a defined variable. If a variable is defined within a script, then we know its address in memory, but if a variable is declared as "import", we do not, until the linking stage when the export matches the import. The linking is performed by the engine, when loading scripts and resolving imports.

This means that the Editor won't be able to tell addresses of imported variables, but only ones that are located in the given script.
The lookup from variable name to address will be:

   Context-dependent ("local") variable name ->
   Globally unique variable name ->
     * use script's own tables of variables to get the address of the variable, or
     * use script's import table to get the address of the variable in another script.


In regards to who does what, there will be 2 alternatives... no actually 3.
1. Editor passes a context-dependent name of a variable to the Engine, and Engine will have to resolve it in the given context, related to the given (current?) script.
2. Editor resolves the context-dependent name of a variable using a local->global var name table, and passes a globally unique variable name to Engine. Engine will still have to resolve its address using table of variables generated for the given script AND table of imports for the given script.
3. Editor resolves the context-dependent name of a variable using a local->global var name table, and then:
   a) if this variable is defined within current script, passes its actual address to the engine, using table of variables of this script.
   b) if this variable is imported, then passes its name, and Engine will have to resolve to real address.

eri0o

If I understood this, what you mean is if you export a variable from one script and import it in other, it would not show up in the global context when debugging that other script.

But should it appear in this case? Because of how our headers are in all scripts, it would show up in the global context for all scripts. Wouldn't this be a lot of variables showing there?

Crimson Wizard

Quote from: eri0o on Tue 09/04/2024 23:34:18If I understood this, what you mean is if you export a variable from one script and import it in other, it would not show up in the global context when debugging that other script.

No, I did not mean that. Why would not it appear if its declaration is visible?

What I was saying is that we cannot know the actual address of a variable at the time of compiling the script, so this part has to be resolved by the engine, not the editor.

eri0o

Right, I worded it terribly, I was thinking about, if none of the things you mentioned are implemented than it wouldn't show up.

I am more trying to think from the perspective that the import is in the header, so it gets into all scripts. So I think it would show everywhere. Which could crowd the variable watcher.

More trying to think, what clue would it have to know that it should or not show such global variable.

Crimson Wizard

#20
Quote from: eri0o on Wed 10/04/2024 00:10:04I am more trying to think from the perspective that the import is in the header, so it gets into all scripts. So I think it would show everywhere. Which could crowd the variable watcher.

I suppose you imply the list that mentions all known variables?
If that's a list that has all variables from all the game, then it will have to include each unique variable only once. This may be achieved by creating a unique key, something like module.variable. If local variables are also included there, then the key would be more complex, including a function and likely a range of lines (i've been speculating on this in one of the previous comments).
If that's a list that only shows variables visible from the current place (a break point), then the variables may be filtered according to their visibility scope. I suppose that may work similar to how autocomplete works, although I haven't thought this through. In my opinion this is a secondary problem, and may be done as an extra task after the general mechanism is implemented.

My primary goal right now is to make possible for user to manually type variable's name into the list, and let Editor & engine resolve that into a value.

eri0o

#21
Quote from: Crimson Wizard on Tue 09/04/2024 22:42:192. Editor resolves the context-dependent name of a variable using a local->global var name table, and passes a globally unique variable name to Engine. Engine will still have to resolve its address using table of variables generated for the given script AND table of imports for the given script.

Uhm, just thinking back at this, I know that the Editor loads a GUID for each module pair when editing the script. Is that GUID present in the script object? Could this be used as a prefix somehow to help identify? I don't remember if we can have a case where a variable in a local context can match the name of a variable in a global context or if this blocked by the compiler.

Btw, about the three options I would suggest going for whatever you feel is the easiest to implement.

Crimson Wizard

Quote from: eri0o on Wed 10/04/2024 01:39:49Uhm, just thinking back at this, I know that the Editor loads a GUID for each module pair when editing the script. Is that GUID present in the script object?

No, GUID is purely editor thing and is not a part of the compiled script. The scripts may be identified with their "sectionname" which contains either name of header or script file. This is used when resolving types for RTTI.

Crimson Wizard

#23
Some interesting progress. Now it understands variable names, including member names:



Branch is still here:
https://github.com/ivan-mogilko/ags-refactoring/tree/ags4--draft-memwatch

This requires script compiler to generate a table of contents of script memory.
And naturally, we need RTTI to be built as well, because RTTI tells us about struct fields.

For the time being I put resolving request inside Engine, it is just faster to write dirty code there.

Currently not supported:
- Accessing array elements (but it's a matter of parsing `[ i ]`);
- Imported variables (from other scripts). These need a mechanism that resolves local declaration to their actual memory location.
- Local variables (those that are allocated on stack). These require a somewhat different approach, because they do not exist in a single instance, but are generated as functions are called (and same local variable may be generated multiple times in case of recursive call), so there's no "fixed" address.


UPDATE: arrays work now (both regular and managed).

UPDATE2: imported variables from other scripts work now.
Imports from the engine don't work yet.

eri0o

Hey @Crimson Wizard , I just thought about it now, I imagine it's not possible to look into things that are from plugins, but how would it show, like say an object that is from a plugin. Or even perhaps like a "character*" to an engine character? Was just thinking about this.

Crimson Wizard

Quote from: eri0o on Tue 16/04/2024 17:32:28Hey @Crimson Wizard , I just thought about it now, I imagine it's not possible to look into things that are from plugins, but how would it show, like say an object that is from a plugin. Or even perhaps like a "character*" to an engine character? Was just thinking about this.

I am planning to make this work at the moment, need to revert and redo some recent changes for this. But only pointer value itself and plain fields will work, not properties.

For properties to be read, the corresponding "get_" function has to be called, which may lead to unwanted side effects. Also, properties which are handled in script cannot be run without having a dedicated script instance allocated exclusively for "watching". I'd rather not touch that until the general watch feature is completed.

eri0o

#26
Oh, yeah, didn't expect those to be accessible, I was more thinking if there is a place in the list where the typename could appear maybe these could have like "(native)" or something to indicate they aren't script objects. But it's cool you have thought about this too. A lot of my System.Log usage will disappear with this new functionality (edit: the variable inspection things). :)

Crimson Wizard

Oh right, having a "type" column would be a nice addition too.

Crimson Wizard

#28
Alright, new update:

1. Now supports local variables (from stack memory), including function parameters.
Supposedly, engine correctly calculates their lifescope, and overriding names in the nested sections should also work (but I cannot remember if old and new compilers allowed to do that yet). But it works correctly if you have multiple variables with the same name inside multiple functions.
The code for local variables turned to be uglier than the rest of the draft, so some things may still be not completely reliable.

2. Added "Type" column.

Branch is still here:
https://github.com/ivan-mogilko/ags-refactoring/tree/ags4--draft-memwatch

I might open a Draft PR to let this build on CI for easier testing.



Crimson Wizard


eri0o

Very interesting stuff, haven't managed to try it out yet but will probably be playing with it tomorrow.

QuoteSave TOC in separate files, packed along with the scripts, sort of "debug symbols" for the game.

My view is this is best. It's not useful for actually running the thing and it will make the size of things bigger. I guess it will work like pdb files or dwarf or any of the many debug parts that are as separate files.

Crimson Wizard

#31
Here's an updated, cleanely written version.
New PR: https://github.com/adventuregamestudio/ags/pull/2430
Download:
https://cirrus-ci.com/task/4820306265636864
updated 3rd June 2024

Functionally this should act the same as before.
I would appreciate any testing of this feature.

Crimson Wizard

Another updated version, now with some improvements and fixes.

* variables list now may be edited by pressing F2, and there's always one empty line added in the end.
* better error reporting (bad syntax, unknown names etc).
* correct reading of member fields of builtin structs, such as GameState ("game").

Download:
https://cirrus-ci.com/task/4728318132486144
updated 12th June 2024

Crimson Wizard

Another update, as of 13th June:
https://cirrus-ci.com/task/5488189920509952

I think this is the last update, unless mistakes are found.
There's a list of known issues, they are not critical, and I decided to postpone addressing these:
https://github.com/adventuregamestudio/ags/pull/2430#issuecomment-2166800056

...because I simply got tired of working on this feature.

As of now, I know only of 2 people who tested or at least mentioned they are going to test.


eri0o

Made a modification that would automatically watch local variables, here is the PR for testing.


SMF spam blocked by CleanTalk