Why isn't AGS script much, much faster?

Started by Monsieur OUXX, Tue 21/09/2021 12:50:10

Previous topic - Next topic

Crimson Wizard

#20
Quote from: eri0o on Sat 16/04/2022 17:16:13
I would not trust the compiler as much. I don't think the checks are that burdensome - they only happen at the frontier in the API, which is not what I meant.

What fernewelten is refering to, probably, are checks that are done by the script interpreter during certain bytecode operations. Like - stack operations. They happen all the time, for every little move on the stack (I know that some were disabled in the past, but some may remain).

As for trusting or not the compiler, frankly in all these years there have been a minimal amount of errors found by these checks, maybe one in several years. fernewelten's proposal to have separate "production" interpreter, and "debugger" makes sense, in my opinion. (There has been a ticket regarding this, although it's more like a note, not a full plan: https://github.com/adventuregamestudio/ags/issues/843)




Then, there are things introduced during earlier rewrite, when trying to handle 64-bit problem, and backward compatibility, that may be bad, like virtual method calls in dynamic object wrappers. Where the old interpreter would simply access a memory pointer and read and write in a buffer, the new interpreter will dereference 2-4 pointers and maybe make a virtual method call.
In other operations, where old engine would do a simple pointer math (+/- a single value), the new one does more computation (e.g. see https://github.com/adventuregamestudio/ags/issues/869)
This all objectively slows things down, which may not be too much for a single operation, but will accumulate if there's alot of them, especially when working with dynamic arrays, i think. What makes things worse, some of these were probably necessary only in certain cases, but not all the time, yet affect many actions.

In any case, whether something slows things down or not, and how much, should not be guessed, it has to be measured.

Personally, I believe that it may be worth to go back to the old script interpreter and look at the system awhole, write down notes about the big picture, and search for a better solution to the problems which I and JJS were trying to solve with this rewrite. Maybe some of these problems have much better and simplier workarounds.

For the reference, there also is the earlier ScummVM AGS port done by fuzzie, from which I took some inspiration, and strangely we came to similar ideas about interpreter. My memories of it are vague now, but maybe it's worth checking out how she solved same issues in her rewrite: https://github.com/adventuregamestudio/scummvm/tree/ags




As for merging instructions, this also may be looked into in the parallel. There's an old thread about this here, but there has been no work on that. There's also a code optimizer written by rofl0r called "agsoptimize" here, from what i understand, it does not introduce new instructions, but squashes multiple into existing ones.

BTW rofl0r also has a tool that runs purely bytecode without the engine, if that's what you meant by "disconnect the AGS Script runner from AGS Engine"? it's called agssim:
https://github.com/rofl0r/agsutils/blob/master/agssim.c

fernewelten

#21
Quote from: Crimson Wizard on Sat 16/04/2022 19:25:38
What fernewelten is refering to, probably, is of is checks that are done by the script interpreter during certain bytecode operations. Like - stack operations. They probably happen all the time, for every little move on the stack (I may be mistaken here, because some were disabled in the past).

Yes, that's the gist of what I was trying to say. When the script interpreter is told to float-add two things, for example, it doesn't just do just that, it checks beforehand whether the respective register etc. has been loaded, whether a value of type 'float' has been written into it, and so on. That's good for checking whether the Compiler has done its thing correctly. That mechanism is bound to have discovered lots of bugs by this time.

But the trouble is, if the Compiler has done the job properly that it was programmed for, then those checks might still be done thousands of times at runtime, over and over again, even for the same source code statement and the same block of Bytecode, when it needn't have been done at all.

There are certain things that the Compiler can't check, for instance whether a pointer location that needs to be dereferenced contains a null pointer at runtime. It's sensible to make the script interpreter check these things. The Compiler may even tell the script interpreter to check something specific, by issuing a suitable Bytecode instruction.

On the other hand, there are a lot of things that the Compiler could make sure of, given our AGS language. For instance, it should be able to make sure that a memory location that is supposed to contain a float hasn't been loaded with an int instead. In these cases, a lot of time can be saved by letting the Compiler do its thing, and telling the interpreter: “Look, Buster, Master has told you that these two things are floats and commanded you to add them, so just obey right now without wasting Master's time!”

It would mean requiring that every Bytecode that is given to the Engine has been produced by the Compiler, or at least, that if you do give "hand-written assembly Bytecode" to the Engine, side-stepping the Compiler, you do it at your own risk.

eri0o

#22
Usually the modern interpreters doesn't just run the bytecode, they do a previous step where they do something that is like compiling the bytecode, which does some of these checks. AGS Script is not being manipulated in realtime and eval is being run, and it doesn't have reflection.

It could run the bytecode by a second tool before actually executing, which would interpret the bytecode and make sure things when ran are smooth. This is what I mean by not trusting the compiler. There's no need to forego the checking if we can just run it beforehand, at runtime.

About disconnecting the Script Runner, if it was less entrenched in the engine, this would make two things:


  • ease with improving it
  • reducing the barrier to connect other script runners (e.g.: Lua Jit)

Also just to further explain what I meant before about the YoYo Games Compiler, here's Game Maker Studio:

https://help.yoyogames.com/hc/en-us/articles/235186048-Setting-Up-For-Windows

You can see it can make/run games using VM settings, which is similar to AGS and using YYC, which essentially builds Game Maker Script Language to C++, and then run the resulting code through a C++ Native compiler. This results in a really fast executable. (this would be like building a plugin in AGS, if everything that can be done through scripting could be done through the plugin interface)

Crimson Wizard

#23
Quote from: eri0o on Sat 16/04/2022 21:13:00
About disconnecting the Script Runner, if it was less entrenched in the engine, this would make two things:


  • ease with improving it
  • reducing the barrier to connect other script runners (e.g.: Lua Jit)

As a note, this task will probably have to be completed in order to make engine not connected too hard to current interpreter:
https://github.com/adventuregamestudio/ags/issues/1223

as explained in the ticket itself, where it sais
QuoteThe way these "translator" functions are registered, and because they are using internal engine's types not meant for anything else, prevents from sharing same registration with other potential users.

eri0o

#24
Quotehttps://github.com/adventuregamestudio/ags/issues/1223

I still doesn't understand which type of solution for that you would find acceptable. Like the template approach you mention, would it be accepted?

Or are you looking into the reverse, like the Lua FFI interface (https://luajit.org/ext_ffi.html), which would register C functions to the AGS Script interface, and then it's mostly about exporting the AGS Engine API as C functions. Problem is this second option is not something we could then repurpose for other languages - it would be a new functionality to the script runner.

Looking through AGS code, there's three things there:


  • RuntimeScriptValue / macros like API_OBJCALL_VOID, these apparently are the things that would make the C++ function symbol compatible with being imported in the Script Runtime.
  • ccAddExternalObjectFunction like functions, makes it possible to access the function symbol from the script runner
  • a header for runtime linkage (agsdefns.sh for the internal AGS API)


Crimson Wizard

#25
Quote from: eri0o on Sat 16/04/2022 22:51:11
Quotehttps://github.com/adventuregamestudio/ags/issues/1223

I still doesn't understand which type of solution for that you would find acceptable. Like the template approach you mention, would it be accepted?

I can't tell which "template approach" are you refering to?
Personally I wanted to try the switch method described in the ticket, similar to what fuzzie did in her scummvm port. It may be not hard to make a minimal version with only few registered functions and test for correctness and any perfomance changes.
From the function-registration side this approach requires functions to register along with the "type description" that notes the number and types of parameters, and type of return value. This description may then even be made available through the plugin interface.

Quote from: eri0o on Sat 16/04/2022 22:51:11
Or are you looking into the reverse, like the Lua FFI interface (https://luajit.org/ext_ffi.html), which would register C functions to the AGS Script interface, and then it's mostly about exporting the AGS Engine API as C functions.

I don't know what "FFI" is, I would have to research that first to have any opinion. But the short term goal is to be able to easily share registered functions at least between script interpreter and plugins, because a plugin may implement any scripting language and use engine api to call the functions. Similar to how lua plugin was done in the past.

eri0o

Sorry, FFI means Foreign Function Interface if I am not mistaken. Usually other languages use iterop when they are talking about this, like here, D about C++: https://dlang.org/spec/cpp_interface.html

About template, I mean like evolving from the issue you describe, it's the last proposition you present in your list, on the top post.

About fuzzie port, you meant the interface here: https://github.com/adventuregamestudio/scummvm/blob/ags/engines/ags/scripting/character.cpp#L2130-L2142 ?
(and that fork, the ags branch could be made the default one in GitHub to make it easier to browse)

Crimson Wizard

#27
Quote from: eri0o on Sat 16/04/2022 23:28:20
About template, I mean like evolving from the issue you describe, it's the last proposition you present in your list, on the top post.

Hmm, if you are refering to the paragraphs starting with "An example of a very straightforward solution for type safety could be helper function template, where implementations would deduce a type and pass it further to actual registration."
That was merely a suggestion for registration helpers. It was supposed to be on top of the actual system, and its only purpose is type safety, not changing anything in how it works.

Quote from: eri0o on Sat 16/04/2022 23:28:20
About fuzzie port, you meant the interface here: https://github.com/adventuregamestudio/scummvm/blob/ags/engines/ags/scripting/character.cpp#L2130-L2142 ?

We have a similar interface, but the difference is that she also passes the function type along. In her variant this type is defined as a string "i", "iii" and so on.
But looking at the how this solution works again now, she does not have a switch, which I thought about for some reason. In fact, she came to an opposite variant, where the api functions actually are like our "wrapper" functions. E.g. like this.

This is an alternate approach, which I forgot to mention in the ticket for some reason. It's to instead have this kind of function type exposed to plugin API, and make plugins work with it instead of calling functions of potentially unknown prototype.

The consequence of such approach is:
* any previously existing plugins which use script functions will no longer work;
* plugins will likely have to pack parameters in an array in order to pass them into this function (extra work for them).
* a big "cleanup" work which would replace calls to "real functions" in wrappers with a working code itself, similar to how it's done in fuzzie's port.
I might add this information into the ticket later.

eri0o

it looks like there are three things going on


  • we want the tying of a C or C++ function to a script to be convenient, possibly including being type safe
  • we don't care much about performance for the act of making these available to script, since this will happen only once, before the game starts
  • script calls to C++ function should have a low overhead, since we know the types of things beforehand, this should not need to figure things out at runtime

QuoteWe have a similar interface, but the difference is that she also passes the function type along. In her variant this type is defined as a string "i", "iii" and so on.

I feel there has to be some way to leverage compile time introspection through templates that we could use to tie the primitive C++ types to AGS types. Unfortunately whenever I look into these things (like https://en.wikipedia.org/wiki/Substitution_failure_is_not_an_error) I kinda fail to grasp how to actually code these stuff...

eri0o

I was looking at this: https://github.com/adventuregamestudio/ags/blob/master/Engine/script/runtimescriptvalue.cpp

I had an idea for refactor here, instead of the type being a thing, having different versions of this class from the same interface that each implemented their own behavior instead of all those IFs per type. I don't know if this helps yet, but was looking at it and in theory this would reduce the branches.

Crimson Wizard

#30
Quote from: eri0o on Thu 13/10/2022 21:13:04I had an idea for refactor here, instead of the type being a thing, having different versions of this class from the same interface that each implemented their own behavior instead of all those IFs per type.

Are you speaking of a virtual inheritance? In such case these objects would have to be allocated dynamically too, one at a time, and accessed through a pointer to a base class.

Alternatively you could have a pointer to vtable in each object; then the object itself could be same struct, allocated regularily, but have a pointer to table of functions that may be different (this is C-style of override, seen, for example, in a Allegro4 BITMAP struct).

Quote from: eri0o on Thu 13/10/2022 21:13:04I don't know if this helps yet, but was looking at it and in theory this would reduce the branches.

In theory, majority if not all of these if branches may be removed by replacing if/else with a switch, similarily to how it was done with ReadValue:
https://github.com/adventuregamestudio/ags/blob/master/Engine/script/runtimescriptvalue.h#L316

I tried that recently, but found that it actually reduced the fps a little in a project I've been testing with, so I decided to leave for later.
Maybe I did something wrong, or the test was wrong. Or this particular branching was not the main problem for that particular project.

EDIT:
My belief is that ideally there should not be any branching or behavior switch at all, and all the memory access implemented similarly, somehow.
One major reason this was written in the first place was because AGS compiled script assumes 32-bit pointer size, so it won't work with 64-bit systems. Couple of people have suggested to implement a virtual memory instead, and use virtual 32-bit addresses instead of the real ones, which might fix this issue.
I suppose this is what Nick Sonneveld started to write in one of his experimental branches few years ago.

eri0o

Quote from: Crimson Wizard on Thu 13/10/2022 22:05:15I suppose this is what Nick Sonneveld started to write in one of his experimental branches few years ago.

Uhm, not sure, you mean this branch I think: https://github.com/sonneveld/ags/commits/ags--script

Quote from: Crimson Wizard on Thu 13/10/2022 22:05:15Are you speaking of a virtual inheritance?

Yeahp, that was my thought!

I think the new compiler is a good step in the right direction, and perhaps in ags4 realm there's something that could improve breaking bytecode compatibility, but unfortunately I don't understand that well enough to be able to participate in such discussion. Maybe in the future.

Crimson Wizard

#32
Quote from: Crimson Wizard on Thu 13/10/2022 22:05:15One major reason this was written in the first place was because AGS compiled script assumes 32-bit pointer size, so it won't work with 64-bit systems. Couple of people have suggested to implement a virtual memory instead, and use virtual 32-bit addresses instead of the real ones, which might fix this issue.
I suppose this is what Nick Sonneveld started to write in one of his experimental branches few years ago.

I've been testing a couple of script-heavy games with a Nick Sonneveld's script interpreter's rewrite
https://github.com/sonneveld/ags/commits/ags--script
using "infinite fps mode" (where the game runs as fast as possible without frame delays) and depending on a game and situation it gives about 20-25% improvement in fps, compared to the 3.5.0 engine it was based on. In one game it raised from 70 fps to around 84 fps, in another - from 330 to 400+ fps.

Code-wise it's bit dirty in places, and I don't know if it's fully feature complete.
There are few things that it probably does not do, which current engine does, like being able to address explicit variables from the engine structs exposed to script instead of letting interpreter read/write memory raw without knowing where it reads or writes to (which may be dangerous), but maybe it uses an alternate safety mechanism which I have not understood yet.

It also does very little safechecks, which is a very good thing for performance, but may make debugging for mistake harder. If it had these checks under some compilation flag, - that could improve debugging too.

Implementation-wise, it solves the memory issue by having a joint virtual/real memory mapper. Whenever possible the script data is allocated on the virtual memory "heap", which size is limited by 32-bit, which lets to reference it by using 32-bit offsets instead of real addresses. But when not possible (or not wanted for some reasons) it uses the virtual-to-real mem map (so it translates 32-bit handles to whatever-bit addresses). The latter is like the classic managed objects handles, except it seem to be able to work for anything. I haven't looked too deep into this, but I may imagine this mem map could be used for plugins too, which can allocate on their own and thus cannot be restricted to a virtual heap.

I actually wonder why we haven't tried at least this virtual-to-real map mechanism back in 2012/13, it alone might have been more performant than the solution that I did. It seems a quite logical thing to try.




Separately, I'd like to re-visit two my past comments in this thread:

Quote from: Crimson Wizard on Fri 24/09/2021 17:49:06Personally I would speculate that most games created with AGS so far may fit into 32-bit memory, and those which don't likely are overusing memory due to low optimization.

So, in the recent year it's been found that Dave Gilbert's new full-HD game actually goes above 32-bit RAM limit, but this was mostly due to the graphics. We did number of memory optimizations, which reduced the RAM usage by few hundreds MBs, but apparently reaching the limit is realistic. If this becomes a problem again, we might use 64-bit engine which has a much much more RAM support.


Quote from: Crimson Wizard on Fri 24/09/2021 17:49:06most of the mem is likely to be taken by resources (sprites, sounds), and these are not exposed into script VM, so not part of this address issue. What is left for VM addresses is: script variables and managed objects. Most managed objects are merely "wrappers" which contain ID of an actual object in the engine. So probably most managed script memory goes to: dynamic containers (arrays, etc), and dynamic sprites.

I must correct the last statement here: dynamic sprites do not store the image data in the script memory, it's being stored inside the sprite storage (aka sprite cache), and therefore this data does not have to be restricted by the size or address. The script's memory only stores minimal reference info.

eri0o

#33
Hey, when I played with that Nick branch I kinda didn't push it to GitHub after properly recovering it and lost my work. If you had a somewhat working version of it, I advise to push to GitHub - or somewhere, just to have a backup.

Crimson Wizard

#34
Quote from: Crimson Wizard on Sat 22/04/2023 16:44:03I've been testing a couple of script-heavy games with a Nick Sonneveld's script interpreter's rewrite
https://github.com/sonneveld/ags/commits/ags--script
using "infinite fps mode" (where the game runs as fast as possible without frame delays) and depending on a game and situation it gives about 20-25% improvement in fps, compared to the 3.5.0 engine it was based on. In one game it raised from 70 fps to around 84 fps, in another - from 330 to 400+ fps.

For more experiments, I downported the one "script heavy" game mentioned above to AGS 3.2.1. The results combining several versions of AGS are these:

1. AGS 3.2.1: 95-98 fps
2. AGS 3.4.3: 75 fps
3. AGS 3.5.0: 70-73 fps
4. AGS 3.5.1: 69-70 fps
5. AGS 3.6.0: 65-66 fps (with certain fixes was able to increase to around 68.5 fps so far; UPD: 72 fps now)
6. Nick Sonneveld's interpreter rewrite: 85-89 fps (85 when running a game made in 3.4.3, and 89 when running a game made in 3.2.1 for some reason).

This is just to illustrate the results of me editing the script interpreter in 2012-13 when trying to make it work on 64-bit systems and the engine to have better compatibility with the old games. Basically, starting since AGS 3.3 the script execution lost about 1/3 of its potential speed.

Guess this also answers @Monsieur OUXX 's original question to some degree.

Of course, the above "only" matters when you're doing a huge amount of calculations, and manipulations with data in your game, like 3D matrix math and physics. This may also be related to the way data is stored, and how often do you create and delete dynamic objects (dynamic arrays, managed structs). Although I do not have an assessment on what impact, relatively, managed objects have in this. It might be curious to reimplement this test game I've been using here, and maybe another my game (car racing), relying strictly on non-managed variables, and record the difference, for statistics.

In all the other cases the performance issues would likely be caused by unoptimized graphics, etc.

LimpingFish

I'm way out of my depth here, but I'd just like to point out that Kweepa's Panorama modules run considerably slower in post 3.21 versions of AGS.
Steam: LimpingFish
PSN: LFishRoller
XB: TheActualLimpingFish
Spotify: LimpingFish

Crimson Wizard

#36
Quote from: LimpingFish on Thu 27/04/2023 00:33:28I'm way out of my depth here, but I'd just like to point out that Kweepa's Panorama modules run considerably slower in post 3.21 versions of AGS.

I could imagine it's slower, but do you have any data on this?

For the reference, I tried a demo game from this thread:
https://www.adventuregamestudio.co.uk/forums/modules-plugins-tools/module-panorama3d-v1-3/msg636644446/#msg636644446

Results on my system were:
- Original exe (made in AGS 3.12): 31-33 fps with Software renderer (was called Allegro/DX5); 27-28 fps with Direct3D.
- 3.6.0 engine: ~30 fps when using Software renderer; 25-26 fps with Direct3D/OpenGL.

(I have a medium level 8 years old PC, if that matters)

My first assumption is that the main culprit is inefficient Get/SetPixel command in AGS. EDIT: Ah, not, instead it seems to do this by repeatedly creating dynamic sprites and rotating them each time in "redraw" function.
Of course, if AGS script had some kind of a 3D polygon API, and engine did the main calculation/drawing internally, then things could be done much more efficiently.


UPDATE
I've been experimenting with speed fixes for script lately, and surprisingly my last attempt runs the Panorama Demo faster than original by about couple fps:
https://www.dropbox.com/s/q3ga78b09vo09m6/acwin-361-perffixes.zip?dl=0

Of course I don't assume I made script run faster than 3.1, so my explanation is that this module or older engine had more performance problems elsewhere than the script itself, and newer engines improved that.

eri0o

Quote from: Crimson Wizard on Thu 27/04/2023 01:42:01if AGS script had some kind of a 3D polygon API, and engine did the main calculation/drawing internally, then things could be done much more efficiently.

SDL2 has a polygon API, but it depends on the SDL2 renderer. In SDL3 it should be rewritten to use it's new SDL GPU backend, which is a new API in development in SDL3 for a generic, shader first, 3D rendering - in Metal/Vulkan/Direct12 first spirit. Just mentioning, in case we ever want to walk that road, it should be easier in a not so far future.

eri0o

@Crimson Wizard I tried some of my games, apparently most of my stuff is bounded by the drawing operations, but my ImGi module benefited enormously by your changes.

This module has it's own software render where each drawing command is hashed and it only draws the rectangles where the hash has changed and this math is still reasonably expensive on regular AGS but with your changes the processing in those is more than cut in half.

Crimson Wizard

Final version of a performance fix (for now, probably):
https://cirrus-ci.com/task/6713476341563392

Bumps game fps by 15-20% in the test games with lots of array manipulations.

SMF spam blocked by CleanTalk