Lua Scripting (lua.mo)

Started by dmilligan, March 29, 2015, 04:44:07 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

JohanJ

Just tried lua_fix Build #398 (04-Apr-2017 23:29:58). It has similar issues too as soon as file_man.mo is active! I did not get a total freeze but file_man is behaving odd after a while: when toggling through the folders the entire file structure suddenly disappeared and from this very moment trash/info/menu/q buttons were dead, and so was the back screen. It was still possible to take pictures but you cannot review them, att all. To get out from there I had to pull the battery, re-boot was not enough.

It all comes down to file_man.mo for builds both from 04/04 and 05/01. As long as the module is not activated, both builds seem to work properly (as far I could test it right now, which was not that deep!!).
60D.111 / 100D.101 / M2.103

garry23

@A1ex

May have found a bug: if not it's ML-LUA strangeness  ;)

Here's the test code (running on latest Lua, but checked against previous Lua versions as well):

display.rect(300, 300, 100, 100, COLOR.RED, COLOR.WHITE)
display.print("TEST", 300, 320, FONT.LARGE ,COLOR.GREEN1,COLOR.TRANSPARENT)


The issue is the text displays a black background, ie not a transparent one.

Cheers

Garry

a1ex

In this case, "transparent" means the underlying image VRAM is visible. There's also TRANSPARENT_BLACK, similar to the one used for Canon's info overlays in LiveView (semi-transparent), or COLOR_TRANSPARENT_GRAY (only defined in bmp.h, and not exposed because not all camera models can use it).

For the bitmap overlay, there's NO_BG_ERASE defined in bmp.h, but currently it's only used for internal ML code that required this behavior before this change (but it's not exposed to Lua).

There's also the SHADOW_FONT flag, not yet exposed to Lua, but could be helpful (as it only draws the background pixels that are adjacent to a foreground pixel).

garry23

@A1ex

Thanks for the info.

BTW you are just about to see another posting from me, introducing a new 'feature', ie a focus bar.

It will be interesting to see what you think.

Cheers

Garry

a1ex

Suspecting some stack corruption from lua_getfield (which appears to require a lua_pop, but that doesn't happen everywhere, see e.g. lua_dryos.c), I've started to dig inside the Lua C API (which I still don't really understand) and got some questions:

1) Is there some easy way to check the stack usage of our C functions? (for example, to make sure each lua_getfield call is paired with a lua_pop, but not only)

I can imagine some API test that calls every such function 1000 times or so to check whether it will overflow the stack, but that doesn't look very nice.

The LUA_FIELD_* macros appear to address this purpose, but they are not used everywhere; as a first step, I'd refactor all the calls to lua_getfield to use similar macros. But I suspect there are more instances of similar behavior, not obvious from the function names.

2) I've also stumbled upon the LUA_PARAM* macros. The Lua manual recommends luaL_checkinteger/number/string/whatever for the same purpose. Any reason we use those long macros, instead of the standard functions?

3) Most of the C tables allow setting arbitrary fields besides the predefined ones (but not all - font doesn't). What is the rationale behind this? Does any of the existing scripts make use of this feature?

I'm tempted to make them read-only, so it would catch typos when accessing these fields, rather than creating new fields that have no effect. Example: you could write in a script: lv.emabled = true (typo) and with the current behavior, you'll probably spend some time figuring out why this doesn't enter LiveView.

4) Most of the fields (from various objects) appear in 3 places (and it's easy for them to become inconsistent, as it doesn't give a compiler error, and I couldn't come up with a way to check them in api_test.lua either). For example, in lua_battery, all the fields (level, id, performance etc) appear in luaCB_battery_index, in luaCB_battery_newindex (on the long line with strcmp) and on lua_battery_fields. Any suggestions for reducing these repetitions?

dmilligan

1. You might not always want to pop a getfield:
  - if it's going to be the value returned from a function
  - if some other lua API call is going to consume/pop it
  - if it's going to be an argument to a function call or the function being called (which will result in it and the other arguments getting popped)

Keeping track of what's going on the the stack is probably the single hardest part of writing lua API stuff in C, I don't really know of a way to automatically verify stack operations, because what happens to the stack simply depends on what it is trying to be accomplished. You'll notice in the lua API docs in gray on the right side are three values for each function, these values describe what that particular function does to the stack.

Sometimes you have to write something so complicated that the only way to do it is to make code comments of what is going on with the stack. Something relatively simple in Lua is very challenging to write the equivalent C. For example to simple do a nested function call like: foo(1, bar(2,t[3]), 4) in C you would need to:
push foo on the stack
push 1 on the stack
push bar on the stack
push 2 on the stack
push t on the stack
push 3 on the stack
getfield
call
push 4 on the stack
call

2. They do a lot more than those built in functions, for example providing a way to specify the default value for optional parameters, and also inserting the name of the parameter into the error message (that way the user maybe has some idea of what they did wrong).

3. Probably a good idea to make those tables readonly.

4. to be continued...


a1ex

I think I've found a fix for file_man crashing. Maybe not the cleanest way, but at least I no longer get warnings in QEMU. Ended up doing some major changes on the menu backend, such as enforcing valid names on every single menu entry (to avoid null pointer issues).

Also renamed dryos.prefix to dryos.image_prefix and made sure it's actually working (with an API test that also shows how to get the file name of the current image).

Didn't look into stack issues yet.

garry23

@a1ex

Just tried the Latest Build (2017-06-23 22:42): my birthday build as it happens :)

I've only got access to my EOSM at the moment and all looks OK apart from a minor observable.

When I double tap the screen to go into the ML menu, the screen flashes Orange.

I'll keep testing. So far my EOSM Toggler script works as 'normal', as does my focus bar script.

Cheers

Garry

JohanJ

Did some testing with the latest lua_fix.2017Jun24 both with 60D and 100D.
From a still photography's point of view a stable solution on both cameras, which is great! Did not dig into mlv though.

And the good news is that file manager is solid now. No hanging what ever I tried to provoke.

Still there is one artifact worth to mention. When activating modules and booting the camera I always get a console window with an assert message in the end, see log file below (same for 60D and 100D)

ML ASSERT:
a
at ../../src/stdio.c:44 (streq), task module_task
lv:0 mode:2


Magic Lantern version : lua_fix.2017Jun24.60D111
Mercurial changeset   : fe6b0207f229+744f5868a308+ (lua_fix)
Built on 2017-06-23 22:49:10 UTC by jenkins@nightly.
Free Memory  : 389K + 1546K


I had chosen dual_iso, ettr and file_man modules. The console remains active unless I turn it off in Debug/Module Debug (Off -->ON --> Off).

Strange also that additionally activating lua.mo provokes an assert again on (re)boot and writes an additional assert.log but the console window disappears after a few seconds and remains inactive. Is there any change in the console window handling lua.mo takes advantage of but other modules not (yet)?

Also not clear whether this assert has any consequence besides the console behavior and lots of log files written to the memory card.
60D.111 / 100D.101 / M2.103

a1ex

@JohanJ: that's yet another null pointer issue (likely in one of these modules, during initialization). Before, it went out unnoticed; that assert won't change the old behavior (whether it was good or bad), other than printing a message. Will look into it.

Yes, Lua.mo hides the console after startup.

@garry23: is the orange screen a regression over previous build, or it was there from the beginning?

garry23

@a1ex

The orange flash only appeared in the latest build.

Cheers


Garry

a1ex

How does it look like?

It would be very helpful if you can compile the code and run "hg bisect" to narrow down the change that caused it, as I have no idea what it might be.

garry23

@a1ex

Also get this assert file, every time I switch on:

ML ASSERT:
a
at ../../src/stdio.c:44 (streq), task module_task
lv:1 mode:3
Magic Lantern version : lua_fix.2017Jun24.EOSM202
Mercurial changeset   : fe6b0207f229+744f5868a308+ (lua_fix)
Built on 2017-06-23 22:43:05 UTC by jenkins@nightly.
Free Memory  : 187K + 3209K


Also just tested with all my scripts removed, to prove it was no me. Orange flash still occurs.

Cheers

Garry

a1ex

Still no idea how the "orange flash" looks like - a video would be best.

garry23


a1ex

May I see a similar video with the previous build, to know how you expect it to behave? If possible, with the same settings and test scene.

garry23

@a1ex

Build 2017-01-25 23:34:20

No orange flash.

BTW this is not a high priority matter for me ;-)




a1ex

Well, it's an unexpected change that I have no idea where it comes from. Looks like the exposure increases when you touch the screen (maybe for autofocus), and since the menu does not appear instantly, you can see the image getting brighter.

That's why I've asked for the two tests to be done at the same settings and test scene.

The puzzling part is - why there is a difference between builds?

There are some differences between the screenshots (for example, the most recent one has some dots on it). Not sure where they come from (cropmarks? AF?)

Also, you can find older builds here: https://builds.magiclantern.fm/jenkins/view/Experiments/job/lua_fix/
Can you narrow it down to a more recent build?

garry23

@a1ex

Just reloaded the latest experimental Lua build and the orange flash occurs with no modules loaded and cropmarks off.

garry23

@a1ex

Just tried all the Lua fix builds in Jenkins down to #381: all appear to have an orange flash.

The build that doesn't have the flash is this one:




upload your photos online

a1ex

Ah, it's a build from the focus_pos branch; that one is based on 4708e20 (somewhere between #381 and #371). Can you check whether #371 has the flash?

Assuming #371 doesn't have the flash, I'm compiling intermediate builds here: https://builds.magiclantern.fm/jenkins/job/lua_fix_dbg

garry23

@a1ex

#371 does (sic) have the flash.

a1ex

Can you check 4708e (first from https://builds.magiclantern.fm/jenkins/job/lua_fix_dbg/4 ) and 43c863 (last from https://builds.magiclantern.fm/jenkins/job/lua_fix_dbg/8 ) ?

If one of them has the flash, you can proceed narrowing down within the other builds from the same list.

Otherwise, can you check some older builds from lua_fix, and also the latest regular (non-experimental) build?

garry23

@a1ex

4708e and 43c863  both flash.

Have looked back as far as I could in Lua fix, ie #369, which also flashes.