Kodi Community Forum
Segfault on script invocation - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Development (https://forum.kodi.tv/forumdisplay.php?fid=32)
+--- Forum: Kodi Application (https://forum.kodi.tv/forumdisplay.php?fid=93)
+--- Thread: Segfault on script invocation (/showthread.php?tid=49031)



Segfault on script invocation - micantox - 2009-04-16

Hi all,

I've very recently started studying the XBMC framework for a personal project.

I'm trying to attach some python scripts to the default skin, PM3.HD, and see what happens.

In the file PM3.HD/720p/MyWeather.xml, at the point

<control type="button" id="4">
<description>Settings button</description>
<posx>0</posx>
<posy>135</posy>
<label>5</label>

I inserted the following row

<onclick>RunScript(/path/to/file.py, chat)</onclick>

where /path/to/file.py is an EMPTY file (touch /path/to/file.py).

Then, after having started an xbmc.bin instance and gone on the settings button in the weather page, if I start pressing the settings button (with the Enter key) repeatedly and quite quickly, at some point the application crashes after a variable number of clicks.

The problem seems related to a free'ing phase (cycling the elements of a dict structure), when it comes to clear/clean a xbmcgui.ControlGroup.

The backtrace follows, together with a p *type (please take both into account).

Hope someone may understand something more than what I got from this.

Thank you all,
Antonio


#0 0xb0a4651c in type_dealloc (type=0x8bd9e40) at Objects/typeobject.c:2108
#1 0xb0a2a180 in insertdict (mp=0xa07c194, key=0xb38adca8, hash=-157863416, value=0xb0adaec8) at Objects/dictobject.c:405
#2 0xb0a2a699 in PyDict_SetItem (op=0xa07c194, key=0xb38adca8, value=0xb0adaec8) at Objects/dictobject.c:589
#3 0xb0a2ec76 in _PyModule_Clear (m=0xb0bdc3b4) at Objects/moduleobject.c:136
#4 0xb0a8ea6d in PyImport_Cleanup () at Python/import.c:436
#5 0xb0a997f4 in Py_EndInterpreter (tstate=0xa0125f8) at Python/pythonrun.c:560
#6 0x086f486f in XBPyThread:Tonguerocess (this=0xb38acef8) at XBPyThread.cpp:257
#7 0x0886ded8 in CThread:ConfusedtaticThread (data=0xb38acef8) at Thread.cpp:206
#8 0x087fd7b6 in InternalThreadFunc (data=0xb0bc3f20) at XThreadUtils.cpp:126
#9 0xb7c34f7b in ?? () from /usr/lib/libSDL-1.2.so.0
#10 0xb0bc3f20 in ?? ()
#11 0x087fd730 in ?? ()
#12 0xb0bdd868 in ?? ()
#13 0xb7c8d840 in ?? () from /usr/lib/libSDL-1.2.so.0
#14 0x00000000 in ?? ()

(gdb) p *type
$1 = {ob_refcnt = 0, ob_type = 0xb0adecc0, ob_size = 0, tp_name = 0x8a42e2c "xbmcgui.ControlGroup", tp_basicsize = 52, tp_itemsize = 0, tp_dealloc = 0x8712650 <ControlGroup_Dealloc>,
tp_print = 0, tp_getattr = 0, tp_setattr = 0, tp_compare = 0x86f7640 <Control_Compare>, tp_repr = 0xb0a46ac9 <object_repr>, tp_as_number = 0x0, tp_as_sequence = 0x0,
tp_as_mapping = 0x0, tp_hash = 0, tp_call = 0, tp_str = 0xb0a46c5e <object_str>, tp_getattro = 0xb0a311fc <PyObject_GenericGetAttr>,
tp_setattro = 0xb0a315b9 <PyObject_GenericSetAttr>, tp_as_buffer = 0x0, tp_flags = 5611,
tp_doc = 0x8ba8460 "ControlGroup class.\n\nControlGroup(x, y, width, height\n\nx", ' ' <repeats 14 times>, ": integer - x coordinate of control.\ny", ' ' <repeats 14 times>, ": integer - y coordinate of control.\nwidth : integer - width of contr"..., tp_traverse = 0, tp_clear = 0, tp_richcompare = 0, tp_weaklistoffset = 0, tp_iter = 0, tp_iternext = 0,
tp_methods = 0x8bd9e20, tp_members = 0x0, tp_getset = 0x0, tp_base = 0x8bd8800, tp_dict = 0xb0bcf534, tp_descr_get = 0, tp_descr_set = 0, tp_dictoffset = 0,
tp_init = 0xb0a469eb <object_init>, tp_alloc = 0xb0a432b9 <PyType_GenericAlloc>, tp_new = 0x8712750 <ControlGroup_New>, tp_free = 0xb0a327eb <PyObject_Free>, tp_is_gc = 0,
tp_bases = 0xb0bcf514, tp_mro = 0xb0bcf474, tp_cache = 0x0, tp_subclasses = 0x0, tp_weaklist = 0xb0bcf744, tp_del = 0}


- micantox - 2009-04-22

There are news about this.

After a little bit of code scouting, I've noticed that there must be some resource allocation/deallocation inconsistency at GUI level.

An ugly hack which prevents this bug from happening is to block the second instance of a python thread invocation based on the source code file: by doing so, keeping on clicking on a button (invoking the corresponding RunScript), only the first instance is launched and all the others get discarded.

I wonder whether this is too hard of a limitation or not: if any far more experienced user/developer could think about possible drawbacks of this approach and give me some hints I would be really glad Smile

Thanks in advance


- micantox - 2009-04-23

micantox Wrote:After a little bit of code scouting, I've noticed that there must be some resource allocation/deallocation inconsistency at GUI level.

To confirm this, I've commented out the DeInitializeInterpreter part of the Process() function in the XBPython Thread implementation.

This is the code which does not get executed any more:

Code:
void XBPython::DeInitializeInterpreter()
{
  DeinitXBMCModule();  
  DeinitPluginModule();
  DeinitGUIModule();  
}

By doing so, a non-limited number of python scripts invocations may be performed without triggering the dealloc bug (related to the PY_DECREF invocations within those calls). Anyway, I got stuck at trying to find in which part of the code the very same resources get REFDecremented again.

Sorry for bothering here, if you think this is not interesting I will try to ask somewhere else.

As usual, any help would be appreciated.
Thanks in advance.


- jmarshall - 2009-04-23

The script is empty, right, but the file is present? So the script will fail to run.

Does it happen with scripts that are present but simple enough that they can be run quickly enough to also cause this problem?

Cheers,
Jonathan


- micantox - 2009-04-23

jmarshall Wrote:The script is empty, right, but the file is present? So the script will fail to run.

Actually, it does not seem a proper failure: I would rather say that it undergoes the initialization phase and finding not any useful piece of code, immediately gets uninitialized.
I can trace the code execution even with a gdb session: the PyRun_SimpleFile (and the following call chain) gets correctly invoked, but when it gets to python code it immediately exit from the chain and the XBMC deinitialization code is invoked.

jmarshall Wrote:Does it happen with scripts that are present but simple enough that they can be run quickly enough to also cause this problem?

Yes. I started just from the observation of a very simple python script failing on multiple invocation (the invocation is only possible unless the python code enters the doModal part, so you have to press enter quite fast).

I guess it's important: I found out that the xbmcgui.controlgroup behaves differently from xbmcgui.controlspin, as far as the allocation/deallocation sequence is concerned. You might log the DeinitGUIModule, by doing something like this:

Code:
CLog::Log(LOGFATAL, "In DeInitGUIModule: before ctrlgrp->refcnt = %d", ControlGroup_Type.ob_refcnt);
    Py_DECREF(&ControlGroup_Type);
    CLog::Log(LOGFATAL, "In DeInitGUIModule: after ctrlgrp->refcnt = %d", ControlGroup_Type.ob_refcnt);

You will have to press esc repeatedly to exit from multiple invocations of the same script: every time that part of code is invoked, the value of the refcounter is smaller by two units, instead of just one as one would say at first glance. That must be due to some other Py_DECREF called somewhere else on the very same object (ControlGroup_type), but I cannot find where this happens.
If you do the very same check on the controlspin_type, you will notice that it correctly gets decremented just by one unit at every deallocation.

Other GUI elements and at least one XBMC module type (Language_Type) suffer from the very same issue.

jmarshall Wrote:Cheers,
Jonathan

Thanks for your answer (hope you are not getting desperate for my multiple posts ;-) )


- micantox - 2009-04-23

I've noticed that the list of controls which are not affected from the double dec_refcnt during the uninitialization phase is made up by those controls that assign 0 to the tp_new control field.

They are Control_Type and ControlSpin_Type only: all the others do define a specific new operator and behave in the described (and likely unexpected) manner.

And this concludes my code survey (and noise) for today Wink

Cheers,
Antonio


- jmarshall - 2009-04-23

Thanks for your investigation. If you could add a ticket to trac with a link here that would be most useful to ensure it doesn't get lost in the rest of the forum noise Smile

Cheers,
Jonathan


- micantox - 2009-04-24

jmarshall Wrote:Thanks for your investigation. If you could add a ticket to trac with a link here that would be most useful to ensure it doesn't get lost in the rest of the forum noise Smile

Cheers,
Jonathan

Done. The ticket is at:

http://xbmc.org/ticket/6430

and a similar problem (as far as the conditions which it shows up under are concerned) is described in another ticket I've just opened
http://xbmc.org/ticket/6431

Thanks a lot for your assistance. Smile

Bye,
Antonio