#tde-devs < 2026/02/02 > | |
|---|---|
| [00:09] | SlavekB has joined |
| [00:46] | micheleC has joined |
| [01:57] | micheleC has quit (Read error: Connection reset by peer) |
| [12:00] | micheleC has joined |
| [12:01] | denk: micheleC: I found libkjs problem, but I don't know how to solve it |
| [12:01] | denk: shortly, it is a clang optimization |
| [12:02] | denk: it removed an if-condition |
| [12:02] | denk: I compared assembler code after g++14 and clang++ for a function |
| [12:03] | micheleC: it must be some sort of c++ bug, because an optimization from compiler can't break code, unless the compiler is buggy itself |
| [12:03] | micheleC: can you point at the line of code? |
| [12:03] | denk: just a moment... |
| [12:03] | micheleC: you can also builld without optimization and compare |
| [12:05] | denk: https://mirror.git.trinitydesktop.org/gitea/TDE/tdelibs/src/branch/master/kjs/value.cpp#L77 |
| [12:05] | tde-bot: Page title: tdelibs/value.cpp at master - tdelibs - TDE Gitea Workspace |
| [12:05] | denk: https://mirror.git.trinitydesktop.org/gitea/TDE/tdelibs/src/branch/master/kjs/simple_number.h |
| [12:05] | tde-bot: Page title: tdelibs/simple_number.h at master - tdelibs - TDE Gitea Workspace |
| [12:05] | denk: in that case this = 0x1 |
| [12:06] | denk: return ((long)imp & mask) == tag; |
| [12:06] | denk: 1 & 3 == 1 |
| [12:06] | micheleC: let me have a look |
| [12:07] | denk: btw, in assembler code I see cuting of the pointer |
| [12:07] | denk: it places a value to %rax, but only %eax used |
| [12:08] | micheleC: if "this == 0x1" something is very wrong. "this" points to some objects and I am sure they are not allocated at address 0x1 |
| [12:08] | denk: no, it's correct, ::zero() -> ::make(0) |
| [12:08] | micheleC: so you need to find out why "this" gets such odd value |
| [12:08] | denk: I don't remembre exact classes |
| [12:09] | micheleC: this is a pointer to an object. So it is either pointing to the stack or to the heap. I don't know the details of your architecture, but to point at 0x1 is a symptom of a corrupted pointer |
| [12:10] | micheleC: what is your OS/arch? |
| [12:10] | denk: Program terminated with signal SIGSEGV, Segmentation fault. |
| [12:10] | denk: Address not mapped to object. |
| [12:10] | denk: #0 KJS::ValueImp::setGcAllowed (this=0x1) at ./kjs/value.cpp:78 |
| [12:10] | denk: 78 _flags |= VI_GCALLOWED; |
| [12:10] | denk: fbsd, clang, amd64 |
| [12:11] | denk: #1 0x0000000856b91734 in KJS::ObjectImp::putDirect (this=0x27c0bdfafbc0, |
| [12:11] | denk: propertyName=..., value=0x1, attr=14) at ./kjs/object.cpp:479 |
| [12:11] | denk: #2 0x0000000856b74e09 in KJS::StringPrototypeImp::StringPrototypeImp ( |
| [12:11] | denk: this=0x27c0bdfafbc0, objProto=<optimized out>) at ./kjs/string_object.cpp:172 |
| [12:11] | denk: 172 putDirect(lengthPropertyName, NumberImp::zero(), DontDelete|ReadOnly|DontEnum); |
| [12:11] | denk: NumberImp::zero() -> NumberImp::make(0) |
| [12:11] | micheleC: see, you have a SEGV because you are trying to access an address that is not even mapped |
| [12:12] | denk: I saw similar behaviour when I ran my netflow collector on sunos cimpiled by suncc |
| [12:12] | denk: it also removed some code |
| [12:13] | denk: it was the same optimization |
| [12:13] | denk: worked only without any optimization |
| [12:13] | micheleC: something goes wrong somewhere and you end up with a corrupted "this" pointer. Obviously there is a bug, but it is not a clang optimization issue |
| [12:13] | denk: there is the same here, -O0 does not produce it |
| [12:13] | micheleC: it may be a wrong cast in the code that the opmitization expose |
| [12:14] | denk: becase clang leaves the if-condition |
| [12:14] | micheleC: or something else. But I am fairly confident the problem is somewhere in our code |
| [12:17] | micheleC: whenever you have a "this" pointer like above, be 100% sure it is a corrupted value caused by some bugs in the code |
| [12:17] | micheleC: and it is not surprising you end up with SEGV |
| [12:17] | denk: https://pastebin.com/SMLvYxu1 |
| [12:17] | tde-bot: Page title: clang++ -O0 .text .globl _ZN3KJS8ValueImp12setGcAllowedEv # - - Pastebin.com |
| [12:18] | denk: huh, I marked wrong, second variant was g++14 |
| [12:19] | denk: line 37, not clang++ -O2 |
| [12:19] | denk: btw, %rdi contains value (this) |
| [12:19] | denk: copy it to %rax to use it |
| [12:20] | denk: but andl with %eax (castrated pointer) |
| [12:20] | micheleC: page 0 of the VA space is usually not allocated on modern OS, to make sure any invalid null pointer triggers a SEGV. So if "this" points to 0x1, some gets corrupted. |
| [12:20] | denk: and line 48 again with %rax |
| [12:21] | micheleC: yes, maybe the cutting down from %rax to %eax may be the cause, so again it may point to a wrong cast in our code for example |
| [12:21] | micheleC: but it may as well be legit code, depending on what C++ code was being executed |
| [12:21] | denk: I played with casting there, nothing helped |
| [12:21] | micheleC: but regardless of that, you need to focus on investigating where "this" goes bad |
| [12:22] | denk: I told you, 0x1 goes from zero() |
| [12:22] | micheleC: step back a few frames till you get a valid "this" pointer, then dig one frame at a time and look at where "this" gets corrupted |
| [12:24] | denk: https://pastebin.com/EqKARsBb |
| [12:24] | tde-bot: Page title: (gdb) bt full#0 KJS::ValueImp::setGcAllowed (this=0x1) at ./kjs/value.cpp:78 - Pastebin.com |
| [12:25] | denk: NumberImp::zero() returns 0x1 as pointer |
| [12:26] | micheleC: "I told you, 0x1 goes from zero()": yes, I see the code, but I question the correctness of such code |
| [12:26] | micheleC: I have not dug into KJS internal, but a function that returns 0x1 as a pointer value smells 10000% wrong |
| [12:27] | micheleC: and you are sure end up pointing to an invalid memory page --> SEGV |
| [12:27] | denk: https://mirror.git.trinitydesktop.org/gitea/TDE/tdelibs/src/branch/master/kjs/internal.h#L132 |
| [12:27] | tde-bot: Page title: tdelibs/internal.h at master - tdelibs - TDE Gitea Workspace |
| [12:28] | denk: https://mirror.git.trinitydesktop.org/gitea/TDE/tdelibs/src/branch/master/kjs/simple_number.h#L45 |
| [12:28] | tde-bot: Page title: tdelibs/simple_number.h at master - tdelibs - TDE Gitea Workspace |
| [12:28] | denk: static inline ValueImp *make(long i) { return (ValueImp *)((i << shift) | tag); } |
| [12:28] | micheleC: yes, as I said I see the code, but it smell badly wrong |
| [12:28] | denk: (0 << 2 ) | tag |
| [12:29] | micheleC: this is nothing to do with optimization. |
| [12:29] | denk: 0x1 in ::is() is just the tag |
| [12:29] | micheleC: we do (0 << 2) | 1 and we return this as a pointer. No wonder if you try to use it you end up with SEGV |
| [12:29] | denk: if (!SimpleNumber::is(this)) |
| [12:29] | denk: _flags |= VI_GCALLOWED; |
| [12:30] | denk: so, why -O, -O1, -O2 remove if-condition? |
| [12:30] | denk: and we see only "or"? |
| [12:31] | micheleC: probably the compiler optimization is good enough to calculate the time at compile time and do some optimazion |
| [12:31] | denk: this->_flags produces segv becase this = 0x1 (tag) |
| [12:31] | micheleC: but the problem is the call to "SimpleNumber::is(this)" with "this=0x1" |
| [12:32] | denk: well, anyway, I don't know how to debug clang itself in this case |
| [12:32] | micheleC: actually not. "this" in this case is not the "c++ this" pointer. I got misleaded |
| [12:32] | denk: probably it should have some flags to print a plan (of optimization) |
| [12:33] | denk: also, at tis moment I can't to try it under dilos (and sparc too) we are not ready for it |
| [12:33] | micheleC: let me look a bit deeper |
| [12:35] | micheleC: where is "zero()" called from? |
| [12:36] | denk: https://pastebin.com/URq4jmgH another problems in the future :) |
| [12:37] | micheleC: https://mirror.git.trinitydesktop.org/gitea/TDE/tdelibs/src/branch/master/kjs/value.cpp#L77 |
| [12:37] | micheleC: from here? |
| [12:37] | tde-bot: Page title: tdelibs/value.cpp at master - tdelibs - TDE Gitea Workspace |
| [12:37] | denk: https://pastebin.com/RCatkA8X |
| [12:37] | tde-bot: Page title: asus% export TDE_DEBUG=1 asus% gdb konqueror GNU gdb (GDB) - Pastebin.com |
| [12:39] | denk: (gdb) disassemble KJS::ValueImp::setGcAllowed |
| [12:39] | denk: Dump of assembler code for function _ZN3KJS8ValueImp12setGcAllowedEv: |
| [12:39] | denk: 0x0000000804bd8f80 <+0>: push %rbp |
| [12:39] | denk: 0x0000000804bd8f81 <+1>: mov %rsp,%rbp |
| [12:39] | denk: => 0x0000000804bd8f84 <+4>: orb $0x2,0xa(%rdi) |
| [12:39] | denk: 0x0000000804bd8f88 <+8>: pop %rbp |
| [12:39] | denk: 0x0000000804bd8f89 <+9>: ret |
| [12:39] | denk: End of assembler dump. |
| [12:39] | denk: (gdb) info reg rdi |
| [12:39] | denk: rdi 0x1 1 |
| [12:39] | denk: boom :) |
| [12:39] | micheleC: " value->setGcAllowed();": here is the bug |
| [12:40] | micheleC: super clear! |
| [12:40] | denk: don't smile on me |
| [12:40] | denk: I found it in two days! |
| [12:40] | denk: two days for a small bug... |
| [12:41] | denk: but I still don't know why |
| [12:41] | micheleC: "value" is 0x1. It's a number used as pointer, produced by "zero()". The code try to use it as a real pointer when calling "setGcAllowed()". So in the call, "this" gets the value 0x1, resulting in SEGV |
| [12:41] | micheleC: it's wrong logic in the c++ code |
| [12:42] | denk: I saw how java and javascript (nodejs) use pointers |
| [12:42] | micheleC: probably that line " value->setGcAllowed();" should be removed |
| [12:42] | denk: some unused bits they used for object attributes |
| [12:43] | micheleC: https://mirror.git.trinitydesktop.org/gitea/TDE/tdelibs/src/branch/master/kjs/object.cpp#L474 |
| [12:43] | denk: but before some real using of pointers they must recover real pointers |
| [12:43] | tde-bot: Page title: tdelibs/object.cpp at master - tdelibs - TDE Gitea Workspace |
| [12:43] | denk: thanks cap! I know that it is a start point of the issue |
| [12:43] | micheleC: using bits for attributes is perfectly fine. the problem is when they try to use those attributes as real pointer :-) |
| [12:44] | micheleC: try removing that line and you will not see that crash anymore |
| [12:44] | micheleC: then let me know and I can create a proper PR for it |
| [12:47] | micheleC: but I see other places where similar crashes could happen. For example line 437 |
| [12:47] | denk: Program terminated with signal SIGSEGV, Segmentation fault. |
| [12:47] | denk: Address not mapped to object. |
| [12:47] | denk: #0 KJS::ValueImp::setGcAllowed (this=0x1) at ./kjs/value.cpp:78 |
| [12:47] | denk: 78 _flags |= VI_GCALLOWED; |
| [12:47] | denk: wrong! |
| [12:47] | denk: #1 0x0000000856802eb5 in KJS::ObjectImp::setInternalValue (this=0x3954957afd40, v=0x1) |
| [12:47] | denk: at ./kjs/object.cpp:437 |
| [12:47] | denk: 437 v->setGcAllowed(); |
| [12:48] | micheleC: tak! see, crashed exactly on that line I just mentioned. Some logic error in the code |
| [12:48] | micheleC: remove also that line and try again. I don't know what the original developer was trying to do, but those 2 lines are definitely wrong |
| [12:49] | denk: I already did it, does not work |
| [12:49] | denk: Program terminated with signal SIGSEGV, Segmentation fault. |
| [12:49] | denk: Address not mapped to object. |
| [12:49] | denk: #0 KJS::ValueImp::ref (this=<optimized out>) at ./kjs/value.h:86 |
| [12:49] | denk: 86 ValueImp* ref() { if (!SimpleNumber::is(this)) refcount++; return this; } |
| [12:50] | denk: 0x00000008538cd6de <+30>: lea 0x1(,%rdx,4),%rax |
| [12:50] | denk: => 0x00000008538cd6e6 <+38>: mov 0x9(,%rdx,4),%ecx |
| [12:50] | denk: 0x00000008538cd6ed <+45>: movzwl 0xb(,%rdx,4),%edx |
| [12:50] | denk: (gdb) info reg rdx |
| [12:50] | denk: rdx 0x1 1 |
| [12:50] | denk: rdx has 0x1 (this) |
| [12:52] | denk: micheleC: it's time to sleep, go to the bed |
| [12:53] | micheleC: we would probably need to look at what the devs were trying to do to have a proper fix |
| [12:55] | micheleC: because it is clear they are doing something quite wrong with ValueImp |
| [13:04] | denk: so, I found some solution, but it works partially |
| [13:05] | denk: !SimpleNumber::is(this) was replaced to SimpleNumber::is(this) == 0 |
| [13:06] | denk: in the end I still have a segv |
| [13:06] | denk: Program terminated with signal SIGSEGV, Segmentation fault. |
| [13:06] | denk: Address not mapped to object. |
| [13:06] | denk: #0 KJS::ValueImp::dispatchToBoolean (this=0x1, exec=0x82052dcd0) at ./kjs/value.cpp:187 |
| [13:06] | denk: 187 return toBoolean(exec); |
| [13:12] | micheleC: there are some major logic flow in that part of kjs. so a proper fix will require understanding what the devs wanted to do first, then code it correctly without messing around with wrong pointers |
| [13:22] | denk: btw, clang if the default compiler in my system |
| [13:22] | denk: asus% c++ -v |
| [13:22] | denk: FreeBSD clang version 19.1.7 (https://github.com/llvm/llvm-project.git llvmorg-19.1.7-0-gcd708029e0b2) |
| [13:22] | denk: Target: x86_64-unknown-freebsd15.0 |
| [13:23] | denk: s/if/is/ |
| [13:23] | micheleC: (y) but the problem is not clang |
| [13:24] | micheleC: it's wrong logic in kjs :-) |
| [13:24] | denk: I know it :) |
| [13:24] | denk: we just find a way to fix it "right now" |
| [13:29] | micheleC: well, as long as it is not the wrong fix. |
| [13:33] | denk: also, since I rebuilt everything directly from the git repo I don't see crashes of kmail |
| [13:41] | micheleC: (y) |
| [14:23] | micheleC has quit (Quit: Kopete 0.12.7 : http://trinitydesktop.org) |
#tde-devs < 2026/02/02 > | |
© 2010-2026 Trinity Desktop Project