[oi-dev] [OpenIndiana-discuss] Sun/Oracle China's DRM//KMS Sol11.2 port backported to function on old-style gfxp_private from pre-2010 era but still immediatedly PANICS

Мартин Бохниг opensxce at mail.ru
Thu Dec 10 17:30:59 UTC 2015


 Hello,


now finally, but without time to instruct newcomers.
I only post this _early_ to prove I'm not talking about hot air / vapoware.

At the moment - when building with gcc - I get an assertiob error and resulting panic _also_ on Sol11.1++ (unlike 2 weeks ago when mostly building with the redistributable osol Studio12.1).

However, I attach the old module bins from two weeks ago.
I said I won't instruct end-users, this means I won't mention that you first need a worjing agpgart and also not (beyond this short note), that on Illumos kernels it is necessary to manyally correct a symlink in /dev/dri.


Because even after the user finally has a working agpgart, Illumos' DDI/DKI for some reason creates a wrong symlink during initialization.


Example:

GOOD:

ls -al /dev/dri
total 11
drwxr-xr-x   5 root root   5 Nov 24 10:41 .
drwxr-xr-x 282 root root 282 Dec 10 17:32 ..
lrwxrwxrwx   1 root root  36 Nov 24 09:55 card0 -> ../../devices/pci at 0,0/display at 2:drm0
lrwxrwxrwx   1 root root  42 Nov 24 10:02 controlD64 -> ../../devices/pci at 0,0/display at 2:controlD64
root at opensxce:~#



BAD (as on Illumos or any other older Solaris kernels before S11 snv_175) :

ls -al /dev/dri
total 11
drwxr-xr-x   5 root root   5 Nov 24 10:41 .
drwxr-xr-x 282 root root 282 Dec 10 17:32 ..
lrwxrwxrwx   1 root root  42 Nov 24 10:41 card0 -> ../../devices/pci at 0,0/display at 2:controlD64
lrwxrwxrwx   1 root root  36 Nov 24 09:55 card1 -> ../../devices/pci at 0,0/display at 2:drm0
root at opensxce:~#

(can be fixed by simply renaming /dev/dri/card0 to /dev/dri/controlD64 in the boot scripts, for each boot, until the real solution is found in Illumos' DDI/DKI bindings, which so far has a low priority until the rest works)


On old CPU's such as my 2700k (finally no longer only a Celeron G530, thanks to your donations!) it is sufficient for getting a working agpgart to wget, uncompress and install  http://svr4.opensxce.org/201405/i386/5.11/sunw_agp-OpenSXCE__Illumos20140505%2cREV%3d2014.05.07.03.16-SunOS5.11-i386-SUNW.pkg.gz   [or to use pkgutil if you are on OpenSXCE2014.05).

Until the real problems are solved, there is no need to upload an updated version with newer pciids.



Now - here my proposed diff in 2 versions plus also the binary i915 and drm modules from 2 weeks ago.
Again: I won't provide howto instructions, because the audience of this post is very small and those who can use anything from this letter, won't need such instrcutions.


The only real question was and is: Why does this work (bypassing all gfxp 11.1++ interfaces) nevertheless only on 11.1++ and panics the host early during X11 initialization on 11.0 and Illumos.

Meanwhile I have the suspicion (judging from binutils) that in the more modern gfxp driver the frame buffer gets mmapp'ed.
However, unlike Illumos or old OpenSolaris, in S11.0 this code seems to be already inside gfxp:

$ nm gfx_private|grep setup
BFD: gfx_private: warning: sh_link not set for section `.eh_frame'
                 U ddi_regs_map_setup
                 U devmap_devmem_setup
                 U devmap_umem_setup
00000000000006bc T gfxp_ddi_segmap_setup
0000000000000a00 T gfxp_devmap_umem_setup
00000000000026f4 t gfxp_setup_fbcons
                 U pci_config_setup


BUT DESPITE THIS, it panics on 11.0 due to attempted dereferencing of a NULL pointer triggering a TRAP.


Normally I would never publish such instable non-functional stuff.
Ahh, ps.: On 11.1++ this stuff works if built with Studio.
But forget getting back to the Text console after X11 was already up. It is possible to enter commands afterwards, but only blindly without seeing anything (console is completely srapped).
However - restarting X11 or rebooting will function (on S11.1++). Only that you won't see what you typed.

This happens in this case not because of incomplete cleanup, but has to do with structs not properly initialized and not correctly assigned to the correct objects (especially fb_info).

Neverthe less X11 and compize (even with gnome) do work fine on S11.1++, yet not on 11.0 or Illumos).
The return2console issue can be addressed after we finally have it working on 11.0 and Illumos.


As for" gfxp is "not" a moving target: nm suggests otherwise.
And replacing newer or older versions of gfxp (as possible with i915/drm and agpgart [all related submodules]) CANNOT be done with gfxp, neitehr upwards, now downwards. Missing symbols (depends directly on the /platform/i86pc/kernel/amd64/unix itself).

Even replacing Grub1 against Grub2 or vide versa (Such as Solaris 11.1 booted from Grub1) : No difference.


As for the panic backtrace that I usually got 2 weeks ago (today with gcc a different one, and no matter if on 11.1++ or earlier - never works, I'm a bit confused if that's really a gcc vs. studio issue, or realted to some other change during the diff created today, dammit), here how it would look like, I uload the kernel dumps to opensxce.org/kms_dump in 10 minutes.

Here you see how it appeared in /var/adm/messages:


Nov 23 05:08:37 opensxce pcplusmp: [ID 805372 kern.info] pcplusmp: pci8086,122 (i915) instance 0 irq 0x10 vector 0x83 ioapic 0x2 intin 0x10 is bound to cpu 2
Nov 23 05:08:37 opensxce drm: [ID 120748 kern.warning] WARNING: [drm:drm_gem_object_alloc_internal_normal:153] ddi_dma_mem_alloc failed
Nov 23 05:08:37 opensxce last message repeated 1 time
Nov 23 05:08:37 opensxce drm: [ID 350376 kern.warning] WARNING: [drm:i915_gem_alloc_object:3315] failed to init gem object
Nov 23 05:08:37 opensxce drm: [ID 278599 kern.warning] WARNING: [drm:init_status_page:1233] Failed to allocate status page
Nov 23 05:08:37 opensxce drm: [ID 651038 kern.warning] WARNING: [drm:i915_driver_firstopen:1647] failed to init modeset
Nov 23 05:09:51 opensxce reboot: [ID 330035 auth.crit] initiated by martin on /dev/console
Nov 23 05:09:57 opensxce rpcbind: [ID 240694 daemon.error] rpcbind terminating on signal 15.
Nov 23 05:10:30 opensxce genunix: [ID 672855 kern.notice] syncing file systems...
Nov 23 05:10:31 opensxce genunix: [ID 904073 kern.notice]  done
Nov 23 05:10:32 opensxce unix: [ID 836849 kern.notice]
Nov 23 05:10:32 opensxce ^Mpanic[cpu6]/thread=ffffff04e79790a0:
Nov 23 05:10:32 opensxce genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff001f610ab0 addr=20 occurred in module "i915" due to a NULL pointer dereference
Nov 23 05:10:32 opensxce unix: [ID 100000 kern.notice]
Nov 23 05:10:32 opensxce unix: [ID 839527 kern.notice] reboot:
Nov 23 05:10:32 opensxce unix: [ID 753105 kern.notice] #pf Page fault
Nov 23 05:10:32 opensxce unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x20
Nov 23 05:10:32 opensxce unix: [ID 243837 kern.notice] pid=1994, pc=0xfffffffff7cbe7ee, sp=0xffffff001f610ba0, eflags=0x10246
Nov 23 05:10:32 opensxce unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 406f8<osxsav,xmme,fxsr,pge,mce,pae,pse,de>
Nov 23 05:10:32 opensxce unix: [ID 624947 kern.notice] cr2: 20
Nov 23 05:10:32 opensxce unix: [ID 625075 kern.notice] cr3: 4400000
Nov 23 05:10:32 opensxce unix: [ID 625715 kern.notice] cr8: c
Nov 23 05:10:32 opensxce unix: [ID 100000 kern.notice]
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  rdi: ffffff04df197038 rsi:          24a7c85 rdx: ffffff04e79790a0
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  rcx:                6  r8:                0  r9:               32
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  rax:                0 rbx: ffffff04e12fe2b0 rbp: ffffff001f610bc0
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  r10: ffffff04df197528 r11:                1 r12: ffffff001f610d6c
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  r13: fffffffffbd4fb88 r14:                1 r15: ffffff04fdedfa40
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  fsb: fffffd7fff172a40 gsb: ffffff04e604e580  ds:               4b
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]   es:               4b  fs:                0  gs:                0
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]  trp:                e err:                0 rip: fffffffff7cbe7ee
Nov 23 05:10:32 opensxce unix: [ID 592667 kern.notice]   cs:               30 rfl:            10246 rsp: ffffff001f610ba0
Nov 23 05:10:32 opensxce unix: [ID 266532 kern.notice]   ss:               38
Nov 23 05:10:32 opensxce unix: [ID 100000 kern.notice]
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610990 unix:die+df ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610aa0 unix:trap+dc0 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610ab0 unix:cmntrap+e6 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610bc0 i915:i915_gem_context_fini+4e ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610c50 i915:i915_quiesce+292 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610c80 genunix:devi_quiesce+3b ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610cc0 genunix:quiesce_one_device+81 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610cf0 genunix:quiesce_devices+46 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610d20 genunix:quiesce_devices+3b ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610d50 genunix:quiesce_devices+3b ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610db0 unix:mdboot+165 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610e40 genunix:kadmin+416 ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610ec0 genunix:uadmin+16d ()
Nov 23 05:10:32 opensxce genunix: [ID 655072 kern.notice] ffffff001f610f10 unix:brand_sys_sysenter+1c9 ()
Nov 23 05:10:32 opensxce unix: [ID 100000 kern.notice]
Nov 23 05:10:32 opensxce genunix: [ID 672855 kern.notice] syncing file systems...
Nov 23 05:10:32 opensxce genunix: [ID 904073 kern.notice]  done
Nov 23 05:10:33 opensxce genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool1/dump, offset 65536, content: kernel
Nov 23 05:10:47 opensxce genunix: [ID 100000 kern.notice]
Nov 23 05:10:47 opensxce genunix: [ID 665016 kern.notice] ^M100% done: 285228 pages dumped,
Nov 23 05:10:47 opensxce genunix: [ID 851671 kern.notice] dump succeeded
Nov 23 05:11:33 opensxce genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version master-0-ga443cc8 64-bit
Nov 23 05:11:33 opensxce genunix: [ID 877030 kern.notice] Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved.




Regards,
and hopefully we can get this Saga solved  ;)


%martin
(the bins are too large and got blocked by oi-dev, this means I upload them in 10 minutes to opensxce.org/kms/unstable_bins/20151210thu)
<<the src is attached>>










-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openindiana.org/pipermail/oi-dev/attachments/20151210/257f2862/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/octet-stream
Size: 21881 bytes
Desc: not available
URL: <http://openindiana.org/pipermail/oi-dev/attachments/20151210/257f2862/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/octet-stream
Size: 21898 bytes
Desc: not available
URL: <http://openindiana.org/pipermail/oi-dev/attachments/20151210/257f2862/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/octet-stream
Size: 1565 bytes
Desc: not available
URL: <http://openindiana.org/pipermail/oi-dev/attachments/20151210/257f2862/attachment-0005.obj>


More information about the oi-dev mailing list