Take your pick: standard or custom displaylink X drivers both work now

[update May 3, 2012: Support for using fb_defio and the standard, generic X fbdev driver is enabled by default, working, and stable with kernel 3.3 (at least). You’ll get lower CPU consumption and latency with DAMAGE notifications, but on many systems (especially higher end), fbdev is great. And because not everything supports DAMAGE, generic fbdev is more reliable.]

[update April 6, 2010: support for fbdev had been merged into the main udlfb codeline, including for 2.6.34, but has since been removed because of kernel faults that stand unsolved. If/when these problems can be found and fixed, fbdev support can get back into the mainline. Until then, the branch mentioned below is an ok way to try and test]

You can now get a version of udlfb with improved performance and support for any fbdev client.

Of the three DisplayLink Linux framebuffer driver lines, udlfb and displaylink-mod (written by Roberto DeIoris) have had the best performance by a significant margin. They’ve relied on Roberto’s custom X server, with some custom IOCTLs, to make use of precise X damage information. All the directions on the http://displaylink.org/ wiki have pointed to these drivers so far.

Unfortunately, these drivers won’t work with standard frambuffer clients that use a mmap’d framebuffer, because they’ll simply never refresh any area of the screen without damage notification. So drivers like the existing xf86-video-fbdev won’t work.

By contrast, Jaya Kumar’s defio-based DisplayLink codeline does work with xf86-video-fbdev or any standard fbdev client, but hasn’t been competitive performance-wise.

So the goal has been to merge the best aspects of both codelines — and get them merged into the kernel. That work isn’t completely done, but it’s at a working milesone. We now have a single kernel framebuffer driver that can support either roberto’s custom X driver (“displaylink”), or the standard fbdev X driver (“fbdev”), just by switching the “driver” line in the “server” section of xorg.conf. This makes for easier performance testing and workaround testing for X server specific problems.

And where previously, the fbdev driver was much slower than the displaylink custom (90% slower on some tests), it’s now within a few percentage points of difference – often not enough to notice.

There’s lots of room to optimize further yet, but this opens the possibility of not needing a custom X server at all for displaylink hardware, which would simplify the linux distribution rollout strategy.

You can grab this code at:

git clone http://git.plugable.com/webdav/udlfb
git checkout origin/defio

Then “make; sudo make install; sudo depmod -a” as usual. If you’re switching from displaylink-mod, get rid of that from the kernel modules directory first, or both udlfb and it may try to load.

Please post experience reports here or on the libdlo list. As patches have been developed for udlfb, there’s not been enough validation from the user community that the patches work and are valuable — and that would help the Linux kernel maintainers make their decisions about whether to accept patches.

Perf data is imperfect, but here’s a benchmark run of this code running the custom displaylink X server (making use of damage information)

bernie@bernie-aspireone:~/git/misc-udlfb$ ./udlfb-perf.sh fb0 gtkperf -a
GtkPerf 0.40 - Starting testing: Mon Dec 21 14:24:13 2009

GtkEntry - time:  0.00
GtkComboBox - time:  3.00
GtkComboBoxEntry - time:  1.89
GtkSpinButton - time:  0.42
GtkProgressBar - time:  0.60
GtkToggleButton - time:  0.44
GtkCheckButton - time:  0.42
GtkRadioButton - time:  0.75
GtkTextView - Add text - time:  2.09
GtkTextView - Scroll - time:  0.83
GtkDrawingArea - Lines - time:  1.66
GtkDrawingArea - Circles - time:  3.09
GtkDrawingArea - Text - time:  2.89
GtkDrawingArea - Pixbufs - time:  0.27
 ---
Total time: 18.37

Quitting..

model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
cpu MHz         : 1600.000
cpu MHz         : 1333.000
MemTotal:        2052144 kB
Framebuffer Mode: 1920,1080

Rendered bytes:  155896744 (total pixels * Bpp)
Identical bytes: 96231480 (skipped via shadow buffer check)
sent bytes:      29614624 (compressed usb data, including overhead)
K CPU cycles:    1251295 (transpired, may include context switches)

% pixels found to be unchanged: 61.00 %
Compression of changed pixels : 50.00 %
Total CPU cycles spent per input pixel: 8
Total CPU cycles spent per output pixel: 42
USB Mbps: 11.29 (theoretical USB 2.0 peak 480)

And here’s a benchmark run of the same, just switched to run the standard fbdev X server (making use of only page faults)

bernie@bernie-aspireone:~/git/misc-udlfb$ ./udlfb-perf.sh fb0 gtkperf -a
GtkPerf 0.40 - Starting testing: Mon Dec 21 12:45:20 2009

GtkEntry - time:  0.00
GtkComboBox - time:  3.16
GtkComboBoxEntry - time:  1.80
GtkSpinButton - time:  0.41
GtkProgressBar - time:  0.60
GtkToggleButton - time:  0.44
GtkCheckButton - time:  0.41
GtkRadioButton - time:  0.76
GtkTextView - Add text - time:  2.03
GtkTextView - Scroll - time:  0.82
GtkDrawingArea - Lines - time:  1.87
GtkDrawingArea - Circles - time:  3.36
GtkDrawingArea - Text - time:  2.84
GtkDrawingArea - Pixbufs - time:  0.22
 ---
Total time: 18.73

Quitting..

model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
cpu MHz         : 1333.000
cpu MHz         : 1600.000
MemTotal:        2052144 kB
Framebuffer Mode: 1920,1080

Rendered bytes:  288165888 (total pixels * Bpp)
Identical bytes: 227263860 (skipped via shadow buffer check)
sent bytes:      39281496 (compressed usb data, including overhead)
K CPU cycles:    1685041 (transpired, may include context switches)

% pixels found to be unchanged: 78.00 %
Compression of changed pixels : 35.00 %
Total CPU cycles spent per input pixel: 5
Total CPU cycles spent per output pixel: 42
USB Mbps: 14.98 (theoretical USB 2.0 peak 480)

The “Compression of changed pixels” is lower on fbdev, because the unchanged pixel detection is less accurate for the page fault method (for now, but that will get fixed ..), and so there’s a lot of re-rendering of desktop pixels — and my desktop background is a complex, gradient heavy image that doesn’t compress well.

The main performance gains vs. the original defio implementation are:

  • Added RLE compression on the defio path
  • Added shadow/backbuffer to scan for unchanged pixels on the defio path
  • Moved to asynchronous urb dispatch for defio and damage codepaths (4 pre-alloc’d 64K urbs). This also has a significant performance benefit for the custom/damage path
  • Changed defio path to no longer send an urb per line, but rather fill every urb completely, across lines and dirty pages. Big gain for defio.

13 comments on “Take your pick: standard or custom displaylink X drivers both work now”

  1. B Douglas Hilton Reply

    I can’t get this code to compile on my x86_64 system. Here is an error log:

    http://dpaste.com/hold/138688/

    Since upgrading to Xorg 1.6.5, I can’t get the xf-driver-displaylink to work anymore with xf86-driver-radeon or ati’s fglrx. Attempting to load displaylink on Screen 1 causes Xorg to segfault and abort. I can load displaylink in conjunction with the vesa driver still in 1.6.5, however this causes extreme graphical corruption.

    So my device is presently unusable with my new system; however, I do have an older system which runs Xorg 1.5.x and kernel 2.6.32 and displaylink and radeon do play nice together there.

    I’m looking at trying to fix the file, but my c programming skills might be a bit light considering the gravity of the errors that I’m seeing here.

  2. bernie Reply

    Thanks for this report! I’ll get on a 64-bit machine and get these fixed in the defio branch, reporting back here.

  3. bernie Reply

    Hi Douglas – This 64-bit issue should now be fixed. Just grab the latest from the “defio” branch (git pull; git checkout origin/defio). It’s the “Enhance tracing” checkin, 9c089…

  4. B Douglas Hilton Reply

    I’m still not able to build this against kernel 2.6.32

    http://dpaste.com/hold/139085/

    When I run “git pull” within the udlfb directory I get this:

    You are not currently on a branch, so I cannot use any
    ‘branch..merge’ in your configuration file.
    Please specify which branch you want to merge on the command
    line and try again (e.g. ‘git pull ‘).
    See git-pull(1) for details.

    Sorry, but I’m not very used to git at all, perhaps I’ve done something wrong which is causing the problems? I did sign up for the mailing list.

    At the moment I can use fbgetty with the displaylink device; however that annoyingly suspends X when I switch over to that display. I’d be pretty happy with just a console on it if I could keep it running while I use X. Like monitor a compile on the mini-monitor or have syslog display messages on it. The fact that its “either-or” really limits is usefulness.

  5. B Douglas Hilton Reply

    Ok, well I’m still getting compile errors. What kernel is your code based on? I tried 2.6.31 and 2.6.32 and both seem to have issues with “”struct fb_deferred_io *fbdefio = info->fbdefio;””

    From what I can see, and I could be wrong, info->fbdefio does not exist at the time that this assignment is made because the framebuffer apparently hasn’t yet been initialized into deferred io mode. Thus this element is not present and the compile aborts with an error.

    Here’s the output from make:

    http://dpaste.com/hold/139365/

    I’ll keep sending bug reports, no big deal here. I’m trying to understand this code and see if I can fix it but kernel driver hacking is a bit over my head at the moment. Sorry I can’t provide patches yet.

  6. bernie Reply

    Ah, your kernel was compiled without CONFIG_FB_DEFERRED_IO enabled — so no defio drivers will work (or compile) with it.
    You’ll need to re-compile your kernel with that enabled to test defio support. If no kernel recompile, then you’re better off using existing (older) udlfb drivers, because the main new feature of this codeline is use of defio (when present in the kernel). By the way, which distro is this that has defio off by default (Ubuntu appears to have it on by default)? Thanks!

  7. B Douglas Hilton Reply

    I’m using Gentoo, this is a custom made kernel. I don’t normally enable things that I don’t understand or recognize. I’ll switch it on and give it a shot!

  8. B Douglas Hilton Reply

    Ok, after some hackery I got this to finally build! It seems as if the CONFIG_FB_DEFERRED_IO is not default with my kernel; rather it is selected automatically if needed. Apparently nothing which I had selected needed that. So I modified drivers/video/Kconfig and put “default=y” there. This allowed the udlfb driver to build, but then it had the following missing symbols upon trying to load it:

    udlfb: Unknown symbol sys_imageblit
    udlfb: Unknown symbol sys_fillrect
    udlfb: Unknown symbol sys_copyarea

    These, I found after some studying, can be enabled by selecting “Virtual Framebuffer Support (ONLY FOR TESTING!)”. So I selected this, rebuilt the kernel, and *BINGO* it loaded.

    However, I found that the only configuration which works for me is to use the “fbdev” driver for both monitors. Nevertheless, thats a great step in the right direction!

    When I try and use fbdev with vesa, radeon, or fglrx, then Xorg complains that I must supply a BusID for all framebuffer devices. I tried various things like “USB:4:1:0” “PCI:0:0:0” but none of these seem to be correct and lacking any documentation I decided not to pursue this any further at the moment.

    One final thing. For some reason Xinerama requires that the primary device be set to 16bpp. Loading it at 32bpp causes Xorg to grumble that there is a depth mismatch between the framebuffer devices. Consequently I’m running at 16bpp on both screens apparently.

    Anyways, in summation: I have the displaylink working with my radeon card in Xinerama mode by selecting the fbdev driver for both cards and utilizing the vesafb driver for the primary display. Thanks for the help Bernie!

  9. bernie Reply

    Excellent! This points out we need a clearer way to handle these kernel dependencies on sys and defio.

    And another vote up on adding 24/32bpp support to make Xinerama across displays easier.

    Thanks for working through this!

  10. Jeff Widman Reply

    Hey Bernie,
    I saw your update that support for this got dropped from the mainline… has it been stuck back in as of kernel 2.6.37 or later?

  11. Bernie Thompson Reply

Leave a Reply