fbi – linux framebuffer imageviewer

Posted on 24. Dec, 2009 by in Tips

I was embarrassed to discover that the ever-popular dlfb “green screen” is broken in subtle ways by the new defio support. So for now with this new driver, DL devices may up with what appears to be gibberish. I didn’t notice, because my test config launches X on the DisplayLink device immediately.

In the absence of a working X setup, and until the draw-during-probe issue is resolved, another great way to confirm your DL devices are working properly is with the fbi (“linux framebuffer imageviewer”) utility.

sudo apt-get install fbi

Run the above to install fbi (on Ubuntu/Debian).

fbi -d /dev/fb0 -a *.jpg

And this to show a full-screen slideshow with fbi — replacing “/dev/fb0″ with one of your DisplayLink devices. You can “ls /sys/class/graphics/” to see all your framebuffer devices, and look within those directories for details on them. Of course, run the fbi utility from a directory with some jpgs (or other images) to display. pg-up/pg-down to move between photos.

You must run the fbi program from a console (fbi limitation). So hit Ctrl-Alt-F1 or something first, login from that text console, and run from there.

This wouldn’t work with older versions of udlfb or displaylink-mod (without defio support), and it makes a nice new test case. Now, short instructions for getting mplayer working for fbdev video would be welcome …

Take your pick: standard or custom displaylink X drivers both work now

Posted on 21. Dec, 2009 by in Programming

[update April 6, 2010: support for fbdev had been merged into the main udlfb codeline, including for 2.6.34, but has since been removed because of kernel faults that stand unsolved. If/when these problems can be found and fixed, fbdev support can get back into the mainline. Until then, the branch mentioned below is an ok way to try and test]

You can now get a version of udlfb with improved performance and support for any fbdev client.

Of the three DisplayLink Linux framebuffer driver lines, udlfb and displaylink-mod (written by Roberto DeIoris) have had the best performance by a significant margin. They’ve relied on Roberto’s custom X server, with some custom IOCTLs, to make use of precise X damage information. All the directions on the http://displaylink.org/ wiki have pointed to these drivers so far.

Unfortunately, these drivers won’t work with standard frambuffer clients that use a mmap’d framebuffer, because they’ll simply never refresh any area of the screen without damage notification. So drivers like the existing xf86-video-fbdev won’t work.

By contrast, Jaya Kumar’s defio-based DisplayLink codeline does work with xf86-video-fbdev or any standard fbdev client, but hasn’t been competitive performance-wise.

So the goal has been to merge the best aspects of both codelines — and get them merged into the kernel. That work isn’t completely done, but it’s at a working milesone. We now have a single kernel framebuffer driver that can support either roberto’s custom X driver (“displaylink”), or the standard fbdev X driver (“fbdev”), just by switching the “driver” line in the “server” section of xorg.conf. This makes for easier performance testing and workaround testing for X server specific problems.

And where previously, the fbdev driver was much slower than the displaylink custom (90% slower on some tests), it’s now within a few percentage points of difference – often not enough to notice.

There’s lots of room to optimize further yet, but this opens the possibility of not needing a custom X server at all for displaylink hardware, which would simplify the linux distribution rollout strategy.

You can grab this code at:

git clone http://git.plugable.com/webdav/udlfb
git checkout origin/defio

Then “make; sudo make install; sudo depmod -a” as usual. If you’re switching from displaylink-mod, get rid of that from the kernel modules directory first, or both udlfb and it may try to load.

Please post experience reports here or on the libdlo list. As patches have been developed for udlfb, there’s not been enough validation from the user community that the patches work and are valuable — and that would help the Linux kernel maintainers make their decisions about whether to accept patches.

Perf data is imperfect, but here’s a benchmark run of this code running the custom displaylink X server (making use of damage information)

bernie@bernie-aspireone:~/git/misc-udlfb$ ./udlfb-perf.sh fb0 gtkperf -a
GtkPerf 0.40 - Starting testing: Mon Dec 21 14:24:13 2009
 
GtkEntry - time:  0.00
GtkComboBox - time:  3.00
GtkComboBoxEntry - time:  1.89
GtkSpinButton - time:  0.42
GtkProgressBar - time:  0.60
GtkToggleButton - time:  0.44
GtkCheckButton - time:  0.42
GtkRadioButton - time:  0.75
GtkTextView - Add text - time:  2.09
GtkTextView - Scroll - time:  0.83
GtkDrawingArea - Lines - time:  1.66
GtkDrawingArea - Circles - time:  3.09
GtkDrawingArea - Text - time:  2.89
GtkDrawingArea - Pixbufs - time:  0.27
 ---
Total time: 18.37
 
Quitting..
 
model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
cpu MHz         : 1600.000
cpu MHz         : 1333.000
MemTotal:        2052144 kB
Framebuffer Mode: 1920,1080
 
Rendered bytes:  155896744 (total pixels * Bpp)
Identical bytes: 96231480 (skipped via shadow buffer check)
sent bytes:      29614624 (compressed usb data, including overhead)
K CPU cycles:    1251295 (transpired, may include context switches)
 
% pixels found to be unchanged: 61.00 %
Compression of changed pixels : 50.00 %
Total CPU cycles spent per input pixel: 8
Total CPU cycles spent per output pixel: 42
USB Mbps: 11.29 (theoretical USB 2.0 peak 480)

And here’s a benchmark run of the same, just switched to run the standard fbdev X server (making use of only page faults)

bernie@bernie-aspireone:~/git/misc-udlfb$ ./udlfb-perf.sh fb0 gtkperf -a
GtkPerf 0.40 - Starting testing: Mon Dec 21 12:45:20 2009
 
GtkEntry - time:  0.00
GtkComboBox - time:  3.16
GtkComboBoxEntry - time:  1.80
GtkSpinButton - time:  0.41
GtkProgressBar - time:  0.60
GtkToggleButton - time:  0.44
GtkCheckButton - time:  0.41
GtkRadioButton - time:  0.76
GtkTextView - Add text - time:  2.03
GtkTextView - Scroll - time:  0.82
GtkDrawingArea - Lines - time:  1.87
GtkDrawingArea - Circles - time:  3.36
GtkDrawingArea - Text - time:  2.84
GtkDrawingArea - Pixbufs - time:  0.22
 ---
Total time: 18.73
 
Quitting..
 
model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
model name      : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
cpu MHz         : 1333.000
cpu MHz         : 1600.000
MemTotal:        2052144 kB
Framebuffer Mode: 1920,1080
 
Rendered bytes:  288165888 (total pixels * Bpp)
Identical bytes: 227263860 (skipped via shadow buffer check)
sent bytes:      39281496 (compressed usb data, including overhead)
K CPU cycles:    1685041 (transpired, may include context switches)
 
% pixels found to be unchanged: 78.00 %
Compression of changed pixels : 35.00 %
Total CPU cycles spent per input pixel: 5
Total CPU cycles spent per output pixel: 42
USB Mbps: 14.98 (theoretical USB 2.0 peak 480)

The “Compression of changed pixels” is lower on fbdev, because the unchanged pixel detection is less accurate for the page fault method (for now, but that will get fixed ..), and so there’s a lot of re-rendering of desktop pixels — and my desktop background is a complex, gradient heavy image that doesn’t compress well.

The main performance gains vs. the original defio implementation are:

  • Added RLE compression on the defio path
  • Added shadow/backbuffer to scan for unchanged pixels on the defio path
  • Moved to asynchronous urb dispatch for defio and damage codepaths (4 pre-alloc’d 64K urbs). This also has a significant performance benefit for the custom/damage path
  • Changed defio path to no longer send an urb per line, but rather fill every urb completely, across lines and dirty pages. Big gain for defio.

DisplayLink Linux performance metrics

Posted on 03. Dec, 2009 by in udlfb

For a USB virtual graphics solution, performance is critical. For the Linux drivers for DisplayLink devices, there are several patches and alternative implementations that are tough to evaluate, compare (and merge) without hard data. It would be great to know what compression ratio we’re getting, how much data is being sent over USB, etc.

So, to enable people to generate that better data, a few lightweight and very low-level benchmarks have been added to udlfb in this patch, which can be grabbed with:

git clone http://git.plugable.com/webdav/udlfb
git checkout origin/sysfs-metrics

What has been added is a set of metrics which are exposed through sysfs. After building branch of the driver with “sudo make install; sudo depmod -a” and a reboot, you’ll find these new files on your system:

ls /sys/class/graphics/fb0/metrics_*
/sys/class/graphics/fb0/metrics_apis_used
/sys/class/graphics/fb0/metrics_bytes_identical
/sys/class/graphics/fb0/metrics_bytes_rendered
/sys/class/graphics/fb0/metrics_bytes_sent
/sys/class/graphics/fb0/metrics_cpu_kcycles_used
/sys/class/graphics/fb0/metrics_reset

If you read any of these files, a request will go down to the udlfb driver, and it will return back the matching metric. One file is write-only: metrics_reset. Writing anything to it will set all the others back to zero.

Sysfs is a really nice mechanism — we can now create some more elaborate test scenarios easily from user mode, and get fairly precise data back from kernel mode.

If you have a working setup of X on top of udlfb (I use this method) — or anything that renders to the framebuffer device — you can now get some much better data about what’s happening in udlfb.

As an example, a script called udlfb-perf.sh has been created to run tests and pretty print a simple report. Here’s the output from a test run on my Acer Aspire laptop running Ubuntu 9.04, using gtkperf as scenario to benchmark.

./udlfb-perf.sh fb0 gtkperf -a
GtkPerf 0.40 - Starting testing: Thu Dec  3 15:16:12 2009
 
GtkEntry - time:  0.00
GtkComboBox - time:  3.17
GtkComboBoxEntry - time:  1.93
GtkSpinButton - time:  0.40
GtkProgressBar - time:  0.60
GtkToggleButton - time:  0.45
GtkCheckButton - time:  0.41
GtkRadioButton - time:  0.74
GtkTextView - Add text - time:  1.99
GtkTextView - Scroll - time:  0.86
GtkDrawingArea - Lines - time:  1.89
GtkDrawingArea - Circles - time:  3.13
GtkDrawingArea - Text - time:  3.05
GtkDrawingArea - Pixbufs - time:  0.56
 --- 
Total time: 19.18
 
Quitting..
 
model name	: Intel(R) Atom(TM) CPU N270   @ 1.60GHz
model name	: Intel(R) Atom(TM) CPU N270   @ 1.60GHz
cpu MHz		: 800.000
cpu MHz		: 800.000
MemTotal:        2052144 kB
Framebuffer Mode: 1920,1080
 
Rendered bytes:  187064790 (total pixels * Bpp)
Identical bytes: 112816894 (skipped via shadow buffer check)
sent bytes:      35556944 (compressed usb data, including overhead)
K CPU cycles:    4316313 (transpired, may include context switches)
 
% pixels found to be unchanged: 60.00 %
Compression of changed pixels : 52.00 %
CPU cycles spent per pixel: 23
USB Mbps: 13.56 (theoretical USB 2.0 peak 480)

To run this, you’ll need the udlfb-perf.sh script (shown below, and also in git here)

And you’ll need gtkperf:

sudo apt-get install gtkperf

Run udlfb-perf on your DisplayLink display, passing it the appropriate framebuffer device (e.g. ./udlfb-perf.sh fb0 gtkperf -a)

#!/bin/bash
# (C) 2009 Bernie Thompson http://plugable.com/
# License http://www.opensource.org/licenses/mit-license.html
 
if [ $# -eq 0 ]
then
    echo
    echo "Usage: $0 fbX [test to benchmark] [test parameters ....]"
    echo "[fbX] must be device visible in /sys/class/graphics directory"
    echo "and should be the DisplayLink device X is currently using"
    echo
    echo "Example: ./udlfb-perf.sh fb0 gtkperf -a"
    echo
    exit 1
fi
 
dev=$1
prog=$2
shift 2
 
echo 1 > /sys/class/graphics/$dev/metrics_reset
 
start=$(date +%s)
$prog $@
end=$(date +%s)
 
rendered=`cat /sys/class/graphics/$dev/metrics_bytes_rendered`
sent=`cat /sys/class/graphics/$dev/metrics_bytes_sent`
identical=`cat /sys/class/graphics/$dev/metrics_bytes_identical`
cycles=`cat /sys/class/graphics/$dev/metrics_cpu_kcycles_used`
mode=`cat /sys/class/graphics/$dev/virtual_size`
 
bus_compress=`/usr/bin/bc <<EOF
scale=2; (($rendered - $identical - $sent) / ($rendered - $identical)) * 100
EOF`
unchanged_pct=`/usr/bin/bc <<EOF
scale=2; (($identical) / $rendered) * 100
EOF`
mbps=`/usr/bin/bc <<EOF
scale=2; ($sent) / ($end - $start) * 8 / 1048576
EOF`
cycles_per_pix=`/usr/bin/bc <<EOF
scale=0; $cycles * 1000 / $rendered
EOF`
 
echo
/bin/grep "model name" /proc/cpuinfo
/bin/grep "MHz" /proc/cpuinfo
/bin/grep "MemTotal" /proc/meminfo
echo "Framebuffer Mode: $mode"
echo
echo "Rendered bytes:  $rendered (total pixels * Bpp)"
echo "Identical bytes: $identical (skipped via shadow buffer check)"
echo "sent bytes:      $sent (compressed usb data, including overhead)"
echo "K CPU cycles:    $cycles (transpired, may include context switches)"
echo
echo "% pixels found to be unchanged: $unchanged_pct %"
echo "Compression of changed pixels : $bus_compress %"
echo "CPU cycles spent per pixel: $cycles_per_pix"
echo "USB Mbps: $mbps (theoretical USB 2.0 peak 480)"
echo

It’d be interesting to see results from other systems and/or some suggested benchmarks other than gtkperf (especially a repeatable video playback test). Please feel free to comment with any of that.

Page 9 of 12« First...7891011...Last »