DisplayLink Linux performance metrics

Posted on 03. Dec, 2009 by in udlfb

For a USB virtual graphics solution, performance is critical. For the Linux drivers for DisplayLink devices, there are several patches and alternative implementations that are tough to evaluate, compare (and merge) without hard data. It would be great to know what compression ratio we’re getting, how much data is being sent over USB, etc.

So, to enable people to generate that better data, a few lightweight and very low-level benchmarks have been added to udlfb in this patch, which can be grabbed with:

git clone http://git.plugable.com/webdav/udlfb
git checkout origin/sysfs-metrics

What has been added is a set of metrics which are exposed through sysfs. After building branch of the driver with “sudo make install; sudo depmod -a” and a reboot, you’ll find these new files on your system:

ls /sys/class/graphics/fb0/metrics_*
/sys/class/graphics/fb0/metrics_apis_used
/sys/class/graphics/fb0/metrics_bytes_identical
/sys/class/graphics/fb0/metrics_bytes_rendered
/sys/class/graphics/fb0/metrics_bytes_sent
/sys/class/graphics/fb0/metrics_cpu_kcycles_used
/sys/class/graphics/fb0/metrics_reset

If you read any of these files, a request will go down to the udlfb driver, and it will return back the matching metric. One file is write-only: metrics_reset. Writing anything to it will set all the others back to zero.

Sysfs is a really nice mechanism — we can now create some more elaborate test scenarios easily from user mode, and get fairly precise data back from kernel mode.

If you have a working setup of X on top of udlfb (I use this method) — or anything that renders to the framebuffer device — you can now get some much better data about what’s happening in udlfb.

As an example, a script called udlfb-perf.sh has been created to run tests and pretty print a simple report. Here’s the output from a test run on my Acer Aspire laptop running Ubuntu 9.04, using gtkperf as scenario to benchmark.

./udlfb-perf.sh fb0 gtkperf -a
GtkPerf 0.40 - Starting testing: Thu Dec  3 15:16:12 2009
 
GtkEntry - time:  0.00
GtkComboBox - time:  3.17
GtkComboBoxEntry - time:  1.93
GtkSpinButton - time:  0.40
GtkProgressBar - time:  0.60
GtkToggleButton - time:  0.45
GtkCheckButton - time:  0.41
GtkRadioButton - time:  0.74
GtkTextView - Add text - time:  1.99
GtkTextView - Scroll - time:  0.86
GtkDrawingArea - Lines - time:  1.89
GtkDrawingArea - Circles - time:  3.13
GtkDrawingArea - Text - time:  3.05
GtkDrawingArea - Pixbufs - time:  0.56
 --- 
Total time: 19.18
 
Quitting..
 
model name	: Intel(R) Atom(TM) CPU N270   @ 1.60GHz
model name	: Intel(R) Atom(TM) CPU N270   @ 1.60GHz
cpu MHz		: 800.000
cpu MHz		: 800.000
MemTotal:        2052144 kB
Framebuffer Mode: 1920,1080
 
Rendered bytes:  187064790 (total pixels * Bpp)
Identical bytes: 112816894 (skipped via shadow buffer check)
sent bytes:      35556944 (compressed usb data, including overhead)
K CPU cycles:    4316313 (transpired, may include context switches)
 
% pixels found to be unchanged: 60.00 %
Compression of changed pixels : 52.00 %
CPU cycles spent per pixel: 23
USB Mbps: 13.56 (theoretical USB 2.0 peak 480)

To run this, you’ll need the udlfb-perf.sh script (shown below, and also in git here)

And you’ll need gtkperf:

sudo apt-get install gtkperf

Run udlfb-perf on your DisplayLink display, passing it the appropriate framebuffer device (e.g. ./udlfb-perf.sh fb0 gtkperf -a)

#!/bin/bash
# (C) 2009 Bernie Thompson http://plugable.com/
# License http://www.opensource.org/licenses/mit-license.html
 
if [ $# -eq 0 ]
then
    echo
    echo "Usage: $0 fbX [test to benchmark] [test parameters ....]"
    echo "[fbX] must be device visible in /sys/class/graphics directory"
    echo "and should be the DisplayLink device X is currently using"
    echo
    echo "Example: ./udlfb-perf.sh fb0 gtkperf -a"
    echo
    exit 1
fi
 
dev=$1
prog=$2
shift 2
 
echo 1 > /sys/class/graphics/$dev/metrics_reset
 
start=$(date +%s)
$prog $@
end=$(date +%s)
 
rendered=`cat /sys/class/graphics/$dev/metrics_bytes_rendered`
sent=`cat /sys/class/graphics/$dev/metrics_bytes_sent`
identical=`cat /sys/class/graphics/$dev/metrics_bytes_identical`
cycles=`cat /sys/class/graphics/$dev/metrics_cpu_kcycles_used`
mode=`cat /sys/class/graphics/$dev/virtual_size`
 
bus_compress=`/usr/bin/bc <<EOF
scale=2; (($rendered - $identical - $sent) / ($rendered - $identical)) * 100
EOF`
unchanged_pct=`/usr/bin/bc <<EOF
scale=2; (($identical) / $rendered) * 100
EOF`
mbps=`/usr/bin/bc <<EOF
scale=2; ($sent) / ($end - $start) * 8 / 1048576
EOF`
cycles_per_pix=`/usr/bin/bc <<EOF
scale=0; $cycles * 1000 / $rendered
EOF`
 
echo
/bin/grep "model name" /proc/cpuinfo
/bin/grep "MHz" /proc/cpuinfo
/bin/grep "MemTotal" /proc/meminfo
echo "Framebuffer Mode: $mode"
echo
echo "Rendered bytes:  $rendered (total pixels * Bpp)"
echo "Identical bytes: $identical (skipped via shadow buffer check)"
echo "sent bytes:      $sent (compressed usb data, including overhead)"
echo "K CPU cycles:    $cycles (transpired, may include context switches)"
echo
echo "% pixels found to be unchanged: $unchanged_pct %"
echo "Compression of changed pixels : $bus_compress %"
echo "CPU cycles spent per pixel: $cycles_per_pix"
echo "USB Mbps: $mbps (theoretical USB 2.0 peak 480)"
echo

It’d be interesting to see results from other systems and/or some suggested benchmarks other than gtkperf (especially a repeatable video playback test). Please feel free to comment with any of that.

udlfb 0.4.0

Posted on 11. Nov, 2009 by in USB-VGA-165

[Update Dec 29, 2011: udlfb was promoted from the staging to the mainline kernel tree in 2.6.38. And in kernel 3.3 pagefault detection and console are enabled by default. See all our udlfb posts for the latest news.]

[Update March 14, 2010: udlfb versions have moved to being released with the Linux kernel. Update on udlfb support in Linux kernel 2.6.34]

[Update Feb 6, 2010: additional features and fixes post-0.4 are available at the plugable git page. Background is in later posts. One major udlfb patch with these changes has made it into linux-next (slated for 2.6.34), and additional patches will be coming as they're ready.]

This is a new release of the DisplayLink kernel framebuffer driver, udlfb.

udlfb was accepted into the Linux kernel staging tree of 2.6 a few months ago. It needs some work to add key features and get it moving from the staging tree, into mainline. Roberto De Iorio, the author of udlfb and displaylink-mod, is focusing on X server work, and is happy with this work happening in parallel to move udlfb forward.

This first release intentionally adds no fundamentally new features. It only gets udlfb up to sync with the displaylink-mod branch (up to Roberto’s last 0.3 release in July 2009) that has been in use the past few months. With this update, displaylink-mod users should be able to switch to this version of udlfb transparently.

Bug reports are very welcome, especially regressions or problems that would stand in the way of moving this driver forward out of staging (comments here are fine for bug reports).

New in 0.4.0 (since 0.2.3 currently in the Linux kernel staging tree)

  • Add dynamic modeset support (from displaylink-mod 0.3 and libdlo)
    • udlfb uses EDID to find the monitor’s preferred mode
    • udlfb no longer has fixed mode tables – it’s able to set any mode (within the capabilities of the chip) dynamically, from the standard VESA timing characteristics of the mode
  • Fix teardown synchronization issues (from displaylink-mod 0.3)
  • Other minor changes related to probe/modeset (from displaylink-mod 0.3)
  • Functionally identical to displaylink-mod 0.3
  • Retains basic layout of udlfb to make diffs more transparent and understandable

Download

See the git project summary page at http://git.plugable.com/gitphp/index.php?p=udlfb&a=summary for information.

Switching from displaylink-mod to udlfb.

Both these drivers match against all displaylink devices. So you don’t want both loading on your system. To clear out displaylink-mod:

sudo rmmod displaylink
sudo rm /lib/modules/`uname -r`/extra/displaylink.ko
sudo depmod -a

Then download, compile, install udlfb 0.4

./configure
make
sudo make install
sudo depmod -a

Todo

  • Merge in enhancements from Jaya Kumar’s displaylinkfb branch (defio support)
  • Merge in enhancements from Bernie’s displaylink-mod branch (performance)
  • Clear up remaining endian issues, to make sure it works on ARM and others
  • Add performance metrics, and sysfs attributes to read/reset them
  • Make allocation/use of backbuffer a runtime option, with param and sysfs switch
  • Figure out what KMS (Kernel Mode Setting) means to framebuffer drivers
  • Move from single URB with synchronous dispatch to ring of USB URBs, with asynchronous dispatch
  • Enhance probe() to better handle chip type detection
  • Enahance mode selection to better handle limits of DisplayLink chip
  • Add simulated hardware cursor support, to prioritize mouse movement

Any feedback or ideas on these todos are very welcome. And, as always, patches are very welcome and will be incorporated as quickly as possible.

DisplayLink kernel framebuffer performance

Posted on 02. Nov, 2009 by in Programming

There are three codelines of Linux DisplayLink kernel frambuffer drivers currently in use:

  • udlfb (Roberto De Ioris), which is in the Linux kernel staging tree of 2.6.31 and later, and is enabled by default in some recent distros (Ubuntu 9.10). Capable of working with all DisplayLink devices.
  • displaylink-mod (Roberto De Ioris), which adds dynamic mode support and a few other minor changes
  • displaylinkfb (Jaya Kumar), which tries out some innovative approaches (defio page-fault change detection), uses existing fbdev x servers without modification, but is much slower (80% slower on example test) than udlfb and displaylink-mod, which both use X damage information and RL/RAW compression
  • udl (Marcus Glocker) for FreeBSD (not Linux) text/graphics console driver interface, which makes use of damage and ports Huffman-style compression from Florian Echtler’s libtubecable library. This support is currently ahead of what’s on Linux.

You can get more information about these drivers at http://libdlo.freedesktop.org/wiki/HowTo

In general, most of the demos and videos posted here use displaylink-mod. Some performance improvement patches are available at http://git.plugable.com/, but they are a relatively small improvement.

So any conclusions of of the performance work so far?

  • Graphics benchmarks on Linux are in rough shape. Most practical approach so far has been simplistically using a combination of glxgears and a few select tests from x11perf. Good video playback tests needed.
  • Jaya’s displaylinkfb driver tries some interesting concepts, but the lack of damage information from X means it runs much slower (up to 80% slower) than the alternatives for now.
  • The original RL compression in udlfb by Henrik Bjerregaard Pedersen is surprisingly effective, even though it only uses one of RLE or RAW for each 255 pixel segment. It’s relatively CPU-efficient with simple inner loops, and decently USB-efficient in practice.
  • The alternating RL/RAW algorithm in the master branch at http://git.plugable.com/ does slightly better, but it varies per test and is not dramatic.
  • The shadow/back buffer that udlfb and displaylink-mod keep do provide a significant gain on maybe one out of every 4 tests or so, but X’s damage information is quite accurate — so saving that allocation and the ongoing reads/writes to that extra memory by forgoing the back buffer is definitely viable and should at least be a module option. The ‘noback’ branch at http://git.plugable.com/ has this removed (despite what one comment checkin says), for anyone who wants to try it.

Perhaps the best and quickest path forward to getting support more widely distributed is

  • Bringing udlfb up to snuff, since there are just a few functional changes in the displaylink-mod branch, and udlfb already is going through the staging->mainline confidence building process
  • For the matching X server, it would be great to have kernel driver that works with both the standard http://cgit.freedesktop.org/xorg/driver/xf86-video-fbdev/ at some level of performance, or with the custom displaylink xserver at some (better) level of performance. Then move things (like damage support, which is key to performance) from the displaylink server to the fbdev server in a standardized way over time.
  • Then there’s a bunch of other more involved work to come on configuration and coexistence with multiple graphics controllers. Rough plans are visible in the fog here, but they involve other projects and people.
Page 7 of 8« First...45678