VFIO VGA test branches

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Alex Williamson-3
On Tue, 2013-05-28 at 07:33 +0200, Knut Omang wrote:

>
> I noticed this warning in the host log - I suppose it is unrelated but
> thought I'd mention it just in case there is some side effect I do not
> understand here:
>
> [    0.538124] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
> [    0.538619] PCI-DMA: Intel(R) Virtualization Technology for Directed I/O
> [    0.538676] ------------[ cut here ]------------
> [    0.538681] WARNING: at drivers/pci/search.c:46 pci_find_upstream_pcie_bridge+0x58/0x80()
> [    0.538683] Hardware name: To be filled by O.E.M.
> [    0.538685] Modules linked in:
> [    0.538687] Pid: 1, comm: swapper/0 Not tainted 3.9.0+ #1
> [    0.538689] Call Trace:
> [    0.538694]  [<ffffffff8105ed2f>] warn_slowpath_common+0x7f/0xc0
> [    0.538697]  [<ffffffff8105ed8a>] warn_slowpath_null+0x1a/0x20
> [    0.538699]  [<ffffffff8132dc28>] pci_find_upstream_pcie_bridge+0x58/0x80
> [    0.538703]  [<ffffffff8152e26b>] intel_iommu_add_device+0x4b/0x1f0
> [    0.538706]  [<ffffffff81525b30>] ? bus_set_iommu+0x60/0x60
> [    0.538708]  [<ffffffff81525b63>] add_iommu_group+0x33/0x60
> [    0.538712]  [<ffffffff813f38fd>] bus_for_each_dev+0x5d/0xa0
> [    0.538714]  [<ffffffff81525b1b>] bus_set_iommu+0x4b/0x60
> [    0.538718]  [<ffffffff81d47d61>] intel_iommu_init+0xa72/0xb9a
> [    0.538722]  [<ffffffff81d0db94>] ? memblock_find_dma_reserve+0x13d/0x13d
> [    0.538724]  [<ffffffff81d0dba7>] pci_iommu_init+0x13/0x3e
> [    0.538727]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
> [    0.538730]  [<ffffffff81d0603b>] kernel_init_freeable+0x150/0x1df
> [    0.538732]  [<ffffffff81d0588d>] ? do_early_param+0x8c/0x8c
> [    0.538736]  [<ffffffff81646580>] ? rest_init+0x80/0x80
> [    0.538738]  [<ffffffff8164658e>] kernel_init+0xe/0xf0
> [    0.538742]  [<ffffffff8166af6c>] ret_from_fork+0x7c/0xb0
> [    0.538744]  [<ffffffff81646580>] ? rest_init+0x80/0x80
> [    0.538749] ---[ end trace f4e8b5168095f9c1 ]---


There's a bug for this:

https://bugzilla.kernel.org/show_bug.cgi?id=44881


Chances are your system includes one of the non-compliant PCIe-to-PCI
bridges that doesn't include a PCIe capability.  So long as you're not
assigning anything behind that bridge, it shouldn't matter, but I think
we'll setup the wrong grouping and use the wrong source ID for devices
behind it.  Thanks,

Alex


Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Maik Broemme
In reply to this post by Maik Broemme
Hi,

Maik Broemme <[hidden email]> wrote:

> Hi Alex,
>
> Maik Broemme <[hidden email]> wrote:
> > Hi Alex,
> >
> > Alex Williamson <[hidden email]> wrote:
> > >
> > > Good to hear.  It looks like you have the same motherboard as my AMD
> > > test system.  An HD7850 in that system runs quite reliably with the
> > > branches above although I do occasionally get VGA palette corruption.
> > >
> >
> > Good to know. I'm using a Radeon HD7870 which works fine now. I have the
> > same VGA palette corruption occasionally but only until Catalyst driver
> > is loaded. So it happens sometimes during VGA init if Windows 7 boot
> > logo is shown with very strange colors and went away if Catalyst driver
> > is loaded.
> >
> > > Are you still require -vga cirrus or do the -vga none, x-vga=on cases
> > > work now too?  Thanks,
> > >
> >
> > No longer required, -vga none with x-vga=on work on your branches fine
> > now. Not sure if there was something more changed because with original
> > Fedora 3.9.2 kernel it still doesn't work.
> >
>
> Alex, I have a strange issue now with either the 'vfio-vga-reset'
> branches or with the stable 3.9.4 kernel. This is my 'lspci' output:
>
> 00:14.2 Audio device: Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel HDA) (rev 40)
> 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 520] (rev a1)
> 01:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
> 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Pitcairn [Radeon HD 7800]
> 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
>
> The '01:00.0' is my primary device used for Linux and '02:00.0' my
> secondary for QEMU. Two new different problems:
>
> 1) If the 'nvidia.ko' binary driver is loaded for the first card, QEMU
> immediately get stuck after startup and hangs with:
>
> 1140  futex(0x7f0ad9b21300, FUTEX_WAIT_PRIVATE, 2, NULL
>
> I have the complete strace output if needed. After that I can only
> terminate qemu with 'kill -9' and if I start it again the following
> Oops occurs:
>
> [  655.684121] ------------[ cut here ]------------
> [  655.684134] WARNING: at lib/list_debug.c:29 __list_add+0x77/0xd0()
> [  655.684151] Hardware name: GA-990FXA-UD3
> [  655.684271] list_add corruption. next->prev should be prev (ffffffff81ca3d98), but was           (null). (next=ffff88041bc3fe08).
> [  655.684477] Modules linked in: vhost_net macvtap macvlan tun arc4 md4 nls_utf8 cifs dns_resolver fscache vfio_pci vfio_iommu_type1 vfio bridge stp llc ip6table_filter ip6_tables it87 hwmon_vid snd_hda_codec_hdmi nvidia(POF) acpi_cpufreq mperf kvm_amd snd_hda_codec_realtek kvm crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel snd_hda_codec microcode edac_core snd_hwdep fam15h_power snd_seq edac_mce_amd snd_seq_device k10temp r8169 sp5100_tco snd_pcm mii i2c_piix4 snd_page_alloc snd_timer i2c_core snd soundcore mxm_wmi firewire_ohci firewire_core crc_itu_t wmi
> [  655.685451] Pid: 2097, comm: qemu-system-x86 Tainted: PF          O 3.9.4-200.fc18.x86_64 #1
> [  655.685642] Call Trace:
> [  655.685738]  [<ffffffff8105f125>] warn_slowpath_common+0x75/0xa0
> [  655.685851]  [<ffffffff8105f206>] warn_slowpath_fmt+0x46/0x50
> [  655.685955]  [<ffffffff81316ef7>] __list_add+0x77/0xd0
> [  655.686058]  [<ffffffff8108392c>] add_wait_queue+0x3c/0x60
> [  655.686162]  [<ffffffff813f241d>] vga_get+0xdd/0x190
> [  655.686266]  [<ffffffff81093e40>] ? try_to_wake_up+0x2d0/0x2d0
> [  655.686373]  [<ffffffffa01ac625>] vfio_pci_vga_rw+0xb5/0x230 [vfio_pci]
> [  655.686481]  [<ffffffffa01aa279>] vfio_pci_rw+0x39/0x80 [vfio_pci]
> [  655.686587]  [<ffffffffa01aa30c>] vfio_pci_read+0x1c/0x20 [vfio_pci]
> [  655.686701]  [<ffffffffa01a40e3>] vfio_device_fops_read+0x23/0x30 [vfio]
> [  655.686814]  [<ffffffff811a01b9>] vfs_read+0xa9/0x180
> [  655.686915]  [<ffffffff811a05ba>] sys_pread64+0x9a/0xb0
> [  655.687018]  [<ffffffff81669f59>] system_call_fastpath+0x16/0x1b
> [  655.687123] ---[ end trace a68eabc3660237b1 ]---
>
> This is always reproducible. I know it is the binary driver and maybe
> nobody cares but it is widely used. :)

Here is the DEBUG_VFIO output:

vfio: vfio_initfn(0000:04:00.0) group 14
vfio: region_add 0 - afffffff [0x7f8698000000]
vfio: SKIPPING region_add fec00000 - fec00fff
vfio: SKIPPING region_add fed00000 - fed003ff
vfio: SKIPPING region_add fee00000 - feefffff
vfio: region_add fffe0000 - ffffffff [0x7f88aa400000]
vfio: region_add 100000000 - 24fffffff [0x7f8748000000]
vfio: Device 0000:04:00.0 flags: 3, regions: 9, irgs: 4
vfio: Device 0000:04:00.0 region 0:
vfio:   size: 0x10000000, offset: 0x0, flags: 0x7
vfio: Device 0000:04:00.0 region 1:
vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
vfio: Device 0000:04:00.0 region 2:
vfio:   size: 0x40000, offset: 0x20000000000, flags: 0x7
vfio: Device 0000:04:00.0 region 3:
vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
vfio: Device 0000:04:00.0 region 4:
vfio:   size: 0x100, offset: 0x40000000000, flags: 0x3
vfio: Device 0000:04:00.0 region 5:
vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
vfio: Device 0000:04:00.0 ROM:
vfio:   size: 0x20000, offset: 0x60000000000, flags: 0x1
vfio: Device 0000:04:00.0 config:
vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
vfio: vfio_load_rom(0000:04:00.0)
vfio: Enabled ATI/AMD BAR2 0x4000 quirk for device 0000:04:00.0
vfio: Enabled ATI/AMD BAR4 window quirk for device 0000:04:00.0
vfio: Enabled ATI/AMD quirk 0x3c3 BAR4 for device 0000:04:00.0
vfio: 0000:04:00.0 PCI MSI CAP @0xa0
vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
vfio: vfio_enable_intx(0000:04:00.0)
vfio: vfio_initfn(0000:04:00.1) group 14
vfio: Device 0000:04:00.1 flags: 3, regions: 9, irgs: 4
vfio: Device 0000:04:00.1 region 0:
vfio:   size: 0x4000, offset: 0x0, flags: 0x7
vfio: Device 0000:04:00.1 region 1:
vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
vfio: Device 0000:04:00.1 region 2:
vfio:   size: 0x0, offset: 0x20000000000, flags: 0x0
vfio: Device 0000:04:00.1 region 3:
vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
vfio: Device 0000:04:00.1 region 4:
vfio:   size: 0x0, offset: 0x40000000000, flags: 0x0
vfio: Device 0000:04:00.1 region 5:
vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
vfio: Device 0000:04:00.1 ROM:
vfio:   size: 0x0, offset: 0x60000000000, flags: 0x0
vfio: Device 0000:04:00.1 config:
vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
vfio: 0000:04:00.1 PCI MSI CAP @0xa0
vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
vfio: vfio_enable_intx(0000:04:00.1)
vfio: region_del 0 - afffffff
vfio: region_add 0 - bffff [0x7f8698000000]
vfio: region_add c0000 - dffff [0x7f88aa200000]
vfio: region_add e0000 - fffff [0x7f88aa400000]
vfio: region_add 100000 - afffffff [0x7f8698100000]
vfio: vfio_pci_reset(0000:04:00.0)
vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
vfio: vfio_disable_intx(0000:04:00.0)
vfio: vfio_pci_read_config(0000:04:00.0, @0x54, len=0x2) 0
vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 3
vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x0, len=0x2)
vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
vfio: vfio_enable_intx(0000:04:00.0)
vfio: vfio_pci_reset(0000:04:00.1)
vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
vfio: vfio_disable_intx(0000:04:00.1)
vfio: vfio_pci_read_config(0000:04:00.1, @0x54, len=0x2) 0
vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 6
vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x0, len=0x2)
vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
vfio: vfio_enable_intx(0000:04:00.1)
vfio: region_del 0 - bffff
vfio: region_del c0000 - dffff
vfio: region_add 0 - c7fff [0x7f8698000000]
vfio: region_add c8000 - dffff [0x7f88aa208000]
vfio: region_del 0 - c7fff
vfio: region_del c8000 - dffff
vfio: region_add 0 - cffff [0x7f8698000000]
vfio: region_add d0000 - dffff [0x7f88aa210000]
vfio: region_del 0 - cffff
vfio: region_del d0000 - dffff
vfio: region_add 0 - d7fff [0x7f8698000000]
vfio: region_add d8000 - dffff [0x7f88aa218000]
vfio: region_del 0 - d7fff
vfio: region_del d8000 - dffff
vfio: region_add 0 - dffff [0x7f8698000000]
vfio: region_del 0 - dffff
vfio: region_del e0000 - fffff
vfio: region_add 0 - e7fff [0x7f8698000000]
vfio: region_add e8000 - fffff [0x7f88aa408000]
vfio: region_del 0 - e7fff
vfio: region_del e8000 - fffff
vfio: region_add 0 - effff [0x7f8698000000]
vfio: region_add f0000 - fffff [0x7f88aa410000]
vfio: region_del 0 - effff
vfio: region_del f0000 - fffff
vfio: region_del 100000 - afffffff
vfio: region_add 0 - afffffff [0x7f8698000000]
vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x4) 68181002
vfio: vfio_pci_read_config(0000:04:00.0, @0x8, len=0x4) 3000000
vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x4) aab01002
vfio: vfio_pci_read_config(0000:04:00.1, @0x8, len=0x4) 4030000
vfio: vfio_pci_read_config(0000:04:00.1, @0xe, len=0x1) 80
vfio: SKIPPING region_add b0000000 - bfffffff
vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) c000000c
vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) f000000c
vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xc000000c, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) ffffffff
vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fde80004
vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fffc0004
vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfde80004, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) ffffffff
vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ce01
vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ffffff01
vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xce01, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffff800, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) fdefc004
vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) ffffc004
vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfdefc004, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) ffffffff
vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0xffffffff, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0x0, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0xfffff800, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0x0, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xc000, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfea00000, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfea60000, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xe0000000, len=0x4)
vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
vfio: SKIPPING region_add feb40000 - feb4002f
vfio: SKIPPING region_add feb40800 - feb40807
vfio: SKIPPING region_add feb41000 - feb4101f
vfio: SKIPPING region_add feb41800 - feb41807
vfio: vfio_update_irq(0000:04:00.1) IRQ moved 20 -> 10
vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
vfio: vfio_update_irq(0000:04:00.0) IRQ moved 23 -> 11
vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
vfio: SKIPPING region_add feb42000 - feb42fff
vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
vfio: vfio_pci_write_config(0000:04:00.0, @0x3c, 0xb, len=0x1)
vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 0
vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
vfio: region_add e0000000 - efffffff [0x7f8688000000]
vfio: region_add fea00000 - fea03fff [0x7f88aa7b8000]
vfio: SKIPPING region_add fea04000 - fea04fff
vfio: region_add fea05000 - fea3ffff [0x7f88aa7bd000]
vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
vfio: vfio_pci_write_config(0000:04:00.1, @0x3c, 0xa, len=0x1)
vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 0
vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x103, len=0x2)
vfio: region_add fea60000 - fea63fff [0x7f88bc710000]
vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
vfio: region_del 0 - afffffff
vfio: region_add 0 - 9ffff [0x7f8698000000]
vfio: SKIPPING region_add a0000 - bffff
vfio: region_add c0000 - afffffff [0x7f86980c0000]
vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fea40000
vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffffffe, len=0x4)
vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40001, len=0x4)
vfio: region_add fea40000 - fea5ffff [0x7f88a9e00000]
vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
vfio: region_del fea40000 - fea5ffff

Here is the strace output from this failure:

1110  ioctl(14, KVM_RUN, 0)             = 0
1110  pread(20,  <unfinished ...>
1099  <... poll resumed> )              = 1 ([{fd=0, revents=POLLIN}])
1099  futex(0x7ff73ca62fa0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
1109  <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
1109  madvise(0x7ff72fe17000, 8368128, MADV_DONTNEED) = 0
1109  _exit(0)                          = ?
1109  +++ exited with 0 +++

From reading the source 'hw/misc/vfio.c' it looks like the following
in 'vfio_vga_read' never finished:

    if (pread(vga->fd, &buf, size, offset) != size) {
        error_report("%s(,0x%"HWADDR_PRIx", %d) failed: %m",
                     __func__, region->offset + addr, size);
        return (uint64_t)-1;
    }

>
> 2) If the 'nouveau.ko' driver is loaded it is even more strange. As soon
> as I start qemu all my SATA links get a hard reset and kernel freezes.
> No SysRQs are working anymore and only reboot helps. If needed I can
> look if I can get some dumps from this freeze because it writes nothing
> more to the disks.
>
> But it is getting even more strange. I was putting the secondary card
> in another PCI slot and then it started to work with nouveau module
> loaded and passthrough ATI card to QEMU. But this worked only until I
> started X server with nouveau X driver. As soon as X is running and I
> started QEMU it hanged again in FUTEX_WAIT_PRIVATE.
>
> 3) Without loading 'nvidia.ko' or 'nouveau.ko' modules it works out of
> the box with several start/stop cycles. However I have no X in this
> case. ;)
>
> Any ideas? :)
>
> > > Alex
> > >
> >
> > --Maik
> >
>
> --Maik
>

--Maik

Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Alex Williamson-3
On Tue, 2013-05-28 at 20:45 +0200, Maik Broemme wrote:

> Hi,
>
> Maik Broemme <[hidden email]> wrote:
> > Hi Alex,
> >
> > Maik Broemme <[hidden email]> wrote:
> > > Hi Alex,
> > >
> > > Alex Williamson <[hidden email]> wrote:
> > > >
> > > > Good to hear.  It looks like you have the same motherboard as my AMD
> > > > test system.  An HD7850 in that system runs quite reliably with the
> > > > branches above although I do occasionally get VGA palette corruption.
> > > >
> > >
> > > Good to know. I'm using a Radeon HD7870 which works fine now. I have the
> > > same VGA palette corruption occasionally but only until Catalyst driver
> > > is loaded. So it happens sometimes during VGA init if Windows 7 boot
> > > logo is shown with very strange colors and went away if Catalyst driver
> > > is loaded.
> > >
> > > > Are you still require -vga cirrus or do the -vga none, x-vga=on cases
> > > > work now too?  Thanks,
> > > >
> > >
> > > No longer required, -vga none with x-vga=on work on your branches fine
> > > now. Not sure if there was something more changed because with original
> > > Fedora 3.9.2 kernel it still doesn't work.
> > >
> >
> > Alex, I have a strange issue now with either the 'vfio-vga-reset'
> > branches or with the stable 3.9.4 kernel. This is my 'lspci' output:
> >
> > 00:14.2 Audio device: Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel HDA) (rev 40)
> > 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 520] (rev a1)
> > 01:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
> > 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Pitcairn [Radeon HD 7800]
> > 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> >
> > The '01:00.0' is my primary device used for Linux and '02:00.0' my
> > secondary for QEMU. Two new different problems:
> >
> > 1) If the 'nvidia.ko' binary driver is loaded for the first card, QEMU
> > immediately get stuck after startup and hangs with:
> >
> > 1140  futex(0x7f0ad9b21300, FUTEX_WAIT_PRIVATE, 2, NULL
> >
> > I have the complete strace output if needed. After that I can only
> > terminate qemu with 'kill -9' and if I start it again the following
> > Oops occurs:
> >
> > [  655.684121] ------------[ cut here ]------------
> > [  655.684134] WARNING: at lib/list_debug.c:29 __list_add+0x77/0xd0()
> > [  655.684151] Hardware name: GA-990FXA-UD3
> > [  655.684271] list_add corruption. next->prev should be prev (ffffffff81ca3d98), but was           (null). (next=ffff88041bc3fe08).
> > [  655.684477] Modules linked in: vhost_net macvtap macvlan tun arc4 md4 nls_utf8 cifs dns_resolver fscache vfio_pci vfio_iommu_type1 vfio bridge stp llc ip6table_filter ip6_tables it87 hwmon_vid snd_hda_codec_hdmi nvidia(POF) acpi_cpufreq mperf kvm_amd snd_hda_codec_realtek kvm crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel snd_hda_codec microcode edac_core snd_hwdep fam15h_power snd_seq edac_mce_amd snd_seq_device k10temp r8169 sp5100_tco snd_pcm mii i2c_piix4 snd_page_alloc snd_timer i2c_core snd soundcore mxm_wmi firewire_ohci firewire_core crc_itu_t wmi
> > [  655.685451] Pid: 2097, comm: qemu-system-x86 Tainted: PF          O 3.9.4-200.fc18.x86_64 #1
> > [  655.685642] Call Trace:
> > [  655.685738]  [<ffffffff8105f125>] warn_slowpath_common+0x75/0xa0
> > [  655.685851]  [<ffffffff8105f206>] warn_slowpath_fmt+0x46/0x50
> > [  655.685955]  [<ffffffff81316ef7>] __list_add+0x77/0xd0
> > [  655.686058]  [<ffffffff8108392c>] add_wait_queue+0x3c/0x60
> > [  655.686162]  [<ffffffff813f241d>] vga_get+0xdd/0x190
> > [  655.686266]  [<ffffffff81093e40>] ? try_to_wake_up+0x2d0/0x2d0
> > [  655.686373]  [<ffffffffa01ac625>] vfio_pci_vga_rw+0xb5/0x230 [vfio_pci]
> > [  655.686481]  [<ffffffffa01aa279>] vfio_pci_rw+0x39/0x80 [vfio_pci]
> > [  655.686587]  [<ffffffffa01aa30c>] vfio_pci_read+0x1c/0x20 [vfio_pci]
> > [  655.686701]  [<ffffffffa01a40e3>] vfio_device_fops_read+0x23/0x30 [vfio]
> > [  655.686814]  [<ffffffff811a01b9>] vfs_read+0xa9/0x180
> > [  655.686915]  [<ffffffff811a05ba>] sys_pread64+0x9a/0xb0
> > [  655.687018]  [<ffffffff81669f59>] system_call_fastpath+0x16/0x1b
> > [  655.687123] ---[ end trace a68eabc3660237b1 ]---
> >
> > This is always reproducible. I know it is the binary driver and maybe
> > nobody cares but it is widely used. :)
>
> Here is the DEBUG_VFIO output:
>
> vfio: vfio_initfn(0000:04:00.0) group 14
> vfio: region_add 0 - afffffff [0x7f8698000000]
> vfio: SKIPPING region_add fec00000 - fec00fff
> vfio: SKIPPING region_add fed00000 - fed003ff
> vfio: SKIPPING region_add fee00000 - feefffff
> vfio: region_add fffe0000 - ffffffff [0x7f88aa400000]
> vfio: region_add 100000000 - 24fffffff [0x7f8748000000]
> vfio: Device 0000:04:00.0 flags: 3, regions: 9, irgs: 4
> vfio: Device 0000:04:00.0 region 0:
> vfio:   size: 0x10000000, offset: 0x0, flags: 0x7
> vfio: Device 0000:04:00.0 region 1:
> vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> vfio: Device 0000:04:00.0 region 2:
> vfio:   size: 0x40000, offset: 0x20000000000, flags: 0x7
> vfio: Device 0000:04:00.0 region 3:
> vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> vfio: Device 0000:04:00.0 region 4:
> vfio:   size: 0x100, offset: 0x40000000000, flags: 0x3
> vfio: Device 0000:04:00.0 region 5:
> vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> vfio: Device 0000:04:00.0 ROM:
> vfio:   size: 0x20000, offset: 0x60000000000, flags: 0x1
> vfio: Device 0000:04:00.0 config:
> vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> vfio: vfio_load_rom(0000:04:00.0)
> vfio: Enabled ATI/AMD BAR2 0x4000 quirk for device 0000:04:00.0
> vfio: Enabled ATI/AMD BAR4 window quirk for device 0000:04:00.0
> vfio: Enabled ATI/AMD quirk 0x3c3 BAR4 for device 0000:04:00.0
> vfio: 0000:04:00.0 PCI MSI CAP @0xa0
> vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> vfio: vfio_enable_intx(0000:04:00.0)
> vfio: vfio_initfn(0000:04:00.1) group 14
> vfio: Device 0000:04:00.1 flags: 3, regions: 9, irgs: 4
> vfio: Device 0000:04:00.1 region 0:
> vfio:   size: 0x4000, offset: 0x0, flags: 0x7
> vfio: Device 0000:04:00.1 region 1:
> vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> vfio: Device 0000:04:00.1 region 2:
> vfio:   size: 0x0, offset: 0x20000000000, flags: 0x0
> vfio: Device 0000:04:00.1 region 3:
> vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> vfio: Device 0000:04:00.1 region 4:
> vfio:   size: 0x0, offset: 0x40000000000, flags: 0x0
> vfio: Device 0000:04:00.1 region 5:
> vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> vfio: Device 0000:04:00.1 ROM:
> vfio:   size: 0x0, offset: 0x60000000000, flags: 0x0
> vfio: Device 0000:04:00.1 config:
> vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> vfio: 0000:04:00.1 PCI MSI CAP @0xa0
> vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> vfio: vfio_enable_intx(0000:04:00.1)
> vfio: region_del 0 - afffffff
> vfio: region_add 0 - bffff [0x7f8698000000]
> vfio: region_add c0000 - dffff [0x7f88aa200000]
> vfio: region_add e0000 - fffff [0x7f88aa400000]
> vfio: region_add 100000 - afffffff [0x7f8698100000]
> vfio: vfio_pci_reset(0000:04:00.0)
> vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> vfio: vfio_disable_intx(0000:04:00.0)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x54, len=0x2) 0
> vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 3
> vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x0, len=0x2)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> vfio: vfio_enable_intx(0000:04:00.0)
> vfio: vfio_pci_reset(0000:04:00.1)
> vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> vfio: vfio_disable_intx(0000:04:00.1)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x54, len=0x2) 0
> vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 6
> vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x0, len=0x2)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> vfio: vfio_enable_intx(0000:04:00.1)
> vfio: region_del 0 - bffff
> vfio: region_del c0000 - dffff
> vfio: region_add 0 - c7fff [0x7f8698000000]
> vfio: region_add c8000 - dffff [0x7f88aa208000]
> vfio: region_del 0 - c7fff
> vfio: region_del c8000 - dffff
> vfio: region_add 0 - cffff [0x7f8698000000]
> vfio: region_add d0000 - dffff [0x7f88aa210000]
> vfio: region_del 0 - cffff
> vfio: region_del d0000 - dffff
> vfio: region_add 0 - d7fff [0x7f8698000000]
> vfio: region_add d8000 - dffff [0x7f88aa218000]
> vfio: region_del 0 - d7fff
> vfio: region_del d8000 - dffff
> vfio: region_add 0 - dffff [0x7f8698000000]
> vfio: region_del 0 - dffff
> vfio: region_del e0000 - fffff
> vfio: region_add 0 - e7fff [0x7f8698000000]
> vfio: region_add e8000 - fffff [0x7f88aa408000]
> vfio: region_del 0 - e7fff
> vfio: region_del e8000 - fffff
> vfio: region_add 0 - effff [0x7f8698000000]
> vfio: region_add f0000 - fffff [0x7f88aa410000]
> vfio: region_del 0 - effff
> vfio: region_del f0000 - fffff
> vfio: region_del 100000 - afffffff
> vfio: region_add 0 - afffffff [0x7f8698000000]
> vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x4) 68181002
> vfio: vfio_pci_read_config(0000:04:00.0, @0x8, len=0x4) 3000000
> vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x4) aab01002
> vfio: vfio_pci_read_config(0000:04:00.1, @0x8, len=0x4) 4030000
> vfio: vfio_pci_read_config(0000:04:00.1, @0xe, len=0x1) 80
> vfio: SKIPPING region_add b0000000 - bfffffff
> vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) c000000c
> vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) f000000c
> vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xc000000c, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) ffffffff
> vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fde80004
> vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fffc0004
> vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfde80004, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) ffffffff
> vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ce01
> vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ffffff01
> vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xce01, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffff800, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
> vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) fdefc004
> vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) ffffc004
> vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfdefc004, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) ffffffff
> vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0xffffffff, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0x0, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0xfffff800, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0x0, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xc000, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfea00000, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfea60000, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xe0000000, len=0x4)
> vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> vfio: SKIPPING region_add feb40000 - feb4002f
> vfio: SKIPPING region_add feb40800 - feb40807
> vfio: SKIPPING region_add feb41000 - feb4101f
> vfio: SKIPPING region_add feb41800 - feb41807
> vfio: vfio_update_irq(0000:04:00.1) IRQ moved 20 -> 10
> vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> vfio: vfio_update_irq(0000:04:00.0) IRQ moved 23 -> 11
> vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> vfio: SKIPPING region_add feb42000 - feb42fff
> vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> vfio: vfio_pci_write_config(0000:04:00.0, @0x3c, 0xb, len=0x1)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 0
> vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> vfio: region_add e0000000 - efffffff [0x7f8688000000]
> vfio: region_add fea00000 - fea03fff [0x7f88aa7b8000]
> vfio: SKIPPING region_add fea04000 - fea04fff
> vfio: region_add fea05000 - fea3ffff [0x7f88aa7bd000]
> vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> vfio: vfio_pci_write_config(0000:04:00.1, @0x3c, 0xa, len=0x1)
> vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 0
> vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x103, len=0x2)
> vfio: region_add fea60000 - fea63fff [0x7f88bc710000]
> vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> vfio: region_del 0 - afffffff
> vfio: region_add 0 - 9ffff [0x7f8698000000]
> vfio: SKIPPING region_add a0000 - bffff
> vfio: region_add c0000 - afffffff [0x7f86980c0000]
> vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fea40000
> vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffffffe, len=0x4)
> vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000

Here the option ROM was sized

> vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40001, len=0x4)

Then enabled

> vfio: region_add fea40000 - fea5ffff [0x7f88a9e00000]

Adding this memory region

> vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> vfio: region_del fea40000 - fea5ffff

Then disabled, removing the memory region.  Presumably between the
enable and disable the contents were read and copied to 0xc0000, which
is where the VGA BIOS is shadowed.

> Here is the strace output from this failure:
>
> 1110  ioctl(14, KVM_RUN, 0)             = 0
> 1110  pread(20,  <unfinished ...>
> 1099  <... poll resumed> )              = 1 ([{fd=0, revents=POLLIN}])
> 1099  futex(0x7ff73ca62fa0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> 1109  <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
> 1109  madvise(0x7ff72fe17000, 8368128, MADV_DONTNEED) = 0
> 1109  _exit(0)                          = ?
> 1109  +++ exited with 0 +++
>
> From reading the source 'hw/misc/vfio.c' it looks like the following
> in 'vfio_vga_read' never finished:
>
>     if (pread(vga->fd, &buf, size, offset) != size) {
>         error_report("%s(,0x%"HWADDR_PRIx", %d) failed: %m",
>                      __func__, region->offset + addr, size);
>         return (uint64_t)-1;
>     }

I agree.  Every VGA access requires us to lock the VGA resources on the
device, so if we can't get the lock, we stop making progress.  I took a
look at Xorg last night and it seems like it should be taking and
releasing the VGA arbiter lock in a way that would be compatible with
our use.  That's in the xserver, not the actual display hardware driver,
and it wraps access functions in the arbiter support, so should be
transparent to the drivers.  So for nouveau, it seems like it should
work.  For nvidia, we don't really know, it could be locking the device
from the kernel module.

You could instrument vga_get, vga_tryget, and vga_put to figure out
what's happening.  It might be enough to look at /dev/vga_arbiter at
each step in the sequence to reproduce (sudo head
--lines=1 /dev/vga_arbiter).  Thanks,

Alex

> >
> > 2) If the 'nouveau.ko' driver is loaded it is even more strange. As soon
> > as I start qemu all my SATA links get a hard reset and kernel freezes.
> > No SysRQs are working anymore and only reboot helps. If needed I can
> > look if I can get some dumps from this freeze because it writes nothing
> > more to the disks.
> >
> > But it is getting even more strange. I was putting the secondary card
> > in another PCI slot and then it started to work with nouveau module
> > loaded and passthrough ATI card to QEMU. But this worked only until I
> > started X server with nouveau X driver. As soon as X is running and I
> > started QEMU it hanged again in FUTEX_WAIT_PRIVATE.
> >
> > 3) Without loading 'nvidia.ko' or 'nouveau.ko' modules it works out of
> > the box with several start/stop cycles. However I have no X in this
> > case. ;)
> >
> > Any ideas? :)
> >
> > > > Alex
> > > >
> > >
> > > --Maik
> > >
> >
> > --Maik
> >
>
> --Maik




Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Maik Broemme
Hi,

Alex Williamson <[hidden email]> wrote:

> On Tue, 2013-05-28 at 20:45 +0200, Maik Broemme wrote:
> > Hi,
> >
> > Maik Broemme <[hidden email]> wrote:
> > > Hi Alex,
> > >
> > > Maik Broemme <[hidden email]> wrote:
> > > > Hi Alex,
> > > >
> > > > Alex Williamson <[hidden email]> wrote:
> > > > >
> > > > > Good to hear.  It looks like you have the same motherboard as my AMD
> > > > > test system.  An HD7850 in that system runs quite reliably with the
> > > > > branches above although I do occasionally get VGA palette corruption.
> > > > >
> > > >
> > > > Good to know. I'm using a Radeon HD7870 which works fine now. I have the
> > > > same VGA palette corruption occasionally but only until Catalyst driver
> > > > is loaded. So it happens sometimes during VGA init if Windows 7 boot
> > > > logo is shown with very strange colors and went away if Catalyst driver
> > > > is loaded.
> > > >
> > > > > Are you still require -vga cirrus or do the -vga none, x-vga=on cases
> > > > > work now too?  Thanks,
> > > > >
> > > >
> > > > No longer required, -vga none with x-vga=on work on your branches fine
> > > > now. Not sure if there was something more changed because with original
> > > > Fedora 3.9.2 kernel it still doesn't work.
> > > >
> > >
> > > Alex, I have a strange issue now with either the 'vfio-vga-reset'
> > > branches or with the stable 3.9.4 kernel. This is my 'lspci' output:
> > >
> > > 00:14.2 Audio device: Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel HDA) (rev 40)
> > > 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 520] (rev a1)
> > > 01:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
> > > 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Pitcairn [Radeon HD 7800]
> > > 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> > >
> > > The '01:00.0' is my primary device used for Linux and '02:00.0' my
> > > secondary for QEMU. Two new different problems:
> > >
> > > 1) If the 'nvidia.ko' binary driver is loaded for the first card, QEMU
> > > immediately get stuck after startup and hangs with:
> > >
> > > 1140  futex(0x7f0ad9b21300, FUTEX_WAIT_PRIVATE, 2, NULL
> > >
> > > I have the complete strace output if needed. After that I can only
> > > terminate qemu with 'kill -9' and if I start it again the following
> > > Oops occurs:
> > >
> > > [  655.684121] ------------[ cut here ]------------
> > > [  655.684134] WARNING: at lib/list_debug.c:29 __list_add+0x77/0xd0()
> > > [  655.684151] Hardware name: GA-990FXA-UD3
> > > [  655.684271] list_add corruption. next->prev should be prev (ffffffff81ca3d98), but was           (null). (next=ffff88041bc3fe08).
> > > [  655.684477] Modules linked in: vhost_net macvtap macvlan tun arc4 md4 nls_utf8 cifs dns_resolver fscache vfio_pci vfio_iommu_type1 vfio bridge stp llc ip6table_filter ip6_tables it87 hwmon_vid snd_hda_codec_hdmi nvidia(POF) acpi_cpufreq mperf kvm_amd snd_hda_codec_realtek kvm crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel snd_hda_codec microcode edac_core snd_hwdep fam15h_power snd_seq edac_mce_amd snd_seq_device k10temp r8169 sp5100_tco snd_pcm mii i2c_piix4 snd_page_alloc snd_timer i2c_core snd soundcore mxm_wmi firewire_ohci firewire_core crc_itu_t wmi
> > > [  655.685451] Pid: 2097, comm: qemu-system-x86 Tainted: PF          O 3.9.4-200.fc18.x86_64 #1
> > > [  655.685642] Call Trace:
> > > [  655.685738]  [<ffffffff8105f125>] warn_slowpath_common+0x75/0xa0
> > > [  655.685851]  [<ffffffff8105f206>] warn_slowpath_fmt+0x46/0x50
> > > [  655.685955]  [<ffffffff81316ef7>] __list_add+0x77/0xd0
> > > [  655.686058]  [<ffffffff8108392c>] add_wait_queue+0x3c/0x60
> > > [  655.686162]  [<ffffffff813f241d>] vga_get+0xdd/0x190
> > > [  655.686266]  [<ffffffff81093e40>] ? try_to_wake_up+0x2d0/0x2d0
> > > [  655.686373]  [<ffffffffa01ac625>] vfio_pci_vga_rw+0xb5/0x230 [vfio_pci]
> > > [  655.686481]  [<ffffffffa01aa279>] vfio_pci_rw+0x39/0x80 [vfio_pci]
> > > [  655.686587]  [<ffffffffa01aa30c>] vfio_pci_read+0x1c/0x20 [vfio_pci]
> > > [  655.686701]  [<ffffffffa01a40e3>] vfio_device_fops_read+0x23/0x30 [vfio]
> > > [  655.686814]  [<ffffffff811a01b9>] vfs_read+0xa9/0x180
> > > [  655.686915]  [<ffffffff811a05ba>] sys_pread64+0x9a/0xb0
> > > [  655.687018]  [<ffffffff81669f59>] system_call_fastpath+0x16/0x1b
> > > [  655.687123] ---[ end trace a68eabc3660237b1 ]---
> > >
> > > This is always reproducible. I know it is the binary driver and maybe
> > > nobody cares but it is widely used. :)
> >
> > Here is the DEBUG_VFIO output:
> >
> > vfio: vfio_initfn(0000:04:00.0) group 14
> > vfio: region_add 0 - afffffff [0x7f8698000000]
> > vfio: SKIPPING region_add fec00000 - fec00fff
> > vfio: SKIPPING region_add fed00000 - fed003ff
> > vfio: SKIPPING region_add fee00000 - feefffff
> > vfio: region_add fffe0000 - ffffffff [0x7f88aa400000]
> > vfio: region_add 100000000 - 24fffffff [0x7f8748000000]
> > vfio: Device 0000:04:00.0 flags: 3, regions: 9, irgs: 4
> > vfio: Device 0000:04:00.0 region 0:
> > vfio:   size: 0x10000000, offset: 0x0, flags: 0x7
> > vfio: Device 0000:04:00.0 region 1:
> > vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> > vfio: Device 0000:04:00.0 region 2:
> > vfio:   size: 0x40000, offset: 0x20000000000, flags: 0x7
> > vfio: Device 0000:04:00.0 region 3:
> > vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> > vfio: Device 0000:04:00.0 region 4:
> > vfio:   size: 0x100, offset: 0x40000000000, flags: 0x3
> > vfio: Device 0000:04:00.0 region 5:
> > vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> > vfio: Device 0000:04:00.0 ROM:
> > vfio:   size: 0x20000, offset: 0x60000000000, flags: 0x1
> > vfio: Device 0000:04:00.0 config:
> > vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> > vfio: vfio_load_rom(0000:04:00.0)
> > vfio: Enabled ATI/AMD BAR2 0x4000 quirk for device 0000:04:00.0
> > vfio: Enabled ATI/AMD BAR4 window quirk for device 0000:04:00.0
> > vfio: Enabled ATI/AMD quirk 0x3c3 BAR4 for device 0000:04:00.0
> > vfio: 0000:04:00.0 PCI MSI CAP @0xa0
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > vfio: vfio_enable_intx(0000:04:00.0)
> > vfio: vfio_initfn(0000:04:00.1) group 14
> > vfio: Device 0000:04:00.1 flags: 3, regions: 9, irgs: 4
> > vfio: Device 0000:04:00.1 region 0:
> > vfio:   size: 0x4000, offset: 0x0, flags: 0x7
> > vfio: Device 0000:04:00.1 region 1:
> > vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> > vfio: Device 0000:04:00.1 region 2:
> > vfio:   size: 0x0, offset: 0x20000000000, flags: 0x0
> > vfio: Device 0000:04:00.1 region 3:
> > vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> > vfio: Device 0000:04:00.1 region 4:
> > vfio:   size: 0x0, offset: 0x40000000000, flags: 0x0
> > vfio: Device 0000:04:00.1 region 5:
> > vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> > vfio: Device 0000:04:00.1 ROM:
> > vfio:   size: 0x0, offset: 0x60000000000, flags: 0x0
> > vfio: Device 0000:04:00.1 config:
> > vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> > vfio: 0000:04:00.1 PCI MSI CAP @0xa0
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > vfio: vfio_enable_intx(0000:04:00.1)
> > vfio: region_del 0 - afffffff
> > vfio: region_add 0 - bffff [0x7f8698000000]
> > vfio: region_add c0000 - dffff [0x7f88aa200000]
> > vfio: region_add e0000 - fffff [0x7f88aa400000]
> > vfio: region_add 100000 - afffffff [0x7f8698100000]
> > vfio: vfio_pci_reset(0000:04:00.0)
> > vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> > vfio: vfio_disable_intx(0000:04:00.0)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x54, len=0x2) 0
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 3
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x0, len=0x2)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > vfio: vfio_enable_intx(0000:04:00.0)
> > vfio: vfio_pci_reset(0000:04:00.1)
> > vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> > vfio: vfio_disable_intx(0000:04:00.1)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x54, len=0x2) 0
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 6
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x0, len=0x2)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > vfio: vfio_enable_intx(0000:04:00.1)
> > vfio: region_del 0 - bffff
> > vfio: region_del c0000 - dffff
> > vfio: region_add 0 - c7fff [0x7f8698000000]
> > vfio: region_add c8000 - dffff [0x7f88aa208000]
> > vfio: region_del 0 - c7fff
> > vfio: region_del c8000 - dffff
> > vfio: region_add 0 - cffff [0x7f8698000000]
> > vfio: region_add d0000 - dffff [0x7f88aa210000]
> > vfio: region_del 0 - cffff
> > vfio: region_del d0000 - dffff
> > vfio: region_add 0 - d7fff [0x7f8698000000]
> > vfio: region_add d8000 - dffff [0x7f88aa218000]
> > vfio: region_del 0 - d7fff
> > vfio: region_del d8000 - dffff
> > vfio: region_add 0 - dffff [0x7f8698000000]
> > vfio: region_del 0 - dffff
> > vfio: region_del e0000 - fffff
> > vfio: region_add 0 - e7fff [0x7f8698000000]
> > vfio: region_add e8000 - fffff [0x7f88aa408000]
> > vfio: region_del 0 - e7fff
> > vfio: region_del e8000 - fffff
> > vfio: region_add 0 - effff [0x7f8698000000]
> > vfio: region_add f0000 - fffff [0x7f88aa410000]
> > vfio: region_del 0 - effff
> > vfio: region_del f0000 - fffff
> > vfio: region_del 100000 - afffffff
> > vfio: region_add 0 - afffffff [0x7f8698000000]
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x4) 68181002
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x8, len=0x4) 3000000
> > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x4) aab01002
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x8, len=0x4) 4030000
> > vfio: vfio_pci_read_config(0000:04:00.1, @0xe, len=0x1) 80
> > vfio: SKIPPING region_add b0000000 - bfffffff
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) c000000c
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) f000000c
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xc000000c, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) ffffffff
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fde80004
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fffc0004
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfde80004, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) ffffffff
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ce01
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ffffff01
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xce01, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffff800, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) fdefc004
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) ffffc004
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfdefc004, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) ffffffff
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0xffffffff, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0x0, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0xfffff800, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0x0, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xc000, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfea00000, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfea60000, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xe0000000, len=0x4)
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> > vfio: SKIPPING region_add feb40000 - feb4002f
> > vfio: SKIPPING region_add feb40800 - feb40807
> > vfio: SKIPPING region_add feb41000 - feb4101f
> > vfio: SKIPPING region_add feb41800 - feb41807
> > vfio: vfio_update_irq(0000:04:00.1) IRQ moved 20 -> 10
> > vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > vfio: vfio_update_irq(0000:04:00.0) IRQ moved 23 -> 11
> > vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > vfio: SKIPPING region_add feb42000 - feb42fff
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x3c, 0xb, len=0x1)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 0
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> > vfio: region_add e0000000 - efffffff [0x7f8688000000]
> > vfio: region_add fea00000 - fea03fff [0x7f88aa7b8000]
> > vfio: SKIPPING region_add fea04000 - fea04fff
> > vfio: region_add fea05000 - fea3ffff [0x7f88aa7bd000]
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x3c, 0xa, len=0x1)
> > vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 0
> > vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x103, len=0x2)
> > vfio: region_add fea60000 - fea63fff [0x7f88bc710000]
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> > vfio: region_del 0 - afffffff
> > vfio: region_add 0 - 9ffff [0x7f8698000000]
> > vfio: SKIPPING region_add a0000 - bffff
> > vfio: region_add c0000 - afffffff [0x7f86980c0000]
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fea40000
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffffffe, len=0x4)
> > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
>
> Here the option ROM was sized
>
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40001, len=0x4)
>
> Then enabled
>
> > vfio: region_add fea40000 - fea5ffff [0x7f88a9e00000]
>
> Adding this memory region
>
> > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> > vfio: region_del fea40000 - fea5ffff
>
> Then disabled, removing the memory region.  Presumably between the
> enable and disable the contents were read and copied to 0xc0000, which
> is where the VGA BIOS is shadowed.
>
> > Here is the strace output from this failure:
> >
> > 1110  ioctl(14, KVM_RUN, 0)             = 0
> > 1110  pread(20,  <unfinished ...>
> > 1099  <... poll resumed> )              = 1 ([{fd=0, revents=POLLIN}])
> > 1099  futex(0x7ff73ca62fa0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> > 1109  <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
> > 1109  madvise(0x7ff72fe17000, 8368128, MADV_DONTNEED) = 0
> > 1109  _exit(0)                          = ?
> > 1109  +++ exited with 0 +++
> >
> > From reading the source 'hw/misc/vfio.c' it looks like the following
> > in 'vfio_vga_read' never finished:
> >
> >     if (pread(vga->fd, &buf, size, offset) != size) {
> >         error_report("%s(,0x%"HWADDR_PRIx", %d) failed: %m",
> >                      __func__, region->offset + addr, size);
> >         return (uint64_t)-1;
> >     }
>
> I agree.  Every VGA access requires us to lock the VGA resources on the
> device, so if we can't get the lock, we stop making progress.  I took a
> look at Xorg last night and it seems like it should be taking and
> releasing the VGA arbiter lock in a way that would be compatible with
> our use.  That's in the xserver, not the actual display hardware driver,
> and it wraps access functions in the arbiter support, so should be
> transparent to the drivers.  So for nouveau, it seems like it should
> work.  For nvidia, we don't really know, it could be locking the device
> from the kernel module.
>
> You could instrument vga_get, vga_tryget, and vga_put to figure out
> what's happening.  It might be enough to look at /dev/vga_arbiter at
> each step in the sequence to reproduce (sudo head
> --lines=1 /dev/vga_arbiter).  Thanks,
>

I've played a bit more with it and there are some differences in
behavior with 'nouveau' and 'nvidia' driver. As soon as I load the
binary driver I see the following:

[   18.628676] [drm] Initialized drm 1.1.0 20060810
[   18.668038] nvidia: module license 'NVIDIA' taints kernel.
[   18.668107] Disabling lock debugging due to kernel taint
[   18.676638] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
[   18.676722] vgaarb: transferring owner from PCI:0000:01:00.0 to PCI:0000:04:00.0
[   18.677007] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0
[   18.677090] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013

root@homer:~# head --lines=1 /dev/vga_arbiter
count:2,PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none(0:0)
root@homer:~# modprobe nvidia
root@homer:~# head --lines=1 /dev/vga_arbiter
count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=io+mem(1:1)
root@homer:~# /usr/local/bin/qemu-system-x86_64 \
        -L /usr/local/share/qemu \
        -L /usr/local/share/qemu \
        -M q35 -enable-kvm -cpu host -smp cores=4,threads=1,sockets=1 \
        -m 8192 -rtc base=localtime -k de -nodefaults -vga none \
        -drive file=/home/mbroemme/.kvm/maggie.img,id=drive0,if=none,cache=none,aio=threads \
        -device virtio-blk-pci,drive=drive0,ioeventfd=on \
        -device ioh3420,id=pcie0,multifunction=on \
        -device vfio-pci,host=04:00.0,addr=0.0,bus=pcie0,multifunction=on,x-vga=on \
        -device vfio-pci,host=04:00.1,addr=0.1,bus=pcie0 -monitor stdio -nographic
root@homer:~# head --lines=1 /dev/vga_arbiter
count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=io+mem(1:1)

It looks like nvidia binary driver locks some resources. :( Whereas
with nouveau driver the VGA arbitration starts first if I start the
VM with VGA passthrough:

[  178.187706] vfio-pci 0000:04:00.0: enabling device (0000 -> 0003)
[  178.209599] vfio_ecap_init: 0000:04:00.0 hiding ecap 0x19@0x270
[  178.209631] vfio_ecap_init: 0000:04:00.0 hiding ecap 0x1b@0x2d0
[  181.198191] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
[  181.198208] vgaarb: transferring owner from PCI:0000:01:00.0 to PCI:0000:04:00.0

root@homer:~# head --lines=1 /dev/vga_arbiter
count:2,PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none(0:0)
root@homer:~# /usr/local/bin/qemu-system-x86_64 \
        -L /usr/local/share/qemu \
        -L /usr/local/share/qemu \
        -M q35 -enable-kvm -cpu host -smp cores=4,threads=1,sockets=1 \
        -m 8192 -rtc base=localtime -k de -nodefaults -vga none \
        -drive file=/home/mbroemme/.kvm/maggie.img,id=drive0,if=none,cache=none,aio=threads \
        -device virtio-blk-pci,drive=drive0,ioeventfd=on \
        -device ioh3420,id=pcie0,multifunction=on \
        -device vfio-pci,host=04:00.0,addr=0.0,bus=pcie0,multifunction=on,x-vga=on \
        -device vfio-pci,host=04:00.1,addr=0.1,bus=pcie0 -monitor stdio -nographic
root@homer:~# head --lines=1 /dev/vga_arbiter
count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=none(0:0)

> Alex
>
> > >
> > > 2) If the 'nouveau.ko' driver is loaded it is even more strange. As soon
> > > as I start qemu all my SATA links get a hard reset and kernel freezes.
> > > No SysRQs are working anymore and only reboot helps. If needed I can
> > > look if I can get some dumps from this freeze because it writes nothing
> > > more to the disks.
> > >
> > > But it is getting even more strange. I was putting the secondary card
> > > in another PCI slot and then it started to work with nouveau module
> > > loaded and passthrough ATI card to QEMU. But this worked only until I
> > > started X server with nouveau X driver. As soon as X is running and I
> > > started QEMU it hanged again in FUTEX_WAIT_PRIVATE.
> > >
> > > 3) Without loading 'nvidia.ko' or 'nouveau.ko' modules it works out of
> > > the box with several start/stop cycles. However I have no X in this
> > > case. ;)
> > >
> > > Any ideas? :)
> > >
> > > > > Alex
> > > > >
> > > >
> > > > --Maik
> > > >
> > >
> > > --Maik
> > >
> >
> > --Maik
>
>
>

Kind regards
Maik

Parallels, Inc.
http://www.parallels.com/

Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Maik Broemme
Hi,

Maik Broemme <[hidden email]> wrote:

> > >
> > > Here is the DEBUG_VFIO output:
> > >
> > > vfio: vfio_initfn(0000:04:00.0) group 14
> > > vfio: region_add 0 - afffffff [0x7f8698000000]
> > > vfio: SKIPPING region_add fec00000 - fec00fff
> > > vfio: SKIPPING region_add fed00000 - fed003ff
> > > vfio: SKIPPING region_add fee00000 - feefffff
> > > vfio: region_add fffe0000 - ffffffff [0x7f88aa400000]
> > > vfio: region_add 100000000 - 24fffffff [0x7f8748000000]
> > > vfio: Device 0000:04:00.0 flags: 3, regions: 9, irgs: 4
> > > vfio: Device 0000:04:00.0 region 0:
> > > vfio:   size: 0x10000000, offset: 0x0, flags: 0x7
> > > vfio: Device 0000:04:00.0 region 1:
> > > vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> > > vfio: Device 0000:04:00.0 region 2:
> > > vfio:   size: 0x40000, offset: 0x20000000000, flags: 0x7
> > > vfio: Device 0000:04:00.0 region 3:
> > > vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> > > vfio: Device 0000:04:00.0 region 4:
> > > vfio:   size: 0x100, offset: 0x40000000000, flags: 0x3
> > > vfio: Device 0000:04:00.0 region 5:
> > > vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> > > vfio: Device 0000:04:00.0 ROM:
> > > vfio:   size: 0x20000, offset: 0x60000000000, flags: 0x1
> > > vfio: Device 0000:04:00.0 config:
> > > vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> > > vfio: vfio_load_rom(0000:04:00.0)
> > > vfio: Enabled ATI/AMD BAR2 0x4000 quirk for device 0000:04:00.0
> > > vfio: Enabled ATI/AMD BAR4 window quirk for device 0000:04:00.0
> > > vfio: Enabled ATI/AMD quirk 0x3c3 BAR4 for device 0000:04:00.0
> > > vfio: 0000:04:00.0 PCI MSI CAP @0xa0
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > > vfio: vfio_enable_intx(0000:04:00.0)
> > > vfio: vfio_initfn(0000:04:00.1) group 14
> > > vfio: Device 0000:04:00.1 flags: 3, regions: 9, irgs: 4
> > > vfio: Device 0000:04:00.1 region 0:
> > > vfio:   size: 0x4000, offset: 0x0, flags: 0x7
> > > vfio: Device 0000:04:00.1 region 1:
> > > vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> > > vfio: Device 0000:04:00.1 region 2:
> > > vfio:   size: 0x0, offset: 0x20000000000, flags: 0x0
> > > vfio: Device 0000:04:00.1 region 3:
> > > vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> > > vfio: Device 0000:04:00.1 region 4:
> > > vfio:   size: 0x0, offset: 0x40000000000, flags: 0x0
> > > vfio: Device 0000:04:00.1 region 5:
> > > vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> > > vfio: Device 0000:04:00.1 ROM:
> > > vfio:   size: 0x0, offset: 0x60000000000, flags: 0x0
> > > vfio: Device 0000:04:00.1 config:
> > > vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> > > vfio: 0000:04:00.1 PCI MSI CAP @0xa0
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > > vfio: vfio_enable_intx(0000:04:00.1)
> > > vfio: region_del 0 - afffffff
> > > vfio: region_add 0 - bffff [0x7f8698000000]
> > > vfio: region_add c0000 - dffff [0x7f88aa200000]
> > > vfio: region_add e0000 - fffff [0x7f88aa400000]
> > > vfio: region_add 100000 - afffffff [0x7f8698100000]
> > > vfio: vfio_pci_reset(0000:04:00.0)
> > > vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> > > vfio: vfio_disable_intx(0000:04:00.0)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x54, len=0x2) 0
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 3
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x0, len=0x2)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > > vfio: vfio_enable_intx(0000:04:00.0)
> > > vfio: vfio_pci_reset(0000:04:00.1)
> > > vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> > > vfio: vfio_disable_intx(0000:04:00.1)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x54, len=0x2) 0
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 6
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x0, len=0x2)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > > vfio: vfio_enable_intx(0000:04:00.1)
> > > vfio: region_del 0 - bffff
> > > vfio: region_del c0000 - dffff
> > > vfio: region_add 0 - c7fff [0x7f8698000000]
> > > vfio: region_add c8000 - dffff [0x7f88aa208000]
> > > vfio: region_del 0 - c7fff
> > > vfio: region_del c8000 - dffff
> > > vfio: region_add 0 - cffff [0x7f8698000000]
> > > vfio: region_add d0000 - dffff [0x7f88aa210000]
> > > vfio: region_del 0 - cffff
> > > vfio: region_del d0000 - dffff
> > > vfio: region_add 0 - d7fff [0x7f8698000000]
> > > vfio: region_add d8000 - dffff [0x7f88aa218000]
> > > vfio: region_del 0 - d7fff
> > > vfio: region_del d8000 - dffff
> > > vfio: region_add 0 - dffff [0x7f8698000000]
> > > vfio: region_del 0 - dffff
> > > vfio: region_del e0000 - fffff
> > > vfio: region_add 0 - e7fff [0x7f8698000000]
> > > vfio: region_add e8000 - fffff [0x7f88aa408000]
> > > vfio: region_del 0 - e7fff
> > > vfio: region_del e8000 - fffff
> > > vfio: region_add 0 - effff [0x7f8698000000]
> > > vfio: region_add f0000 - fffff [0x7f88aa410000]
> > > vfio: region_del 0 - effff
> > > vfio: region_del f0000 - fffff
> > > vfio: region_del 100000 - afffffff
> > > vfio: region_add 0 - afffffff [0x7f8698000000]
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x4) 68181002
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x8, len=0x4) 3000000
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x4) aab01002
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x8, len=0x4) 4030000
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0xe, len=0x1) 80
> > > vfio: SKIPPING region_add b0000000 - bfffffff
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) c000000c
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) f000000c
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xc000000c, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) ffffffff
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fde80004
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fffc0004
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfde80004, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) ffffffff
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ce01
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ffffff01
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xce01, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffff800, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) fdefc004
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) ffffc004
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfdefc004, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) ffffffff
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0xffffffff, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0x0, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0xfffff800, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0x0, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xc000, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfea00000, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfea60000, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xe0000000, len=0x4)
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> > > vfio: SKIPPING region_add feb40000 - feb4002f
> > > vfio: SKIPPING region_add feb40800 - feb40807
> > > vfio: SKIPPING region_add feb41000 - feb4101f
> > > vfio: SKIPPING region_add feb41800 - feb41807
> > > vfio: vfio_update_irq(0000:04:00.1) IRQ moved 20 -> 10
> > > vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> > > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > > vfio: vfio_update_irq(0000:04:00.0) IRQ moved 23 -> 11
> > > vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> > > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > > vfio: SKIPPING region_add feb42000 - feb42fff
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x3c, 0xb, len=0x1)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 0
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> > > vfio: region_add e0000000 - efffffff [0x7f8688000000]
> > > vfio: region_add fea00000 - fea03fff [0x7f88aa7b8000]
> > > vfio: SKIPPING region_add fea04000 - fea04fff
> > > vfio: region_add fea05000 - fea3ffff [0x7f88aa7bd000]
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x3c, 0xa, len=0x1)
> > > vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 0
> > > vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x103, len=0x2)
> > > vfio: region_add fea60000 - fea63fff [0x7f88bc710000]
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> > > vfio: region_del 0 - afffffff
> > > vfio: region_add 0 - 9ffff [0x7f8698000000]
> > > vfio: SKIPPING region_add a0000 - bffff
> > > vfio: region_add c0000 - afffffff [0x7f86980c0000]
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fea40000
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffffffe, len=0x4)
> > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
> >
> > Here the option ROM was sized
> >
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40001, len=0x4)
> >
> > Then enabled
> >
> > > vfio: region_add fea40000 - fea5ffff [0x7f88a9e00000]
> >
> > Adding this memory region
> >
> > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> > > vfio: region_del fea40000 - fea5ffff
> >
> > Then disabled, removing the memory region.  Presumably between the
> > enable and disable the contents were read and copied to 0xc0000, which
> > is where the VGA BIOS is shadowed.
> >
> > > Here is the strace output from this failure:
> > >
> > > 1110  ioctl(14, KVM_RUN, 0)             = 0
> > > 1110  pread(20,  <unfinished ...>
> > > 1099  <... poll resumed> )              = 1 ([{fd=0, revents=POLLIN}])
> > > 1099  futex(0x7ff73ca62fa0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> > > 1109  <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
> > > 1109  madvise(0x7ff72fe17000, 8368128, MADV_DONTNEED) = 0
> > > 1109  _exit(0)                          = ?
> > > 1109  +++ exited with 0 +++
> > >
> > > From reading the source 'hw/misc/vfio.c' it looks like the following
> > > in 'vfio_vga_read' never finished:
> > >
> > >     if (pread(vga->fd, &buf, size, offset) != size) {
> > >         error_report("%s(,0x%"HWADDR_PRIx", %d) failed: %m",
> > >                      __func__, region->offset + addr, size);
> > >         return (uint64_t)-1;
> > >     }
> >
> > I agree.  Every VGA access requires us to lock the VGA resources on the
> > device, so if we can't get the lock, we stop making progress.  I took a
> > look at Xorg last night and it seems like it should be taking and
> > releasing the VGA arbiter lock in a way that would be compatible with
> > our use.  That's in the xserver, not the actual display hardware driver,
> > and it wraps access functions in the arbiter support, so should be
> > transparent to the drivers.  So for nouveau, it seems like it should
> > work.  For nvidia, we don't really know, it could be locking the device
> > from the kernel module.
> >
> > You could instrument vga_get, vga_tryget, and vga_put to figure out
> > what's happening.  It might be enough to look at /dev/vga_arbiter at
> > each step in the sequence to reproduce (sudo head
> > --lines=1 /dev/vga_arbiter).  Thanks,
> >
>
> I've played a bit more with it and there are some differences in
> behavior with 'nouveau' and 'nvidia' driver. As soon as I load the
> binary driver I see the following:
>
> [   18.628676] [drm] Initialized drm 1.1.0 20060810
> [   18.668038] nvidia: module license 'NVIDIA' taints kernel.
> [   18.668107] Disabling lock debugging due to kernel taint
> [   18.676638] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
> [   18.676722] vgaarb: transferring owner from PCI:0000:01:00.0 to PCI:0000:04:00.0
> [   18.677007] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0
> [   18.677090] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013
>
> root@homer:~# head --lines=1 /dev/vga_arbiter
> count:2,PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none(0:0)
> root@homer:~# modprobe nvidia
> root@homer:~# head --lines=1 /dev/vga_arbiter
> count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=io+mem(1:1)
> root@homer:~# /usr/local/bin/qemu-system-x86_64 \
>         -L /usr/local/share/qemu \
>         -L /usr/local/share/qemu \
>         -M q35 -enable-kvm -cpu host -smp cores=4,threads=1,sockets=1 \
>         -m 8192 -rtc base=localtime -k de -nodefaults -vga none \
>         -drive file=/home/mbroemme/.kvm/maggie.img,id=drive0,if=none,cache=none,aio=threads \
>         -device virtio-blk-pci,drive=drive0,ioeventfd=on \
>         -device ioh3420,id=pcie0,multifunction=on \
>         -device vfio-pci,host=04:00.0,addr=0.0,bus=pcie0,multifunction=on,x-vga=on \
>         -device vfio-pci,host=04:00.1,addr=0.1,bus=pcie0 -monitor stdio -nographic
> root@homer:~# head --lines=1 /dev/vga_arbiter
> count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=io+mem(1:1)
>
> It looks like nvidia binary driver locks some resources. :( Whereas
> with nouveau driver the VGA arbitration starts first if I start the
> VM with VGA passthrough:
>
> [  178.187706] vfio-pci 0000:04:00.0: enabling device (0000 -> 0003)
> [  178.209599] vfio_ecap_init: 0000:04:00.0 hiding ecap 0x19@0x270
> [  178.209631] vfio_ecap_init: 0000:04:00.0 hiding ecap 0x1b@0x2d0
> [  181.198191] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
> [  181.198208] vgaarb: transferring owner from PCI:0000:01:00.0 to PCI:0000:04:00.0
>
> root@homer:~# head --lines=1 /dev/vga_arbiter
> count:2,PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none(0:0)
> root@homer:~# /usr/local/bin/qemu-system-x86_64 \
>         -L /usr/local/share/qemu \
>         -L /usr/local/share/qemu \
>         -M q35 -enable-kvm -cpu host -smp cores=4,threads=1,sockets=1 \
>         -m 8192 -rtc base=localtime -k de -nodefaults -vga none \
>         -drive file=/home/mbroemme/.kvm/maggie.img,id=drive0,if=none,cache=none,aio=threads \
>         -device virtio-blk-pci,drive=drive0,ioeventfd=on \
>         -device ioh3420,id=pcie0,multifunction=on \
>         -device vfio-pci,host=04:00.0,addr=0.0,bus=pcie0,multifunction=on,x-vga=on \
>         -device vfio-pci,host=04:00.1,addr=0.1,bus=pcie0 -monitor stdio -nographic
> root@homer:~# head --lines=1 /dev/vga_arbiter
> count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=none(0:0)
>
Alex you pointed me in the right direction, many thanks! I got it
working with 'nvidia' binary driver now, but I had to patch it. I've
attached it here just for reference if others want to try the same
and I don't know if it is the proper way. After some testing (Xorg,
DRI, VDPAU, switching between X and text-console) it looks like a stable
workaround for me now until NVIDIA will fix their driver.

--Maik

NVIDIA-Linux-x86_64-319.23-vfio-vgaarb-fix.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Alex Williamson-3
On Wed, 2013-05-29 at 18:16 +0200, Maik Broemme wrote:

> Hi,
>
> Maik Broemme <[hidden email]> wrote:
> > > >
> > > > Here is the DEBUG_VFIO output:
> > > >
> > > > vfio: vfio_initfn(0000:04:00.0) group 14
> > > > vfio: region_add 0 - afffffff [0x7f8698000000]
> > > > vfio: SKIPPING region_add fec00000 - fec00fff
> > > > vfio: SKIPPING region_add fed00000 - fed003ff
> > > > vfio: SKIPPING region_add fee00000 - feefffff
> > > > vfio: region_add fffe0000 - ffffffff [0x7f88aa400000]
> > > > vfio: region_add 100000000 - 24fffffff [0x7f8748000000]
> > > > vfio: Device 0000:04:00.0 flags: 3, regions: 9, irgs: 4
> > > > vfio: Device 0000:04:00.0 region 0:
> > > > vfio:   size: 0x10000000, offset: 0x0, flags: 0x7
> > > > vfio: Device 0000:04:00.0 region 1:
> > > > vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.0 region 2:
> > > > vfio:   size: 0x40000, offset: 0x20000000000, flags: 0x7
> > > > vfio: Device 0000:04:00.0 region 3:
> > > > vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.0 region 4:
> > > > vfio:   size: 0x100, offset: 0x40000000000, flags: 0x3
> > > > vfio: Device 0000:04:00.0 region 5:
> > > > vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.0 ROM:
> > > > vfio:   size: 0x20000, offset: 0x60000000000, flags: 0x1
> > > > vfio: Device 0000:04:00.0 config:
> > > > vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> > > > vfio: vfio_load_rom(0000:04:00.0)
> > > > vfio: Enabled ATI/AMD BAR2 0x4000 quirk for device 0000:04:00.0
> > > > vfio: Enabled ATI/AMD BAR4 window quirk for device 0000:04:00.0
> > > > vfio: Enabled ATI/AMD quirk 0x3c3 BAR4 for device 0000:04:00.0
> > > > vfio: 0000:04:00.0 PCI MSI CAP @0xa0
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > > > vfio: vfio_enable_intx(0000:04:00.0)
> > > > vfio: vfio_initfn(0000:04:00.1) group 14
> > > > vfio: Device 0000:04:00.1 flags: 3, regions: 9, irgs: 4
> > > > vfio: Device 0000:04:00.1 region 0:
> > > > vfio:   size: 0x4000, offset: 0x0, flags: 0x7
> > > > vfio: Device 0000:04:00.1 region 1:
> > > > vfio:   size: 0x0, offset: 0x10000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.1 region 2:
> > > > vfio:   size: 0x0, offset: 0x20000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.1 region 3:
> > > > vfio:   size: 0x0, offset: 0x30000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.1 region 4:
> > > > vfio:   size: 0x0, offset: 0x40000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.1 region 5:
> > > > vfio:   size: 0x0, offset: 0x50000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.1 ROM:
> > > > vfio:   size: 0x0, offset: 0x60000000000, flags: 0x0
> > > > vfio: Device 0000:04:00.1 config:
> > > > vfio:   size: 0x1000, offset: 0x70000000000, flags: 0x3
> > > > vfio: 0000:04:00.1 PCI MSI CAP @0xa0
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > > > vfio: vfio_enable_intx(0000:04:00.1)
> > > > vfio: region_del 0 - afffffff
> > > > vfio: region_add 0 - bffff [0x7f8698000000]
> > > > vfio: region_add c0000 - dffff [0x7f88aa200000]
> > > > vfio: region_add e0000 - fffff [0x7f88aa400000]
> > > > vfio: region_add 100000 - afffffff [0x7f8698100000]
> > > > vfio: vfio_pci_reset(0000:04:00.0)
> > > > vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> > > > vfio: vfio_disable_intx(0000:04:00.0)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x54, len=0x2) 0
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 3
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x0, len=0x2)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > > > vfio: vfio_enable_intx(0000:04:00.0)
> > > > vfio: vfio_pci_reset(0000:04:00.1)
> > > > vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> > > > vfio: vfio_disable_intx(0000:04:00.1)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x54, len=0x2) 0
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 6
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x0, len=0x2)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > > > vfio: vfio_enable_intx(0000:04:00.1)
> > > > vfio: region_del 0 - bffff
> > > > vfio: region_del c0000 - dffff
> > > > vfio: region_add 0 - c7fff [0x7f8698000000]
> > > > vfio: region_add c8000 - dffff [0x7f88aa208000]
> > > > vfio: region_del 0 - c7fff
> > > > vfio: region_del c8000 - dffff
> > > > vfio: region_add 0 - cffff [0x7f8698000000]
> > > > vfio: region_add d0000 - dffff [0x7f88aa210000]
> > > > vfio: region_del 0 - cffff
> > > > vfio: region_del d0000 - dffff
> > > > vfio: region_add 0 - d7fff [0x7f8698000000]
> > > > vfio: region_add d8000 - dffff [0x7f88aa218000]
> > > > vfio: region_del 0 - d7fff
> > > > vfio: region_del d8000 - dffff
> > > > vfio: region_add 0 - dffff [0x7f8698000000]
> > > > vfio: region_del 0 - dffff
> > > > vfio: region_del e0000 - fffff
> > > > vfio: region_add 0 - e7fff [0x7f8698000000]
> > > > vfio: region_add e8000 - fffff [0x7f88aa408000]
> > > > vfio: region_del 0 - e7fff
> > > > vfio: region_del e8000 - fffff
> > > > vfio: region_add 0 - effff [0x7f8698000000]
> > > > vfio: region_add f0000 - fffff [0x7f88aa410000]
> > > > vfio: region_del 0 - effff
> > > > vfio: region_del f0000 - fffff
> > > > vfio: region_del 100000 - afffffff
> > > > vfio: region_add 0 - afffffff [0x7f8698000000]
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0xa, len=0x2) 300
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0xa, len=0x2) 403
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x2) 1002
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x0, len=0x4) 68181002
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x8, len=0x4) 3000000
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0xe, len=0x1) 80
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x2) 1002
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x0, len=0x4) aab01002
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x8, len=0x4) 4030000
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0xe, len=0x1) 80
> > > > vfio: SKIPPING region_add b0000000 - bfffffff
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) c000000c
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x10, len=0x4) f000000c
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xc000000c, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x14, len=0x4) ffffffff
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fde80004
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x18, len=0x4) fffc0004
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfde80004, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x1c, len=0x4) ffffffff
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ce01
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x20, len=0x4) ffffff01
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xce01, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x24, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x24, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffff800, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) fdefc004
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x10, len=0x4) ffffc004
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfdefc004, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x14, len=0x4) ffffffff
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x18, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x18, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x1c, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x1c, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x20, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x20, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0xffffffff, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x24, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x24, 0x0, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0xfffff800, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x30, len=0x4) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x30, 0x0, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x20, 0xc000, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x18, 0xfea00000, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x1c, 0x0, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x10, 0xfea60000, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x14, 0x0, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x10, 0xe0000000, len=0x4)
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x14, 0x0, len=0x4)
> > > > vfio: SKIPPING region_add feb40000 - feb4002f
> > > > vfio: SKIPPING region_add feb40800 - feb40807
> > > > vfio: SKIPPING region_add feb41000 - feb4101f
> > > > vfio: SKIPPING region_add feb41800 - feb41807
> > > > vfio: vfio_update_irq(0000:04:00.1) IRQ moved 20 -> 10
> > > > vfio: vfio_disable_intx_kvm(0000:04:00.1) KVM INTx accel disabled
> > > > vfio: vfio_enable_intx_kvm(0000:04:00.1) KVM INTx accel enabled
> > > > vfio: vfio_update_irq(0000:04:00.0) IRQ moved 23 -> 11
> > > > vfio: vfio_disable_intx_kvm(0000:04:00.0) KVM INTx accel disabled
> > > > vfio: vfio_enable_intx_kvm(0000:04:00.0) KVM INTx accel enabled
> > > > vfio: SKIPPING region_add feb42000 - feb42fff
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x3d, len=0x1) 1
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x3c, 0xb, len=0x1)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> > > > vfio: region_add e0000000 - efffffff [0x7f8688000000]
> > > > vfio: region_add fea00000 - fea03fff [0x7f88aa7b8000]
> > > > vfio: SKIPPING region_add fea04000 - fea04fff
> > > > vfio: region_add fea05000 - fea3ffff [0x7f88aa7bd000]
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x3d, len=0x1) 2
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x3c, 0xa, len=0x1)
> > > > vfio: vfio_pci_read_config(0000:04:00.1, @0x4, len=0x2) 0
> > > > vfio: vfio_pci_write_config(0000:04:00.1, @0x4, 0x103, len=0x2)
> > > > vfio: region_add fea60000 - fea63fff [0x7f88bc710000]
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x4, 0x103, len=0x2)
> > > > vfio: region_del 0 - afffffff
> > > > vfio: region_add 0 - 9ffff [0x7f8698000000]
> > > > vfio: SKIPPING region_add a0000 - bffff
> > > > vfio: region_add c0000 - afffffff [0x7f86980c0000]
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x4, len=0x2) 103
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fea40000
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfffffffe, len=0x4)
> > > > vfio: vfio_pci_read_config(0000:04:00.0, @0x30, len=0x4) fffe0000
> > >
> > > Here the option ROM was sized
> > >
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40001, len=0x4)
> > >
> > > Then enabled
> > >
> > > > vfio: region_add fea40000 - fea5ffff [0x7f88a9e00000]
> > >
> > > Adding this memory region
> > >
> > > > vfio: vfio_pci_write_config(0000:04:00.0, @0x30, 0xfea40000, len=0x4)
> > > > vfio: region_del fea40000 - fea5ffff
> > >
> > > Then disabled, removing the memory region.  Presumably between the
> > > enable and disable the contents were read and copied to 0xc0000, which
> > > is where the VGA BIOS is shadowed.
> > >
> > > > Here is the strace output from this failure:
> > > >
> > > > 1110  ioctl(14, KVM_RUN, 0)             = 0
> > > > 1110  pread(20,  <unfinished ...>
> > > > 1099  <... poll resumed> )              = 1 ([{fd=0, revents=POLLIN}])
> > > > 1099  futex(0x7ff73ca62fa0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> > > > 1109  <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
> > > > 1109  madvise(0x7ff72fe17000, 8368128, MADV_DONTNEED) = 0
> > > > 1109  _exit(0)                          = ?
> > > > 1109  +++ exited with 0 +++
> > > >
> > > > From reading the source 'hw/misc/vfio.c' it looks like the following
> > > > in 'vfio_vga_read' never finished:
> > > >
> > > >     if (pread(vga->fd, &buf, size, offset) != size) {
> > > >         error_report("%s(,0x%"HWADDR_PRIx", %d) failed: %m",
> > > >                      __func__, region->offset + addr, size);
> > > >         return (uint64_t)-1;
> > > >     }
> > >
> > > I agree.  Every VGA access requires us to lock the VGA resources on the
> > > device, so if we can't get the lock, we stop making progress.  I took a
> > > look at Xorg last night and it seems like it should be taking and
> > > releasing the VGA arbiter lock in a way that would be compatible with
> > > our use.  That's in the xserver, not the actual display hardware driver,
> > > and it wraps access functions in the arbiter support, so should be
> > > transparent to the drivers.  So for nouveau, it seems like it should
> > > work.  For nvidia, we don't really know, it could be locking the device
> > > from the kernel module.
> > >
> > > You could instrument vga_get, vga_tryget, and vga_put to figure out
> > > what's happening.  It might be enough to look at /dev/vga_arbiter at
> > > each step in the sequence to reproduce (sudo head
> > > --lines=1 /dev/vga_arbiter).  Thanks,
> > >
> >
> > I've played a bit more with it and there are some differences in
> > behavior with 'nouveau' and 'nvidia' driver. As soon as I load the
> > binary driver I see the following:
> >
> > [   18.628676] [drm] Initialized drm 1.1.0 20060810
> > [   18.668038] nvidia: module license 'NVIDIA' taints kernel.
> > [   18.668107] Disabling lock debugging due to kernel taint
> > [   18.676638] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
> > [   18.676722] vgaarb: transferring owner from PCI:0000:01:00.0 to PCI:0000:04:00.0
> > [   18.677007] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0
> > [   18.677090] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013
> >
> > root@homer:~# head --lines=1 /dev/vga_arbiter
> > count:2,PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none(0:0)
> > root@homer:~# modprobe nvidia
> > root@homer:~# head --lines=1 /dev/vga_arbiter
> > count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=io+mem(1:1)
> > root@homer:~# /usr/local/bin/qemu-system-x86_64 \
> >         -L /usr/local/share/qemu \
> >         -L /usr/local/share/qemu \
> >         -M q35 -enable-kvm -cpu host -smp cores=4,threads=1,sockets=1 \
> >         -m 8192 -rtc base=localtime -k de -nodefaults -vga none \
> >         -drive file=/home/mbroemme/.kvm/maggie.img,id=drive0,if=none,cache=none,aio=threads \
> >         -device virtio-blk-pci,drive=drive0,ioeventfd=on \
> >         -device ioh3420,id=pcie0,multifunction=on \
> >         -device vfio-pci,host=04:00.0,addr=0.0,bus=pcie0,multifunction=on,x-vga=on \
> >         -device vfio-pci,host=04:00.1,addr=0.1,bus=pcie0 -monitor stdio -nographic
> > root@homer:~# head --lines=1 /dev/vga_arbiter
> > count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=io+mem(1:1)
> >
> > It looks like nvidia binary driver locks some resources. :( Whereas
> > with nouveau driver the VGA arbitration starts first if I start the
> > VM with VGA passthrough:
> >
> > [  178.187706] vfio-pci 0000:04:00.0: enabling device (0000 -> 0003)
> > [  178.209599] vfio_ecap_init: 0000:04:00.0 hiding ecap 0x19@0x270
> > [  178.209631] vfio_ecap_init: 0000:04:00.0 hiding ecap 0x1b@0x2d0
> > [  181.198191] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
> > [  181.198208] vgaarb: transferring owner from PCI:0000:01:00.0 to PCI:0000:04:00.0
> >
> > root@homer:~# head --lines=1 /dev/vga_arbiter
> > count:2,PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none(0:0)
> > root@homer:~# /usr/local/bin/qemu-system-x86_64 \
> >         -L /usr/local/share/qemu \
> >         -L /usr/local/share/qemu \
> >         -M q35 -enable-kvm -cpu host -smp cores=4,threads=1,sockets=1 \
> >         -m 8192 -rtc base=localtime -k de -nodefaults -vga none \
> >         -drive file=/home/mbroemme/.kvm/maggie.img,id=drive0,if=none,cache=none,aio=threads \
> >         -device virtio-blk-pci,drive=drive0,ioeventfd=on \
> >         -device ioh3420,id=pcie0,multifunction=on \
> >         -device vfio-pci,host=04:00.0,addr=0.0,bus=pcie0,multifunction=on,x-vga=on \
> >         -device vfio-pci,host=04:00.1,addr=0.1,bus=pcie0 -monitor stdio -nographic
> > root@homer:~# head --lines=1 /dev/vga_arbiter
> > count:1,PCI:0000:01:00.0,decodes=none,owns=none,locks=none(0:0)
> >
>
> Alex you pointed me in the right direction, many thanks! I got it
> working with 'nvidia' binary driver now, but I had to patch it. I've
> attached it here just for reference if others want to try the same
> and I don't know if it is the proper way. After some testing (Xorg,
> DRI, VDPAU, switching between X and text-console) it looks like a stable
> workaround for me now until NVIDIA will fix their driver.

Hmm, the code doesn't make much sense to me.  They never do a vga_put,
so they must not realize that the vga_tryget is actually locking vga
arbitration.  It looks like they just want to point VGA routing to the
default device and indicate they don't use VGA I/O.  I don't really see
the point of the vga_tryget though since by declaring that they don't
use VGA I/O they give up all locks.  A simpler fix would be to simply
skip the vga_tryget call.  Or check the return value:

if (!vga_tryget(...))
    vga_put(...)

The VGA arbitration code is lazy, so VGA routing will still point to the
default devices, but it won't be locked there.  Thanks,

Alex


Reply | Threaded
Open this post in threaded view
|

Re: VFIO VGA test branches

Michael Nelson
In reply to this post by Alex Williamson-3
Here is some more data for vfio-vga. 

I have tried NVIDIA (MSI) GT610 and ATI (MSI) 7850 cards separately in primary and secondary (to Cirrus) modes. The ATI can boot as primary in VGA mode, but installing Catalyst causes the VM to reboot repeatedly on bootup. The NVIDIA doesn't get past POST as primary. Neither card works as a secondary with their drivers; they each complain that there aren't enough resources available (code 12) in Windows device manager.

I have tried a few different KVM device settings, but feel like I am poking in the dark at this point. What I am currently using is at the bottom of the e-mail. I have URLs pointing to the various compressed logs since they are fairly large. If there is any other data I can provide, please let me know, I would really like to get this working :).

Machine:

Motherboard: Intel DQ67SW (BIOS: WQ6710H.86A.0052.2011.0520.1802 05/20/2011)
CPU: i5-2400
Kernel: 3.9 - current vfio-vga-reset branch 
Qemu: 1.4.50 - current vfio-vga-reset branch 
Guest: Windows Server 2008 R2 64-bit

Qemu has this patch added to vfio_ati_3c3_quirk_read():
+    if (1 || data == quirk->data.address_match) {

NVIDIA:

When it is primary, the NVIDIA card ends up with a QEMU fatal error within a couple seconds of startup (the EIP seems to vary slightly). I have seen this with 3 different NVidia cards (Quadro FX 580, GeForce 8800 GTS, and a brand new GT610):

qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000000a0000

EAX=00009e48 EBX=00000000 ECX=0000b79d EDX=000003d4
ESI=000000e2 EDI=0000823a EBP=00004918 ESP=00008234
EIP=0009ffca EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
[…]

QEMU output when card is primary is at:

QEMU output when card is secondary is at:

ATI:

The ATI card is able to boot Windows in VGA mode once I add "if (1)" to the 3C3 quirk in QEMU. Once I install the Catalyst drivers and reboot, the machine reboots itself 10-15 seconds after the Windows boot loader finishes. I am not sure if it's blue screening and I am not seeing it -- there is no output.

QEMU output when card is primary is at:

QEMU output when card is secondary is at:

KVM command line:

~/qemu/usr/local/bin/qemu-system-x86_64 \
-nodefconfig -readconfig /root/q35-chipset.cfg \
-enable-kvm \
-M q35 \
-cpu host \
-L ~/seabios-out -L ~/qemu/usr/local/share/qemu \
-m 1024 \
-drive file=/dev/vgsys/vm.delete,if=virtio -boot order=cad,menu=on \
-vga cirrus \
-device vfio-pci,host=00:1d.0,addr=3.0,bus=pcie.0 \
-netdev tap,ifname=vm_rest_lan,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:00:00:59:03,addr=05.00 \
-vnc 0.0.0.0:1 \
-device vfio-pci,host=01:00.0,addr=0.0,bus=ich9-pcie-port-1,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1,addr=0.1,bus=ich9-pcie-port-1 \

Thanks,
-mike

12