<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/virt/kvm, branch v3.9</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/virt/kvm?h=v3.9</id>
<link rel='self' href='https://git.amat.us/linux/atom/virt/kvm?h=v3.9'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-04-07T10:05:35Z</updated>
<entry>
<title>KVM: Allow cross page reads and writes from cached translations.</title>
<updated>2013-04-07T10:05:35Z</updated>
<author>
<name>Andrew Honig</name>
<email>ahonig@google.com</email>
</author>
<published>2013-03-29T16:35:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=8f964525a121f2ff2df948dac908dcc65be21b5b'/>
<id>urn:sha1:8f964525a121f2ff2df948dac908dcc65be21b5b</id>
<content type='text'>
This patch adds support for kvm_gfn_to_hva_cache_init functions for
reads and writes that will cross a page.  If the range falls within
the same memslot, then this will be a fast operation.  If the range
is split between two memslots, then the slower kvm_read_guest and
kvm_write_guest are used.

Tested: Test against kvm_clock unit tests.

Signed-off-by: Andrew Honig &lt;ahonig@google.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)</title>
<updated>2013-03-19T17:20:21Z</updated>
<author>
<name>Andy Honig</name>
<email>ahonig@google.com</email>
</author>
<published>2013-02-20T22:49:16Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a2c118bfab8bc6b8bb213abfc35201e441693d55'/>
<id>urn:sha1:a2c118bfab8bc6b8bb213abfc35201e441693d55</id>
<content type='text'>
If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows
that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate
that request.  ioapic_read_indirect contains an
ASSERT(redir_index &lt; IOAPIC_NUM_PINS), but the ASSERT has no effect in
non-debug builds.  In recent kernels this allows a guest to cause a kernel
oops by reading invalid memory.  In older kernels (pre-3.3) this allows a
guest to read from large ranges of host memory.

Tested: tested against apic unit tests.

Signed-off-by: Andrew Honig &lt;ahonig@google.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
</entry>
<entry>
<title>hlist: drop the node parameter from iterators</title>
<updated>2013-02-28T03:10:24Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2013-02-28T01:06:00Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b67bfe0d42cac56c512dd5da4b1b347a23f4b70a'/>
<id>urn:sha1:b67bfe0d42cac56c512dd5da4b1b347a23f4b70a</id>
<content type='text'>
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj-&gt;member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    &lt;+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+&gt;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin &lt;peter.senna@gmail.com&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: Remove user_alloc from struct kvm_memory_slot</title>
<updated>2013-02-11T09:52:00Z</updated>
<author>
<name>Takuya Yoshikawa</name>
<email>yoshikawa_takuya_b1@lab.ntt.co.jp</email>
</author>
<published>2013-02-07T09:55:57Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7a905b1485adf863607b5fc9e32a3fa3838bcc23'/>
<id>urn:sha1:7a905b1485adf863607b5fc9e32a3fa3838bcc23</id>
<content type='text'>
This field was needed to differentiate memory slots created by the new
API, KVM_SET_USER_MEMORY_REGION, from those by the old equivalent,
KVM_SET_MEMORY_REGION, whose support was dropped long before:

  commit b74a07beed0e64bfba413dcb70dd6749c57f43dc
  KVM: Remove kernel-allocated memory regions

Although we also have private memory slots to which KVM allocates
memory with vm_mmap(), !user_alloc slots in other words, the slot id
should be enough for differentiating them.

Note: corresponding function parameters will be removed later.

Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Takuya Yoshikawa &lt;yoshikawa_takuya_b1@lab.ntt.co.jp&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: set_memory_region: Disallow changing read-only attribute later</title>
<updated>2013-02-05T00:56:47Z</updated>
<author>
<name>Takuya Yoshikawa</name>
<email>yoshikawa_takuya_b1@lab.ntt.co.jp</email>
</author>
<published>2013-01-30T10:40:41Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=75d61fbcf563373696578570e914f555e12c8d97'/>
<id>urn:sha1:75d61fbcf563373696578570e914f555e12c8d97</id>
<content type='text'>
As Xiao pointed out, there are a few problems with it:
 - kvm_arch_commit_memory_region() write protects the memory slot only
   for GET_DIRTY_LOG when modifying the flags.
 - FNAME(sync_page) uses the old spte value to set a new one without
   checking KVM_MEM_READONLY flag.

Since we flush all shadow pages when creating a new slot, the simplest
fix is to disallow such problematic flag changes: this is safe because
no one is doing such things.

Reviewed-by: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Takuya Yoshikawa &lt;yoshikawa_takuya_b1@lab.ntt.co.jp&gt;
Cc: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: set_memory_region: Identify the requested change explicitly</title>
<updated>2013-02-05T00:00:53Z</updated>
<author>
<name>Takuya Yoshikawa</name>
<email>yoshikawa_takuya_b1@lab.ntt.co.jp</email>
</author>
<published>2013-01-29T02:00:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f64c0398939483eb1d8951f24fbc21e94ed54457'/>
<id>urn:sha1:f64c0398939483eb1d8951f24fbc21e94ed54457</id>
<content type='text'>
KVM_SET_USER_MEMORY_REGION forces __kvm_set_memory_region() to identify
what kind of change is being requested by checking the arguments.  The
current code does this checking at various points in code and each
condition being used there is not easy to understand at first glance.

This patch consolidates these checks and introduces an enum to name the
possible changes to clean up the code.

Although this does not introduce any functional changes, there is one
change which optimizes the code a bit: if we have nothing to change, the
new code returns 0 immediately.

Note that the return value for this case cannot be changed since QEMU
relies on it: we noticed this when we changed it to -EINVAL and got a
section mismatch error at the final stage of live migration.

Signed-off-by: Takuya Yoshikawa &lt;yoshikawa_takuya_b1@lab.ntt.co.jp&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: Handle yield_to failure return code for potential undercommit case</title>
<updated>2013-01-29T13:38:45Z</updated>
<author>
<name>Raghavendra K T</name>
<email>raghavendra.kt@linux.vnet.ibm.com</email>
</author>
<published>2013-01-22T07:39:24Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c45c528e899094b9049b3c900e2cf1f00aa0490c'/>
<id>urn:sha1:c45c528e899094b9049b3c900e2cf1f00aa0490c</id>
<content type='text'>
yield_to returns -ESRCH, When source and target of yield_to
run queue length is one. When we see three successive failures of
yield_to we assume we are in potential undercommit case and abort
from PLE handler.
The assumption is backed by low probability of wrong decision
for even worst case scenarios such as average runqueue length
between 1 and 2.

More detail on rationale behind using three tries:
if p is the probability of finding rq length one on a particular cpu,
and if we do n tries, then probability of exiting ple handler is:

 p^(n+1) [ because we would have come across one source with rq length
1 and n target cpu rqs  with length 1 ]

so
num tries:         probability of aborting ple handler (1.5x overcommit)
 1                 1/4
 2                 1/8
 3                 1/16

We can increase this probability with more tries, but the problem is
the overhead.
Also, If we have tried three times that means we would have iterated
over 3 good eligible vcpus along with many non-eligible candidates. In
worst case if we iterate all the vcpus, we reduce 1x performance and
overcommit performance get hit.

note that we do not update last boosted vcpu in failure cases.
Thank Avi for raising question on aborting after first fail from yield_to.

Reviewed-by: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Signed-off-by: Raghavendra K T &lt;raghavendra.kt@linux.vnet.ibm.com&gt;
Tested-by: Chegu Vinod &lt;chegu_vinod@hp.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
</entry>
<entry>
<title>x86, apicv: add virtual interrupt delivery support</title>
<updated>2013-01-29T08:48:19Z</updated>
<author>
<name>Yang Zhang</name>
<email>yang.z.zhang@Intel.com</email>
</author>
<published>2013-01-25T02:18:51Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c7c9c56ca26f7b9458711b2d78b60b60e0d38ba7'/>
<id>urn:sha1:c7c9c56ca26f7b9458711b2d78b60b60e0d38ba7</id>
<content type='text'>
Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:

- for pending interrupt, instead of direct injection, we may need
  update architecture specific indicators before resuming to guest.

- A pending interrupt, which is masked by ISR, should be also
  considered in above update action, since hardware will decide
  when to inject it at right time. Current has_interrupt and
  get_interrupt only returns a valid vector from injection p.o.v.

Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Signed-off-by: Yang Zhang &lt;yang.z.zhang@Intel.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: Obey read-only mappings in iommu</title>
<updated>2013-01-27T10:41:41Z</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2013-01-24T22:04:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d47510e295c0f82699192a61d715351cf00f65de'/>
<id>urn:sha1:d47510e295c0f82699192a61d715351cf00f65de</id>
<content type='text'>
We've been ignoring read-only mappings and programming everything
into the iommu as read-write.  Fix this to only include the write
access flag when read-only is not set.

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: Force IOMMU remapping on memory slot read-only flag changes</title>
<updated>2013-01-27T10:41:30Z</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2013-01-24T22:04:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=261874b0d5ebe2a5ccc544df7170d6559635e79a'/>
<id>urn:sha1:261874b0d5ebe2a5ccc544df7170d6559635e79a</id>
<content type='text'>
Memory slot flags can be altered without changing other parameters of
the slot.  The read-only attribute is the only one the IOMMU cares
about, so generate an un-map, re-map when this occurs.  This also
avoid unnecessarily re-mapping the slot when no IOMMU visible changes
are made.

Reviewed-by: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
</entry>
</feed>
