<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/security/keys/compat.c, branch v3.13.2</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/security/keys/compat.c?h=v3.13.2</id>
<link rel='self' href='https://git.amat.us/linux/atom/security/keys/compat.c?h=v3.13.2'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-09-24T09:35:19Z</updated>
<entry>
<title>KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches</title>
<updated>2013-09-24T09:35:19Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2013-09-24T09:35:19Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f36f8c75ae2e7d4da34f4c908cebdb4aa42c977e'/>
<id>urn:sha1:f36f8c75ae2e7d4da34f4c908cebdb4aa42c977e</id>
<content type='text'>
Add support for per-user_namespace registers of persistent per-UID kerberos
caches held within the kernel.

This allows the kerberos cache to be retained beyond the life of all a user's
processes so that the user's cron jobs can work.

The kerberos cache is envisioned as a keyring/key tree looking something like:

	struct user_namespace
	  \___ .krb_cache keyring		- The register
		\___ _krb.0 keyring		- Root's Kerberos cache
		\___ _krb.5000 keyring		- User 5000's Kerberos cache
		\___ _krb.5001 keyring		- User 5001's Kerberos cache
			\___ tkt785 big_key	- A ccache blob
			\___ tkt12345 big_key	- Another ccache blob

Or possibly:

	struct user_namespace
	  \___ .krb_cache keyring		- The register
		\___ _krb.0 keyring		- Root's Kerberos cache
		\___ _krb.5000 keyring		- User 5000's Kerberos cache
		\___ _krb.5001 keyring		- User 5001's Kerberos cache
			\___ tkt785 keyring	- A ccache
				\___ krbtgt/REDHAT.COM@REDHAT.COM big_key
				\___ http/REDHAT.COM@REDHAT.COM user
				\___ afs/REDHAT.COM@REDHAT.COM user
				\___ nfs/REDHAT.COM@REDHAT.COM user
				\___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key
				\___ http/KERNEL.ORG@KERNEL.ORG big_key

What goes into a particular Kerberos cache is entirely up to userspace.  Kernel
support is limited to giving you the Kerberos cache keyring that you want.

The user asks for their Kerberos cache by:

	krb_cache = keyctl_get_krbcache(uid, dest_keyring);

The uid is -1 or the user's own UID for the user's own cache or the uid of some
other user's cache (requires CAP_SETUID).  This permits rpc.gssd or whatever to
mess with the cache.

The cache returned is a keyring named "_krb.&lt;uid&gt;" that the possessor can read,
search, clear, invalidate, unlink from and add links to.  Active LSMs get a
chance to rule on whether the caller is permitted to make a link.

Each uid's cache keyring is created when it first accessed and is given a
timeout that is extended each time this function is called so that the keyring
goes away after a while.  The timeout is configurable by sysctl but defaults to
three days.

Each user_namespace struct gets a lazily-created keyring that serves as the
register.  The cache keyrings are added to it.  This means that standard key
search and garbage collection facilities are available.

The user_namespace struct's register goes away when it does and anything left
in it is then automatically gc'd.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Tested-by: Simo Sorce &lt;simo@redhat.com&gt;
cc: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys</title>
<updated>2013-03-12T18:05:45Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2013-02-25T15:20:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=8aec0f5d4137532de14e6554fd5dd201ff3a3c49'/>
<id>urn:sha1:8aec0f5d4137532de14e6554fd5dd201ff3a3c49</id>
<content type='text'>
Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to
compat_process_vm_rw() shows that the compatibility code requires an
explicit "access_ok()" check before calling
compat_rw_copy_check_uvector(). The same difference seems to appear when
we compare fs/read_write.c:do_readv_writev() to
fs/compat.c:compat_do_readv_writev().

This subtle difference between the compat and non-compat requirements
should probably be debated, as it seems to be error-prone. In fact,
there are two others sites that use this function in the Linux kernel,
and they both seem to get it wrong:

Now shifting our attention to fs/aio.c, we see that aio_setup_iocb()
also ends up calling compat_rw_copy_check_uvector() through
aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to
be missing. Same situation for
security/keys/compat.c:compat_keyctl_instantiate_key_iov().

I propose that we add the access_ok() check directly into
compat_rw_copy_check_uvector(), so callers don't have to worry about it,
and it therefore makes the compat call code similar to its non-compat
counterpart. Place the access_ok() check in the same location where
copy_from_user() can trigger a -EFAULT error in the non-compat code, so
the ABI behaviors are alike on both compat and non-compat.

While we are here, fix compat_do_readv_writev() so it checks for
compat_rw_copy_check_uvector() negative return values.

And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error
handling.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge commit 'v3.5-rc2' into next</title>
<updated>2012-06-10T12:52:10Z</updated>
<author>
<name>James Morris</name>
<email>james.l.morris@oracle.com</email>
</author>
<published>2012-06-10T12:52:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=66dd07b88a1c9d446f32253da606b87324fa620e'/>
<id>urn:sha1:66dd07b88a1c9d446f32253da606b87324fa620e</id>
<content type='text'>
</content>
</entry>
<entry>
<title>aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()</title>
<updated>2012-06-01T00:49:32Z</updated>
<author>
<name>Christopher Yeoh</name>
<email>cyeoh@au1.ibm.com</email>
</author>
<published>2012-05-31T23:26:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ac34ebb3a67e699edcb5ac72f19d31679369dfaa'/>
<id>urn:sha1:ac34ebb3a67e699edcb5ac72f19d31679369dfaa</id>
<content type='text'>
A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to.  This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.

Signed-off-by: Chris Yeoh &lt;yeohc@au1.ibm.com&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>KEYS: Fix some sparse warnings</title>
<updated>2012-05-25T10:51:42Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2012-05-21T11:32:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=423b9788023263364ea5de04189f02bd9b6a12db'/>
<id>urn:sha1:423b9788023263364ea5de04189f02bd9b6a12db</id>
<content type='text'>
Fix some sparse warnings in the keyrings code:

 (1) compat_keyctl_instantiate_key_iov() should be static.

 (2) There were a couple of places where a pointer was being compared against
     integer 0 rather than NULL.

 (3) keyctl_instantiate_key_common() should not take a __user-labelled iovec
     pointer as the caller must have copied the iovec to kernel space.

 (4) __key_link_begin() takes and __key_link_end() releases
     keyring_serialise_link_sem under some circumstances and so this should be
     declared.

     Note that adding __acquires() and __releases() for this doesn't help cure
     the warnings messages - something only commenting out both helps.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: James Morris &lt;james.l.morris@oracle.com&gt;
</content>
</entry>
<entry>
<title>KEYS: Add invalidation support</title>
<updated>2012-05-11T09:56:56Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2012-05-11T09:56:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fd75815f727f157a05f4c96b5294a4617c0557da'/>
<id>urn:sha1:fd75815f727f157a05f4c96b5294a4617c0557da</id>
<content type='text'>
Add support for invalidating a key - which renders it immediately invisible to
further searches and causes the garbage collector to immediately wake up,
remove it from keyrings and then destroy it when it's no longer referenced.

It's better not to do this with keyctl_revoke() as that marks the key to start
returning -EKEYREVOKED to searches when what is actually desired is to have the
key refetched.

To invalidate a key the caller must be granted SEARCH permission by the key.
This may be too strict.  It may be better to also permit invalidation if the
caller has any of READ, WRITE or SETATTR permission.

The primary use for this is to evict keys that are cached in special keyrings,
such as the DNS resolver or an ID mapper.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>Cross Memory Attach</title>
<updated>2011-11-01T00:30:44Z</updated>
<author>
<name>Christopher Yeoh</name>
<email>cyeoh@au1.ibm.com</email>
</author>
<published>2011-11-01T00:06:39Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fcf634098c00dd9cd247447368495f0b79be12d1'/>
<id>urn:sha1:fcf634098c00dd9cd247447368495f0b79be12d1</id>
<content type='text'>
The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&amp;m=130105930902915&amp;w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh &lt;yeohc@au1.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: &lt;linux-man@vger.kernel.org&gt;
Cc: &lt;linux-arch@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>KEYS: Add an iovec version of KEYCTL_INSTANTIATE</title>
<updated>2011-03-08T00:17:22Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2011-03-07T15:06:20Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ee009e4a0d4555ed522a631bae9896399674f064'/>
<id>urn:sha1:ee009e4a0d4555ed522a631bae9896399674f064</id>
<content type='text'>
Add a keyctl op (KEYCTL_INSTANTIATE_IOV) that is like KEYCTL_INSTANTIATE, but
takes an iovec array and concatenates the data in-kernel into one buffer.
Since the KEYCTL_INSTANTIATE copies the data anyway, this isn't too much of a
problem.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
</entry>
<entry>
<title>KEYS: Add a new keyctl op to reject a key with a specified error code</title>
<updated>2011-03-08T00:17:18Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2011-03-07T15:06:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fdd1b94581782a2ddf9124414e5b7a5f48ce2f9c'/>
<id>urn:sha1:fdd1b94581782a2ddf9124414e5b7a5f48ce2f9c</id>
<content type='text'>
Add a new keyctl op to reject a key with a specified error code.  This works
much the same as negating a key, and so keyctl_negate_key() is made a special
case of keyctl_reject_key().  The difference is that keyctl_negate_key()
selects ENOKEY as the error to be reported.

Typically the key would be rejected with EKEYEXPIRED, EKEYREVOKED or
EKEYREJECTED, but this is not mandatory.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
</entry>
<entry>
<title>KEYS: Fix up comments in key management code</title>
<updated>2011-01-21T22:59:30Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2011-01-20T16:38:33Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=973c9f4f49ca96a53bcf6384c4c59ccd26c33906'/>
<id>urn:sha1:973c9f4f49ca96a53bcf6384c4c59ccd26c33906</id>
<content type='text'>
Fix up comments in the key management code.  No functional changes.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
