diff options
Diffstat (limited to 'Documentation/filesystems/nfs')
| -rw-r--r-- | Documentation/filesystems/nfs/00-INDEX | 6 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/Exporting | 9 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/fault_injection.txt | 69 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/idmapper.txt | 24 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/nfs.txt | 44 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/nfs41-server.txt | 85 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/nfsd-admin-interfaces.txt | 41 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/nfsroot.txt | 12 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/pnfs.txt | 63 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/rpc-server-gss.txt | 91 |
10 files changed, 366 insertions, 78 deletions
diff --git a/Documentation/filesystems/nfs/00-INDEX b/Documentation/filesystems/nfs/00-INDEX index a57e12411d2..53f3b596ac0 100644 --- a/Documentation/filesystems/nfs/00-INDEX +++ b/Documentation/filesystems/nfs/00-INDEX @@ -2,6 +2,8 @@ - this file (nfs-related documentation). Exporting - explanation of how to make filesystems exportable. +fault_injection.txt + - information for using fault injection on the server knfsd-stats.txt - statistics which the NFS server makes available to user space. nfs.txt @@ -10,6 +12,8 @@ nfs41-server.txt - info on the Linux server implementation of NFSv4 minor version 1. nfs-rdma.txt - how to install and setup the Linux NFS/RDMA client and server software +nfsd-admin-interfaces.txt + - Administrative interfaces for nfsd. nfsroot.txt - short guide on setting up a diskless box with NFS root filesystem. pnfs.txt @@ -18,3 +22,5 @@ rpc-cache.txt - introduction to the caching mechanisms in the sunrpc layer. idmapper.txt - information for configuring request-keys to be used by idmapper +rpc-server-gss.txt + - Information on GSS authentication support in the NFS Server diff --git a/Documentation/filesystems/nfs/Exporting b/Documentation/filesystems/nfs/Exporting index 87019d2b598..e543b1a619c 100644 --- a/Documentation/filesystems/nfs/Exporting +++ b/Documentation/filesystems/nfs/Exporting @@ -92,7 +92,14 @@ For a filesystem to be exportable it must: 1/ provide the filehandle fragment routines described below. 2/ make sure that d_splice_alias is used rather than d_add when ->lookup finds an inode for a given parent and name. - Typically the ->lookup routine will end with a: + + If inode is NULL, d_splice_alias(inode, dentry) is equivalent to + + d_add(dentry, inode), NULL + + Similarly, d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err) + + Typically the ->lookup routine will simply end with a: return d_splice_alias(inode, dentry); } diff --git a/Documentation/filesystems/nfs/fault_injection.txt b/Documentation/filesystems/nfs/fault_injection.txt new file mode 100644 index 00000000000..426d166089a --- /dev/null +++ b/Documentation/filesystems/nfs/fault_injection.txt @@ -0,0 +1,69 @@ + +Fault Injection +=============== +Fault injection is a method for forcing errors that may not normally occur, or +may be difficult to reproduce. Forcing these errors in a controlled environment +can help the developer find and fix bugs before their code is shipped in a +production system. Injecting an error on the Linux NFS server will allow us to +observe how the client reacts and if it manages to recover its state correctly. + +NFSD_FAULT_INJECTION must be selected when configuring the kernel to use this +feature. + + +Using Fault Injection +===================== +On the client, mount the fault injection server through NFS v4.0+ and do some +work over NFS (open files, take locks, ...). + +On the server, mount the debugfs filesystem to <debug_dir> and ls +<debug_dir>/nfsd. This will show a list of files that will be used for +injecting faults on the NFS server. As root, write a number n to the file +corresponding to the action you want the server to take. The server will then +process the first n items it finds. So if you want to forget 5 locks, echo '5' +to <debug_dir>/nfsd/forget_locks. A value of 0 will tell the server to forget +all corresponding items. A log message will be created containing the number +of items forgotten (check dmesg). + +Go back to work on the client and check if the client recovered from the error +correctly. + + +Available Faults +================ +forget_clients: + The NFS server keeps a list of clients that have placed a mount call. If + this list is cleared, the server will have no knowledge of who the client + is, forcing the client to reauthenticate with the server. + +forget_openowners: + The NFS server keeps a list of what files are currently opened and who + they were opened by. Clearing this list will force the client to reopen + its files. + +forget_locks: + The NFS server keeps a list of what files are currently locked in the VFS. + Clearing this list will force the client to reclaim its locks (files are + unlocked through the VFS as they are cleared from this list). + +forget_delegations: + A delegation is used to assure the client that a file, or part of a file, + has not changed since the delegation was awarded. Clearing this list will + force the client to reaquire its delegation before accessing the file + again. + +recall_delegations: + Delegations can be recalled by the server when another client attempts to + access a file. This test will notify the client that its delegation has + been revoked, forcing the client to reaquire the delegation before using + the file again. + + +tools/nfs/inject_faults.sh script +================================= +This script has been created to ease the fault injection process. This script +will detect the mounted debugfs directory and write to the files located there +based on the arguments passed by the user. For example, running +`inject_faults.sh forget_locks 1` as root will instruct the server to forget +one lock. Running `inject_faults forget_locks` will instruct the server to +forgetall locks. diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt index b9b4192ea8b..fe03d10bb79 100644 --- a/Documentation/filesystems/nfs/idmapper.txt +++ b/Documentation/filesystems/nfs/idmapper.txt @@ -4,13 +4,21 @@ ID Mapper ========= Id mapper is used by NFS to translate user and group ids into names, and to translate user and group names into ids. Part of this translation involves -performing an upcall to userspace to request the information. Id mapper will -user request-key to perform this upcall and cache the result. The program -/usr/sbin/nfs.idmap should be called by request-key, and will perform the -translation and initialize a key with the resulting information. +performing an upcall to userspace to request the information. There are two +ways NFS could obtain this information: placing a call to /sbin/request-key +or by placing a call to the rpc.idmap daemon. + +NFS will attempt to call /sbin/request-key first. If this succeeds, the +result will be cached using the generic request-key cache. This call should +only fail if /etc/request-key.conf is not configured for the id_resolver key +type, see the "Configuring" section below if you wish to use the request-key +method. + +If the call to /sbin/request-key fails (if /etc/request-key.conf is not +configured with the id_resolver key type), then the idmapper will ask the +legacy rpc.idmap daemon for the id mapping. This result will be stored +in a custom NFS idmap cache. - NFS_USE_NEW_IDMAPPER must be selected when configuring the kernel to use this - feature. =========== Configuring @@ -47,8 +55,8 @@ request-key will find the first matching line and corresponding program. In this case, /some/other/program will handle all uid lookups and /usr/sbin/nfs.idmap will handle gid, user, and group lookups. -See <file:Documentation/keys-request-keys.txt> for more information about the -request-key function. +See <file:Documentation/security/keys-request-key.txt> for more information +about the request-key function. ========= diff --git a/Documentation/filesystems/nfs/nfs.txt b/Documentation/filesystems/nfs/nfs.txt index f50f26ce6cd..f2571c8bef7 100644 --- a/Documentation/filesystems/nfs/nfs.txt +++ b/Documentation/filesystems/nfs/nfs.txt @@ -12,9 +12,47 @@ and work is in progress on adding support for minor version 1 of the NFSv4 protocol. The purpose of this document is to provide information on some of the -upcall interfaces that are used in order to provide the NFS client with -some of the information that it requires in order to fully comply with -the NFS spec. +special features of the NFS client that can be configured by system +administrators. + + +The nfs4_unique_id parameter +============================ + +NFSv4 requires clients to identify themselves to servers with a unique +string. File open and lock state shared between one client and one server +is associated with this identity. To support robust NFSv4 state recovery +and transparent state migration, this identity string must not change +across client reboots. + +Without any other intervention, the Linux client uses a string that contains +the local system's node name. System administrators, however, often do not +take care to ensure that node names are fully qualified and do not change +over the lifetime of a client system. Node names can have other +administrative requirements that require particular behavior that does not +work well as part of an nfs_client_id4 string. + +The nfs.nfs4_unique_id boot parameter specifies a unique string that can be +used instead of a system's node name when an NFS client identifies itself to +a server. Thus, if the system's node name is not unique, or it changes, its +nfs.nfs4_unique_id stays the same, preventing collision with other clients +or loss of state during NFS reboot recovery or transparent state migration. + +The nfs.nfs4_unique_id string is typically a UUID, though it can contain +anything that is believed to be unique across all NFS clients. An +nfs4_unique_id string should be chosen when a client system is installed, +just as a system's root file system gets a fresh UUID in its label at +install time. + +The string should remain fixed for the lifetime of the client. It can be +changed safely if care is taken that the client shuts down cleanly and all +outstanding NFSv4 state has expired, to prevent loss of NFSv4 state. + +This string can be stored in an NFS client's grub.conf, or it can be provided +via a net boot facility such as PXE. It may also be specified as an nfs.ko +module parameter. Specifying a uniquifier string is not support for NFS +clients running in containers. + The DNS resolver ================ diff --git a/Documentation/filesystems/nfs/nfs41-server.txt b/Documentation/filesystems/nfs/nfs41-server.txt index 04884914a1c..c49cd7e796e 100644 --- a/Documentation/filesystems/nfs/nfs41-server.txt +++ b/Documentation/filesystems/nfs/nfs41-server.txt @@ -5,11 +5,11 @@ Server support for minorversion 1 can be controlled using the by reading this file will contain either "+4.1" or "-4.1" correspondingly. -Currently, server support for minorversion 1 is disabled by default. -It can be enabled at run time by writing the string "+4.1" to +Currently, server support for minorversion 1 is enabled by default. +It can be disabled at run time by writing the string "-4.1" to the /proc/fs/nfsd/versions control file. Note that to write this -control file, the nfsd service must be taken down. Use your user-mode -nfs-utils to set this up; see rpc.nfsd(8) +control file, the nfsd service must be taken down. You can use rpc.nfsd +for this; see rpc.nfsd(8). (Warning: older servers will interpret "+4.1" and "-4.1" as "+4" and "-4", respectively. Therefore, code meant to work on both new and old @@ -29,49 +29,6 @@ are still under development out of tree. See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design for more information. -The current implementation is intended for developers only: while it -does support ordinary file operations on clients we have tested against -(including the linux client), it is incomplete in ways which may limit -features unexpectedly, cause known bugs in rare cases, or cause -interoperability problems with future clients. Known issues: - - - gss support is questionable: currently mounts with kerberos - from a linux client are possible, but we aren't really - conformant with the spec (for example, we don't use kerberos - on the backchannel correctly). - - no trunking support: no clients currently take advantage of - trunking, but this is a mandatory feature, and its use is - recommended to clients in a number of places. (E.g. to ensure - timely renewal in case an existing connection's retry timeouts - have gotten too long; see section 8.3 of the RFC.) - Therefore, lack of this feature may cause future clients to - fail. - - Incomplete backchannel support: incomplete backchannel gss - support and no support for BACKCHANNEL_CTL mean that - callbacks (hence delegations and layouts) may not be - available and clients confused by the incomplete - implementation may fail. - - Server reboot recovery is unsupported; if the server reboots, - clients may fail. - - We do not support SSV, which provides security for shared - client-server state (thus preventing unauthorized tampering - with locks and opens, for example). It is mandatory for - servers to support this, though no clients use it yet. - - Mandatory operations which we do not support, such as - DESTROY_CLIENTID, FREE_STATEID, SECINFO_NO_NAME, and - TEST_STATEID, are not currently used by clients, but will be - (and the spec recommends their uses in common cases), and - clients should not be expected to know how to recover from the - case where they are not supported. This will eventually cause - interoperability failures. - -In addition, some limitations are inherited from the current NFSv4 -implementation: - - - Incomplete delegation enforcement: if a file is renamed or - unlinked, a client holding a delegation may continue to - indefinitely allow opens of the file under the old name. - The table below, taken from the NFSv4.1 document, lists the operations that are mandatory to implement (REQ), optional (OPT), and NFSv4.0 operations that are required not to implement (MNI) @@ -98,8 +55,8 @@ Operations | | MNI | or OPT) | | +----------------------+------------+--------------+----------------+ | ACCESS | REQ | | Section 18.1 | -NS | BACKCHANNEL_CTL | REQ | | Section 18.33 | -NS | BIND_CONN_TO_SESSION | REQ | | Section 18.34 | +I | BACKCHANNEL_CTL | REQ | | Section 18.33 | +I | BIND_CONN_TO_SESSION | REQ | | Section 18.34 | | CLOSE | REQ | | Section 18.2 | | COMMIT | REQ | | Section 18.3 | | CREATE | REQ | | Section 18.4 | @@ -108,10 +65,10 @@ NS*| DELEGPURGE | OPT | FDELG (REQ) | Section 18.5 | | DELEGRETURN | OPT | FDELG, | Section 18.6 | | | | DDELG, pNFS | | | | | (REQ) | | -NS | DESTROY_CLIENTID | REQ | | Section 18.50 | +I | DESTROY_CLIENTID | REQ | | Section 18.50 | I | DESTROY_SESSION | REQ | | Section 18.37 | I | EXCHANGE_ID | REQ | | Section 18.35 | -NS | FREE_STATEID | REQ | | Section 18.38 | +I | FREE_STATEID | REQ | | Section 18.38 | | GETATTR | REQ | | Section 18.7 | P | GETDEVICEINFO | OPT | pNFS (REQ) | Section 18.40 | P | GETDEVICELIST | OPT | pNFS (OPT) | Section 18.41 | @@ -145,14 +102,14 @@ NS*| OPENATTR | OPT | | Section 18.17 | | RESTOREFH | REQ | | Section 18.27 | | SAVEFH | REQ | | Section 18.28 | | SECINFO | REQ | | Section 18.29 | -NS | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, | +I | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, | | | | layout (REQ) | Section 13.12 | I | SEQUENCE | REQ | | Section 18.46 | | SETATTR | REQ | | Section 18.30 | | SETCLIENTID | MNI | | N/A | | SETCLIENTID_CONFIRM | MNI | | N/A | NS | SET_SSV | REQ | | Section 18.47 | -NS | TEST_STATEID | REQ | | Section 18.48 | +I | TEST_STATEID | REQ | | Section 18.48 | | VERIFY | REQ | | Section 18.31 | NS*| WANT_DELEGATION | OPT | FDELG (OPT) | Section 18.49 | | WRITE | REQ | | Section 18.32 | @@ -189,6 +146,16 @@ NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 | Implementation notes: +SSV: +* The spec claims this is mandatory, but we don't actually know of any + implementations, so we're ignoring it for now. The server returns + NFS4ERR_ENCR_ALG_UNSUPP on EXCHANGE_ID, which should be future-proof. + +GSS on the backchannel: +* Again, theoretically required but not widely implemented (in + particular, the current Linux client doesn't request it). We return + NFS4ERR_ENCR_ALG_UNSUPP on CREATE_SESSION. + DELEGPURGE: * mandatory only for servers that support CLAIM_DELEGATE_PREV and/or CLAIM_DELEG_PREV_FH (which allows clients to keep delegations that @@ -196,26 +163,18 @@ DELEGPURGE: now. EXCHANGE_ID: -* only SP4_NONE state protection supported * implementation ids are ignored CREATE_SESSION: * backchannel attributes are ignored -* backchannel security parameters are ignored SEQUENCE: * no support for dynamic slot table renegotiation (optional) -nfsv4.1 COMPOUND rules: -The following cases aren't supported yet: -* Enforcing of NFS4ERR_NOT_ONLY_OP for: BIND_CONN_TO_SESSION, CREATE_SESSION, - DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID. -* DESTROY_SESSION MUST be the final operation in the COMPOUND request. - Nonstandard compound limitations: * No support for a sessions fore channel RPC compound that requires both a ca_maxrequestsize request and a ca_maxresponsesize reply, so we may fail to live up to the promise we made in CREATE_SESSION fore channel negotiation. -* No more than one IO operation (read, write, readdir) allowed per - compound. + +See also http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues. diff --git a/Documentation/filesystems/nfs/nfsd-admin-interfaces.txt b/Documentation/filesystems/nfs/nfsd-admin-interfaces.txt new file mode 100644 index 00000000000..56a96fb08a7 --- /dev/null +++ b/Documentation/filesystems/nfs/nfsd-admin-interfaces.txt @@ -0,0 +1,41 @@ +Administrative interfaces for nfsd +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Note that normally these interfaces are used only by the utilities in +nfs-utils. + +nfsd is controlled mainly by pseudofiles under the "nfsd" filesystem, +which is normally mounted at /proc/fs/nfsd/. + +The server is always started by the first write of a nonzero value to +nfsd/threads. + +Before doing that, NFSD can be told which sockets to listen on by +writing to nfsd/portlist; that write may be: + + - an ascii-encoded file descriptor, which should refer to a + bound (and listening, for tcp) socket, or + - "transportname port", where transportname is currently either + "udp", "tcp", or "rdma". + +If nfsd is started without doing any of these, then it will create one +udp and one tcp listener at port 2049 (see nfsd_init_socks). + +On startup, nfsd and lockd grace periods start. + +nfsd is shut down by a write of 0 to nfsd/threads. All locks and state +are thrown away at that point. + +Between startup and shutdown, the number of threads may be adjusted up +or down by additional writes to nfsd/threads or by writes to +nfsd/pool_threads. + +For more detail about files under nfsd/ and what they control, see +fs/nfsd/nfsctl.c; most of them have detailed comments. + +Implementation notes +^^^^^^^^^^^^^^^^^^^^ + +Note that the rpc server requires the caller to serialize addition and +removal of listening sockets, and startup and shutdown of the server. +For nfsd this is done using nfsd_mutex. diff --git a/Documentation/filesystems/nfs/nfsroot.txt b/Documentation/filesystems/nfs/nfsroot.txt index 90c71c6f0d0..2d66ed68812 100644 --- a/Documentation/filesystems/nfs/nfsroot.txt +++ b/Documentation/filesystems/nfs/nfsroot.txt @@ -78,7 +78,8 @@ nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>] flags = hard, nointr, noposix, cto, ac -ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> +ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>: + <dns0-ip>:<dns1-ip> This parameter tells the kernel how to configure IP addresses of devices and also how to set up the IP routing table. It was originally called @@ -158,6 +159,13 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> Default: any + <dns0-ip> IP address of first nameserver. + Value gets exported by /proc/net/pnp which is often linked + on embedded systems by /etc/resolv.conf. + + <dns1-ip> IP address of secound nameserver. + Same as above. + nfsrootdebug @@ -226,7 +234,7 @@ They depend on various facilities being available: cdrecord. e.g. - cdrecord dev=ATAPI:1,0,0 arch/i386/boot/image.iso + cdrecord dev=ATAPI:1,0,0 arch/x86/boot/image.iso For more information on isolinux, including how to create bootdisks for prebuilt kernels, see http://syslinux.zytor.com/ diff --git a/Documentation/filesystems/nfs/pnfs.txt b/Documentation/filesystems/nfs/pnfs.txt index bc0b9cfe095..adc81a35fe2 100644 --- a/Documentation/filesystems/nfs/pnfs.txt +++ b/Documentation/filesystems/nfs/pnfs.txt @@ -12,7 +12,7 @@ struct pnfs_layout_hdr ---------------------- The on-the-wire command LAYOUTGET corresponds to struct pnfs_layout_segment, usually referred to by the variable name lseg. -Each nfs_inode may hold a pointer to a cache of of these layout +Each nfs_inode may hold a pointer to a cache of these layout segments in nfsi->layout, of type struct pnfs_layout_hdr. We reference the header for the inode pointing to it, across each @@ -46,3 +46,64 @@ data server cache file driver devices refer to data servers, which are kept in a module level cache. Its reference is held over the lifetime of the deviceid pointing to it. + +lseg +---- +lseg maintains an extra reference corresponding to the NFS_LSEG_VALID +bit which holds it in the pnfs_layout_hdr's list. When the final lseg +is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED +bit is set, preventing any new lsegs from being added. + +layout drivers +-------------- + +PNFS utilizes what is called layout drivers. The STD defines 3 basic +layout types: "files" "objects" and "blocks". For each of these types +there is a layout-driver with a common function-vectors table which +are called by the nfs-client pnfs-core to implement the different layout +types. + +Files-layout-driver code is in: fs/nfs/nfs4filelayout.c && nfs4filelayoutdev.c +Objects-layout-deriver code is in: fs/nfs/objlayout/.. directory +Blocks-layout-deriver code is in: fs/nfs/blocklayout/.. directory + +objects-layout setup +-------------------- + +As part of the full STD implementation the objlayoutdriver.ko needs, at times, +to automatically login to yet undiscovered iscsi/osd devices. For this the +driver makes up-calles to a user-mode script called *osd_login* + +The path_name of the script to use is by default: + /sbin/osd_login. +This name can be overridden by the Kernel module parameter: + objlayoutdriver.osd_login_prog + +If Kernel does not find the osd_login_prog path it will zero it out +and will not attempt farther logins. An admin can then write new value +to the objlayoutdriver.osd_login_prog Kernel parameter to re-enable it. + +The /sbin/osd_login is part of the nfs-utils package, and should usually +be installed on distributions that support this Kernel version. + +The API to the login script is as follows: + Usage: $0 -u <URI> -o <OSDNAME> -s <SYSTEMID> + Options: + -u target uri e.g. iscsi://<ip>:<port> + (allways exists) + (More protocols can be defined in the future. + The client does not interpret this string it is + passed unchanged as received from the Server) + -o osdname of the requested target OSD + (Might be empty) + (A string which denotes the OSD name, there is a + limit of 64 chars on this string) + -s systemid of the requested target OSD + (Might be empty) + (This string, if not empty is always an hex + representation of the 20 bytes osd_system_id) + +blocks-layout setup +------------------- + +TODO: Document the setup needs of the blocks layout driver diff --git a/Documentation/filesystems/nfs/rpc-server-gss.txt b/Documentation/filesystems/nfs/rpc-server-gss.txt new file mode 100644 index 00000000000..716f4be8e8b --- /dev/null +++ b/Documentation/filesystems/nfs/rpc-server-gss.txt @@ -0,0 +1,91 @@ + +rpcsec_gss support for kernel RPC servers +========================================= + +This document gives references to the standards and protocols used to +implement RPCGSS authentication in kernel RPC servers such as the NFS +server and the NFS client's NFSv4.0 callback server. (But note that +NFSv4.1 and higher don't require the client to act as a server for the +purposes of authentication.) + +RPCGSS is specified in a few IETF documents: + - RFC2203 v1: http://tools.ietf.org/rfc/rfc2203.txt + - RFC5403 v2: http://tools.ietf.org/rfc/rfc5403.txt +and there is a 3rd version being proposed: + - http://tools.ietf.org/id/draft-williams-rpcsecgssv3.txt + (At draft n. 02 at the time of writing) + +Background +---------- + +The RPCGSS Authentication method describes a way to perform GSSAPI +Authentication for NFS. Although GSSAPI is itself completely mechanism +agnostic, in many cases only the KRB5 mechanism is supported by NFS +implementations. + +The Linux kernel, at the moment, supports only the KRB5 mechanism, and +depends on GSSAPI extensions that are KRB5 specific. + +GSSAPI is a complex library, and implementing it completely in kernel is +unwarranted. However GSSAPI operations are fundementally separable in 2 +parts: +- initial context establishment +- integrity/privacy protection (signing and encrypting of individual + packets) + +The former is more complex and policy-independent, but less +performance-sensitive. The latter is simpler and needs to be very fast. + +Therefore, we perform per-packet integrity and privacy protection in the +kernel, but leave the initial context establishment to userspace. We +need upcalls to request userspace to perform context establishment. + +NFS Server Legacy Upcall Mechanism +---------------------------------- + +The classic upcall mechanism uses a custom text based upcall mechanism +to talk to a custom daemon called rpc.svcgssd that is provide by the +nfs-utils package. + +This upcall mechanism has 2 limitations: + +A) It can handle tokens that are no bigger than 2KiB + +In some Kerberos deployment GSSAPI tokens can be quite big, up and +beyond 64KiB in size due to various authorization extensions attacked to +the Kerberos tickets, that needs to be sent through the GSS layer in +order to perform context establishment. + +B) It does not properly handle creds where the user is member of more +than a few housand groups (the current hard limit in the kernel is 65K +groups) due to limitation on the size of the buffer that can be send +back to the kernel (4KiB). + +NFS Server New RPC Upcall Mechanism +----------------------------------- + +The newer upcall mechanism uses RPC over a unix socket to a daemon +called gss-proxy, implemented by a userspace program called Gssproxy. + +The gss_proxy RPC protocol is currently documented here: + + https://fedorahosted.org/gss-proxy/wiki/ProtocolDocumentation + +This upcall mechanism uses the kernel rpc client and connects to the gssproxy +userspace program over a regular unix socket. The gssproxy protocol does not +suffer from the size limitations of the legacy protocol. + +Negotiating Upcall Mechanisms +----------------------------- + +To provide backward compatibility, the kernel defaults to using the +legacy mechanism. To switch to the new mechanism, gss-proxy must bind +to /var/run/gssproxy.sock and then write "1" to +/proc/net/rpc/use-gss-proxy. If gss-proxy dies, it must repeat both +steps. + +Once the upcall mechanism is chosen, it cannot be changed. To prevent +locking into the legacy mechanisms, the above steps must be performed +before starting nfsd. Whoever starts nfsd can guarantee this by reading +from /proc/net/rpc/use-gss-proxy and checking that it contains a +"1"--the read will block until gss-proxy has done its write to the file. |
