diff options
Diffstat (limited to 'doc/documentation/chapters/developer.texi')
-rw-r--r-- | doc/documentation/chapters/developer.texi | 8341 |
1 files changed, 8341 insertions, 0 deletions
diff --git a/doc/documentation/chapters/developer.texi b/doc/documentation/chapters/developer.texi new file mode 100644 index 0000000000..70fd7c7ebe --- /dev/null +++ b/doc/documentation/chapters/developer.texi @@ -0,0 +1,8341 @@ +@c *********************************************************************** +@node GNUnet Developer Handbook +@chapter GNUnet Developer Handbook + +This book is intended to be an introduction for programmers that want to +extend the GNUnet framework. GNUnet is more than a simple peer-to-peer +application. + +For developers, GNUnet is: + +@itemize @bullet +@item developed by a community that believes in the GNU philosophy +@item Free Software (Free as in Freedom), licensed under the +GNU General Public License +@item A set of standards, including coding conventions and +architectural rules +@item A set of layered protocols, both specifying the communication +between peers as well as the communication between components +of a single peer +@item A set of libraries with well-defined APIs suitable for +writing extensions +@end itemize + +In particular, the architecture specifies that a peer consists of many +processes communicating via protocols. Processes can be written in almost +any language. +C and Java @footnote{As well as Guile} APIs exist for accessing existing +services and for writing extensions. +It is possible to write extensions in other languages by +implementing the necessary IPC protocols. + +GNUnet can be extended and improved along many possible dimensions, and +anyone interested in Free Software and Freedom-enhancing Networking is +welcome to join the effort. This Developer Handbook attempts to provide +an initial introduction to some of the key design choices and central +components of the system. +This part of the GNUNet documentation is far from complete, +and we welcome informed contributions, be it in the form of +new chapters, sections or insightful comments. + +@menu +* Developer Introduction:: +* Code overview:: +* System Architecture:: +* Subsystem stability:: +* Naming conventions and coding style guide:: +* Build-system:: +* Developing extensions for GNUnet using the gnunet-ext template:: +* Writing testcases:: +* TESTING library:: +* Performance regression analysis with Gauger:: +* TESTBED Subsystem:: +* libgnunetutil:: +* Automatic Restart Manager (ARM):: +* TRANSPORT Subsystem:: +* NAT library:: +* Distance-Vector plugin:: +* SMTP plugin:: +* Bluetooth plugin:: +* WLAN plugin:: +* ATS Subsystem:: +* CORE Subsystem:: +* CADET Subsystem:: +* NSE Subsystem:: +* HOSTLIST Subsystem:: +* IDENTITY Subsystem:: +* NAMESTORE Subsystem:: +* PEERINFO Subsystem:: +* PEERSTORE Subsystem:: +* SET Subsystem:: +* STATISTICS Subsystem:: +* Distributed Hash Table (DHT):: +* GNU Name System (GNS):: +* GNS Namecache:: +* REVOCATION Subsystem:: +* File-sharing (FS) Subsystem:: +* REGEX Subsystem:: +@end menu + +@node Developer Introduction +@section Developer Introduction + +This Developer Handbook is intended as first introduction to GNUnet for +new developers that want to extend the GNUnet framework. After the +introduction, each of the GNUnet subsystems (directories in the +@file{src/} tree) is (supposed to be) covered in its own chapter. In +addition to this documentation, GNUnet developers should be aware of the +services available on the GNUnet server to them. + +New developers can have a look a the GNUnet tutorials for C and java +available in the @file{src/} directory of the repository or under the +following links: + +@c ** FIXME: Link to files in source, not online. +@c ** FIXME: Where is the Java tutorial? +@itemize @bullet +@item @uref{https://gnunet.org/git/gnunet.git/plain/doc/gnunet-c-tutorial.pdf, GNUnet C tutorial} +@item GNUnet Java tutorial +@end itemize + +In addition to the GNUnet Reference Documentation you are reading, +the GNUnet server at @uref{https://gnunet.org} contains +various resources for GNUnet developers and those +who aspire to become regular contributors. +They are all conveniently reachable via the "Developer" +entry in the navigation menu. Some additional tools (such as static +analysis reports) require a special developer access to perform certain +operations. If you want (or require) access, you should contact +@uref{http://grothoff.org/christian/, Christian Grothoff}, +GNUnet's maintainer. + +The public subsystems on the GNUnet server that help developers are: + +@itemize @bullet + +@item The version control system (git) keeps our code and enables +distributed development. +It is pubclicly accessible at @uref{https://gnunet.org/git/}. +Only developers with write access can commit code, everyone else is +encouraged to submit patches to the +@uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers, GNUnet-developers mailinglist}. + +@item The bugtracking system (Mantis). +We use it to track feature requests, open bug reports and their +resolutions. +It can be accessed at @uref{https://gnunet.org/bugs/}. +Anyone can report bugs, but only developers can claim to have fixed them. + +@item Our site installation of the +CI@footnote{Continuous Integration} system @code{Buildbot} is used +to check GNUnet builds automatically on a range of platforms. +The web interface of this CI is exposed at +@uref{https://gnunet.org/buildbot/}. +Builds are triggered automatically 30 minutes after the last commit to +our repository was made. + +@item The current quality of our automated test suite is assessed using +Code coverage analysis. This analysis is run daily; however the webpage +is only updated if all automated tests pass at that time. Testcases that +improve our code coverage are always welcome. + +@item We try to automatically find bugs using a static analysis scan. +This scan is run daily; however the webpage is only updated if all +automated tests pass at the time. Note that not everything that is +flagged by the analysis is a bug, sometimes even good code can be marked +as possibly problematic. Nevertheless, developers are encouraged to at +least be aware of all issues in their code that are listed. + +@item We use Gauger for automatic performance regression visualization. +Details on how to use Gauger are here. + +@item We use @uref{http://junit.org/, junit} to automatically test +@command{gnunet-java}. +Automatically generated, current reports on the test suite are here. + +@item We use Cobertura to generate test coverage reports for gnunet-java. +Current reports on test coverage are here. + +@end itemize + + + +@c *********************************************************************** +@menu +* Project overview:: +@end menu + +@node Project overview +@subsection Project overview + +The GNUnet project consists at this point of several sub-projects. This +section is supposed to give an initial overview about the various +sub-projects. Note that this description also lists projects that are far +from complete, including even those that have literally not a single line +of code in them yet. + +GNUnet sub-projects in order of likely relevance are currently: + +@table @asis + +@item @command{gnunet} +Core of the P2P framework, including file-sharing, VPN and +chat applications; this is what the Developer Handbook covers mostly +@item @command{gnunet-gtk} +Gtk+-based user interfaces, including: + +@itemize @bullet +@item @command{gnunet-fs-gtk} (file-sharing), +@item @command{gnunet-statistics-gtk} (statistics over time), +@item @command{gnunet-peerinfo-gtk} +(information about current connections and known peers), +@item @command{gnunet-chat-gtk} (chat GUI) and +@item @command{gnunet-setup} (setup tool for "everything") +@end itemize + +@item @command{gnunet-fuse} +Mounting directories shared via GNUnet's file-sharing +on GNU/Linux distributions +@item @command{gnunet-update} +Installation and update tool +@item @command{gnunet-ext} +Template for starting 'external' GNUnet projects +@item @command{gnunet-java} +Java APIs for writing GNUnet services and applications +@c ** FIXME: Point to new website repository once we have it: +@c ** @item svn/gnunet-www/ Code and media helping drive the GNUnet +@c website +@item @command{eclectic} +Code to run GNUnet nodes on testbeds for research, development, +testing and evaluation +@c ** FIXME: Solve the status and location of gnunet-qt +@item @command{gnunet-qt} +Qt-based GNUnet GUI (is it depreacated?) +@item @command{gnunet-cocoa} +cocoa-based GNUnet GUI (is it depreacated?) +@item @command{gnunet-guile} + +@end table + +We are also working on various supporting libraries and tools: +@c ** FIXME: What about gauger, and what about libmwmodem? + +@table @asis +@item @command{libextractor} +GNU libextractor (meta data extraction) +@item @command{libmicrohttpd} +GNU libmicrohttpd (embedded HTTP(S) server library) +@item @command{gauger} +Tool for performance regression analysis +@item @command{monkey} +Tool for automated debugging of distributed systems +@item @command{libmwmodem} +Library for accessing satellite connection quality +reports +@item @command{libgnurl} +gnURL (feature-restricted variant of cURL/libcurl) +@end table + +Finally, there are various external projects (see links for a list of +those that have a public website) which build on top of the GNUnet +framework. + +@c *********************************************************************** +@node Code overview +@section Code overview + +This section gives a brief overview of the GNUnet source code. +Specifically, we sketch the function of each of the subdirectories in +the @file{gnunet/src/} directory. The order given is roughly bottom-up +(in terms of the layers of the system). + +@table @asis +@item @file{util/} --- libgnunetutil +Library with general utility functions, all +GNUnet binaries link against this library. Anything from memory +allocation and data structures to cryptography and inter-process +communication. The goal is to provide an OS-independent interface and +more 'secure' or convenient implementations of commonly used primitives. +The API is spread over more than a dozen headers, developers should study +those closely to avoid duplicating existing functions. +@pxref{libgnunetutil}. +@item @file{hello/} --- libgnunethello +HELLO messages are used to +describe under which addresses a peer can be reached (for example, +protocol, IP, port). This library manages parsing and generating of HELLO +messages. +@item @file{block/} --- libgnunetblock +The DHT and other components of GNUnet +store information in units called 'blocks'. Each block has a type and the +type defines a particular format and how that binary format is to be +linked to a hash code (the key for the DHT and for databases). The block +library is a wapper around block plugins which provide the necessary +functions for each block type. +@item @file{statistics/} --- statistics service +The statistics service enables associating +values (of type uint64_t) with a componenet name and a string. The main +uses is debugging (counting events), performance tracking and user +entertainment (what did my peer do today?). +@item @file{arm/} --- Automatic Restart Manager (ARM) +The automatic-restart-manager (ARM) service +is the GNUnet master service. Its role is to start gnunet-services, to +re-start them when they crashed and finally to shut down the system when +requested. +@item @file{peerinfo/} --- peerinfo service +The peerinfo service keeps track of which peers are known +to the local peer and also tracks the validated addresses for each peer +(in the form of a HELLO message) for each of those peers. The peer is not +necessarily connected to all peers known to the peerinfo service. +Peerinfo provides persistent storage for peer identities --- peers are +not forgotten just because of a system restart. +@item @file{datacache/} --- libgnunetdatacache +The datacache library provides (temporary) block storage for the DHT. +Existing plugins can store blocks in Sqlite, Postgres or MySQL databases. +All data stored in the cache is lost when the peer is stopped or +restarted (datacache uses temporary tables). +@item @file{datastore/} --- datastore service +The datastore service stores file-sharing blocks in +databases for extended periods of time. In contrast to the datacache, data +is not lost when peers restart. However, quota restrictions may still +cause old, expired or low-priority data to be eventually discarded. +Existing plugins can store blocks in Sqlite, Postgres or MySQL databases. +@item @file{template/} --- service template +Template for writing a new service. Does nothing. +@item @file{ats/} --- Automatic Transport Selection +The automatic transport selection (ATS) service +is responsible for deciding which address (i.e. +which transport plugin) should be used for communication with other peers, +and at what bandwidth. +@item @file{nat/} --- libgnunetnat +Library that provides basic functions for NAT traversal. +The library supports NAT traversal with +manual hole-punching by the user, UPnP and ICMP-based autonomous NAT +traversal. The library also includes an API for testing if the current +configuration works and the @code{gnunet-nat-server} which provides an +external service to test the local configuration. +@item @file{fragmentation/} --- libgnunetfragmentation +Some transports (UDP and WLAN, mostly) have restrictions on the maximum +transfer unit (MTU) for packets. The fragmentation library can be used to +break larger packets into chunks of at most 1k and transmit the resulting +fragments reliabily (with acknowledgement, retransmission, timeouts, +etc.). +@item @file{transport/} --- transport service +The transport service is responsible for managing the +basic P2P communication. It uses plugins to support P2P communication +over TCP, UDP, HTTP, HTTPS and other protocols.The transport service +validates peer addresses, enforces bandwidth restrictions, limits the +total number of connections and enforces connectivity restrictions (i.e. +friends-only). +@item @file{peerinfo-tool/} --- gnunet-peerinfo +This directory contains the gnunet-peerinfo binary which can be used to +inspect the peers and HELLOs known to the peerinfo service. +@item @file{core/} +The core service is responsible for establishing encrypted, authenticated +connections with other peers, encrypting and decrypting messages and +forwarding messages to higher-level services that are interested in them. +@item @file{testing/} --- libgnunettesting +The testing library allows starting (and stopping) peers +for writing testcases. +It also supports automatic generation of configurations for peers +ensuring that the ports and paths are disjoint. libgnunettesting is also +the foundation for the testbed service +@item @file{testbed/} --- testbed service +The testbed service is used for creating small or large scale deployments +of GNUnet peers for evaluation of protocols. +It facilitates peer depolyments on multiple +hosts (for example, in a cluster) and establishing varous network +topologies (both underlay and overlay). +@item @file{nse/} --- Network Size Estimation +The network size estimation (NSE) service +implements a protocol for (securely) estimating the current size of the +P2P network. +@item @file{dht/} --- distributed hash table +The distributed hash table (DHT) service provides a +distributed implementation of a hash table to store blocks under hash +keys in the P2P network. +@item @file{hostlist/} --- hostlist service +The hostlist service allows learning about +other peers in the network by downloading HELLO messages from an HTTP +server, can be configured to run such an HTTP server and also implements +a P2P protocol to advertise and automatically learn about other peers +that offer a public hostlist server. +@item @file{topology/} --- topology service +The topology service is responsible for +maintaining the mesh topology. It tries to maintain connections to friends +(depending on the configuration) and also tries to ensure that the peer +has a decent number of active connections at all times. If necessary, new +connections are added. All peers should run the topology service, +otherwise they may end up not being connected to any other peer (unless +some other service ensures that core establishes the required +connections). The topology service also tells the transport service which +connections are permitted (for friend-to-friend networking) +@item @file{fs/} --- file-sharing +The file-sharing (FS) service implements GNUnet's +file-sharing application. Both anonymous file-sharing (using gap) and +non-anonymous file-sharing (using dht) are supported. +@item @file{cadet/} --- cadet service +The CADET service provides a general-purpose routing abstraction to create +end-to-end encrypted tunnels in mesh networks. We wrote a paper +documenting key aspects of the design. +@item @file{tun/} --- libgnunettun +Library for building IPv4, IPv6 packets and creating +checksums for UDP, TCP and ICMP packets. The header +defines C structs for common Internet packet formats and in particular +structs for interacting with TUN (virtual network) interfaces. +@item @file{mysql/} --- libgnunetmysql +Library for creating and executing prepared MySQL +statements and to manage the connection to the MySQL database. +Essentially a lightweight wrapper for the interaction between GNUnet +components and libmysqlclient. +@item @file{dns/} +Service that allows intercepting and modifying DNS requests of +the local machine. Currently used for IPv4-IPv6 protocol translation +(DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The +service can also be configured to offer an exit service for DNS traffic. +@item @file{vpn/} --- VPN service +The virtual public network (VPN) service provides a virtual +tunnel interface (VTUN) for IP routing over GNUnet. +Needs some other peers to run an "exit" service to work. +Can be activated using the "gnunet-vpn" tool or integrated with DNS using +the "pt" daemon. +@item @file{exit/} +Daemon to allow traffic from the VPN to exit this +peer to the Internet or to specific IP-based services of the local peer. +Currently, an exit service can only be restricted to IPv4 or IPv6, not to +specific ports and or IP address ranges. If this is not acceptable, +additional firewall rules must be added manually. exit currently only +works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the +system via a DNS service. +@item @file{pt/} +protocol translation daemon. This daemon enables 4-to-6, +6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It +essentially uses "DNS" to intercept DNS replies and then maps results to +those offered by the VPN, which then sends them using mesh to some daemon +offering an appropriate exit service. +@item @file{identity/} +Management of egos (alter egos) of a user; identities are +essentially named ECC private keys and used for zones in the GNU name +system and for namespaces in file-sharing, but might find other uses later +@item @file{revocation/} +Key revocation service, can be used to revoke the +private key of an identity if it has been compromised +@item @file{namecache/} +Cache for resolution results for the GNU name system; +data is encrypted and can be shared among users, +loss of the data should ideally only result in a +performance degradation (persistence not required) +@item @file{namestore/} +Database for the GNU name system with per-user private information, +persistence required +@item @file{gns/} +GNU name system, a GNU approach to DNS and PKI. +@item @file{dv/} +A plugin for distance-vector (DV)-based routing. +DV consists of a service and a transport plugin to provide peers +with the illusion of a direct P2P connection for connections +that use multiple (typically up to 3) hops in the actual underlay network. +@item @file{regex/} +Service for the (distributed) evaluation of regular expressions. +@item @file{scalarproduct/} +The scalar product service offers an API to perform a secure multiparty +computation which calculates a scalar product between two peers +without exposing the private input vectors of the peers to each other. +@item @file{consensus/} +The consensus service will allow a set of peers to agree +on a set of values via a distributed set union computation. +@item @file{rest/} +The rest API allows access to GNUnet services using RESTful interaction. +The services provide plugins that can exposed by the rest server. +@item @file{experimentation/} +The experimentation daemon coordinates distributed +experimentation to evaluate transport and ATS properties. +@end table + +@c *********************************************************************** +@node System Architecture +@section System Architecture + +GNUnet developers like LEGOs. The blocks are indestructible, can be +stacked together to construct complex buildings and it is generally easy +to swap one block for a different one that has the same shape. GNUnet's +architecture is based on LEGOs: + +@c images here + +This chapter documents the GNUnet LEGO system, also known as GNUnet's +system architecture. + +The most common GNUnet component is a service. Services offer an API (or +several, depending on what you count as "an API") which is implemented as +a library. The library communicates with the main process of the service +using a service-specific network protocol. The main process of the service +typically doesn't fully provide everything that is needed --- it has holes +to be filled by APIs to other services. + +A special kind of component in GNUnet are user interfaces and daemons. +Like services, they have holes to be filled by APIs of other services. +Unlike services, daemons do not implement their own network protocol and +they have no API: + +The GNUnet system provides a range of services, daemons and user +interfaces, which are then combined into a layered GNUnet instance (also +known as a peer). + +Note that while it is generally possible to swap one service for another +compatible service, there is often only one implementation. However, +during development we often have a "new" version of a service in parallel +with an "old" version. While the "new" version is not working, developers +working on other parts of the service can continue their development by +simply using the "old" service. Alternative design ideas can also be +easily investigated by swapping out individual components. This is +typically achieved by simply changing the name of the "BINARY" in the +respective configuration section. + +Key properties of GNUnet services are that they must be separate +processes and that they must protect themselves by applying tight error +checking against the network protocol they implement (thereby achieving a +certain degree of robustness). + +On the other hand, the APIs are implemented to tolerate failures of the +service, isolating their host process from errors by the service. If the +service process crashes, other services and daemons around it should not +also fail, but instead wait for the service process to be restarted by +ARM. + + +@c *********************************************************************** +@node Subsystem stability +@section Subsystem stability + +This section documents the current stability of the various GNUnet +subsystems. Stability here describes the expected degree of compatibility +with future versions of GNUnet. For each subsystem we distinguish between +compatibility on the P2P network level (communication protocol between +peers), the IPC level (communication between the service and the service +library) and the API level (stability of the API). P2P compatibility is +relevant in terms of which applications are likely going to be able to +communicate with future versions of the network. IPC communication is +relevant for the implementation of language bindings that re-implement the +IPC messages. Finally, API compatibility is relevant to developers that +hope to be able to avoid changes to applications build on top of the APIs +of the framework. + +The following table summarizes our current view of the stability of the +respective protocols or APIs: + +@multitable @columnfractions .20 .20 .20 .20 +@headitem Subsystem @tab P2P @tab IPC @tab C API +@item util @tab n/a @tab n/a @tab stable +@item arm @tab n/a @tab stable @tab stable +@item ats @tab n/a @tab unstable @tab testing +@item block @tab n/a @tab n/a @tab stable +@item cadet @tab testing @tab testing @tab testing +@item consensus @tab experimental @tab experimental @tab experimental +@item core @tab stable @tab stable @tab stable +@item datacache @tab n/a @tab n/a @tab stable +@item datastore @tab n/a @tab stable @tab stable +@item dht @tab stable @tab stable @tab stable +@item dns @tab stable @tab stable @tab stable +@item dv @tab testing @tab testing @tab n/a +@item exit @tab testing @tab n/a @tab n/a +@item fragmentation @tab stable @tab n/a @tab stable +@item fs @tab stable @tab stable @tab stable +@item gns @tab stable @tab stable @tab stable +@item hello @tab n/a @tab n/a @tab testing +@item hostlist @tab stable @tab stable @tab n/a +@item identity @tab stable @tab stable @tab n/a +@item multicast @tab experimental @tab experimental @tab experimental +@item mysql @tab stable @tab n/a @tab stable +@item namestore @tab n/a @tab stable @tab stable +@item nat @tab n/a @tab n/a @tab stable +@item nse @tab stable @tab stable @tab stable +@item peerinfo @tab n/a @tab stable @tab stable +@item psyc @tab experimental @tab experimental @tab experimental +@item pt @tab n/a @tab n/a @tab n/a +@item regex @tab stable @tab stable @tab stable +@item revocation @tab stable @tab stable @tab stable +@item social @tab experimental @tab experimental @tab experimental +@item statistics @tab n/a @tab stable @tab stable +@item testbed @tab n/a @tab testing @tab testing +@item testing @tab n/a @tab n/a @tab testing +@item topology @tab n/a @tab n/a @tab n/a +@item transport @tab stable @tab stable @tab stable +@item tun @tab n/a @tab n/a @tab stable +@item vpn @tab testing @tab n/a @tab n/a +@end multitable + +Here is a rough explanation of the values: + +@table @samp +@item stable +No incompatible changes are planned at this time; for IPC/APIs, if +there are incompatible changes, they will be minor and might only require +minimal changes to existing code; for P2P, changes will be avoided if at +all possible for the 0.10.x-series + +@item testing +No incompatible changes are +planned at this time, but the code is still known to be in flux; so while +we have no concrete plans, our expectation is that there will still be +minor modifications; for P2P, changes will likely be extensions that +should not break existing code + +@item unstable +Changes are planned and will happen; however, they +will not be totally radical and the result should still resemble what is +there now; nevertheless, anticipated changes will break protocol/API +compatibility + +@item experimental +Changes are planned and the result may look nothing like +what the API/protocol looks like today + +@item unknown +Someone should think about where this subsystem headed + +@item n/a +This subsystem does not have an API/IPC-protocol/P2P-protocol +@end table + +@c *********************************************************************** +@node Naming conventions and coding style guide +@section Naming conventions and coding style guide + +Here you can find some rules to help you write code for GNUnet. + +@c *********************************************************************** +@menu +* Naming conventions:: +* Coding style:: +@end menu + +@node Naming conventions +@subsection Naming conventions + + +@c *********************************************************************** +@menu +* include files:: +* binaries:: +* logging:: +* configuration:: +* exported symbols:: +* private (library-internal) symbols (including structs and macros):: +* testcases:: +* performance tests:: +* src/ directories:: +@end menu + +@node include files +@subsubsection include files + +@itemize @bullet +@item _lib: library without need for a process +@item _service: library that needs a service process +@item _plugin: plugin definition +@item _protocol: structs used in network protocol +@item exceptions: +@itemize @bullet +@item gnunet_config.h --- generated +@item platform.h --- first included +@item plibc.h --- external library +@item gnunet_common.h --- fundamental routines +@item gnunet_directories.h --- generated +@item gettext.h --- external library +@end itemize +@end itemize + +@c *********************************************************************** +@node binaries +@subsubsection binaries + +@itemize @bullet +@item gnunet-service-xxx: service process (has listen socket) +@item gnunet-daemon-xxx: daemon process (no listen socket) +@item gnunet-helper-xxx[-yyy]: SUID helper for module xxx +@item gnunet-yyy: command-line tool for end-users +@item libgnunet_plugin_xxx_yyy.so: plugin for API xxx +@item libgnunetxxx.so: library for API xxx +@end itemize + +@c *********************************************************************** +@node logging +@subsubsection logging + +@itemize @bullet +@item services and daemons use their directory name in +@code{GNUNET_log_setup} (i.e. 'core') and log using +plain 'GNUNET_log'. +@item command-line tools use their full name in +@code{GNUNET_log_setup} (i.e. 'gnunet-publish') and log using +plain 'GNUNET_log'. +@item service access libraries log using +'@code{GNUNET_log_from}' and use '@code{DIRNAME-api}' for the +component (i.e. 'core-api') +@item pure libraries (without associated service) use +'@code{GNUNET_log_from}' with the component set to their +library name (without lib or '@file{.so}'), +which should also be their directory name (i.e. '@file{nat}') +@item plugins should use '@code{GNUNET_log_from}' +with the directory name and the plugin name combined to produce +the component name (i.e. 'transport-tcp'). +@item logging should be unified per-file by defining a +@code{LOG} macro with the appropriate arguments, +along these lines: + +@example +#define LOG(kind,...) +GNUNET_log_from (kind, "example-api",__VA_ARGS__) +@end example + +@end itemize + +@c *********************************************************************** +@node configuration +@subsubsection configuration + +@itemize @bullet +@item paths (that are substituted in all filenames) are in PATHS +(have as few as possible) +@item all options for a particular module (@file{src/MODULE}) +are under @code{[MODULE]} +@item options for a plugin of a module +are under @code{[MODULE-PLUGINNAME]} +@end itemize + +@c *********************************************************************** +@node exported symbols +@subsubsection exported symbols + +@itemize @bullet +@item must start with @code{GNUNET_modulename_} and be defined in +@file{modulename.c} +@item exceptions: those defined in @file{gnunet_common.h} +@end itemize + +@c *********************************************************************** +@node private (library-internal) symbols (including structs and macros) +@subsubsection private (library-internal) symbols (including structs and macros) + +@itemize @bullet +@item must NOT start with any prefix +@item must not be exported in a way that linkers could use them or@ other +libraries might see them via headers; they must be either +declared/defined in C source files or in headers that are in the +respective directory under @file{src/modulename/} and NEVER be declared +in @file{src/include/}. +@end itemize + +@node testcases +@subsubsection testcases + +@itemize @bullet +@item must be called @file{test_module-under-test_case-description.c} +@item "case-description" maybe omitted if there is only one test +@end itemize + +@c *********************************************************************** +@node performance tests +@subsubsection performance tests + +@itemize @bullet +@item must be called @file{perf_module-under-test_case-description.c} +@item "case-description" maybe omitted if there is only one performance +test +@item Must only be run if @code{HAVE_BENCHMARKS} is satisfied +@end itemize + +@c *********************************************************************** +@node src/ directories +@subsubsection src/ directories + +@itemize @bullet +@item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm) +@item gnunet-service-NAME: service processes with accessor library (i.e., +gnunet-service-arm) +@item libgnunetNAME: accessor library (_service.h-header) or standalone +library (_lib.h-header) +@item gnunet-daemon-NAME: daemon process without accessor library (i.e., +gnunet-daemon-hostlist) and no GNUnet management port +@item libgnunet_plugin_DIR_NAME: loadable plugins (i.e., +libgnunet_plugin_transport_tcp) +@end itemize + +@cindex Coding style +@node Coding style +@subsection Coding style + +@c XXX: Adjust examples to GNU Standards! +@itemize @bullet +@item We follow the GNU Coding Standards (@pxref{Top, The GNU Coding Standards,, standards, The GNU Coding Standards}); +@item Indentation is done with spaces, two per level, no tabs; +@item C99 struct initialization is fine; +@item declare only one variable per line, for example: + +@noindent +instead of + +@example +int i,j; +@end example + +@noindent +write: + +@example +int i; +int j; +@end example + +@c TODO: include actual example from a file in source + +@noindent +This helps keep diffs small and forces developers to think precisely about +the type of every variable. +Note that @code{char *} is different from @code{const char*} and +@code{int} is different from @code{unsigned int} or @code{uint32_t}. +Each variable type should be chosen with care. + +@item While @code{goto} should generally be avoided, having a +@code{goto} to the end of a function to a block of clean up +statements (free, close, etc.) can be acceptable. + +@item Conditions should be written with constants on the left (to avoid +accidental assignment) and with the 'true' target being either the +'error' case or the significantly simpler continuation. For example: + +@example +if (0 != stat ("filename," &sbuf)) @{ + error(); + @} + else @{ + /* handle normal case here */ + @} +@end example + +@noindent +instead of + +@example +if (stat ("filename," &sbuf) == 0) @{ + /* handle normal case here */ + @} else @{ + error(); + @} +@end example + +@noindent +If possible, the error clause should be terminated with a 'return' (or +'goto' to some cleanup routine) and in this case, the 'else' clause +should be omitted: + +@example +if (0 != stat ("filename," &sbuf)) @{ + error(); + return; + @} +/* handle normal case here */ +@end example + +This serves to avoid deep nesting. The 'constants on the left' rule +applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}), +NULL, and enums). With the two above rules (constants on left, errors in +'true' branch), there is only one way to write most branches correctly. + +@item Combined assignments and tests are allowed if they do not hinder +code clarity. For example, one can write: + +@example +if (NULL == (value = lookup_function())) @{ + error(); + return; + @} +@end example + +@item Use @code{break} and @code{continue} wherever possible to avoid +deep(er) nesting. Thus, we would write: + +@example +next = head; +while (NULL != (pos = next)) @{ + next = pos->next; + if (! should_free (pos)) + continue; + GNUNET_CONTAINER_DLL_remove (head, tail, pos); + GNUNET_free (pos); + @} +@end example + +instead of + +@example +next = head; while (NULL != (pos = next)) @{ + next = pos->next; + if (should_free (pos)) @{ + /* unnecessary nesting! */ + GNUNET_CONTAINER_DLL_remove (head, tail, pos); + GNUNET_free (pos); + @} + @} +@end example + +@item We primarily use @code{for} and @code{while} loops. +A @code{while} loop is used if the method for advancing in the loop is +not a straightforward increment operation. In particular, we use: + +@example +next = head; +while (NULL != (pos = next)) +@{ + next = pos->next; + if (! should_free (pos)) + continue; + GNUNET_CONTAINER_DLL_remove (head, tail, pos); + GNUNET_free (pos); +@} +@end example + +to free entries in a list (as the iteration changes the structure of the +list due to the free; the equivalent @code{for} loop does no longer +follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}). +However, for loops that do follow the simple @code{for} paradigm we do +use @code{for}, even if it involves linked lists: + +@example +/* simple iteration over a linked list */ +for (pos = head; + NULL != pos; + pos = pos->next) +@{ + use (pos); +@} +@end example + + +@item The first argument to all higher-order functions in GNUnet must be +declared to be of type @code{void *} and is reserved for a closure. We do +not use inner functions, as trampolines would conflict with setups that +use non-executable stacks. +The first statement in a higher-order function, which unusually should +be part of the variable declarations, should assign the +@code{cls} argument to the precise expected type. For example: + +@example +int callback (void *cls, char *args) @{ + struct Foo *foo = cls; + int other_variables; + + /* rest of function */ +@} +@end example + + +@item It is good practice to write complex @code{if} expressions instead +of using deeply nested @code{if} statements. However, except for addition +and multiplication, all operators should use parens. This is fine: + +@example +if ( (1 == foo) || ((0 == bar) && (x != y)) ) + return x; +@end example + + +However, this is not: + +@example +if (1 == foo) + return x; +if (0 == bar && x != y) + return x; +@end example + +@noindent +Note that splitting the @code{if} statement above is debateable as the +@code{return x} is a very trivial statement. However, once the logic after +the branch becomes more complicated (and is still identical), the "or" +formulation should be used for sure. + +@item There should be two empty lines between the end of the function and +the comments describing the following function. There should be a single +empty line after the initial variable declarations of a function. If a +function has no local variables, there should be no initial empty line. If +a long function consists of several complex steps, those steps might be +separated by an empty line (possibly followed by a comment describing the +following step). The code should not contain empty lines in arbitrary +places; if in doubt, it is likely better to NOT have an empty line (this +way, more code will fit on the screen). +@end itemize + +@c *********************************************************************** +@node Build-system +@section Build-system + +If you have code that is likely not to compile or build rules you might +want to not trigger for most developers, use @code{if HAVE_EXPERIMENTAL} +in your @file{Makefile.am}. +Then it is OK to (temporarily) add non-compiling (or known-to-not-port) +code. + +If you want to compile all testcases but NOT run them, run configure with +the @code{--enable-test-suppression} option. + +If you want to run all testcases, including those that take a while, run +configure with the @code{--enable-expensive-testcases} option. + +If you want to compile and run benchmarks, run configure with the +@code{--enable-benchmarks} option. + +If you want to obtain code coverage results, run configure with the +@code{--enable-coverage} option and run the @file{coverage.sh} script in +the @file{contrib/} directory. + +@cindex gnunet-ext +@node Developing extensions for GNUnet using the gnunet-ext template +@section Developing extensions for GNUnet using the gnunet-ext template + +For developers who want to write extensions for GNUnet we provide the +gnunet-ext template to provide an easy to use skeleton. + +gnunet-ext contains the build environment and template files for the +development of GNUnet services, command line tools, APIs and tests. + +First of all you have to obtain gnunet-ext from git: + +@example +git clone https://gnunet.org/git/gnunet-ext.git +@end example + +The next step is to bootstrap and configure it. For configure you have to +provide the path containing GNUnet with +@code{--with-gnunet=/path/to/gnunet} and the prefix where you want the +install the extension using @code{--prefix=/path/to/install}: + +@example +./bootstrap +./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet +@end example + +When your GNUnet installation is not included in the default linker search +path, you have to add @code{/path/to/gnunet} to the file +@file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the +environmental variable @code{LD_LIBRARY_PATH} by using + +@example +export LD_LIBRARY_PATH=/path/to/gnunet/lib +@end example + +@cindex writing testcases +@node Writing testcases +@section Writing testcases + +Ideally, any non-trivial GNUnet code should be covered by automated +testcases. Testcases should reside in the same place as the code that is +being tested. The name of source files implementing tests should begin +with @code{test_} followed by the name of the file that contains +the code that is being tested. + +Testcases in GNUnet should be integrated with the autotools build system. +This way, developers and anyone building binary packages will be able to +run all testcases simply by running @code{make check}. The final +testcases shipped with the distribution should output at most some brief +progress information and not display debug messages by default. The +success or failure of a testcase must be indicated by returning zero +(success) or non-zero (failure) from the main method of the testcase. +The integration with the autotools is relatively straightforward and only +requires modifications to the @file{Makefile.am} in the directory +containing the testcase. For a testcase testing the code in @file{foo.c} +the @file{Makefile.am} would contain the following lines: + +@example +check_PROGRAMS = test_foo +TESTS = $(check_PROGRAMS) +test_foo_SOURCES = test_foo.c +test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la +@end example + +Naturally, other libraries used by the testcase may be specified in the +@code{LDADD} directive as necessary. + +Often testcases depend on additional input files, such as a configuration +file. These support files have to be listed using the @code{EXTRA_DIST} +directive in order to ensure that they are included in the distribution. + +Example: + +@example +EXTRA_DIST = test_foo_data.conf +@end example + +Executing @code{make check} will run all testcases in the current +directory and all subdirectories. Testcases can be compiled individually +by running @code{make test_foo} and then invoked directly using +@code{./test_foo}. Note that due to the use of plugins in GNUnet, it is +typically necessary to run @code{make install} before running any +testcases. Thus the canonical command @code{make check install} has to be +changed to @code{make install check} for GNUnet. + +@cindex TESTING library +@node TESTING library +@section TESTING library + +The TESTING library is used for writing testcases which involve starting a +single or multiple peers. While peers can also be started by testcases +using the ARM subsystem, using TESTING library provides an elegant way to +do this. The configurations of the peers are auto-generated from a given +template to have non-conflicting port numbers ensuring that peers' +services do not run into bind errors. This is achieved by testing ports' +availability by binding a listening socket to them before allocating them +to services in the generated configurations. + +An another advantage while using TESTING is that it shortens the testcase +startup time as the hostkeys for peers are copied from a pre-computed set +of hostkeys instead of generating them at peer startup which may take a +considerable amount of time when starting multiple peers or on an embedded +processor. + +TESTING also allows for certain services to be shared among peers. This +feature is invaluable when testing with multiple peers as it helps to +reduce the number of services run per each peer and hence the total +number of processes run per testcase. + +TESTING library only handles creating, starting and stopping peers. +Features useful for testcases such as connecting peers in a topology are +not available in TESTING but are available in the TESTBED subsystem. +Furthermore, TESTING only creates peers on the localhost, however by +using TESTBED testcases can benefit from creating peers across multiple +hosts. + +@menu +* API:: +* Finer control over peer stop:: +* Helper functions:: +* Testing with multiple processes:: +@end menu + +@cindex TESTING API +@node API +@subsection API + +TESTING abstracts a group of peers as a TESTING system. All peers in a +system have common hostname and no two services of these peers have a +same port or a UNIX domain socket path. + +TESTING system can be created with the function +@code{GNUNET_TESTING_system_create()} which returns a handle to the +system. This function takes a directory path which is used for generating +the configurations of peers, an IP address from which connections to the +peers' services should be allowed, the hostname to be used in peers' +configuration, and an array of shared service specifications of type +@code{struct GNUNET_TESTING_SharedService}. + +The shared service specification must specify the name of the service to +share, the configuration pertaining to that shared service and the +maximum number of peers that are allowed to share a single instance of +the shared service. + +TESTING system created with @code{GNUNET_TESTING_system_create()} chooses +ports from the default range @code{12000} - @code{56000} while +auto-generating configurations for peers. +This range can be customised with the function +@code{GNUNET_TESTING_system_create_with_portrange()}. This function is +similar to @code{GNUNET_TESTING_system_create()} except that it take 2 +additional parameters --- the start and end of the port range to use. + +A TESTING system is destroyed with the funciton +@code{GNUNET_TESTING_system_destory()}. This function takes the handle of +the system and a flag to remove the files created in the directory used +to generate configurations. + +A peer is created with the function +@code{GNUNET_TESTING_peer_configure()}. This functions takes the system +handle, a configuration template from which the configuration for the peer +is auto-generated and the index from where the hostkey for the peer has to +be copied from. When successfull, this function returs a handle to the +peer which can be used to start and stop it and to obtain the identity of +the peer. If unsuccessful, a NULL pointer is returned with an error +message. This function handles the generated configuration to have +non-conflicting ports and paths. + +Peers can be started and stopped by calling the functions +@code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()} +respectively. A peer can be destroyed by calling the function +@code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports +and paths in allocated in its configuration are reclaimed for usage in new +peers. + +@c *********************************************************************** +@node Finer control over peer stop +@subsection Finer control over peer stop + +Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases. +However, calling this function for each peer is inefficient when trying to +shutdown multiple peers as this function sends the termination signal to +the given peer process and waits for it to terminate. It would be faster +in this case to send the termination signals to the peers first and then +wait on them. This is accomplished by the functions +@code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the +peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on +the peer. + +Further finer control can be achieved by choosing to stop a peer +asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}. +This function takes a callback parameter and a closure for it in addition +to the handle to the peer to stop. The callback function is called with +the given closure when the peer is stopped. Using this function +eliminates blocking while waiting for the peer to terminate. + +An asynchronous peer stop can be cancelled by calling the function +@code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this +function does not prevent the peer from terminating if the termination +signal has already been sent to it. It does, however, cancels the +callback to be called when the peer is stopped. + +@c *********************************************************************** +@node Helper functions +@subsection Helper functions + +Most of the testcases can benefit from an abstraction which configures a +peer and starts it. This is provided by the function +@code{GNUNET_TESTING_peer_run()}. This function takes the testing +directory pathname, a configuration template, a callback and its closure. +This function creates a peer in the given testing directory by using the +configuration template, starts the peer and calls the given callback with +the given closure. + +The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of +the peer which starts the rest of the configured services. A similar +function @code{GNUNET_TESTING_service_run} can be used to just start a +single service of a peer. In this case, the peer's ARM service is not +started; instead, only the given service is run. + +@c *********************************************************************** +@node Testing with multiple processes +@subsection Testing with multiple processes + +When testing GNUnet, the splitting of the code into a services and clients +often complicates testing. The solution to this is to have the testcase +fork @code{gnunet-service-arm}, ask it to start the required server and +daemon processes and then execute appropriate client actions (to test the +client APIs or the core module or both). If necessary, multiple ARM +services can be forked using different ports (!) to simulate a network. +However, most of the time only one ARM process is needed. Note that on +exit, the testcase should shutdown ARM with a @code{TERM} signal (to give +it the chance to cleanly stop its child processes). + +The following code illustrates spawning and killing an ARM process from a +testcase: + +@example +static void run (void *cls, + char *const *args, + const char *cfgfile, + const struct GNUNET_CONFIGURATION_Handle *cfg) @{ + struct GNUNET_OS_Process *arm_pid; + arm_pid = GNUNET_OS_start_process (NULL, + NULL, + "gnunet-service-arm", + "gnunet-service-arm", + "-c", + cfgname, + NULL); + /* do real test work here */ + if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM)) + GNUNET_log_strerror + (GNUNET_ERROR_TYPE_WARNING, "kill"); + GNUNET_assert (GNUNET_OK == GNUNET_OS_process_wait (arm_pid)); + GNUNET_OS_process_close (arm_pid); @} + +GNUNET_PROGRAM_run (argc, argv, + "NAME-OF-TEST", + "nohelp", + options, + &run, + cls); +@end example + + +An alternative way that works well to test plugins is to implement a +mock-version of the environment that the plugin expects and then to +simply load the plugin directly. + +@c *********************************************************************** +@node Performance regression analysis with Gauger +@section Performance regression analysis with Gauger + +To help avoid performance regressions, GNUnet uses Gauger. Gauger is a +simple logging tool that allows remote hosts to send performance data to +a central server, where this data can be analyzed and visualized. Gauger +shows graphs of the repository revisions and the performace data recorded +for each revision, so sudden performance peaks or drops can be identified +and linked to a specific revision number. + +In the case of GNUnet, the buildbots log the performance data obtained +during the tests after each build. The data can be accesed on GNUnet's +Gauger page. + +The menu on the left allows to select either the results of just one +build bot (under "Hosts") or review the data from all hosts for a given +test result (under "Metrics"). In case of very different absolute value +of the results, for instance arm vs. amd64 machines, the option +"Normalize" on a metric view can help to get an idea about the +performance evolution across all hosts. + +Using Gauger in GNUnet and having the performance of a module tracked over +time is very easy. First of course, the testcase must generate some +consistent metric, which makes sense to have logged. Highly volatile or +random dependant metrics probably are not ideal candidates for meaningful +regression detection. + +To start logging any value, just include @code{gauger.h} in your testcase +code. Then, use the macro @code{GAUGER()} to make the Buildbots log +whatever value is of interest for you to @code{gnunet.org}'s Gauger +server. No setup is necessary as most Buildbots have already everything +in place and new metrics are created on demand. To delete a metric, you +need to contact a member of the GNUnet development team (a file will need +to be removed manually from the respective directory). + +The code in the test should look like this: + +@example +[other includes] +#include <gauger.h> + +int main (int argc, char *argv[]) @{ + + [run test, generate data] + GAUGER("YOUR_MODULE", + "METRIC_NAME", + (float)value, + "UNIT"); @} +@end example + +Where: + +@table @asis + +@item @strong{YOUR_MODULE} is a category in the gauger page and should be +the name of the module or subsystem like "Core" or "DHT" +@item @strong{METRIC} is +the name of the metric being collected and should be concise and +descriptive, like "PUT operations in sqlite-datastore". +@item @strong{value} is the value +of the metric that is logged for this run. +@item @strong{UNIT} is the unit in +which the value is measured, for instance "kb/s" or "kb of RAM/node". +@end table + +If you wish to use Gauger for your own project, you can grab a copy of the +latest stable release or check out Gauger's Subversion repository. + +@cindex TESTBED Subsystem +@node TESTBED Subsystem +@section TESTBED Subsystem + +The TESTBED subsystem facilitates testing and measuring of multi-peer +deployments on a single host or over multiple hosts. + +The architecture of the testbed module is divided into the following: +@itemize @bullet + +@item Testbed API: An API which is used by the testing driver programs. It +provides with functions for creating, destroying, starting, stopping +peers, etc. + +@item Testbed service (controller): A service which is started through the +Testbed API. This service handles operations to create, destroy, start, +stop peers, connect them, modify their configurations. + +@item Testbed helper: When a controller has to be started on a host, the +testbed API starts the testbed helper on that host which in turn starts +the controller. The testbed helper receives a configuration for the +controller through its stdin and changes it to ensure the controller +doesn't run into any port conflict on that host. +@end itemize + + +The testbed service (controller) is different from the other GNUnet +services in that it is not started by ARM and is not supposed to be run +as a daemon. It is started by the testbed API through a testbed helper. +In a typical scenario involving multiple hosts, a controller is started +on each host. Controllers take up the actual task of creating peers, +starting and stopping them on the hosts they run. + +While running deployments on a single localhost the testbed API starts the +testbed helper directly as a child process. When running deployments on +remote hosts the testbed API starts Testbed Helpers on each remote host +through remote shell. By default testbed API uses SSH as a remote shell. +This can be changed by setting the environmental variable +GNUNET_TESTBED_RSH_CMD to the required remote shell program. This +variable can also contain parameters which are to be passed to the remote +shell program. For e.g: + +@example +export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \ +-o NoHostAuthenticationForLocalhost=yes %h" +@end example + +Substitutions are allowed in the command string above, +this allows for substitutions through placemarks which begin with a `%'. +At present the following substitutions are supported + +@itemize @bullet +@item %h: hostname +@item %u: username +@item %p: port +@end itemize + +Note that the substitution placemark is replaced only when the +corresponding field is available and only once. Specifying + +@example +%u@atchar{}%h +@end example + +doesn't work either. If you want to user username substitutions for +@command{SSH}, use the argument @code{-l} before the +username substitution. + +For example: +@example +ssh -l %u -p %p %h +@end example + +The testbed API and the helper communicate through the helpers stdin and +stdout. As the helper is started through a remote shell on remote hosts +any output messages from the remote shell interfere with the communication +and results in a failure while starting the helper. For this reason, it is +suggested to use flags to make the remote shells produce no output +messages and to have password-less logins. The default remote shell, SSH, +the default options are: + +@example +-o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes" +@end example + +Password-less logins should be ensured by using SSH keys. + +Since the testbed API executes the remote shell as a non-interactive +shell, certain scripts like .bashrc, .profiler may not be executed. If +this is the case testbed API can be forced to execute an interactive +shell by setting up the environmental variable +@code{GNUNET_TESTBED_RSH_CMD_SUFFIX} to a shell program. + +An example could be: + +@example +export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc" +@end example + +The testbed API will then execute the remote shell program as: + +@example +$GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \ +gnunet-helper-testbed +@end example + +On some systems, problems may arise while starting testbed helpers if +GNUnet is installed into a custom location since the helper may not be +found in the standard path. This can be addressed by setting the variable +`@code{HELPER_BINARY_PATH}' to the path of the testbed helper. +Testbed API will then use this path to start helper binaries both +locally and remotely. + +Testbed API can accessed by including the +@file{gnunet_testbed_service.h} file and linking with +@code{-lgnunettestbed}. + +@c *********************************************************************** +@menu +* Supported Topologies:: +* Hosts file format:: +* Topology file format:: +* Testbed Barriers:: +* Automatic large-scale deployment in the PlanetLab testbed:: +* TESTBED Caveats:: +@end menu + +@node Supported Topologies +@subsection Supported Topologies + +While testing multi-peer deployments, it is often needed that the peers +are connected in some topology. This requirement is addressed by the +function @code{GNUNET_TESTBED_overlay_connect()} which connects any given +two peers in the testbed. + +The API also provides a helper function +@code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set +of peers in any of the following supported topologies: + +@itemize @bullet + +@item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with +each other + +@item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a +line + +@item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a +ring topology + +@item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to +form a 2 dimensional torus topology. The number of peers may not be a +perfect square, in that case the resulting torus may not have the uniform +poloidal and toroidal lengths + +@item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated +to form a random graph. The number of links to be present should be given + +@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to +form a 2D Torus with some random links among them. The number of random +links are to be given + +@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are +connected to form a ring with some random links among them. The number of +random links are to be given + +@item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a +topology where peer connectivity follows power law - new peers are +connected with high probabililty to well connected peers. +@footnote{See Emergence of Scaling in Random Networks. Science 286, +509-512, 1999 +(@uref{https://gnunet.org/git/bibliography.git/plain/docs/emergence_of_scaling_in_random_networks__barabasi_albert_science_286__1999.pdf, pdf})} + +@item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information +is loaded from a file. The path to the file has to be given. +@xref{Topology file format}, for the format of this file. + +@item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology +@end itemize + + +The above supported topologies can be specified respectively by setting +the variable @code{OVERLAY_TOPOLOGY} to the following values in the +configuration passed to Testbed API functions +@code{GNUNET_TESTBED_test_run()} and +@code{GNUNET_TESTBED_run()}: + +@itemize @bullet +@item @code{CLIQUE} +@item @code{RING} +@item @code{LINE} +@item @code{2D_TORUS} +@item @code{RANDOM} +@item @code{SMALL_WORLD} +@item @code{SMALL_WORLD_RING} +@item @code{SCALE_FREE} +@item @code{FROM_FILE} +@item @code{NONE} +@end itemize + + +Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING} +require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of +random links to be generated in the configuration. The option will be +ignored for the rest of the topologies. + +Topology @code{SCALE_FREE} requires the options +@code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers +which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to +how many peers a peer should be atleast connected to. + +Similarly, the topology @code{FROM_FILE} requires the option +@code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing +the topology information. This option is ignored for the rest of the +topologies. @xref{Topology file format}, for the format of this file. + +@c *********************************************************************** +@node Hosts file format +@subsection Hosts file format + +The testbed API offers the function +@code{GNUNET_TESTBED_hosts_load_from_file()} to load from a given file +details about the hosts which testbed can use for deploying peers. +This function is useful to keep the data about hosts +separate instead of hard coding them in code. + +Another helper function from testbed API, @code{GNUNET_TESTBED_run()} +also takes a hosts file name as its parameter. It uses the above +function to populate the hosts data structures and start controllers to +deploy peers. + +These functions require the hosts file to be of the following format: +@itemize @bullet +@item Each line is interpreted to have details about a host +@item Host details should include the username to use for logging into the +host, the hostname of the host and the port number to use for the remote +shell program. All thee values should be given. +@item These details should be given in the following format: +@example +<username>@@<hostname>:<port> +@end example +@end itemize + +Note that having canonical hostnames may cause problems while resolving +the IP addresses (See this bug). Hence it is advised to provide the hosts' +IP numerical addresses as hostnames whenever possible. + +@c *********************************************************************** +@node Topology file format +@subsection Topology file format + +A topology file describes how peers are to be connected. It should adhere +to the following format for testbed to parse it correctly. + +Each line should begin with the target peer id. This should be followed by +a colon(`:') and origin peer ids seperated by `|'. All spaces except for +newline characters are ignored. The API will then try to connect each +origin peer to the target peer. + +For example, the following file will result in 5 overlay connections: +[2->1], [3->1],[4->3], [0->3], [2->0]@ +@code{@ 1:2|3@ 3:4| 0@ 0: 2@ } + +@c *********************************************************************** +@node Testbed Barriers +@subsection Testbed Barriers + +The testbed subsystem's barriers API facilitates coordination among the +peers run by the testbed and the experiment driver. The concept is +similar to the barrier synchronisation mechanism found in parallel +programming or multi-threading paradigms - a peer waits at a barrier upon +reaching it until the barrier is reached by a predefined number of peers. +This predefined number of peers required to cross a barrier is also called +quorum. We say a peer has reached a barrier if the peer is waiting for the +barrier to be crossed. Similarly a barrier is said to be reached if the +required quorum of peers reach the barrier. A barrier which is reached is +deemed as crossed after all the peers waiting on it are notified. + +The barriers API provides the following functions: +@itemize @bullet +@item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to +initialse a barrier in the experiment +@item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel +a barrier which has been initialised before +@item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal +barrier service that the caller has reached a barrier and is waiting for +it to be crossed +@item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to +stop waiting for a barrier to be crossed +@end itemize + + +Among the above functions, the first two, namely +@code{GNUNET_TESTBED_barrier_init()} and +@code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All +barriers should be initialised by the experiment driver by calling +@code{GNUNET_TESTBED_barrier_init()}. This function takes a name to +identify the barrier, the quorum required for the barrier to be crossed +and a notification callback for notifying the experiment driver when the +barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an +initialised barrier and frees the resources allocated for it. This +function can be called upon a initialised barrier before it is crossed. + +The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and +@code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's +processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local +barrier service running on the same host the peer is running on and +registers that the caller has reached the barrier and is waiting for the +barrier to be crossed. Note that this function can only be used by peers +which are started by testbed as this function tries to access the local +barrier service which is part of the testbed controller service. Calling +@code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results +in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the +notification registered by @code{GNUNET_TESTBED_barrier_wait()}. + + +@c *********************************************************************** +@menu +* Implementation:: +@end menu + +@node Implementation +@subsubsection Implementation + +Since barriers involve coordination between experiment driver and peers, +the barrier service in the testbed controller is split into two +components. The first component responds to the message generated by the +barrier API used by the experiment driver (functions +@code{GNUNET_TESTBED_barrier_init()} and +@code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the +messages generated by barrier API used by peers (functions +@code{GNUNET_TESTBED_barrier_wait()} and +@code{GNUNET_TESTBED_barrier_wait_cancel()}). + +Calling @code{GNUNET_TESTBED_barrier_init()} sends a +@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master +controller. The master controller then registers a barrier and calls +@code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this +way barrier initialisation is propagated to the controller hierarchy. +While propagating initialisation, any errors at a subcontroller such as +timeout during further propagation are reported up the hierarchy back to +the experiment driver. + +Similar to @code{GNUNET_TESTBED_barrier_init()}, +@code{GNUNET_TESTBED_barrier_cancel()} propagates +@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes +controllers to remove an initialised barrier. + +The second component is implemented as a separate service in the binary +`gnunet-service-testbed' which already has the testbed controller service. +Although this deviates from the gnunet process architecture of having one +service per binary, it is needed in this case as this component needs +access to barrier data created by the first component. This component +responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from +local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon +receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the +service checks if the requested barrier has been initialised before and +if it was not initialised, an error status is sent through +@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local +peer and the connection from the peer is terminated. If the barrier is +initialised before, the barrier's counter for reached peers is incremented +and a notification is registered to notify the peer when the barrier is +reached. The connection from the peer is left open. + +When enough peers required to attain the quorum send +@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller +sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its +parent informing that the barrier is crossed. If the controller has +started further subcontrollers, it delays this message until it receives +a similar notification from each of those subcontrollers. Finally, the +barriers API at the experiment driver receives the +@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is +reached at all the controllers. + +The barriers API at the experiment driver responds to the +@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it +back to the master controller and notifying the experiment controller +through the notification callback that a barrier has been crossed. The +echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is +propagated by the master controller to the controller hierarchy. This +propagation triggers the notifications registered by peers at each of the +controllers in the hierarchy. Note the difference between this downward +propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} +message from its upward propagation --- the upward propagation is needed +for ensuring that the barrier is reached by all the controllers and the +downward propagation is for triggering that the barrier is crossed. + +@cindex PlanetLab testbed +@node Automatic large-scale deployment in the PlanetLab testbed +@subsection Automatic large-scale deployment in the PlanetLab testbed + +PlanetLab is a testbed for computer networking and distributed systems +research. It was established in 2002 and as of June 2010 was composed of +1090 nodes at 507 sites worldwide. + +To automate the GNUnet we created a set of automation tools to simplify +the large-scale deployment. We provide you a set of scripts you can use +to deploy GNUnet on a set of nodes and manage your installation. + +Please also check @uref{https://gnunet.org/installation-fedora8-svn} and +@uref{https://gnunet.org/installation-fedora12-svn} to find detailled +instructions how to install GNUnet on a PlanetLab node. + + +@c *********************************************************************** +@menu +* PlanetLab Automation for Fedora8 nodes:: +* Install buildslave on PlanetLab nodes running fedora core 8:: +* Setup a new PlanetLab testbed using GPLMT:: +* Why do i get an ssh error when using the regex profiler?:: +@end menu + +@node PlanetLab Automation for Fedora8 nodes +@subsubsection PlanetLab Automation for Fedora8 nodes + +@c *********************************************************************** +@node Install buildslave on PlanetLab nodes running fedora core 8 +@subsubsection Install buildslave on PlanetLab nodes running fedora core 8 +@c ** Actually this is a subsubsubsection, but must be fixed differently +@c ** as subsubsection is the lowest. + +Since most of the PlanetLab nodes are running the very old Fedora core 8 +image, installing the buildslave software is quite some pain. For our +PlanetLab testbed we figured out how to install the buildslave software +best. + +@c This is a vvery terrible way to suggest installing software. +@c FIXME: Is there an official, safer way instead of blind-piping a +@c script? +@c FIXME: Use newer pypi URLs below. +Install Distribute for Python: + +@example +curl http://python-distribute.org/distribute_setup.py | sudo python +@end example + +Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not +work): + +@example +export PYPI=@value{PYPI-URL} +wget $PYPI/z/zope.interface/zope.interface-3.8.0.tar.gz +tar zvfz zope.interface-3.8.0.tar.gz +cd zope.interface-3.8.0 +sudo python setup.py install +@end example + +Install the buildslave software (0.8.6 was the latest version): + +@example +export GCODE="http://buildbot.googlecode.com/files" +wget $GCODE/buildbot-slave-0.8.6p1.tar.gz +tar xvfz buildbot-slave-0.8.6p1.tar.gz +cd buildslave-0.8.6p1 +sudo python setup.py install +@end example + +The setup will download the matching twisted package and install it. +It will also try to install the latest version of zope.interface which +will fail to install. Buildslave will work anyway since version 3.8.0 +was installed before! + +@c *********************************************************************** +@node Setup a new PlanetLab testbed using GPLMT +@subsubsection Setup a new PlanetLab testbed using GPLMT + +@itemize @bullet +@item Get a new slice and assign nodes +Ask your PlanetLab PI to give you a new slice and assign the nodes you +need +@item Install a buildmaster +You can stick to the buildbot documentation:@ +@uref{http://buildbot.net/buildbot/docs/current/manual/installation.html} +@item Install the buildslave software on all nodes +To install the buildslave on all nodes assigned to your slice you can use +the tasklist @code{install_buildslave_fc8.xml} provided with GPLMT: + +@example +./gplmt.py -c contrib/tumple_gnunet.conf -t \ +contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password> +@end example + +@item Create the buildmaster configuration and the slave setup commands + +The master and the and the slaves have need to have credentials and the +master has to have all nodes configured. This can be done with the +@file{create_buildbot_configuration.py} script in the @file{scripts} +directory. + +This scripts takes a list of nodes retrieved directly from PlanetLab or +read from a file and a configuration template and creates: + +@itemize @bullet +@item a tasklist which can be executed with gplmt to setup the slaves +@item a master.cfg file containing a PlanetLab nodes +@end itemize + +A configuration template is included in the <contrib>, most important is +that the script replaces the following tags in the template: + +%GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@ +%GPLMT_SCHEDULER_BUILDERS + +Create configuration for all nodes assigned to a slice: + +@example +./create_buildbot_configuration.py -u <planetlab username> \ +-p <planetlab password> -s <slice> -m <buildmaster+port> \ +-t <template> +@end example + +Create configuration for some nodes in a file: + +@example +./create_buildbot_configuration.p -f <node_file> \ +-m <buildmaster+port> -t <template> +@end example + +@item Copy the @file{master.cfg} to the buildmaster and start it +Use @code{buildbot start <basedir>} to start the server +@item Setup the buildslaves +@end itemize + +@c *********************************************************************** +@node Why do i get an ssh error when using the regex profiler? +@subsubsection Why do i get an ssh error when using the regex profiler? + +Why do i get an ssh error "Permission denied (publickey,password)." when +using the regex profiler although passwordless ssh to localhost works +using publickey and ssh-agent? + +You have to generate a public/private-key pair with no password:@ +@code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@ +and then add the following to your ~/.ssh/config file: + +@code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost} + +now make sure your hostsfile looks like + +@example +[USERNAME]@@127.0.0.1:22@ +[USERNAME]@@127.0.0.1:22 +@end example + +You can test your setup by running @code{ssh 127.0.0.1} in a +terminal and then in the opened session run it again. +If you were not asked for a password on either login, +then you should be good to go. + +@cindex TESTBED Caveats +@node TESTBED Caveats +@subsection TESTBED Caveats + +This section documents a few caveats when using the GNUnet testbed +subsystem. + +@c *********************************************************************** +@menu +* CORE must be started:: +* ATS must want the connections:: +@end menu + +@node CORE must be started +@subsubsection CORE must be started + +A simple issue is #3993: Your configuration MUST somehow ensure that for +each peer the CORE service is started when the peer is setup, otherwise +TESTBED may fail to connect peers when the topology is initialized, as +TESTBED will start some CORE services but not necessarily all (but it +relies on all of them running). The easiest way is to set +'FORCESTART = YES' in the '[core]' section of the configuration file. +Alternatively, having any service that directly or indirectly depends on +CORE being started with FORCESTART will also do. This issue largely arises +if users try to over-optimize by not starting any services with +FORCESTART. + +@c *********************************************************************** +@node ATS must want the connections +@subsubsection ATS must want the connections + +When TESTBED sets up connections, it only offers the respective HELLO +information to the TRANSPORT service. It is then up to the ATS service to +@strong{decide} to use the connection. The ATS service will typically +eagerly establish any connection if the number of total connections is +low (relative to bandwidth). Details may further depend on the +specific ATS backend that was configured. If ATS decides to NOT establish +a connection (even though TESTBED provided the required information), then +that connection will count as failed for TESTBED. Note that you can +configure TESTBED to tolerate a certain number of connection failures +(see '-e' option of gnunet-testbed-profiler). This issue largely arises +for dense overlay topologies, especially if you try to create cliques +with more than 20 peers. + +@cindex libgnunetutil +@node libgnunetutil +@section libgnunetutil + +libgnunetutil is the fundamental library that all GNUnet code builds upon. +Ideally, this library should contain most of the platform dependent code +(except for user interfaces and really special needs that only few +applications have). It is also supposed to offer basic services that most +if not all GNUnet binaries require. The code of libgnunetutil is in the +@file{src/util/} directory. The public interface to the library is in the +gnunet_util.h header. The functions provided by libgnunetutil fall +roughly into the following categories (in roughly the order of importance +for new developers): + +@itemize @bullet +@item logging (common_logging.c) +@item memory allocation (common_allocation.c) +@item endianess conversion (common_endian.c) +@item internationalization (common_gettext.c) +@item String manipulation (string.c) +@item file access (disk.c) +@item buffered disk IO (bio.c) +@item time manipulation (time.c) +@item configuration parsing (configuration.c) +@item command-line handling (getopt*.c) +@item cryptography (crypto_*.c) +@item data structures (container_*.c) +@item CPS-style scheduling (scheduler.c) +@item Program initialization (program.c) +@item Networking (network.c, client.c, server*.c, service.c) +@item message queueing (mq.c) +@item bandwidth calculations (bandwidth.c) +@item Other OS-related (os*.c, plugin.c, signal.c) +@item Pseudonym management (pseudonym.c) +@end itemize + +It should be noted that only developers that fully understand this entire +API will be able to write good GNUnet code. + +Ideally, porting GNUnet should only require porting the gnunetutil +library. More testcases for the gnunetutil APIs are therefore a great +way to make porting of GNUnet easier. + +@menu +* Logging:: +* Interprocess communication API (IPC):: +* Cryptography API:: +* Message Queue API:: +* Service API:: +* Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps:: +* CONTAINER_MDLL API:: +@end menu + +@cindex Logging +@cindex log levels +@node Logging +@subsection Logging + +GNUnet is able to log its activity, mostly for the purposes of debugging +the program at various levels. + +@file{gnunet_common.h} defines several @strong{log levels}: +@table @asis + +@item ERROR for errors (really problematic situations, often leading to +crashes) +@item WARNING for warnings (troubling situations that might have +negative consequences, although not fatal) +@item INFO for various information. +Used somewhat rarely, as GNUnet statistics is used to hold and display +most of the information that users might find interesting. +@item DEBUG for debugging. +Does not produce much output on normal builds, but when extra logging is +enabled at compile time, a staggering amount of data is outputted under +this log level. +@end table + + +Normal builds of GNUnet (configured with @code{--enable-logging[=yes]}) +are supposed to log nothing under DEBUG level. The +@code{--enable-logging=verbose} configure option can be used to create a +build with all logging enabled. However, such build will produce large +amounts of log data, which is inconvenient when one tries to hunt down a +specific problem. + +To mitigate this problem, GNUnet provides facilities to apply a filter to +reduce the logs: +@table @asis + +@item Logging by default When no log levels are configured in any other +way (see below), GNUnet will default to the WARNING log level. This +mostly applies to GNUnet command line utilities, services and daemons; +tests will always set log level to WARNING or, if +@code{--enable-logging=verbose} was passed to configure, to DEBUG. The +default level is suggested for normal operation. +@item The -L option Most GNUnet executables accept an "-L loglevel" or +"--log=loglevel" option. If used, it makes the process set a global log +level to "loglevel". Thus it is possible to run some processes +with -L DEBUG, for example, and others with -L ERROR to enable specific +settings to diagnose problems with a particular process. +@item Configuration files. Because GNUnet +service and deamon processes are usually launched by gnunet-arm, it is not +possible to pass different custom command line options directly to every +one of them. The options passed to @code{gnunet-arm} only affect +gnunet-arm and not the rest of GNUnet. However, one can specify a +configuration key "OPTIONS" in the section that corresponds to a service +or a daemon, and put a value of "-L loglevel" there. This will make the +respective service or daemon set its log level to "loglevel" (as the +value of OPTIONS will be passed as a command-line argument). + +To specify the same log level for all services without creating separate +"OPTIONS" entries in the configuration for each one, the user can specify +a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration +file. The value of GLOBAL_POSTFIX will be appended to all command lines +used by the ARM service to run other services. It can contain any option +valid for all GNUnet commands, thus in particular the "-L loglevel" +option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX; +to set log level for it, one has to specify "OPTIONS" key in the [arm] +section. +@item Environment variables. +Setting global per-process log levels with "-L loglevel" does not offer +sufficient log filtering granularity, as one service will call interface +libraries and supporting libraries of other GNUnet services, potentially +producing lots of debug log messages from these libraries. Also, changing +the config file is not always convenient (especially when running the +GNUnet test suite).@ To fix that, and to allow GNUnet to use different +log filtering at runtime without re-compiling the whole source tree, the +log calls were changed to be configurable at run time. To configure them +one has to define environment variables "GNUNET_FORCE_LOGFILE", +"GNUNET_LOG" and/or "GNUNET_FORCE_LOG": +@itemize @bullet + +@item "GNUNET_LOG" only affects the logging when no global log level is +configured by any other means (that is, the process does not explicitly +set its own log level, there are no "-L loglevel" options on command line +or in configuration files), and can be used to override the default +WARNING log level. + +@item "GNUNET_FORCE_LOG" will completely override any other log +configuration options given. + +@item "GNUNET_FORCE_LOGFILE" will completely override the location of the +file to log messages to. It should contain a relative or absolute file +name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing +"--log-file=logfile" or "-l logfile" option (see below). It supports "[]" +format in file names, but not "@{@}" (see below). +@end itemize + + +Because environment variables are inherited by child processes when they +are launched, starting or re-starting the ARM service with these +variables will propagate them to all other services. + +"GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially +formatted @strong{logging definition} string, which looks like this:@ + +@c FIXME: Can we close this with [/component] instead? +@example +[component];[file];[function];[from_line[-to_line]];loglevel[/component...] +@end example + +That is, a logging definition consists of definition entries, separated by +slashes ('/'). If only one entry is present, there is no need to add a +slash to its end (although it is not forbidden either).@ All definition +fields (component, file, function, lines and loglevel) are mandatory, but +(except for the loglevel) they can be empty. An empty field means +"match anything". Note that even if fields are empty, the semicolon (';') +separators must be present.@ The loglevel field is mandatory, and must +contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@ +The lines field might contain one non-negative number, in which case it +matches only one line, or a range "from_line-to_line", in which case it +matches any line in the interval [from_line;to_line] (that is, including +both start and end line).@ GNUnet mostly defaults component name to the +name of the service that is implemented in a process ('transport', +'core', 'peerinfo', etc), but logging calls can specify custom component +names using @code{GNUNET_log_from}.@ File name and function name are +provided by the compiler (__FILE__ and __FUNCTION__ built-ins). + +Component, file and function fields are interpreted as non-extended +regular expressions (GNU libc regex functions are used). Matching is +case-sensitive, "^" and "$" will match the beginning and the end of the +text. If a field is empty, its contents are automatically replaced with +a ".*" regular expression, which matches anything. Matching is done in +the default way, which means that the expression matches as long as it's +contained anywhere in the string. Thus "GNUNET_" will match both +"GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that +the expression matches at the start and/or at the end of the string. +The semicolon (';') can't be escaped, and GNUnet will not use it in +component names (it can't be used in function names and file names +anyway). + +@end table + + +Every logging call in GNUnet code will be (at run time) matched against +the log definitions passed to the process. If a log definition fields are +matching the call arguments, then the call log level is compared the the +log level of that definition. If the call log level is less or equal to +the definition log level, the call is allowed to proceed. Otherwise the +logging call is forbidden, and nothing is logged. If no definitions +matched at all, GNUnet will use the global log level or (if a global log +level is not specified) will default to WARNING (that is, it will allow +the call to proceed, if its level is less or equal to the global log +level or to WARNING). + +That is, definitions are evaluated from left to right, and the first +matching definition is used to allow or deny the logging call. Thus it is +advised to place narrow definitions at the beginning of the logdef +string, and generic definitions - at the end. + +Whether a call is allowed or not is only decided the first time this +particular call is made. The evaluation result is then cached, so that +any attempts to make the same call later will be allowed or disallowed +right away. Because of that runtime log level evaluation should not +significantly affect the process performance. +Log definition parsing is only done once, at the first call to +GNUNET_log_setup () made by the process (which is usually done soon after +it starts). + +At the moment of writing there is no way to specify logging definitions +from configuration files, only via environment variables. + +At the moment GNUnet will stop processing a log definition when it +encounters an error in definition formatting or an error in regular +expression syntax, and will not report the failure in any way. + + +@c *********************************************************************** +@menu +* Examples:: +* Log files:: +* Updated behavior of GNUNET_log:: +@end menu + +@node Examples +@subsubsection Examples + +@table @asis + +@item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet +process tree, running all processes with DEBUG level (one should be +careful with it, as log files will grow at alarming rate!) +@item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet +process tree, running the core service under DEBUG level (everything else +will use configured or default level). + +@item Start GNUnet process tree, allowing any logging calls from +gnunet-service-transport_validation.c (everything else will use +configured or default level). + +@example +GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \ +gnunet-arm -s +@end example + +@item Start GNUnet process tree, allowing any logging calls from +gnunet-gnunet-service-fs_push.c (everything else will use configured or +default level). + +@example +GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s +@end example + +@item Start GNUnet process tree, allowing any logging calls from the +GNUNET_NETWORK_socket_select function (everything else will use +configured or default level). + +@example +GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s +@end example + +@item Start GNUnet process tree, allowing any logging calls from the +components that have "transport" in their names, and are made from +function that have "send" in their names. Everything else will be allowed +to be logged only if it has WARNING level. + +@example +GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s +@end example + +@end table + + +On Windows, one can use batch files to run GNUnet processes with special +environment variables, without affecting the whole system. Such batch +file will look like this: + +@example +set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s +@end example + +(note the absence of double quotes in the environment variable definition, +as opposed to earlier examples, which use the shell). +Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set +in order to GNUNET_FORCE_LOG to work. + + +@cindex Log files +@node Log files +@subsubsection Log files + +GNUnet can be told to log everything into a file instead of stderr (which +is the default) using the "--log-file=logfile" or "-l logfile" option. +This option can also be passed via command line, or from the "OPTION" and +"GLOBAL_POSTFIX" configuration keys (see above). The file name passed +with this option is subject to GNUnet filename expansion. If specified in +"GLOBAL_POSTFIX", it is also subject to ARM service filename expansion, +in particular, it may contain "@{@}" (left and right curly brace) +sequence, which will be replaced by ARM with the name of the service. +This is used to keep logs from more than one service separate, while only +specifying one template containing "@{@}" in GLOBAL_POSTFIX. + +As part of a secondary file name expansion, the first occurrence of "[]" +sequence ("left square brace" followed by "right square brace") in the +file name will be replaced with a process identifier or the process when +it initializes its logging subsystem. As a result, all processes will log +into different files. This is convenient for isolating messages of a +particular process, and prevents I/O races when multiple processes try to +write into the file at the same time. This expansion is done +independently of "@{@}" expansion that ARM service does (see above). + +The log file name that is specified via "-l" can contain format characters +from the 'strftime' function family. For example, "%Y" will be replaced +with the current year. Using "basename-%Y-%m-%d.log" would include the +current year, month and day in the log file. If a GNUnet process runs for +long enough to need more than one log file, it will eventually clean up +old log files. Currently, only the last three log files (plus the current +log file) are preserved. So once the fifth log file goes into use (so +after 4 days if you use "%Y-%m-%d" as above), the first log file will be +automatically deleted. Note that if your log file name only contains "%Y", +then log files would be kept for 4 years and the logs from the first year +would be deleted once year 5 begins. If you do not use any date-related +string format codes, logs would never be automatically deleted by GNUnet. + + +@c *********************************************************************** + +@node Updated behavior of GNUNET_log +@subsubsection Updated behavior of GNUNET_log + +It's currently quite common to see constructions like this all over the +code: + +@example +#if MESH_DEBUG +GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n"); +#endif +@end example + +The reason for the #if is not to avoid displaying the message when +disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the +compiler including it in the binary at all, when compiling GNUnet for +platforms with restricted storage space / memory (MIPS routers, +ARM plug computers / dev boards, etc). + +This presents several problems: the code gets ugly, hard to write and it +is very easy to forget to include the #if guards, creating non-consistent +code. A new change in GNUNET_log aims to solve these problems. + +@strong{This change requires to @file{./configure} with at least +@code{--enable-logging=verbose} to see debug messages.} + +Here is an example of code with dense debug statements: + +@example +switch (restrict_topology) @{ +case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING +GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique +topology\n")); #endif unblacklisted_connections = create_clique (pg, +&remove_connections, BLACKLIST, GNUNET_NO); break; case +GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log +(GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring) +topology\n")); #endif unblacklisted_connections = create_small_world_ring +(pg,&remove_connections, BLACKLIST); break; +@end example + + +Pretty hard to follow, huh? + +From now on, it is not necessary to include the #if / #endif statements to +achieve the same behavior. The GNUNET_log and GNUNET_log_from macros take +care of it for you, depending on the configure option: + +@itemize @bullet +@item If @code{--enable-logging} is set to @code{no}, the binary will +contain no log messages at all. +@item If @code{--enable-logging} is set to @code{yes}, the binary will +contain no DEBUG messages, and therefore running with -L DEBUG will have +no effect. Other messages (ERROR, WARNING, INFO, etc) will be included. +@item If @code{--enable-logging} is set to @code{verbose}, or +@code{veryverbose} the binary will contain DEBUG messages (still, it will +be neccessary to run with -L DEBUG or set the DEBUG config option to show +them). +@end itemize + + +If you are a developer: +@itemize @bullet +@item please make sure that you @code{./configure +--enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages. +@item please remove the @code{#if} statements around @code{GNUNET_log +(GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your +code. +@end itemize + +Since now activating DEBUG automatically makes it VERBOSE and activates +@strong{all} debug messages by default, you probably want to use the +https://gnunet.org/logging functionality to filter only relevant messages. +A suitable configuration could be: + +@example +$ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING" +@end example + +Which will behave almost like enabling DEBUG in that subsytem before the +change. Of course you can adapt it to your particular needs, this is only +a quick example. + +@cindex Interprocess communication API +@cindex ICP +@node Interprocess communication API (IPC) +@subsection Interprocess communication API (IPC) + +In GNUnet a variety of new message types might be defined and used in +interprocess communication, in this tutorial we use the +@code{struct AddressLookupMessage} as a example to introduce how to +construct our own message type in GNUnet and how to implement the message +communication between service and client. +(Here, a client uses the @code{struct AddressLookupMessage} as a request +to ask the server to return the address of any other peer connecting to +the service.) + + +@c *********************************************************************** +@menu +* Define new message types:: +* Define message struct:: +* Client - Establish connection:: +* Client - Initialize request message:: +* Client - Send request and receive response:: +* Server - Startup service:: +* Server - Add new handles for specified messages:: +* Server - Process request message:: +* Server - Response to client:: +* Server - Notification of clients:: +* Conversion between Network Byte Order (Big Endian) and Host Byte Order:: +@end menu + +@node Define new message types +@subsubsection Define new message types + +First of all, you should define the new message type in +@file{gnunet_protocols.h}: + +@example + // Request to look addresses of peers in server. +#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29 + // Response to the address lookup request. +#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30 +@end example + +@c *********************************************************************** +@node Define message struct +@subsubsection Define message struct + +After the type definition, the specified message structure should also be +described in the header file, e.g. transport.h in our case. + +@example +struct AddressLookupMessage @{ + struct GNUNET_MessageHeader header; + int32_t numeric_only GNUNET_PACKED; + struct GNUNET_TIME_AbsoluteNBO timeout; + uint32_t addrlen GNUNET_PACKED; + /* followed by 'addrlen' bytes of the actual address, then + followed by the 0-terminated name of the transport */ @}; +GNUNET_NETWORK_STRUCT_END +@end example + + +Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED} +which both ensure correct alignment when sending structs over the network. + +@menu +@end menu + +@c *********************************************************************** +@node Client - Establish connection +@subsubsection Client - Establish connection +@c %**end of header + + +At first, on the client side, the underlying API is employed to create a +new connection to a service, in our example the transport service would be +connected. + +@example +struct GNUNET_CLIENT_Connection *client; +client = GNUNET_CLIENT_connect ("transport", cfg); +@end example + +@c *********************************************************************** +@node Client - Initialize request message +@subsubsection Client - Initialize request message +@c %**end of header + +When the connection is ready, we initialize the message. In this step, +all the fields of the message should be properly initialized, namely the +size, type, and some extra user-defined data, such as timeout, name of +transport, address and name of transport. + +@example +struct AddressLookupMessage *msg; +size_t len = sizeof (struct AddressLookupMessage) + + addressLen + + strlen (nameTrans) + + 1; +msg->header->size = htons (len); +msg->header->type = htons +(GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP); +msg->timeout = GNUNET_TIME_absolute_hton (abs_timeout); +msg->addrlen = htonl (addressLen); +char *addrbuf = (char *) &msg[1]; +memcpy (addrbuf, address, addressLen); +char *tbuf = &addrbuf[addressLen]; +memcpy (tbuf, nameTrans, strlen (nameTrans) + 1); +@end example + +Note that, here the functions @code{htonl}, @code{htons} and +@code{GNUNET_TIME_absolute_hton} are applied to convert little endian +into big endian, about the usage of the big/small edian order and the +corresponding conversion function please refer to Introduction of +Big Endian and Little Endian. + +@c *********************************************************************** +@node Client - Send request and receive response +@subsubsection Client - Send request and receive response +@c %**end of header + +@b{FIXME: This is very outdated, see the tutorial for the current API!} + +Next, the client would send the constructed message as a request to the +service and wait for the response from the service. To accomplish this +goal, there are a number of API calls that can be used. In this example, +@code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most +appropriate function to use. + +@example +GNUNET_CLIENT_transmit_and_get_response +(client, msg->header, timeout, GNUNET_YES, &address_response_processor, +arp_ctx); +@end example + +the argument @code{address_response_processor} is a function with +@code{GNUNET_CLIENT_MessageHandler} type, which is used to process the +reply message from the service. + +@node Server - Startup service +@subsubsection Server - Startup service + +After receiving the request message, we run a standard GNUnet service +startup sequence using @code{GNUNET_SERVICE_run}, as follows, + +@example +int main(int argc, char**argv) @{ + GNUNET_SERVICE_run(argc, argv, "transport" + GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @} +@end example + +@c *********************************************************************** +@node Server - Add new handles for specified messages +@subsubsection Server - Add new handles for specified messages +@c %**end of header + +in the function above the argument @code{run} is used to initiate +transport service,and defined like this: + +@example +static void run (void *cls, +struct GNUNET_SERVER_Handle *serv, +const struct GNUNET_CONFIGURATION_Handle *cfg) @{ + GNUNET_SERVER_add_handlers (serv, handlers); @} +@end example + + +Here, @code{GNUNET_SERVER_add_handlers} must be called in the run +function to add new handlers in the service. The parameter +@code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler} +to tell the service which function should be called when a particular +type of message is received, and should be defined in this way: + +@example +static struct GNUNET_SERVER_MessageHandler handlers[] = @{ + @{&handle_start, + NULL, + GNUNET_MESSAGE_TYPE_TRANSPORT_START, + 0@}, + @{&handle_send, + NULL, + GNUNET_MESSAGE_TYPE_TRANSPORT_SEND, + 0@}, + @{&handle_try_connect, + NULL, + GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT, + sizeof (struct TryConnectMessage) + @}, + @{&handle_address_lookup, + NULL, + GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP, + 0@}, + @{NULL, + NULL, + 0, + 0@} +@}; +@end example + + +As shown, the first member of the struct in the first area is a callback +function, which is called to process the specified message types, given +as the third member. The second parameter is the closure for the callback +function, which is set to @code{NULL} in most cases, and the last +parameter is the expected size of the message of this type, usually we +set it to 0 to accept variable size, for special cases the exact size of +the specified message also can be set. In addition, the terminator sign +depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last aera. + +@c *********************************************************************** +@node Server - Process request message +@subsubsection Server - Process request message +@c %**end of header + +After the initialization of transport service, the request message would +be processed. Before handling the main message data, the validity of this +message should be checked out, e.g., to check whether the size of message +is correct. + +@example +size = ntohs (message->size); +if (size < sizeof (struct AddressLookupMessage)) @{ + GNUNET_break_op (0); + GNUNET_SERVER_receive_done (client, GNUNET_SYSERR); + return; @} +@end example + + +Note that, opposite to the construction method of the request message in +the client, in the server the function @code{nothl} and @code{ntohs} +should be employed during the extraction of the data from the message, so +that the data in big endian order can be converted back into little +endian order. See more in detail please refer to Introduction of +Big Endian and Little Endian. + +Moreover in this example, the name of the transport stored in the message +is a 0-terminated string, so we should also check whether the name of the +transport in the received message is 0-terminated: + +@example +nameTransport = (const char *) &address[addressLen]; +if (nameTransport[size - sizeof + (struct AddressLookupMessage) + - addressLen - 1] != '\0') @{ + GNUNET_break_op (0); + GNUNET_SERVER_receive_done (client, + GNUNET_SYSERR); + return; @} +@end example + +Here, @code{GNUNET_SERVER_receive_done} should be called to tell the +service that the request is done and can receive the next message. The +argument @code{GNUNET_SYSERR} here indicates that the service didn't +understand the request message, and the processing of this request would +be terminated. + +In comparison to the aforementioned situation, when the argument is equal +to @code{GNUNET_OK}, the service would continue to process the requst +message. + +@c *********************************************************************** +@node Server - Response to client +@subsubsection Server - Response to client +@c %**end of header + +Once the processing of current request is done, the server should give the +response to the client. A new @code{struct AddressLookupMessage} would be +produced by the server in a similar way as the client did and sent to the +client, but here the type should be +@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than +@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client. +@example +struct AddressLookupMessage *msg; +size_t len = sizeof (struct AddressLookupMessage) + + addressLen + + strlen (nameTrans) + 1; +msg->header->size = htons (len); +msg->header->type = htons + (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY); + +// ... + +struct GNUNET_SERVER_TransmitContext *tc; +tc = GNUNET_SERVER_transmit_context_create (client); +GNUNET_SERVER_transmit_context_append_data +(tc, + NULL, + 0, + GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY); +GNUNET_SERVER_transmit_context_run (tc, rtimeout); +@end example + + +Note that, there are also a number of other APIs provided to the service +to send the message. + +@c *********************************************************************** +@node Server - Notification of clients +@subsubsection Server - Notification of clients +@c %**end of header + +Often a service needs to (repeatedly) transmit notifications to a client +or a group of clients. In these cases, the client typically has once +registered for a set of events and then needs to receive a message +whenever such an event happens (until the client disconnects). The use of +a notification context can help manage message queues to clients and +handle disconnects. Notification contexts can be used to send +individualized messages to a particular client or to broadcast messages +to a group of clients. An individualized notification might look like +this: + +@example +GNUNET_SERVER_notification_context_unicast(nc, + client, + msg, + GNUNET_YES); +@end example + + +Note that after processing the original registration message for +notifications, the server code still typically needs to call +@code{GNUNET_SERVER_receive_done} so that the client can transmit further +messages to the server. + +@c *********************************************************************** +@node Conversion between Network Byte Order (Big Endian) and Host Byte Order +@subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order +@c %** subsub? it's a referenced page on the ipc document. +@c %**end of header + +Here we can simply comprehend big endian and little endian as Network Byte +Order and Host Byte Order respectively. What is the difference between +both two? + +Usually in our host computer we store the data byte as Host Byte Order, +for example, we store a integer in the RAM which might occupies 4 Byte, +as Host Byte Order the higher Byte would be stored at the lower address +of RAM, and the lower Byte would be stored at the higher address of RAM. +However, contrast to this, Network Byte Order just take the totally +opposite way to store the data, says, it will store the lower Byte at the +lower address, and the higher Byte will stay at higher address. + +For the current communication of network, we normally exchange the +information by surveying the data package, every two host wants to +communicate with each other must send and receive data package through +network. In order to maintain the identity of data through the +transmission in the network, the order of the Byte storage must changed +before sending and after receiving the data. + +There ten convenient functions to realize the conversion of Byte Order in +GNUnet, as following: + +@table @asis + +@item uint16_t htons(uint16_t hostshort) Convert host byte order to net +byte order with short int +@item uint32_t htonl(uint32_t hostlong) Convert host byte +order to net byte order with long int +@item uint16_t ntohs(uint16_t netshort) +Convert net byte order to host byte order with short int +@item uint32_t +ntohl(uint32_t netlong) Convert net byte order to host byte order with +long int +@item unsigned long long GNUNET_ntohll (unsigned long long netlonglong) +Convert net byte order to host byte order with long long int +@item unsigned long long GNUNET_htonll (unsigned long long hostlonglong) +Convert host byte order to net byte order with long long int +@item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton +(struct GNUNET_TIME_Relative a) Convert relative time to network byte +order. +@item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh +(struct GNUNET_TIME_RelativeNBO a) Convert relative time from network +byte order. +@item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton +(struct GNUNET_TIME_Absolute a) Convert relative time to network byte +order. +@item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh +(struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network +byte order. +@end table + +@cindex Cryptography API +@node Cryptography API +@subsection Cryptography API +@c %**end of header + +The gnunetutil APIs provides the cryptographic primitives used in GNUnet. +GNUnet uses 2048 bit RSA keys for the session key exchange and for signing +messages by peers and most other public-key operations. Most researchers +in cryptography consider 2048 bit RSA keys as secure and practically +unbreakable for a long time. The API provides functions to create a fresh +key pair, read a private key from a file (or create a new file if the +file does not exist), encrypt, decrypt, sign, verify and extraction of +the public key into a format suitable for network transmission. + +For the encryption of files and the actual data exchanged between peers +GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated +for every new connection.@ Again, there is no published technique to +break this cipher in any realistic amount of time. The API provides +functions for generation of keys, validation of keys (important for +checking that decryptions using RSA succeeded), encryption and decryption. + +GNUnet uses SHA-512 for computing one-way hash codes. The API provides +functions to compute a hash over a block in memory or over a file on disk. + +The crypto API also provides functions for randomizing a block of memory, +obtaining a single random number and for generating a permuation of the +numbers 0 to n-1. Random number generation distinguishes between WEAK and +STRONG random number quality; WEAK random numbers are pseudo-random +whereas STRONG random numbers use entropy gathered from the operating +system. + +Finally, the crypto API provides a means to deterministically generate a +1024-bit RSA key from a hash code. These functions should most likely not +be used by most applications; most importantly, +GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that +should be considered secure for traditional applications of RSA. + +@cindex Message Queue API +@node Message Queue API +@subsection Message Queue API +@c %**end of header + +@strong{ Introduction }@ +Often, applications need to queue messages that +are to be sent to other GNUnet peers, clients or services. As all of +GNUnet's message-based communication APIs, by design, do not allow +messages to be queued, it is common to implement custom message queues +manually when they are needed. However, writing very similar code in +multiple places is tedious and leads to code duplication. + +MQ (for Message Queue) is an API that provides the functionality to +implement and use message queues. We intend to eventually replace all of +the custom message queue implementations in GNUnet with MQ. + +@strong{ Basic Concepts }@ +The two most important entities in MQ are queues and envelopes. + +Every queue is backed by a specific implementation (e.g. for mesh, stream, +connection, server client, etc.) that will actually deliver the queued +messages. For convenience,@ some queues also allow to specify a list of +message handlers. The message queue will then also wait for incoming +messages and dispatch them appropriately. + +An envelope holds the the memory for a message, as well as metadata +(Where is the envelope queued? What should happen after it has been +sent?). Any envelope can only be queued in one message queue. + +@strong{ Creating Queues }@ +The following is a list of currently available message queues. Note that +to avoid layering issues, message queues for higher level APIs are not +part of @code{libgnunetutil}, but@ the respective API itself provides the +queue implementation. + +@table @asis + +@item @code{GNUNET_MQ_queue_for_connection_client} +Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle. +Also supports receiving with message handlers. + +@item @code{GNUNET_MQ_queue_for_server_client} +Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does +not support incoming message handlers. + +@item @code{GNUNET_MESH_mq_create} Transmits queued messages over a +@code{GNUNET_MESH_Tunnel} handle. Does not support incoming message +handlers. + +@item @code{GNUNET_MQ_queue_for_callbacks} This is the most general +implementation. Instead of delivering and receiving messages with one of +GNUnet's communication APIs, implementation callbacks are called. Refer to +"Implementing Queues" for a more detailed explanation. +@end table + + +@strong{ Allocating Envelopes }@ +A GNUnet message (as defined by the GNUNET_MessageHeader) has three +parts: The size, the type, and the body. + +MQ provides macros to allocate an envelope containing a message +conveniently, automatically setting the size and type fields of the +message. + +Consider the following simple message, with the body consisting of a +single number value. +@c why the empy code function? +@code{} + +@example +struct NumberMessage @{ + /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */ + struct GNUNET_MessageHeader header; + uint32_t number GNUNET_PACKED; +@}; +@end example + +An envelope containing an instance of the NumberMessage can be +constructed like this: + +@example +struct GNUNET_MQ_Envelope *ev; +struct NumberMessage *msg; +ev = GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1); +msg->number = htonl (42); +@end example + +In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is +the newly allocated envelope. The first argument must be a pointer to some +@code{struct} containing a @code{struct GNUNET_MessageHeader header} +field, while the second argument is the desired message type, in host +byte order. + +The @code{msg} pointer now points to an allocated message, where the +message type and the message size are already set. The message's size is +inferred from the type of the @code{msg} pointer: It will be set to +'sizeof(*msg)', properly converted to network byte order. + +If the message body's size is dynamic, the the macro +@code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose +message has additional space allocated after the @code{msg} structure. + +If no structure has been defined for the message, +@code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space +after the message header. The first argument then must be a pointer to a +@code{GNUNET_MessageHeader}. + +@strong{Envelope Properties}@ +A few functions in MQ allow to set additional properties on envelopes: + +@table @asis + +@item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will +be called once the envelope's message has been sent irrevocably. +An envelope can be canceled precisely up to the@ point where the notify +sent callback has been called. + +@item @code{GNUNET_MQ_disable_corking} No corking will be used when +sending the message. Not every@ queue supports this flag, per default, +envelopes are sent with corking.@ + +@end table + + +@strong{Sending Envelopes}@ +Once an envelope has been constructed, it can be queued for sending with +@code{GNUNET_MQ_send}. + +Note that in order to avoid memory leaks, an envelope must either be sent +(the queue will free it) or destroyed explicitly with +@code{GNUNET_MQ_discard}. + +@strong{Canceling Envelopes}@ +An envelope queued with @code{GNUNET_MQ_send} can be canceled with +@code{GNUNET_MQ_cancel}. Note that after the notify sent callback has +been called, canceling a message results in undefined behavior. +Thus it is unsafe to cancel an envelope that does not have a notify sent +callback. When canceling an envelope, it is not necessary@ to call +@code{GNUNET_MQ_discard}, and the envelope can't be sent again. + +@strong{ Implementing Queues }@ +@code{TODO} + +@cindex Service API +@node Service API +@subsection Service API +@c %**end of header + +Most GNUnet code lives in the form of services. Services are processes +that offer an API for other components of the system to build on. Those +other components can be command-line tools for users, graphical user +interfaces or other services. Services provide their API using an IPC +protocol. For this, each service must listen on either a TCP port or a +UNIX domain socket; for this, the service implementation uses the server +API. This use of server is exposed directly to the users of the service +API. Thus, when using the service API, one is usually also often using +large parts of the server API. The service API provides various +convenience functions, such as parsing command-line arguments and the +configuration file, which are not found in the server API. +The dual to the service/server API is the client API, which can be used to +access services. + +The most common way to start a service is to use the +@code{GNUNET_SERVICE_run} function from the program's main function. +@code{GNUNET_SERVICE_run} will then parse the command line and +configuration files and, based on the options found there, +start the server. It will then give back control to the main +program, passing the server and the configuration to the +@code{GNUNET_SERVICE_Main} callback. @code{GNUNET_SERVICE_run} +will also take care of starting the scheduler loop. +If this is inappropriate (for example, because the scheduler loop +is already running), @code{GNUNET_SERVICE_start} and +related functions provide an alternative to @code{GNUNET_SERVICE_run}. + +When starting a service, the service_name option is used to determine +which sections in the configuration file should be used to configure the +service. A typical value here is the name of the @file{src/} +sub-directory, for example @file{statistics}. +The same string would also be given to +@code{GNUNET_CLIENT_connect} to access the service. + +Once a service has been initialized, the program should use the +@code{GNUNET_SERVICE_Main} callback to register message handlers +using @code{GNUNET_SERVER_add_handlers}. +The service will already have registered a handler for the +"TEST" message. + +@fnindex GNUNET_SERVICE_Options +The option bitfield (@code{enum GNUNET_SERVICE_Options}) +determines how a service should behave during shutdown. +There are three key strategies: + +@table @asis + +@item instant (@code{GNUNET_SERVICE_OPTION_NONE}) +Upon receiving the shutdown +signal from the scheduler, the service immediately terminates the server, +closing all existing connections with clients. +@item manual (@code{GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN}) +The service does nothing by itself +during shutdown. The main program will need to take the appropriate +action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending +on how the service was initialized) to terminate the service. This method +is used by gnunet-service-arm and rather uncommon. +@item soft (@code{GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN}) +Upon receiving the shutdown signal from the scheduler, +the service immediately tells the server to stop +listening for incoming clients. Requests from normal existing clients are +still processed and the server/service terminates once all normal clients +have disconnected. Clients that are not expected to ever disconnect (such +as clients that monitor performance values) can be marked as 'monitor' +clients using GNUNET_SERVER_client_mark_monitor. Those clients will +continue to be processed until all 'normal' clients have disconnected. +Then, the server will terminate, closing the monitor connections. +This mode is for example used by 'statistics', allowing existing 'normal' +clients to set (possibly persistent) statistic values before terminating. + +@end table + +@c *********************************************************************** +@node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps +@subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps +@c %**end of header + +A commonly used data structure in GNUnet is a (multi-)hash map. It is most +often used to map a peer identity to some data structure, but also to map +arbitrary keys to values (for example to track requests in the distributed +hash table or in file-sharing). As it is commonly used, the DHT is +actually sometimes responsible for a large share of GNUnet's overall +memory consumption (for some processes, 30% is not uncommon). The +following text documents some API quirks (and their implications for +applications) that were recently introduced to minimize the footprint of +the hash map. + + +@c *********************************************************************** +@menu +* Analysis:: +* Solution:: +* Migration:: +* Conclusion:: +* Availability:: +@end menu + +@node Analysis +@subsubsection Analysis +@c %**end of header + +The main reason for the "excessive" memory consumption by the hash map is +that GNUnet uses 512-bit cryptographic hash codes --- and the +(multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As +a result, storing just the keys requires 64 bytes of memory for each key. +As some applications like to keep a large number of entries in the hash +map (after all, that's what maps are good for), 64 bytes per hash is +significant: keeping a pointer to the value and having a linked list for +collisions consume between 8 and 16 bytes, and 'malloc' may add about the +same overhead per allocation, putting us in the 16 to 32 byte per entry +ballpark. Adding a 64-byte key then triples the overall memory +requirement for the hash map. + +To make things "worse", most of the time storing the key in the hash map +is not required: it is typically already in memory elsewhere! In most +cases, the values stored in the hash map are some application-specific +struct that _also_ contains the hash. Here is a simplified example: + +@example +struct MyValue @{ +struct GNUNET_HashCode key; +unsigned int my_data; @}; + +// ... +val = GNUNET_malloc (sizeof (struct MyValue)); +val->key = key; +val->my_data = 42; +GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...); +@end example + +This is a common pattern as later the entries might need to be removed, +and at that time it is convenient to have the key immediately at hand: + +@example +GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val); +@end example + + +Note that here we end up with two times 64 bytes for the key, plus maybe +64 bytes total for the rest of the 'struct MyValue' and the map entry in +the hash map. The resulting redundant storage of the key increases +overall memory consumption per entry from the "optimal" 128 bytes to 192 +bytes. This is not just an extreme example: overheads in practice are +actually sometimes close to those highlighted in this example. This is +especially true for maps with a significant number of entries, as there +we tend to really try to keep the entries small. + +@c *********************************************************************** +@node Solution +@subsubsection Solution +@c %**end of header + +The solution that has now been implemented is to @strong{optionally} +allow the hash map to not make a (deep) copy of the hash but instead have +a pointer to the hash/key in the entry. This reduces the memory +consumption for the key from 64 bytes to 4 to 8 bytes. However, it can +also only work if the key is actually stored in the entry (which is the +case most of the time) and if the entry does not modify the key (which in +all of the code I'm aware of has been always the case if there key is +stored in the entry). Finally, when the client stores an entry in the +hash map, it @strong{must} provide a pointer to the key within the entry, +not just a pointer to a transient location of the key. If +the client code does not meet these requirements, the result is a dangling +pointer and undefined behavior of the (multi-)hash map API. + +@c *********************************************************************** +@node Migration +@subsubsection Migration +@c %**end of header + +To use the new feature, first check that the values contain the respective +key (and never modify it). Then, all calls to +@code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be +audited and most likely changed to pass a pointer into the value's struct. +For the initial example, the new code would look like this: + +@example +struct MyValue @{ +struct GNUNET_HashCode key; +unsigned int my_data; @}; + +// ... +val = GNUNET_malloc (sizeof (struct MyValue)); +val->key = key; val->my_data = 42; +GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...); +@end example + + +Note that @code{&val} was changed to @code{&val->key} in the argument to +the @code{put} call. This is critical as often @code{key} is on the stack +or in some other transient data structure and thus having the hash map +keep a pointer to @code{key} would not work. Only the key inside of +@code{val} has the same lifetime as the entry in the map (this must of +course be checked as well). Naturally, @code{val->key} must be +intiialized before the @code{put} call. Once all @code{put} calls have +been converted and double-checked, you can change the call to create the +hash map from + +@example +map = +GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO); +@end example + +to + +@example +map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES); +@end example + +If everything was done correctly, you now use about 60 bytes less memory +per entry in @code{map}. However, if now (or in the future) any call to +@code{put} does not ensure that the given key is valid until the entry is +removed from the map, undefined behavior is likely to be observed. + +@c *********************************************************************** +@node Conclusion +@subsubsection Conclusion +@c %**end of header + +The new optimization can is often applicable and can result in a +reduction in memory consumption of up to 30% in practice. However, it +makes the code less robust as additional invariants are imposed on the +multi hash map client. Thus applications should refrain from enabling the +new mode unless the resulting performance increase is deemed significant +enough. In particular, it should generally not be used in new code (wait +at least until benchmarks exist). + +@c *********************************************************************** +@node Availability +@subsubsection Availability +@c %**end of header + +The new multi hash map code was committed in SVN 24319 (will be in GNUnet +0.9.4). Various subsystems (transport, core, dht, file-sharing) were +previously audited and modified to take advantage of the new capability. +In particular, memory consumption of the file-sharing service is expected +to drop by 20-30% due to this change. + + +@cindex CONTAINER_MDLL API +@node CONTAINER_MDLL API +@subsection CONTAINER_MDLL API +@c %**end of header + +This text documents the GNUNET_CONTAINER_MDLL API. The +GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in +that it provides operations for the construction and manipulation of +doubly-linked lists. The key difference to the (simpler) DLL-API is that +the MDLL-version allows a single element (instance of a "struct") to be +in multiple linked lists at the same time. + +Like the DLL API, the MDLL API stores (most of) the data structures for +the doubly-linked list with the respective elements; only the 'head' and +'tail' pointers are stored "elsewhere" --- and the application needs to +provide the locations of head and tail to each of the calls in the +MDLL API. The key difference for the MDLL API is that the "next" and +"previous" pointers in the struct can no longer be simply called "next" +and "prev" --- after all, the element may be in multiple doubly-linked +lists, so we cannot just have one "next" and one "prev" pointer! + +The solution is to have multiple fields that must have a name of the +format "next_XX" and "prev_XX" where "XX" is the name of one of the +doubly-linked lists. Here is a simple example: + +@example +struct MyMultiListElement @{ + struct MyMultiListElement *next_ALIST; + struct MyMultiListElement *prev_ALIST; + struct MyMultiListElement *next_BLIST; + struct MyMultiListElement *prev_BLIST; + void + *data; +@}; +@end example + + +Note that by convention, we use all-uppercase letters for the list names. +In addition, the program needs to have a location for the head and tail +pointers for both lists, for example: + +@example +static struct MyMultiListElement *head_ALIST; +static struct MyMultiListElement *tail_ALIST; +static struct MyMultiListElement *head_BLIST; +static struct MyMultiListElement *tail_BLIST; +@end example + + +Using the MDLL-macros, we can now insert an element into the ALIST: + +@example +GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element); +@end example + + +Passing "ALIST" as the first argument to MDLL specifies which of the +next/prev fields in the 'struct MyMultiListElement' should be used. The +extra "ALIST" argument and the "_ALIST" in the names of the +next/prev-members are the only differences between the MDDL and DLL-API. +Like the DLL-API, the MDLL-API offers functions for inserting (at head, +at tail, after a given element) and removing elements from the list. +Iterating over the list should be done by directly accessing the +"next_XX" and/or "prev_XX" members. + +@cindex Automatic Restart Manager +@cindex ARM +@node Automatic Restart Manager (ARM) +@section Automatic Restart Manager (ARM) +@c %**end of header + +GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible +for system initialization and service babysitting. ARM starts and halts +services, detects configuration changes and restarts services impacted by +the changes as needed. It's also responsible for restarting services in +case of crashes and is planned to incorporate automatic debugging for +diagnosing service crashes providing developers insights about crash +reasons. The purpose of this document is to give GNUnet developer an idea +about how ARM works and how to interact with it. + +@menu +* Basic functionality:: +* Key configuration options:: +* ARM - Availability:: +* Reliability:: +@end menu + +@c *********************************************************************** +@node Basic functionality +@subsection Basic functionality +@c %**end of header + +@itemize @bullet +@item ARM source code can be found under "src/arm".@ Service processes are +managed by the functions in "gnunet-service-arm.c" which is controlled +with "gnunet-arm.c" (main function in that file is ARM's entry point). + +@item The functions responsible for communicating with ARM , starting and +stopping services -including ARM service itself- are provided by the +ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller +an ARM handle after setting it to the caller's context (configuration and +scheduler in use). This handle can be used afterwards by the caller to +communicate with ARM. Functions GNUNET_ARM_start_service() and +GNUNET_ARM_stop_service() are used for starting and stopping services +respectively. + +@item A typical example of using these basic ARM services can be found in +file test_arm_api.c. The test case connects to ARM, starts it, then uses +it to start a service "resolver", stops the "resolver" then stops "ARM". +@end itemize + +@c *********************************************************************** +@node Key configuration options +@subsection Key configuration options +@c %**end of header + +Configurations for ARM and services should be available in a .conf file +(As an example, see test_arm_api_data.conf). When running ARM, the +configuration file to use should be passed to the command: + +@example +$ gnunet-arm -s -c configuration_to_use.conf +@end example + +If no configuration is passed, the default configuration file will be used +(see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from +contrib/defaults.conf).@ Each of the services is having a section starting +by the service name between square brackets, for example: "[arm]". +The following options configure how ARM configures or interacts with the +various services: + +@table @asis + +@item PORT Port number on which the service is listening for incoming TCP +connections. ARM will start the services should it notice a request at +this port. + +@item HOSTNAME Specifies on which host the service is deployed. Note +that ARM can only start services that are running on the local system +(but will not check that the hostname matches the local machine name). +This option is used by the @code{gnunet_client_lib.h} implementation to +determine which system to connect to. The default is "localhost". + +@item BINARY The name of the service binary file. + +@item OPTIONS To be passed to the service. + +@item PREFIX A command to pre-pend to the actual command, for example, +running a service with "valgrind" or "gdb" + +@item DEBUG Run in debug mode (much verbosity). + +@item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of +the service and start the service on-demand. + +@item FORCESTART ARM will always start this service when the peer +is started. + +@item ACCEPT_FROM IPv4 addresses the service accepts connections from. + +@item ACCEPT_FROM6 IPv6 addresses the service accepts connections from. + +@end table + + +Options that impact the operation of ARM overall are in the "[arm]" +section. ARM is a normal service and has (except for AUTOSTART) all of the +options that other services do. In addition, ARM has the +following options: + +@table @asis + +@item GLOBAL_PREFIX Command to be pre-pended to all services that are +going to run. + +@item GLOBAL_POSTFIX Global option that will be supplied to all the +services that are going to run. + +@end table + +@c *********************************************************************** +@node ARM - Availability +@subsection ARM - Availability +@c %**end of header + +As mentioned before, one of the features provided by ARM is starting +services on demand. Consider the example of one service "client" that +wants to connect to another service a "server". The "client" will ask ARM +to run the "server". ARM starts the "server". The "server" starts +listening to incoming connections. The "client" will establish a +connection with the "server". And then, they will start to communicate +together.@ One problem with that scheme is that it's slow!@ +The "client" service wants to communicate with the "server" service at +once and is not willing wait for it to be started and listening to +incoming connections before serving its request.@ One solution for that +problem will be that ARM starts all services as default services. That +solution will solve the problem, yet, it's not quite practical, for some +services that are going to be started can never be used or are going to +be used after a relatively long time.@ +The approach followed by ARM to solve this problem is as follows: + +@itemize @bullet + +@item For each service having a PORT field in the configuration file and +that is not one of the default services ( a service that accepts incoming +connections from clients), ARM creates listening sockets for all addresses +associated with that service. + +@item The "client" will immediately establish a connection with +the "server". + +@item ARM --- pretending to be the "server" --- will listen on the +respective port and notice the incoming connection from the "client" +(but not accept it), instead + +@item Once there is an incoming connection, ARM will start the "server", +passing on the listen sockets (now, the service is started and can do its +work). + +@item Other client services now can directly connect directly to the +"server". + +@end itemize + +@c *********************************************************************** +@node Reliability +@subsection Reliability + +One of the features provided by ARM, is the automatic restart of crashed +services.@ ARM needs to know which of the running services died. Function +"gnunet-service-arm.c/maint_child_death()" is responsible for that. The +function is scheduled to run upon receiving a SIGCHLD signal. The +function, then, iterates ARM's list of services running and monitors +which service has died (crashed). For all crashing services, ARM restarts +them.@ +Now, considering the case of a service having a serious problem causing it +to crash each time it's started by ARM. If ARM keeps blindly restarting +such a service, we are going to have the pattern: +start-crash-restart-crash-restart-crash and so forth!! Which is of course +not practical.@ +For that reason, ARM schedules the service to be restarted after waiting +for some delay that grows exponentially with each crash/restart of that +service.@ To clarify the idea, considering the following example: + +@itemize @bullet + +@item Service S crashed. + +@item ARM receives the SIGCHLD and inspects its list of services to find +the dead one(s). + +@item ARM finds S dead and schedules it for restarting after "backoff" +time which is initially set to 1ms. ARM will double the backoff time +correspondent to S (now backoff(S) = 2ms) + +@item Because there is a severe problem with S, it crashed again. + +@item Again ARM receives the SIGCHLD and detects that it's S again that's +crashed. ARM schedules it for restarting but after its new backoff time +(which became 2ms), and doubles its backoff time (now backoff(S) = 4). + +@item and so on, until backoff(S) reaches a certain threshold +(@code{EXPONENTIAL_BACKOFF_THRESHOLD} is set to half an hour), +after reaching it, backoff(S) will remain half an hour, +hence ARM won't be busy for a lot of time trying to restart a +problematic service. +@end itemize + +@cindex TRANSPORT Subsystem +@node TRANSPORT Subsystem +@section TRANSPORT Subsystem +@c %**end of header + +This chapter documents how the GNUnet transport subsystem works. The +GNUnet transport subsystem consists of three main components: the +transport API (the interface used by the rest of the system to access the +transport service), the transport service itself (most of the interesting +functions, such as choosing transports, happens here) and the transport +plugins. A transport plugin is a concrete implementation for how two +GNUnet peers communicate; many plugins exist, for example for +communication via TCP, UDP, HTTP, HTTPS and others. Finally, the +transport subsystem uses supporting code, especially the NAT/UPnP +library to help with tasks such as NAT traversal. + +Key tasks of the transport service include: + +@itemize @bullet + +@item Create our HELLO message, notify clients and neighbours if our HELLO +changes (using NAT library as necessary) + +@item Validate HELLOs from other peers (send PING), allow other peers to +validate our HELLO's addresses (send PONG) + +@item Upon request, establish connections to other peers (using address +selection from ATS subsystem) and maintain them (again using PINGs and +PONGs) as long as desired + +@item Accept incoming connections, give ATS service the opportunity to +switch communication channels + +@item Notify clients about peers that have connected to us or that have +been disconnected from us + +@item If a (stateful) connection goes down unexpectedly (without explicit +DISCONNECT), quickly attempt to recover (without notifying clients) but do +notify clients quickly if reconnecting fails + +@item Send (payload) messages arriving from clients to other peers via +transport plugins and receive messages from other peers, forwarding +those to clients + +@item Enforce inbound traffic limits (using flow-control if it is +applicable); outbound traffic limits are enforced by CORE, not by us (!) + +@item Enforce restrictions on P2P connection as specified by the blacklist +configuration and blacklisting clients +@end itemize + +Note that the term "clients" in the list above really refers to the +GNUnet-CORE service, as CORE is typically the only client of the +transport service. + +@menu +* Address validation protocol:: +@end menu + +@node Address validation protocol +@subsection Address validation protocol +@c %**end of header + +This section documents how the GNUnet transport service validates +connections with other peers. It is a high-level description of the +protocol necessary to understand the details of the implementation. It +should be noted that when we talk about PING and PONG messages in this +section, we refer to transport-level PING and PONG messages, which are +different from core-level PING and PONG messages (both in implementation +and function). + +The goal of transport-level address validation is to minimize the chances +of a successful man-in-the-middle attack against GNUnet peers on the +transport level. Such an attack would not allow the adversary to decrypt +the P2P transmissions, but a successful attacker could at least measure +traffic volumes and latencies (raising the adversaries capablities by +those of a global passive adversary in the worst case). The scenarios we +are concerned about is an attacker, Mallory, giving a @code{HELLO} to +Alice that claims to be for Bob, but contains Mallory's IP address +instead of Bobs (for some transport). +Mallory would then forward the traffic to Bob (by initiating a +connection to Bob and claiming to be Alice). As a further +complication, the scheme has to work even if say Alice is behind a NAT +without traversal support and hence has no address of her own (and thus +Alice must always initiate the connection to Bob). + +An additional constraint is that @code{HELLO} messages do not contain a +cryptographic signature since other peers must be able to edit +(i.e. remove) addresses from the @code{HELLO} at any time (this was +not true in GNUnet 0.8.x). A basic @strong{assumption} is that each peer +knows the set of possible network addresses that it @strong{might} +be reachable under (so for example, the external IP address of the +NAT plus the LAN address(es) with the respective ports). + +The solution is the following. If Alice wants to validate that a given +address for Bob is valid (i.e. is actually established @strong{directly} +with the intended target), she sends a PING message over that connection +to Bob. Note that in this case, Alice initiated the connection so only +Alice knows which address was used for sure (Alice may be behind NAT, so +whatever address Bob sees may not be an address Alice knows she has). +Bob checks that the address given in the @code{PING} is actually one +of Bob's addresses (ie: does not belong to Mallory), and if it is, +sends back a @code{PONG} (with a signature that says that Bob +owns/uses the address from the @code{PING}). +Alice checks the signature and is happy if it is valid and the address +in the @code{PONG} is the address Alice used. +This is similar to the 0.8.x protocol where the @code{HELLO} contained a +signature from Bob for each address used by Bob. +Here, the purpose code for the signature is +@code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will +remember Bob's address and consider the address valid for a while (12h in +the current implementation). Note that after this exchange, Alice only +considers Bob's address to be valid, the connection itself is not +considered 'established'. In particular, Alice may have many addresses +for Bob that Alice considers valid. + +@c TODO: reference Footnotes so that I don't have to duplicate the +@c footnotes or add them to an index at the end. Is this possible at +@c all in Texinfo? +The @code{PONG} message is protected with a nonce/challenge against replay +attacks@footnote{@uref{http://en.wikipedia.org/wiki/Replay_attack, replay}} +and uses an expiration time for the signature (but those are almost +implementation details). + +@cindex NAT library +@node NAT library +@section NAT library +@c %**end of header + +The goal of the GNUnet NAT library is to provide a general-purpose API for +NAT traversal @strong{without} third-party support. So protocols that +involve contacting a third peer to help establish a connection between +two peers are outside of the scope of this API. That does not mean that +GNUnet doesn't support involving a third peer (we can do this with the +distance-vector transport or using application-level protocols), it just +means that the NAT API is not concerned with this possibility. The API is +written so that it will work for IPv6-NAT in the future as well as +current IPv4-NAT. Furthermore, the NAT API is always used, even for peers +that are not behind NAT --- in that case, the mapping provided is simply +the identity. + +NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a +set of addresses that the peer has locally bound to (TCP or UDP), the NAT +library will return (via callback) a (possibly longer) list of addresses +the peer @strong{might} be reachable under. Internally, depending on the +configuration, the NAT library will try to punch a hole (using UPnP) or +just "know" that the NAT was manually punched and generate the respective +external IP address (the one that should be globally visible) based on +the given information. + +The NAT library also supports ICMP-based NAT traversal. Here, the other +peer can request connection-reversal by this peer (in this special case, +the peer is even allowed to configure a port number of zero). If the NAT +library detects a connection-reversal request, it returns the respective +target address to the client as well. It should be noted that +connection-reversal is currently only intended for TCP, so other plugins +@strong{must} pass @code{NULL} for the reversal callback. Naturally, the +NAT library also supports requesting connection reversal from a remote +peer (@code{GNUNET_NAT_run_client}). + +Once initialized, the NAT handle can be used to test if a given address is +possibly a valid address for this peer (@code{GNUNET_NAT_test_address}). +This is used for validating our addresses when generating PONGs. + +Finally, the NAT library contains an API to test if our NAT configuration +is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to +the respective port, the NAT library can be used to test if the +configuration works. The test function act as a local client, initialize +the NAT traversal and then contact a @code{gnunet-nat-server} (running by +default on @code{gnunet.org}) and ask for a connection to be established. +This way, it is easy to test if the current NAT configuration is valid. + +@node Distance-Vector plugin +@section Distance-Vector plugin +@c %**end of header + +The Distance Vector (DV) transport is a transport mechanism that allows +peers to act as relays for each other, thereby connecting peers that would +otherwise be unable to connect. This gives a larger connection set to +applications that may work better with more peers to choose from (for +example, File Sharing and/or DHT). + +The Distance Vector transport essentially has two functions. The first is +"gossiping" connection information about more distant peers to directly +connected peers. The second is taking messages intended for non-directly +connected peers and encapsulating them in a DV wrapper that contains the +required information for routing the message through forwarding peers. Via +gossiping, optimal routes through the known DV neighborhood are discovered +and utilized and the message encapsulation provides some benefits in +addition to simply getting the message from the correct source to the +proper destination. + +The gossiping function of DV provides an up to date routing table of +peers that are available up to some number of hops. We call this a +fisheye view of the network (like a fish, nearby objects are known while +more distant ones unknown). Gossip messages are sent only to directly +connected peers, but they are sent about other knowns peers within the +"fisheye distance". Whenever two peers connect, they immediately gossip +to each other about their appropriate other neighbors. They also gossip +about the newly connected peer to previously +connected neighbors. In order to keep the routing tables up to date, +disconnect notifications are propogated as gossip as well (because +disconnects may not be sent/received, timeouts are also used remove +stagnant routing table entries). + +Routing of messages via DV is straightforward. When the DV transport is +notified of a message destined for a non-direct neighbor, the appropriate +forwarding peer is selected, and the base message is encapsulated in a DV +message which contains information about the initial peer and the intended +recipient. At each forwarding hop, the initial peer is validated (the +forwarding peer ensures that it has the initial peer in its neighborhood, +otherwise the message is dropped). Next the base message is +re-encapsulated in a new DV message for the next hop in the forwarding +chain (or delivered to the current peer, if it has arrived at the +destination). + +Assume a three peer network with peers Alice, Bob and Carol. Assume that + +@example +Alice <-> Bob and Bob <-> Carol +@end example + +@noindent +are direct (e.g. over TCP or UDP transports) connections, but that +Alice cannot directly connect to Carol. +This may be the case due to NAT or firewall restrictions, or perhaps +based on one of the peers respective configurations. If the Distance +Vector transport is enabled on all three peers, it will automatically +discover (from the gossip protocol) that Alice and Carol can connect via +Bob and provide a "virtual" Alice <-> Carol connection. Routing between +Alice and Carol happens as follows; Alice creates a message destined for +Carol and notifies the DV transport about it. The DV transport at Alice +looks up Carol in the routing table and finds that the message must be +sent through Bob for Carol. The message is encapsulated setting Alice as +the initiator and Carol as the destination and sent to Bob. Bob receives +the messages, verifies that both Alice and Carol are known to Bob, and +re-wraps the message in a new DV message for Carol. +The DV transport at Carol receives this message, unwraps the original +message, and delivers it to Carol as though it came directly from Alice. + +@cindex SMTP plugin +@node SMTP plugin +@section SMTP plugin +@c %**end of header + +This section describes the new SMTP transport plugin for GNUnet as it +exists in the 0.7.x and 0.8.x branch. SMTP support is currently not +available in GNUnet 0.9.x. This page also describes the transport layer +abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives +some benchmarking results. The performance results presented are quite +old and maybe outdated at this point. + +@itemize @bullet +@item Why use SMTP for a peer-to-peer transport? +@item SMTPHow does it work? +@item How do I configure my peer? +@item How do I test if it works? +@item How fast is it? +@item Is there any additional documentation? +@end itemize + + +@menu +* Why use SMTP for a peer-to-peer transport?:: +* How does it work?:: +* How do I configure my peer?:: +* How do I test if it works?:: +* How fast is it?:: +@end menu + +@node Why use SMTP for a peer-to-peer transport? +@subsection Why use SMTP for a peer-to-peer transport? +@c %**end of header + +There are many reasons why one would not want to use SMTP: + +@itemize @bullet +@item SMTP is using more bandwidth than TCP, UDP or HTTP +@item SMTP has a much higher latency. +@item SMTP requires significantly more computation (encoding and decoding +time) for the peers. +@item SMTP is significantly more complicated to configure. +@item SMTP may be abused by tricking GNUnet into sending mail to@ +non-participating third parties. +@end itemize + +So why would anybody want to use SMTP? +@itemize @bullet +@item SMTP can be used to contact peers behind NAT boxes (in virtual +private networks). +@item SMTP can be used to circumvent policies that limit or prohibit +peer-to-peer traffic by masking as "legitimate" traffic. +@item SMTP uses E-mail addresses which are independent of a specific IP, +which can be useful to address peers that use dynamic IP addresses. +@item SMTP can be used to initiate a connection (e.g. initial address +exchange) and peers can then negotiate the use of a more efficient +protocol (e.g. TCP) for the actual communication. +@end itemize + +In summary, SMTP can for example be used to send a message to a peer +behind a NAT box that has a dynamic IP to tell the peer to establish a +TCP connection to a peer outside of the private network. Even an +extraordinary overhead for this first message would be irrelevant in this +type of situation. + +@node How does it work? +@subsection How does it work? +@c %**end of header + +When a GNUnet peer needs to send a message to another GNUnet peer that has +advertised (only) an SMTP transport address, GNUnet base64-encodes the +message and sends it in an E-mail to the advertised address. The +advertisement contains a filter which is placed in the E-mail header, +such that the receiving host can filter the tagged E-mails and forward it +to the GNUnet peer process. The filter can be specified individually by +each peer and be changed over time. This makes it impossible to censor +GNUnet E-mail messages by searching for a generic filter. + +@node How do I configure my peer? +@subsection How do I configure my peer? +@c %**end of header + +First, you need to configure @code{procmail} to filter your inbound E-mail +for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for +example @code{/tmp/gnunet.smtp}. You also need to define a filter that is +used by @command{procmail} to detect GNUnet messages. You are free to +choose whichever filter you like, but you should make sure that it does +not occur in your other E-mail. In our example, we will use +@code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then +looks like this: + +@example +:0: +* ^X-mailer: GNUnet +/tmp/gnunet.smtp +# where do you want your other e-mail delivered to +# (default: /var/spool/mail/) +:0: /var/spool/mail/ +@end example + +After adding this file, first make sure that your regular E-mail still +works (e.g. by sending an E-mail to yourself). Then edit the GNUnet +configuration. In the section @code{SMTP} you need to specify your E-mail +address under @code{EMAIL}, your mail server (for outgoing mail) under +@code{SERVER}, the filter (X-mailer: GNUnet in the example) under +@code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed +section could then look like this: + +@example +EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER = +"X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp +@end example + +Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in +the @code{GNUNETD} section. GNUnet peers will use the E-mail address that +you specified to contact your peer until the advertisement times out. +Thus, if you are not sure if everything works properly or if you are not +planning to be online for a long time, you may want to configure this +timeout to be short, e.g. just one hour. For this, set +@code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section. + +This should be it, but you may probably want to test it first. + +@node How do I test if it works? +@subsection How do I test if it works? +@c %**end of header + +Any transport can be subjected to some rudimentary tests using the +@code{gnunet-transport-check} tool. The tool sends a message to the local +node via the transport and checks that a valid message is received. While +this test does not involve other peers and can not check if firewalls or +other network obstacles prohibit proper operation, this is a great +testcase for the SMTP transport since it tests pretty much nearly all of +the functionality. + +@code{gnunet-transport-check} should only be used without running +@code{gnunetd} at the same time. By default, @code{gnunet-transport-check} +tests all transports that are specified in the configuration file. But +you can specifically test SMTP by giving the option +@code{--transport=smtp}. + +Note that this test always checks if a transport can receive and send. +While you can configure most transports to only receive or only send +messages, this test will only work if you have configured the transport +to send and receive messages. + +@node How fast is it? +@subsection How fast is it? +@c %**end of header + +We have measured the performance of the UDP, TCP and SMTP transport layer +directly and when used from an application using the GNUnet core. +Measureing just the transport layer gives the better view of the actual +overhead of the protocol, whereas evaluating the transport from the +application puts the overhead into perspective from a practical point of +view. + +The loopback measurements of the SMTP transport were performed on three +different machines spanning a range of modern SMTP configurations. We +used a PIII-800 running RedHat 7.3 with the Purdue Computer Science +configuration which includes filters for spam. We also used a Xenon 2 GHZ +with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used +qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for +UDP and TCP are provided using the SGL configuration. The qmail benchmark +uses qmail's internal filtering whereas the sendmail benchmarks relies on +procmail to filter and deliver the mail. We used the transport layer to +send a message of b bytes (excluding transport protocol headers) directly +to the local machine. This way, network latency and packet loss on the +wire have no impact on the timings. n messages were sent sequentially over +the transport layer, sending message i+1 after the i-th message was +received. All messages were sent over the same connection and the time to +establish the connection was not taken into account since this overhead is +miniscule in practice --- as long as a connection is used for a +significant number of messages. + +@multitable @columnfractions .20 .15 .15 .15 .15 .15 +@headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail) +@tab SMTP (RH 8.0) @tab SMTP (SGL qmail) +@item 11 bytes @tab 31 ms @tab 55 ms @tab 781 s @tab 77 s @tab 24 s +@item 407 bytes @tab 37 ms @tab 62 ms @tab 789 s @tab 78 s @tab 25 s +@item 1,221 bytes @tab 46 ms @tab 73 ms @tab 804 s @tab 78 s @tab 25 s +@end multitable + +The benchmarks show that UDP and TCP are, as expected, both significantly +faster compared with any of the SMTP services. Among the SMTP +implementations, there can be significant differences depending on the +SMTP configuration. Filtering with an external tool like procmail that +needs to re-parse its configuration for each mail can be very expensive. +Applying spam filters can also significantly impact the performance of +the underlying SMTP implementation. The microbenchmark shows that SMTP +can be a viable solution for initiating peer-to-peer sessions: a couple of +seconds to connect to a peer are probably not even going to be noticed by +users. The next benchmark measures the possible throughput for a +transport. Throughput can be measured by sending multiple messages in +parallel and measuring packet loss. Note that not only UDP but also the +TCP transport can actually loose messages since the TCP implementation +drops messages if the @code{write} to the socket would block. While the +SMTP protocol never drops messages itself, it is often so +slow that only a fraction of the messages can be sent and received in the +given time-bounds. For this benchmark we report the message loss after +allowing t time for sending m messages. If messages were not sent (or +received) after an overall timeout of t, they were considered lost. The +benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0 +with sendmail. The machines were connected with a direct 100 MBit ethernet +connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the +throughput for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps +and 6 kbps for UDP, TCP and SMTP respectively. The high per-message +overhead of SMTP can be improved by increasing the MTU, for example, an +MTU of 12,000 octets improves the throughput to 13 kbps as figure +smtp-MTUs shows. Our research paper) has some more details on the +benchmarking results. + +@cindex Bluetooth plugin +@node Bluetooth plugin +@section Bluetooth plugin +@c %**end of header + +This page describes the new Bluetooth transport plugin for GNUnet. The +plugin is still in the testing stage so don't expect it to work +perfectly. If you have any questions or problems just post them here or +ask on the IRC channel. + +@itemize @bullet +@item What do I need to use the Bluetooth plugin transport? +@item BluetoothHow does it work? +@item What possible errors should I be aware of? +@item How do I configure my peer? +@item How can I test it? +@end itemize + +@menu +* What do I need to use the Bluetooth plugin transport?:: +* How does it work2?:: +* What possible errors should I be aware of?:: +* How do I configure my peer2?:: +* How can I test it?:: +* The implementation of the Bluetooth transport plugin:: +@end menu + +@node What do I need to use the Bluetooth plugin transport? +@subsection What do I need to use the Bluetooth plugin transport? +@c %**end of header + +If you are a GNU/Linux user and you want to use the Bluetooth +transport plugin you should install the +@command{BlueZ development libraries} (if they aren't already +installed). +For instructions about how to install the libraries you should +check out the BlueZ site +(@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if +you have the necesarry libraries, don't worry, just run the GNUnet +configure script and you will be able to see a notification at the end +which will warn you if you don't have the necessary libraries. + +If you are a Windows user you should have installed the +@emph{MinGW}/@emph{MSys2} with the latest updates (especially the +@emph{ws2bth} header). If this is your first build of GNUnet on Windows +you should check out the SBuild repository. It will semi-automatically +assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra +packages which are needed for the GNUnet build. So this will ease your +work!@ Finally you just have to be sure that you have the correct drivers +for your Bluetooth device installed and that your device is on and in a +discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM +protocol so we cannot turn on your device programatically! + +@c FIXME: Change to unique title +@node How does it work2? +@subsection How does it work2? +@c %**end of header + +The Bluetooth transport plugin uses virtually the same code as the WLAN +plugin and only the helper binary is different. The helper takes a single +argument, which represents the interface name and is specified in the +configuration file. Here are the basic steps that are followed by the +helper binary used on GNU/Linux: + +@itemize @bullet +@item it verifies if the name corresponds to a Bluetooth interface name +@item it verifies if the iterface is up (if it is not, it tries to bring +it up) +@item it tries to enable the page and inquiry scan in order to make the +device discoverable and to accept incoming connection requests +@emph{The above operations require root access so you should start the +transport plugin with root privileges.} +@item it finds an available port number and registers a SDP service which +will be used to find out on which port number is the server listening on +and switch the socket in listening mode +@item it sends a HELLO message with its address +@item finally it forwards traffic from the reading sockets to the STDOUT +and from the STDIN to the writing socket +@end itemize + +Once in a while the device will make an inquiry scan to discover the +nearby devices and it will send them randomly HELLO messages for peer +discovery. + +@node What possible errors should I be aware of? +@subsection What possible errors should I be aware of? +@c %**end of header + +@emph{This section is dedicated for GNU/Linux users} + +Well there are many ways in which things could go wrong but I will try to +present some tools that you could use to debug and some scenarios. + +@itemize @bullet + +@item @code{bluetoothd -n -d} : use this command to enable logging in the +foreground and to print the logging messages + +@item @code{hciconfig}: can be used to configure the Bluetooth devices. +If you run it without any arguments it will print information about the +state of the interfaces. So if you receive an error that the device +couldn't be brought up you should try to bring it manually and to see if +it works (use @code{hciconfig -a hciX up}). If you can't and the +Bluetooth address has the form 00:00:00:00:00:00 it means that there is +something wrong with the D-Bus daemon or with the Bluetooth daemon. Use +@code{bluetoothd} tool to see the logs + +@item @code{sdptool} can be used to control and interogate SDP servers. +If you encounter problems regarding the SDP server (like the SDP server is +down) you should check out if the D-Bus daemon is running correctly and to +see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool). +Also, sometimes the SDP service could work but somehow the device couldn't +register his service. Use @code{sdptool browse [dev-address]} to see if +the service is registered. There should be a service with the name of the +interface and GNUnet as provider. + +@item @code{hcitool} : another useful tool which can be used to configure +the device and to send some particular commands to it. + +@item @code{hcidump} : could be used for low level debugging +@end itemize + +@c FIXME: A more unique name +@node How do I configure my peer2? +@subsection How do I configure my peer2? +@c %**end of header + +On GNU/Linux, you just have to be sure that the interface name +corresponds to the one that you want to use. +Use the @code{hciconfig} tool to check that. +By default it is set to hci0 but you can change it. + +A basic configuration looks like this: + +@example +[transport-bluetooth] +# Name of the interface (typically hciX) +INTERFACE = hci0 +# Real hardware, no testing +TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM; +@end example + +In order to use the Bluetooth transport plugin when the transport service +is started, you must add the plugin name to the default transport service +plugins list. For example: + +@example +[transport] ... PLUGINS = dns bluetooth ... +@end example + +If you want to use only the Bluetooth plugin set +@emph{PLUGINS = bluetooth} + +On Windows, you cannot specify which device to use. The only thing that +you should do is to add @emph{bluetooth} on the plugins list of the +transport service. + +@node How can I test it? +@subsection How can I test it? +@c %**end of header + +If you have two Bluetooth devices on the same machine and you are using +GNU/Linux you must: + +@itemize @bullet + +@item create two different file configuration (one which will use the +first interface (@emph{hci0}) and the other which will use the second +interface (@emph{hci1})). Let's name them @emph{peer1.conf} and +@emph{peer2.conf}. + +@item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the +peers private keys. The @strong{X} must be replace with 1 or 2. + +@item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to +start the transport service. (Make sure that you have "bluetooth" on the +transport plugins list if the Bluetooth transport service doesn't start.) + +@item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's +ID. If you already know your peer ID (you saved it from the first +command), this can be skipped. + +@item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start +sending data for benchmarking to the other peer. + +@end itemize + + +This scenario will try to connect the second peer to the first one and +then start sending data for benchmarking. + +On Windows you cannot test the plugin functionality using two Bluetooth +devices from the same machine because after you install the drivers there +will occur some conflicts between the Bluetooth stacks. (At least that is +what happend on my machine : I wasn't able to use the Bluesoleil stack and +the WINDCOMM one in the same time). + +If you have two different machines and your configuration files are good +you can use the same scenario presented on the begining of this section. + +Another way to test the plugin functionality is to create your own +application which will use the GNUnet framework with the Bluetooth +transport service. + +@node The implementation of the Bluetooth transport plugin +@subsection The implementation of the Bluetooth transport plugin +@c %**end of header + +This page describes the implementation of the Bluetooth transport plugin. + +First I want to remind you that the Bluetooth transport plugin uses +virtually the same code as the WLAN plugin and only the helper binary is +different. Also the scope of the helper binary from the Bluetooth +transport plugin is the same as the one used for the wlan transport +plugin: it acceses the interface and then it forwards traffic in both +directions between the Bluetooth interface and stdin/stdout of the +process involved. + +The Bluetooth plugin transport could be used both on GNU/Linux and Windows +platforms. + +@itemize @bullet +@item Linux functionality +@item Windows functionality +@item Pending Features +@end itemize + + + +@menu +* Linux functionality:: +* THE INITIALIZATION:: +* THE LOOP:: +* Details about the broadcast implementation:: +* Windows functionality:: +* Pending features:: +@end menu + +@node Linux functionality +@subsubsection Linux functionality +@c %**end of header + +In order to implement the plugin functionality on GNU/Linux I +used the BlueZ stack. +For the communication with the other devices I used the RFCOMM +protocol. Also I used the HCI protocol to gain some control over the +device. The helper binary takes a single argument (the name of the +Bluetooth interface) and is separated in two stages: + +@c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not +@c %** starting a new section? +@node THE INITIALIZATION +@subsubsection THE INITIALIZATION + +@itemize @bullet +@item first, it checks if we have root privilegies +(@emph{Remember that we need to have root privilegies in order to be able +to bring the interface up if it is down or to change its state.}). + +@item second, it verifies if the interface with the given name exists. + +@strong{If the interface with that name exists and it is a Bluetooth +interface:} + +@item it creates a RFCOMM socket which will be used for listening and call +the @emph{open_device} method + +On the @emph{open_device} method: +@itemize @bullet +@item creates a HCI socket used to send control events to the the device +@item searches for the device ID using the interface name +@item saves the device MAC address +@item checks if the interface is down and tries to bring it UP +@item checks if the interface is in discoverable mode and tries to make it +discoverable +@item closes the HCI socket and binds the RFCOMM one +@item switches the RFCOMM socket in listening mode +@item registers the SDP service (the service will be used by the other +devices to get the port on which this device is listening on) +@end itemize + +@item drops the root privilegies + +@strong{If the interface is not a Bluetooth interface the helper exits +with a suitable error} +@end itemize + +@c %** Same as for @node entry above +@node THE LOOP +@subsubsection THE LOOP + +The helper binary uses a list where it saves all the connected neighbour +devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and +@emph{write_std}). The first message which is send is a control message +with the device's MAC address in order to announce the peer presence to +the neighbours. Here are a short description of what happens in the main +loop: + +@itemize @bullet +@item Every time when it receives something from the STDIN it processes +the data and saves the message in the first buffer (@emph{write_pout}). +When it has something in the buffer, it gets the destination address from +the buffer, searches the destination address in the list (if there is no +connection with that device, it creates a new one and saves it to the +list) and sends the message. +@item Every time when it receives something on the listening socket it +accepts the connection and saves the socket on a list with the reading +sockets. @item Every time when it receives something from a reading +socket it parses the message, verifies the CRC and saves it in the +@emph{write_std} buffer in order to be sent later to the STDOUT. +@end itemize + +So in the main loop we use the select function to wait until one of the +file descriptor saved in one of the two file descriptors sets used is +ready to use. The first set (@emph{rfds}) represents the reading set and +it could contain the list with the reading sockets, the STDIN file +descriptor or the listening socket. The second set (@emph{wfds}) is the +writing set and it could contain the sending socket or the STDOUT file +descriptor. After the select function returns, we check which file +descriptor is ready to use and we do what is supposed to do on that kind +of event. @emph{For example:} if it is the listening socket then we +accept a new connection and save the socket in the reading list; if it is +the STDOUT file descriptor, then we write to STDOUT the message from the +@emph{write_std} buffer. + +To find out on which port a device is listening on we connect to the local +SDP server and searche the registered service for that device. + +@emph{You should be aware of the fact that if the device fails to connect +to another one when trying to send a message it will attempt one more +time. If it fails again, then it skips the message.} +@emph{Also you should know that the transport Bluetooth plugin has +support for @strong{broadcast messages}.} + +@node Details about the broadcast implementation +@subsubsection Details about the broadcast implementation +@c %**end of header + +First I want to point out that the broadcast functionality for the CONTROL +messages is not implemented in a conventional way. Since the inquiry scan +time is too big and it will take some time to send a message to all the +discoverable devices I decided to tackle the problem in a different way. +Here is how I did it: + +@itemize @bullet +@item If it is the first time when I have to broadcast a message I make an +inquiry scan and save all the devices' addresses to a vector. +@item After the inquiry scan ends I take the first address from the list +and I try to connect to it. If it fails, I try to connect to the next one. +If it succeeds, I save the socket to a list and send the message to the +device. +@item When I have to broadcast another message, first I search on the list +for a new device which I'm not connected to. If there is no new device on +the list I go to the beginning of the list and send the message to the +old devices. After 5 cycles I make a new inquiry scan to check out if +there are new discoverable devices and save them to the list. If there +are no new discoverable devices I reset the cycling counter and go again +through the old list and send messages to the devices saved in it. +@end itemize + +@strong{Therefore}: + +@itemize @bullet +@item every time when I have a broadcast message I look up on the list +for a new device and send the message to it +@item if I reached the end of the list for 5 times and I'm connected to +all the devices from the list I make a new inquiry scan. +@emph{The number of the list's cycles after an inquiry scan could be +increased by redefining the MAX_LOOPS variable} +@item when there are no new devices I send messages to the old ones. +@end itemize + +Doing so, the broadcast control messages will reach the devices but with +delay. + +@emph{NOTICE:} When I have to send a message to a certain device first I +check on the broadcast list to see if we are connected to that device. If +not we try to connect to it and in case of success we save the address and +the socket on the list. If we are already connected to that device we +simply use the socket. + +@node Windows functionality +@subsubsection Windows functionality +@c %**end of header + +For Windows I decided to use the Microsoft Bluetooth stack which has the +advantage of coming standard from Windows XP SP2. The main disadvantage is +that it only supports the RFCOMM protocol so we will not be able to have +a low level control over the Bluetooth device. Therefore it is the user +responsability to check if the device is up and in the discoverable mode. +Also there are no tools which could be used for debugging in order to read +the data coming from and going to a Bluetooth device, which obviously +hindered my work. Another thing that slowed down the implementation of the +plugin (besides that I wasn't too accomodated with the win32 API) was that +there were some bugs on MinGW regarding the Bluetooth. Now they are solved +but you should keep in mind that you should have the latest updates +(especially the @emph{ws2bth} header). + +Besides the fact that it uses the Windows Sockets, the Windows +implemenation follows the same principles as the GNU/Linux one: + +@itemize @bullet +@item It has a initalization part where it initializes the +Windows Sockets, creates a RFCOMM socket which will be binded and switched +to the listening mode and registers a SDP service. In the Microsoft +Bluetooth API there are two ways to work with the SDP: +@itemize @bullet +@item an easy way which works with very simple service records +@item a hard way which is useful when you need to update or to delete the +record +@end itemize +@end itemize + +Since I only needed the SDP service to find out on which port the device +is listening on and that did not change, I decided to use the easy way. +In order to register the service I used the @emph{WSASetService} function +and I generated the @emph{Universally Unique Identifier} with the +@emph{guidgen.exe} Windows's tool. + +In the loop section the only difference from the GNU/Linux implementation +is that I used the @code{GNUNET_NETWORK} library for +functions like @emph{accept}, @emph{bind}, @emph{connect} or +@emph{select}. I decided to use the +@code{GNUNET_NETWORK} library because I also needed to interact +with the STDIN and STDOUT handles and on Windows +the select function is only defined for sockets, +and it will not work for arbitrary file handles. + +Another difference between GNU/Linux and Windows implementation is that in +GNU/Linux, the Bluetooth address is represented in 48 bits +while in Windows is represented in 64 bits. +Therefore I had to do some changes on @emph{plugin_transport_wlan} header. + +Also, currently on Windows the Bluetooth plugin doesn't have support for +broadcast messages. When it receives a broadcast message it will skip it. + +@node Pending features +@subsubsection Pending features +@c %**end of header + +@itemize @bullet +@item Implement the broadcast functionality on Windows @emph{(currently +working on)} +@item Implement a testcase for the helper :@ @emph{The testcase +consists of a program which emaluates the plugin and uses the helper. It +will simulate connections, disconnections and data transfers.} +@end itemize + +If you have a new idea about a feature of the plugin or suggestions about +how I could improve the implementation you are welcome to comment or to +contact me. + +@node WLAN plugin +@section WLAN plugin +@c %**end of header + +This section documents how the wlan transport plugin works. Parts which +are not implemented yet or could be better implemented are described at +the end. + +@cindex ATS Subsystem +@node ATS Subsystem +@section ATS Subsystem +@c %**end of header + +ATS stands for "automatic transport selection", and the function of ATS in +GNUnet is to decide on which address (and thus transport plugin) should +be used for two peers to communicate, and what bandwidth limits should be +imposed on such an individual connection. To help ATS make an informed +decision, higher-level services inform the ATS service about their +requirements and the quality of the service rendered. The ATS service +also interacts with the transport service to be appraised of working +addresses and to communicate its resource allocation decisions. Finally, +the ATS service's operation can be observed using a monitoring API. + +The main logic of the ATS service only collects the available addresses, +their performance characteristics and the applications requirements, but +does not make the actual allocation decision. This last critical step is +left to an ATS plugin, as we have implemented (currently three) different +allocation strategies which differ significantly in their performance and +maturity, and it is still unclear if any particular plugin is generally +superior. + +@cindex CORE Subsystem +@node CORE Subsystem +@section CORE Subsystem +@c %**end of header + +The CORE subsystem in GNUnet is responsible for securing link-layer +communications between nodes in the GNUnet overlay network. CORE builds +on the TRANSPORT subsystem which provides for the actual, insecure, +unreliable link-layer communication (for example, via UDP or WLAN), and +then adds fundamental security to the connections: + +@itemize @bullet +@item confidentiality with so-called perfect forward secrecy; we use +ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}} +powered by Curve25519 +@footnote{@uref{http://cr.yp.to/ecdh.html, Curve25519}} for the key +exchange and then use symmetric encryption, encrypting with both AES-256 +@footnote{@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}} and +Twofish @footnote{@uref{http://en.wikipedia.org/wiki/Twofish, Twofish}} +@item @uref{http://en.wikipedia.org/wiki/Authentication, authentication} +is achieved by signing the ephemeral keys using Ed25519 +@footnote{@uref{http://ed25519.cr.yp.to/, Ed25519}}, a deterministic +variant of ECDSA +@footnote{@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA}} +@item integrity protection (using SHA-512 +@footnote{@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}} to do +encrypt-then-MAC +@footnote{@uref{http://en.wikipedia.org/wiki/Authenticated_encryption, encrypt-then-MAC}}) +@item Replay +@footnote{@uref{http://en.wikipedia.org/wiki/Replay_attack, replay}} +protection (using nonces, timestamps, challenge-response, +message counters and ephemeral keys) +@item liveness (keep-alive messages, timeout) +@end itemize + +@menu +* Limitations:: +* When is a peer "connected"?:: +* libgnunetcore:: +* The CORE Client-Service Protocol:: +* The CORE Peer-to-Peer Protocol:: +@end menu + +@cindex core subsystem limitations +@node Limitations +@subsection Limitations +@c %**end of header + +CORE does not perform +@uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is +only possible to communicate with peers that happen to already be +"directly" connected with each other. CORE also does not have an +API to allow applications to establish such "direct" connections --- for +this, applications can ask TRANSPORT, but TRANSPORT might not be able to +establish a "direct" connection. The TOPOLOGY subsystem is responsible for +trying to keep a few "direct" connections open at all times. Applications +that need to talk to particular peers should use the CADET subsystem, as +it can establish arbitrary "indirect" connections. + +Because CORE does not perform routing, CORE must only be used directly by +applications that either perform their own routing logic (such as +anonymous file-sharing) or that do not require routing, for example +because they are based on flooding the network. CORE communication is +unreliable and delivery is possibly out-of-order. Applications that +require reliable communication should use the CADET service. Each +application can only queue one message per target peer with the CORE +service at any time; messages cannot be larger than approximately +63 kilobytes. If messages are small, CORE may group multiple messages +(possibly from different applications) prior to encryption. If permitted +by the application (using the @uref{http://baus.net/on-tcp_cork/, cork} +option), CORE may delay transmissions to facilitate grouping of multiple +small messages. If cork is not enabled, CORE will transmit the message as +soon as TRANSPORT allows it (TRANSPORT is responsible for limiting +bandwidth and congestion control). CORE does not allow flow control; +applications are expected to process messages at line-speed. If flow +control is needed, applications should use the CADET service. + +@cindex when is a peer connected +@node When is a peer "connected"? +@subsection When is a peer "connected"? +@c %**end of header + +In addition to the security features mentioned above, CORE also provides +one additional key feature to applications using it, and that is a +limited form of protocol-compatibility checking. CORE distinguishes +between TRANSPORT-level connections (which enable communication with other +peers) and application-level connections. Applications using the CORE API +will (typically) learn about application-level connections from CORE, and +not about TRANSPORT-level connections. When a typical application uses +CORE, it will specify a set of message types +(from @code{gnunet_protocols.h}) that it understands. CORE will then +notify the application about connections it has with other peers if and +only if those applications registered an intersecting set of message +types with their CORE service. Thus, it is quite possible that CORE only +exposes a subset of the established direct connections to a particular +application --- and different applications running above CORE might see +different sets of connections at the same time. + +A special case are applications that do not register a handler for any +message type. +CORE assumes that these applications merely want to monitor connections +(or "all" messages via other callbacks) and will notify those applications +about all connections. This is used, for example, by the +@code{gnunet-core} command-line tool to display the active connections. +Note that it is also possible that the TRANSPORT service has more active +connections than the CORE service, as the CORE service first has to +perform a key exchange with connecting peers before exchanging information +about supported message types and notifying applications about the new +connection. + +@cindex libgnunetcore +@node libgnunetcore +@subsection libgnunetcore +@c %**end of header + +The CORE API (defined in @file{gnunet_core_service.h}) is the basic +messaging API used by P2P applications built using GNUnet. It provides +applications the ability to send and receive encrypted messages to the +peer's "directly" connected neighbours. + +As CORE connections are generally "direct" connections,@ applications must +not assume that they can connect to arbitrary peers this way, as "direct" +connections may not always be possible. Applications using CORE are +notified about which peers are connected. Creating new "direct" +connections must be done using the TRANSPORT API. + +The CORE API provides unreliable, out-of-order delivery. While the +implementation tries to ensure timely, in-order delivery, both message +losses and reordering are not detected and must be tolerated by the +application. Most important, the core will NOT perform retransmission if +messages could not be delivered. + +Note that CORE allows applications to queue one message per connected +peer. The rate at which each connection operates is influenced by the +preferences expressed by local application as well as restrictions +imposed by the other peer. Local applications can express their +preferences for particular connections using the "performance" API of the +ATS service. + +Applications that require more sophisticated transmission capabilities +such as TCP-like behavior, or if you intend to send messages to arbitrary +remote peers, should use the CADET API. + +The typical use of the CORE API is to connect to the CORE service using +@code{GNUNET_CORE_connect}, process events from the CORE service (such as +peers connecting, peers disconnecting and incoming messages) and send +messages to connected peers using +@code{GNUNET_CORE_notify_transmit_ready}. Note that applications must +cancel pending transmission requests if they receive a disconnect event +for a peer that had a transmission pending; furthermore, queueing more +than one transmission request per peer per application using the +service is not permitted. + +The CORE API also allows applications to monitor all communications of the +peer prior to encryption (for outgoing messages) or after decryption (for +incoming messages). This can be useful for debugging, diagnostics or to +establish the presence of cover traffic (for anonymity). As monitoring +applications are often not interested in the payload, the monitoring +callbacks can be configured to only provide the message headers (including +the message type and size) instead of copying the full data stream to the +monitoring client. + +The init callback of the @code{GNUNET_CORE_connect} function is called +with the hash of the public key of the peer. This public key is used to +identify the peer globally in the GNUnet network. Applications are +encouraged to check that the provided hash matches the hash that they are +using (as theoretically the application may be using a different +configuration file with a different private key, which would result in +hard to find bugs). + +As with most service APIs, the CORE API isolates applications from crashes +of the CORE service. If the CORE service crashes, the application will see +disconnect events for all existing connections. Once the connections are +re-established, the applications will be receive matching connect events. + +@cindex core clinet-service protocol +@node The CORE Client-Service Protocol +@subsection The CORE Client-Service Protocol +@c %**end of header + +This section describes the protocol between an application using the CORE +service (the client) and the CORE service process itself. + + +@menu +* Setup2:: +* Notifications:: +* Sending:: +@end menu + +@node Setup2 +@subsubsection Setup2 +@c %**end of header + +When a client connects to the CORE service, it first sends a +@code{InitMessage} which specifies options for the connection and a set of +message type values which are supported by the application. The options +bitmask specifies which events the client would like to be notified about. +The options include: + +@table @asis +@item GNUNET_CORE_OPTION_NOTHING No notifications +@item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting +@item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after +decryption) with full payload +@item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader} +of all inbound messages +@item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound +messages (prior to encryption) with full payload +@item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all +outbound messages +@end table + +Typical applications will only monitor for connection status changes. + +The CORE service responds to the @code{InitMessage} with an +@code{InitReplyMessage} which contains the peer's identity. Afterwards, +both CORE and the client can send messages. + +@node Notifications +@subsubsection Notifications +@c %**end of header + +The CORE will send @code{ConnectNotifyMessage}s and +@code{DisconnectNotifyMessage}s whenever peers connect or disconnect from +the CORE (assuming their type maps overlap with the message types +registered by the client). When the CORE receives a message that matches +the set of message types specified during the @code{InitMessage} (or if +monitoring is enabled in for inbound messages in the options), it sends a +@code{NotifyTrafficMessage} with the peer identity of the sender and the +decrypted payload. The same message format (except with +@code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is +used to notify clients monitoring outbound messages; here, the peer +identity given is that of the receiver. + +@node Sending +@subsubsection Sending +@c %**end of header + +When a client wants to transmit a message, it first requests a +transmission slot by sending a @code{SendMessageRequest} which specifies +the priority, deadline and size of the message. Note that these values +may be ignored by CORE. When CORE is ready for the message, it answers +with a @code{SendMessageReady} response. The client can then transmit the +payload with a @code{SendMessage} message. Note that the actual message +size in the @code{SendMessage} is allowed to be smaller than the size in +the original request. A client may at any time send a fresh +@code{SendMessageRequest}, which then superceeds the previous +@code{SendMessageRequest}, which is then no longer valid. The client can +tell which @code{SendMessageRequest} the CORE service's +@code{SendMessageReady} message is for as all of these messages contain a +"unique" request ID (based on a counter incremented by the client +for each request). + +@cindex CORE Peer-to-Peer Protocol +@node The CORE Peer-to-Peer Protocol +@subsection The CORE Peer-to-Peer Protocol +@c %**end of header + + +@menu +* Creating the EphemeralKeyMessage:: +* Establishing a connection:: +* Encryption and Decryption:: +* Type maps:: +@end menu + +@cindex EphemeralKeyMessage creation +@node Creating the EphemeralKeyMessage +@subsubsection Creating the EphemeralKeyMessage +@c %**end of header + +When the CORE service starts, each peer creates a fresh ephemeral (ECC) +public-private key pair and signs the corresponding +@code{EphemeralKeyMessage} with its long-term key (which we usually call +the peer's identity; the hash of the public long term key is what results +in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral +key is ONLY used for an ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}} +exchange by the CORE service to establish symmetric session keys. A peer +will use the same @code{EphemeralKeyMessage} for all peers for +@code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it +will create a fresh ephemeral key (forgetting the old one) and broadcast +the new @code{EphemeralKeyMessage} to all connected peers, resulting in +fresh symmetric session keys. Note that peers independently decide on +when to discard ephemeral keys; it is not a protocol violation to discard +keys more often. Ephemeral keys are also never stored to disk; restarting +a peer will thus always create a fresh ephemeral key. The use of ephemeral +keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}. + +Just before transmission, the @code{EphemeralKeyMessage} is patched to +reflect the current sender_status, which specifies the current state of +the connection from the point of view of the sender. The possible values +are: + +@itemize @bullet +@item @code{KX_STATE_DOWN} Initial value, never used on the network +@item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the +key of the other peer +@item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid +ephemeral key of the other peer, but we are waiting for the other peer to +confirm it's authenticity (ability to decode) via challenge-response. +@item @code{KX_STATE_UP} The connection is fully up from the point of +view of the sender (now performing keep-alives) +@item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying +operation; the other peer has so far failed to confirm a working +connection using the new ephemeral key +@end itemize + +@node Establishing a connection +@subsubsection Establishing a connection +@c %**end of header + +Peers begin their interaction by sending a @code{EphemeralKeyMessage} to +the other peer once the TRANSPORT service notifies the CORE service about +the connection. +A peer receiving an @code{EphemeralKeyMessage} with a status +indicating that the sender does not have the receiver's ephemeral key, the +receiver's @code{EphemeralKeyMessage} is sent in response. +Additionally, if the receiver has not yet confirmed the authenticity of +the sender, it also sends an (encrypted)@code{PingMessage} with a +challenge (and the identity of the target) to the other peer. Peers +receiving a @code{PingMessage} respond with an (encrypted) +@code{PongMessage} which includes the challenge. Peers receiving a +@code{PongMessage} check the challenge, and if it matches set the +connection to @code{KX_STATE_UP}. + +@node Encryption and Decryption +@subsubsection Encryption and Decryption +@c %**end of header + +All functions related to the key exchange and encryption/decryption of +messages can be found in @file{gnunet-service-core_kx.c} (except for the +cryptographic primitives, which are in @file{util/crypto*.c}). +Given the key material from ECDHE, a Key derivation function +@footnote{@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key derivation function}} +is used to derive two pairs of encryption and decryption keys for AES-256 +and TwoFish, as well as initialization vectors and authentication keys +(for HMAC@footnote{@uref{https://en.wikipedia.org/wiki/HMAC, HMAC}}). +The HMAC is computed over the encrypted payload. +Encrypted messages include an iv_seed and the HMAC in the header. + +Each encrypted message in the CORE service includes a sequence number and +a timestamp in the encrypted payload. The CORE service remembers the +largest observed sequence number and a bit-mask which represents which of +the previous 32 sequence numbers were already used. +Messages with sequence numbers lower than the largest observed sequence +number minus 32 are discarded. Messages with a timestamp that is less +than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of +course means that system clocks need to be reasonably synchronized for +peers to be able to communicate. Additionally, as the ephemeral key +changes every 12 hours, a peer would not even be able to decrypt messages +older than 12 hours. + +@node Type maps +@subsubsection Type maps +@c %**end of header + +Once an encrypted connection has been established, peers begin to exchange +type maps. Type maps are used to allow the CORE service to determine which +(encrypted) connections should be shown to which applications. A type map +is an array of 65536 bits representing the different types of messages +understood by applications using the CORE service. Each CORE service +maintains this map, simply by setting the respective bit for each message +type supported by any of the applications using the CORE service. Note +that bits for message types embedded in higher-level protocols (such as +MESH) will not be included in these type maps. + +Typically, the type map of a peer will be sparse. Thus, the CORE service +attempts to compress its type map using @code{gzip}-style compression +("deflate") prior to transmission. However, if the compression fails to +compact the map, the map may also be transmitted without compression +(resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or +@code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively). +Upon receiving a type map, the respective CORE service notifies +applications about the connection to the other peer if they support any +message type indicated in the type map (or no message type at all). +If the CORE service experience a connect or disconnect event from an +application, it updates its type map (setting or unsetting the respective +bits) and notifies its neighbours about the change. +The CORE services of the neighbours then in turn generate connect and +disconnect events for the peer that sent the type map for their respective +applications. As CORE messages may be lost, the CORE service confirms +receiving a type map by sending back a +@code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation +(with the correct hash of the type map) is not received, the sender will +retransmit the type map (with exponential back-off). + +@cindex CADET Subsystem +@node CADET Subsystem +@section CADET Subsystem + +The CADET subsystem in GNUnet is responsible for secure end-to-end +communications between nodes in the GNUnet overlay network. CADET builds +on the CORE subsystem which provides for the link-layer communication and +then adds routing, forwarding and additional security to the connections. +CADET offers the same cryptographic services as CORE, but on an +end-to-end level. This is done so peers retransmitting traffic on behalf +of other peers cannot access the payload data. + +@itemize @bullet +@item CADET provides confidentiality with so-called perfect forward +secrecy; we use ECDHE powered by Curve25519 for the key exchange and then +use symmetric encryption, encrypting with both AES-256 and Twofish +@item authentication is achieved by signing the ephemeral keys using +Ed25519, a deterministic variant of ECDSA +@item integrity protection (using SHA-512 to do encrypt-then-MAC, although +only 256 bits are sent to reduce overhead) +@item replay protection (using nonces, timestamps, challenge-response, +message counters and ephemeral keys) +@item liveness (keep-alive messages, timeout) +@end itemize + +Additional to the CORE-like security benefits, CADET offers other +properties that make it a more universal service than CORE. + +@itemize @bullet +@item CADET can establish channels to arbitrary peers in GNUnet. If a +peer is not immediately reachable, CADET will find a path through the +network and ask other peers to retransmit the traffic on its behalf. +@item CADET offers (optional) reliability mechanisms. In a reliable +channel traffic is guaranteed to arrive complete, unchanged and in-order. +@item CADET takes care of flow and congestion control mechanisms, not +allowing the sender to send more traffic than the receiver or the network +are able to process. +@end itemize + +@menu +* libgnunetcadet:: +@end menu + +@cindex libgnunetcadet +@node libgnunetcadet +@subsection libgnunetcadet + + +The CADET API (defined in @file{gnunet_cadet_service.h}) is the +messaging API used by P2P applications built using GNUnet. +It provides applications the ability to send and receive encrypted +messages to any peer participating in GNUnet. +The API is heavily base on the CORE API. + +CADET delivers messages to other peers in "channels". +A channel is a permanent connection defined by a destination peer +(identified by its public key) and a port number. +Internally, CADET tunnels all channels towards a destiantion peer +using one session key and relays the data on multiple "connections", +independent from the channels. + +Each channel has optional paramenters, the most important being the +reliability flag. +Should a message get lost on TRANSPORT/CORE level, if a channel is +created with as reliable, CADET will retransmit the lost message and +deliver it in order to the destination application. + +To communicate with other peers using CADET, it is necessary to first +connect to the service using @code{GNUNET_CADET_connect}. +This function takes several parameters in form of callbacks, to allow the +client to react to various events, like incoming channels or channels that +terminate, as well as specify a list of ports the client wishes to listen +to (at the moment it is not possible to start listening on further ports +once connected, but nothing prevents a client to connect several times to +CADET, even do one connection per listening port). +The function returns a handle which has to be used for any further +interaction with the service. + +To connect to a remote peer a client has to call the +@code{GNUNET_CADET_channel_create} function. The most important parameters +given are the remote peer's identity (it public key) and a port, which +specifies which application on the remote peer to connect to, similar to +TCP/UDP ports. CADET will then find the peer in the GNUnet network and +establish the proper low-level connections and do the necessary key +exchanges to assure and authenticated, secure and verified communication. +Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel} +returns a handle to interact with the created channel. + +For every message the client wants to send to the remote application, +@code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the +channel on which the message should be sent and the size of the message +(but not the message itself!). Once CADET is ready to send the message, +the provided callback will fire, and the message contents are provided to +this callback. + +Please note the CADET does not provide an explicit notification of when a +channel is connected. In loosely connected networks, like big wireless +mesh networks, this can take several seconds, even minutes in the worst +case. To be alerted when a channel is online, a client can call +@code{GNUNET_CADET_notify_transmit_ready} immediately after +@code{GNUNET_CADET_create_channel}. When the callback is activated, it +means that the channel is online. The callback can give 0 bytes to CADET +if no message is to be sent, this is ok. + +If a transmission was requested but before the callback fires it is no +longer needed, it can be cancelled with +@code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle +given back by @code{GNUNET_CADET_notify_transmit_ready}. +As in the case of CORE, only one message can be requested at a time: a +client must not call @code{GNUNET_CADET_notify_transmit_ready} again until +the callback is called or the request is cancelled. + +When a channel is no longer needed, a client can call +@code{GNUNET_CADET_channel_destroy} to get rid of it. +Note that CADET will try to transmit all pending traffic before notifying +the remote peer of the destruction of the channel, including +retransmitting lost messages if the channel was reliable. + +Incoming channels, channels being closed by the remote peer, and traffic +on any incoming or outgoing channels are given to the client when CADET +executes the callbacks given to it at the time of +@code{GNUNET_CADET_connect}. + +Finally, when an application no longer wants to use CADET, it should call +@code{GNUNET_CADET_disconnect}, but first all channels and pending +transmissions must be closed (otherwise CADET will complain). + +@cindex NSE Subsystem +@node NSE Subsystem +@section NSE Subsystem + + +NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides +other subsystems and users with a rough estimate of the number of peers +currently participating in the GNUnet overlay. +The computed value is not a precise number as producing a precise number +in a decentralized, efficient and secure way is impossible. +While NSE's estimate is inherently imprecise, NSE also gives the expected +range. For a peer that has been running in a stable network for a +while, the real network size will typically (99.7% of the time) be in the +range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the +algorithm used to calculate the estimate; +all of the details can be found in this technical report. + +@c FIXME: link to the report. + +@menu +* Motivation:: +* Principle:: +* libgnunetnse:: +* The NSE Client-Service Protocol:: +* The NSE Peer-to-Peer Protocol:: +@end menu + +@node Motivation +@subsection Motivation + + +Some subsytems, like DHT, need to know the size of the GNUnet network to +optimize some parameters of their own protocol. The decentralized nature +of GNUnet makes efficient and securely counting the exact number of peers +infeasable. Although there are several decentralized algorithms to count +the number of peers in a system, so far there is none to do so securely. +Other protocols may allow any malicious peer to manipulate the final +result or to take advantage of the system to perform +@dfn{Denial of Service} (DoS) attacks against the network. +GNUnet's NSE protocol avoids these drawbacks. + + + +@menu +* Security:: +@end menu + +@cindex NSE security +@cindex nse security +@node Security +@subsubsection Security + + +The NSE subsystem is designed to be resilient against these attacks. +It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work} +to prevent one peer from impersonating a large number of participants, +which would otherwise allow an adversary to artifically inflate the +estimate. +The DoS protection comes from the time-based nature of the protocol: +the estimates are calculated periodically and out-of-time traffic is +either ignored or stored for later retransmission by benign peers. +In particular, peers cannot trigger global network communication at will. + +@cindex NSE principle +@cindex nse principle +@node Principle +@subsection Principle + + +The algorithm calculates the estimate by finding the globally closest +peer ID to a random, time-based value. + +The idea is that the closer the ID is to the random value, the more +"densely packed" the ID space is, and therefore, more peers are in the +network. + + + +@menu +* Example:: +* Algorithm:: +* Target value:: +* Timing:: +* Controlled Flooding:: +* Calculating the estimate:: +@end menu + +@node Example +@subsubsection Example + + +Suppose all peers have IDs between 0 and 100 (our ID space), and the +random value is 42. +If the closest peer has the ID 70 we can imagine that the average +"distance" between peers is around 30 and therefore the are around 3 +peers in the whole ID space. On the other hand, if the closest peer has +the ID 44, we can imagine that the space is rather packed with peers, +maybe as much as 50 of them. +Naturally, we could have been rather unlucky, and there is only one peer +and happens to have the ID 44. Thus, the current estimate is calculated +as the average over multiple rounds, and not just a single sample. + +@node Algorithm +@subsubsection Algorithm + + +Given that example, one can imagine that the job of the subsystem is to +efficiently communicate the ID of the closest peer to the target value +to all the other peers, who will calculate the estimate from it. + +@node Target value +@subsubsection Target value + +@c %**end of header + +The target value itself is generated by hashing the current time, rounded +down to an agreed value. If the rounding amount is 1h (default) and the +time is 12:34:56, the time to hash would be 12:00:00. The process is +repeated each rouning amount (in this example would be every hour). +Every repetition is called a round. + +@node Timing +@subsubsection Timing +@c %**end of header + +The NSE subsystem has some timing control to avoid everybody broadcasting +its ID all at one. Once each peer has the target random value, it +compares its own ID to the target and calculates the hypothetical size of +the network if that peer were to be the closest. +Then it compares the hypothetical size with the estimate from the previous +rounds. For each value there is an assiciated point in the period, +let's call it "broadcast time". If its own hypothetical estimate +is the same as the previous global estimate, its "broadcast time" will be +in the middle of the round. If its bigger it will be earlier and if its +smaller (the most likely case) it will be later. This ensures that the +peers closests to the target value start broadcasting their ID the first. + +@node Controlled Flooding +@subsubsection Controlled Flooding + +@c %**end of header + +When a peer receives a value, first it verifies that it is closer than the +closest value it had so far, otherwise it answers the incoming message +with a message containing the better value. Then it checks a proof of +work that must be included in the incoming message, to ensure that the +other peer's ID is not made up (otherwise a malicious peer could claim to +have an ID of exactly the target value every round). Once validated, it +compares the brodcast time of the received value with the current time +and if it's not too early, sends the received value to its neighbors. +Otherwise it stores the value until the correct broadcast time comes. +This prevents unnecessary traffic of sub-optimal values, since a better +value can come before the broadcast time, rendering the previous one +obsolete and saving the traffic that would have been used to broadcast it +to the neighbors. + +@node Calculating the estimate +@subsubsection Calculating the estimate + +@c %**end of header + +Once the closest ID has been spread across the network each peer gets the +exact distance betweed this ID and the target value of the round and +calculates the estimate with a mathematical formula described in the tech +report. The estimate generated with this method for a single round is not +very precise. Remember the case of the example, where the only peer is the +ID 44 and we happen to generate the target value 42, thinking there are +50 peers in the network. Therefore, the NSE subsystem remembers the last +64 estimates and calculates an average over them, giving a result of which +usually has one bit of uncertainty (the real size could be half of the +estimate or twice as much). Note that the actual network size is +calculated in powers of two of the raw input, thus one bit of uncertainty +means a factor of two in the size estimate. + +@cindex libgnunetnse +@node libgnunetnse +@subsection libgnunetnse + +@c %**end of header + +The NSE subsystem has the simplest API of all services, with only two +calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}. + +The connect call gets a callback function as a parameter and this function +is called each time the network agrees on an estimate. This usually is +once per round, with some exceptions: if the closest peer has a late +local clock and starts spreading his ID after everyone else agreed on a +value, the callback might be activated twice in a round, the second value +being always bigger than the first. The default round time is set to +1 hour. + +The disconnect call disconnects from the NSE subsystem and the callback +is no longer called with new estimates. + + + +@menu +* Results:: +* libgnunetnse - Examples:: +@end menu + +@node Results +@subsubsection Results + +@c %**end of header + +The callback provides two values: the average and the +@uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation} +of the last 64 rounds. The values provided by the callback function are +logarithmic, this means that the real estimate numbers can be obtained by +calculating 2 to the power of the given value (2average). From a +statistics point of view this means that: + +@itemize @bullet +@item 68% of the time the real size is included in the interval +[(2average-stddev), 2] +@item 95% of the time the real size is included in the interval +[(2average-2*stddev, 2^average+2*stddev] +@item 99.7% of the time the real size is included in the interval +[(2average-3*stddev, 2average+3*stddev] +@end itemize + +The expected standard variation for 64 rounds in a network of stable size +is 0.2. Thus, we can say that normally: + +@itemize @bullet +@item 68% of the time the real size is in the range [-13%, +15%] +@item 95% of the time the real size is in the range [-24%, +32%] +@item 99.7% of the time the real size is in the range [-34%, +52%] +@end itemize + +As said in the introduction, we can be quite sure that usually the real +size is between one third and three times the estimate. This can of +course vary with network conditions. +Thus, applications may want to also consider the provided standard +deviation value, not only the average (in particular, if the standard +veriation is very high, the average maybe meaningless: the network size is +changing rapidly). + +@node libgnunetnse - Examples +@subsubsection libgnunetnse -Examples + +@c %**end of header + +Let's close with a couple examples. + +@table @asis + +@item Average: 10, std dev: 1 Here the estimate would be +2^10 = 1024 peers. @footnote{The range in which we can be 95% sure is: +[2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network +is not a hundred peers and absolutely sure that it is not a million peers, +but somewhere around a thousand.} + +@item Average 22, std dev: 0.2 Here the estimate would be +2^22 = 4 Million peers. @footnote{The range in which we can be 99.7% sure +is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size +is around four million, with absolutely way of it being 1 million.} + +@end table + +To put this in perspective, if someone remembers the LHC Higgs boson +results, were announced with "5 sigma" and "6 sigma" certainties. In this +case a 5 sigma minimum would be 2 million and a 6 sigma minimum, +1.8 million. + +@node The NSE Client-Service Protocol +@subsection The NSE Client-Service Protocol + +@c %**end of header + +As with the API, the client-service protocol is very simple, only has 2 +different messages, defined in @code{src/nse/nse.h}: + +@itemize @bullet +@item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters +and is sent from the client to the service upon connection. +@item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from +the service to the client for every new estimate and upon connection. +Contains a timestamp for the estimate, the average and the standard +deviation for the respective round. +@end itemize + +When the @code{GNUNET_NSE_disconnect} API call is executed, the client +simply disconnects from the service, with no message involved. + +@cindex NSE Peer-to-Peer Protocol +@node The NSE Peer-to-Peer Protocol +@subsection The NSE Peer-to-Peer Protocol + +@c %**end of header + +The NSE subsystem only has one message in the P2P protocol, the +@code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message. + +This message key contents are the timestamp to identify the round +(differences in system clocks may cause some peers to send messages way +too early or way too late, so the timestamp allows other peers to +identify such messages easily), the +@uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work} +used to make it difficult to mount a +@uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the +public key, which is used to verify the signature on the message. + +Every peer stores a message for the previous, current and next round. The +messages for the previous and current round are given to peers that +connect to us. The message for the next round is simply stored until our +system clock advances to the next round. The message for the current round +is what we are flooding the network with right now. +At the beginning of each round the peer does the following: + +@itemize @bullet +@item calculates his own distance to the target value +@item creates, signs and stores the message for the current round (unless +it has a better message in the "next round" slot which came early in the +previous round) +@item calculates, based on the stored round message (own or received) when +to stard flooding it to its neighbors +@end itemize + +Upon receiving a message the peer checks the validity of the message +(round, proof of work, signature). The next action depends on the +contents of the incoming message: + +@itemize @bullet +@item if the message is worse than the current stored message, the peer +sends the current message back immediately, to stop the other peer from +spreading suboptimal results +@item if the message is better than the current stored message, the peer +stores the new message and calculates the new target time to start +spreading it to its neighbors (excluding the one the message came from) +@item if the message is for the previous round, it is compared to the +message stored in the "previous round slot", which may then be updated +@item if the message is for the next round, it is compared to the message +stored in the "next round slot", which again may then be updated +@end itemize + +Finally, when it comes to send the stored message for the current round to +the neighbors there is a random delay added for each neighbor, to avoid +traffic spikes and minimize cross-messages. + +@cindex HOSTLIST Subsystem +@node HOSTLIST Subsystem +@section HOSTLIST Subsystem + +@c %**end of header + +Peers in the GNUnet overlay network need address information so that they +can connect with other peers. GNUnet uses so called HELLO messages to +store and exchange peer addresses. +GNUnet provides several methods for peers to obtain this information: + +@itemize @bullet +@item out-of-band exchange of HELLO messages (manually, using for example +gnunet-peerinfo) +@item HELLO messages shipped with GNUnet (automatic with distribution) +@item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast) +@item topology gossiping (learning from other peers we already connected +to), and +@item the HOSTLIST daemon covered in this section, which is particularly +relevant for bootstrapping new peers. +@end itemize + +New peers have no existing connections (and thus cannot learn from gossip +among peers), may not have other peers in their LAN and might be started +with an outdated set of HELLO messages from the distribution. +In this case, getting new peers to connect to the network requires either +manual effort or the use of a HOSTLIST to obtain HELLOs. + +@menu +* HELLOs:: +* Overview for the HOSTLIST subsystem:: +* Interacting with the HOSTLIST daemon:: +* Hostlist security address validation:: +* The HOSTLIST daemon:: +* The HOSTLIST server:: +* The HOSTLIST client:: +* Usage:: +@end menu + +@node HELLOs +@subsection HELLOs + +@c %**end of header + +The basic information peers require to connect to other peers are +contained in so called HELLO messages you can think of as a business card. +Besides the identity of the peer (based on the cryptographic public key) a +HELLO message may contain address information that specifies ways to +contact a peer. By obtaining HELLO messages, a peer can learn how to +contact other peers. + +@node Overview for the HOSTLIST subsystem +@subsection Overview for the HOSTLIST subsystem + +@c %**end of header + +The HOSTLIST subsystem provides a way to distribute and obtain contact +information to connect to other peers using a simple HTTP GET request. +It's implementation is split in three parts, the main file for the daemon +itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download +peer information (@file{hostlist-client.c}) and the server component used +to provide this information to other peers (@file{hostlist-server.c}). +The server is basically a small HTTP web server (based on GNU +libmicrohttpd) which provides a list of HELLOs known to the local peer for +download. The client component is basically a HTTP client +(based on libcurl) which can download hostlists from one or more websites. +The hostlist format is a binary blob containing a sequence of HELLO +messages. Note that any HTTP server can theoretically serve a hostlist, +the build-in hostlist server makes it simply convenient to offer this +service. + + +@menu +* Features:: +* HOSTLIST - Limitations:: +@end menu + +@node Features +@subsubsection Features + +@c %**end of header + +The HOSTLIST daemon can: + +@itemize @bullet +@item provide HELLO messages with validated addresses obtained from +PEERINFO to download for other peers +@item download HELLO messages and forward these message to the TRANSPORT +subsystem for validation +@item advertises the URL of this peer's hostlist address to other peers +via gossip +@item automatically learn about hostlist servers from the gossip of other +peers +@end itemize + +@node HOSTLIST - Limitations +@subsubsection HOSTLIST - Limitations + +@c %**end of header + +The HOSTLIST daemon does not: + +@itemize @bullet +@item verify the cryptographic information in the HELLO messages +@item verify the address information in the HELLO messages +@end itemize + +@node Interacting with the HOSTLIST daemon +@subsection Interacting with the HOSTLIST daemon + +@c %**end of header + +The HOSTLIST subsystem is currently implemented as a daemon, so there is +no need for the user to interact with it and therefore there is no +command line tool and no API to communicate with the daemon. In the +future, we can envision changing this to allow users to manually trigger +the download of a hostlist. + +Since there is no command line interface to interact with HOSTLIST, the +only way to interact with the hostlist is to use STATISTICS to obtain or +modify information about the status of HOSTLIST: + +@example +$ gnunet-statistics -s hostlist +@end example + +@noindent +In particular, HOSTLIST includes a @strong{persistent} value in statistics +that specifies when the hostlist server might be queried next. As this +value is exponentially increasing during runtime, developers may want to +reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs +to be shutdown if changes to this value are to have any effect on the +daemon (as HOSTLIST does not monitor STATISTICS for changes to the +download frequency). + +@node Hostlist security address validation +@subsection Hostlist security address validation + +@c %**end of header + +Since information obtained from other parties cannot be trusted without +validation, we have to distinguish between @emph{validated} and +@emph{not validated} addresses. Before using (and so trusting) +information from other parties, this information has to be double-checked +(validated). Address validation is not done by HOSTLIST but by the +TRANSPORT service. + +The HOSTLIST component is functionally located between the PEERINFO and +the TRANSPORT subsystem. When acting as a server, the daemon obtains valid +(@emph{validated}) peer information (HELLO messages) from the PEERINFO +service and provides it to other peers. When acting as a client, it +contacts the HOSTLIST servers specified in the configuration, downloads +the (unvalidated) list of HELLO messages and forwards these information +to the TRANSPORT server to validate the addresses. + +@cindex HOSTLIST daemon +@node The HOSTLIST daemon +@subsection The HOSTLIST daemon + +@c %**end of header + +The hostlist daemon is the main component of the HOSTLIST subsystem. It is +started by the ARM service and (if configured) starts the HOSTLIST client +and server components. + +If the daemon provides a hostlist itself it can advertise it's own +hostlist to other peers. To do so it sends a +@code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers +when they connect to this peer on the CORE level. This hostlist +advertisement message contains the URL to access the HOSTLIST HTTP +server of the sender. The daemon may also subscribe to this type of +message from CORE service, and then forward these kind of message to the +HOSTLIST client. The client then uses all available URLs to download peer +information when necessary. + +When starting, the HOSTLIST daemon first connects to the CORE subsystem +and if hostlist learning is enabled, registers a CORE handler to receive +this kind of messages. Next it starts (if configured) the client and +server. It passes pointers to CORE connect and disconnect and receive +handlers where the client and server store their functions, so the daemon +can notify them about CORE events. + +To clean up on shutdown, the daemon has a cleaning task, shutting down all +subsystems and disconnecting from CORE. + +@cindex HOSTLIST server +@node The HOSTLIST server +@subsection The HOSTLIST server + +@c %**end of header + +The server provides a way for other peers to obtain HELLOs. Basically it +is a small web server other peers can connect to and download a list of +HELLOs using standard HTTP; it may also advertise the URL of the hostlist +to other peers connecting on CORE level. + + +@menu +* The HTTP Server:: +* Advertising the URL:: +@end menu + +@node The HTTP Server +@subsubsection The HTTP Server + +@c %**end of header + +During startup, the server starts a web server listening on the port +specified with the HTTPPORT value (default 8080). In addition it connects +to the PEERINFO service to obtain peer information. The HOSTLIST server +uses the GNUNET_PEERINFO_iterate function to request HELLO information for +all peers and adds their information to a new hostlist if they are +suitable (expired addresses and HELLOs without addresses are both not +suitable) and the maximum size for a hostlist is not exceeded +(MAX_BYTES_PER_HOSTLISTS = 500000). +When PEERINFO finishes (with a last NULL callback), the server destroys +the previous hostlist response available for download on the web server +and replaces it with the updated hostlist. The hostlist format is +basically a sequence of HELLO messages (as obtained from PEERINFO) without +any special tokenization. Since each HELLO message contains a size field, +the response can easily be split into separate HELLO messages by the +client. + +A HOSTLIST client connecting to the HOSTLIST server will receive the +hostlist as a HTTP response and the the server will terminate the +connection with the result code @code{HTTP 200 OK}. +The connection will be closed immediately if no hostlist is available. + +@node Advertising the URL +@subsubsection Advertising the URL + +@c %**end of header + +The server also advertises the URL to download the hostlist to other peers +if hostlist advertisement is enabled. +When a new peer connects and has hostlist learning enabled, the server +sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this +peer using the CORE service. + +@cindex HOSTLIST client +@node The HOSTLIST client +@subsection The HOSTLIST client + +@c %**end of header + +The client provides the functionality to download the list of HELLOs from +a set of URLs. +It performs a standard HTTP request to the URLs configured and learned +from advertisement messages received from other peers. When a HELLO is +downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT +service for validation. + +The client supports two modes of operation: + +@itemize @bullet +@item download of HELLOs (bootstrapping) +@item learning of URLs +@end itemize + +@menu +* Bootstrapping:: +* Learning:: +@end menu + +@node Bootstrapping +@subsubsection Bootstrapping + +@c %**end of header + +For bootstrapping, it schedules a task to download the hostlist from the +set of known URLs. +The downloads are only performed if the number of current +connections is smaller than a minimum number of connections +(at the moment 4). +The interval between downloads increases exponentially; however, the +exponential growth is limited if it becomes longer than an hour. +At that point, the frequency growth is capped at +(#number of connections * 1h). + +Once the decision has been taken to download HELLOs, the daemon chooses a +random URL from the list of known URLs. URLs can be configured in the +configuration or be learned from advertisement messages. +The client uses a HTTP client library (libcurl) to initiate the download +using the libcurl multi interface. +Libcurl passes the data to the callback_download function which +stores the data in a buffer if space is available and the maximum size for +a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). +When a full HELLO was downloaded, the HOSTLIST client offers this +HELLO message to the TRANSPORT service for validation. +When the download is finished or failed, statistical information about the +quality of this URL is updated. + +@cindex HOSTLIST learning +@node Learning +@subsubsection Learning + +@c %**end of header + +The client also manages hostlist advertisements from other peers. The +HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} +messages to the client subsystem, which extracts the URL from the message. +Next, a test of the newly obtained URL is performed by triggering a +download from the new URL. If the URL works correctly, it is added to the +list of working URLs. + +The size of the list of URLs is restricted, so if an additional server is +added and the list is full, the URL with the worst quality ranking +(determined through successful downloads and number of HELLOs e.g.) is +discarded. During shutdown the list of URLs is saved to a file for +persistance and loaded on startup. URLs from the configuration file are +never discarded. + +@node Usage +@subsection Usage + +@c %**end of header + +To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES +section for the ARM services. This is done in the default configuration. + +For more information on how to configure the HOSTLIST subsystem see the +installation handbook:@ +Configuring the hostlist to bootstrap@ +Configuring your peer to provide a hostlist + +@cindex IDENTITY Subsystem +@node IDENTITY Subsystem +@section IDENTITY Subsystem + +@c %**end of header + +Identities of "users" in GNUnet are called egos. +Egos can be used as pseudonyms ("fake names") or be tied to an +organization (for example, "GNU") or even the actual identity of a human. +GNUnet users are expected to have many egos. They might have one tied to +their real identity, some for organizations they manage, and more for +different domains where they want to operate under a pseudonym. + +The IDENTITY service allows users to manage their egos. The identity +service manages the private keys egos of the local user; it does not +manage identities of other users (public keys). Public keys for other +users need names to become manageable. GNUnet uses the +@dfn{GNU Name System} (GNS) to give names to other users and manage their +public keys securely. This chapter is about the IDENTITY service, +which is about the management of private keys. + +On the network, an ego corresponds to an ECDSA key (over Curve25519, +using RFC 6979, as required by GNS). Thus, users can perform actions +under a particular ego by using (signing with) a particular private key. +Other users can then confirm that the action was really performed by that +ego by checking the signature against the respective public key. + +The IDENTITY service allows users to associate a human-readable name with +each ego. This way, users can use names that will remind them of the +purpose of a particular ego. +The IDENTITY service will store the respective private keys and +allows applications to access key information by name. +Users can change the name that is locally (!) associated with an ego. +Egos can also be deleted, which means that the private key will be removed +and it thus will not be possible to perform actions with that ego in the +future. + +Additionally, the IDENTITY subsystem can associate service functions with +egos. +For example, GNS requires the ego that should be used for the shorten +zone. GNS will ask IDENTITY for an ego for the "gns-short" service. +The IDENTITY service has a mapping of such service strings to the name of +the ego that the user wants to use for this service, for example +"my-short-zone-ego". + +Finally, the IDENTITY API provides access to a special ego, the +anonymous ego. The anonymous ego is special in that its private key is not +really private, but fixed and known to everyone. +Thus, anyone can perform actions as anonymous. This can be useful as with +this trick, code does not have to contain a special case to distinguish +between anonymous and pseudonymous egos. + +@menu +* libgnunetidentity:: +* The IDENTITY Client-Service Protocol:: +@end menu + +@cindex libgnunetidentity +@node libgnunetidentity +@subsection libgnunetidentity +@c %**end of header + + +@menu +* Connecting to the service:: +* Operations on Egos:: +* The anonymous Ego:: +* Convenience API to lookup a single ego:: +* Associating egos with service functions:: +@end menu + +@node Connecting to the service +@subsubsection Connecting to the service + +@c %**end of header + +First, typical clients connect to the identity service using +@code{GNUNET_IDENTITY_connect}. This function takes a callback as a +parameter. +If the given callback parameter is non-null, it will be invoked to notify +the application about the current state of the identities in the system. + +@itemize @bullet +@item First, it will be invoked on all known egos at the time of the +connection. For each ego, a handle to the ego and the user's name for the +ego will be passed to the callback. Furthermore, a @code{void **} context +argument will be provided which gives the client the opportunity to +associate some state with the ego. +@item Second, the callback will be invoked with NULL for the ego, the name +and the context. This signals that the (initial) iteration over all egos +has completed. +@item Then, the callback will be invoked whenever something changes about +an ego. +If an ego is renamed, the callback is invoked with the ego handle of the +ego that was renamed, and the new name. If an ego is deleted, the callback +is invoked with the ego handle and a name of NULL. In the deletion case, +the application should also release resources stored in the context. +@item When the application destroys the connection to the identity service +using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked +with the ego and a name of NULL (equivalent to deletion of the egos). +This should again be used to clean up the per-ego context. +@end itemize + +The ego handle passed to the callback remains valid until the callback is +invoked with a name of NULL, so it is safe to store a reference to the +ego's handle. + +@node Operations on Egos +@subsubsection Operations on Egos + +@c %**end of header + +Given an ego handle, the main operations are to get its associated private +key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated +public key using @code{GNUNET_IDENTITY_ego_get_public_key}. + +The other operations on egos are pretty straightforward. +Using @code{GNUNET_IDENTITY_create}, an application can request the +creation of an ego by specifying the desired name. +The operation will fail if that name is +already in use. Using @code{GNUNET_IDENTITY_rename} the name of an +existing ego can be changed. Finally, egos can be deleted using +@code{GNUNET_IDENTITY_delete}. All of these operations will trigger +updates to the callback given to the @code{GNUNET_IDENTITY_connect} +function of all applications that are connected with the identity service +at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the +operations before the respective continuations would be called. +It is not guaranteed that the operation will not be completed anyway, +only the continuation will no longer be called. + +@node The anonymous Ego +@subsubsection The anonymous Ego + +@c %**end of header + +A special way to obtain an ego handle is to call +@code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the +"anonymous" user --- anyone knows and can get the private key for this +user, so it is suitable for operations that are supposed to be anonymous +but require signatures (for example, to avoid a special path in the code). +The anonymous ego is always valid and accessing it does not require a +connection to the identity service. + +@node Convenience API to lookup a single ego +@subsubsection Convenience API to lookup a single ego + + +As applications commonly simply have to lookup a single ego, there is a +convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to +lookup a single ego by name. Note that this is the user's name for the +ego, not the service function. The resulting ego will be returned via a +callback and will only be valid during that callback. The operation can +be cancelled via @code{GNUNET_IDENTITY_ego_lookup_cancel} +(cancellation is only legal before the callback is invoked). + +@node Associating egos with service functions +@subsubsection Associating egos with service functions + + +The @code{GNUNET_IDENTITY_set} function is used to associate a particular +ego with a service function. The name used by the service and the ego are +given as arguments. +Afterwards, the service can use its name to lookup the associated ego +using @code{GNUNET_IDENTITY_get}. + +@node The IDENTITY Client-Service Protocol +@subsection The IDENTITY Client-Service Protocol + +@c %**end of header + +A client connecting to the identity service first sends a message with +type +@code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the +client will receive information about changes to the egos by receiving +messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}. +Those messages contain the private key of the ego and the user's name of +the ego (or zero bytes for the name to indicate that the ego was deleted). +A special bit @code{end_of_list} is used to indicate the end of the +initial iteration over the identity service's egos. + +The client can trigger changes to the egos by sending @code{CREATE}, +@code{RENAME} or @code{DELETE} messages. +The CREATE message contains the private key and the desired name.@ +The RENAME message contains the old name and the new name.@ +The DELETE message only needs to include the name of the ego to delete.@ +The service responds to each of these messages with a @code{RESULT_CODE} +message which indicates success or error of the operation, and possibly +a human-readable error message. + +Finally, the client can bind the name of a service function to an ego by +sending a @code{SET_DEFAULT} message with the name of the service function +and the private key of the ego. +Such bindings can then be resolved using a @code{GET_DEFAULT} message, +which includes the name of the service function. The identity service +will respond to a GET_DEFAULT request with a SET_DEFAULT message +containing the respective information, or with a RESULT_CODE to +indicate an error. + +@cindex NAMESTORE Subsystem +@node NAMESTORE Subsystem +@section NAMESTORE Subsystem + +The NAMESTORE subsystem provides persistent storage for local GNS zone +information. All local GNS zone information are managed by NAMESTORE. It +provides both the functionality to administer local GNS information (e.g. +delete and add records) as well as to retrieve GNS information (e.g to +list name information in a client). +NAMESTORE does only manage the persistent storage of zone information +belonging to the user running the service: GNS information from other +users obtained from the DHT are stored by the NAMECACHE subsystem. + +NAMESTORE uses a plugin-based database backend to store GNS information +with good performance. Here sqlite, MySQL and PostgreSQL are supported +database backends. +NAMESTORE clients interact with the IDENTITY subsystem to obtain +cryptographic information about zones based on egos as described with the +IDENTITY subsystem, but internally NAMESTORE refers to zones using the +ECDSA private key. +In addition, it collaborates with the NAMECACHE subsystem and +stores zone information when local information are modified in the +GNS cache to increase look-up performance for local information. + +NAMESTORE provides functionality to look-up and store records, to iterate +over a specific or all zones and to monitor zones for changes. NAMESTORE +functionality can be accessed using the NAMESTORE api or the NAMESTORE +command line tool. + +@menu +* libgnunetnamestore:: +@end menu + +@cindex libgnunetnamestore +@node libgnunetnamestore +@subsection libgnunetnamestore + +To interact with NAMESTORE clients first connect to the NAMESTORE service +using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle. +As a result they obtain a NAMESTORE handle, they can use for operations, +or NULL is returned if the connection failed. + +To disconnect from NAMESTORE, clients use +@code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect. + +NAMESTORE internally uses the ECDSA private key to refer to zones. These +private keys can be obtained from the IDENTITY subsytem. +Here @emph{egos} @emph{can be used to refer to zones or the default ego +assigned to the GNS subsystem can be used to obtained the master zone's +private key.} + + +@menu +* Editing Zone Information:: +* Iterating Zone Information:: +* Monitoring Zone Information:: +@end menu + +@node Editing Zone Information +@subsubsection Editing Zone Information + +@c %**end of header + +NAMESTORE provides functions to lookup records stored under a label in a +zone and to store records under a label in a zone. + +To store (and delete) records, the client uses the +@code{GNUNET_NAMESTORE_records_store} function and has to provide +namestore handle to use, the private key of the zone, the label to store +the records under, the records and number of records plus an callback +function. +After the operation is performed NAMESTORE will call the provided +callback function with the result GNUNET_SYSERR on failure +(including timeout/queue drop/failure to validate), GNUNET_NO if content +was already there or not found GNUNET_YES (or other positive value) on +success plus an additional error message. + +Records are deleted by using the store command with 0 records to store. +It is important to note, that records are not merged when records exist +with the label. +So a client has first to retrieve records, merge with existing records +and then store the result. + +To perform a lookup operation, the client uses the +@code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the +namestore handle, the private key of the zone and the label. He also has +to provide a callback function which will be called with the result of +the lookup operation: +the zone for the records, the label, and the records including the +number of records included. + +A special operation is used to set the preferred nickname for a zone. +This nickname is stored with the zone and is automatically merged with +all labels and records stored in a zone. Here the client uses the +@code{GNUNET_NAMESTORE_set_nick} function and passes the private key of +the zone, the nickname as string plus a the callback with the result of +the operation. + +@node Iterating Zone Information +@subsubsection Iterating Zone Information + +@c %**end of header + +A client can iterate over all information in a zone or all zones managed +by NAMESTORE. +Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start} +function and passes the namestore handle, the zone to iterate over and a +callback function to call with the result. +If the client wants to iterate over all the, he passes NULL for the zone. +A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to +continue iteration. + +NAMESTORE calls the callback for every result and expects the client to +call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or +@code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration. +When NAMESTORE reached the last item it will call the callback with a +NULL value to indicate. + +@node Monitoring Zone Information +@subsubsection Monitoring Zone Information + +@c %**end of header + +Clients can also monitor zones to be notified about changes. Here the +clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and +passes the private key of the zone and and a callback function to call +with updates for a zone. +The client can specify to obtain zone information first by iterating over +the zone and specify a synchronization callback to be called when the +client and the namestore are synced. + +On an update, NAMESTORE will call the callback with the private key of the +zone, the label and the records and their number. + +To stop monitoring, the client calls +@code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained +from the function to start the monitoring. + +@cindex PEERINFO Subsystem +@node PEERINFO Subsystem +@section PEERINFO Subsystem + +@c %**end of header + +The PEERINFO subsystem is used to store verified (validated) information +about known peers in a persistent way. It obtains these addresses for +example from TRANSPORT service which is in charge of address validation. +Validation means that the information in the HELLO message are checked by +connecting to the addresses and performing a cryptographic handshake to +authenticate the peer instance stating to be reachable with these +addresses. +Peerinfo does not validate the HELLO messages itself but only stores them +and gives them to interested clients. + +As future work, we think about moving from storing just HELLO messages to +providing a generic persistent per-peer information store. +More and more subsystems tend to need to store per-peer information in +persistent way. +To not duplicate this functionality we plan to provide a PEERSTORE +service providing this functionality. + +@menu +* PEERINFO - Features:: +* PEERINFO - Limitations:: +* DeveloperPeer Information:: +* Startup:: +* Managing Information:: +* Obtaining Information:: +* The PEERINFO Client-Service Protocol:: +* libgnunetpeerinfo:: +@end menu + +@node PEERINFO - Features +@subsection PEERINFO - Features + +@c %**end of header + +@itemize @bullet +@item Persistent storage +@item Client notification mechanism on update +@item Periodic clean up for expired information +@item Differentiation between public and friend-only HELLO +@end itemize + +@node PEERINFO - Limitations +@subsection PEERINFO - Limitations + + +@itemize @bullet +@item Does not perform HELLO validation +@end itemize + +@node DeveloperPeer Information +@subsection DeveloperPeer Information + +@c %**end of header + +The PEERINFO subsystem stores these information in the form of HELLO +messages you can think of as business cards. +These HELLO messages contain the public key of a peer and the addresses +a peer can be reached under. +The addresses include an expiration date describing how long they are +valid. This information is updated regularly by the TRANSPORT service by +revalidating the address. +If an address is expired and not renewed, it can be removed from the +HELLO message. + +Some peer do not want to have their HELLO messages distributed to other +peers, especially when GNUnet's friend-to-friend modus is enabled. +To prevent this undesired distribution. PEERINFO distinguishes between +@emph{public} and @emph{friend-only} HELLO messages. +Public HELLO messages can be freely distributed to other (possibly +unknown) peers (for example using the hostlist, gossiping, broadcasting), +whereas friend-only HELLO messages may not be distributed to other peers. +Friend-only HELLO messages have an additional flag @code{friend_only} set +internally. For public HELLO message this flag is not set. +PEERINFO does and cannot not check if a client is allowed to obtain a +specific HELLO type. + +The HELLO messages can be managed using the GNUnet HELLO library. +Other GNUnet systems can obtain these information from PEERINFO and use +it for their purposes. +Clients are for example the HOSTLIST component providing these +information to other peers in form of a hostlist or the TRANSPORT +subsystem using these information to maintain connections to other peers. + +@node Startup +@subsection Startup + +@c %**end of header + +During startup the PEERINFO services loads persistent HELLOs from disk. +First PEERINFO parses the directory configured in the HOSTS value of the +@code{PEERINFO} configuration section to store PEERINFO information. +For all files found in this directory valid HELLO messages are extracted. +In addition it loads HELLO messages shipped with the GNUnet distribution. +These HELLOs are used to simplify network bootstrapping by providing +valid peer information with the distribution. +The use of these HELLOs can be prevented by setting the +@code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to +@code{NO}. Files containing invalid information are removed. + +@node Managing Information +@subsection Managing Information + +@c %**end of header + +The PEERINFO services stores information about known PEERS and a single +HELLO message for every peer. +A peer does not need to have a HELLO if no information are available. +HELLO information from different sources, for example a HELLO obtained +from a remote HOSTLIST and a second HELLO stored on disk, are combined +and merged into one single HELLO message per peer which will be given to +clients. During this merge process the HELLO is immediately written to +disk to ensure persistence. + +PEERINFO in addition periodically scans the directory where information +are stored for empty HELLO messages with expired TRANSPORT addresses. +This periodic task scans all files in the directory and recreates the +HELLO messages it finds. +Expired TRANSPORT addresses are removed from the HELLO and if the +HELLO does not contain any valid addresses, it is discarded and removed +from the disk. + +@node Obtaining Information +@subsection Obtaining Information + +@c %**end of header + +When a client requests information from PEERINFO, PEERINFO performs a +lookup for the respective peer or all peers if desired and transmits this +information to the client. +The client can specify if friend-only HELLOs have to be included or not +and PEERINFO filters the respective HELLO messages before transmitting +information. + +To notify clients about changes to PEERINFO information, PEERINFO +maintains a list of clients interested in this notifications. +Such a notification occurs if a HELLO for a peer was updated (due to a +merge for example) or a new peer was added. + +@node The PEERINFO Client-Service Protocol +@subsection The PEERINFO Client-Service Protocol + +@c %**end of header + +To connect and disconnect to and from the PEERINFO Service PEERINFO +utilizes the util client/server infrastructure, so no special messages +types are used here. + +To add information for a peer, the plain HELLO message is transmitted to +the service without any wrapping. All pieces of information required are +stored within the HELLO message. +The PEERINFO service provides a message handler accepting and processing +these HELLO messages. + +When obtaining PEERINFO information using the iterate functionality +specific messages are used. To obtain information for all peers, a +@code{struct ListAllPeersMessage} with message type +@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag +include_friend_only to indicate if friend-only HELLO messages should be +included are transmitted. If information for a specific peer is required +a @code{struct ListAllPeersMessage} with +@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is +used. + +For both variants the PEERINFO service replies for each HELLO message it +wants to transmit with a @code{struct ListAllPeersMessage} with type +@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO. +The final message is @code{struct GNUNET_MessageHeader} with type +@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this +message, it can proceed with the next request if any is pending. + +@node libgnunetpeerinfo +@subsection libgnunetpeerinfo + +@c %**end of header + +The PEERINFO API consists mainly of three different functionalities: + +@itemize @bullet +@item maintaining a connection to the service +@item adding new information to the PEERINFO service +@item retrieving information from the PEERINFO service +@end itemize + +@menu +* Connecting to the PEERINFO Service:: +* Adding Information to the PEERINFO Service:: +* Obtaining Information from the PEERINFO Service:: +@end menu + +@node Connecting to the PEERINFO Service +@subsubsection Connecting to the PEERINFO Service + +@c %**end of header + +To connect to the PEERINFO service the function +@code{GNUNET_PEERINFO_connect} is used, taking a configuration handle as +an argument, and to disconnect from PEERINFO the function +@code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO +handle returned from the connect function has to be called. + +@node Adding Information to the PEERINFO Service +@subsubsection Adding Information to the PEERINFO Service + +@c %**end of header + +@code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem +storage. This function takes the PEERINFO handle as an argument, the HELLO +message to store and a continuation with a closure to be called with the +result of the operation. +The @code{GNUNET_PEERINFO_add_peer} returns a handle to this operation +allowing to cancel the operation with the respective cancel function +@code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from +PEERINFO you can iterate over all information stored with PEERINFO or you +can tell PEERINFO to notify if new peer information are available. + +@node Obtaining Information from the PEERINFO Service +@subsubsection Obtaining Information from the PEERINFO Service + +@c %**end of header + +To iterate over information in PEERINFO you use +@code{GNUNET_PEERINFO_iterate}. +This function expects the PEERINFO handle, a flag if HELLO messages +intended for friend only mode should be included, a timeout how long the +operation should take and a callback with a callback closure to be called +for the results. +If you want to obtain information for a specific peer, you can specify +the peer identity, if this identity is NULL, information for all peers are +returned. The function returns a handle to allow to cancel the operation +using @code{GNUNET_PEERINFO_iterate_cancel}. + +To get notified when peer information changes, you can use +@code{GNUNET_PEERINFO_notify}. +This function expects a configuration handle and a flag if friend-only +HELLO messages should be included. The PEERINFO service will notify you +about every change and the callback function will be called to notify you +about changes. The function returns a handle to cancel notifications +with @code{GNUNET_PEERINFO_notify_cancel}. + +@cindex PEERSTORE Subsystem +@node PEERSTORE Subsystem +@section PEERSTORE Subsystem + +@c %**end of header + +GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other +GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently +store and retrieve arbitrary data. +Each data record stored with PEERSTORE contains the following fields: + +@itemize @bullet +@item subsystem: Name of the subsystem responsible for the record. +@item peerid: Identity of the peer this record is related to. +@item key: a key string identifying the record. +@item value: binary record value. +@item expiry: record expiry date. +@end itemize + +@menu +* Functionality:: +* Architecture:: +* libgnunetpeerstore:: +@end menu + +@node Functionality +@subsection Functionality + +@c %**end of header + +Subsystems can store any type of value under a (subsystem, peerid, key) +combination. A "replace" flag set during store operations forces the +PEERSTORE to replace any old values stored under the same +(subsystem, peerid, key) combination with the new value. +Additionally, an expiry date is set after which the record is *possibly* +deleted by PEERSTORE. + +Subsystems can iterate over all values stored under any of the following +combination of fields: + +@itemize @bullet +@item (subsystem) +@item (subsystem, peerid) +@item (subsystem, key) +@item (subsystem, peerid, key) +@end itemize + +Subsystems can also request to be notified about any new values stored +under a (subsystem, peerid, key) combination by sending a "watch" +request to PEERSTORE. + +@node Architecture +@subsection Architecture + +@c %**end of header + +PEERSTORE implements the following components: + +@itemize @bullet +@item PEERSTORE service: Handles store, iterate and watch operations. +@item PEERSTORE API: API to be used by other subsystems to communicate and +issue commands to the PEERSTORE service. +@item PEERSTORE plugins: Handles the persistent storage. At the moment, +only an "sqlite" plugin is implemented. +@end itemize + +@cindex libgnunetpeerstore +@node libgnunetpeerstore +@subsection libgnunetpeerstore + +@c %**end of header + +libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems +wishing to communicate with the PEERSTORE service use this API to open a +connection to PEERSTORE. This is done by calling +@code{GNUNET_PEERSTORE_connect} which returns a handle to the newly +created connection. +This handle has to be used with any further calls to the API. + +To store a new record, the function @code{GNUNET_PEERSTORE_store} is to +be used which requires the record fields and a continuation function that +will be called by the API after the STORE request is sent to the +PEERSTORE service. +Note that calling the continuation function does not mean that the record +is successfully stored, only that the STORE request has been successfully +sent to the PEERSTORE service. +@code{GNUNET_PEERSTORE_store_cancel} can be called to cancel the STORE +request only before the continuation function has been called. + +To iterate over stored records, the function +@code{GNUNET_PEERSTORE_iterate} is +to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator +callback function will be called with each matching record found and a +NULL record at the end to signal the end of result set. +@code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE +request before the iterator callback is called with a NULL record. + +To be notified with new values stored under a (subsystem, peerid, key) +combination, the function @code{GNUNET_PEERSTORE_watch} is to be used. +This will register the watcher with the PEERSTORE service, any new +records matching the given combination will trigger the callback +function passed to @code{GNUNET_PEERSTORE_watch}. This continues until +@code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the +service is destroyed. + +After the connection is no longer needed, the function +@code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the +PEERSTORE service. +Any pending ITERATE or WATCH requests will be destroyed. +If the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will +delay the disconnection until all pending STORE requests are sent to +the PEERSTORE service, otherwise, the pending STORE requests will be +destroyed as well. + +@cindex SET Subsystem +@node SET Subsystem +@section SET Subsystem + +@c %**end of header + +The SET service implements efficient set operations between two peers +over a mesh tunnel. +Currently, set union and set intersection are the only supported +operations. Elements of a set consist of an @emph{element type} and +arbitrary binary @emph{data}. +The size of an element's data is limited to around 62 KB. + +@menu +* Local Sets:: +* Set Modifications:: +* Set Operations:: +* Result Elements:: +* libgnunetset:: +* The SET Client-Service Protocol:: +* The SET Intersection Peer-to-Peer Protocol:: +* The SET Union Peer-to-Peer Protocol:: +@end menu + +@node Local Sets +@subsection Local Sets + +@c %**end of header + +Sets created by a local client can be modified and reused for multiple +operations. As each set operation requires potentially expensive special +auxilliary data to be computed for each element of a set, a set can only +participate in one type of set operation (i.e. union or intersection). +The type of a set is determined upon its creation. +If a the elements of a set are needed for an operation of a different +type, all of the set's element must be copied to a new set of appropriate +type. + +@node Set Modifications +@subsection Set Modifications + +@c %**end of header + +Even when set operations are active, one can add to and remove elements +from a set. +However, these changes will only be visible to operations that have been +created after the changes have taken place. That is, every set operation +only sees a snapshot of the set from the time the operation was started. +This mechanism is @emph{not} implemented by copying the whole set, but by +attaching @emph{generation information} to each element and operation. + +@node Set Operations +@subsection Set Operations + +@c %**end of header + +Set operations can be started in two ways: Either by accepting an +operation request from a remote peer, or by requesting a set operation +from a remote peer. +Set operations are uniquely identified by the involved @emph{peers}, an +@emph{application id} and the @emph{operation type}. + +The client is notified of incoming set operations by @emph{set listeners}. +A set listener listens for incoming operations of a specific operation +type and application id. +Once notified of an incoming set request, the client can accept the set +request (providing a local set for the operation) or reject it. + +@node Result Elements +@subsection Result Elements + +@c %**end of header + +The SET service has three @emph{result modes} that determine how an +operation's result set is delivered to the client: + +@itemize @bullet +@item @strong{Full Result Set.} All elements of set resulting from the set +operation are returned to the client. +@item @strong{Added Elements.} Only elements that result from the +operation and are not already in the local peer's set are returned. +Note that for some operations (like set intersection) this result mode +will never return any elements. +This can be useful if only the remove peer is actually interested in +the result of the set operation. +@item @strong{Removed Elements.} Only elements that are in the local +peer's initial set but not in the operation's result set are returned. +Note that for some operations (like set union) this result mode will +never return any elements. This can be useful if only the remove peer is +actually interested in the result of the set operation. +@end itemize + +@cindex libgnunetset +@node libgnunetset +@subsection libgnunetset + +@c %**end of header + +@menu +* Sets:: +* Listeners:: +* Operations:: +* Supplying a Set:: +* The Result Callback:: +@end menu + +@node Sets +@subsubsection Sets + +@c %**end of header + +New sets are created with @code{GNUNET_SET_create}. Both the local peer's +configuration (as each set has its own client connection) and the +operation type must be specified. +The set exists until either the client calls @code{GNUNET_SET_destroy} or +the client's connection to the service is disrupted. +In the latter case, the client is notified by the return value of +functions dealing with sets. This return value must always be checked. + +Elements are added and removed with @code{GNUNET_SET_add_element} and +@code{GNUNET_SET_remove_element}. + +@node Listeners +@subsubsection Listeners + +@c %**end of header + +Listeners are created with @code{GNUNET_SET_listen}. Each time time a +remote peer suggests a set operation with an application id and operation +type matching a listener, the listener's callback is invoked. +The client then must synchronously call either @code{GNUNET_SET_accept} +or @code{GNUNET_SET_reject}. Note that the operation will not be started +until the client calls @code{GNUNET_SET_commit} +(see Section "Supplying a Set"). + +@node Operations +@subsubsection Operations + +@c %**end of header + +Operations to be initiated by the local peer are created with +@code{GNUNET_SET_prepare}. Note that the operation will not be started +until the client calls @code{GNUNET_SET_commit} +(see Section "Supplying a Set"). + +@node Supplying a Set +@subsubsection Supplying a Set + +@c %**end of header + +To create symmetry between the two ways of starting a set operation +(accepting and nitiating it), the operation handles returned by +@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare} do not yet have a +set to operate on, thus they can not do any work yet. + +The client must call @code{GNUNET_SET_commit} to specify a set to use for +an operation. @code{GNUNET_SET_commit} may only be called once per set +operation. + +@node The Result Callback +@subsubsection The Result Callback + +@c %**end of header + +Clients must specify both a result mode and a result callback with +@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result +callback with a status indicating either that an element was received, or +the operation failed or succeeded. +The interpretation of the received element depends on the result mode. +The callback needs to know which result mode it is used in, as the +arguments do not indicate if an element is part of the full result set, +or if it is in the difference between the original set and the final set. + +@node The SET Client-Service Protocol +@subsection The SET Client-Service Protocol + +@c %**end of header + +@menu +* Creating Sets:: +* Listeners2:: +* Initiating Operations:: +* Modifying Sets:: +* Results and Operation Status:: +* Iterating Sets:: +@end menu + +@node Creating Sets +@subsubsection Creating Sets + +@c %**end of header + +For each set of a client, there exists a client connection to the service. +Sets are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message +over a new client connection. Multiple operations for one set are +multiplexed over one client connection, using a request id supplied by +the client. + +@node Listeners2 +@subsubsection Listeners2 + +@c %**end of header + +Each listener also requires a seperate client connection. By sending the +@code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service +of the application id and operation type it is interested in. A client +rejects an incoming request by sending @code{GNUNET_SERVICE_SET_REJECT} +on the listener's client connection. +In contrast, when accepting an incoming request, a +@code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that +is supplied for the set operation. + +@node Initiating Operations +@subsubsection Initiating Operations + +@c %**end of header + +Operations with remote peers are initiated by sending a +@code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client +connection that this message is sent by determines the set to use. + +@node Modifying Sets +@subsubsection Modifying Sets + +@c %**end of header + +Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and +@code{GNUNET_SERVICE_SET_REMOVE} messages. + + +@c %@menu +@c %* Results and Operation Status:: +@c %* Iterating Sets:: +@c %@end menu + +@node Results and Operation Status +@subsubsection Results and Operation Status +@c %**end of header + +The service notifies the client of result elements and success/failure of +a set operation with the @code{GNUNET_SERVICE_SET_RESULT} message. + +@node Iterating Sets +@subsubsection Iterating Sets + +@c %**end of header + +All elements of a set can be requested by sending +@code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with +@code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the +iteration with @code{GNUNET_SERVICE_SET_ITER_DONE}. +After each received element, the client +must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set +iteration may be active for a set at any given time. + +@node The SET Intersection Peer-to-Peer Protocol +@subsection The SET Intersection Peer-to-Peer Protocol + +@c %**end of header + +The intersection protocol operates over CADET and starts with a +GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer +initiating the operation to the peer listening for inbound requests. +It includes the number of elements of the initiating peer, which is used +to decide which side will send a Bloom filter first. + +The listening peer checks if the operation type and application +identifier are acceptable for its current state. +If not, it responds with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of +GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel). + +If the application accepts the request, the listener sends back a +@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} if it has +more elements in the set than the client. +Otherwise, it immediately starts with the Bloom filter exchange. +If the initiator receives a +@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} response, +it beings the Bloom filter exchange, unless the set size is indicated to +be zero, in which case the intersection is considered finished after +just the initial handshake. + + +@menu +* The Bloom filter exchange:: +* Salt:: +@end menu + +@node The Bloom filter exchange +@subsubsection The Bloom filter exchange + +@c %**end of header + +In this phase, each peer transmits a Bloom filter over the remaining +keys of the local set to the other peer using a +@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF} message. This +message additionally includes the number of elements left in the sender's +set, as well as the XOR over all of the keys in that set. + +The number of bits 'k' set per element in the Bloom filter is calculated +based on the relative size of the two sets. +Furthermore, the size of the Bloom filter is calculated based on 'k' and +the number of elements in the set to maximize the amount of data filtered +per byte transmitted on the wire (while avoiding an excessively high +number of iterations). + +The receiver of the message removes all elements from its local set that +do not pass the Bloom filter test. +It then checks if the set size of the sender and the XOR over the keys +match what is left of his own set. If they do, he sends a +@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE} back to indicate +that the latest set is the final result. +Otherwise, the receiver starts another Bloom fitler exchange, except +this time as the sender. + +@node Salt +@subsubsection Salt + +@c %**end of header + +Bloomfilter operations are probablistic: With some non-zero probability +the test may incorrectly say an element is in the set, even though it is +not. + +To mitigate this problem, the intersection protocol iterates exchanging +Bloom filters using a different random 32-bit salt in each iteration (the +salt is also included in the message). +With different salts, set operations may fail for different elements. +Merging the results from the executions, the probability of failure drops +to zero. + +The iterations terminate once both peers have established that they have +sets of the same size, and where the XOR over all keys computes the same +512-bit value (leaving a failure probability of 2-511). + +@node The SET Union Peer-to-Peer Protocol +@subsection The SET Union Peer-to-Peer Protocol + +@c %**end of header + +The SET union protocol is based on Eppstein's efficient set reconciliation +without prior context. You should read this paper first if you want to +understand the protocol. + +The union protocol operates over CADET and starts with a +GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer +initiating the operation to the peer listening for inbound requests. +It includes the number of elements of the initiating peer, which is +currently not used. + +The listening peer checks if the operation type and application +identifier are acceptable for its current state. If not, it responds with +a @code{GNUNET_MESSAGE_TYPE_SET_RESULT} and a status of +@code{GNUNET_SET_STATUS_FAILURE} (and terminates the CADET channel). + +If the application accepts the request, it sends back a strata estimator +using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The +initiator evaluates the strata estimator and initiates the exchange of +invertible Bloom filters, sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF. + +During the IBF exchange, if the receiver cannot invert the Bloom filter or +detects a cycle, it sends a larger IBF in response (up to a defined +maximum limit; if that limit is reached, the operation fails). +Elements decoded while processing the IBF are transmitted to the other +peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the +other peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages, +depending on the sign observed during decoding of the IBF. +Peers respond to a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message +with the respective element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS +message. If the IBF fully decodes, the peer responds with a +GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another +GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF. + +All Bloom filter operations use a salt to mingle keys before hasing them +into buckets, such that future iterations have a fresh chance of +succeeding if they failed due to collisions before. + +@cindex STATISTICS Subsystem +@node STATISTICS Subsystem +@section STATISTICS Subsystem + +@c %**end of header + +In GNUnet, the STATISTICS subsystem offers a central place for all +subsystems to publish unsigned 64-bit integer run-time statistics. +Keeping this information centrally means that there is a unified way for +the user to obtain data on all subsystems, and individual subsystems do +not have to always include a custom data export method for performance +metrics and other statistics. For example, the TRANSPORT system uses +STATISTICS to update information about the number of directly connected +peers and the bandwidth that has been consumed by the various plugins. +This information is valuable for diagnosing connectivity and performance +issues. + +Following the GNUnet service architecture, the STATISTICS subsystem is +divided into an API which is exposed through the header +@strong{gnunet_statistics_service.h} and the STATISTICS service +@strong{gnunet-service-statistics}. The @strong{gnunet-statistics} +command-line tool can be used to obtain (and change) information about +the values stored by the STATISTICS service. The STATISTICS service does +not communicate with other peers. + +Data is stored in the STATISTICS service in the form of tuples +@strong{(subsystem, name, value, persistence)}. The subsystem determines +to which other GNUnet's subsystem the data belongs. name is the name +through which value is associated. It uniquely identifies the record +from among other records belonging to the same subsystem. +In some parts of the code, the pair @strong{(subsystem, name)} is called +a @strong{statistic} as it identifies the values stored in the STATISTCS +service.The persistence flag determines if the record has to be preserved +across service restarts. A record is said to be persistent if this flag +is set for it; if not, the record is treated as a non-persistent record +and it is lost after service restart. Persistent records are written to +and read from the file @strong{statistics.data} before shutdown +and upon startup. The file is located in the HOME directory of the peer. + +An anomaly of the STATISTICS service is that it does not terminate +immediately upon receiving a shutdown signal if it has any clients +connected to it. It waits for all the clients that are not monitors to +close their connections before terminating itself. +This is to prevent the loss of data during peer shutdown --- delaying the +STATISTICS service shutdown helps other services to store important data +to STATISTICS during shutdown. + +@menu +* libgnunetstatistics:: +* The STATISTICS Client-Service Protocol:: +@end menu + +@cindex libgnunetstatistics +@node libgnunetstatistics +@subsection libgnunetstatistics + +@c %**end of header + +@strong{libgnunetstatistics} is the library containing the API for the +STATISTICS subsystem. Any process requiring to use STATISTICS should use +this API by to open a connection to the STATISTICS service. +This is done by calling the function @code{GNUNET_STATISTICS_create()}. +This function takes the subsystem's name which is trying to use STATISTICS +and a configuration. +All values written to STATISTICS with this connection will be placed in +the section corresponding to the given subsystem's name. +The connection to STATISTICS can be destroyed with the function +@code{GNUNET_STATISTICS_destroy()}. This function allows for the +connection to be destroyed immediately or upon transferring all +pending write requests to the service. + +Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES} +under the @code{[STATISTICS]} section in the configuration. With such a +configuration all calls to @code{GNUNET_STATISTICS_create()} return +@code{NULL} as the STATISTICS subsystem is unavailable and no other +functions from the API can be used. + + +@menu +* Statistics retrieval:: +* Setting statistics and updating them:: +* Watches:: +@end menu + +@node Statistics retrieval +@subsubsection Statistics retrieval + +@c %**end of header + +Once a connection to the statistics service is obtained, information +about any other system which uses statistics can be retrieved with the +function GNUNET_STATISTICS_get(). +This function takes the connection handle, the name of the subsystem +whose information we are interested in (a @code{NULL} value will +retrieve information of all available subsystems using STATISTICS), the +name of the statistic we are interested in (a @code{NULL} value will +retrieve all available statistics), a continuation callback which is +called when all of requested information is retrieved, an iterator +callback which is called for each parameter in the retrieved information +and a closure for the aforementioned callbacks. The library then invokes +the iterator callback for each value matching the request. + +Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be +canceled with the function @code{GNUNET_STATISTICS_get_cancel()}. +This is helpful when retrieving statistics takes too long and especially +when we want to shutdown and cleanup everything. + +@node Setting statistics and updating them +@subsubsection Setting statistics and updating them + +@c %**end of header + +So far we have seen how to retrieve statistics, here we will learn how we +can set statistics and update them so that other subsystems can retrieve +them. + +A new statistic can be set using the function +@code{GNUNET_STATISTICS_set()}. +This function takes the name of the statistic and its value and a flag to +make the statistic persistent. +The value of the statistic should be of the type @code{uint64_t}. +The function does not take the name of the subsystem; it is determined +from the previous @code{GNUNET_STATISTICS_create()} invocation. If +the given statistic is already present, its value is overwritten. + +An existing statistics can be updated, i.e its value can be increased or +decreased by an amount with the function +@code{GNUNET_STATISTICS_update()}. +The parameters to this function are similar to +@code{GNUNET_STATISTICS_set()}, except that it takes the amount to be +changed as a type @code{int64_t} instead of the value. + +The library will combine multiple set or update operations into one +message if the client performs requests at a rate that is faster than the +available IPC with the STATISTICS service. Thus, the client does not have +to worry about sending requests too quickly. + +@node Watches +@subsubsection Watches + +@c %**end of header + +As interesting feature of STATISTICS lies in serving notifications +whenever a statistic of our interest is modified. +This is achieved by registering a watch through the function +@code{GNUNET_STATISTICS_watch()}. +The parameters of this function are similar to those of +@code{GNUNET_STATISTICS_get()}. +Changes to the respective statistic's value will then cause the given +iterator callback to be called. +Note: A watch can only be registered for a specific statistic. Hence +the subsystem name and the parameter name cannot be @code{NULL} in a +call to @code{GNUNET_STATISTICS_watch()}. + +A registered watch will keep notifying any value changes until +@code{GNUNET_STATISTICS_watch_cancel()} is called with the same +parameters that are used for registering the watch. + +@node The STATISTICS Client-Service Protocol +@subsection The STATISTICS Client-Service Protocol +@c %**end of header + + +@menu +* Statistics retrieval2:: +* Setting and updating statistics:: +* Watching for updates:: +@end menu + +@node Statistics retrieval2 +@subsubsection Statistics retrieval2 + +@c %**end of header + +To retrieve statistics, the client transmits a message of type +@code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem +name and statistic parameter to the STATISTICS service. +The service responds with a message of type +@code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the statistics +parameters that match the client request for the client. The end of +information retrieved is signaled by the service by sending a message of +type @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}. + +@node Setting and updating statistics +@subsubsection Setting and updating statistics + +@c %**end of header + +The subsystem name, parameter name, its value and the persistence flag are +communicated to the service through the message +@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}. + +When the service receives a message of type +@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem +name and checks for a statistic parameter with matching the name given in +the message. +If a statistic parameter is found, the value is overwritten by the new +value from the message; if not found then a new statistic parameter is +created with the given name and value. + +In addition to just setting an absolute value, it is possible to perform a +relative update by sending a message of type +@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag +(@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in +the message should be treated as an update value. + +@node Watching for updates +@subsubsection Watching for updates + +@c %**end of header + +The function registers the watch at the service by sending a message of +type @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends +notifications through messages of type +@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic +parameter's value is changed. + +@cindex DHT +@cindex Distributed Hash Table +@node Distributed Hash Table (DHT) +@section Distributed Hash Table (DHT) + +@c %**end of header + +GNUnet includes a generic distributed hash table that can be used by +developers building P2P applications in the framework. +This section documents high-level features and how developers are +expected to use the DHT. +We have a research paper detailing how the DHT works. +Also, Nate's thesis includes a detailed description and performance +analysis (in chapter 6). + +Key features of GNUnet's DHT include: + +@itemize @bullet +@item stores key-value pairs with values up to (approximately) 63k in size +@item works with many underlay network topologies (small-world, random +graph), underlay does not need to be a full mesh / clique +@item support for extended queries (more than just a simple 'key'), +filtering duplicate replies within the network (bloomfilter) and content +validation (for details, please read the subsection on the block library) +@item can (optionally) return paths taken by the PUT and GET operations +to the application +@item provides content replication to handle churn +@end itemize + +GNUnet's DHT is randomized and unreliable. Unreliable means that there is +no strict guarantee that a value stored in the DHT is always +found --- values are only found with high probability. +While this is somewhat true in all P2P DHTs, GNUnet developers should be +particularly wary of this fact (this will help you write secure, +fault-tolerant code). Thus, when writing any application using the DHT, +you should always consider the possibility that a value stored in the +DHT by you or some other peer might simply not be returned, or returned +with a significant delay. +Your application logic must be written to tolerate this (naturally, some +loss of performance or quality of service is expected in this case). + +@menu +* Block library and plugins:: +* libgnunetdht:: +* The DHT Client-Service Protocol:: +* The DHT Peer-to-Peer Protocol:: +@end menu + +@node Block library and plugins +@subsection Block library and plugins + +@c %**end of header + +@menu +* What is a Block?:: +* The API of libgnunetblock:: +* Queries:: +* Sample Code:: +* Conclusion2:: +@end menu + +@node What is a Block? +@subsubsection What is a Block? + +@c %**end of header + +Blocks are small (< 63k) pieces of data stored under a key (struct +GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines +their data format. Blocks are used in GNUnet as units of static data +exchanged between peers and stored (or cached) locally. +Uses of blocks include file-sharing (the files are broken up into blocks), +the VPN (DNS information is stored in blocks) and the DHT (all +information in the DHT and meta-information for the maintenance of the +DHT are both stored using blocks). +The block subsystem provides a few common functions that must be +available for any type of block. + +@cindex libgnunetblock API +@node The API of libgnunetblock +@subsubsection The API of libgnunetblock + +@c %**end of header + +The block library requires for each (family of) block type(s) a block +plugin (implementing @file{gnunet_block_plugin.h}) that provides basic +functions that are needed by the DHT (and possibly other subsystems) to +manage the block. +These block plugins are typically implemented within their respective +subsystems. +The main block library is then used to locate, load and query the +appropriate block plugin. +Which plugin is appropriate is determined by the block type (which is +just a 32-bit integer). Block plugins contain code that specifies which +block types are supported by a given plugin. The block library loads all +block plugins that are installed at the local peer and forwards the +application request to the respective plugin. + +The central functions of the block APIs (plugin and main library) are to +allow the mapping of blocks to their respective key (if possible) and the +ability to check that a block is well-formed and matches a given +request (again, if possible). +This way, GNUnet can avoid storing invalid blocks, storing blocks under +the wrong key and forwarding blocks in response to a query that they do +not answer. + +One key function of block plugins is that it allows GNUnet to detect +duplicate replies (via the Bloom filter). All plugins MUST support +detecting duplicate replies (by adding the current response to the +Bloom filter and rejecting it if it is encountered again). +If a plugin fails to do this, responses may loop in the network. + +@node Queries +@subsubsection Queries +@c %**end of header + +The query format for any block in GNUnet consists of four main components. +First, the type of the desired block must be specified. Second, the query +must contain a hash code. The hash code is used for lookups in hash +tables and databases and must not be unique for the block (however, if +possible a unique hash should be used as this would be best for +performance). +Third, an optional Bloom filter can be specified to exclude known results; +replies that hash to the bits set in the Bloom filter are considered +invalid. False-positives can be eliminated by sending the same query +again with a different Bloom filter mutator value, which parameterizes +the hash function that is used. +Finally, an optional application-specific "eXtended query" (xquery) can +be specified to further constrain the results. It is entirely up to +the type-specific plugin to determine whether or not a given block +matches a query (type, hash, Bloom filter, and xquery). +Naturally, not all xquery's are valid and some types of blocks may not +support Bloom filters either, so the plugin also needs to check if the +query is valid in the first place. + +Depending on the results from the plugin, the DHT will then discard the +(invalid) query, forward the query, discard the (invalid) reply, cache the +(valid) reply, and/or forward the (valid and non-duplicate) reply. + +@node Sample Code +@subsubsection Sample Code + +@c %**end of header + +The source code in @strong{plugin_block_test.c} is a good starting point +for new block plugins --- it does the minimal work by implementing a +plugin that performs no validation at all. +The respective @strong{Makefile.am} shows how to build and install a +block plugin. + +@node Conclusion2 +@subsubsection Conclusion2 + +@c %**end of header + +In conclusion, GNUnet subsystems that want to use the DHT need to define a +block format and write a plugin to match queries and replies. For testing, +the @code{GNUNET_BLOCK_TYPE_TEST} block type can be used; it accepts +any query as valid and any reply as matching any query. +This type is also used for the DHT command line tools. +However, it should NOT be used for normal applications due to the lack +of error checking that results from this primitive implementation. + +@cindex libgnunetdht +@node libgnunetdht +@subsection libgnunetdht + +@c %**end of header + +The DHT API itself is pretty simple and offers the usual GET and PUT +functions that work as expected. The specified block type refers to the +block library which allows the DHT to run application-specific logic for +data stored in the network. + + +@menu +* GET:: +* PUT:: +* MONITOR:: +* DHT Routing Options:: +@end menu + +@node GET +@subsubsection GET + +@c %**end of header + +When using GET, the main consideration for developers (other than the +block library) should be that after issuing a GET, the DHT will +continuously cause (small amounts of) network traffic until the operation +is explicitly canceled. +So GET does not simply send out a single network request once; instead, +the DHT will continue to search for data. This is needed to achieve good +success rates and also handles the case where the respective PUT +operation happens after the GET operation was started. +Developers should not cancel an existing GET operation and then +explicitly re-start it to trigger a new round of network requests; +this is simply inefficient, especially as the internal automated version +can be more efficient, for example by filtering results in the network +that have already been returned. + +If an application that performs a GET request has a set of replies that it +already knows and would like to filter, it can call@ +@code{GNUNET_DHT_get_filter_known_results} with an array of hashes over +the respective blocks to tell the DHT that these results are not +desired (any more). +This way, the DHT will filter the respective blocks using the block +library in the network, which may result in a significant reduction in +bandwidth consumption. + +@node PUT +@subsubsection PUT + +@c %**end of header + +In contrast to GET operations, developers @strong{must} manually re-run +PUT operations periodically (if they intend the content to continue to be +available). Content stored in the DHT expires or might be lost due to +churn. +Furthermore, GNUnet's DHT typically requires multiple rounds of PUT +operations before a key-value pair is consistently available to all +peers (the DHT randomizes paths and thus storage locations, and only +after multiple rounds of PUTs there will be a sufficient number of +replicas in large DHTs). An explicit PUT operation using the DHT API will +only cause network traffic once, so in order to ensure basic availability +and resistance to churn (and adversaries), PUTs must be repeated. +While the exact frequency depends on the application, a rule of thumb is +that there should be at least a dozen PUT operations within the content +lifetime. Content in the DHT typically expires after one day, so +DHT PUT operations should be repeated at least every 1-2 hours. + +@node MONITOR +@subsubsection MONITOR + +@c %**end of header + +The DHT API also allows applications to monitor messages crossing the +local DHT service. +The types of messages used by the DHT are GET, PUT and RESULT messages. +Using the monitoring API, applications can choose to monitor these +requests, possibly limiting themselves to requests for a particular block +type. + +The monitoring API is not only usefu only for diagnostics, it can also be +used to trigger application operations based on PUT operations. +For example, an application may use PUTs to distribute work requests to +other peers. +The workers would then monitor for PUTs that give them work, instead of +looking for work using GET operations. +This can be beneficial, especially if the workers have no good way to +guess the keys under which work would be stored. +Naturally, additional protocols might be needed to ensure that the desired +number of workers will process the distributed workload. + +@node DHT Routing Options +@subsubsection DHT Routing Options + +@c %**end of header + +There are two important options for GET and PUT requests: + +@table @asis +@item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all +peers should process the request, even if their peer ID is not closest to +the key. For a PUT request, this means that all peers that a request +traverses may make a copy of the data. +Similarly for a GET request, all peers will check their local database +for a result. Setting this option can thus significantly improve caching +and reduce bandwidth consumption --- at the expense of a larger DHT +database. If in doubt, we recommend that this option should be used. +@item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record +the path that a GET or a PUT request is taking through the overlay +network. The resulting paths are then returned to the application with +the respective result. This allows the receiver of a result to construct +a path to the originator of the data, which might then be used for +routing. Naturally, setting this option requires additional bandwidth +and disk space, so applications should only set this if the paths are +needed by the application logic. +@item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by +the DHT's peer discovery mechanism and should not be used by applications. +@item GNUNET_DHT_RO_BART This option is currently not implemented. It may +in the future offer performance improvements for clique topologies. +@end table + +@node The DHT Client-Service Protocol +@subsection The DHT Client-Service Protocol + +@c %**end of header + +@menu +* PUTting data into the DHT:: +* GETting data from the DHT:: +* Monitoring the DHT:: +@end menu + +@node PUTting data into the DHT +@subsubsection PUTting data into the DHT + +@c %**end of header + +To store (PUT) data into the DHT, the client sends a +@code{struct GNUNET_DHT_ClientPutMessage} to the service. +This message specifies the block type, routing options, the desired +replication level, the expiration time, key, +value and a 64-bit unique ID for the operation. The service responds with +a @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same +64-bit unique ID. Note that the service sends the confirmation as soon as +it has locally processed the PUT request. The PUT may still be +propagating through the network at this time. + +In the future, we may want to change this to provide (limited) feedback +to the client, for example if we detect that the PUT operation had no +effect because the same key-value pair was already stored in the DHT. +However, changing this would also require additional state and messages +in the P2P interaction. + +@node GETting data from the DHT +@subsubsection GETting data from the DHT + +@c %**end of header + +To retrieve (GET) data from the DHT, the client sends a +@code{struct GNUNET_DHT_ClientGetMessage} to the service. The message +specifies routing options, a replication level (for replicating the GET, +not the content), the desired block type, the key, the (optional) +extended query and unique 64-bit request ID. + +Additionally, the client may send any number of +@code{struct GNUNET_DHT_ClientGetResultSeenMessage}s to notify the +service about results that the client is already aware of. +These messages consist of the key, the unique 64-bit ID of the request, +and an arbitrary number of hash codes over the blocks that the client is +already aware of. As messages are restricted to 64k, a client that +already knows more than about a thousand blocks may need to send +several of these messages. Naturally, the client should transmit these +messages as quickly as possible after the original GET request such that +the DHT can filter those results in the network early on. Naturally, as +these messages are send after the original request, it is conceivalbe +that the DHT service may return blocks that match those already known +to the client anyway. + +In response to a GET request, the service will send @code{struct +GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the +block type, expiration, key, unique ID of the request and of course the +value (a block). Depending on the options set for the respective +operations, the replies may also contain the path the GET and/or the PUT +took through the network. + +A client can stop receiving replies either by disconnecting or by sending +a @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the +key and the 64-bit unique ID of the original request. Using an +explicit "stop" message is more common as this allows a client to run +many concurrent GET operations over the same connection with the DHT +service --- and to stop them individually. + +@node Monitoring the DHT +@subsubsection Monitoring the DHT + +@c %**end of header + +To begin monitoring, the client sends a +@code{struct GNUNET_DHT_MonitorStartStop} message to the DHT service. +In this message, flags can be set to enable (or disable) monitoring of +GET, PUT and RESULT messages that pass through a peer. The message can +also restrict monitoring to a particular block type or a particular key. +Once monitoring is enabled, the DHT service will notify the client about +any matching event using @code{struct GNUNET_DHT_MonitorGetMessage}s for +GET events, @code{struct GNUNET_DHT_MonitorPutMessage} for PUT events +and @code{struct GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of +these messages contains all of the information about the event. + +@node The DHT Peer-to-Peer Protocol +@subsection The DHT Peer-to-Peer Protocol +@c %**end of header + + +@menu +* Routing GETs or PUTs:: +* PUTting data into the DHT2:: +* GETting data from the DHT2:: +@end menu + +@node Routing GETs or PUTs +@subsubsection Routing GETs or PUTs + +@c %**end of header + +When routing GETs or PUTs, the DHT service selects a suitable subset of +neighbours for forwarding. The exact number of neighbours can be zero or +more and depends on the hop counter of the query (initially zero) in +relation to the (log of) the network size estimate, the desired +replication level and the peer's connectivity. +Depending on the hop counter and our network size estimate, the selection +of the peers maybe randomized or by proximity to the key. +Furthermore, requests include a set of peers that a request has already +traversed; those peers are also excluded from the selection. + +@node PUTting data into the DHT2 +@subsubsection PUTting data into the DHT2 + +@c %**end of header + +To PUT data into the DHT, the service sends a @code{struct PeerPutMessage} +of type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective +neighbour. +In addition to the usual information about the content (type, routing +options, desired replication level for the content, expiration time, key +and value), the message contains a fixed-size Bloom filter with +information about which peers (may) have already seen this request. +This Bloom filter is used to ensure that DHT messages never loop back to +a peer that has already processed the request. +Additionally, the message includes the current hop counter and, depending +on the routing options, the message may include the full path that the +message has taken so far. +The Bloom filter should already contain the identity of the previous hop; +however, the path should not include the identity of the previous hop and +the receiver should append the identity of the sender to the path, not +its own identity (this is done to reduce bandwidth). + +@node GETting data from the DHT2 +@subsubsection GETting data from the DHT2 + +@c %**end of header + +A peer can search the DHT by sending @code{struct PeerGetMessage}s of type +@code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the +usual information about the request (type, routing options, desired +replication level for the request, the key and the extended query), a GET +request also again contains a hop counter, a Bloom filter over the peers +that have processed the request already and depending on the routing +options the full path traversed by the GET. +Finally, a GET request includes a variable-size second Bloom filter and a +so-called Bloom filter mutator value which together indicate which +replies the sender has already seen. During the lookup, each block that +matches they block type, key and extended query is additionally subjected +to a test against this Bloom filter. +The block plugin is expected to take the hash of the block and combine it +with the mutator value and check if the result is not yet in the Bloom +filter. The originator of the query will from time to time modify the +mutator to (eventually) allow false-positives filtered by the Bloom filter +to be returned. + +Peers that receive a GET request perform a local lookup (depending on +their proximity to the key and the query options) and forward the request +to other peers. +They then remember the request (including the Bloom filter for blocking +duplicate results) and when they obtain a matching, non-filtered response +a @code{struct PeerResultMessage} of type +@code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous +hop. +Whenver a result is forwarded, the block plugin is used to update the +Bloom filter accordingly, to ensure that the same result is never +forwarded more than once. +The DHT service may also cache forwarded results locally if the +"CACHE_RESULTS" option is set to "YES" in the configuration. + +@cindex GNS +@cindex GNU Name System +@node GNU Name System (GNS) +@section GNU Name System (GNS) + +@c %**end of header + +The GNU Name System (GNS) is a decentralized database that enables users +to securely resolve names to values. +Names can be used to identify other users (for example, in social +networking), or network services (for example, VPN services running at a +peer in GNUnet, or purely IP-based services on the Internet). +Users interact with GNS by typing in a hostname that ends in ".gnu" +or ".zkey". + +Videos giving an overview of most of the GNS and the motivations behind +it is available here and here. +The remainder of this chapter targets developers that are familiar with +high level concepts of GNS as presented in these talks. +@c TODO: Add links to here and here and to these. + +GNS-aware applications should use the GNS resolver to obtain the +respective records that are stored under that name in GNS. +Each record consists of a type, value, expiration time and flags. + +The type specifies the format of the value. Types below 65536 correspond +to DNS record types, larger values are used for GNS-specific records. +Applications can define new GNS record types by reserving a number and +implementing a plugin (which mostly needs to convert the binary value +representation to a human-readable text format and vice-versa). +The expiration time specifies how long the record is to be valid. +The GNS API ensures that applications are only given non-expired values. +The flags are typically irrelevant for applications, as GNS uses them +internally to control visibility and validity of records. + +Records are stored along with a signature. +The signature is generated using the private key of the authoritative +zone. This allows any GNS resolver to verify the correctness of a +name-value mapping. + +Internally, GNS uses the NAMECACHE to cache information obtained from +other users, the NAMESTORE to store information specific to the local +users, and the DHT to exchange data between users. +A plugin API is used to enable applications to define new GNS +record types. + +@menu +* libgnunetgns:: +* libgnunetgnsrecord:: +* GNS plugins:: +* The GNS Client-Service Protocol:: +* Hijacking the DNS-Traffic using gnunet-service-dns:: +* Serving DNS lookups via GNS on W32:: +@end menu + +@node libgnunetgns +@subsection libgnunetgns + +@c %**end of header + +The GNS API itself is extremely simple. Clients first connec to the +GNS service using @code{GNUNET_GNS_connect}. +They can then perform lookups using @code{GNUNET_GNS_lookup} or cancel +pending lookups using @code{GNUNET_GNS_lookup_cancel}. +Once finished, clients disconnect using @code{GNUNET_GNS_disconnect}. + +@menu +* Looking up records:: +* Accessing the records:: +* Creating records:: +* Future work:: +@end menu + +@node Looking up records +@subsubsection Looking up records + +@c %**end of header + +@code{GNUNET_GNS_lookup} takes a number of arguments: + +@table @asis +@item handle This is simply the GNS connection handle from +@code{GNUNET_GNS_connect}. +@item name The client needs to specify the name to +be resolved. This can be any valid DNS or GNS hostname. +@item zone The client +needs to specify the public key of the GNS zone against which the +resolution should be done (the ".gnu" zone). +Note that a key must be provided, even if the name ends in ".zkey". +This should typically be the public key of the master-zone of the user. +@item type This is the desired GNS or DNS record type +to look for. While all records for the given name will be returned, this +can be important if the client wants to resolve record types that +themselves delegate resolution, such as CNAME, PKEY or GNS2DNS. +Resolving a record of any of these types will only work if the respective +record type is specified in the request, as the GNS resolver will +otherwise follow the delegation and return the records from the +respective destination, instead of the delegating record. +@item only_cached This argument should typically be set to +@code{GNUNET_NO}. Setting it to @code{GNUNET_YES} disables resolution via +the overlay network. +@item shorten_zone_key If GNS encounters new names during resolution, +their respective zones can automatically be learned and added to the +"shorten zone". If this is desired, clients must pass the private key of +the shorten zone. If NULL is passed, shortening is disabled. +@item proc This argument identifies +the function to call with the result. It is given proc_cls, the number of +records found (possilby zero) and the array of the records as arguments. +proc will only be called once. After proc,> has been called, the lookup +must no longer be cancelled. +@item proc_cls The closure for proc. +@end table + +@node Accessing the records +@subsubsection Accessing the records + +@c %**end of header + +The @code{libgnunetgnsrecord} library provides an API to manipulate the +GNS record array that is given to proc. In particular, it offers +functions such as converting record values to human-readable +strings (and back). However, most @code{libgnunetgnsrecord} functions are +not interesting to GNS client applications. + +For DNS records, the @code{libgnunetdnsparser} library provides +functions for parsing (and serializing) common types of DNS records. + +@node Creating records +@subsubsection Creating records + +@c %**end of header + +Creating GNS records is typically done by building the respective record +information (possibly with the help of @code{libgnunetgnsrecord} and +@code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to +publish the information. The GNS API is not involved in this +operation. + +@node Future work +@subsubsection Future work + +@c %**end of header + +In the future, we want to expand @code{libgnunetgns} to allow +applications to observe shortening operations performed during GNS +resolution, for example so that users can receive visual feedback when +this happens. + +@node libgnunetgnsrecord +@subsection libgnunetgnsrecord + +@c %**end of header + +The @code{libgnunetgnsrecord} library is used to manipulate GNS +records (in plaintext or in their encrypted format). +Applications mostly interact with @code{libgnunetgnsrecord} by using the +functions to convert GNS record values to strings or vice-versa, or to +lookup a GNS record type number by name (or vice-versa). +The library also provides various other functions that are mostly +used internally within GNS, such as converting keys to names, checking for +expiration, encrypting GNS records to GNS blocks, verifying GNS block +signatures and decrypting GNS records from GNS blocks. + +We will now discuss the four commonly used functions of the API.@ +@code{libgnunetgnsrecord} does not perform these operations itself, +but instead uses plugins to perform the operation. +GNUnet includes plugins to support common DNS record types as well as +standard GNS record types. + +@menu +* Value handling:: +* Type handling:: +@end menu + +@node Value handling +@subsubsection Value handling + +@c %**end of header + +@code{GNUNET_GNSRECORD_value_to_string} can be used to convert +the (binary) representation of a GNS record value to a human readable, +0-terminated UTF-8 string. +NULL is returned if the specified record type is not supported by any +available plugin. + +@code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a +human readable string to the respective (binary) representation of +a GNS record value. + +@node Type handling +@subsubsection Type handling + +@c %**end of header + +@code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the +numeric value associated with a given typename. For example, given the +typename "A" (for DNS A reocrds), the function will return the number 1. +A list of common DNS record types is +@uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here}. +Note that not all DNS record types are supported by GNUnet GNSRECORD +plugins at this time. + +@code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the +typename associated with a given numeric value. +For example, given the type number 1, the function will return the +typename "A". + +@node GNS plugins +@subsection GNS plugins + +@c %**end of header + +Adding a new GNS record type typically involves writing (or extending) a +GNSRECORD plugin. The plugin needs to implement the +@code{gnunet_gnsrecord_plugin.h} API which provides basic functions that +are needed by GNSRECORD to convert typenames and values of the respective +record type to strings (and back). +These gnsrecord plugins are typically implemented within their respective +subsystems. +Examples for such plugins can be found in the GNSRECORD, GNS and +CONVERSATION subsystems. + +The @code{libgnunetgnsrecord} library is then used to locate, load and +query the appropriate gnsrecord plugin. +Which plugin is appropriate is determined by the record type (which is +just a 32-bit integer). The @code{libgnunetgnsrecord} library loads all +block plugins that are installed at the local peer and forwards the +application request to the plugins. If the record type is not +supported by the plugin, it should simply return an error code. + +The central functions of the block APIs (plugin and main library) are the +same four functions for converting between values and strings, and +typenames and numbers documented in the previous subsection. + +@node The GNS Client-Service Protocol +@subsection The GNS Client-Service Protocol +@c %**end of header + +The GNS client-service protocol consists of two simple messages, the +@code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP} +message contains a unique 32-bit identifier, which will be included in the +corresponding response. Thus, clients can send many lookup requests in +parallel and receive responses out-of-order. +A @code{LOOKUP} request also includes the public key of the GNS zone, +the desired record type and fields specifying whether shortening is +enabled or networking is disabled. Finally, the @code{LOOKUP} message +includes the name to be resolved. + +The response includes the number of records and the records themselves +in the format created by @code{GNUNET_GNSRECORD_records_serialize}. +They can thus be deserialized using +@code{GNUNET_GNSRECORD_records_deserialize}. + +@node Hijacking the DNS-Traffic using gnunet-service-dns +@subsection Hijacking the DNS-Traffic using gnunet-service-dns + +@c %**end of header + +This section documents how the gnunet-service-dns (and the +gnunet-helper-dns) intercepts DNS queries from the local system. +This is merely one method for how we can obtain GNS queries. +It is also possible to change @code{resolv.conf} to point to a machine +running @code{gnunet-dns2gns} or to modify libc's name system switch +(NSS) configuration to include a GNS resolution plugin. +The method described in this chaper is more of a last-ditch catch-all +approach. + +@code{gnunet-service-dns} enables intercepting DNS traffic using policy +based routing. +We MARK every outgoing DNS-packet if it was not sent by our application. +Using a second routing table in the Linux kernel these marked packets are +then routed through our virtual network interface and can thus be +captured unchanged. + +Our application then reads the query and decides how to handle it: A +query to an address ending in ".gnu" or ".zkey" is hijacked by +@code{gnunet-service-gns} and resolved internally using GNS. +In the future, a reverse query for an address of the configured virtual +network could be answered with records kept about previous forward +queries. +Queries that are not hijacked by some application using the DNS service +will be sent to the original recipient. +The answer to the query will always be sent back through the virtual +interface with the original nameserver as source address. + + +@menu +* Network Setup Details:: +@end menu + +@node Network Setup Details +@subsubsection Network Setup Details + +@c %**end of header + +The DNS interceptor adds the following rules to the Linux kernel: +@example +iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 \ +-j ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK \ +--set-mark 3 ip rule add fwmark 3 table2 ip route add default via \ +$VIRTUALDNS table2 +@end example + +@c FIXME: Rewrite to reflect display which is no longer content by line +@c FIXME: due to the < 74 characters limit. +Line 1 makes sure that all packets coming from a port our application +opened beforehand (@code{$LOCALPORT}) will be routed normally. +Line 2 marks every other packet to a DNS-Server with mark 3 (chosen +arbitrarily). The third line adds a routing policy based on this mark +3 via the routing table. + +@node Serving DNS lookups via GNS on W32 +@subsection Serving DNS lookups via GNS on W32 + +@c %**end of header + +This section documents how the libw32nsp (and +gnunet-gns-helper-service-w32) do DNS resolutions of DNS queries on the +local system. This only applies to GNUnet running on W32. + +W32 has a concept of "Namespaces" and "Namespace providers". +These are used to present various name systems to applications in a +generic way. +Namespaces include DNS, mDNS, NLA and others. For each namespace any +number of providers could be registered, and they are queried in an order +of priority (which is adjustable). + +Applications can resolve names by using WSALookupService*() family of +functions. + +However, these are WSA-only facilities. Common BSD socket functions for +namespace resolutions are gethostbyname and getaddrinfo (among others). +These functions are implemented internally (by default - by mswsock, +which also implements the default DNS provider) as wrappers around +WSALookupService*() functions (see "Sample Code for a Service Provider" +on MSDN). + +On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be +installed into the system by using w32nsp-install (and uninstalled by +w32nsp-uninstall), as described in "Installation Handbook". + +libw32nsp is very simple and has almost no dependencies. As a response to +NSPLookupServiceBegin(), it only checks that the provider GUID passed to +it by the caller matches GNUnet DNS Provider GUID, checks that name being +resolved ends in ".gnu" or ".zkey", then connects to +gnunet-gns-helper-service-w32 at 127.0.0.1:5353 (hardcoded) and sends the +name resolution request there, returning the connected socket to the +caller. + +When the caller invokes NSPLookupServiceNext(), libw32nsp reads a +completely formed reply from that socket, unmarshalls it, then gives +it back to the caller. + +At the moment gnunet-gns-helper-service-w32 is implemented to ever give +only one reply, and subsequent calls to NSPLookupServiceNext() will fail +with WSA_NODATA (first call to NSPLookupServiceNext() might also fail if +GNS failed to find the name, or there was an error connecting to it). + +gnunet-gns-helper-service-w32 does most of the processing: + +@itemize @bullet +@item Maintains a connection to GNS. +@item Reads GNS config and loads appropriate keys. +@item Checks service GUID and decides on the type of record to look up, +refusing to make a lookup outright when unsupported service GUID is +passed. +@item Launches the lookup +@end itemize + +When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete +reply (including filling a WSAQUERYSETW structure and, possibly, a binary +blob with a hostent structure for gethostbyname() client), marshalls it, +and sends it back to libw32nsp. If no records were found, it sends an +empty header. + +This works for most normal applications that use gethostbyname() or +getaddrinfo() to resolve names, but fails to do anything with +applications that use alternative means of resolving names (such as +sending queries to a DNS server directly by themselves). +This includes some of well known utilities, like "ping" and "nslookup". + +@cindex GNS Namecache +@node GNS Namecache +@section GNS Namecache + +@c %**end of header + +The NAMECACHE subsystem is responsible for caching (encrypted) resolution +results of the GNU Name System (GNS). GNS makes zone information available +to other users via the DHT. However, as accessing the DHT for every +lookup is expensive (and as the DHT's local cache is lost whenever the +peer is restarted), GNS uses the NAMECACHE as a more persistent cache for +DHT lookups. +Thus, instead of always looking up every name in the DHT, GNS first +checks if the result is already available locally in the NAMECACHE. +Only if there is no result in the NAMECACHE, GNS queries the DHT. +The NAMECACHE stores data in the same (encrypted) format as the DHT. +It thus makes no sense to iterate over all items in the +NAMECACHE --- the NAMECACHE does not have a way to provide the keys +required to decrypt the entries. + +Blocks in the NAMECACHE share the same expiration mechanism as blocks in +the DHT --- the block expires wheneever any of the records in +the (encrypted) block expires. +The expiration time of the block is the only information stored in +plaintext. The NAMECACHE service internally performs all of the required +work to expire blocks, clients do not have to worry about this. +Also, given that NAMECACHE stores only GNS blocks that local users +requested, there is no configuration option to limit the size of the +NAMECACHE. It is assumed to be always small enough (a few MB) to fit on +the drive. + +The NAMECACHE supports the use of different database backends via a +plugin API. + +@menu +* libgnunetnamecache:: +* The NAMECACHE Client-Service Protocol:: +* The NAMECACHE Plugin API:: +@end menu + +@node libgnunetnamecache +@subsection libgnunetnamecache + +@c %**end of header + +The NAMECACHE API consists of five simple functions. First, there is +@code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service. +This returns the handle required for all other operations on the +NAMECACHE. Using @code{GNUNET_NAMECACHE_block_cache} clients can insert a +block into the cache. +@code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that +were stored in the NAMECACHE. Both operations can be cancelled using +@code{GNUNET_NAMECACHE_cancel}. Note that cancelling a +@code{GNUNET_NAMECACHE_block_cache} operation can result in the block +being stored in the NAMECACHE --- or not. Cancellation primarily ensures +that the continuation function with the result of the operation will no +longer be invoked. +Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to the +NAMECACHE. + +The maximum size of a block that can be stored in the NAMECACHE is +@code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB. + +@node The NAMECACHE Client-Service Protocol +@subsection The NAMECACHE Client-Service Protocol + +@c %**end of header + +All messages in the NAMECACHE IPC protocol start with the +@code{struct GNUNET_NAMECACHE_Header} which adds a request +ID (32-bit integer) to the standard message header. +The request ID is used to match requests with the +respective responses from the NAMECACHE, as they are allowed to happen +out-of-order. + + +@menu +* Lookup:: +* Store:: +@end menu + +@node Lookup +@subsubsection Lookup + +@c %**end of header + +The @code{struct LookupBlockMessage} is used to lookup a block stored in +the cache. +It contains the query hash. The NAMECACHE always responds with a +@code{struct LookupBlockResponseMessage}. If the NAMECACHE has no +response, it sets the expiration time in the response to zero. +Otherwise, the response is expected to contain the expiration time, the +ECDSA signature, the derived key and the (variable-size) encrypted data +of the block. + +@node Store +@subsubsection Store + +@c %**end of header + +The @code{struct BlockCacheMessage} is used to cache a block in the +NAMECACHE. +It has the same structure as the @code{struct LookupBlockResponseMessage}. +The service responds with a @code{struct BlockCacheResponseMessage} which +contains the result of the operation (success or failure). +In the future, we might want to make it possible to provide an error +message as well. + +@node The NAMECACHE Plugin API +@subsection The NAMECACHE Plugin API +@c %**end of header + +The NAMECACHE plugin API consists of two functions, @code{cache_block} to +store a block in the database, and @code{lookup_block} to lookup a block +in the database. + + +@menu +* Lookup2:: +* Store2:: +@end menu + +@node Lookup2 +@subsubsection Lookup2 + +@c %**end of header + +The @code{lookup_block} function is expected to return at most one block +to the iterator, and return @code{GNUNET_NO} if there were no non-expired +results. +If there are multiple non-expired results in the cache, the lookup is +supposed to return the result with the largest expiration time. + +@node Store2 +@subsubsection Store2 + +@c %**end of header + +The @code{cache_block} function is expected to try to store the block in +the database, and return @code{GNUNET_SYSERR} if this was not possible +for any reason. +Furthermore, @code{cache_block} is expected to implicitly perform cache +maintenance and purge blocks from the cache that have expired. Note that +@code{cache_block} might encounter the case where the database already has +another block stored under the same key. In this case, the plugin must +ensure that the block with the larger expiration time is preserved. +Obviously, this can done either by simply adding new blocks and selecting +for the most recent expiration time during lookup, or by checking which +block is more recent during the store operation. + +@cindex REVOCATION Subsystem +@node REVOCATION Subsystem +@section REVOCATION Subsystem +@c %**end of header + +The REVOCATION subsystem is responsible for key revocation of Egos. +If a user learns that theis private key has been compromised or has lost +it, they can use the REVOCATION system to inform all of the other users +that their private key is no longer valid. +The subsystem thus includes ways to query for the validity of keys and to +propagate revocation messages. + +@menu +* Dissemination:: +* Revocation Message Design Requirements:: +* libgnunetrevocation:: +* The REVOCATION Client-Service Protocol:: +* The REVOCATION Peer-to-Peer Protocol:: +@end menu + +@node Dissemination +@subsection Dissemination + +@c %**end of header + +When a revocation is performed, the revocation is first of all +disseminated by flooding the overlay network. +The goal is to reach every peer, so that when a peer needs to check if a +key has been revoked, this will be purely a local operation where the +peer looks at his local revocation list. Flooding the network is also the +most robust form of key revocation --- an adversary would have to control +a separator of the overlay graph to restrict the propagation of the +revocation message. Flooding is also very easy to implement --- peers that +receive a revocation message for a key that they have never seen before +simply pass the message to all of their neighbours. + +Flooding can only distribute the revocation message to peers that are +online. +In order to notify peers that join the network later, the revocation +service performs efficient set reconciliation over the sets of known +revocation messages whenever two peers (that both support REVOCATION +dissemination) connect. +The SET service is used to perform this operation efficiently. + +@node Revocation Message Design Requirements +@subsection Revocation Message Design Requirements + +@c %**end of header + +However, flooding is also quite costly, creating O(|E|) messages on a +network with |E| edges. +Thus, revocation messages are required to contain a proof-of-work, the +result of an expensive computation (which, however, is cheap to verify). +Only peers that have expended the CPU time necessary to provide +this proof will be able to flood the network with the revocation message. +This ensures that an attacker cannot simply flood the network with +millions of revocation messages. The proof-of-work required by GNUnet is +set to take days on a typical PC to compute; if the ability to quickly +revoke a key is needed, users have the option to pre-compute revocation +messages to store off-line and use instantly after their key has expired. + +Revocation messages must also be signed by the private key that is being +revoked. Thus, they can only be created while the private key is in the +possession of the respective user. This is another reason to create a +revocation message ahead of time and store it in a secure location. + +@node libgnunetrevocation +@subsection libgnunetrevocation + +@c %**end of header + +The REVOCATION API consists of two parts, to query and to issue +revocations. + + +@menu +* Querying for revoked keys:: +* Preparing revocations:: +* Issuing revocations:: +@end menu + +@node Querying for revoked keys +@subsubsection Querying for revoked keys + +@c %**end of header + +@code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public +key has been revoked. +The given callback will be invoked with the result of the check. +The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on +the return value. + +@node Preparing revocations +@subsubsection Preparing revocations + +@c %**end of header + +It is often desirable to create a revocation record ahead-of-time and +store it in an off-line location to be used later in an emergency. +This is particularly true for GNUnet revocations, where performing the +revocation operation itself is computationally expensive and thus is +likely to take some time. +Thus, if users want the ability to perform revocations quickly in an +emergency, they must pre-compute the revocation message. +The revocation API enables this with two functions that are used to +compute the revocation message, but not trigger the actual revocation +operation. + +@code{GNUNET_REVOCATION_check_pow} should be used to calculate the +proof-of-work required in the revocation message. This function takes the +public key, the required number of bits for the proof of work (which in +GNUnet is a network-wide constant) and finally a proof-of-work number as +arguments. +The function then checks if the given proof-of-work number is a valid +proof of work for the given public key. Clients preparing a revocation +are expected to call this function repeatedly (typically with a +monotonically increasing sequence of numbers of the proof-of-work number) +until a given number satisfies the check. +That number should then be saved for later use in the revocation +operation. + +@code{GNUNET_REVOCATION_sign_revocation} is used to generate the +signature that is required in a revocation message. +It takes the private key that (possibly in the future) is to be revoked +and returns the signature. +The signature can again be saved to disk for later use, which will then +allow performing a revocation even without access to the private key. + +@node Issuing revocations +@subsubsection Issuing revocations + + +Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign} +and the proof-of-work, +@code{GNUNET_REVOCATION_revoke} can be used to perform the +actual revocation. The given callback is called upon completion of the +operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the +library from calling the continuation; however, in that case it is +undefined whether or not the revocation operation will be executed. + +@node The REVOCATION Client-Service Protocol +@subsection The REVOCATION Client-Service Protocol + + +The REVOCATION protocol consists of four simple messages. + +A @code{QueryMessage} containing a public ECDSA key is used to check if a +particular key has been revoked. The service responds with a +@code{QueryResponseMessage} which simply contains a bit that says if the +given public key is still valid, or if it has been revoked. + +The second possible interaction is for a client to revoke a key by +passing a @code{RevokeMessage} to the service. The @code{RevokeMessage} +contains the ECDSA public key to be revoked, a signature by the +corresponding private key and the proof-of-work, The service responds +with a @code{RevocationResponseMessage} which can be used to indicate +that the @code{RevokeMessage} was invalid (i.e. proof of work incorrect), +or otherwise indicates that the revocation has been processed +successfully. + +@node The REVOCATION Peer-to-Peer Protocol +@subsection The REVOCATION Peer-to-Peer Protocol + +@c %**end of header + +Revocation uses two disjoint ways to spread revocation information among +peers. +First of all, P2P gossip exchanged via CORE-level neighbours is used to +quickly spread revocations to all connected peers. +Second, whenever two peers (that both support revocations) connect, +the SET service is used to compute the union of the respective revocation +sets. + +In both cases, the exchanged messages are @code{RevokeMessage}s which +contain the public key that is being revoked, a matching ECDSA signature, +and a proof-of-work. +Whenever a peer learns about a new revocation this way, it first +validates the signature and the proof-of-work, then stores it to disk +(typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally +spreads the information to all directly connected neighbours. + +For computing the union using the SET service, the peer with the smaller +hashed peer identity will connect (as a "client" in the two-party set +protocol) to the other peer after one second (to reduce traffic spikes +on connect) and initiate the computation of the set union. +All revocation services use a common hash to identify the SET operation +over revocation sets. + +The current implementation accepts revocation set union operations from +all peers at any time; however, well-behaved peers should only initiate +this operation once after establishing a connection to a peer with a +larger hashed peer identity. + +@cindex FS +@cindex FS Subsystem +@node File-sharing (FS) Subsystem +@section File-sharing (FS) Subsystem + +@c %**end of header + +This chapter describes the details of how the file-sharing service works. +As with all services, it is split into an API (libgnunetfs), the service +process (gnunet-service-fs) and user interface(s). +The file-sharing service uses the datastore service to store blocks and +the DHT (and indirectly datacache) for lookups for non-anonymous +file-sharing. +Furthermore, the file-sharing service uses the block library (and the +block fs plugin) for validation of DHT operations. + +In contrast to many other services, libgnunetfs is rather complex since +the client library includes a large number of high-level abstractions; +this is necessary since the Fs service itself largely only operates on +the block level. +The FS library is responsible for providing a file-based abstraction to +applications, including directories, meta data, keyword search, +verification, and so on. + +The method used by GNUnet to break large files into blocks and to use +keyword search is called the +"Encoding for Censorship Resistant Sharing" (ECRS). +ECRS is largely implemented in the fs library; block validation is also +reflected in the block FS plugin and the FS service. +ECRS on-demand encoding is implemented in the FS service. + +NOTE: The documentation in this chapter is quite incomplete. + +@menu +* Encoding for Censorship-Resistant Sharing (ECRS):: +* File-sharing persistence directory structure:: +@end menu + +@cindex ECRS +@cindex Encoding for Censorship-Resistant Sharing +@node Encoding for Censorship-Resistant Sharing (ECRS) +@subsection Encoding for Censorship-Resistant Sharing (ECRS) + +@c %**end of header + +When GNUnet shares files, it uses a content encoding that is called ECRS, +the Encoding for Censorship-Resistant Sharing. +Most of ECRS is described in the (so far unpublished) research paper +attached to this page. ECRS obsoletes the previous ESED and ESED II +encodings which were used in GNUnet before version 0.7.0. +The rest of this page assumes that the reader is familiar with the +attached paper. What follows is a description of some minor extensions +that GNUnet makes over what is described in the paper. +The reason why these extensions are not in the paper is that we felt +that they were obvious or trivial extensions to the original scheme and +thus did not warrant space in the research report. + +@menu +* Namespace Advertisements:: +* KSBlocks:: +@end menu + +@node Namespace Advertisements +@subsubsection Namespace Advertisements + +@c %**end of header +@c %**FIXME: all zeroses -> ? + +An @code{SBlock} with identifier all zeros is a signed +advertisement for a namespace. This special @code{SBlock} contains +metadata describing the content of the namespace. +Instead of the name of the identifier for a potential update, it contains +the identifier for the root of the namespace. +The URI should always be empty. The @code{SBlock} is signed with the +content provder's RSA private key (just like any other SBlock). Peers +can search for @code{SBlock}s in order to find out more about a namespace. + +@node KSBlocks +@subsubsection KSBlocks + +@c %**end of header + +GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead +of encrypting a CHK and metadata, encrypt an @code{SBlock} instead. +In other words, @code{KSBlocks} enable GNUnet to find @code{SBlocks} +using the global keyword search. +Usually the encrypted @code{SBlock} is a namespace advertisement. +The rationale behind @code{KSBlock}s and @code{SBlock}s is to enable +peers to discover namespaces via keyword searches, and, to associate +useful information with namespaces. When GNUnet finds @code{KSBlocks} +during a normal keyword search, it adds the information to an internal +list of discovered namespaces. Users looking for interesting namespaces +can then inspect this list, reducing the need for out-of-band discovery +of namespaces. +Naturally, namespaces (or more specifically, namespace advertisements) can +also be referenced from directories, but @code{KSBlock}s should make it +easier to advertise namespaces for the owner of the pseudonym since they +eliminate the need to first create a directory. + +Collections are also advertised using @code{KSBlock}s. + +@table @asis +@item Attachment Size +@item ecrs.pdf 270.68 KB +@item https://gnunet.org/sites/default/files/ecrs.pdf +@end table + +@node File-sharing persistence directory structure +@subsection File-sharing persistence directory structure + +@c %**end of header + +This section documents how the file-sharing library implements +persistence of file-sharing operations and specifically the resulting +directory structure. +This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag +was set when calling @code{GNUNET_FS_start}. +In this case, the file-sharing library will try hard to ensure that all +major operations (searching, downloading, publishing, unindexing) are +persistent, that is, can live longer than the process itself. +More specifically, an operation is supposed to live until it is +explicitly stopped. + +If @code{GNUNET_FS_stop} is called before an operation has been stopped, a +@code{SUSPEND} event is generated and then when the process calls +@code{GNUNET_FS_start} next time, a @code{RESUME} event is generated. +Additionally, even if an application crashes (segfault, SIGKILL, system +crash) and hence @code{GNUNET_FS_stop} is never called and no +@code{SUSPEND} events are generated, operations are still resumed (with +@code{RESUME} events). +This is implemented by constantly writing the current state of the +file-sharing operations to disk. +Specifically, the current state is always written to disk whenever +anything significant changes (the exception are block-wise progress in +publishing and unindexing, since those operations would be slowed down +significantly and can be resumed cheaply even without detailed +accounting). +Note that if the process crashes (or is killed) during a serialization +operation, FS does not guarantee that this specific operation is +recoverable (no strict transactional semantics, again for performance +reasons). However, all other unrelated operations should resume nicely. + +Since we need to serialize the state continuously and want to recover as +much as possible even after crashing during a serialization operation, +we do not use one large file for serialization. +Instead, several directories are used for the various operations. +When @code{GNUNET_FS_start} executes, the master directories are scanned +for files describing operations to resume. +Sometimes, these operations can refer to related operations in child +directories which may also be resumed at this point. +Note that corrupted files are cleaned up automatically. +However, dangling files in child directories (those that are not +referenced by files from the master directories) are not automatically +removed. + +Persistence data is kept in a directory that begins with the "STATE_DIR" +prefix from the configuration file +(by default, "$SERVICEHOME/persistence/") followed by the name of the +client as given to @code{GNUNET_FS_start} (for example, "gnunet-gtk") +followed by the actual name of the master or child directory. + +The names for the master directories follow the names of the operations: + +@itemize @bullet +@item "search" +@item "download" +@item "publish" +@item "unindex" +@end itemize + +Each of the master directories contains names (chosen at random) for each +active top-level (master) operation. +Note that a download that is associated with a search result is not a +top-level operation. + +In contrast to the master directories, the child directories are only +consulted when another operation refers to them. +For each search, a subdirectory (named after the master search +synchronization file) contains the search results. +Search results can have an associated download, which is then stored in +the general "download-child" directory. +Downloads can be recursive, in which case children are stored in +subdirectories mirroring the structure of the recursive download +(either starting in the master "download" directory or in the +"download-child" directory depending on how the download was initiated). +For publishing operations, the "publish-file" directory contains +information about the individual files and directories that are part of +the publication. +However, this directory structure is flat and does not mirror the +structure of the publishing operation. +Note that unindex operations cannot have associated child operations. + +@cindex REGEX subsystem +@node REGEX Subsystem +@section REGEX Subsystem + +@c %**end of header + +Using the REGEX subsystem, you can discover peers that offer a particular +service using regular expressions. +The peers that offer a service specify it using a regular expressions. +Peers that want to patronize a service search using a string. +The REGEX subsystem will then use the DHT to return a set of matching +offerers to the patrons. + +For the technical details, we have Max's defense talk and Max's Master's +thesis. + +@c An additional publication is under preparation and available to +@c team members (in Git). +@c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms + +@menu +* How to run the regex profiler:: +@end menu + +@node How to run the regex profiler +@subsection How to run the regex profiler + +@c %**end of header + +The gnunet-regex-profiler can be used to profile the usage of mesh/regex +for a given set of regular expressions and strings. +Mesh/regex allows you to announce your peer ID under a certain regex and +search for peers matching a particular regex using a string. +See @uref{https://gnunet.org/szengel2012ms, szengel2012ms} for a full +introduction. + +First of all, the regex profiler uses GNUnet testbed, thus all the +implications for testbed also apply to the regex profiler +(for example you need password-less ssh login to the machines listed in +your hosts file). + +@strong{Configuration} + +Moreover, an appropriate configuration file is needed. +Generally you can refer to the +@file{contrib/regex_profiler_infiniband.conf} file in the sourcecode +of GNUnet for an example configuration. +In the following paragraph the important details are highlighted. + +Announcing of the regular expressions is done by the +gnunet-daemon-regexprofiler, therefore you have to make sure it is +started, by adding it to the AUTOSTART set of ARM: + +@example +[regexprofiler] +AUTOSTART = YES +@end example + +@noindent +Furthermore you have to specify the location of the binary: + +@example +[regexprofiler] +# Location of the gnunet-daemon-regexprofiler binary. +BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler +# Regex prefix that will be applied to all regular expressions and +# search string. +REGEX_PREFIX = "GNVPN-0001-PAD" +@end example + +@noindent +When running the profiler with a large scale deployment, you probably +want to reduce the workload of each peer. +Use the following options to do this. + +@example +[dht] +# Force network size estimation +FORCE_NSE = 1 + +[dhtcache] +DATABASE = heap +# Disable RC-file for Bloom filter? (for benchmarking with limited IO +# availability) +DISABLE_BF_RC = YES +# Disable Bloom filter entirely +DISABLE_BF = YES + +[nse] +# Minimize proof-of-work CPU consumption by NSE +WORKBITS = 1 +@end example + +@noindent +@strong{Options} + +To finally run the profiler some options and the input data need to be +specified on the command line. + +@example +gnunet-regex-profiler -c config-file -d log-file -n num-links \ +-p path-compression-length -s search-delay -t matching-timeout \ +-a num-search-strings hosts-file policy-dir search-strings-file +@end example + +@noindent +Where... + +@itemize @bullet +@item ... @code{config-file} means the configuration file created earlier. +@item ... @code{log-file} is the file where to write statistics output. +@item ... @code{num-links} indicates the number of random links between +started peers. +@item ... @code{path-compression-length} is the maximum path compression +length in the DFA. +@item ... @code{search-delay} time to wait between peers finished linking +and starting to match strings. +@item ... @code{matching-timeout} timeout after which to cancel the +searching. +@item ... @code{num-search-strings} number of strings in the +search-strings-file. +@item ... the @code{hosts-file} should contain a list of hosts for the +testbed, one per line in the following format: + +@itemize @bullet +@item @code{user@@host_ip:port} +@end itemize +@item ... the @code{policy-dir} is a folder containing text files +containing one or more regular expressions. A peer is started for each +file in that folder and the regular expressions in the corresponding file +are announced by this peer. +@item ... the @code{search-strings-file} is a text file containing search +strings, one in each line. +@end itemize + +@noindent +You can create regular expressions and search strings for every AS in the +Internet using the attached scripts. You need one of the +@uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA routeviews prefix2as} +data files for this. Run + +@example +create_regex.py <filename> <output path> +@end example + +@noindent +to create the regular expressions and + +@example +create_strings.py <input path> <outfile> +@end example + +@noindent +to create a search strings file from the previously created +regular expressions. |