www.fgks.org   »   [go: up one dir, main page]

Workbook on Digital Private Papers > Administrative and preservation metadata > Persistent identifiers

Persistent identifiers

Handle System

Background

The Handle System was developed in America by the Corporation for National Research Initiatives (CNRI) as part of the Computer Science Technical Reports (CSTR) project; this project, which was funded by the Defense Advanced Research Projects Agency (DARPA), ran from 1992 to 1996. It involved developing and providing network access to a corpus of digitised material from the collections of computer science technical reports held by five major universities; part of the project’s work involved developing an architecture for an open distributed digital library (as described in a paper by Robert Kahn and Robert Wilensky in 1995). One of the key concepts which emerged from the project was the idea of handles to provide unique, location-independent persistent identifiers for digital objects. The Handle System was first implemented in autumn 1994.

The system is a general-purpose naming service which provides a mechanism both for assigning persistent identifiers to digital objects and resolving these identifiers to provide users with access to the information necessary to locate, access or otherwise use the digital object identified by the Handle, or (where appropriate) to the resource itself. Information about a digital object’s current location is stored in the Handle records, meaning that when this location changes, only the Handle record (rather than the Handle address) needs to be changed.

The Handle System was designed to work independently of the DNS, although it can also work successfully within it.

See the Handle System website for more information.

How does the Handle System work?

The Handle System is comprised of three different elements:

In order to maintain its independence, the Handle System is not based on DNS root servers; it has its own root server, the Global Handle Registry (GHR). The GHR provides the service used to manage lower-level Naming Authorities (NAs). Each NA is an organisation with administrative responsibility for creating and managing Handles within a specified namespace (which may have any number of sub-namespaces) and each local namespace is managed by a Local Handle Service; all of these namespaces must be registered with the GHR. Local Handle Services are reliant on a local Handle server, which can be downloaded and installed by system administrators in a similar way to installing a web server. The Local Handle Service can then establish its own local infrastructure, e.g. it might scale up by adding more servers at local level; there are no limits on the number of sites or servers which make up a local service.

It is recommended that a Handle server should be installed on a machine with an internet presence because the GHR needs to be able to contact the local server. It may, however, be possible to configure the server so that two different IP addresses are used to distinguish between internal and external access.

Handle syntax

A Handle identifier is divided into two parts: a prefix and a suffix divided by a forward slash, taking the following form:

[Handle Naming Authority]/[Handle Local Name]

Hypothetical example:
19123.11/object200

Handle Naming Authority Each NA is assigned a number by the GHR which is globally unique within the Handle System. These numbers are decimal and are assigned sequentially. Each NA can authorise any number of sub-NAs, and a dot (.) is used to express this hierarchy, which should be read from left to right; e.g. in the fictional example above, 19123 would represent the higher-level NA, which might be a national or university library; 11 might be a particular project, programme or department within that institution, although sub-NAs do not necessarily have to be administratively dependent on their parent NA in any way.

Handle Local Name The Local Name is assigned by each individual NA in accordance with its own policies. The Handle System sets no limitation on the syntax of the Local Name, although it must be expressed using characters from Unicode’s UCS-2 character set. It should also be unique within the NA, making it globally unique within the Handle System.

Resolving Handles

The Handle system is not limited to naming; it also enables users to resolve Handles into the information necessary to locate, access and use an identified resource (e.g. metadata about a resource, a request form to apply for access, or information about the location of the resource), or take them directly to the resource itself. The GHR maintains a record of all NA prefixes. If a user wishes to resolve a particular Handle, they send a request to the GHR, which identifies (by its prefix) the NA which assigned the Handle and returns this information to the user, who can then access the relevant Local Handle Service directly, e.g. if a user wants to find out which local service is responsible for the Handle 34567/890, they send a request to the Global Handle Registry to resolve the NA Handle for the prefix 34567.

Caching servers can be associated with local servers: these allow frequently-accessed Handles to be stored and resolved without contacting the GHR.

Each Handle may have a set of values assigned to it in the form of a standardised metadata record which contains information about how the Handle and resource it identifies are accessed and administered, e.g: a detailed set of read/write/execute permissions applicable both to the Handle administrator and the user; location information; and a description of the resource identified by the Handle. The Handle value may be ‘privatised’ so that only the administrator has read-access, thus keeping some of the data (e.g. location information) inaccessible to the public.

One unique feature of the Handle System is that Handles can refer to copies of the same document held at different locations, helping to assure access when servers are busy or there is high demand, although this is not a high priority for digital archivists, who are generally dealing with unique objects. However, it is possible to include in a Handle’s value set references to other Handles which ‘add credentials’ to the Handle; this might be used to link various different metadata records to a digital object.

Using HTTP, Handles can be resolved by using the resolver service at <http://hdl.handle.net/> or simply appending a Handle to the URL <http://hdl.handle.net/>. The user may be redirected to a URL associated with the relevant resource, or be able to view a list of the Handle’s values.

Maintenance and adoption

A number of projects and institutions are currently making use of the Handle System, including: the Defense Virtual Library - a digital library established in America by the Defense Technical Information Center (DTIC), the DARPA and CNRI; the Digital Object Identifier System (DOI), another persistent identification scheme which uses Handles as its naming component; the digital repository software DSpace, which uses Handles to name and provide access to document containers; and the Library of Congress in its National Digital Library Program.

In order to use Handles as persistent identifiers, an institution has to register and establish a Naming Authority. This involves signing a licence agreement with the CNRI, which maintains the system. Although the software is made freely available, in June 2006 a fee was introduced for those wishing to participate formally in the Handle scheme in order to cover operational costs for running the GHR. Registering for a Handle NA number costs $50 and there is an annual service charge of $50. While the registration fee only applies once no matter how many derived prefixes are registered, the annual charge is applied per prefix (e.g. three prefixes would incur a $50 registration fee and a $150 annual charge).

The CNRI hosts the Global Handle System root server. This is also overseen by the Handle System Advisory Committee, which has members drawn from both public and private sectors. The software for client and server can be found at <http://www.handle.net>.

Advantages and disadvantages of the Handle System

Advantages

  • The Handle system was one of the earliest PID schemes to be introduced (contemporary with URNs), and is being used by a number of digital libraries and national institutions. It is maintained by a national organisation in the USA, so is stable and well-established.
  • It conforms to the functional requirements of the URI and URN concepts, and is independent from, yet interoperable with, current protocols like HTTP.
  • Handle syntax is straightforward and is also capable of incorporating existing local identifier systems.
  • The system may be adaptable to the different levels of access required for managing personal digital archives: operations on the Handle database are controlled by a detailed authorisation mechanism for security of data; and Local Handle Servers can be configured to allow either internal or external access, which might enable use of Handles as identifiers in a closed environment.
  • The distributed model means that local Handle services and NAs have autonomy to manage their own Handles.
  • The system is scaleable and might allow smaller institutions to share a local service under the same NA.

Disadvantages

  • Whilst not prohibitive, there is nevertheless an initial fee and an annual charge for those participating in the system, whereas ideally a PID system should be free and the software openly available.
  • While there are authorisation mechanisms, the system still has a strong emphasis on identifying resources which are openly available via the Web, rather than held in the more restricted context of a digital archive.
  • The system includes some optional metadata elements which are superfluous to the needs of a digital archive: extensive metadata is already produced for each digital object (using METS and PREMIS in the case of Paradigm), so the production of a value set for each Handle would therefore be unnecessary.
  • The character set for Handles is much broader than is permissible for URIs, so institutional naming policies would have to place restrictions on the characters used in order to comply with URI requirements.