External anchoring for wide-area network support: the RHYTHM project
Cesare Maioli (+), Stefano Sola (*) and Fabio Vitali (*)
(+): Dept. of Mathematics, University of Bologna
(*): CIRFID, University of Bologna
Many hypertext systems exist allowing different users to access a
common database concurrently in a local area network. The
implementation of such systems requires the careful consideration of
many aspects, such as concurrency, access rights, authorship, etc.
Unfortunately, it is not easy to scale up hypertext systems for
international networks. The creation of many, independent hypertext
databases, the accessing of thousands of concurrent users and the
range of data types and formats that can be made available throughout
the network, are important issues that cannot be approached lightly.
WWW
has admirably faced wide area network access and support, and has
been the first hypertext system to do so. Yet, there are some
problems that haven't been faced with
WWW.
It is clear, for instance, that
WWW
is mostly a tool for displaying and browsing data. Little
provisions for interactive editing and linking are given. Yet,
literature, both classic and recent (e.g. Ted Nelson, cfr. "Literary
Machines" Sausalito Press 1983, and Poltrock, cfr. Malcolm, Poltrock,
Schuler "Industrial Strength Hypermedia: Requirements for a Large
Engineering Enterprise" Hypertext '91 Conference Proceedings ACM
Press 1991) strongly stresses the importance of creating an
environment for the integrated coexistence of public and private
data, of public and private links, i.e. those created by official
authors for browsing and exploring ideas and those created by readers
for their own purposes, such as notetaking, brainstorming, creation
of guided tours, creation of private collections. Poltrock indeed
suggests the creation of "asynchronous, distributed virtual meetings"
between authors and readers, where the difference in their roles
somewhat blurs and disappears.
This is a difficult thing to do with
WWW,
since links are stored
within the document containing the starting end-point. Unless WRITE
access rights are obtained (which is, in general, out of question),
the best one can do in order to create a private link on a public
document is creating a new, personal document containing a link to
the entirety of the public document. This restrict the choice of
links to node-to-node or span-to-node links (no span-to-span ones)
and, furthermore, to private-to-public links, i.e. no
public-to-private or public-to-public ones.
It is our opinion that there is only one implementation decision that
allows the creation of private span-to-span links between public
documents in an arbitrarily large distributed system: external,
remote anchoring. This is an implementation and sintactical issue
that regards anchoring policies. A discussion of many anchoring
policies as implemented in various hypertext systems can be found in
our paper: C. Maioli, S. Sola, F. Vitali, "Wide-Area Distribution
Issues In Hypertext Systems", ACM SIGDOC '93 Conference, ACM Press,
Kitchener (Canada), 5/8 october 1993, which can be dowloaded with
anonymous FTP at ftp.cs.unibo.it.
External remote anchoring stores the exact specification of both
end-point spans outside both documents, for instance in an apposite
link table. This table is completely independent from all documents,
and can be stored, for instance, in the home server of every user.
Therefore no WRITE access ever is necessary to create a link on
public documents, since it is stored outside the documents in a
computer where, supposedly, the user has WRITE permissions. On the
other hand, an advantage of internal anchoring policy is that
whenever the document is modified, the anchors are immediately
updated since they move with the data. This cannot happen when
anchors are stored outside the document they refer to: an explicit
update operation needs to be implemented. Updating can be synchronous
(i.e. as soon as the modified node is saved on disk) if the anchor
table is under the control of the local server, but this is
impossible to guarantee when the server storing the node and the one
storing the link are different and autonomous: indeed, it may happen
that a modification notice need to be sent to thousands of link
server throughout the net, and this is clearly absurd. A different
approach is asynchronous update for external anchors. This system
would work on demand: only those link that are actually activated are
updated, while the other keep being outdated until someone needs to
use them. Therefore, as soon as the link is selected and activated,
the system would query for the current position of the other
end-point to the node server storing the requested node, which would
reply with the current version of the node and a specification of the
current position of the requested anchor, which may have moved since
the link was created. It is therefore necessary that all node servers
employ a procedure for computing the current position of arbitrary
spans as were saved in previous versions of each node. One such
mechanism is pattern matching (i.e.: the link server stores a pattern
of the end-points, and the node server looks for the current position
of the pattern), but this is a very fragile solution, since pattern
modifications may void the link, and there would also be problems in
uniqueness, size and data types; another solution is implementing a
versioning mechanism for nodes, which would allow to compute the
current position of arbitrary span by examining the development in
time of the node.
Therefore it is our opinion that external remote anchoring with
on-demand anchor updating based on versioning is the only practical
solution to allow the interactive and integrated access to both
public and private data and the creation of private links on public
documents.
The RHYTHM hypertext system (Research on HYpertext THeory and
Management) is explicitly addressing interactive and integrated
access to public and private hypertext data on an arbitrarily large
computer network. RHYTHM is being developed at the University of
Bologna, within the Progetto Finalizzato CNR "Sistemi Informatici e
Calcolo Parallelo", subproject #7 "Sistemi di Supporto al Lavoro
Intellettuale".
RHYTHM is based on two basic ideas: external anchoring and inclusion.
External anchoring has been explained already. In RHYTHM, links can be
stored either in the document containing the starting end-point, or
in the document containing the destination end-point, or in any other
document of the net. It is necessary, though, that the document
containing the link is open for the link to be shown. Therefore, the
set of links starting or leading to a document may vary depending on
the set of open documents and the set of links contained therein. It
is therefore possible to create multiple different tour and linksets
on the same set of data documents by activating different link
documents.
Inclusion is another important features of the RHYTHM hypertext
system. In RHYTHM it is possible to select any subpart of an open
document and "include" it in another document. From there on, the
included part belongs to the including document's content, and is
always shown with the document. Inclusion is therefore similar to
copy & paste, with two basic differences: no data is duplicated, and
a live connection with the original document is always kept, and it
is always possible to "jump back to context", i.e. examine the
citation in the original document whence it was taken. Inclusion is
in our opinion another, powerful method (as navigation links were)
for interconnecting documents and creating new forms of interactive,
non-linear data containers.
Furthermore, inclusion is exploited for two other features of the
RHYTHM hypertext system: authoring and versioning.
Versioning is used for updating external anchors, and allows the user
to examine and modify previous versions of any document without ever
losing track of development sequences and history. Versioning is
based on inclusion, since a new version of a document is a new
document including references to the part of the previous versions
that are kept unchanged. All references to the document are easily
computed to the correct position in the new version, all data that
have been taken away from the document are still accessable by
examinining previous versions of the document, and different versions
of a document can be examined at the same time by showing them side
by side.
Authoring, in RHYTHM terminology, means the possibility for all users
to access in WRITE mode to all documents they can read, both private
and public and belonging to other users, both local and remotely
stored. This implies the possibility to add, change, delete, and move
any data within any document. Modifying public documents for one's
own needs lies in the same line as creating private links on public
documents: is a way for customising the environment and integrating
completely one's own needs within a shared environment, allowing
operations that can be done with books, which we can underline,
annotate, delete, rip off and bind in differente ways, but can not do
yet with hypertext documents. An important characteristics that lust
be kept, though, is authorship: it is unacceptable to allow any user
to do any operation on public data, thereby irrevocably changing the
data for all users. The customisation of the environment mustn't
influence on other users' right to do the same: it must be done on
private views of the data. RHYTHM allows any modification on any
document by creating a new document including the original one, on
which the user has complete control, and where he can add, delete,
modify and move data as much as he likes. This document does not
really substitute the original one, which is still accessable by the
author and all other users, but substitutes it for the user that did
the modifications, who can still access the original document as a
previous version of his modifications.
As a conclusion, we would like to stress that the path that
WWW opened
is very important and decisive for the future of the exploitation of
wide area networks such as Internet, but new forms of interaction
will need to be provided, which
WWW,
right now, cannot provide. It is
our opinion that one basic architectural choice will need to be
implemented for these forms of interaction to be possible, and this
choice is external remote anchoring. It is also our opinion that the
RHYTHM hypertext system, based on this and on inclusion, might be an
interesting test system for these features.
-----------------------------
Fabio Vitali
CIRFID, University of Bologna
e-mail: fabio@cirfid.unibo.it