External anchoring for wide-area network support: the RHYTHM project
Cesare Maioli (+), Stefano Sola (*) and Fabio Vitali (*)

(+): Dept. of Mathematics, University of Bologna
(*): CIRFID, University of Bologna

Many hypertext systems exist allowing different users to access a common database concurrently in a local area network. The implementation of such systems requires the careful consideration of many aspects, such as concurrency, access rights, authorship, etc. Unfortunately, it is not easy to scale up hypertext systems for international networks. The creation of many, independent hypertext databases, the accessing of thousands of concurrent users and the range of data types and formats that can be made available throughout the network, are important issues that cannot be approached lightly. WWW has admirably faced wide area network access and support, and has been the first hypertext system to do so. Yet, there are some problems that haven't been faced with WWW. It is clear, for instance, that WWW is mostly a tool for displaying and browsing data. Little provisions for interactive editing and linking are given. Yet, literature, both classic and recent (e.g. Ted Nelson, cfr. "Literary Machines" Sausalito Press 1983, and Poltrock, cfr. Malcolm, Poltrock, Schuler "Industrial Strength Hypermedia: Requirements for a Large Engineering Enterprise" Hypertext '91 Conference Proceedings ACM Press 1991) strongly stresses the importance of creating an environment for the integrated coexistence of public and private data, of public and private links, i.e. those created by official authors for browsing and exploring ideas and those created by readers for their own purposes, such as notetaking, brainstorming, creation of guided tours, creation of private collections. Poltrock indeed suggests the creation of "asynchronous, distributed virtual meetings" between authors and readers, where the difference in their roles somewhat blurs and disappears. This is a difficult thing to do with WWW, since links are stored within the document containing the starting end-point. Unless WRITE access rights are obtained (which is, in general, out of question), the best one can do in order to create a private link on a public document is creating a new, personal document containing a link to the entirety of the public document. This restrict the choice of links to node-to-node or span-to-node links (no span-to-span ones) and, furthermore, to private-to-public links, i.e. no public-to-private or public-to-public ones. It is our opinion that there is only one implementation decision that allows the creation of private span-to-span links between public documents in an arbitrarily large distributed system: external, remote anchoring. This is an implementation and sintactical issue that regards anchoring policies. A discussion of many anchoring policies as implemented in various hypertext systems can be found in our paper: C. Maioli, S. Sola, F. Vitali, "Wide-Area Distribution Issues In Hypertext Systems", ACM SIGDOC '93 Conference, ACM Press, Kitchener (Canada), 5/8 october 1993, which can be dowloaded with anonymous FTP at ftp.cs.unibo.it. External remote anchoring stores the exact specification of both end-point spans outside both documents, for instance in an apposite link table. This table is completely independent from all documents, and can be stored, for instance, in the home server of every user. Therefore no WRITE access ever is necessary to create a link on public documents, since it is stored outside the documents in a computer where, supposedly, the user has WRITE permissions. On the other hand, an advantage of internal anchoring policy is that whenever the document is modified, the anchors are immediately updated since they move with the data. This cannot happen when anchors are stored outside the document they refer to: an explicit update operation needs to be implemented. Updating can be synchronous (i.e. as soon as the modified node is saved on disk) if the anchor table is under the control of the local server, but this is impossible to guarantee when the server storing the node and the one storing the link are different and autonomous: indeed, it may happen that a modification notice need to be sent to thousands of link server throughout the net, and this is clearly absurd. A different approach is asynchronous update for external anchors. This system would work on demand: only those link that are actually activated are updated, while the other keep being outdated until someone needs to use them. Therefore, as soon as the link is selected and activated, the system would query for the current position of the other end-point to the node server storing the requested node, which would reply with the current version of the node and a specification of the current position of the requested anchor, which may have moved since the link was created. It is therefore necessary that all node servers employ a procedure for computing the current position of arbitrary spans as were saved in previous versions of each node. One such mechanism is pattern matching (i.e.: the link server stores a pattern of the end-points, and the node server looks for the current position of the pattern), but this is a very fragile solution, since pattern modifications may void the link, and there would also be problems in uniqueness, size and data types; another solution is implementing a versioning mechanism for nodes, which would allow to compute the current position of arbitrary span by examining the development in time of the node. Therefore it is our opinion that external remote anchoring with on-demand anchor updating based on versioning is the only practical solution to allow the interactive and integrated access to both public and private data and the creation of private links on public documents. The RHYTHM hypertext system (Research on HYpertext THeory and Management) is explicitly addressing interactive and integrated access to public and private hypertext data on an arbitrarily large computer network. RHYTHM is being developed at the University of Bologna, within the Progetto Finalizzato CNR "Sistemi Informatici e Calcolo Parallelo", subproject #7 "Sistemi di Supporto al Lavoro Intellettuale". RHYTHM is based on two basic ideas: external anchoring and inclusion. External anchoring has been explained already. In RHYTHM, links can be stored either in the document containing the starting end-point, or in the document containing the destination end-point, or in any other document of the net. It is necessary, though, that the document containing the link is open for the link to be shown. Therefore, the set of links starting or leading to a document may vary depending on the set of open documents and the set of links contained therein. It is therefore possible to create multiple different tour and linksets on the same set of data documents by activating different link documents. Inclusion is another important features of the RHYTHM hypertext system. In RHYTHM it is possible to select any subpart of an open document and "include" it in another document. From there on, the included part belongs to the including document's content, and is always shown with the document. Inclusion is therefore similar to copy & paste, with two basic differences: no data is duplicated, and a live connection with the original document is always kept, and it is always possible to "jump back to context", i.e. examine the citation in the original document whence it was taken. Inclusion is in our opinion another, powerful method (as navigation links were) for interconnecting documents and creating new forms of interactive, non-linear data containers. Furthermore, inclusion is exploited for two other features of the RHYTHM hypertext system: authoring and versioning. Versioning is used for updating external anchors, and allows the user to examine and modify previous versions of any document without ever losing track of development sequences and history. Versioning is based on inclusion, since a new version of a document is a new document including references to the part of the previous versions that are kept unchanged. All references to the document are easily computed to the correct position in the new version, all data that have been taken away from the document are still accessable by examinining previous versions of the document, and different versions of a document can be examined at the same time by showing them side by side. Authoring, in RHYTHM terminology, means the possibility for all users to access in WRITE mode to all documents they can read, both private and public and belonging to other users, both local and remotely stored. This implies the possibility to add, change, delete, and move any data within any document. Modifying public documents for one's own needs lies in the same line as creating private links on public documents: is a way for customising the environment and integrating completely one's own needs within a shared environment, allowing operations that can be done with books, which we can underline, annotate, delete, rip off and bind in differente ways, but can not do yet with hypertext documents. An important characteristics that lust be kept, though, is authorship: it is unacceptable to allow any user to do any operation on public data, thereby irrevocably changing the data for all users. The customisation of the environment mustn't influence on other users' right to do the same: it must be done on private views of the data. RHYTHM allows any modification on any document by creating a new document including the original one, on which the user has complete control, and where he can add, delete, modify and move data as much as he likes. This document does not really substitute the original one, which is still accessable by the author and all other users, but substitutes it for the user that did the modifications, who can still access the original document as a previous version of his modifications. As a conclusion, we would like to stress that the path that WWW opened is very important and decisive for the future of the exploitation of wide area networks such as Internet, but new forms of interaction will need to be provided, which WWW, right now, cannot provide. It is our opinion that one basic architectural choice will need to be implemented for these forms of interaction to be possible, and this choice is external remote anchoring. It is also our opinion that the RHYTHM hypertext system, based on this and on inclusion, might be an interesting test system for these features.
-----------------------------
Fabio Vitali 
CIRFID, University of Bologna

e-mail: fabio@cirfid.unibo.it