Update: The CritSuite Toolset Project has been completed. This page is now part of an archive of CritSuite web pages. The domain http://crit.org no longer belongs to this project or to Foresight Institute. For current information on CritSuite, please see the site maintained by the author of the software, Ka-Ping Yee: http://zesty.ca/crit
In an attempt to get the ball rolling, I am trying to scope
out the high-level design but also to specify as many details as
possible. So this is really three documents in one. Sections 2 and 3 comprise a
requirements and objectives specification. Section
4 is a design specification. Sections 5
and 6 form an implementation outline. Each of
these three components is written to be considered independently:
any one of a number of designs might meet our objectives (but
here's one way to do it) and any one of a number of
implementations might conform to this design (but here's one way
to do it).
Credit for ideas should go to everybody. Specific
attributions are pointed out where possible.
The required functionality components I propose are
the ability for a reader to see, in any given document on
the Web, the existence and location of links constructed
by people other than the author of that document
the ability to represent links between any two documents,
at least of the type where the link comes from an entire
document and references a fine-grained target
the ability to store such links in a fashion that does
not need to reside within both documents
the ability to assign to such links some amount
of additional metainformation (such as a link type)
Or, in fewer (and familiar) words, I suggest that we want bidirectional,
fine-grained, extrinsic, typed
links in this implementation.
Note that, for now, we are leaving out filtered links and
multi-ended links (the Web need not be just a graph it
might eventually be a hypergraph). We are not implementing
correlation-based article filtering and rating (yet). We are also
explicitly omitting HyperWave(TM) functionality such as
collection management, ownership rights, payment,
searching/indexing, and universal resource naming (not to mention
a 3-D landscape viewer).
The proposed objectives of this project, in decreasing
priority, are:
to implement the requirements in a manner that maximizes
accessibility to current WWW users
to keep the implementation flexible enough to allow rapid
prototyping and extension
to achieve decent performance for a fairly small
number of users (e.g. under 10000 requests per
day)
to have a working implementation ready quickly (e.g.
by the end of June)
Potential goals beyond the essential requirements include:
more convenient posting of annotations
reader registration and preferences
display of what articles or links have been newly added
since the reader last checked in
more generally, visualization of discussion structure!
article rating and filtering
The implementation to be described addresses only the first of
these additional goals, but any implementation should leave doors
open enough to allow them all.
The design to be described specifically addresses requirements
a, c, and d.
Requirement b (link representation) is left to
the implementation.
The general proposal to scan and acquire links from the Web
(rather than keeping a separate database of annotations) is due
to Mark Miller, who named it M1. Augmented with the
annotation-posting facility to be described in Section
4.4, I will refer to this design as M1+. (This design is also
described in a bit more detail than M1 and involves some further
decisions that Mark may or may not agree with...)
Central to this design is a mediator program through which
readers must fetch their documents. My definition of a mediator
is a program which (a) operates as both an HTTP server and a
client, much like a proxy, (b) performs some useful processing on
the document, and (c) modifies hyperlinks within documents so
that further documents in the user's browsing session will
continue to be requested via the mediator. Using my terminology,
in this model the client makes a request to the mediator,
which then contacts the origin server to fetch the raw
document; the mediator produces a processed document
which it serves to the client.
While proxying is a little inconvenient for users and leads to
some performance limitations, it seems to be the only way to
provide universal accessibility. A pseudo-proxy, though it
involves still more processing, is more flexible than a plain
HTTP proxy because it can be reached by users inside firewalls
(and perhaps its use is easier to describe since it avoids any
browser-specific details).
The M1+ mediator is responsible for fetching any document from
the publicly-accessible Web, processing its HTML text so as to
present bidirectional link information, and sending the result on
to the user.
It is proposed that links remain directed, i.e.
asymmetric. Every link consists of two locations, a head and a
tail, where the tail must always lie within a document authored
by the creator of the link. The tail is the location of the
reference and the head is the location referred to, in analogy to
the head and tail of an arrow.
It is proposed that links be allowed to be either
coarse (referring to an entire document) or fine (referring to a
specific passage in a document) on either the head or
the tail.
(For comparison, both previous annotator implementations only
supported fine-grained heads and coarse-grained tails. Usenet
supports visible heads and visible tails, both coarse-grained.
Most current Web browsers implement only fine-grained tails. Some
truly advanced browsers such as Lynx, Traveler, and my
hacked copy of Grail also support coarse-grained tails. The Web
in general allows fine-grained or coarse-grained heads, but link
heads are never visible. To make matters worse, fine-grained
heads are essentially made useless except in only very few
browsers like Grail which actually implement them sensibly.)
It is further suggested that people be encouraged to classify
each link as one of four types:
query (request for clarification)
comment (ancillary information or a related topic)
support (positive argument)
issue (correction, adjustment, or negative argument)
where the type name indicates the relationship of the tail
object to the head object. For example, many of the
existing "go check out this neat site!" kinds of links
on the Web might be classified as "comments".
These basic semantic link types are drawn from suggestions by
Dean Tribble, Eric Drexler, and others.
It is a requirement of this design that all link information
be represented in HTML in the tail documents themselves. This
keeps the link information public, which may later aid
scalability. (The manner of representation of link types and
granularities in HTML is left to the implementation in Section 5.1.)
While processing any requested document, the mediator scans it
for this link information according to the chosen syntax, and
stores it for future use. When a document containing known link
heads is requested, the mediator inserts reverse links in
effect, filling in both ends of a link so that current web
browsers can make use of them. Since the most popular web
browsers only support visibility of fine-grained link tails,
we require the addition of coarse-grained link tails as well as
both fine-grained and coarse-grained link heads.
In the original design, the mediator could also periodically
refetch documents for examination (either from a list of known
documents which users edit, or from the history of previously
requested documents). This is perhaps an implementation issue,
but I believe that scanning only requested documents ought to be
enough for now.
The + in M1+ refers to one extra feature that is not in the
essential requirements, which is the ability for users to rapidly
publish simple documents when they wish to annotate a given page
or contribute to a discussion without having to create an HTML
document on their own site. This helps to make up for the earlier
restriction that all links must reside in public HTML documents
authored by the link creator.
To allow for rapid publishing the mediator needs to add some
user interface components which allow a reader to select a link
head in the document being viewed, and then to enter a document
for the link tail (effectively an annotation).
Documents entered in this fashion are stored on the mediating
server at a publicly-accessible URL. In this design we do not
require that they be editable after being posted. The
implementation of this feature will likely include a mechanism
for adding some simple markup to the plain entered text (see Section 5.2).
The use of a mediator makes it impossible for anyone to
annotate documents that are inside firewalls. It also
may potentially yield many problems with frame documents (a
specific implementation for dealing with frame documents is
possible, but I propose to leave that problem for later).
By encouraging people to write links in their own HTML
documents, we alleviate the need for people to get special
software to do authoring, but we also open ourselves to an
anarchy of link types (or to lazy authors who never bother to
classify their links).
HTML was designed to allow REL and REV
attributes on links (which stand for "relationship" and
"reverse relationship" respectively). We can use these,
even though most browser manufacturers have totally ignored them.
Note that REL is defined to give the relationship of
the head object to the tail object. Since the
link types we described in the design (see Section
4.2) give the relationship of the tail to the head,
we ask authors to use our link types with the REV
attribute.
Links with coarse-grained tails are represented in
HTML using the <LINK> tag within the <HEAD>
element of a document. Again very few browsers bother to
implement this, but we can use it.
Links with fine-grained tails are represented in
HTML, as they are now, as text flows contained in an anchor
element marked by the <A> and </A>
tags.
Links with coarse-grained heads are represented in
HTML, as they are now, as document URLs given in the HREF
attribute of a <LINK> tag or an <A>
tag.
Links with fine-grained heads are represented as
document URLs with a fragment identifier. We use the existing
convention that a hash (or number, or pound) sign separates the
requested URL from the fragment identifier. But we must invent a
new scheme to be used in the identifier itself, since HTML only
supports fine-grained heads where the head anchor is explicitly
marked in the head document. This degrades well, about as
well as you could ever ask, since the unmediated link will still
point at the correct document (coarsely).
The scheme is based on a finding a search string for context
and then identifying the portion of that search string which is
to form the head anchor of the link. The search string is to be
sought in the HTML text data after tags have been removed. Tags
or any amount of whitespace in the head document become word
breaks. Any punctuation is removed, leaving only letters or
digits in each word. This scheme requires that link anchors
contain only entire words, not parts of words.
Since URLs may not contain spaces, we stipulate that the
identifier consists of the words in the search string separated
by hyphens. Searches are case-sensitive. Finally, we enclose the
anchor portion of the string in parentheses. An example of a
reference to the words "may not contain spaces" at the
beginning of this paragraph would be
wep.html#Since-URLs-(may-not-contain-spaces)-we
The two namespaces are completely separate, since current
fragment identifiers are SGML NAME identifiers, which are
restricted to letters, digits, and underscores. Both parentheses
and hyphens are legal in URLs but illegal in SGML NAMEs. The
remaining punctuation characters which happen to be legal in
URLs, according to RFC 1738, are
+ _ = . / * , @ ' $ : ; & ! ?
which are all available for later use to represent more
interesting kinds of search strings.
(Instead of the hyphen, any word delimiter would do
such as the underscore or the period but hyphens are
pretty easy to type.)
When the user enters a comment for rapid publishing, the
comment is entered as plain text. The body of the comment is
automatically marked up according to the following conventions:
Blank lines are treated as paragraph breaks.
Sections of text indented by space are treated as
preformatted.
Words enclosed in asterisks are emphasized.
Strings which appear to be URLs are turned into link
tails.
As well as the comment body, the user is also asked to supply
the title of the new document. These elements are organized into
a complete HTML document posted on the mediator's server
(probably with a filename based on the document title).
The search string identifier scheme requires that all link
head anchors be text strings. It is not possible to make a link
referring to (for example) an image in another document, although
you could make a link to a phrase of text including the image.
This could be alleviated by adding some search syntax for finding
specific tags.
My preference is to implement the system in Python, to allow
easy object-oriented work, facilitate modularity and code reuse,
and provide quick prototyping abilities. Python's extensibility
allows us to later implement more speed-critical parts of the
code in C if necessary.
The code is to be divided into the following main modules
(presented here roughly in order of decreasing depth):
m1conf (configuration variables)
m1http (HTTP client)
m1lex (HTML lexical analyzer)
m1gather (link gatherer)
m1linkdb (link database)
m1url (URL parser and rewriter)
m1ui (user interface)
m1med (document mediator)
m1req (request handler)
Information flows from the HTTP client through the lexical
analyzer to the user interface, which uses the link database and
URL rewriting utilities to add value. The information also flows
from the lexical analyzer to the link gatherer, which saves
detected links in the link database. This information is a series
of tokens, each of which may be either text data or metadata
(such as a representation of an HTML tag or an indicator token
for some other purpose). Metadata tokens also contain state
information (such as the HTML tag stack).
For maximum performance the data is processed with a streaming
model which deals with information piecemeal as it comes
over the wire from the origin server. This streaming model is
supported by a category of stream-processing objects,
all of which have a method to accept a list of incoming data
tokens and contain a reference to the next stream-processing
object in the chain. Each such object maintains its own queue of
data, releasing tokens along the chain only when their state is
completely determined. (For example, if the lexical analyzer sees
a tag-opening <, it should queue data until it
has found the end of the tag.)
The document mediator module is responsible for calling upon
the HTTP client to provide the raw document to the user
interface. The top-level request handler interfaces the document
mediator with the outside world. Although the request handler
module may in future be a complete HTTP server of its own, for
the first implementation it can be just a CGI script.
The rapid publishing utility is independent of the mediator
and may be implemented as a separate CGI script. It is expected
that this program will also use the m1linkdb module to manipulate
the link database, and get its configuration information from the
m1conf module. It will also incorporate a module named m1markup
which is responsible for accepting a plain-text document and
applying rapid-publishing markup.
This module would contain configuration information such as
the host server's name and address, ports to bind to, relevant
URLs, and other string constants.
This module opens and manages an HTTP client connection to a
specified origin server, generating a stream of text data as it
comes over the wire. The interface to this module includes:
class HTTPSession
Constructed with a complete HTTP request, an object of
this class tries to produce the requested document. It
may maintain its own cache to draw on. If it does not get
the document from the cache, it creates a new connection
to an origin server and makes the request. It must parse
the HTTP header. If the content type is not HTML, it
produces binary-data tokens that are passed through the
other stream processors untouched. If the contained
document is HTML, all content data received is sent to
the next stream processor as text tokens. Any errors or
other conditions are indicated by sending metadata tokens
down the stream.
This module is responsible for breaking a stream containing an
HTML document into text and tag tokens, while maintaining a tag
stack. To correctly maintain the tag stack, this module needs to
contain a small database of circumstances in which tags need to
be implicitly closed or inserted.
class HTMLLex
Constructed with a reference to the next
stream-processing object, an object of this class accepts
HTML text data and produces a stream of text and tag
tokens. Tag tokens produced by this object contain the
tag stack state.
This module gathers link information for the database.
class LinkGatherer
Constructed with a LinkDB, an object of this class
provides a stream sink that accepts text and tag tokens.
Incoming tag tokens which indicate links (i.e.<A>
and <LINK> tags) are assembled into
entries in the link database.
This module manages a link database in permanent storage
(likely implemented with a public-domain database library such as
BSDDB or GDBM). The interface to this module includes:
class LinkDB
Allows direct reading and writing of the link database,
for use by the annotation interface.
class LinkFinder
Represents a collection of link anchors which can be
compared to a stream of data. As data is added to the
stream, the method forwards as much data as possible with
tokens inserted to represent anchor limits
(beginning or ending points of anchors). An object of
this class is constructed given a LinkDB and a particular
URL for a target document.
This module provides a number of utilities for manipulating
URLs joining them, resolving them with respect to a given
base URL, and creating extended URLs that reference the mediator.
This module hopefully separates the user interface from the
rest of the mediator's inner workings. The stream processing
object in the user interface accepts the tags and tokens from a LinkFinder
object, and turns the special tokens for link anchors into the
desired user interface components (for example, linked dots,
arrows, or buttons).
This module accepts a request for a document to be mediated,
and sets up the appropriate objects and stream processors from
the above classes to perform the processing. It may maintain its
own cache of pre-processed documents.
This module deals with the details of CGI request parameters
and produces a request for the mediator module to perform. It is
the final receiver of data from the mediator, and dumps the data
to the client.