Conversion Issues
As much as they are well suited for
adaptation to hypertext, converting
text to hypertext has been a classic
problem while dealing with very large
information spaces such as training
manuals, encyclopedias, dictionaries.
Currently published literature on
hypertext contains little work directly
related to the scale of transforming
large volumes of encyclopedic text into
hypertext form (most deal with creating
small hypertext documents, not
converting large documents to hypertext).
The following are some of the issues
involved in converting text to hypertext
[Glushko, 1989],
[Riner, 1991]:
-
Identifying documents that would benefit
readers if converted to hypertext form.
-
Determining procedures to convert them
to hypertext format.
-
Preparing documents in an electronic
format from paper or other forms.
-
Identifying nodes and links and
classifying them into various types
(to capture semantics). An important
problem related to this issue is
called the fragmentation problem.
It is still difficult to identify text
units that can be separate modules and
also serve as cross-references for
other entries. Links should follow some
model of the user's need for information
in some particular context. Deciding on
the level of granularity is a difficult
problem. Too fine the granularity,
greater the problem of fragmentation.
Too coarse the granularity, greater the
need or the display of large entries.
Also fragmentation tends to make an
implicit structure (such as a subtle
treatment of a theme that may communicate
an idea more artistically) explicit,
taking away the expressiveness of the
statement. Therefore, we have to find
means to reduce segmentation of ideas and
loss of structural information due to the
manipulation of the semantic structure of
a linear document.
-
Determining the target of a link as a
complete entry, a sub entry, or a
derivative form is a challenging task.
This involves determining the right
part of speech, the etymological root,
and applying sense-disambiguation to
identify a particular meaning.
-
With present-day video monitors, the
display of large entries in their
entirety is still a problem. This can
be partly solved by having fisheye
views and abbreviations. Structural
information can be extracted from the
tags and employed in the construction
of a structural view.
-
Performing the conversion and
verifying the results.
Hypermedia structures and systems assignment by
Mark de Haas (0481832)