Session A: Changes in University Organisation and Structure
Stefan Gradmann: Reducing White Noise

2. White Noise

The American writer Don Delillo has coined one of the strongest metaphors for the problem that had been indicated by Bush and that this paper is concerned with, too. One of DeLillo‘s most successful novels bears the title “White Noise“ [ 2 ] and is essentially concerned with the entropy-like situation caused by information overload: if the probability of locating a specific bit of information is equal at any given point on the information map, and this simply, because these bits of information are overwhelmingly omnipresent and means for their filtering and for aggregating information are absent, the effect is white noise information, non-information caused by an overload of unstructured information.

DeLillo‘s novel was written in 1985, and thus the loss of coherence in world perception and organization caused by information overload depicted in this book is mainly associated to the television medium. Had this apocalyptic vision been written 15 years later, it could well have been about information in the World Wide Web. When describing the information ecology of the WWW in 1997, Jeff Ubois stated that “the rapidly widening gap between the amount of data in the world and the amount of attention available to process it means a growing percentage will never be looked at by a human.“ [ 3 ] This could have been put even more pessimistically: not only is there a risk of substantial information never actually being retrieved: the real threat is that of valuable information being submerged by heaps of irrelevant bits of information, by what David Shenk called “Data Smog“ in his monograph published in 1997. This paper is about this data smog, the tendency towards entropy induced by its omnipresence, the danger of WWW information quality being reduced to the precision and specificity of white noise. But most of all, this paper is about some of the means we have or we may conceive to prevent this entropic tendency from becoming increasingly dominant and ultimately suffocating in terms of information ecology. More specifically, the contribution is thus concerned with the role of traditional players in academic institutions - such as computer centers and libraries - or newly emerging institutions - such as multi-media centers - in this context of what may be termed an academic information garbage policy.

The sad fact is, that the logic of building WWW-based information services, until now has remained mostly cumulative, the sheer amount of accumulated information having been considered a quality in itself during a long period. This cumulative logic was a reaction to the traditional information paradigm, in which the generation and accumulation of information was a tedious process, which went along with strong mechanisms for information filtering and aggregation, some of these deeply anchored in academic culture. Sparseness of information resources and the difficulties of accessing them physically further decreased dangers of information overload in this traditional information economy. With this past situation in mind, any infrastructure granting easy, fast access to huge amounts of information was of course essentially perceived as eliminating restrictions and thus a value in itself.

The information explosion within the WWW thus has been acclaimed for its sheer quantitative aspects for quite some time, as if the pure doubling of bytes available on the Web was equivalent of a proportional growth in informational value. However, this tremendous amount of easily accessible electronic information comes with only poor mechanisms for information filtering and aggregation.

Approaches based on the use of search engine are a good example for such shortcomings, the following example (Figure 1) is meant to illustrate the almost uncontrollable precision rate combined with only seemingly impressive recall obtained when searching for the term ’virtual library‘ via the meta-engine SuperSeek:

Figure 1

These results are questionable without doubt: while the highly selective Yahoo service announces 336 matches, InfoSeek already reports 64.118 of them and Lycos as well as AltaVista announce truly impressive figures with 1.407.776 web sites and 1.630.740 pages respectively. Even though these discrepancies may partially be due to the different functional paradigms underlying the search engines it is close to impossible to judge the actual relevance of the ’information‘ thus retrieved. The lack of transparence of the respective ranking algorithms clearly doesn‘t contribute to a more focused picture either: in fact, when comparing the top ten results reported back from each of these information services the user ends up with 35 different site indications and none of these contained in all four top ten sets (and only one reported back by at least three services, another four figuring in at least two top ten result sets).

This example should make clear that WWW based information services and the tools used for retrieving these certainly are doing an impressive job in terms of data accumulation but that they are rather poor aggregators of information.

Such pure accumulation of information does not serve research as such. Information overload, bearing entropic traits, even tends to harm serious research work, and it certainly does so, once individual researchers are submerged by heaps of information without a real chance to determine the actual relevance of any given bit of this information stream.

Even though actually tackling this problem clearly is far beyond the reach of any individual institution and technical solutions therefore need to be found in a more global context, academic institutions - as all other users of WWW information services - nevertheless are in need of consistent strategies for locally dealing with white noise information. The following sections of this paper provide some examples for elements of such strategies together with indications concerning the players needed to implement them.



© This publication and its compilation in form and content is copyrighted. Every realization which is not explicitly allowed by copyright law requires a written agreement. Especially, this holds for reprography and processing / storing by electronic systems.

EUNIS Proceeding DTD Version 1.0
HTML - Version create: Fri Mar 23 14:32:52 2001