Presenters:
Suzette Keith,
Interaction Design Centre, Middlesex University.
Bob Fields, Interaction
Design Centre, Middlesex University.
Abstract: Digital libraries are notoriously difficult to design well in terms of their eventual usability. In this tutorial we will present an overview of usability issues and techniques affecting digital library interface design. We will develop scenarios that make use of information seeking models to describe the users, their goals and activities. The scenarios describe the user interaction and provide the context for examining the effect of the design on the user. Claims made about the design will be examined with reference to broad usability principles, cognitive models and models of information seeking. Through a graduated series of worked examples participants will get hands-on experience of how to develop scenarios and apply claims analysis. Trade-offs between positive and negative claims and the effect on the design activity will be examined.
Target audience: This is an introductory tutorial. Participants are not expected to have prior experience of usability techniques for digital libraries, or of human computer interaction more generally. You are, however, expected to have some experience of working with digital libraries, or of delivering digital library services.
Duration: Full-day, 09.00-17.30, Sunday August 17.
Presenters:
Giuseppe Amato, ISTI-CNR,
Pisa, Italy.
Claudio Gennaro,
ISTI-CNR, Pisa, Italy.
Pasquale Savino,
ISTI-CNR, Pisa, Italy.
Abstract: The aim is to provide a theoretical and experimental
background on the techniques and the methodologies for the organization,
creation, and management of an Audio/Video Digital Library (A/V DL). The
frontier of DLs consists in the possibility of managing multimedia documents
other than pure textual information. In particular, due to the large amount of
A/V material that is available in a digital for and due to the importance of
this material for many aspects - economic, environmental, health, cultural,
social, etc. - of everyday life, the management of A/V DL is becoming of crucial
importance.
The course will illustrate the techniques and the methodologies to design, build
and maintain an A/V DL. Extensive examples will be done of existing systems and
approaches. In particular, as a running example we will refer to the ECHO
system, which provides a DL service for historical films. It allows to index and
retrieve the A/V material by using speech transcripts, video features
automatically extracted from the video and metadata manually associated by the
user. Metadata are described by using an A/V metadata model based on the
IFLA-FRBR standard.
Target audience: Librarians, Archivists, Computer Scientist in the audio/video processing field.
Duration: Half-day, 09.00-12.30, Sunday August 17.
Presenter: George Tzanetakis, Computer Science Department, Carnegie Mellon University.
Abstract: The capacity to store and the ability to
distribute large collections of multimedia information is increasing every day.
A large percentage of this data as well as of current internet traffic consists
of music files either in compressed audio format or in symbolic representation.
As the recording industry is gradually moving towards digital music distribution
there is an increasing need for tools that can help analyze and search large
digital libraries of music. Music Information Retrieval (MIR) is an emerging
research area dealing with the problems for analyzing, indexing and searching
large collections of music. Although music information retrieval has
similarities with text, image and video information retrieval it has unique
characteristics that pose new challenges to research in digital libraries.
In the last few years the emerging field of MIR has been gaining momentum and a
large variety of problems, algorithms, tools and ideas have been proposed. As
with every evolving field these developments are scattered in various
publication forums and there is little work that provides a comprehensive
overview of the field especially for researchers that are not directly involved
but are interested in learning about it.
Most of existing work in MIR falls into two main categories based on the
underlying representation used: 1) symbolic MIR where the underlying
representation is some form of musical score and the techniques used are more
closely related to text IR and 2) audio MIR where the underlying representation
is an audio file and the techniques sued are more closely related to multimedia
IR. This tutorial will only cover audio representation. The main emphasis will
be on defining various problems in MIR and describe the fundamental ideas and
concepts behind solving them rather than providing unnecessary technical
details.
Target audience: The intended audience is people involved with digital libraries from academia and industry that are interested in the emerging area of MIR. Familiarity with basic concepts in text and multimedia information retrieval as well as some math will be useful but not necessary.
Duration: Half-day, 14.00-17.30, Sunday August 17.
Presenter: Dagbert Soergel, College of Information Studies, Univ. of Maryland.
Abstract: This introductory tutorial is intended for anyone concerned with subject access to digital libraries. It provides a bridge by presenting methods of subject access as treated in an information studies program for those coming to digital libraries from other fields. It will elucidate through examples the conceptual and vocabulary problems users face when searching digital libraries. It will then show how a well-structured thesaurus / ontology can be used as the knowledge base for an interface that can assist users with search topic clarification (for example through browsing well-structured hierarchies and guided facet analysis) and with finding good search terms (through query term mapping and query term expansion — synonyms and hierarchic inclusion). It will touch on cross-database and cross-language searching as natural extensions of these functions. The workshop will cover the thesaurus structure needed to support these functions: Concept-term relationships for vocabulary control and synonym expansion, conceptual structure (semantic analysis, facets, and hierarchy) for topic clarification and hierarchic query term expansion). It will introduce a few sample thesauri and some thesaurus-supported digital libraries and Web sites to illustrate these principles.
Target audience: This introductory tutorial is intended for anyone concerned with subject access to digital libraries.
Duration: Half-day, 09.00-12.30, Sunday August 17.
Presenter: Dagobert Soergel, College of Information Studies, Univ. of Maryland.
Abstract: This tutorial is intended for people who have a basic familiarity with the function and structure of thesauri and ontologies. It will introduce criteria for the design and evaluation of thesauri and ontologies and then deal with methods and tools for their development: Locating sources; collecting concepts, terms. and relationships to reuse existing knowledge; developing and refining thesaurus/ontology structure; software and database structure for the development and maintenance of thesauri and ontologies; collaborative development of thesauri and ontologies; developing crosswalks / mappings between thesauri/ontologies. In summing up, the tutorial will address the question of the amount of resources needed to develop and maintain a thesaurus or ontology.
Target audience: This tutorial is intended for people who have a basic familiarity with the function and structure of thesauri and ontologies.
Duration: Half-day, 14.00-17.30, Sunday August 17.
Presenters:
Linda Hill, Alexandria
Digital Library Project, Department of Geography, University of California,
Santa Barbara.
Michael Freeston,
Project Coordinator, Alexandria Digital Earth Prototype Project, Department
of Computer Science, University of California, Santa Barbara
Abstract: Georeferencing is relating information (e.g., documents,
datasets, maps, images, biographical information) to geographic locations
through placenames (i.e., toponyms) and place codes (e.g., postal codes) or
through geospatial referencing (e.g., longitude and latitude coordinates). The
digital library perspective toward georeferencing is a blend of the focus of
Geographic Information Systems (GIS) on geospatial coordinates, data layers, and
mapping; of map librarianship; and of the traditional library focus on textual
representation of location using placenames, administrative unit hierarchies,
and other textual forms of spatial reference.
This tutorial covers the broad scope of georeferencing, including an overview of
types of georeferenced objects and their characteristics; fundamental concepts
of geospatial referencing; georeferencing structures of metadata standards
(MARC, FGDC, Dublin Core, and more); gazetteers and their role in translating
between textual and geospatial location referencing; supporting database
architectures; and geospatial matching in information retrieval. In the process,
the major information management standards for geospatial description,
retrieval, interoperability, and information exchange will be identified.
Target audience: This is an introductory tutorial that is relevant to those interested in the application of map-based (geospatial) indexing for objects in digital library collections, cataloging and metadata design, knowledge organization systems, information retrieval, information visualization, and in subject fields where georeferencing is key to information analysis, including the social sciences, humanities, and environmental sciences.
Duration: Half-day, 09.00-12.30, Sunday August 17.
Presenters:
James Frew,
Assistant Professor, Donald Bren School of Environmental Science and Management,
University of California, Santa Barbara.
Gregory A. Janée,
University of California, Santa Barbara, Department of Computer Science, Alexandria
Digital Library.
Rudolf W. Nottrott, University of California, Santa Barbara, Department
of Computer Science, Alexandria Digital Library.
Catherine Masi, Davidson Library, University of California, Santa Barbara,
Abstract: This tutorial will be of interest to individuals or
institutions with geospatial digital content which they would like to publish
for structured search and retrieval over the Web. The tutorial is based on
software developed by the Alexandria Digital Library Project (ADL), which
facilitates the creation and management of distributed digital library
collections. ADL collections can operate stand-alone for use by individual
users, or optionally and seamlessly switch into a distributed mode for web-based
information sharing and publication.
Geospatial collections are typically heterogeneous in content and can span items
as diverse as maps, historical photographs, field data, remotely sensed images
or archeological data. The ADL software allows structured search and retrieval
on such heterogeneous data collections, combining the simplicity of Dublin Core
with the specificity of a full Boolean query language. The aim of the tutorial
is to familiarize participants with the overall technology and with the specific
procedures and software involved in setting up a stand-alone or distributed ADL
node. As a case study, we will focus on a collection of USGS Digital Raster
Graphics (DRG) maps. However, the technology we present is much more general: it
can be applied to collections of any georeferenced library objects and, further,
to collections of any objects to which a structured discovery technique can be
applied. Based on Open Source components and open protocol standards (including
Java,Tomcat, XML, JDBC, SQL), the ADL software is freely available and can be
installed on all common software and hardware platforms.
Target audience: This tutorial targets individuals or institutions interested in publishing existing collection content for structured search and retrieval either on their local system or Intranet, or over the Internet to a global user community by participating in federated networks of heterogeneous content providers.
Duration: Half-day, 14.00-17.30, Sunday August 17.
Presenter: Fredric Gey, University of California, Berkeley
Abstract: The growth of the Internet and the World
Wide Web has made available vast written and spoken resources on a global scale
from almost all countries in the world. The languages represented on the web
are a reflection of this diversity of resources and, to the serious searcher,
documents in languages other than English may provide unique news, cultural
insight and altogether different perspectives on our electronic world. Moreover,
most of the world’s peoples speak a native tongue other than English. This fact
will increasingly be felt on the Internet. According to the Global Internet
Statistics (as of January 2001), the majority of internet users speak a non-English
language (52% versus 47%) as their native tongue (see http://www.glreach.com/globstat
for details). During the past decade rapid progress has been made in developing
techniques for Multilingual Information Access. Use of electronic bilingual
dictionaries and machine translation software has been augmented by lexicons
assembled from aligned bilingual parallel corpora of translated documents, techniques
for query expansion, phrase recognition and translation disambiguation. On the
other hand, most of these resources have been developed and applied to the major
European (English, French, German, Italian and Spanish) and Asian (Chinese,
Japanese, Korean) languages.
This half-day tutorial will cover aspects of Multilingual Information Access
such as cross language search and retrieval, machine translation and statistical
machine translation, multilingual search of the WWW and electronic digital library
catalogs, evaluation strategies, evaluation campaigns and test collections for
cross-language search effectiveness in the United States (TREC), Japan (NTCIR)
and Europe (CLEF).
Target audience: The audience is intended to be professionals in information retrieval or digital library research whose positions may expose them to multilingual digital content. The audience will be exposed in some detail to the basic principles of multilingual search and automatic translation and evaluation of such search. Examples will be from European languages, including languages with non-Roman alphabets such as Russian and Greek, Asian languages where discernment of word boundaries (no white space between words) is a significant challenge, and other languages such as those from the Indian subcontinent.
Duration: Half-day, 09.00-12.30, Sunday August 17.
Presenter:
Martin
Doerr, Information Systems Lab, Institute of Computer Science, Foundation
for Research and Technology - Hellas (FORTH), Vassilika Vouton.
Stephen Stead, Vice Chair, CIDOC/ICOM.
Abstract: This tutorial will introduce the audience to the CIDOC
Conceptual Reference Model, a core ontology and proposed ISO standard (ISO/CD
21127) for the semantic integration of cultural information with library,
archive and other information. The CIDOC CRM concentrates on the definition of
relationships, rather than classes, in order to capture the underlying semantics
of multiple data and metadata structures. This led to a compact model of 80
classes and 130 relationships, easy to comprehend and suitable to serve as a
basis for mediation of cultural and other information and thereby provide the
semantic 'glue' needed to transform today's disparate, localised information
sources into a coherent and valuable global resource. It comprises the concepts
characteristic for most museum, archive and library documentation.
The tutorial aims at rendering the necessary knowledge to understand the
potential of applying the CRM - where it can be useful and what the major
technical issues of an application are. It will present information integration
by employing a core ontology of relationships, in contrast to the prescription
of a common data format, as an approach applicable to other domains. In a real
example, it will demonstrate the solution of typical cases of heterogeneity by
intellectually mapping source data structures to the ontology. Participants with
some background in information modelling should be able to use the CIDOC CRM in
their applications after this course and some further reading.
Target audience: Ontology experts, digital library designers, data warehouse designers, system integrators, portal designers that work in the wider area of cultural and library information, but also IT-Staff of libraries, museums and archives, vendors of cultural and other information systems. Basic knowledge of object-oriented data models is required.
Duration: Half-day, 14.00-17.30, Sunday August 17.