Large web sites are becoming repositories of structured in- formation that can benefit from being viewed and queried as relational databases. However, querying these views efficiently requires new tech- niques. Data usually resides at a remote site and is organized as a set of related HTML documents, with network access being a primary cost factor in query evaluation. This cost can be reduced by exploiting the redundancy often found in site design. We use a simple data model, a subset of the Araneus data model, to describe the structure of a web site. We augment the model with link and inclusion constraints that capture the redundancies in the site. We map relational views of a site to a navi- gational algebra and show how to use the constraints to rewrite algebraic expressions, reducing the number of network accesses.
Efficient Queries over Web Views
MECCA, Giansalvatore;
1998-01-01
Abstract
Large web sites are becoming repositories of structured in- formation that can benefit from being viewed and queried as relational databases. However, querying these views efficiently requires new tech- niques. Data usually resides at a remote site and is organized as a set of related HTML documents, with network access being a primary cost factor in query evaluation. This cost can be reduced by exploiting the redundancy often found in site design. We use a simple data model, a subset of the Araneus data model, to describe the structure of a web site. We augment the model with link and inclusion constraints that capture the redundancies in the site. We map relational views of a site to a navi- gational algebra and show how to use the constraints to rewrite algebraic expressions, reducing the number of network accesses.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.