A large body of research has been recently motivated by the attempt to extend database manipulation techniques to data on the Web. Most of these research efforts -- which range from the definition of Web query languages and the related optimizations, to systems for Web site development and management, and to integration techniques -- started before XML was introduced, and therefore have strived for a long time to handle the highly heterogeneous nature of HTML pages. In the meanwhile, Web data sources have evolved from small, home-made collections of HTML pages into complex platforms for distributed data access and application development, and XML promises to impose itself as a more appropriate format for this new breed of Web sites. XML brings data on the Web closer to databases, since, differently from HTML, it is based on a clean distinction between the way the data, its logical structure (the DTD), and the chosen presentation (the stylesheet) are specified. By virtue of this, most of the early research proposals for data management on the Web are now being reconsidered in this new perspective. In this paper, we discuss the impact of XML on the research work conducted in the last few years by our group in the framework of the Araneus project. Araneus started as an attempt to investigate the chances of re-applying traditional database concepts and abstractions, such as the ones of data-model and query language, to data on the Web. In this spirit, we have developed several tools and techniques to handle both structured and semistructured data, in the Web style, as follows: (i) a data model called ADM for modeling Web documents and hypertexts; (ii) languages for wrapping and querying Web sites; (iii) tools and techniques for Web site design and implementation.

Araneus in the Era of XML

MECCA, Giansalvatore;
1999-01-01

Abstract

A large body of research has been recently motivated by the attempt to extend database manipulation techniques to data on the Web. Most of these research efforts -- which range from the definition of Web query languages and the related optimizations, to systems for Web site development and management, and to integration techniques -- started before XML was introduced, and therefore have strived for a long time to handle the highly heterogeneous nature of HTML pages. In the meanwhile, Web data sources have evolved from small, home-made collections of HTML pages into complex platforms for distributed data access and application development, and XML promises to impose itself as a more appropriate format for this new breed of Web sites. XML brings data on the Web closer to databases, since, differently from HTML, it is based on a clean distinction between the way the data, its logical structure (the DTD), and the chosen presentation (the stylesheet) are specified. By virtue of this, most of the early research proposals for data management on the Web are now being reconsidered in this new perspective. In this paper, we discuss the impact of XML on the research work conducted in the last few years by our group in the framework of the Araneus project. Araneus started as an attempt to investigate the chances of re-applying traditional database concepts and abstractions, such as the ones of data-model and query language, to data on the Web. In this spirit, we have developed several tools and techniques to handle both structured and semistructured data, in the Web style, as follows: (i) a data model called ADM for modeling Web documents and hypertexts; (ii) languages for wrapping and querying Web sites; (iii) tools and techniques for Web site design and implementation.
1999
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/1595
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact