Automatic Web Information Extraction in the ROADRUNNER System

Crescenzi, V; Mecca, Giansalvatore; Merialdo, P.

doi:10.1007/3-540-46140-X_21

This paper presents RoadRunner, a research project that aims at developing solutions for automatically extracting data from large HTML data sources. The target of our research are data-intensive Web sites, i.e., HTML-based sites with a fairly complex structure, that publish large amounts of data. The paper describes the top-level software architecture of the RoadRunner System, and the novel research challenges posed by the attempt to automate the information extraction process.

Automatic Web Information Extraction in the ROADRUNNER System

CRESCENZI V;MECCA, Giansalvatore;MERIALDO P.

2001-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2001
			
	Codice ISBN
	
				3540441220
			
	Appare nelle tipologie:
	
				4.1 Contributo in atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/9634

Automatic Web Information Extraction in the ROADRUNNER System

CRESCENZI V;MECCA, Giansalvatore;MERIALDO P.

2001-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

Automatic Web Information Extraction in the ROADRUNNER System

CRESCENZI V;MECCA, Giansalvatore;MERIALDO P.

2001-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)