RoadRunner: Towards Automatic Data Extraction from Large Web Sites

Crescenzi, V; Mecca, Giansalvatore; Merialdo, P.

The paper investigates techniques for extracting data from HTML sites through the use of auto- matically generated wrappers. To automate the wrapper generation and the data extraction pro- cess, the paper develops a novel technique to com- pare HTML pages and generate a wrapper based on their similarities and differences. Experimental results on real-life data-intensive Web sites con- firm the feasibility of the approach.

RoadRunner: Towards Automatic Data Extraction from Large Web Sites

CRESCENZI V;MECCA, Giansalvatore;MERIALDO P.

2001-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2001
			
	Codice ISBN
	
				1558608044
			
	Appare nelle tipologie:
	
				4.1 Contributo in atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/9626

RoadRunner: Towards Automatic Data Extraction from Large Web Sites

CRESCENZI V;MECCA, Giansalvatore;MERIALDO P.

2001-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

RoadRunner: Towards Automatic Data Extraction from Large Web Sites

CRESCENZI V;MECCA, Giansalvatore;MERIALDO P.

2001-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)