Multi-encoder U-Net for Oral Squamous Cell Carcinoma Image Segmentation

IRIS

Oral tumors are responsible for about 170,000 deaths every year in the World. In this paper, we focus on oral squamous cell carcinoma (OSCC), which represents up to 80-90 % of all malignant neoplasms of the oral cavity. We present a novel deep learning-based method for segmenting whole slide image (WSI) samples at the pixel level. The proposed method is a modification of the well-known U-Net architecture through a multi-encoder structure. In particular, our network, called Multi-encoder U-Net, is a multi-encoder single decoder network that takes as input an image and splits it in tiles. For each tile, there is an encoder responsible for encoding it in the latent space, then a convolutional layer is responsible for merging the tiles into a single layer. Each layer of the decoder takes as input the previous up-sampled layer and concatenate it with the layer made by merging the corresponding layers of the multiple encoders. Experiments have been carried out on the publicly available ORal Cancer Annotated (ORCA) dataset, which contains annotated data from the TCGA repository. Quantitative experimental results, obtained using three different quality metrics, demonstrate the effectiveness of the proposed approach, which achieves 82% Pixel-wise Accuracy, 0.82 Dice similarity score, and 0.72 Mean Intersection Over Union.

Multi-encoder U-Net for Oral Squamous Cell Carcinoma Image Segmentation

Pennisi A.;Bloisi D. D.;Nardi D.;Varricchio S.;Merolla F.

2022-01-01

Abstract

Oral tumors are responsible for about 170,000 deaths every year in the World. In this paper, we focus on oral squamous cell carcinoma (OSCC), which represents up to 80-90 % of all malignant neoplasms of the oral cavity. We present a novel deep learning-based method for segmenting whole slide image (WSI) samples at the pixel level. The proposed method is a modification of the well-known U-Net architecture through a multi-encoder structure. In particular, our network, called Multi-encoder U-Net, is a multi-encoder single decoder network that takes as input an image and splits it in tiles. For each tile, there is an encoder responsible for encoding it in the latent space, then a convolutional layer is responsible for merging the tiles into a single layer. Each layer of the decoder takes as input the previous up-sampled layer and concatenate it with the layer made by merging the corresponding layers of the multiple encoders. Experiments have been carried out on the publicly available ORal Cancer Annotated (ORCA) dataset, which contains annotated data from the TCGA repository. Quantitative experimental results, obtained using three different quality metrics, demonstrate the effectiveness of the proposed approach, which achieves 82% Pixel-wise Accuracy, 0.82 Dice similarity score, and 0.72 Mean Intersection Over Union.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2022
			
	Codice ISBN
	
				978-1-6654-8299-8
			
	Appare nelle tipologie:
	
				4.1 Contributo in atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
Multi-encoder_U-Net_for_Oral_Squamous_Cell_Carcinoma_Image_Segmentation.pdf solo utenti autorizzati Tipologia: Pdf editoriale Licenza: Versione editoriale Dimensione 3.77 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.77 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/169198

Citazioni

ND

5

4

social impact