Oral tumors are responsible for about 170,000 deaths every year in the World. In this paper, we focus on oral squamous cell carcinoma (OSCC), which represents up to 80-90 % of all malignant neoplasms of the oral cavity. We present a novel deep learning-based method for segmenting whole slide image (WSI) samples at the pixel level. The proposed method is a modification of the well-known U-Net architecture through a multi-encoder structure. In particular, our network, called Multi-encoder U-Net, is a multi-encoder single decoder network that takes as input an image and splits it in tiles. For each tile, there is an encoder responsible for encoding it in the latent space, then a convolutional layer is responsible for merging the tiles into a single layer. Each layer of the decoder takes as input the previous up-sampled layer and concatenate it with the layer made by merging the corresponding layers of the multiple encoders. Experiments have been carried out on the publicly available ORal Cancer Annotated (ORCA) dataset, which contains annotated data from the TCGA repository. Quantitative experimental results, obtained using three different quality metrics, demonstrate the effectiveness of the proposed approach, which achieves 82% Pixel-wise Accuracy, 0.82 Dice similarity score, and 0.72 Mean Intersection Over Union.

Multi-encoder U-Net for Oral Squamous Cell Carcinoma Image Segmentation

Bloisi D. D.
;
Nardi D.;
2022-01-01

Abstract

Oral tumors are responsible for about 170,000 deaths every year in the World. In this paper, we focus on oral squamous cell carcinoma (OSCC), which represents up to 80-90 % of all malignant neoplasms of the oral cavity. We present a novel deep learning-based method for segmenting whole slide image (WSI) samples at the pixel level. The proposed method is a modification of the well-known U-Net architecture through a multi-encoder structure. In particular, our network, called Multi-encoder U-Net, is a multi-encoder single decoder network that takes as input an image and splits it in tiles. For each tile, there is an encoder responsible for encoding it in the latent space, then a convolutional layer is responsible for merging the tiles into a single layer. Each layer of the decoder takes as input the previous up-sampled layer and concatenate it with the layer made by merging the corresponding layers of the multiple encoders. Experiments have been carried out on the publicly available ORal Cancer Annotated (ORCA) dataset, which contains annotated data from the TCGA repository. Quantitative experimental results, obtained using three different quality metrics, demonstrate the effectiveness of the proposed approach, which achieves 82% Pixel-wise Accuracy, 0.82 Dice similarity score, and 0.72 Mean Intersection Over Union.
2022
978-1-6654-8299-8
File in questo prodotto:
File Dimensione Formato  
Multi-encoder_U-Net_for_Oral_Squamous_Cell_Carcinoma_Image_Segmentation.pdf

solo utenti autorizzati

Tipologia: Pdf editoriale
Licenza: Versione editoriale
Dimensione 3.77 MB
Formato Adobe PDF
3.77 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/169198
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact