With the growing number of dental patients and the continuous advancement of the digital transformation of dental hospitals, tooth segmentation is becoming increasingly important in the digital diagnosis, design, treatment, and customized device manufacturing of orthodontics, oral implants, and restorations. This study aims to apply the “Arbitrary Segmentation Model” (SAM) to the tooth segmentation task to achieve accurate tooth segmentation performance.
This study introduces a new tooth segmentation method, Tooth-ASAM, which leverages the power of SAM. An adapter-based image encoder and mask decoder are designed specifically to apply SAM to tooth images. The proposed method is evaluated through rigorous evaluation on multimodal tooth images, including cone beam computed tomography (CBCT) images, panoramic radiographs, and natural tooth images captured by a micro camera.
The experimental results clearly show that Tooth-ASAM achieves superior performance on all four datasets, with outstanding performance on key metrics such as Dice coefficient, IoU, HD95, and ASSD. In addition, the proposed Tooth-ASAM model obtains more perceptually accurate segmentation results than the state-of-the-art methods on the four tooth datasets.
This study shows that accurate tooth segmentation performance can be achieved by applying SAM and adaptive training strategies, making it very suitable for clinical applications in orthodontics, oral implant surgery, and prosthodontics.
With the acceleration of urbanization and changes in living environments, the incidence of major oral diseases continues to rise worldwide. According to the 2022 Global Oral Health Report1, it is estimated that nearly 3.5 billion people in the world suffer from oral diseases, and oral health issues have become an important issue that cannot be ignored.
The increase in the number of patients with oral diseases, coupled with the enhancement of national health awareness, has led to a continuous increase in the number of dental patients visiting hospital outpatient and emergency departments. Therefore, the digital transformation of dental hospitals has become an inevitable trend, and innovative solutions are urgently needed to meet the growing demand for efficient oral medical care.
Accurate tooth segmentation is crucial in the digital diagnosis, design, treatment, and customized device manufacturing of orthodontics, oral implants, and prosthodontics. This process requires the precise segmentation of individual teeth from digital dental scans.
On the one hand, accurate tooth segmentation can improve the accuracy of oral medical diagnosis, thereby making more informed decisions. On the other hand, it can also improve the precision of treatment and the effectiveness of oral structure restoration, ultimately reducing treatment costs.
However, achieving accurate tooth boundary and root segmentation is challenging due to various factors such as missing teeth, complex root morphology, low contrast, and uneven intensity distribution in dental images. These images may include cone beam CT (CBCT) scans, panoramic X-ray images, and natural tooth images captured by micro cameras.
Despite these challenges, people continue to explore the advancement of digital technology and image processing algorithms to improve the accuracy and efficiency of tooth segmentation. With the remarkable achievements of convolutional neural networks (CNNs) in various computer vision tasks, their application in dental images has become increasingly popular.
Compared with traditional level set methods, CNN methods have performed well in segmenting tooth roots and alveolar bones. However, CNNs are greatly limited by their small receptive fields, which restricts their ability to capture global or contextual information. To address this problem, a Transformer model based entirely on the self-attention mechanism was proposed. Transformer-based tooth segmentation methods and CNN-Transformer algorithms4 can extract local features and long-range dependencies, such as accurate tooth edge detection, analysis of tooth-jaw relationship, and identification of pathological changes involving surrounding tissues.
Recently, the SAM (Sensitive Anywhere Model) has attracted much attention due to its excellent zero-shot generalization ability in the field of image segmentation. The model is able to accurately segment objects and scenes that it has never been trained on, guided only by user-provided cues (e.g., points or bounding boxes).
As a cues-driven framework, SAM achieves state-of-the-art segmentation performance by training on a dataset containing 1 billion images, enabling it to dynamically adapt to user-provided cues (e.g., points, boxes, or text) to accurately depict objects. However, due to the differences between tooth images and other images, directly applying SAM 5 or SAM-Adapter 6 to tooth segmentation does not produce ideal results.
At the same time, existing tooth segmentation methods mainly deal with single-modal datasets, which cannot meet the needs of practical applications. Recently, Katsaros et al. 7 created a dental video dataset (Vident-lab), which provides the possibility for further research on dental video processing applications, and proposed a deep network for multi-task dental video enhancement with almost real-time processing.
Therefore, this paper proposes a novel tooth segmentation method, namely, developing an adaptive SAM model Tooth-ASAM specifically tailored for this task. This paper aims to address the challenges in multimodal dental image segmentation and proposes a solution using adaptive SAM.
Current challenges in multimodal dental image segmentation include the need to accurately segment fine anatomical structures, deal with low contrast and fuzzy boundaries in dental images, and the complexity of integrating information from different modalities. The SAM model has several key advantages over existing tooth segmentation methods. First, its high generalization ability enables it to perform well on different datasets without extensive retraining, which makes it highly adaptable to various imaging modalities and patient populations.
Second, SAM supports zero-shot segmentation, which means that it can segment objects (e.g., teeth) in new or unseen scenarios without the need for labeled examples for these specific cases. Third, its flexibility in handling multimodal and heterogeneous image data ensures its robust performance across a variety of dental imaging techniques (e.g., intraoral scans, CBCT, or panoramic radiographs). Together, these features make SAM a powerful tool to improve the accuracy and efficiency of segmenting individual teeth from digital dental casts.
To verify the effectiveness of the proposed model, we conducted a comprehensive evaluation on four different tooth datasets, including two CBCT tooth datasets, a panoramic X-ray tooth dataset, and natural tooth images taken by a miniature camera. The results show that the proposed Tooth-ASAM outperforms the existing methods on all four datasets, highlighting its superiority in accurate tooth segmentation.
The main contributions of this work are summarized as follows: (1) An innovative tooth segmentation method is developed. This is an improved adaptive SAM method (Tooth-ASAM), which is designed to fully consider the challenges of various dental image modalities.
This method focuses on solving the differentiated challenges in the processing of different dental images (such as CBCT, panoramic X-rays, and natural tooth images), which is of pioneering significance in the field of tooth segmentation.
(2) Tooth-ASAM is rigorously evaluated on four different tooth datasets (including CBCT, panoramic X-rays, and natural tooth images). The evaluation demonstrates the adaptability and generalization ability of the proposed method in different image modalities.
(3) Experimental results show that the proposed method outperforms the existing technical means on all four datasets. This not only verifies the effectiveness of this method in accurate tooth segmentation, but also highlights its unique advantages over traditional methods in the field of tooth image segmentation research.
Related topics: