Apr 26 2024

The use of deep learning integrating image recognition in language analysis technology in secondary school education Scientific Reports

Handloomed fabrics recognition with deep learning Scientific Reports

ai based image recognition

Models were trained using the PTB-XL dataset and evaluated on holdout test data from PTB-XL (Table 1). Additionally, models were also tested on ECG images from other datasets not involved in training. Further testing was done on combined datasets, where matching diagnostic labels were present (Table 2).

However, these features often correlate, causing information redundancy that negatively affects classification speed and accuracy, thus impacting overall performance. Some scholars have proposed dimensionality reduction methods like principal component analysis and discriminant analysis, which reduce feature dimensionality and accelerate classification speed. However, these methods only fuse images and fail to adequately represent the contribution of features to classification results. Other scholars have used genetic algorithms and particle swarm optimization to select features, preserving their original meaning and showing good interpretability of classification results.

What is Enterprise AI? A Complete Guide for Businesses – TechTarget

What is Enterprise AI? A Complete Guide for Businesses.

Posted: Tue, 29 Oct 2024 07:00:00 GMT [source]

The precise delineation of what is considered image (pre)processing is also unclear when considering the full path from initial X-ray exposure through to input to an AI model or even presentation on a viewing workstation for clinician review. While many features may be involved in AI-based racial identity prediction and performance bias7, including other demographic confounders42,45, we focused on image acquisition and processing factors for several reasons. First, it is known that biases related to such factors already exist in several medical imaging domains14,19,20,22,23 and may be more widespread.

Honda Invests in U.S.-based Helm.ai to Strengthen its Software Technology Development

In the resampled test set, we observe that the overall underdiagnosis bias is lower at baseline, as recently demonstrated by Glocker et al.42. Nonetheless, we find that the bias can be further reduced when using the per-view thresholds, with similar results also observed when performing training set resampling. For the DICOM-based evaluation, both the baseline disparity magnitude and its decrease with view-specific thresholds are similar to the original results. Thus, we observe variations in the baseline underdiagnosis bias, but the view-specific threshold approach reduces this bias for each confounder strategy, patient race (Asian and Black), and model training set (CXP and MXR).

Including visuals such as images of tunnel face conditions and rock samples can highlight these challenges and underscore the importance of the proposed AI-based methods in improving assessment accuracy and construction safety. The second part is a review of the current research status of IR and classification issues both domestically and internationally. The third part proposes an IR algorithm based on DenseNet, and improves the SDP acceleration training algorithm based on GQ. This model is expected to achieve accurate IR and alleviate the low efficiency in distributed training, thereby improving the running speed of the model. The existence of the fully connected layer leads to the fact that the size of the input image must be uniform, and the proposal of SPP-Net He et al. (2015) solves this problem, so that the size of the input image is not limited.

ai based image recognition

To address this, our rock database includes a wide range of rock types to enhance adaptability. Continuous updates and further validation in diverse environments are essential to ensure robust performance. Dr. Kaiming He provided various depth ResNet models in his 2016 paper, such as ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152. Unlike UNet-based segmentation algorithms, ResNet networks extract image features in a hierarchical manner. The term “class imbalance” refers to the unequal distribution of data between classes.

Finally, an experiment is designed to validate the proposed framework for analyzing language behavior in secondary school education. The results indicate specific differences among the grouped evaluation scores for each analysis indicator. The significance p values for the online classroom discourse’s speaking rate, speech intelligibility, average sentence length, and content similarity are −0.56, −0.71, −0.71, and −0.74, respectively.

Choosing an AI data classification tool is part of the process of training models, and different tools may offer varied algorithms, functionalities, and performance characteristics that can affect the effectiveness of the classification models. Selecting the right tool during this step is necessary to reach your data classification goals. The single-stage target detection network, RetinaNet24,25, has been improved to better suit the detection of electrical equipment, which often has a large aspect ratio, a tilt angle, and is densely arranged. The horizontal rectangular frame of the original RetinaNet has been altered to a rotating rectangular frame to accommodate the prediction of the tilt angle of the electrical equipment. Additionally, the Path Aggregation Network (PAN) module and an Attention module have been incorporated into the feature fusion stage of the original RetinaNet.

EfficientNet is a family of models that delivers competitive results in both performance and computational cost. Higher numbered models are typically larger and more complex, but require more computing power. Alex McFarland is an AI journalist and writer exploring the latest developments in artificial intelligence. Using AI, Monument sorts your files by date, location, camera, person, and scenery, making them easily accessible and searchable.

Design of test experiment plan and model parameter setting

As opposed to the fourth convolutional block, the second and third blocks generate lower-level features with smaller local receptive fields which caused inferior performance when used for the domain classifier. In the third experiment involving CTransPath, we conducted training without employing regular augmentations. Across the Ovarian, Pleural, Bladder, and Breast datasets, AIDA without augmentation functions yielded classification accuracies of 82.67%, 73.77%, 64.56%, and 77.45% respectively, surpassing its augmented counterpart.

One important set of parameters centers around X-ray exposure, dictating the energy and quantity of X-rays emitted by the machine28,29. The appropriate level of exposure and the effects of differing exposures on image statistics such as contrast and noise are complex topics that depend on patient and machine-specific characteristics28,29,30,31,32,33. In modern digital radiography, additional image processing takes place that can compensate for some of these effects, such as ‘windowing’ the image to help normalize overall brightness and contrast28,29. You can foun additiona information about ai customer service and artificial intelligence and NLP. While it is not possible to retrospectively alter the X-ray exposure in the images used here, we can still perform windowing modifications to simulate changes in the image processing and, to some extent, exposure.

A model with high recall means it can reliably predict positive samples when they occur. In sports category classification, ensuring as many positive samples as possible are successfully identified is crucial. The f1-score is a metric used in statistics to measure the accuracy of binary (or multi-task binary) classification models. It simultaneously considers both the precision and recall of the classification model.

Anisotropic diffusion filtering is used instead of Gaussian wrap-around filtering, which makes the estimation of the light component at the image boundary more accurate, and attenuates the halo at the strong edge part of the enhanced image. Traditional guided filtering applies a fixed regularization factor ε to each region of the image, which does not take into account the textural differences among various regions. To address this limitation, WGF introduces an edge weighting factor ΓG, allowing ε to be adaptively adjusted based on the degree of image smoothing, thereby enhancing the algorithm’s capability to preserve image edges15. The edge weighting factor ΓG and the modified linear factor ak are defined in the following equation.

Representative original and output images of different organoids imaged on different days were shown (Fig. 4c). To estimate organoid growth in a non-invasive manner, we analyzed three images per organoid sample using OrgaExtractor and extracted data regarding the total projected areas daily. Based on the data plotted on a graph, organoids were cultured until their growth ChatGPT App slowed down. The relatively estimated cell numbers were significantly different between the two rapidly grown organoids and the other gradually grown organoid (Fig. 4d). The growths of COL-007-N, COL-035-N, and COL-039-N were also estimated with CTG assay, Hoechst staining, and CellTracker Red staining assay, which can also confirm the actual cell numbers (Fig. 4e).

For patients with multiple slides, to prevent data leakage between training, validation, and test sets, we assigned slides from each patient to only one of these sets. This study utilized two public chest X-ray datasets, CheXpert39 and MIMIC-CXR40, which are de-identified in accordance with the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. The study is classified as not-human subjects research as determined ChatGPT by the Dana-Farber/Harvard Cancer Center Institutional Review Board. The Pleural dataset consists of benign pleural tissue and malignant mesothelioma slides from two centers. The source dataset includes 194 WSIs (128 patients) and the target dataset contains 53 WSIs (53 patients). The Bladder dataset is comprised of micropapillary carcinoma (MPC) and conventional urothelial carcinoma (UCC) slides from multiple hospitals across British Columbia.

Typically, image recognition entails building deep neural networks that analyze each image pixel. These networks are fed as many labeled images as possible to train them to recognize related images. OrgaExtractor provides several measurements, such as the projected area, perimeter, axis length, eccentricity, circularity, roundness, and solidity of each organoid, from the contour image. Although these measurements may be useful for understanding the morphological features of a single organoid, they are insufficient for representing the entire culture condition to which the organoid belongs. Therefore, we added up the measurements, such as the projected area, or averaged out the eccentricity of a single organoid as the parameters of the organoid image to analyze the culture conditions (Fig. 1d). The correlation between analysis parameters and the estimated actual cell numbers was extracted.

The strengths and weaknesses of this approach are discussed in detail (Table 2). Some famous methods for edge detection include the Sobel operator, Canny edge detector, and Laplacian of Gaussian (LoG) filter. KNN is a simple yet powerful machine learning algorithm used for classification and regression tasks. The key idea behind it is to assign a label or predict a value for a new data point based on the labels or values of its closest neighbors in the training dataset. KNN is often used in scenarios where there is little prior knowledge about the data distribution, such as recommendation systems, anomaly detection, and pattern recognition. AI data classification is a process of organizing data into predefined categories using AI tools and techniques.

Key metrics such as precision and recall are typically used to quantify the model’s success in classifying data. Evaluating AI data classification models helps you discover their strengths, weaknesses, and any potential areas for improvement that call for additional training or feature engineering. This step ensures that the classification process meets the desired quality standards and aligns with the defined objectives. A novel infrared image denoising algorithm for electrical equipment based on DeDn-CNN is proposed. This algorithm introduces a deformable convolution module that autonomously learns the noise feature information in infrared images. The RetinaNet is augmented by incorporating a rotating rectangular frame and an attention module, and further enhanced by appending the Path Aggregation Network (PAN) to the Feature Pyramid Network (FPN) for improved bottom-up feature fusion.

The proposed model produces a binary segmentation output in which black represents the background, and white indicates the organoids. Organoids are 3D structures composed of numerous cells from pluripotent stem cells or adult stem cells of organs1. Because diverse cell types are derived from stem cells, organoids mimic human native organs better than traditional 2D culture systems2. Organoids have become a precise preclinical model for researching personalized drugs and organ-specific diseases3,4. Optimization of the organoid culture conditions requires periodic monitoring and precise interpretation by researchers2.

This could present an exciting opportunity to utilize the power of AI to inform clinical trials and deep biological interrogation by adding more precision in patient stratification and selection. Of note, our model also identified a subset of p53abn ECs (representing 20%; referred to as NSMP-like p53abn) with a resemblance to NSMP as assessed by H&E staining. These 10 classifiers were then used to label the cases as p53abn or NSMP and their consensus was used to come up with a label for a given case.

  • If language like this is included in a bill that is passed by Congress and signed into law, BIS wouldn’t necessarily adopt the broadest possible scope of coverage.
  • Figure 3 Performance assessment of single-stage Object detection algorithms in different datasets.
  • For a given combination of window width and field of view, the racial identity prediction model was run on each image in the test set to produce three scores per image (corresponding to Asian, Black, and white).
  • Some scholars have introduced the above optimization scheme in the improvement of the network structure of related models to make the detection results more ideal.
  • M.Z.K., data analysis, experiments and evaluations, manuscript draft preparation M.S.B., conceptualization,defining the methodology, evaluations of the results, and original draft and reviewing, supervision.

However, because an organoid is a multicellular structure of varying sizes, estimating the growth with precise time points is difficult15,16,30. OrgaExtractor was used to compare the growth between different colon organoid samples based on the total projected areas and to understand the characteristics of a single colon organoid sample. Researchers can observe the growth of organoid samples in real time using the morphological data extracted from OrgaExtractor.

Regression analysis of classroom discourse indicators in secondary school online education on course evaluation

Efficient-Net Tan and Le (2019) does not pursue an increase in one dimension (depth, width, image resolution) to improve the overall precision of the model but instead explores the best combination of these three dimensions. Based on EfficientNet, Tan et al. (2020) suggested a set of Object detection frameworks, EfficientDet, which can achieve good performance for different levels of resource constraints. The mAPs for the PASCAL VOC2007 and PASCAL VOC2012 datasets are respectively 71.4% and 73.8%. Python is utilized to call the Alibaba Cloud intelligent speech recognition interface and Baidu AI Cloud general scene text interface, enabling speech recognition for audio resources containing educational language behavior. Following this, text recognition is performed for image resources containing courseware content. This process extracts the text format of classroom discourse from audio files and the text format of teaching content from images.

The study was developed based on the data parallelism accelerated training method to speed up the training of neural network models and reduce communication costs as much as possible31,32. Image recognition technology belongs to an important research field of artificial intelligence. In order to enhance the application value of image recognition technology in the field of computer vision and improve the technical dilemma of image recognition, the research improves the feature reuse method of dense convolutional network. Based on gradient quantization, traditional parallel algorithms have been improved. This improvement allows for independent parameter updates layer by layer, reducing communication time and data volume.

The fact that the data augmentation approach did not help, and actually seemed to slightly increase the underdiagnosis bias, does raise an important question of whether current standard data augmentation techniques have any contribution to AI bias. We also note that it is much more challenging to assess the “true” underlying distribution of the factors represented by the window width and field of view parameters. The field of view parameter is also an imperfect simulation of changing the collimation and relative size of the X-ray field with respect to the patient. Nonetheless, the fact that the race prediction model did show differences in predictions over these parameters does suggest that it may have learned intrinsic patterns in the underlying datasets (Supplementary Fig. 6). We explored two approaches motivated by the results above to reduce the underdiagnosis bias.

These results underscore the importance of integrating the FFT-Enhancer module in the model architecture to enhance knowledge transfer between domains, resulting in more robust and reliable models for real-world applications. CTransPath employs a semantically relevant contrastive learning (SRCL) framework and a hybrid CNN-Transformer backbone to address the limitations of traditional SSL approaches. The SRCL framework selects semantically matched positives in the latent space, providing more visual diversity and informative semantic representations than conventional data augmentation methods.

Best Data Analytics…

Thus, it provides a solid technical foundation for extracting characters from teaching video images and obtaining teaching content in this work28. The models demonstrated good performance when tested on unseen holdout test data from the original datasets used on training. Performance of models trained on a combination of different datasets mixed together showed good performance on holdout test splits containing the mixed datasets. “One of my biggest takeaways is that we now have another dimension to evaluate models on.

From a “global” picture level to a “local” image level, contextual information has been utilized in object recognition. A global image level takes into account image statistics from the entire image, whereas a local image level takes into account contextual information from the objects’ surrounding areas. Contextual characteristics can be divided into three categories such as local pixel context, semantic context, and spatial context. All data generated or analysed during this study are included in this published article [and its supplementary information files]. 5, a significant negative correlation is observed between speech intelligibility and the comprehensive score of online course evaluation, with a correlation coefficient of −0.71.

Analysis of learner’s evaluation comments reveals that learners often focus on educators’ speaking rates when evaluating online courses for secondary education. The speaking rate can be explicitly understood as the number of words or syllables per unit of time. The statistical unit for speaking rate in Chinese is generally expressed as Words Per Minute (WPM)18. This work defines speaking rate as the average speed at which educators talk throughout a class, with pauses between sentences also considered in the calculation19. The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Here, each patch’s label corresponds to the subtype of its corresponding histopathology slide. The process involves passing patches through convolutional layers and feeding the generated feature maps into fully connected layers. The model is trained using the cross-entropy loss function58, similar to standard classification tasks. For score threshold selection, we targeted a ‘balanced’ threshold computed to achieve approximately equal sensitivity and specificity in the validation set. Such a selection strategy is invariant to the empirical prevalence of findings in the dataset used to choose the threshold, allowing more consistent comparisons across datasets and different subgroups.

ai based image recognition

Liu et al.30 experimented with VGG16 and its variants and concluded their effectiveness in detecting complicated texture fabrics. Considering this analysis in the textile domain, we adopted VGG16 and VGG19 for our classification problem. WSIs are large gigapixel images with resolutions typically exceeding 100,000 × 100,000 pixels and present a high degree of morphological variance, as well as containing a variety of artifacts. These conditions make it impossible to directly apply conventional deep networks.

On the COCO dataset, the two-stage object detection uses a cascade structure and has been successful in instance segmentation. Although detection accuracy has improved over time, detection speed has remained poor. On the VOC2007 test set, VOC 2012 test set, and COCO test set, Figure 2 reviews the spine network of the two-stage object detection method, as well as the detection accuracy (mAP) and detection speed. Performance comparison of two-stage object ai based image recognition detection algorithms as shown in Figure 2. Using three models to test a dataset of 500 images from 100 sports categories (5 images per category), the VGG16 model and ResNet50 had overall accuracy, recall and f1 scores of 0.92, 0.93 and 0.92, respectively. The overall accuracy, recall rate and f1 score of the SE-RES-CNN model are all 0.98, as shown by using three models to test a dataset of 500 images from 100 sports categories (5 images per category).

Deep learning obviates the requirement for independent feature extraction by autonomously learning and discerning relevant features directly from raw data. This inherent capability streamlines the process, enhancing adaptability to diverse datasets and eliminating the need for manual feature engineering. As the view position is a discrete, interpretable parameter, it is straightforward to compare the behavior of the AI model by this parameter to its empirical statistics in the dataset. We indeed find differences in the relative frequencies of views across races in both the CXP and MXR datasets. Overall, the largest discrepancies were observed for Black patients in the MXR dataset, which also corresponds to where the largest AI-based underdiagnosis bias was observed. These differences in view proportions are problematic from an AI development perspective, in part because the AI model may learn shortcut connections between the view type and the presence of pathological findings24,25.

Kommentare deaktiviert für The use of deep learning integrating image recognition in language analysis technology in secondary school education Scientific Reports

Comments are closed at this time.