AI and Machine Learning Glossary for AWS - Knowledge Gained While Studying for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate
First Published:
Last Updated:
This time, I have compiled the knowledge gained during my study process to pass the newly added AWS certifications, AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate, into a "Glossary of AI and Machine Learning Terms Related to AWS".
The knowledge in this "Glossary of AI and Machine Learning Terms Related to AWS" is also used in the questions and answers of "Learning AWS Functions and History Through Quizzes: Selected 'Machine Learning' Edition" in "Compilation of Thin Books on AWS Vol.01", which I co-authored as an individual publication for Japan's "Technical Book Fair 17".
I hope this will be helpful for those who are preparing to take the AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate exams.
AI/ML AWS Services
Amazon SageMaker
Service Name
Description
Amazon SageMaker
A fully managed service for efficiently building, training, and deploying machine learning models. An integrated platform that supports the entire ML lifecycle from development to production operation.
SageMaker Studio
One of SageMaker's components, a browser-based integrated development environment (IDE). Enables one-stop execution from notebook creation to model development, training, and deployment. Achieves centralized management of ML workflows.
SageMaker Canvas
One of SageMaker's components, a visual interface that allows building ML models through drag & drop without writing code. A no-code ML development environment for business analysts.
SageMaker Ground Truth
One of SageMaker's components, a data labeling service for creating high-quality training datasets. Improves efficiency of human labeling tasks and provides semi-automated labeling workflows.
SageMaker Data Wrangler
One of SageMaker's components, a tool for streamlining data preparation and preprocessing. Provides over 200 built-in transformation functions, enabling data cleansing to feature engineering via GUI.
SageMaker Feature Store
One of SageMaker's components, a repository for centrally managing and sharing features. Ensures consistency of features in online/offline scenarios and promotes reuse across teams.
SageMaker JumpStart
One of SageMaker's components, an ML hub providing pre-trained models and solutions. Deployable with one click and supports transfer learning and fine-tuning.
SageMaker Model Monitor
One of SageMaker's components, continuously monitors model quality in production environments. Detects data drift and bias, enabling early detection of model performance degradation.
SageMaker Clarify
One of SageMaker's components, evaluates bias detection and explainability of models. Ensures fairness and transparency, and analyzes the basis for model decisions.
SageMaker Debugger
One of SageMaker's components, for debugging and monitoring the training process. Enables visualization of metrics and setting of alerts. Supports optimization of learning.
SageMaker Pipelines
One of SageMaker's components, for orchestration of ML workflows. Builds reproducible ML pipelines and enables automated experiment management.
SageMaker Model Cards
One of SageMaker's components, for creating and managing model documentation. Centralizes management of detailed model information, ensuring governance and compliance.
SageMaker Role Manager
One of SageMaker's components, manages access permissions for ML activities. Implements security based on the principle of least privilege and provides appropriate access control.
SageMaker Experiments
One of SageMaker's components, a tool for tracking and managing machine learning experiments. Automatically records experiment results such as training runs, parameters, and metrics, enabling comparative analysis.
SageMaker Model Registry
One of SageMaker's components, a repository for cataloging and versioning ML models. Manages model metadata and approval status.
Amazon Bedrock
Service Name
Description
Amazon Bedrock
A secure, fully managed generative AI platform service. An integrated platform that allows access to multiple Foundation Models through a single API. Provides comprehensive features including Foundation Models (FMs) as the underlying large language models, Knowledge Bases for RAG construction, Agents for automation, Guardrails for harmful content filtering, and Prompt Flows for workflow execution.
Foundation Models (FMs)
Large language models that serve as the foundation for text generation, image generation, etc. The core AI component providing the foundational functionality in Bedrock.
Knowledge Bases
A Bedrock feature that provides information retrieval from external knowledge bases, enabling the construction of Retrieval Augmented Generation (RAG) architectures.
Agents
A Bedrock feature that orchestrates procedural instructions, custom action execution, and Knowledge base utilization in an integrated manner.
Guardrails
A Bedrock feature that detects and filters harmful content and hallucinations, controlling AI output.
Prompt Flows
A Bedrock feature that systematically workflows prompt execution, S3 data input/output, and Lambda function execution.
Amazon Q
Service Name
Description
Amazon Q
A generative AI-powered assistant service specifically designed for businesses. It comes in Business and Developer editions, specializing in business productivity improvement and development support respectively. It can be integrated with various AWS services, including Amazon Q in Amazon QuickSight, Amazon Q in Amazon Connect, Amazon Q in AWS Chatbot, Amazon Q network troubleshooting, and Amazon Q Data integration in AWS Glue.
Amazon Q Business
A generative AI assistant designed to improve employee productivity. Supports automation and efficiency of general business tasks.
Amazon Q Developer
A generative AI assistant specialized in coding support for developers. Supports development tasks such as code generation, debugging, and optimization.
Natural Language Processing Services
Service Name
Description
Amazon Comprehend
A natural language processing service that performs sentiment analysis, personal information detection, key phrase extraction, etc. from text. Custom model creation is also possible.
Amazon Kendra
An advanced search service for enterprises. Provides context-aware search results for natural language queries. Easy integration with RAG.
Amazon Lex
A service for building interactive interfaces (chatbots). Provides natural language understanding and dialogue management functions. Supports both voice and text.
Amazon Textract
A service for extracting text and structured data from documents. Capable of handwriting recognition, form processing, and table analysis. Provides high-accuracy OCR functionality.
Amazon Translate
An automatic translation service between multiple languages. Provides real-time translation between 74 languages. Supports custom terminology dictionaries.
Amazon Transcribe
A speech-to-text (speech recognition) service. Capable of multiple speaker identification and customization of specialized terminology. Supports real-time transcription.
Amazon Polly
A text-to-speech (speech synthesis) service. Provides natural pronunciation and Neural text-to-speech. Supports multiple languages and voice types.
Amazon CodeWhisperer
An AI coding companion for programming assistance. Provides code completion and suggestions.
Image and Video Processing Services
Service Name
Description
Amazon Rekognition
An image and video analysis service. Provides face recognition, object detection, text extraction, content moderation, celebrity recognition, etc. Supports both real-time analysis and batch processing.
Amazon Lookout for Vision
An anomaly detection service using industrial image analysis. Used for product defect detection in manufacturing lines, etc.
Other AI-Related Services
Service Name
Description
Amazon Personalize
A service that provides personalized recommendations. Enables product recommendations and related content suggestions based on user behavior data. Supports real-time recommendations.
Amazon Pinpoint
A customer engagement service. Provides ML-powered segmentation, user behavior analysis, and optimal delivery time prediction functions. Enables multi-channel communication through email, SMS, push notifications, etc.
Amazon Fraud Detector
A machine learning-based fraud detection service. Detects online fraudulent transactions, account takeovers, fake account creation, etc. Can be used in combination with custom rules and ML models.
Amazon Augmented AI (A2I)
A service that manages human review task execution. Enables building workflows for human review of machine learning prediction results.
Amazon Mechanical Turk (MTurk)
A crowdsourcing marketplace. Enables execution of tasks such as data labeling and content moderation by humans. Can be integrated with Amazon SageMaker Ground Truth and Amazon Augmented AI (A2I).
Amazon QuickSight
A BI (Business Intelligence) tool. Equipped with ML predictive analysis capabilities, enabling data visualization and analysis. Supports data analysis in natural language through Q function.
Data Storage and Database Solutions
Service
Description
Amazon S3
Scalable object storage. Optimal for building data lakes. High durability and availability.
Amazon FSx for Lustre is a high-performance file system capable of directly processing large-scale datasets. It seamlessly integrates with Amazon S3, automating data loading from and writing back to S3, and accelerates workloads with hundreds of GBps of parallel processing.
Amazon DynamoDB
Fully managed NoSQL database. Enables fast read and write operations. Automatic scaling feature.
Amazon Redshift
Petabyte-scale data warehouse. Enables fast query processing. Columnar storage.
Amazon OpenSearch Service
An Elasticsearch-compatible search and analytics engine service. In addition to full-text search and real-time analytics, it provides vector database functionality supporting neural search and k-Nearest Neighbors (k-NN) vector search. Compatible with log analysis, application search, and security analytics, as well as AI applications such as recommendations and semantic search. Advanced search capabilities are enabled through integration with large language models using OpenSearch Neural Search functionality.
Computer systems that exhibit human-like intelligent behavior. Possess capabilities such as learning, reasoning, and problem-solving. Includes specialized AI for specific tasks and general AI for broad intelligence.
Machine Learning (ML)
Algorithms or systems that learn patterns from data and perform tasks without explicit programming. Realized through a combination of statistical methods and algorithms.
Deep Learning
A machine learning technique using multi-layer neural networks. Demonstrates high performance in image recognition, natural language processing, etc. Requires large amounts of data and computational resources.
Feature
Individual variables or attributes used as input to a model. Meaningful information extracted from data.
Label
Correct answer data in supervised learning. Target values or classification categories that the model should predict.
Instance
Individual data points. Composed of a combination of features and labels.
Batch
A set of data processed simultaneously during model training. Affects memory efficiency and training speed.
Epoch
A unit representing one complete processing of all training data. Model is gradually improved through multiple epochs of learning.
Iteration
One update of model parameters. Often refers to processing per batch.
Parameters
Values optimized by the model during the learning process. Includes weights and biases.
Hyperparameters
Control parameters set before model learning. Includes learning rate and batch size.
Inductive Bias
Assumptions or hypotheses inherent in the model. Characterizes the nature of the learning algorithm.
Generalization Performance
The model's predictive ability on unseen data. Balancing overfitting and underfitting is important.
Generative AI Related Concepts
Term
Description
Foundation Model (FM)
A general-purpose AI model pre-trained on large-scale data. Adaptable to various tasks. Forms the basis for transfer learning and fine-tuning.
Large Language Model (LLM)
A large-scale foundation model specialized in natural language processing. Examples include GPT and BERT. Demonstrates high performance in text generation and comprehension tasks.
RAG (Retrieval-Augmented Generation)
A method to improve the output quality of generative AI by searching and referencing external knowledge. Effective in preventing hallucination and improving accuracy.
Prompt
Input text to generative AI models. Instructions or context to control the model's output.
Token
The smallest unit for dividing text in prompts. Composed of words or substrings. The basis for input/output limitations of models.
Temperature
A parameter in prompts that controls the randomness of generation. Higher values lead to more diverse outputs, lower values to more deterministic outputs.
Top-p sampling
A method in prompts for selecting the next token based on cumulative probability. Controls the balance between output diversity and quality.
Top-k sampling
A method in prompts for selecting the next token from the top k tokens by probability. Used to control output.
Context window
The maximum length of input that the model can process at once in prompts. Affects the understanding of long contexts.
In-context learning
The ability to learn tasks through examples within the prompt. Adaptation without additional training.
Fine-tuning
The process of adapting a foundation model to specific tasks or domains. Specialization through additional learning.
Prompt engineering
The technique of designing effective prompts. Improves the quality and consistency of outputs.
Hallucination
The phenomenon where a model generates information not based on facts. A challenge for reliability.
Style transfer
A generative technique to change the style of existing content. Used for images and text.
Latent space
A compressed representation space of data learned by generative models. Controls the diversity of generation.
Attention mechanism
A mechanism to focus on important parts of the input. Core technology of Transformer models.
Self-attention mechanism
A mechanism to learn relationships between elements within a sequence. Effective for capturing long-range dependencies.
Decoder
The part that generates the desired output from latent representations. An important component of generative models.
Encoder
The part that converts input into latent representations. Responsible for information compression and feature extraction.
Transformer
An architecture based on self-attention mechanism. The foundation of modern generative AI.
Multimodal
The ability to handle multiple data formats such as text, images, and audio.
Zero-shot capability
The ability to perform new tasks using only pre-training in prompts. Adaptation without examples.
Few-shot capability
The ability to perform new tasks with a few examples in prompts. Efficient adaptive learning.
Machine Learning Approaches
Term
Description
Parametric learning
An approach where the model shape is fixed and the number of parameters is constant. Examples include linear regression and logistic regression.
Non-parametric learning
An approach where the complexity of the model changes according to the data. Examples include k-NN and kernel methods.
Ensemble learning
An approach that combines multiple learners to improve performance. Examples include Random Forest and Boosting.
Foundational Learning Theories
Term
Description
Maximum Likelihood Estimation
A method to estimate parameters by maximizing the probability of obtaining the data.
Bayesian Estimation
A method to estimate parameters by calculating posterior probability from prior probability and data likelihood.
Empirical Risk Minimization
A principle to minimize prediction errors on training data.
Structural Risk Minimization
A principle to minimize prediction errors while considering model complexity.
Loss Functions
Term
Description
Squared Loss
The square of the difference between predicted and actual values. Commonly used in regression problems.
Cross-Entropy Loss
A loss function used in classification problems. Measures the distance between probability distributions.
Hinge Loss
A loss function used in SVMs. Achieves margin maximization.
Huber Loss
A loss function robust to outliers. A combination of squared loss and absolute loss.
Optimization Theory
Term
Description
Convex Optimization
A special optimization problem where local optima are global optima.
Stochastic Optimization
A method that uses randomness to search for optimal solutions.
Constrained Optimization
An optimization problem under constraints. Solved using methods like Lagrange multipliers.
Terms Related to Learning Process
Term
Description
Vanishing Gradient Problem
A phenomenon in deep neural networks where gradients vanish during backpropagation. Makes learning difficult in deep layers.
Exploding Gradient
A phenomenon in deep neural networks where gradients grow exponentially. Causes instability in learning.
Sparsity
A property where many of the data or model parameters are zero. Affects computational efficiency and generalization performance.
Curse of Dimensionality
A problem where the required amount of data increases exponentially as the number of feature dimensions increases. A challenge in high-dimensional data analysis.
Data Quality Related Terms
Term
Description
Data Imbalance
A state where there is a large difference in the number of samples between classes. Makes learning difficult for minority classes.
Noise
Unwanted variations or errors in data. Can hinder model learning.
Outliers
Values that deviate significantly from the general distribution of data. Can negatively affect model learning.
Missing Values
Unrecorded or unmeasured values in a dataset. Requires appropriate handling.
Model Evaluation Related Terms
Term
Description
Baseline
A simple model or performance metric used as a comparison standard. Used to evaluate improvement.
Significance Testing
A method to evaluate whether performance differences between models are statistically meaningful.
Cross-Entropy
A metric that measures the difference between predicted probabilities and true distribution in classification problems.
Confusion Matrix
A table that aggregates prediction results by classification. Used for performance evaluation.
Activation Functions
Term
Description
ReLU
The most commonly used activation function. A simple non-linear function that sets negative inputs to zero.
Sigmoid
Converts output to a range of 0-1. Often used in the output layer for binary classification.
tanh
Converts output to a range of -1 to 1. Mitigates the vanishing gradient problem better than sigmoid.
Softmax
Outputs probability distribution for multiple classes. Used in the output layer for multi-class classification.
Types of Learning Algorithms
Term
Description
Perceptron
The most basic neural network. Suitable for linearly separable problems.
SVM (Support Vector Machine)
A method that determines the classification boundary by maximizing margin. Non-linear classification is possible with kernel trick.
Decision Tree
A method that makes predictions by hierarchically dividing data. High interpretability and easy evaluation of feature importance.
k-Nearest Neighbors
A method that makes predictions based on the majority of the k nearest training data. Simple but computationally expensive.
Data Quality Indicators
Term
Description
Data Completeness
An indicator of the degree of missing values, duplicates, and inconsistencies in a dataset.
Data Consistency
An indicator of whether data formats and value ranges are as expected.
Data Freshness
An indicator of the update time and expiration date of data.
Data Representativeness
An indicator of whether the sample appropriately represents the population.
Model Quality Indicators
Term
Description
Prediction Stability
An indicator of prediction consistency for similar inputs.
Model Confidence
An indicator of the model's confidence in each prediction.
Explainability
An indicator of the ease of interpreting the reasons for model predictions.
Robustness
An indicator of the model's resistance to noise and outliers.
Statistical Concepts
Term
Description
Analysis of Variance
A statistical method for analyzing sources of variation in data.
Hypothesis Testing
A method for verifying statistical hypotheses.
Confidence Interval
A range that quantifies the uncertainty of an estimate.
Effect Size
An indicator of the practical magnitude of statistical differences.
Model Development Process
[Amazon SageMaker components useful in this category] * Amazon SageMaker Studio provides an integrated development environment (IDE) that enables one-stop execution from notebook creation to model development, training, and deployment, realizing centralized management of ML workflows. * Amazon SageMaker Canvas provides a no-code ML development environment that allows data preparation to model deployment through drag & drop without writing code, enabling development for business analysts.
Model Development Process
Phase
Description
Data Collection
Collection and integration of data necessary for learning. Includes identification of data sources and quality checks. Consider data representativeness and balance.
Data Preprocessing
Implement data splitting, cleaning, data labeling, feature engineering, scaling (normalization, standardization). Improve data quality and convert to a format suitable for learning. Include handling of missing values and outliers.
Model Selection
Select appropriate algorithms and architectures based on problem type (classification/regression, etc.), data characteristics, requirements (accuracy/speed/explainability). Consider computational resource constraints and deployment environment. Also consider the possibility of using pre-trained models.
Model Training
Train the model using the selected algorithm. Include hyperparameter optimization. Conduct performance evaluation through cross-validation.
Model Evaluation
Verify model performance. Analyze from multiple angles using various evaluation metrics. Confirm generalization performance on test data.
Deployment
Deploy the model to the production environment. Include scaling and monitoring settings. Also conduct A/B testing to verify effectiveness.
Inference
Execute predictions on new data using the deployed model. Perform predictions in real-time or batch processing.
Monitoring
Continuously monitor model performance. Detect drift and make decisions on retraining. Track quality metrics and set alerts.
Data Collection
Types of Data
Type
Description
Structured Data
Data organized in tabular form. Such as data managed in RDBMS. Has a clear schema.
Unstructured Data
Data without a fixed structure. Such as text, images, audio, video. Requires special techniques for processing.
Semi-structured Data
Partially structured data. Such as JSON, XML, HTML. Has a flexible schema.
Vector Data
Data represented as numerical vectors. Such as word embeddings, feature vectors. Suitable for similarity calculations.
ETL (Extract, Transform, Load)
Phase
Description
Extract
Extract data from various sources. Check data format and quality. Perform consistency checks.
Transform
Transform and process data. Execute cleansing, normalization, aggregation, etc. Transform according to business rules.
Load
Save and load processed data. Store in data warehouses or data lakes. Ensure consistency.
Data Preprocessing
[Amazon SageMaker components useful in this category] * Amazon SageMaker Data Wrangler provides tools to streamline data preparation and preprocessing, enabling data cleansing to feature engineering via GUI. * Amazon SageMaker Canvas provides functionality to perform data preprocessing, feature engineering, data transformation, etc. via GUI without writing code, enabling data preparation by business analysts.
Data Splitting
Term
Description
Training Data
Dataset used for model learning. Typically accounts for about 60-80% of all data.
Validation Data
Dataset used for model hyperparameter tuning and performance evaluation. Typically accounts for about 10-20% of all data.
Test Data
Independent dataset used for final model evaluation. Typically accounts for about 10-20% of all data.
Holdout Method
Basic method of splitting data into training and evaluation sets. Used when data quantity is sufficient.
Stratified Sampling
Method of splitting data while maintaining class ratios. Important for imbalanced datasets.
Cleansing (Cleaning)
Task
Description
Noise Removal
Detection and removal of outliers and noise. Improves data quality. Utilizes statistical methods and domain knowledge.
Missing Value Handling
Completion or removal of missing data. Impute with mean, median, predicted values, etc. Consider MAR and MCAR assumptions.
Outlier Detection
Identification of outliers using statistical methods or ML techniques. Important to check consistency with domain knowledge.
Duplicate Data Elimination
Detection and removal of duplicate records. Ensures data consistency. Normalization of key items.
Data Labeling
[Amazon SageMaker components useful in this category] * Amazon SageMaker Ground Truth provides a data labeling service for creating high-quality training datasets.
Method
Description
Manual Labeling
Direct labeling by humans. High quality but time and cost intensive. Effective when specialized knowledge is required.
Semi-Automatic Labeling
Combination of AI prediction and human verification. Enables efficient labeling. Achieves balance between quality and efficiency.
Active Learning
Selection of target data for efficient labeling. Prioritizes data with high uncertainty. Optimizes labeling costs.
Label Quality Management
Checks for consistency and errors. Includes consensus building among multiple annotators. Setting and monitoring of quality metrics.
Feature Engineering
[Amazon SageMaker components useful in this category] ・Amazon SageMaker Feature Store provides a centralized repository for managing and sharing features, ensuring consistency of features both online and offline.
Technique
Description
Feature Selection
Selection of useful input variables for the model. Selection based on correlation analysis and importance evaluation. Contributes to dimensionality reduction and model performance improvement. Used when there are many features or when you want to remove unnecessary features to prevent model overfitting. Particularly effective when analyzing datasets with many variables, such as medical or financial data.
Feature Extraction
The process of extracting meaningful features from raw data. Examples include Fourier transform in signal processing and edge detection from images. Used when there is a need to extract useful information from complex raw data, especially important in image processing, speech processing, and sensor data analysis. In time series data analysis, it is utilized for extracting statistical measures and frequency characteristics.
Feature Scaling
The process of adjusting the value range of features. Includes standardization and normalization. Essential when dealing with datasets that have features on different scales, particularly important for machine learning algorithms using gradient descent methods and clustering algorithms that perform distance calculations.
Feature Interaction
Creation of new features by combining multiple features. Used when wanting to capture non-linear relationships or model phenomena that cannot be explained by individual features alone. Particularly utilized in regression analysis and predictive models to improve prediction accuracy.
Dimensionality Reduction
Techniques for reducing the feature dimensions of data. PCA and t-SNE are representative methods. Contributes to improved computational efficiency and performance. Used when visualization of high-dimensional data is necessary or when wanting to reduce computational costs. Particularly useful when dealing with high-dimensional data in image recognition and document classification.
Encoding
Numerical conversion of categorical values. Uses methods such as One-Hot, Label, and Target encoding. Select appropriate methods based on data characteristics. Essential when building machine learning models that handle categorical data, especially important when there are many categories or when relationships between categories need to be considered.
Embedding
Conversion of high-dimensional data into low-dimensional vector representations. Examples include Word2Vec and BERT. Preserves semantic similarity. Used when dealing with text data or large-scale categorical data, playing a particularly important role in natural language processing and recommender system construction.
Data Augmentation
Enhancement of training data by transforming existing data. Includes rotation, scaling, etc. Improves model generalization performance. Used when training data is limited or when wanting to prevent model overfitting. Particularly effective in tasks using deep learning, such as image recognition and speech recognition.
Encoding Techniques
Encoding Technique
Description and Use Cases
Label Encoding
A technique that converts categorical values into continuous integer values. Suitable when there is an ordinal relationship between categories (e.g., education level, age group). Memory-efficient and often used in decision tree-based algorithms. However, caution is needed for non-ordinal categorical data as it introduces a numerical relationship between categories.
One-Hot Encoding
A technique that converts categorical values into binary vectors. Optimal for cases where there is no ordinal relationship between categories (e.g., color, gender, occupation). Treats each category equally, but can lead to high memory consumption and increased computational cost due to dimensionality increase when there are many categories. Particularly important in linear models and neural networks.
Target Encoding
A technique that replaces categorical values with the mean of the target variable. Effective when there are a very large number of categories or when there is a strong association between categories and the target variable. However, there is a risk of overfitting, so appropriate regularization and cross-validation are necessary. Particularly useful for improving performance in predictive models.
Frequency Encoding
A technique that converts categories to numerical values based on their frequency of occurrence. Suitable when the frequency of categories holds significant meaning (e.g., product popularity, usage frequency). Easy to implement and interpret, but has the limitation of not being able to distinguish between categories with the same frequency.
Binary Encoding
A technique that converts categories into binary representations. More memory-efficient than One-Hot Encoding as it can represent with fewer dimensions, useful when there are many categories. However, the generated features can be difficult to interpret, and relationships between categories may be lost.
Hash Encoding
A technique that uses a hash function to convert categories into fixed-dimensional features. Suitable for cases with an extremely large number of categories or when new categories are continuously added. Memory-efficient and can handle online learning, but there is a possibility of information loss due to hash collisions.
Feature Extraction Techniques for Text Data
Technique
Description
TF-IDF
A technique that calculates word importance based on term frequency and inverse document frequency. A fundamental feature in text analysis. Widely used in document classification, information retrieval, keyword extraction, and other tasks where word importance needs to be considered.
Word2Vec
A technique that converts words into fixed-length dense vectors. Captures semantic similarity between words. Used in natural language processing tasks that require consideration of word meaning relationships and context, such as sentiment analysis, document classification, question-answering systems, and machine translation.
Doc2Vec
An extension of Word2Vec that learns vector representations for entire documents. Used for document similarity calculations. Suitable for tasks requiring semantic comparison at the document level, such as document classification, clustering, recommender systems, and similar document search.
FastText
A word embedding technique that considers substrings. Capable of handling unknown words. Particularly effective in processing languages with rich morphology, handling text with spelling errors, and analyzing social media posts where new or modified words frequently appear.
BERT Tokenization
A tokenization technique for BERT models. Uses the WordPiece algorithm. Used as essential preprocessing when using BERT models for advanced natural language processing tasks that consider context, such as sentiment analysis, named entity recognition, and question answering.
BPE (Byte Pair Encoding)
A technique that learns subword units by merging frequent character strings. Allows control of vocabulary size. Particularly effective in machine translation, multilingual processing, and processing languages with complex morphology where efficient handling of large vocabularies is necessary.
Bag of Words (BoW)
The most basic technique that vectorizes word frequency in documents. Does not consider word order. Used in basic text analysis tasks where word frequency alone is sufficient for performance, such as spam email detection, document classification, and topic classification.
n-gram
A technique that uses combinations of n consecutive words or characters as features. Captures local context. Used in language modeling, spell checking, author identification, predictive input for programming languages, and other cases where local context or word order is important.
Text Data Preprocessing
Method
Description
Tokenization
Splitting text into words or substrings. Selection of appropriate splitting method based on language characteristics.
Normalization
Implement unification of uppercase and lowercase, accent removal, character type standardization, etc.
Stop Word Removal
Removal of common words with little information (articles, prepositions, etc.).
Lemmatization/Stemming
Convert words to their base form. Use morphological analysis or stemming.
Noise Removal
Removal of special characters, HTML/XML tags, unnecessary spaces, etc.
Image Data Preprocessing
Method
Description
Resize
Standardization of image size. Convert to a size suitable for model input.
Normalization
Standardization of pixel values. Generally converted to a range of 0-1 or -1-1.
Color Space Conversion
Conversion to color spaces such as RGB, HSV, grayscale according to purpose.
Noise Removal
Noise reduction using median filters or Gaussian filters.
Data Augmentation
Enhancement of training data through rotation, flipping, scaling, etc.
Scaling Methods
Method
Description
Normalization
A transformation that fits data into a specific range (usually 0-1). Unifies the scale between features, making them comparable.
Standardization
A transformation that converts data to a distribution with mean 0 and standard deviation 1. Susceptible to outliers but suitable for normally distributed data.
Standard Scaler
A scaler that implements standardization. Converts to mean 0 and standard deviation 1. Suitable for data following normal distribution.
Robust Scaler
Robust scaling using median and interquartile range. Less affected by outliers.
Min Max Scaler
A scaler that implements normalization. Converts data to a specified range such as 0-1. Suitable for neural network inputs.
Max Absolute Scaler
Normalizes by maximum absolute value. Suitable for sparse data. Maintains zero-centered scale.
Model Selection
[Amazon SageMaker components useful in this category] * Amazon SageMaker JumpStart functions as an ML hub providing pre-trained models and solutions, deployable with one click.
Model Types
Classification
Description
Supervised Learning Models
Predictive models for classification or regression. Uses labeled data. Examples include RandomForest, SVM, Neural Networks.
Unsupervised Learning Models
Models for clustering and pattern discovery. Uses unlabeled data. Examples include K-means, PCA, Auto-encoders.
Semi-Supervised Learning Models
Models that learn using a small amount of labeled data and a large amount of unlabeled data. Examples include pseudo-labeling, co-training.
Generative Models
Models that generate data or learn distributions. Examples include GANs, VAE, Diffusion Models.
Transfer Learning Models
Models that utilize pre-trained knowledge. Based on foundation models such as BERT, GPT, ResNet.
Selection Criteria
Criteria
Description
Data Characteristics
Model selection based on data quantity, dimensionality, presence of noise, class balance, sparsity, etc.
Computational Resources
Selection based on available memory, CPU/GPU, and training time constraints.
Prediction Performance
Selection based on target performance metrics such as accuracy, recall, F1 score.
Inference Speed
Selection based on real-time requirements, batch processing requirements.
Explainability
Selection based on requirements for model transparency and interpretability.
Scalability
Selection based on ability to handle increases in data volume and system expansion.
Cost
Selection based on total cost of ownership including development, training, and operation.
Common Algorithms
Algorithm
Application Scenarios
Linear Regression
Simple regression problems, when interpretation of relationships is important.
Logistic Regression
Binary classification problems, when probability prediction is needed.
Random Forest
Classification and regression of structured data, analysis of feature importance.
XGBoost/LightGBM
High-performance prediction problems with structured data.
Neural Networks
Complex pattern recognition, image and speech processing.
BERT/Transformer
Text processing, natural language understanding tasks.
CNN
Image recognition, pattern detection.
RNN/LSTM
Time series data analysis, sequence data processing.
Reinforcement Learning Models
Decision making, game strategies, robot control, etc.
Architecture Considerations
Element
Description
Model Size
Consideration of number of parameters, memory requirements, storage requirements.
Layer Configuration
Selection of number of layers, number of units, activation functions for neural networks.
Ensemble Methods
Methods of combining multiple models, voting or averaging strategies.
Quantization and Compression
Model lightweighting, adaptation to edge deployment.
Batch Size
Balance between memory usage and speed during training and inference.
Distributed Learning Support
Possibility of learning on multiple GPUs or multi-nodes.
Model Selection Strategies
Strategy
Description
Baseline Construction
Strategy to start with simple models and gradually increase complexity.
AutoML
Utilization of tools for automatic model selection and optimization.
Algorithm Comparison
Evaluate multiple models in parallel and select the optimal one.
Experiment Management
Tracking and documentation of the model selection process.
A/B Testing
Comparison of different models' performance in real environments.
Gradual Optimization
Strategy to optimize while gradually increasing model complexity.
Model Training
[Amazon SageMaker components useful in this category] * Amazon SageMaker Debugger performs debugging and monitoring of the training process, enabling visualization of metrics and setting of alerts.
Basic Concepts of Model Learning
Concept
Description
Inductive Bias
Assumptions or hypotheses inherent in the model. Determines the nature of the learning algorithm. Appropriate bias improves generalization performance.
Bias-Variance Tradeoff
The tradeoff relationship between model complexity and generalization performance. Balance between overfitting and underfitting.
Cross-Entropy Loss
A loss function commonly used in classification problems. Measures the difference between predicted probabilities and true distribution.
Training Methods
Method
Description
Supervised Learning
A method of learning using data with correct labels. Used for classification and regression tasks. The quality and quantity of data determine performance.
Unsupervised Learning
A method to discover patterns from unlabeled data. Used for clustering and anomaly detection. Reveals latent structures in data.
Semi-Supervised Learning
A method of learning using a small amount of labeled data and a large amount of unlabeled data. Achieves high performance while suppressing labeling costs.
Reinforcement Learning
A method to learn actions that maximize rewards through interaction with the environment. Acquires optimal action policies through trial and error.
Transfer Learning
A method to apply knowledge learned from one task to another task. Improves learning efficiency by utilizing pre-trained models.
Batch Learning
A learning method that processes all training data at once. Enables stable learning with good computational efficiency. Requires retraining when data is updated.
Online Learning
A learning method that processes data sequentially and continuously updates the model. Quick adaptation to new patterns. Risk of instability.
Incremental Learning
A method to perform additional learning on existing models with new data. Enables model updates without complete retraining.
Pre-training
The basic process of training a model from scratch with large-scale data. Acquires general knowledge and patterns. Forms the foundation for subsequent tasks.
Fine-tuning
A method to adapt pre-trained models to specific tasks. Achieves high performance even with small amounts of data. A type of transfer learning.
Continuous Pre-training
Periodic retraining with new data. Effective for maintaining performance of domain-specific models. Important as a drift countermeasure.
RLHF
Reinforcement learning through human feedback. Improves quality and safety of generative AI models. Addresses alignment problems.
Custom Vocabulary Learning
A method to train models on specialized terminology in specific fields. Important for building domain-specific models. Contributes to improving expertise.
Meta-learning
A method to learn the learning algorithm itself. Realizes high flexibility and efficiency for new tasks. The foundation of automatic ML systems. Enables quick adaptation to new tasks.
Few-shot Learning
A method that enables learning from a small number of samples. A form of meta-learning.
Zero-shot Learning
A method that can infer even classes not seen during learning. An advanced form of transfer learning.
Self-Supervised Learning
A method to learn by automatically generating teaching signals from unlabeled data. Effective for pre-training.
Multi-task Learning
A method to learn multiple tasks simultaneously. Enables efficient learning through knowledge sharing between tasks.
Federated Learning
A method to perform cooperative learning on multiple clients while keeping data distributed. Effective for privacy protection.
Knowledge Distillation
A method to transfer knowledge from a large model to a small model. Used for model lightweighting.
Model Optimization Methods
Method
Description
Hyperparameter Tuning
Optimization of model configuration parameters. Uses grid search, random search, Bayesian optimization, etc. Consider the balance between computational cost and performance.
Regularization
Addition of penalty terms to prevent overfitting. L1 (Lasso), L2 (Ridge) regularization, etc. Controls model complexity.
Early Stopping
Ends learning when improvement in validation performance is no longer observed, preventing overfitting. Also contributes to efficient use of computational resources.
Cross-validation
Conducts evaluation by dividing data into multiple parts. Accurately estimates model's generalization performance. Particularly effective when data quantity is limited.
Ensemble Learning
Improves performance by combining multiple models. Random Forest, Gradient Boosting, etc. Compensates for weaknesses of individual models.
Gradient Descent
A method to search for optimal solutions by updating parameters based on the gradient of the loss function. There are variations such as Stochastic Gradient Descent (SGD) and Mini-batch Gradient Descent.
Learning Rate Adjustment
Adjustment of parameters that control the update amount in gradient descent. There are adaptive methods such as AdaGrad, Adam, RMSprop.
Momentum
A method to accelerate optimization by using past gradient information. Effective for avoiding local optima and accelerating convergence.
Optimizers (Optimization Algorithms)
Term
Description
SGD (Stochastic Gradient Descent)
A basic optimization algorithm that calculates gradients and updates parameters on a mini-batch basis.
Adam
A popular optimization algorithm that combines momentum and adaptive learning rates. Shows good convergence in many cases.
RMSprop
An adaptive optimization algorithm that considers past gradients using exponential moving average.
AdaGrad
An adaptive optimization algorithm that applies different learning rates for each parameter.
Regularization Methods
Method
Description
Dropout
A method to prevent overfitting by randomly disabling neurons.
Batch Normalization
A method to normalize inputs on a mini-batch basis, stabilizing and accelerating learning.
Layer Normalization
A method to perform normalization at the layer level. Effective for RNNs and Transformers.
Weight Decay
A regularization method that penalizes the magnitude of weights. Also called L2 regularization.
Label Smoothing
A method to prevent model overconfidence by softening teacher labels.
Learning Rate Scheduling
Method
Description
Step Decay
A method to decrease the learning rate in stages every certain number of epochs.
Exponential Decay
A method to decay the learning rate exponentially.
Cosine Annealing
A method to periodically change the learning rate based on a cosine function.
Warm-up
A method to gradually increase the learning rate at the beginning of learning, then transition to normal learning rate scheduling.
Model Evaluation
[Amazon SageMaker components useful in this category] * Amazon SageMaker Clarify can evaluate bias detection and explainability (interpretation of predictions) in model evaluation. * Amazon SageMaker Experiments provides tools for tracking and managing machine learning experiments, automatically recording experiment results such as training runs, parameters, metrics, and enabling comparative analysis.
Baseline Evaluation
Metric
Description
Rule-based Baseline
Performance of a prediction model based on simple rules. Used as a baseline for improvement.
Random Baseline
Performance when making random predictions. Used as a minimum performance standard.
Industry Standard Baseline
Performance standards generally accepted in the industry. Used as a benchmark for competitive comparison.
Classification Tasks
Metric
Description
Accuracy
The proportion of correct predictions among all predictions. Effective when classes are balanced. Caution is needed with imbalanced data. Often used in cases where the number of data points is similar across classes, such as image classification and document classification. Examples: handwritten character recognition, general object recognition tasks, etc.
Precision
The proportion of true positives among positive predictions. Important when minimizing false positives is crucial. Emphasized in spam filters, etc. Used when the cost of false positives is high. Examples: spam email detection, fraudulent transaction detection, quality control inspection, etc., where incorrectly classifying normal items as abnormal can cause significant problems.
Recall
The proportion of correct predictions among actual positives. Important when minimizing false negatives is crucial. Emphasized in disease diagnosis, etc. Used when the cost of false negatives is high. Examples: cancer screening, security systems, earthquake prediction, etc., where missed detections can lead to serious consequences.
F1 Score
The harmonic mean of precision and recall. A balanced evaluation metric. Used as a single comprehensive evaluation. Used when both precision and recall are important. Examples: information retrieval systems, product recommendation, document classification, etc., where both accuracy and comprehensiveness are required.
ROC Curve
A graph plotting true positive rate vs false positive rate for each classification threshold. Performance is evaluated by AUC (Area Under the Curve). Used when a comprehensive evaluation of model performance is desired or when determining the optimal classification threshold is necessary. Examples: credit scoring, medical diagnostic systems, risk assessment models, etc., where threshold adjustment is important.
Regression Tasks
Metric
Description
MSE (Mean Squared Error)
The average of the squared differences between predicted and actual values. As it squares the errors, it is strongly affected by outliers. Commonly used in general regression problems, especially when emphasizing outliers or when larger errors need to be penalized more severely.
RMSE (Root Mean Squared Error)
The square root of the mean squared error. Evaluates the magnitude of prediction errors in the original unit. Strongly affected by outliers. Suitable for tasks like housing price prediction or sales forecasting where interpreting the predicted values in the original scale is desired. Taking the square root of MSE allows for more intuitive interpretation.
MAE (Mean Absolute Error)
An evaluation metric less sensitive to outliers. It's the average of the absolute differences between predicted and actual values. Suitable for demand forecasting or inventory management where you want to assess the average magnitude of errors while suppressing the impact of outliers. It doesn't overestimate prediction errors and is easy to understand intuitively.
MAPE (Mean Absolute Percentage Error)
Evaluates relative errors. Allows comparison between data of different scales. Suitable for sales forecasting or stock price prediction where actual values are greater than 0 and you want to assess the relative magnitude of errors. Useful for comparing prediction accuracy across companies or products of different sizes.
R² (Coefficient of Determination)
Expresses the goodness of fit of a model on a scale from 0 to 1, with values closer to 1 indicating higher prediction accuracy. Can also take negative values. Used to evaluate the overall explanatory power of a model. Particularly used as an indicator for variable selection in multiple regression analysis and is useful for model comparison and selection.
Adjusted R²
A modified version of R². It corrects for the influence of the number of explanatory variables, allowing for more accurate model evaluation. Suitable for variable selection and model comparison, especially when comparing models with different numbers of explanatory variables. Used to prevent overfitting.
RMSLE (Root Mean Squared Logarithmic Error)
The RMSE after taking the logarithm of predicted and actual values. Evaluates relative errors and mitigates the impact of large values. Suitable for sales forecasting or population prediction where the range of data values is wide and relative errors are emphasized. Particularly useful when predicted values have vastly different scales.
MSLE (Mean Squared Logarithmic Error)
The MSE after taking the logarithm of predicted and actual values. Evaluates relative errors and mitigates the impact of large values. Used for similar purposes as RMSLE, particularly suitable for cases where predicted values increase exponentially or for predicting ratios.
MedAE (Median Absolute Error)
The median of absolute errors. The evaluation metric least affected by outliers. Particularly useful for prediction tasks with noisy datasets or when outliers are present. Suitable for analyzing sensor data or evaluating actual measurement data that may contain anomalies.
Text Generation
Metric
Description
ROUGE
Evaluates the similarity between generated text and reference text. Calculates n-gram matching. Commonly used for summarization tasks. Particularly effective for evaluating the performance of news article summarization and document summarization systems.
Human Evaluation
Subjectively assesses quality, coherence, relevance, etc. Sets qualitative evaluation criteria and judges with multiple evaluators. Used when subtle nuances and contextual understanding that cannot be fully captured by automatic evaluation metrics are required, or when evaluating creative text generation.
Evaluation Metrics for Generative AI Models
Metric
Description
BLEU
A metric used for evaluating machine translation. Calculates n-gram matching between generated and reference sentences. Particularly effective for quality assessment of multilingual translation systems and comparison of different translation models.
METEOR
An evaluation metric for translation and generated text. Allows for flexible evaluation considering synonyms and morphological variations. Used in translation tasks where there are significant differences in grammatical structures between languages or when considering diversity of expressions.
BERTScore
A metric that evaluates semantic similarity of sentences using BERT's contextual word embeddings. Used when semantic similarity assessment is important beyond surface-level matching, or when evaluating paraphrasing.
Perplexity
A metric for evaluating the predictive performance of language models. Lower values indicate better models. Used for evaluating the learning process of language models and comparing language models with different architectures.
Explainability Evaluation
Method
Description
LIME
A method providing local explainability. Generates interpretable explanations for individual predictions. Used in cases where explanation of individual decision bases is important, such as medical diagnostics or financial credit assessments.
SHAP
A feature importance calculation method based on game theory. Evaluates the contribution of each feature to predictions. Used when there is a need to understand the decision-making process of complex models or when ranking feature importance.
Attention Visualization
Visualization of the attention mechanism in Transformer models. Visually represents the basis of model decisions. Used to confirm the areas of focus in natural language processing tasks or when analyzing model behavior.
Feature Attribution
A method to quantify the contribution of each feature to prediction results. Analyzes the decision process of models. Used for evaluating model fairness or detecting bias when necessary.
Correlation Analysis
Method
Description
Pearson Correlation
Measures the strength of linear relationships on a scale from -1 to 1. Used for evaluating relationships between continuous variables. Applied when analyzing variables expected to have a linear relationship, such as height and weight.
Spearman Correlation
Evaluates the strength of ordinal relationships. Applicable to non-linear relationships. Suitable for ordinal variables. Used when analyzing monotonic but not necessarily linear relationships, such as between customer satisfaction and purchase amount.
Chi-square Test
Statistically tests the association between categorical variables. Used for verifying independence. Applied when analyzing relationships between categorical data, such as the association between gender and product selection.
Phi Coefficient
Measures correlation between binary variables. Applied to 2x2 contingency table data. Used when measuring the strength of relationships between binary data, such as the association between pass/fail and male/female.
Model Challenges and Phenomena
Term
Description
Overfitting
A state where the model excessively fits to the training data, reducing generalization performance on new data. Also called overlearning, balance between model complexity and training data amount is important.
Underfitting
A state where the model fails to capture patterns in the training data sufficiently. Caused by lack of model expressiveness or insufficient learning. Requires more complex models or additional learning.
Bias
Systematic error between model predictions and true values. Increases when model expressiveness is insufficient, causing underfitting.
Variance
Variability in model predictions. The magnitude of prediction value fluctuations to small changes in training data. If too high, it can cause overfitting.
Hallucination
When AI models generate incorrect information not based on facts. Particularly problematic in generative AI, can be mitigated by techniques like RAG.
Drift
Changes in data distribution or model performance over time. Includes concept drift (changes in target variable relationships) and data drift (changes in input distribution).
Dealing with Data Imbalance
Method
Description
Oversampling
A method to increase data of minority classes. Includes synthetic data generation like SMOTE. Improves class balance. Effective when overall data quantity is small and you want to maximize use of information, or when there are ample computational resources.
Undersampling
A method to balance by reducing data from majority classes. Risk of information loss but computationally efficient. Effective when data quantity is sufficient and there are strict constraints on computation time or memory.
SMOTE
A method to generate synthetic data for minority classes. Uses k-nearest neighbors to generate new samples. Ensures data diversity. Effective when simple duplication risks overfitting or when you want to learn more diverse features of minority classes.
Class Weighting
Adjusts balance by weighting minority classes during learning. Modifies the model's loss function to balance. Effective when you want to maintain the original data distribution or avoid data modification.
Bias and Fairness Evaluation
Metric
Description
Demographic Parity
Evaluates the uniformity of prediction result distribution across different demographic groups.
Equal Opportunity
An indicator that confirms true positive rates are equal across protected attributes.
Predictive Parity
Evaluates the consistency of prediction accuracy between different groups.
Individual Fairness
Evaluates the consistency of predictions for individuals with similar characteristics.
Bias Amplification
Measures the degree to which the model amplifies existing biases in the data.
Performance Stability Evaluation
Metric
Description
Prediction Variance
Evaluates the variability of model predictions. Used as an indicator of stability.
Threshold Stability
Evaluates the robustness of performance to changes in classification thresholds.
Cross-validation Standard Deviation
Evaluates the variability of performance across different data splits.
Noise Resistance
Evaluates the stability of predictions against input noise.
Temporal Stability
Evaluates the consistency of prediction performance in time series data.
Reliability and Robustness Evaluation
Metric
Description
Adversarial Attack Resistance
Evaluates the model's robustness against adversarial samples. Identifies security vulnerabilities.
Model Uncertainty
Quantification of prediction confidence and uncertainty. Evaluation using Bayesian methods or ensemble methods.
Stress Test
Evaluation of model behavior in extreme cases or boundary conditions. Understanding system limitations.
Data Quality Sensitivity
Evaluation of model sensitivity to deterioration in input data quality. Used as an indicator of robustness.
Fail-safe Property
Evaluation of safety in case of model abnormal operation. Confirmation of fallback mechanism effectiveness.
Cost Efficiency Evaluation
Metric
Description
Computational Cost
Evaluation of computational resources required for model training and inference. GPU time, memory usage, etc.
Infrastructure Cost
Evaluation of infrastructure costs required for model operation. Storage, network, etc.
Maintenance Cost
Evaluation of human resources and time required for model maintenance and updates.
ROI Analysis
Evaluation of return on investment from model deployment. Quantification of cost reduction or revenue increase.
Model Deployment
[Amazon SageMaker components useful in this category] * Amazon SageMaker Model Registry catalogs and version-manages ML models, managing model metadata.
Deployment Strategies
Term
Description
Canary Deployment
A technique that applies a new version of the model to only a portion of the traffic and gradually expands. Allows validation of new models while minimizing risk.
Blue/Green Deployment
A deployment method that prepares production (blue) and new (green) environments in parallel and switches between them. Enables immediate rollback.
Shadow Deployment
A method that mirrors production traffic to a new model for parallel evaluation. Enables performance verification under actual workloads.
A/B Testing
A method to operate multiple versions of models simultaneously and compare their performance. Enables data-driven decision making.
Rolling Update
A gradual deployment method that updates instances sequentially. Minimizes service interruption.
Rollback Plan
Setting of recovery procedures and trigger conditions in case of problems. Ensures consistency of data and models.
Inference Options
Term
Description
Real-time Inference
A method that executes inference in real-time for requests. Suitable for use cases requiring low latency. In Amazon SageMaker, it's provided as persistent, fully managed endpoints that can handle payloads up to 6MB and processing times up to 60 seconds. A scalable solution capable of handling continuous traffic.
Batch Inference
A method that executes inference in bulk for large amounts of data. Suitable for periodic prediction processing. In Amazon SageMaker, it's provided as batch transform, capable of processing large-scale datasets of several GB. Optimal for offline processing or preprocessing that doesn't require persistent endpoints.
Asynchronous Inference
A method that queues and processes inference requests requiring large payloads or long processing times. In Amazon SageMaker, it supports payloads up to 1GB and processing times up to 1 hour. Can scale down to 0 when there's no traffic.
Serverless Inference
An event-driven inference execution method. Automatically scales according to demand. In Amazon SageMaker, it provides a model that requires no infrastructure management and charges only for usage, for intermittent or unpredictable traffic. Supports payloads up to 4MB and processing times up to 60 seconds.
Endpoint Options
Term
Description
Single Model Endpoint
A basic endpoint configuration for deploying a single model. Simple and easy to manage.
Multi-Model Endpoint
An endpoint configuration that serves multiple models of the same framework in a single container. In Amazon SageMaker, it improves endpoint utilization and reduces deployment overhead, realizing cost optimization.
Multi-Container Endpoint
An endpoint configuration that serves multiple models of different frameworks in separate containers. In Amazon SageMaker, it allows flexible deployment of various frameworks and models.
Serial Inference Pipeline
An endpoint configuration that executes preprocessing, inference, and post-processing as a series of pipelines. In Amazon SageMaker, all containers are hosted on the same EC2 instance and fully managed, achieving low latency.
Scalable Endpoint
An endpoint configuration that automatically scales according to load. Flexibly responds to traffic fluctuations.
High Availability Endpoint
An endpoint configuration deployed across multiple availability zones, ensuring redundancy.
Infrastructure
Term
Description
Model Containerization
Packaging of models using container technologies like Docker. Ensures consistency and portability of environments.
Scaling Strategy
Setting of automatic scaling according to load. Selection and policy setting of horizontal/vertical scaling.
Service Mesh
Management of traffic control and inter-service communication in microservice architectures.
Deployment Pipeline
Automation and standardization of deployments. Construction of CI/CD pipelines and setting of quality gates.
Optimization and Security
Term
Description
Model Optimization
Model optimization before deployment. Lightweighting and acceleration through quantization, pruning, distillation, etc.
Security Settings
Setting of access control, encryption, authentication and authorization. Configuration of secure endpoints.
API Versioning
Version management of model APIs and ensuring compatibility between versions.
Monitoring Settings
Configuration of metric collection, logging, alert settings. Establishment of performance and quality monitoring system.
Inference
Prompting Techniques
Technique
Description
Prompt Engineering
A technique to obtain desired outputs by crafting inputs (prompts) to AI models. Optimizes methods of setting context and presenting constraints.
Zero-shot Prompting
One of the prompt engineering techniques. Executes tasks directly without examples. Utilizes the model's generalization ability to handle new tasks.
Few-shot Prompting
One of the prompt engineering techniques. Teaches how to execute tasks by showing a few examples. Controls model behavior through concrete examples.
Chain-of-Thought Prompting
One of the prompt engineering techniques. Guides the model to solve complex problems step by step (Chain of Thought). Encourages explicit expansion of the reasoning process.
Prompt Optimization Techniques
Technique
Description
Prompt Templates
Design of reusable fixed prompts. Ensures consistent outputs.
Hallucination Countermeasures
Techniques to prevent generation not based on facts. Incorporation of knowledge base references and fact-checking.
Context Management
Effective setting and control of context information in prompts. Leads to more accurate responses.
Prompt Variation
Experimentation and optimization of different expression methods for the same intent. Improves robustness.
Monitoring
[Amazon SageMaker components useful in this category] * Amazon SageMaker Model Monitor continuously monitors model quality in production environments and detects data drift and bias. * Amazon SageMaker Clarify can also be used in model quality monitoring, continuously evaluating model bias and explainability.
Model Quality Monitoring
Term
Description
Data Drift Detection
Monitors changes in input data distribution. Used as an indicator to determine timing for model retraining.
Concept Drift Detection
Monitors changes in the relationship between inputs and outputs. Used as an indicator to determine the need for model updates.
Prediction Quality Monitoring
Continuously evaluates the quality of model prediction results. Includes monitoring of bias and fairness.
Explainability Monitoring
Monitors metrics related to model explainability. Ensures transparency and reliability of predictions.
Input Data Validation
Continuously validates the validity of input data schema, type, range, etc.
Data Completeness Monitoring
Monitors data quality indicators such as missing values, outliers, duplicates.
Feature Stability Monitoring
Tracks changes in statistical properties of features. Detects distribution shifts.
Data Source Monitoring
Monitors availability, freshness, and consistency of data sources.
Performance Monitoring
Term
Description
Latency Monitoring
Monitors inference processing time. Checks SLA compliance status and used for performance optimization.
Throughput Monitoring
Monitors the number of processes per unit time. Used for capacity planning.
Resource Utilization Monitoring
Monitors infrastructure metrics such as CPU, memory, disk usage. Used for scaling decisions.
Error Rate Monitoring
Monitors the occurrence rate of inference errors and system errors. Used for maintaining service quality.
Security Monitoring
Term
Description
Access Monitoring
Monitors access patterns and authentication status to API endpoints. Detection of unauthorized access.
Data Security Monitoring
Monitors data encryption status, access control, and privacy protection status.
Compliance Monitoring
Continuously monitors compliance with regulatory requirements. Maintenance of audit trails.
Operational Monitoring
Term
Description
Alert Settings
A mechanism to notify when important metrics exceed thresholds. Enables early response.
Log Analysis
Analysis of system logs and application logs. Used for identifying causes of failures and trend analysis.
Incident Tracking
Tracking of occurrence history and response status of failures and abnormalities. Used for formulating recurrence prevention measures.
Capacity Management
Prediction and planning of resource usage. Formulation of appropriate scaling strategies.
Business Impact Monitoring
Term
Description
ROI Analysis
Continuous evaluation of costs and effects of model operation. Measurement of return on investment.
Business Metrics
Monitoring of indicators showing the business contribution of the model. Measurement of effects such as sales and cost reduction.
User Satisfaction
Tracking of feedback and service evaluations from end users.
System Health Monitoring
Term
Description
Infrastructure Availability Monitoring
Operational status and health checks of system components.
Network Monitoring
Monitoring of network connectivity, latency, bandwidth.
Cache Efficiency Monitoring
Tracking of cache hit rates, memory usage efficiency.
Batch Processing Monitoring
Monitoring of batch job execution status, success rates, processing times.
Fairness Monitoring
Term
Description
Bias Metrics
Monitoring of demographic biases in model predictions. Tracking of fairness indicators.
Attribute-based Monitoring
Monitoring of prediction biases based on protected attributes. Detection of discriminatory results.
Fairness Score
Evaluation of prediction accuracy uniformity across different groups. Quantitative measurement of fairness.
Impact Analysis
Analysis of the impact of model predictions on different populations. Evaluation of social impact.
Governance Monitoring
Term
Description
Policy Compliance
Monitoring of compliance with the organization's AI governance policies. Confirmation of guideline adherence.
Accountability Tracking
Monitoring of transparency and explainability in the model's decision-making process. Ensuring accountability.
Ethical Risk Monitoring
Continuous assessment of AI's ethical impacts and potential risks. Fulfillment of social responsibility.
Regulatory Compliance Tracking
Monitoring of compliance status with new regulatory requirements. Maintenance of compliance.
MLOps Management Process
[Amazon SageMaker components useful in this category] * Amazon SageMaker Studio provides an integrated development environment (IDE) that enables one-stop execution from notebook creation to model development, training, and deployment, realizing centralized management of ML workflows. * Amazon SageMaker Canvas provides a no-code ML development environment that allows data preparation to model deployment through drag & drop without writing code, enabling development for business analysts. * Amazon SageMaker Pipelines orchestrates ML workflows, building reproducible ML pipelines.
Experiment Management
[Amazon SageMaker components useful in this category] * Amazon SageMaker Experiments provides tools for tracking and managing machine learning experiments, automatically recording experiment results such as training runs, parameters, metrics, and enabling comparative analysis.
Term
Description
Experiment Tracking
Activity of recording and managing settings, parameters, and results of each experiment in model development. Uses tools like MLflow, SageMaker Experiments.
Metadata Management
Management of associated information such as settings, environment, datasets, results related to experiments. Ensures reproducibility and traceability of experiments.
Hyperparameter Logging
History management of model hyperparameter settings. Used for tracking and comparative analysis of optimization processes.
Evaluation Metrics Tracking
Time-series recording and analysis of model performance indicators. Used for understanding improvement trends and comparative evaluation.
Artifact Management
Storage and management of artifacts such as models, checkpoints, plots. Streamlines storage and sharing of experiment results.
A/B Test Management
Design and result management of comparative experiments of multiple models. Supports statistical significance evaluation and decision making.
Experiment Environment Management
Management of development environment configurations, dependencies, resource settings, etc. Maintains reproducibility and consistency of environments.
Version Control
[Amazon SageMaker components useful in this category] * Amazon SageMaker Model Registry catalogs and version-manages ML models, managing model metadata.
Term
Description
Data Versioning
Change history management of training datasets. Uses tools like DVC (Data Version Control). Enables tracking of data lineage.
Model Versioning
Management of different versions of trained models. Implemented with tools like SageMaker Model Registry. Controls switching and rollback in production environments.
Code Version Control
Version management of model development code. Uses Git, etc. Setting of branch strategies and merge policies.
Configuration File Management
Version management of configuration files such as environment settings, parameter settings. Ensures consistency across environments.
Dependency Management
Version management of libraries and frameworks. Clarified in requirements.txt or Dockerfile.
Tagging
Assigning meaningful tags to versions of models, data, code. Facilitates release management and tracking.
Baseline Management
Management of model versions that serve as benchmarks for performance comparison. Used as indicators for quality assurance.
Documentation
[Amazon SageMaker components useful in this category] * Amazon SageMaker Model Cards creates and manages model documentation, centralizing management of detailed model information.
Term
Description
Model Card
A standardized document recording detailed information about the model. Includes usage, performance, limitations, ethical considerations, etc. Can be managed with SageMaker Model Cards.
Data Sheet
A document recording characteristics of datasets, collection methods, preprocessing procedures, license information, etc. Ensures transparency and reusability of data.
API Specification
Describes model interface specifications, input/output formats, endpoint information, etc. Managed in standard formats like OpenAPI/Swagger.
Experiment Report
A document summarizing the purpose, method, results, and discussion of experiments. Records important findings and decisions.
Operation Manual
A manual describing procedures for model deployment, monitoring, and maintenance. Includes incident response procedures.
Training Record
Detailed record of model learning. Includes data preparation, parameter settings, learning process, summary of results.
Change History
Records important changes to models, data, code. Documents reasons for updates and scope of impact.
Risk Assessment Document
A document evaluating potential risks, biases, ethical considerations of the model. Also used as evidence for regulatory compliance.
Quality Assurance Document
Records test results, performance evaluations, validation procedures. Demonstrates compliance with quality standards.
Compliance Document
A document demonstrating compliance with regulatory requirements. Records responses to GDPR, AI governance, etc.
Architecture Diagram
Visual representation of system configuration, data flow, relationships between components.
Troubleshooting Guide
A guide listing common problems and their solution procedures. Contributes to improving operational efficiency.
Model Lineage Diagram
Visual representation of model development process, derivative relationships, important changes. Clarifies relationships between versions.
Performance Benchmark
Performance comparison results between different model versions. Used as evidence of improvement.
Deployment Plan
A plan describing model deployment strategy, schedule, risk countermeasures.
Orchestration Management
Term
Description
Pipeline Management
Automation and control of ML workflows. Manages the series of flows from data processing to inference.
Workflow Definition
Definition of each step in the ML process and its dependencies. DAG-based control flow design.
Automation Triggers
Setting of conditions and schedules for pipeline execution. Control of event-driven processing.
Error Handling
Implementation of anomaly detection and recovery mechanisms. Definition of fallback strategies.
Quality Management
Term
Description
Quality Gates
Quality checkpoints before deployment. Verification of performance, security, compliance.
Test Automation
Automated execution system for unit tests, integration tests, performance tests. Continuous quality assurance.
Quality Metrics
Definition and monitoring of quality indicators for models and systems. Tracking of SLO/SLA compliance status.
Infrastructure Management
Term
Description
Resource Optimization
Efficient allocation and management of computational resources and storage. Cost optimization.
Scaling Management
Setting and monitoring of auto-scaling policies. Flexible resource adjustment according to demand.
Availability Management
Ensuring system redundancy and fault tolerance. Management of backup and disaster recovery plans.
Security Management
[Amazon SageMaker components useful in this category] * Amazon SageMaker Role Manager manages access permissions for ML activities, implementing security based on the principle of least privilege and providing appropriate access control.
Term
Description
Access Control
Role-based access management. Permission settings based on the principle of least privilege.
Data Protection
Encryption and anonymization of sensitive data. Implementation of privacy protection mechanisms.
Vulnerability Management
Detection and countermeasures for security vulnerabilities. Conducting regular security assessments.
Governance Management
Term
Description
Policy Management
Formulation and compliance management of AI governance policies. Setting of ethical guidelines.
Audit Response
Maintenance of audit trails and management of audit response processes. Preparation of compliance evidence.
Risk Management
Identification, assessment, and implementation of mitigation measures for risks related to AI use. Continuous risk monitoring.
In this article, I have compiled an "AI and Machine Learning Terminology for AWS" based on the knowledge I gained during my study process to pass the newly added AWS certifications: AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate. Additionally, I included insights from co-authoring the "Quiz to Learn AWS Functions and History: Selected 'Machine Learning' Edition" in the "Compilation of Thin Books on AWS Vol.01", which was self-published for Japan's "Technical Book Festival 17".
I will continue to shape ideas that can be useful for learning and utilizing AWS.
Additionally, I plan to update this article periodically to reflect changes in AI and machine learning in AWS.