hidekazu-konishi.com

AI and Machine Learning Glossary for AWS - Knowledge Gained While Studying for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate

First Published: 2024-11-24
Last Updated: 2024-11-25

This time, I have compiled the knowledge gained during my study process to pass the newly added AWS certifications, AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate, into a "Glossary of AI and Machine Learning Terms Related to AWS".

The knowledge in this "Glossary of AI and Machine Learning Terms Related to AWS" is also used in the questions and answers of "Learning AWS Functions and History Through Quizzes: Selected 'Machine Learning' Edition" in "Compilation of Thin Books on AWS Vol.01", which I co-authored as an individual publication for Japan's "Technical Book Fair 17".

I hope this will be helpful for those who are preparing to take the AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate exams.

AI/ML AWS Services

Amazon SageMaker

Service Name	Description
Amazon SageMaker	A fully managed service for efficiently building, training, and deploying machine learning models. An integrated platform that supports the entire ML lifecycle from development to production operation.
SageMaker Studio	One of SageMaker's components, a browser-based integrated development environment (IDE). Enables one-stop execution from notebook creation to model development, training, and deployment. Achieves centralized management of ML workflows.
SageMaker Canvas	One of SageMaker's components, a visual interface that allows building ML models through drag & drop without writing code. A no-code ML development environment for business analysts.
SageMaker Ground Truth	One of SageMaker's components, a data labeling service for creating high-quality training datasets. Improves efficiency of human labeling tasks and provides semi-automated labeling workflows.
SageMaker Data Wrangler	One of SageMaker's components, a tool for streamlining data preparation and preprocessing. Provides over 200 built-in transformation functions, enabling data cleansing to feature engineering via GUI.
SageMaker Feature Store	One of SageMaker's components, a repository for centrally managing and sharing features. Ensures consistency of features in online/offline scenarios and promotes reuse across teams.
SageMaker JumpStart	One of SageMaker's components, an ML hub providing pre-trained models and solutions. Deployable with one click and supports transfer learning and fine-tuning.
SageMaker Model Monitor	One of SageMaker's components, continuously monitors model quality in production environments. Detects data drift and bias, enabling early detection of model performance degradation.
SageMaker Clarify	One of SageMaker's components, evaluates bias detection and explainability of models. Ensures fairness and transparency, and analyzes the basis for model decisions.
SageMaker Debugger	One of SageMaker's components, for debugging and monitoring the training process. Enables visualization of metrics and setting of alerts. Supports optimization of learning.
SageMaker Pipelines	One of SageMaker's components, for orchestration of ML workflows. Builds reproducible ML pipelines and enables automated experiment management.
SageMaker Model Cards	One of SageMaker's components, for creating and managing model documentation. Centralizes management of detailed model information, ensuring governance and compliance.
SageMaker Role Manager	One of SageMaker's components, manages access permissions for ML activities. Implements security based on the principle of least privilege and provides appropriate access control.
SageMaker Experiments	One of SageMaker's components, a tool for tracking and managing machine learning experiments. Automatically records experiment results such as training runs, parameters, and metrics, enabling comparative analysis.
SageMaker Model Registry	One of SageMaker's components, a repository for cataloging and versioning ML models. Manages model metadata and approval status.

Amazon Bedrock

Service Name	Description
Amazon Bedrock	A secure, fully managed generative AI platform service. An integrated platform that allows access to multiple Foundation Models through a single API. Provides comprehensive features including Foundation Models (FMs) as the underlying large language models, Knowledge Bases for RAG construction, Agents for automation, Guardrails for harmful content filtering, and Prompt Flows for workflow execution.
Foundation Models (FMs)	Large language models that serve as the foundation for text generation, image generation, etc. The core AI component providing the foundational functionality in Bedrock.
Knowledge Bases	A Bedrock feature that provides information retrieval from external knowledge bases, enabling the construction of Retrieval Augmented Generation (RAG) architectures.
Agents	A Bedrock feature that orchestrates procedural instructions, custom action execution, and Knowledge base utilization in an integrated manner.
Guardrails	A Bedrock feature that detects and filters harmful content and hallucinations, controlling AI output.
Prompt Flows	A Bedrock feature that systematically workflows prompt execution, S3 data input/output, and Lambda function execution.

Amazon Q

Service Name	Description
Amazon Q	A generative AI-powered assistant service specifically designed for businesses. It comes in Business and Developer editions, specializing in business productivity improvement and development support respectively. It can be integrated with various AWS services, including Amazon Q in Amazon QuickSight, Amazon Q in Amazon Connect, Amazon Q in AWS Chatbot, Amazon Q network troubleshooting, and Amazon Q Data integration in AWS Glue.
Amazon Q Business	A generative AI assistant designed to improve employee productivity. Supports automation and efficiency of general business tasks.
Amazon Q Developer	A generative AI assistant specialized in coding support for developers. Supports development tasks such as code generation, debugging, and optimization.

Natural Language Processing Services

Service Name	Description
Amazon Comprehend	A natural language processing service that performs sentiment analysis, personal information detection, key phrase extraction, etc. from text. Custom model creation is also possible.
Amazon Kendra	An advanced search service for enterprises. Provides context-aware search results for natural language queries. Easy integration with RAG.
Amazon Lex	A service for building interactive interfaces (chatbots). Provides natural language understanding and dialogue management functions. Supports both voice and text.
Amazon Textract	A service for extracting text and structured data from documents. Capable of handwriting recognition, form processing, and table analysis. Provides high-accuracy OCR functionality.
Amazon Translate	An automatic translation service between multiple languages. Provides real-time translation between 74 languages. Supports custom terminology dictionaries.
Amazon Transcribe	A speech-to-text (speech recognition) service. Capable of multiple speaker identification and customization of specialized terminology. Supports real-time transcription.
Amazon Polly	A text-to-speech (speech synthesis) service. Provides natural pronunciation and Neural text-to-speech. Supports multiple languages and voice types.
Amazon CodeWhisperer	An AI coding companion for programming assistance. Provides code completion and suggestions.

Image and Video Processing Services

Service Name	Description
Amazon Rekognition	An image and video analysis service. Provides face recognition, object detection, text extraction, content moderation, celebrity recognition, etc. Supports both real-time analysis and batch processing.
Amazon Lookout for Vision	An anomaly detection service using industrial image analysis. Used for product defect detection in manufacturing lines, etc.

Other AI-Related Services

Service Name	Description
Amazon Personalize	A service that provides personalized recommendations. Enables product recommendations and related content suggestions based on user behavior data. Supports real-time recommendations.
Amazon Pinpoint	A customer engagement service. Provides ML-powered segmentation, user behavior analysis, and optimal delivery time prediction functions. Enables multi-channel communication through email, SMS, push notifications, etc.
Amazon Fraud Detector	A machine learning-based fraud detection service. Detects online fraudulent transactions, account takeovers, fake account creation, etc. Can be used in combination with custom rules and ML models.
Amazon Augmented AI (A2I)	A service that manages human review task execution. Enables building workflows for human review of machine learning prediction results.
Amazon Mechanical Turk (MTurk)	A crowdsourcing marketplace. Enables execution of tasks such as data labeling and content moderation by humans. Can be integrated with Amazon SageMaker Ground Truth and Amazon Augmented AI (A2I).
Amazon QuickSight	A BI (Business Intelligence) tool. Equipped with ML predictive analysis capabilities, enabling data visualization and analysis. Supports data analysis in natural language through Q function.

Data Storage and Database Solutions

Service	Description
Amazon S3	Scalable object storage. Optimal for building data lakes. High durability and availability.
Amazon EFS	Fully managed scalable file storage. Shareable across multiple instances. Supports NFS protocol.
Amazon FSx for Lustre	Amazon FSx for Lustre is a high-performance file system capable of directly processing large-scale datasets. It seamlessly integrates with Amazon S3, automating data loading from and writing back to S3, and accelerates workloads with hundreds of GBps of parallel processing.
Amazon DynamoDB	Fully managed NoSQL database. Enables fast read and write operations. Automatic scaling feature.
Amazon Redshift	Petabyte-scale data warehouse. Enables fast query processing. Columnar storage.
Amazon OpenSearch Service	An Elasticsearch-compatible search and analytics engine service. In addition to full-text search and real-time analytics, it provides vector database functionality supporting neural search and k-Nearest Neighbors (k-NN) vector search. Compatible with log analysis, application search, and security analytics, as well as AI applications such as recommendations and semantic search. Advanced search capabilities are enabled through integration with large language models using OpenSearch Neural Search functionality.
Amazon DocumentDB	MongoDB-compatible document database. Equipped with vector search functionality. Scalable document management.

Basic Concepts of Machine Learning

AI/ML Fundamental Concepts

Term	Description
Artificial Intelligence (AI)	Computer systems that exhibit human-like intelligent behavior. Possess capabilities such as learning, reasoning, and problem-solving. Includes specialized AI for specific tasks and general AI for broad intelligence.
Machine Learning (ML)	Algorithms or systems that learn patterns from data and perform tasks without explicit programming. Realized through a combination of statistical methods and algorithms.
Deep Learning	A machine learning technique using multi-layer neural networks. Demonstrates high performance in image recognition, natural language processing, etc. Requires large amounts of data and computational resources.
Feature	Individual variables or attributes used as input to a model. Meaningful information extracted from data.
Label	Correct answer data in supervised learning. Target values or classification categories that the model should predict.
Instance	Individual data points. Composed of a combination of features and labels.
Batch	A set of data processed simultaneously during model training. Affects memory efficiency and training speed.
Epoch	A unit representing one complete processing of all training data. Model is gradually improved through multiple epochs of learning.
Iteration	One update of model parameters. Often refers to processing per batch.
Parameters	Values optimized by the model during the learning process. Includes weights and biases.
Hyperparameters	Control parameters set before model learning. Includes learning rate and batch size.
Inductive Bias	Assumptions or hypotheses inherent in the model. Characterizes the nature of the learning algorithm.
Generalization Performance	The model's predictive ability on unseen data. Balancing overfitting and underfitting is important.

Generative AI Related Concepts

Term	Description
Foundation Model (FM)	A general-purpose AI model pre-trained on large-scale data. Adaptable to various tasks. Forms the basis for transfer learning and fine-tuning.
Large Language Model (LLM)	A large-scale foundation model specialized in natural language processing. Examples include GPT and BERT. Demonstrates high performance in text generation and comprehension tasks.
RAG (Retrieval-Augmented Generation)	A method to improve the output quality of generative AI by searching and referencing external knowledge. Effective in preventing hallucination and improving accuracy.
Prompt	Input text to generative AI models. Instructions or context to control the model's output.
Token	The smallest unit for dividing text in prompts. Composed of words or substrings. The basis for input/output limitations of models.
Temperature	A parameter in prompts that controls the randomness of generation. Higher values lead to more diverse outputs, lower values to more deterministic outputs.
Top-p sampling	A method in prompts for selecting the next token based on cumulative probability. Controls the balance between output diversity and quality.
Top-k sampling	A method in prompts for selecting the next token from the top k tokens by probability. Used to control output.
Context window	The maximum length of input that the model can process at once in prompts. Affects the understanding of long contexts.
In-context learning	The ability to learn tasks through examples within the prompt. Adaptation without additional training.
Fine-tuning	The process of adapting a foundation model to specific tasks or domains. Specialization through additional learning.
Prompt engineering	The technique of designing effective prompts. Improves the quality and consistency of outputs.
Hallucination	The phenomenon where a model generates information not based on facts. A challenge for reliability.
Style transfer	A generative technique to change the style of existing content. Used for images and text.
Latent space	A compressed representation space of data learned by generative models. Controls the diversity of generation.
Attention mechanism	A mechanism to focus on important parts of the input. Core technology of Transformer models.
Self-attention mechanism	A mechanism to learn relationships between elements within a sequence. Effective for capturing long-range dependencies.
Decoder	The part that generates the desired output from latent representations. An important component of generative models.
Encoder	The part that converts input into latent representations. Responsible for information compression and feature extraction.
Transformer	An architecture based on self-attention mechanism. The foundation of modern generative AI.
Multimodal	The ability to handle multiple data formats such as text, images, and audio.
Zero-shot capability	The ability to perform new tasks using only pre-training in prompts. Adaptation without examples.
Few-shot capability	The ability to perform new tasks with a few examples in prompts. Efficient adaptive learning.

Machine Learning Approaches

Term	Description
Parametric learning	An approach where the model shape is fixed and the number of parameters is constant. Examples include linear regression and logistic regression.
Non-parametric learning	An approach where the complexity of the model changes according to the data. Examples include k-NN and kernel methods.
Ensemble learning	An approach that combines multiple learners to improve performance. Examples include Random Forest and Boosting.

Foundational Learning Theories

Term	Description
Maximum Likelihood Estimation	A method to estimate parameters by maximizing the probability of obtaining the data.
Bayesian Estimation	A method to estimate parameters by calculating posterior probability from prior probability and data likelihood.
Empirical Risk Minimization	A principle to minimize prediction errors on training data.
Structural Risk Minimization	A principle to minimize prediction errors while considering model complexity.

Loss Functions

Term	Description
Squared Loss	The square of the difference between predicted and actual values. Commonly used in regression problems.
Cross-Entropy Loss	A loss function used in classification problems. Measures the distance between probability distributions.
Hinge Loss	A loss function used in SVMs. Achieves margin maximization.
Huber Loss	A loss function robust to outliers. A combination of squared loss and absolute loss.

Optimization Theory

Term	Description
Convex Optimization	A special optimization problem where local optima are global optima.
Stochastic Optimization	A method that uses randomness to search for optimal solutions.
Constrained Optimization	An optimization problem under constraints. Solved using methods like Lagrange multipliers.

Terms Related to Learning Process

Term	Description
Vanishing Gradient Problem	A phenomenon in deep neural networks where gradients vanish during backpropagation. Makes learning difficult in deep layers.
Exploding Gradient	A phenomenon in deep neural networks where gradients grow exponentially. Causes instability in learning.
Sparsity	A property where many of the data or model parameters are zero. Affects computational efficiency and generalization performance.
Curse of Dimensionality	A problem where the required amount of data increases exponentially as the number of feature dimensions increases. A challenge in high-dimensional data analysis.

Data Quality Related Terms

Term	Description
Data Imbalance	A state where there is a large difference in the number of samples between classes. Makes learning difficult for minority classes.
Noise	Unwanted variations or errors in data. Can hinder model learning.
Outliers	Values that deviate significantly from the general distribution of data. Can negatively affect model learning.
Missing Values	Unrecorded or unmeasured values in a dataset. Requires appropriate handling.

Model Evaluation Related Terms

Term	Description
Baseline	A simple model or performance metric used as a comparison standard. Used to evaluate improvement.
Significance Testing	A method to evaluate whether performance differences between models are statistically meaningful.
Cross-Entropy	A metric that measures the difference between predicted probabilities and true distribution in classification problems.
Confusion Matrix	A table that aggregates prediction results by classification. Used for performance evaluation.

Activation Functions

Term	Description
ReLU	The most commonly used activation function. A simple non-linear function that sets negative inputs to zero.
Sigmoid	Converts output to a range of 0-1. Often used in the output layer for binary classification.
tanh	Converts output to a range of -1 to 1. Mitigates the vanishing gradient problem better than sigmoid.
Softmax	Outputs probability distribution for multiple classes. Used in the output layer for multi-class classification.

Types of Learning Algorithms

Term	Description
Perceptron	The most basic neural network. Suitable for linearly separable problems.
SVM (Support Vector Machine)	A method that determines the classification boundary by maximizing margin. Non-linear classification is possible with kernel trick.
Decision Tree	A method that makes predictions by hierarchically dividing data. High interpretability and easy evaluation of feature importance.
k-Nearest Neighbors	A method that makes predictions based on the majority of the k nearest training data. Simple but computationally expensive.

Data Quality Indicators

Term	Description
Data Completeness	An indicator of the degree of missing values, duplicates, and inconsistencies in a dataset.
Data Consistency	An indicator of whether data formats and value ranges are as expected.
Data Freshness	An indicator of the update time and expiration date of data.
Data Representativeness	An indicator of whether the sample appropriately represents the population.

Model Quality Indicators

Term	Description
Prediction Stability	An indicator of prediction consistency for similar inputs.
Model Confidence	An indicator of the model's confidence in each prediction.
Explainability	An indicator of the ease of interpreting the reasons for model predictions.
Robustness	An indicator of the model's resistance to noise and outliers.

Statistical Concepts

Term	Description
Analysis of Variance	A statistical method for analyzing sources of variation in data.
Hypothesis Testing	A method for verifying statistical hypotheses.
Confidence Interval	A range that quantifies the uncertainty of an estimate.
Effect Size	An indicator of the practical magnitude of statistical differences.

Model Development Process

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Studio provides an integrated development environment (IDE) that enables one-stop execution from notebook creation to model development, training, and deployment, realizing centralized management of ML workflows.
* Amazon SageMaker Canvas provides a no-code ML development environment that allows data preparation to model deployment through drag & drop without writing code, enabling development for business analysts.

Model Development Process

Phase	Description
Data Collection	Collection and integration of data necessary for learning. Includes identification of data sources and quality checks. Consider data representativeness and balance.
Data Preprocessing	Implement data splitting, cleaning, data labeling, feature engineering, scaling (normalization, standardization). Improve data quality and convert to a format suitable for learning. Include handling of missing values and outliers.
Model Selection	Select appropriate algorithms and architectures based on problem type (classification/regression, etc.), data characteristics, requirements (accuracy/speed/explainability). Consider computational resource constraints and deployment environment. Also consider the possibility of using pre-trained models.
Model Training	Train the model using the selected algorithm. Include hyperparameter optimization. Conduct performance evaluation through cross-validation.
Model Evaluation	Verify model performance. Analyze from multiple angles using various evaluation metrics. Confirm generalization performance on test data.
Deployment	Deploy the model to the production environment. Include scaling and monitoring settings. Also conduct A/B testing to verify effectiveness.
Inference	Execute predictions on new data using the deployed model. Perform predictions in real-time or batch processing.
Monitoring	Continuously monitor model performance. Detect drift and make decisions on retraining. Track quality metrics and set alerts.

Data Collection

Types of Data

Type	Description
Structured Data	Data organized in tabular form. Such as data managed in RDBMS. Has a clear schema.
Unstructured Data	Data without a fixed structure. Such as text, images, audio, video. Requires special techniques for processing.
Semi-structured Data	Partially structured data. Such as JSON, XML, HTML. Has a flexible schema.
Vector Data	Data represented as numerical vectors. Such as word embeddings, feature vectors. Suitable for similarity calculations.

ETL (Extract, Transform, Load)

Phase	Description
Extract	Extract data from various sources. Check data format and quality. Perform consistency checks.
Transform	Transform and process data. Execute cleansing, normalization, aggregation, etc. Transform according to business rules.
Load	Save and load processed data. Store in data warehouses or data lakes. Ensure consistency.

Data Preprocessing

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Data Wrangler provides tools to streamline data preparation and preprocessing, enabling data cleansing to feature engineering via GUI.
* Amazon SageMaker Canvas provides functionality to perform data preprocessing, feature engineering, data transformation, etc. via GUI without writing code, enabling data preparation by business analysts.

Data Splitting

Term	Description
Training Data	Dataset used for model learning. Typically accounts for about 60-80% of all data.
Validation Data	Dataset used for model hyperparameter tuning and performance evaluation. Typically accounts for about 10-20% of all data.
Test Data	Independent dataset used for final model evaluation. Typically accounts for about 10-20% of all data.
Holdout Method	Basic method of splitting data into training and evaluation sets. Used when data quantity is sufficient.
Stratified Sampling	Method of splitting data while maintaining class ratios. Important for imbalanced datasets.

Cleansing (Cleaning)

Task	Description
Noise Removal	Detection and removal of outliers and noise. Improves data quality. Utilizes statistical methods and domain knowledge.
Missing Value Handling	Completion or removal of missing data. Impute with mean, median, predicted values, etc. Consider MAR and MCAR assumptions.
Outlier Detection	Identification of outliers using statistical methods or ML techniques. Important to check consistency with domain knowledge.
Duplicate Data Elimination	Detection and removal of duplicate records. Ensures data consistency. Normalization of key items.

Data Labeling

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Ground Truth provides a data labeling service for creating high-quality training datasets.

Method	Description
Manual Labeling	Direct labeling by humans. High quality but time and cost intensive. Effective when specialized knowledge is required.
Semi-Automatic Labeling	Combination of AI prediction and human verification. Enables efficient labeling. Achieves balance between quality and efficiency.
Active Learning	Selection of target data for efficient labeling. Prioritizes data with high uncertainty. Optimizes labeling costs.
Label Quality Management	Checks for consistency and errors. Includes consensus building among multiple annotators. Setting and monitoring of quality metrics.

Feature Engineering

[Amazon SageMaker components useful in this category]
・Amazon SageMaker Feature Store provides a centralized repository for managing and sharing features, ensuring consistency of features both online and offline.

Technique	Description
Feature Selection	Selection of useful input variables for the model. Selection based on correlation analysis and importance evaluation. Contributes to dimensionality reduction and model performance improvement. Used when there are many features or when you want to remove unnecessary features to prevent model overfitting. Particularly effective when analyzing datasets with many variables, such as medical or financial data.
Feature Extraction	The process of extracting meaningful features from raw data. Examples include Fourier transform in signal processing and edge detection from images. Used when there is a need to extract useful information from complex raw data, especially important in image processing, speech processing, and sensor data analysis. In time series data analysis, it is utilized for extracting statistical measures and frequency characteristics.
Feature Scaling	The process of adjusting the value range of features. Includes standardization and normalization. Essential when dealing with datasets that have features on different scales, particularly important for machine learning algorithms using gradient descent methods and clustering algorithms that perform distance calculations.
Feature Interaction	Creation of new features by combining multiple features. Used when wanting to capture non-linear relationships or model phenomena that cannot be explained by individual features alone. Particularly utilized in regression analysis and predictive models to improve prediction accuracy.
Dimensionality Reduction	Techniques for reducing the feature dimensions of data. PCA and t-SNE are representative methods. Contributes to improved computational efficiency and performance. Used when visualization of high-dimensional data is necessary or when wanting to reduce computational costs. Particularly useful when dealing with high-dimensional data in image recognition and document classification.
Encoding	Numerical conversion of categorical values. Uses methods such as One-Hot, Label, and Target encoding. Select appropriate methods based on data characteristics. Essential when building machine learning models that handle categorical data, especially important when there are many categories or when relationships between categories need to be considered.
Embedding	Conversion of high-dimensional data into low-dimensional vector representations. Examples include Word2Vec and BERT. Preserves semantic similarity. Used when dealing with text data or large-scale categorical data, playing a particularly important role in natural language processing and recommender system construction.
Data Augmentation	Enhancement of training data by transforming existing data. Includes rotation, scaling, etc. Improves model generalization performance. Used when training data is limited or when wanting to prevent model overfitting. Particularly effective in tasks using deep learning, such as image recognition and speech recognition.

Encoding Techniques

Encoding Technique	Description and Use Cases
Label Encoding	A technique that converts categorical values into continuous integer values. Suitable when there is an ordinal relationship between categories (e.g., education level, age group). Memory-efficient and often used in decision tree-based algorithms. However, caution is needed for non-ordinal categorical data as it introduces a numerical relationship between categories.
One-Hot Encoding	A technique that converts categorical values into binary vectors. Optimal for cases where there is no ordinal relationship between categories (e.g., color, gender, occupation). Treats each category equally, but can lead to high memory consumption and increased computational cost due to dimensionality increase when there are many categories. Particularly important in linear models and neural networks.
Target Encoding	A technique that replaces categorical values with the mean of the target variable. Effective when there are a very large number of categories or when there is a strong association between categories and the target variable. However, there is a risk of overfitting, so appropriate regularization and cross-validation are necessary. Particularly useful for improving performance in predictive models.
Frequency Encoding	A technique that converts categories to numerical values based on their frequency of occurrence. Suitable when the frequency of categories holds significant meaning (e.g., product popularity, usage frequency). Easy to implement and interpret, but has the limitation of not being able to distinguish between categories with the same frequency.
Binary Encoding	A technique that converts categories into binary representations. More memory-efficient than One-Hot Encoding as it can represent with fewer dimensions, useful when there are many categories. However, the generated features can be difficult to interpret, and relationships between categories may be lost.
Hash Encoding	A technique that uses a hash function to convert categories into fixed-dimensional features. Suitable for cases with an extremely large number of categories or when new categories are continuously added. Memory-efficient and can handle online learning, but there is a possibility of information loss due to hash collisions.

Feature Extraction Techniques for Text Data

Technique	Description
TF-IDF	A technique that calculates word importance based on term frequency and inverse document frequency. A fundamental feature in text analysis. Widely used in document classification, information retrieval, keyword extraction, and other tasks where word importance needs to be considered.
Word2Vec	A technique that converts words into fixed-length dense vectors. Captures semantic similarity between words. Used in natural language processing tasks that require consideration of word meaning relationships and context, such as sentiment analysis, document classification, question-answering systems, and machine translation.
Doc2Vec	An extension of Word2Vec that learns vector representations for entire documents. Used for document similarity calculations. Suitable for tasks requiring semantic comparison at the document level, such as document classification, clustering, recommender systems, and similar document search.
FastText	A word embedding technique that considers substrings. Capable of handling unknown words. Particularly effective in processing languages with rich morphology, handling text with spelling errors, and analyzing social media posts where new or modified words frequently appear.
BERT Tokenization	A tokenization technique for BERT models. Uses the WordPiece algorithm. Used as essential preprocessing when using BERT models for advanced natural language processing tasks that consider context, such as sentiment analysis, named entity recognition, and question answering.
BPE (Byte Pair Encoding)	A technique that learns subword units by merging frequent character strings. Allows control of vocabulary size. Particularly effective in machine translation, multilingual processing, and processing languages with complex morphology where efficient handling of large vocabularies is necessary.
Bag of Words (BoW)	The most basic technique that vectorizes word frequency in documents. Does not consider word order. Used in basic text analysis tasks where word frequency alone is sufficient for performance, such as spam email detection, document classification, and topic classification.
n-gram	A technique that uses combinations of n consecutive words or characters as features. Captures local context. Used in language modeling, spell checking, author identification, predictive input for programming languages, and other cases where local context or word order is important.

Text Data Preprocessing

Method	Description
Tokenization	Splitting text into words or substrings. Selection of appropriate splitting method based on language characteristics.
Normalization	Implement unification of uppercase and lowercase, accent removal, character type standardization, etc.
Stop Word Removal	Removal of common words with little information (articles, prepositions, etc.).
Lemmatization/Stemming	Convert words to their base form. Use morphological analysis or stemming.
Noise Removal	Removal of special characters, HTML/XML tags, unnecessary spaces, etc.

Image Data Preprocessing

Method	Description
Resize	Standardization of image size. Convert to a size suitable for model input.
Normalization	Standardization of pixel values. Generally converted to a range of 0-1 or -1-1.
Color Space Conversion	Conversion to color spaces such as RGB, HSV, grayscale according to purpose.
Noise Removal	Noise reduction using median filters or Gaussian filters.
Data Augmentation	Enhancement of training data through rotation, flipping, scaling, etc.

Scaling Methods

Method	Description
Normalization	A transformation that fits data into a specific range (usually 0-1). Unifies the scale between features, making them comparable.
Standardization	A transformation that converts data to a distribution with mean 0 and standard deviation 1. Susceptible to outliers but suitable for normally distributed data.
Standard Scaler	A scaler that implements standardization. Converts to mean 0 and standard deviation 1. Suitable for data following normal distribution.
Robust Scaler	Robust scaling using median and interquartile range. Less affected by outliers.
Min Max Scaler	A scaler that implements normalization. Converts data to a specified range such as 0-1. Suitable for neural network inputs.
Max Absolute Scaler	Normalizes by maximum absolute value. Suitable for sparse data. Maintains zero-centered scale.

Model Selection

[Amazon SageMaker components useful in this category]
* Amazon SageMaker JumpStart functions as an ML hub providing pre-trained models and solutions, deployable with one click.

Model Types

Classification	Description
Supervised Learning Models	Predictive models for classification or regression. Uses labeled data. Examples include RandomForest, SVM, Neural Networks.
Unsupervised Learning Models	Models for clustering and pattern discovery. Uses unlabeled data. Examples include K-means, PCA, Auto-encoders.
Semi-Supervised Learning Models	Models that learn using a small amount of labeled data and a large amount of unlabeled data. Examples include pseudo-labeling, co-training.
Generative Models	Models that generate data or learn distributions. Examples include GANs, VAE, Diffusion Models.
Transfer Learning Models	Models that utilize pre-trained knowledge. Based on foundation models such as BERT, GPT, ResNet.

Selection Criteria

Criteria	Description
Data Characteristics	Model selection based on data quantity, dimensionality, presence of noise, class balance, sparsity, etc.
Computational Resources	Selection based on available memory, CPU/GPU, and training time constraints.
Prediction Performance	Selection based on target performance metrics such as accuracy, recall, F1 score.
Inference Speed	Selection based on real-time requirements, batch processing requirements.
Explainability	Selection based on requirements for model transparency and interpretability.
Scalability	Selection based on ability to handle increases in data volume and system expansion.
Cost	Selection based on total cost of ownership including development, training, and operation.

Common Algorithms

Algorithm	Application Scenarios
Linear Regression	Simple regression problems, when interpretation of relationships is important.
Logistic Regression	Binary classification problems, when probability prediction is needed.
Random Forest	Classification and regression of structured data, analysis of feature importance.
XGBoost/LightGBM	High-performance prediction problems with structured data.
Neural Networks	Complex pattern recognition, image and speech processing.
BERT/Transformer	Text processing, natural language understanding tasks.
CNN	Image recognition, pattern detection.
RNN/LSTM	Time series data analysis, sequence data processing.
Reinforcement Learning Models	Decision making, game strategies, robot control, etc.

Architecture Considerations

Element	Description
Model Size	Consideration of number of parameters, memory requirements, storage requirements.
Layer Configuration	Selection of number of layers, number of units, activation functions for neural networks.
Ensemble Methods	Methods of combining multiple models, voting or averaging strategies.
Quantization and Compression	Model lightweighting, adaptation to edge deployment.
Batch Size	Balance between memory usage and speed during training and inference.
Distributed Learning Support	Possibility of learning on multiple GPUs or multi-nodes.

Model Selection Strategies

Strategy	Description
Baseline Construction	Strategy to start with simple models and gradually increase complexity.
AutoML	Utilization of tools for automatic model selection and optimization.
Algorithm Comparison	Evaluate multiple models in parallel and select the optimal one.
Experiment Management	Tracking and documentation of the model selection process.
A/B Testing	Comparison of different models' performance in real environments.
Gradual Optimization	Strategy to optimize while gradually increasing model complexity.

Model Training

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Debugger performs debugging and monitoring of the training process, enabling visualization of metrics and setting of alerts.

Basic Concepts of Model Learning

Concept	Description
Inductive Bias	Assumptions or hypotheses inherent in the model. Determines the nature of the learning algorithm. Appropriate bias improves generalization performance.
Bias-Variance Tradeoff	The tradeoff relationship between model complexity and generalization performance. Balance between overfitting and underfitting.
Cross-Entropy Loss	A loss function commonly used in classification problems. Measures the difference between predicted probabilities and true distribution.

Training Methods

Method	Description
Supervised Learning	A method of learning using data with correct labels. Used for classification and regression tasks. The quality and quantity of data determine performance.
Unsupervised Learning	A method to discover patterns from unlabeled data. Used for clustering and anomaly detection. Reveals latent structures in data.
Semi-Supervised Learning	A method of learning using a small amount of labeled data and a large amount of unlabeled data. Achieves high performance while suppressing labeling costs.
Reinforcement Learning	A method to learn actions that maximize rewards through interaction with the environment. Acquires optimal action policies through trial and error.
Transfer Learning	A method to apply knowledge learned from one task to another task. Improves learning efficiency by utilizing pre-trained models.
Batch Learning	A learning method that processes all training data at once. Enables stable learning with good computational efficiency. Requires retraining when data is updated.
Online Learning	A learning method that processes data sequentially and continuously updates the model. Quick adaptation to new patterns. Risk of instability.
Incremental Learning	A method to perform additional learning on existing models with new data. Enables model updates without complete retraining.
Pre-training	The basic process of training a model from scratch with large-scale data. Acquires general knowledge and patterns. Forms the foundation for subsequent tasks.
Fine-tuning	A method to adapt pre-trained models to specific tasks. Achieves high performance even with small amounts of data. A type of transfer learning.
Continuous Pre-training	Periodic retraining with new data. Effective for maintaining performance of domain-specific models. Important as a drift countermeasure.
RLHF	Reinforcement learning through human feedback. Improves quality and safety of generative AI models. Addresses alignment problems.
Custom Vocabulary Learning	A method to train models on specialized terminology in specific fields. Important for building domain-specific models. Contributes to improving expertise.
Meta-learning	A method to learn the learning algorithm itself. Realizes high flexibility and efficiency for new tasks. The foundation of automatic ML systems. Enables quick adaptation to new tasks.
Few-shot Learning	A method that enables learning from a small number of samples. A form of meta-learning.
Zero-shot Learning	A method that can infer even classes not seen during learning. An advanced form of transfer learning.
Self-Supervised Learning	A method to learn by automatically generating teaching signals from unlabeled data. Effective for pre-training.
Multi-task Learning	A method to learn multiple tasks simultaneously. Enables efficient learning through knowledge sharing between tasks.
Federated Learning	A method to perform cooperative learning on multiple clients while keeping data distributed. Effective for privacy protection.
Knowledge Distillation	A method to transfer knowledge from a large model to a small model. Used for model lightweighting.

Model Optimization Methods

Method	Description
Hyperparameter Tuning	Optimization of model configuration parameters. Uses grid search, random search, Bayesian optimization, etc. Consider the balance between computational cost and performance.
Regularization	Addition of penalty terms to prevent overfitting. L1 (Lasso), L2 (Ridge) regularization, etc. Controls model complexity.
Early Stopping	Ends learning when improvement in validation performance is no longer observed, preventing overfitting. Also contributes to efficient use of computational resources.
Cross-validation	Conducts evaluation by dividing data into multiple parts. Accurately estimates model's generalization performance. Particularly effective when data quantity is limited.
Ensemble Learning	Improves performance by combining multiple models. Random Forest, Gradient Boosting, etc. Compensates for weaknesses of individual models.
Gradient Descent	A method to search for optimal solutions by updating parameters based on the gradient of the loss function. There are variations such as Stochastic Gradient Descent (SGD) and Mini-batch Gradient Descent.
Learning Rate Adjustment	Adjustment of parameters that control the update amount in gradient descent. There are adaptive methods such as AdaGrad, Adam, RMSprop.
Momentum	A method to accelerate optimization by using past gradient information. Effective for avoiding local optima and accelerating convergence.

Optimizers (Optimization Algorithms)

Term	Description
SGD (Stochastic Gradient Descent)	A basic optimization algorithm that calculates gradients and updates parameters on a mini-batch basis.
Adam	A popular optimization algorithm that combines momentum and adaptive learning rates. Shows good convergence in many cases.
RMSprop	An adaptive optimization algorithm that considers past gradients using exponential moving average.
AdaGrad	An adaptive optimization algorithm that applies different learning rates for each parameter.

Regularization Methods

Method	Description
Dropout	A method to prevent overfitting by randomly disabling neurons.
Batch Normalization	A method to normalize inputs on a mini-batch basis, stabilizing and accelerating learning.
Layer Normalization	A method to perform normalization at the layer level. Effective for RNNs and Transformers.
Weight Decay	A regularization method that penalizes the magnitude of weights. Also called L2 regularization.
Label Smoothing	A method to prevent model overconfidence by softening teacher labels.

Learning Rate Scheduling

Method	Description
Step Decay	A method to decrease the learning rate in stages every certain number of epochs.
Exponential Decay	A method to decay the learning rate exponentially.
Cosine Annealing	A method to periodically change the learning rate based on a cosine function.
Warm-up	A method to gradually increase the learning rate at the beginning of learning, then transition to normal learning rate scheduling.

Model Evaluation

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Clarify can evaluate bias detection and explainability (interpretation of predictions) in model evaluation.
* Amazon SageMaker Experiments provides tools for tracking and managing machine learning experiments, automatically recording experiment results such as training runs, parameters, metrics, and enabling comparative analysis.

Baseline Evaluation

Metric	Description
Rule-based Baseline	Performance of a prediction model based on simple rules. Used as a baseline for improvement.
Random Baseline	Performance when making random predictions. Used as a minimum performance standard.
Industry Standard Baseline	Performance standards generally accepted in the industry. Used as a benchmark for competitive comparison.

Classification Tasks

Metric	Description
Accuracy	The proportion of correct predictions among all predictions. Effective when classes are balanced. Caution is needed with imbalanced data. Often used in cases where the number of data points is similar across classes, such as image classification and document classification. Examples: handwritten character recognition, general object recognition tasks, etc.
Precision	The proportion of true positives among positive predictions. Important when minimizing false positives is crucial. Emphasized in spam filters, etc. Used when the cost of false positives is high. Examples: spam email detection, fraudulent transaction detection, quality control inspection, etc., where incorrectly classifying normal items as abnormal can cause significant problems.
Recall	The proportion of correct predictions among actual positives. Important when minimizing false negatives is crucial. Emphasized in disease diagnosis, etc. Used when the cost of false negatives is high. Examples: cancer screening, security systems, earthquake prediction, etc., where missed detections can lead to serious consequences.
F1 Score	The harmonic mean of precision and recall. A balanced evaluation metric. Used as a single comprehensive evaluation. Used when both precision and recall are important. Examples: information retrieval systems, product recommendation, document classification, etc., where both accuracy and comprehensiveness are required.
ROC Curve	A graph plotting true positive rate vs false positive rate for each classification threshold. Performance is evaluated by AUC (Area Under the Curve). Used when a comprehensive evaluation of model performance is desired or when determining the optimal classification threshold is necessary. Examples: credit scoring, medical diagnostic systems, risk assessment models, etc., where threshold adjustment is important.

Regression Tasks

Metric	Description
MSE (Mean Squared Error)	The average of the squared differences between predicted and actual values. As it squares the errors, it is strongly affected by outliers. Commonly used in general regression problems, especially when emphasizing outliers or when larger errors need to be penalized more severely.
RMSE (Root Mean Squared Error)	The square root of the mean squared error. Evaluates the magnitude of prediction errors in the original unit. Strongly affected by outliers. Suitable for tasks like housing price prediction or sales forecasting where interpreting the predicted values in the original scale is desired. Taking the square root of MSE allows for more intuitive interpretation.
MAE (Mean Absolute Error)	An evaluation metric less sensitive to outliers. It's the average of the absolute differences between predicted and actual values. Suitable for demand forecasting or inventory management where you want to assess the average magnitude of errors while suppressing the impact of outliers. It doesn't overestimate prediction errors and is easy to understand intuitively.
MAPE (Mean Absolute Percentage Error)	Evaluates relative errors. Allows comparison between data of different scales. Suitable for sales forecasting or stock price prediction where actual values are greater than 0 and you want to assess the relative magnitude of errors. Useful for comparing prediction accuracy across companies or products of different sizes.
R² (Coefficient of Determination)	Expresses the goodness of fit of a model on a scale from 0 to 1, with values closer to 1 indicating higher prediction accuracy. Can also take negative values. Used to evaluate the overall explanatory power of a model. Particularly used as an indicator for variable selection in multiple regression analysis and is useful for model comparison and selection.
Adjusted R²	A modified version of R². It corrects for the influence of the number of explanatory variables, allowing for more accurate model evaluation. Suitable for variable selection and model comparison, especially when comparing models with different numbers of explanatory variables. Used to prevent overfitting.
RMSLE (Root Mean Squared Logarithmic Error)	The RMSE after taking the logarithm of predicted and actual values. Evaluates relative errors and mitigates the impact of large values. Suitable for sales forecasting or population prediction where the range of data values is wide and relative errors are emphasized. Particularly useful when predicted values have vastly different scales.
MSLE (Mean Squared Logarithmic Error)	The MSE after taking the logarithm of predicted and actual values. Evaluates relative errors and mitigates the impact of large values. Used for similar purposes as RMSLE, particularly suitable for cases where predicted values increase exponentially or for predicting ratios.
MedAE (Median Absolute Error)	The median of absolute errors. The evaluation metric least affected by outliers. Particularly useful for prediction tasks with noisy datasets or when outliers are present. Suitable for analyzing sensor data or evaluating actual measurement data that may contain anomalies.

Text Generation

Metric	Description
ROUGE	Evaluates the similarity between generated text and reference text. Calculates n-gram matching. Commonly used for summarization tasks. Particularly effective for evaluating the performance of news article summarization and document summarization systems.
Human Evaluation	Subjectively assesses quality, coherence, relevance, etc. Sets qualitative evaluation criteria and judges with multiple evaluators. Used when subtle nuances and contextual understanding that cannot be fully captured by automatic evaluation metrics are required, or when evaluating creative text generation.

Evaluation Metrics for Generative AI Models

Metric	Description
BLEU	A metric used for evaluating machine translation. Calculates n-gram matching between generated and reference sentences. Particularly effective for quality assessment of multilingual translation systems and comparison of different translation models.
METEOR	An evaluation metric for translation and generated text. Allows for flexible evaluation considering synonyms and morphological variations. Used in translation tasks where there are significant differences in grammatical structures between languages or when considering diversity of expressions.
BERTScore	A metric that evaluates semantic similarity of sentences using BERT's contextual word embeddings. Used when semantic similarity assessment is important beyond surface-level matching, or when evaluating paraphrasing.
Perplexity	A metric for evaluating the predictive performance of language models. Lower values indicate better models. Used for evaluating the learning process of language models and comparing language models with different architectures.

Explainability Evaluation

Method	Description
LIME	A method providing local explainability. Generates interpretable explanations for individual predictions. Used in cases where explanation of individual decision bases is important, such as medical diagnostics or financial credit assessments.
SHAP	A feature importance calculation method based on game theory. Evaluates the contribution of each feature to predictions. Used when there is a need to understand the decision-making process of complex models or when ranking feature importance.
Attention Visualization	Visualization of the attention mechanism in Transformer models. Visually represents the basis of model decisions. Used to confirm the areas of focus in natural language processing tasks or when analyzing model behavior.
Feature Attribution	A method to quantify the contribution of each feature to prediction results. Analyzes the decision process of models. Used for evaluating model fairness or detecting bias when necessary.

Correlation Analysis

Method	Description
Pearson Correlation	Measures the strength of linear relationships on a scale from -1 to 1. Used for evaluating relationships between continuous variables. Applied when analyzing variables expected to have a linear relationship, such as height and weight.
Spearman Correlation	Evaluates the strength of ordinal relationships. Applicable to non-linear relationships. Suitable for ordinal variables. Used when analyzing monotonic but not necessarily linear relationships, such as between customer satisfaction and purchase amount.
Chi-square Test	Statistically tests the association between categorical variables. Used for verifying independence. Applied when analyzing relationships between categorical data, such as the association between gender and product selection.
Phi Coefficient	Measures correlation between binary variables. Applied to 2x2 contingency table data. Used when measuring the strength of relationships between binary data, such as the association between pass/fail and male/female.

Model Challenges and Phenomena

Term	Description
Overfitting	A state where the model excessively fits to the training data, reducing generalization performance on new data. Also called overlearning, balance between model complexity and training data amount is important.
Underfitting	A state where the model fails to capture patterns in the training data sufficiently. Caused by lack of model expressiveness or insufficient learning. Requires more complex models or additional learning.
Bias	Systematic error between model predictions and true values. Increases when model expressiveness is insufficient, causing underfitting.
Variance	Variability in model predictions. The magnitude of prediction value fluctuations to small changes in training data. If too high, it can cause overfitting.
Hallucination	When AI models generate incorrect information not based on facts. Particularly problematic in generative AI, can be mitigated by techniques like RAG.
Drift	Changes in data distribution or model performance over time. Includes concept drift (changes in target variable relationships) and data drift (changes in input distribution).

Dealing with Data Imbalance

Method	Description
Oversampling	A method to increase data of minority classes. Includes synthetic data generation like SMOTE. Improves class balance. Effective when overall data quantity is small and you want to maximize use of information, or when there are ample computational resources.
Undersampling	A method to balance by reducing data from majority classes. Risk of information loss but computationally efficient. Effective when data quantity is sufficient and there are strict constraints on computation time or memory.
SMOTE	A method to generate synthetic data for minority classes. Uses k-nearest neighbors to generate new samples. Ensures data diversity. Effective when simple duplication risks overfitting or when you want to learn more diverse features of minority classes.
Class Weighting	Adjusts balance by weighting minority classes during learning. Modifies the model's loss function to balance. Effective when you want to maintain the original data distribution or avoid data modification.

Bias and Fairness Evaluation

Metric	Description
Demographic Parity	Evaluates the uniformity of prediction result distribution across different demographic groups.
Equal Opportunity	An indicator that confirms true positive rates are equal across protected attributes.
Predictive Parity	Evaluates the consistency of prediction accuracy between different groups.
Individual Fairness	Evaluates the consistency of predictions for individuals with similar characteristics.
Bias Amplification	Measures the degree to which the model amplifies existing biases in the data.

Performance Stability Evaluation

Metric	Description
Prediction Variance	Evaluates the variability of model predictions. Used as an indicator of stability.
Threshold Stability	Evaluates the robustness of performance to changes in classification thresholds.
Cross-validation Standard Deviation	Evaluates the variability of performance across different data splits.
Noise Resistance	Evaluates the stability of predictions against input noise.
Temporal Stability	Evaluates the consistency of prediction performance in time series data.

Reliability and Robustness Evaluation

Metric	Description
Adversarial Attack Resistance	Evaluates the model's robustness against adversarial samples. Identifies security vulnerabilities.
Model Uncertainty	Quantification of prediction confidence and uncertainty. Evaluation using Bayesian methods or ensemble methods.
Stress Test	Evaluation of model behavior in extreme cases or boundary conditions. Understanding system limitations.
Data Quality Sensitivity	Evaluation of model sensitivity to deterioration in input data quality. Used as an indicator of robustness.
Fail-safe Property	Evaluation of safety in case of model abnormal operation. Confirmation of fallback mechanism effectiveness.

Cost Efficiency Evaluation

Metric	Description
Computational Cost	Evaluation of computational resources required for model training and inference. GPU time, memory usage, etc.
Infrastructure Cost	Evaluation of infrastructure costs required for model operation. Storage, network, etc.
Maintenance Cost	Evaluation of human resources and time required for model maintenance and updates.
ROI Analysis	Evaluation of return on investment from model deployment. Quantification of cost reduction or revenue increase.

Model Deployment

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Registry catalogs and version-manages ML models, managing model metadata.

Deployment Strategies

Term	Description
Canary Deployment	A technique that applies a new version of the model to only a portion of the traffic and gradually expands. Allows validation of new models while minimizing risk.
Blue/Green Deployment	A deployment method that prepares production (blue) and new (green) environments in parallel and switches between them. Enables immediate rollback.
Shadow Deployment	A method that mirrors production traffic to a new model for parallel evaluation. Enables performance verification under actual workloads.
A/B Testing	A method to operate multiple versions of models simultaneously and compare their performance. Enables data-driven decision making.
Rolling Update	A gradual deployment method that updates instances sequentially. Minimizes service interruption.
Rollback Plan	Setting of recovery procedures and trigger conditions in case of problems. Ensures consistency of data and models.

Inference Options

Term	Description
Real-time Inference	A method that executes inference in real-time for requests. Suitable for use cases requiring low latency. In Amazon SageMaker, it's provided as persistent, fully managed endpoints that can handle payloads up to 6MB and processing times up to 60 seconds. A scalable solution capable of handling continuous traffic.
Batch Inference	A method that executes inference in bulk for large amounts of data. Suitable for periodic prediction processing. In Amazon SageMaker, it's provided as batch transform, capable of processing large-scale datasets of several GB. Optimal for offline processing or preprocessing that doesn't require persistent endpoints.
Asynchronous Inference	A method that queues and processes inference requests requiring large payloads or long processing times. In Amazon SageMaker, it supports payloads up to 1GB and processing times up to 1 hour. Can scale down to 0 when there's no traffic.
Serverless Inference	An event-driven inference execution method. Automatically scales according to demand. In Amazon SageMaker, it provides a model that requires no infrastructure management and charges only for usage, for intermittent or unpredictable traffic. Supports payloads up to 4MB and processing times up to 60 seconds.

Endpoint Options

Term	Description
Single Model Endpoint	A basic endpoint configuration for deploying a single model. Simple and easy to manage.
Multi-Model Endpoint	An endpoint configuration that serves multiple models of the same framework in a single container. In Amazon SageMaker, it improves endpoint utilization and reduces deployment overhead, realizing cost optimization.
Multi-Container Endpoint	An endpoint configuration that serves multiple models of different frameworks in separate containers. In Amazon SageMaker, it allows flexible deployment of various frameworks and models.
Serial Inference Pipeline	An endpoint configuration that executes preprocessing, inference, and post-processing as a series of pipelines. In Amazon SageMaker, all containers are hosted on the same EC2 instance and fully managed, achieving low latency.
Scalable Endpoint	An endpoint configuration that automatically scales according to load. Flexibly responds to traffic fluctuations.
High Availability Endpoint	An endpoint configuration deployed across multiple availability zones, ensuring redundancy.

Infrastructure

Term	Description
Model Containerization	Packaging of models using container technologies like Docker. Ensures consistency and portability of environments.
Scaling Strategy	Setting of automatic scaling according to load. Selection and policy setting of horizontal/vertical scaling.
Service Mesh	Management of traffic control and inter-service communication in microservice architectures.
Deployment Pipeline	Automation and standardization of deployments. Construction of CI/CD pipelines and setting of quality gates.

Optimization and Security

Term	Description
Model Optimization	Model optimization before deployment. Lightweighting and acceleration through quantization, pruning, distillation, etc.
Security Settings	Setting of access control, encryption, authentication and authorization. Configuration of secure endpoints.
API Versioning	Version management of model APIs and ensuring compatibility between versions.
Monitoring Settings	Configuration of metric collection, logging, alert settings. Establishment of performance and quality monitoring system.

Inference

Prompting Techniques

Technique	Description
Prompt Engineering	A technique to obtain desired outputs by crafting inputs (prompts) to AI models. Optimizes methods of setting context and presenting constraints.
Zero-shot Prompting	One of the prompt engineering techniques. Executes tasks directly without examples. Utilizes the model's generalization ability to handle new tasks.
Few-shot Prompting	One of the prompt engineering techniques. Teaches how to execute tasks by showing a few examples. Controls model behavior through concrete examples.
Chain-of-Thought Prompting	One of the prompt engineering techniques. Guides the model to solve complex problems step by step (Chain of Thought). Encourages explicit expansion of the reasoning process.

Prompt Optimization Techniques

Technique	Description
Prompt Templates	Design of reusable fixed prompts. Ensures consistent outputs.
Hallucination Countermeasures	Techniques to prevent generation not based on facts. Incorporation of knowledge base references and fact-checking.
Context Management	Effective setting and control of context information in prompts. Leads to more accurate responses.
Prompt Variation	Experimentation and optimization of different expression methods for the same intent. Improves robustness.

Monitoring

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Monitor continuously monitors model quality in production environments and detects data drift and bias.
* Amazon SageMaker Clarify can also be used in model quality monitoring, continuously evaluating model bias and explainability.

Model Quality Monitoring

Term	Description
Data Drift Detection	Monitors changes in input data distribution. Used as an indicator to determine timing for model retraining.
Concept Drift Detection	Monitors changes in the relationship between inputs and outputs. Used as an indicator to determine the need for model updates.
Prediction Quality Monitoring	Continuously evaluates the quality of model prediction results. Includes monitoring of bias and fairness.
Explainability Monitoring	Monitors metrics related to model explainability. Ensures transparency and reliability of predictions.
Input Data Validation	Continuously validates the validity of input data schema, type, range, etc.
Data Completeness Monitoring	Monitors data quality indicators such as missing values, outliers, duplicates.
Feature Stability Monitoring	Tracks changes in statistical properties of features. Detects distribution shifts.
Data Source Monitoring	Monitors availability, freshness, and consistency of data sources.

Performance Monitoring

Term	Description
Latency Monitoring	Monitors inference processing time. Checks SLA compliance status and used for performance optimization.
Throughput Monitoring	Monitors the number of processes per unit time. Used for capacity planning.
Resource Utilization Monitoring	Monitors infrastructure metrics such as CPU, memory, disk usage. Used for scaling decisions.
Error Rate Monitoring	Monitors the occurrence rate of inference errors and system errors. Used for maintaining service quality.

Security Monitoring

Term	Description
Access Monitoring	Monitors access patterns and authentication status to API endpoints. Detection of unauthorized access.
Data Security Monitoring	Monitors data encryption status, access control, and privacy protection status.
Compliance Monitoring	Continuously monitors compliance with regulatory requirements. Maintenance of audit trails.

Operational Monitoring

Term	Description
Alert Settings	A mechanism to notify when important metrics exceed thresholds. Enables early response.
Log Analysis	Analysis of system logs and application logs. Used for identifying causes of failures and trend analysis.
Incident Tracking	Tracking of occurrence history and response status of failures and abnormalities. Used for formulating recurrence prevention measures.
Capacity Management	Prediction and planning of resource usage. Formulation of appropriate scaling strategies.

Business Impact Monitoring

Term	Description
ROI Analysis	Continuous evaluation of costs and effects of model operation. Measurement of return on investment.
Business Metrics	Monitoring of indicators showing the business contribution of the model. Measurement of effects such as sales and cost reduction.
User Satisfaction	Tracking of feedback and service evaluations from end users.

System Health Monitoring

Term	Description
Infrastructure Availability Monitoring	Operational status and health checks of system components.
Network Monitoring	Monitoring of network connectivity, latency, bandwidth.
Cache Efficiency Monitoring	Tracking of cache hit rates, memory usage efficiency.
Batch Processing Monitoring	Monitoring of batch job execution status, success rates, processing times.

Fairness Monitoring

Term	Description
Bias Metrics	Monitoring of demographic biases in model predictions. Tracking of fairness indicators.
Attribute-based Monitoring	Monitoring of prediction biases based on protected attributes. Detection of discriminatory results.
Fairness Score	Evaluation of prediction accuracy uniformity across different groups. Quantitative measurement of fairness.
Impact Analysis	Analysis of the impact of model predictions on different populations. Evaluation of social impact.

Governance Monitoring

Term	Description
Policy Compliance	Monitoring of compliance with the organization's AI governance policies. Confirmation of guideline adherence.
Accountability Tracking	Monitoring of transparency and explainability in the model's decision-making process. Ensuring accountability.
Ethical Risk Monitoring	Continuous assessment of AI's ethical impacts and potential risks. Fulfillment of social responsibility.
Regulatory Compliance Tracking	Monitoring of compliance status with new regulatory requirements. Maintenance of compliance.

MLOps Management Process

Experiment Management

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Experiments provides tools for tracking and managing machine learning experiments, automatically recording experiment results such as training runs, parameters, metrics, and enabling comparative analysis.

Term	Description
Experiment Tracking	Activity of recording and managing settings, parameters, and results of each experiment in model development. Uses tools like MLflow, SageMaker Experiments.
Metadata Management	Management of associated information such as settings, environment, datasets, results related to experiments. Ensures reproducibility and traceability of experiments.
Hyperparameter Logging	History management of model hyperparameter settings. Used for tracking and comparative analysis of optimization processes.
Evaluation Metrics Tracking	Time-series recording and analysis of model performance indicators. Used for understanding improvement trends and comparative evaluation.
Artifact Management	Storage and management of artifacts such as models, checkpoints, plots. Streamlines storage and sharing of experiment results.
A/B Test Management	Design and result management of comparative experiments of multiple models. Supports statistical significance evaluation and decision making.
Experiment Environment Management	Management of development environment configurations, dependencies, resource settings, etc. Maintains reproducibility and consistency of environments.

Version Control

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Registry catalogs and version-manages ML models, managing model metadata.

Term	Description
Data Versioning	Change history management of training datasets. Uses tools like DVC (Data Version Control). Enables tracking of data lineage.
Model Versioning	Management of different versions of trained models. Implemented with tools like SageMaker Model Registry. Controls switching and rollback in production environments.
Code Version Control	Version management of model development code. Uses Git, etc. Setting of branch strategies and merge policies.
Configuration File Management	Version management of configuration files such as environment settings, parameter settings. Ensures consistency across environments.
Dependency Management	Version management of libraries and frameworks. Clarified in requirements.txt or Dockerfile.
Tagging	Assigning meaningful tags to versions of models, data, code. Facilitates release management and tracking.
Baseline Management	Management of model versions that serve as benchmarks for performance comparison. Used as indicators for quality assurance.

Documentation

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Cards creates and manages model documentation, centralizing management of detailed model information.

Term	Description
Model Card	A standardized document recording detailed information about the model. Includes usage, performance, limitations, ethical considerations, etc. Can be managed with SageMaker Model Cards.
Data Sheet	A document recording characteristics of datasets, collection methods, preprocessing procedures, license information, etc. Ensures transparency and reusability of data.
API Specification	Describes model interface specifications, input/output formats, endpoint information, etc. Managed in standard formats like OpenAPI/Swagger.
Experiment Report	A document summarizing the purpose, method, results, and discussion of experiments. Records important findings and decisions.
Operation Manual	A manual describing procedures for model deployment, monitoring, and maintenance. Includes incident response procedures.
Training Record	Detailed record of model learning. Includes data preparation, parameter settings, learning process, summary of results.
Change History	Records important changes to models, data, code. Documents reasons for updates and scope of impact.
Risk Assessment Document	A document evaluating potential risks, biases, ethical considerations of the model. Also used as evidence for regulatory compliance.
Quality Assurance Document	Records test results, performance evaluations, validation procedures. Demonstrates compliance with quality standards.
Compliance Document	A document demonstrating compliance with regulatory requirements. Records responses to GDPR, AI governance, etc.
Architecture Diagram	Visual representation of system configuration, data flow, relationships between components.
Troubleshooting Guide	A guide listing common problems and their solution procedures. Contributes to improving operational efficiency.
Model Lineage Diagram	Visual representation of model development process, derivative relationships, important changes. Clarifies relationships between versions.
Performance Benchmark	Performance comparison results between different model versions. Used as evidence of improvement.
Deployment Plan	A plan describing model deployment strategy, schedule, risk countermeasures.

Orchestration Management

Term	Description
Pipeline Management	Automation and control of ML workflows. Manages the series of flows from data processing to inference.
Workflow Definition	Definition of each step in the ML process and its dependencies. DAG-based control flow design.
Automation Triggers	Setting of conditions and schedules for pipeline execution. Control of event-driven processing.
Error Handling	Implementation of anomaly detection and recovery mechanisms. Definition of fallback strategies.

Quality Management

Term	Description
Quality Gates	Quality checkpoints before deployment. Verification of performance, security, compliance.
Test Automation	Automated execution system for unit tests, integration tests, performance tests. Continuous quality assurance.
Quality Metrics	Definition and monitoring of quality indicators for models and systems. Tracking of SLO/SLA compliance status.

Infrastructure Management

Term	Description
Resource Optimization	Efficient allocation and management of computational resources and storage. Cost optimization.
Scaling Management	Setting and monitoring of auto-scaling policies. Flexible resource adjustment according to demand.
Availability Management	Ensuring system redundancy and fault tolerance. Management of backup and disaster recovery plans.

Security Management

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Role Manager manages access permissions for ML activities, implementing security based on the principle of least privilege and providing appropriate access control.

Term	Description
Access Control	Role-based access management. Permission settings based on the principle of least privilege.
Data Protection	Encryption and anonymization of sensitive data. Implementation of privacy protection mechanisms.
Vulnerability Management	Detection and countermeasures for security vulnerabilities. Conducting regular security assessments.

Governance Management

Term	Description
Policy Management	Formulation and compliance management of AI governance policies. Setting of ethical guidelines.
Audit Response	Maintenance of audit trails and management of audit response processes. Preparation of compliance evidence.
Risk Management	Identification, assessment, and implementation of mitigation measures for risks related to AI use. Continuous risk monitoring.

References:
Tech Blog with curated related content

Summary

In this article, I have compiled an "AI and Machine Learning Terminology for AWS" based on the knowledge I gained during my study process to pass the newly added AWS certifications: AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate. Additionally, I included insights from co-authoring the "Quiz to Learn AWS Functions and History: Selected 'Machine Learning' Edition" in the "Compilation of Thin Books on AWS Vol.01", which was self-published for Japan's "Technical Book Festival 17".

I will continue to shape ideas that can be useful for learning and utilizing AWS.
Additionally, I plan to update this article periodically to reflect changes in AI and machine learning in AWS.

Written by Hidekazu Konishi