hidekazu-konishi.com

AI and Machine Learning Glossary for AWS - Knowledge Gained While Studying for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate

First Published:
Last Updated:

This time, I have compiled the knowledge gained during my study process to pass the newly added AWS certifications, AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate, into a "Glossary of AI and Machine Learning Terms Related to AWS".

The knowledge in this "Glossary of AI and Machine Learning Terms Related to AWS" is also used in the questions and answers of "Learning AWS Functions and History Through Quizzes: Selected 'Machine Learning' Edition" in "Compilation of Thin Books on AWS Vol.01", which I co-authored as an individual publication for Japan's "Technical Book Fair 17".

I hope this will be helpful for those who are preparing to take the AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate exams.

AI/ML AWS Services

Amazon SageMaker

Service Name Description
Amazon SageMaker A fully managed service for efficiently building, training, and deploying machine learning models. An integrated platform that supports the entire ML lifecycle from development to production operation.
SageMaker Studio One of SageMaker's components, a browser-based integrated development environment (IDE). Enables one-stop execution from notebook creation to model development, training, and deployment. Achieves centralized management of ML workflows.
SageMaker Canvas One of SageMaker's components, a visual interface that allows building ML models through drag & drop without writing code. A no-code ML development environment for business analysts.
SageMaker Ground Truth One of SageMaker's components, a data labeling service for creating high-quality training datasets. Improves efficiency of human labeling tasks and provides semi-automated labeling workflows.
SageMaker Data Wrangler One of SageMaker's components, a tool for streamlining data preparation and preprocessing. Provides over 200 built-in transformation functions, enabling data cleansing to feature engineering via GUI.
SageMaker Feature Store One of SageMaker's components, a repository for centrally managing and sharing features. Ensures consistency of features in online/offline scenarios and promotes reuse across teams.
SageMaker JumpStart One of SageMaker's components, an ML hub providing pre-trained models and solutions. Deployable with one click and supports transfer learning and fine-tuning.
SageMaker Model Monitor One of SageMaker's components, continuously monitors model quality in production environments. Detects data drift and bias, enabling early detection of model performance degradation.
SageMaker Clarify One of SageMaker's components, evaluates bias detection and explainability of models. Ensures fairness and transparency, and analyzes the basis for model decisions.
SageMaker Debugger One of SageMaker's components, for debugging and monitoring the training process. Enables visualization of metrics and setting of alerts. Supports optimization of learning.
SageMaker Pipelines One of SageMaker's components, for orchestration of ML workflows. Builds reproducible ML pipelines and enables automated experiment management.
SageMaker Model Cards One of SageMaker's components, for creating and managing model documentation. Centralizes management of detailed model information, ensuring governance and compliance.
SageMaker Role Manager One of SageMaker's components, manages access permissions for ML activities. Implements security based on the principle of least privilege and provides appropriate access control.
SageMaker Experiments One of SageMaker's components, a tool for tracking and managing machine learning experiments. Automatically records experiment results such as training runs, parameters, and metrics, enabling comparative analysis.
SageMaker Model Registry One of SageMaker's components, a repository for cataloging and versioning ML models. Manages model metadata and approval status.

Amazon Bedrock

Service Name Description
Amazon Bedrock A secure, fully managed generative AI platform service. An integrated platform that allows access to multiple Foundation Models through a single API. Provides comprehensive features including Foundation Models (FMs) as the underlying large language models, Knowledge Bases for RAG construction, Agents for automation, Guardrails for harmful content filtering, and Prompt Flows for workflow execution.
Foundation Models (FMs) Large language models that serve as the foundation for text generation, image generation, etc. The core AI component providing the foundational functionality in Bedrock.
Knowledge Bases A Bedrock feature that provides information retrieval from external knowledge bases, enabling the construction of Retrieval Augmented Generation (RAG) architectures.
Agents A Bedrock feature that orchestrates procedural instructions, custom action execution, and Knowledge base utilization in an integrated manner.
Guardrails A Bedrock feature that detects and filters harmful content and hallucinations, controlling AI output.
Prompt Flows A Bedrock feature that systematically workflows prompt execution, S3 data input/output, and Lambda function execution.

Amazon Q

Service Name Description
Amazon Q A generative AI-powered assistant service specifically designed for businesses. It comes in Business and Developer editions, specializing in business productivity improvement and development support respectively. It can be integrated with various AWS services, including Amazon Q in Amazon QuickSight, Amazon Q in Amazon Connect, Amazon Q in AWS Chatbot, Amazon Q network troubleshooting, and Amazon Q Data integration in AWS Glue.
Amazon Q Business A generative AI assistant designed to improve employee productivity. Supports automation and efficiency of general business tasks.
Amazon Q Developer A generative AI assistant specialized in coding support for developers. Supports development tasks such as code generation, debugging, and optimization.

Natural Language Processing Services

Service Name Description
Amazon Comprehend A natural language processing service that performs sentiment analysis, personal information detection, key phrase extraction, etc. from text. Custom model creation is also possible.
Amazon Kendra An advanced search service for enterprises. Provides context-aware search results for natural language queries. Easy integration with RAG.
Amazon Lex A service for building interactive interfaces (chatbots). Provides natural language understanding and dialogue management functions. Supports both voice and text.
Amazon Textract A service for extracting text and structured data from documents. Capable of handwriting recognition, form processing, and table analysis. Provides high-accuracy OCR functionality.
Amazon Translate An automatic translation service between multiple languages. Provides real-time translation between 74 languages. Supports custom terminology dictionaries.
Amazon Transcribe A speech-to-text (speech recognition) service. Capable of multiple speaker identification and customization of specialized terminology. Supports real-time transcription.
Amazon Polly A text-to-speech (speech synthesis) service. Provides natural pronunciation and Neural text-to-speech. Supports multiple languages and voice types.
Amazon CodeWhisperer An AI coding companion for programming assistance. Provides code completion and suggestions.

Image and Video Processing Services

Service Name Description
Amazon Rekognition An image and video analysis service. Provides face recognition, object detection, text extraction, content moderation, celebrity recognition, etc. Supports both real-time analysis and batch processing.
Amazon Lookout for Vision An anomaly detection service using industrial image analysis. Used for product defect detection in manufacturing lines, etc.

Other AI-Related Services

Service Name Description
Amazon Personalize A service that provides personalized recommendations. Enables product recommendations and related content suggestions based on user behavior data. Supports real-time recommendations.
Amazon Pinpoint A customer engagement service. Provides ML-powered segmentation, user behavior analysis, and optimal delivery time prediction functions. Enables multi-channel communication through email, SMS, push notifications, etc.
Amazon Fraud Detector A machine learning-based fraud detection service. Detects online fraudulent transactions, account takeovers, fake account creation, etc. Can be used in combination with custom rules and ML models.
Amazon Augmented AI (A2I) A service that manages human review task execution. Enables building workflows for human review of machine learning prediction results.
Amazon Mechanical Turk (MTurk) A crowdsourcing marketplace. Enables execution of tasks such as data labeling and content moderation by humans. Can be integrated with Amazon SageMaker Ground Truth and Amazon Augmented AI (A2I).
Amazon QuickSight A BI (Business Intelligence) tool. Equipped with ML predictive analysis capabilities, enabling data visualization and analysis. Supports data analysis in natural language through Q function.

Data Storage and Database Solutions

Service Description
Amazon S3 Scalable object storage. Optimal for building data lakes. High durability and availability.
Amazon EFS Fully managed scalable file storage. Shareable across multiple instances. Supports NFS protocol.
Amazon FSx for Lustre Amazon FSx for Lustre is a high-performance file system capable of directly processing large-scale datasets. It seamlessly integrates with Amazon S3, automating data loading from and writing back to S3, and accelerates workloads with hundreds of GBps of parallel processing.
Amazon DynamoDB Fully managed NoSQL database. Enables fast read and write operations. Automatic scaling feature.
Amazon Redshift Petabyte-scale data warehouse. Enables fast query processing. Columnar storage.
Amazon OpenSearch Service An Elasticsearch-compatible search and analytics engine service. In addition to full-text search and real-time analytics, it provides vector database functionality supporting neural search and k-Nearest Neighbors (k-NN) vector search. Compatible with log analysis, application search, and security analytics, as well as AI applications such as recommendations and semantic search. Advanced search capabilities are enabled through integration with large language models using OpenSearch Neural Search functionality.
Amazon DocumentDB MongoDB-compatible document database. Equipped with vector search functionality. Scalable document management.

Basic Concepts of Machine Learning

AI/ML Fundamental Concepts

Term Description
Artificial Intelligence (AI) Computer systems that exhibit human-like intelligent behavior. Possess capabilities such as learning, reasoning, and problem-solving. Includes specialized AI for specific tasks and general AI for broad intelligence.
Machine Learning (ML) Algorithms or systems that learn patterns from data and perform tasks without explicit programming. Realized through a combination of statistical methods and algorithms.
Deep Learning A machine learning technique using multi-layer neural networks. Demonstrates high performance in image recognition, natural language processing, etc. Requires large amounts of data and computational resources.
Feature Individual variables or attributes used as input to a model. Meaningful information extracted from data.
Label Correct answer data in supervised learning. Target values or classification categories that the model should predict.
Instance Individual data points. Composed of a combination of features and labels.
Batch A set of data processed simultaneously during model training. Affects memory efficiency and training speed.
Epoch A unit representing one complete processing of all training data. Model is gradually improved through multiple epochs of learning.
Iteration One update of model parameters. Often refers to processing per batch.
Parameters Values optimized by the model during the learning process. Includes weights and biases.
Hyperparameters Control parameters set before model learning. Includes learning rate and batch size.
Inductive Bias Assumptions or hypotheses inherent in the model. Characterizes the nature of the learning algorithm.
Generalization Performance The model's predictive ability on unseen data. Balancing overfitting and underfitting is important.

Generative AI Related Concepts

Term Description
Foundation Model (FM) A general-purpose AI model pre-trained on large-scale data. Adaptable to various tasks. Forms the basis for transfer learning and fine-tuning.
Large Language Model (LLM) A large-scale foundation model specialized in natural language processing. Examples include GPT and BERT. Demonstrates high performance in text generation and comprehension tasks.
RAG (Retrieval-Augmented Generation) A method to improve the output quality of generative AI by searching and referencing external knowledge. Effective in preventing hallucination and improving accuracy.
Prompt Input text to generative AI models. Instructions or context to control the model's output.
Token The smallest unit for dividing text in prompts. Composed of words or substrings. The basis for input/output limitations of models.
Temperature A parameter in prompts that controls the randomness of generation. Higher values lead to more diverse outputs, lower values to more deterministic outputs.
Top-p sampling A method in prompts for selecting the next token based on cumulative probability. Controls the balance between output diversity and quality.
Top-k sampling A method in prompts for selecting the next token from the top k tokens by probability. Used to control output.
Context window The maximum length of input that the model can process at once in prompts. Affects the understanding of long contexts.
In-context learning The ability to learn tasks through examples within the prompt. Adaptation without additional training.
Fine-tuning The process of adapting a foundation model to specific tasks or domains. Specialization through additional learning.
Prompt engineering The technique of designing effective prompts. Improves the quality and consistency of outputs.
Hallucination The phenomenon where a model generates information not based on facts. A challenge for reliability.
Style transfer A generative technique to change the style of existing content. Used for images and text.
Latent space A compressed representation space of data learned by generative models. Controls the diversity of generation.
Attention mechanism A mechanism to focus on important parts of the input. Core technology of Transformer models.
Self-attention mechanism A mechanism to learn relationships between elements within a sequence. Effective for capturing long-range dependencies.
Decoder The part that generates the desired output from latent representations. An important component of generative models.
Encoder The part that converts input into latent representations. Responsible for information compression and feature extraction.
Transformer An architecture based on self-attention mechanism. The foundation of modern generative AI.
Multimodal The ability to handle multiple data formats such as text, images, and audio.
Zero-shot capability The ability to perform new tasks using only pre-training in prompts. Adaptation without examples.
Few-shot capability The ability to perform new tasks with a few examples in prompts. Efficient adaptive learning.

Machine Learning Approaches

Term Description
Parametric learning An approach where the model shape is fixed and the number of parameters is constant. Examples include linear regression and logistic regression.
Non-parametric learning An approach where the complexity of the model changes according to the data. Examples include k-NN and kernel methods.
Ensemble learning An approach that combines multiple learners to improve performance. Examples include Random Forest and Boosting.

Foundational Learning Theories

Term Description
Maximum Likelihood Estimation A method to estimate parameters by maximizing the probability of obtaining the data.
Bayesian Estimation A method to estimate parameters by calculating posterior probability from prior probability and data likelihood.
Empirical Risk Minimization A principle to minimize prediction errors on training data.
Structural Risk Minimization A principle to minimize prediction errors while considering model complexity.

Loss Functions

Term Description
Squared Loss The square of the difference between predicted and actual values. Commonly used in regression problems.
Cross-Entropy Loss A loss function used in classification problems. Measures the distance between probability distributions.
Hinge Loss A loss function used in SVMs. Achieves margin maximization.
Huber Loss A loss function robust to outliers. A combination of squared loss and absolute loss.

Optimization Theory

Term Description
Convex Optimization A special optimization problem where local optima are global optima.
Stochastic Optimization A method that uses randomness to search for optimal solutions.
Constrained Optimization An optimization problem under constraints. Solved using methods like Lagrange multipliers.

Terms Related to Learning Process

Term Description
Vanishing Gradient Problem A phenomenon in deep neural networks where gradients vanish during backpropagation. Makes learning difficult in deep layers.
Exploding Gradient A phenomenon in deep neural networks where gradients grow exponentially. Causes instability in learning.
Sparsity A property where many of the data or model parameters are zero. Affects computational efficiency and generalization performance.
Curse of Dimensionality A problem where the required amount of data increases exponentially as the number of feature dimensions increases. A challenge in high-dimensional data analysis.

Data Quality Related Terms

Term Description
Data Imbalance A state where there is a large difference in the number of samples between classes. Makes learning difficult for minority classes.
Noise Unwanted variations or errors in data. Can hinder model learning.
Outliers Values that deviate significantly from the general distribution of data. Can negatively affect model learning.
Missing Values Unrecorded or unmeasured values in a dataset. Requires appropriate handling.

Model Evaluation Related Terms

Term Description
Baseline A simple model or performance metric used as a comparison standard. Used to evaluate improvement.
Significance Testing A method to evaluate whether performance differences between models are statistically meaningful.
Cross-Entropy A metric that measures the difference between predicted probabilities and true distribution in classification problems.
Confusion Matrix A table that aggregates prediction results by classification. Used for performance evaluation.

Activation Functions

Term Description
ReLU The most commonly used activation function. A simple non-linear function that sets negative inputs to zero.
Sigmoid Converts output to a range of 0-1. Often used in the output layer for binary classification.
tanh Converts output to a range of -1 to 1. Mitigates the vanishing gradient problem better than sigmoid.
Softmax Outputs probability distribution for multiple classes. Used in the output layer for multi-class classification.

Types of Learning Algorithms

Term Description
Perceptron The most basic neural network. Suitable for linearly separable problems.
SVM (Support Vector Machine) A method that determines the classification boundary by maximizing margin. Non-linear classification is possible with kernel trick.
Decision Tree A method that makes predictions by hierarchically dividing data. High interpretability and easy evaluation of feature importance.
k-Nearest Neighbors A method that makes predictions based on the majority of the k nearest training data. Simple but computationally expensive.

Data Quality Indicators

Term Description
Data Completeness An indicator of the degree of missing values, duplicates, and inconsistencies in a dataset.
Data Consistency An indicator of whether data formats and value ranges are as expected.
Data Freshness An indicator of the update time and expiration date of data.
Data Representativeness An indicator of whether the sample appropriately represents the population.

Model Quality Indicators

Term Description
Prediction Stability An indicator of prediction consistency for similar inputs.
Model Confidence An indicator of the model's confidence in each prediction.
Explainability An indicator of the ease of interpreting the reasons for model predictions.
Robustness An indicator of the model's resistance to noise and outliers.

Statistical Concepts

Term Description
Analysis of Variance A statistical method for analyzing sources of variation in data.
Hypothesis Testing A method for verifying statistical hypotheses.
Confidence Interval A range that quantifies the uncertainty of an estimate.
Effect Size An indicator of the practical magnitude of statistical differences.

Model Development Process

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Studio provides an integrated development environment (IDE) that enables one-stop execution from notebook creation to model development, training, and deployment, realizing centralized management of ML workflows.
* Amazon SageMaker Canvas provides a no-code ML development environment that allows data preparation to model deployment through drag & drop without writing code, enabling development for business analysts.

Model Development Process

Phase Description
Data Collection Collection and integration of data necessary for learning. Includes identification of data sources and quality checks. Consider data representativeness and balance.
Data Preprocessing Implement data splitting, cleaning, data labeling, feature engineering, scaling (normalization, standardization). Improve data quality and convert to a format suitable for learning. Include handling of missing values and outliers.
Model Selection Select appropriate algorithms and architectures based on problem type (classification/regression, etc.), data characteristics, requirements (accuracy/speed/explainability). Consider computational resource constraints and deployment environment. Also consider the possibility of using pre-trained models.
Model Training Train the model using the selected algorithm. Include hyperparameter optimization. Conduct performance evaluation through cross-validation.
Model Evaluation Verify model performance. Analyze from multiple angles using various evaluation metrics. Confirm generalization performance on test data.
Deployment Deploy the model to the production environment. Include scaling and monitoring settings. Also conduct A/B testing to verify effectiveness.
Inference Execute predictions on new data using the deployed model. Perform predictions in real-time or batch processing.
Monitoring Continuously monitor model performance. Detect drift and make decisions on retraining. Track quality metrics and set alerts.

Data Collection

Types of Data

Type Description
Structured Data Data organized in tabular form. Such as data managed in RDBMS. Has a clear schema.
Unstructured Data Data without a fixed structure. Such as text, images, audio, video. Requires special techniques for processing.
Semi-structured Data Partially structured data. Such as JSON, XML, HTML. Has a flexible schema.
Vector Data Data represented as numerical vectors. Such as word embeddings, feature vectors. Suitable for similarity calculations.

ETL (Extract, Transform, Load)

Phase Description
Extract Extract data from various sources. Check data format and quality. Perform consistency checks.
Transform Transform and process data. Execute cleansing, normalization, aggregation, etc. Transform according to business rules.
Load Save and load processed data. Store in data warehouses or data lakes. Ensure consistency.

Data Preprocessing

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Data Wrangler provides tools to streamline data preparation and preprocessing, enabling data cleansing to feature engineering via GUI.
* Amazon SageMaker Canvas provides functionality to perform data preprocessing, feature engineering, data transformation, etc. via GUI without writing code, enabling data preparation by business analysts.

Data Splitting

Term Description
Training Data Dataset used for model learning. Typically accounts for about 60-80% of all data.
Validation Data Dataset used for model hyperparameter tuning and performance evaluation. Typically accounts for about 10-20% of all data.
Test Data Independent dataset used for final model evaluation. Typically accounts for about 10-20% of all data.
Holdout Method Basic method of splitting data into training and evaluation sets. Used when data quantity is sufficient.
Stratified Sampling Method of splitting data while maintaining class ratios. Important for imbalanced datasets.

Cleansing (Cleaning)

Task Description
Noise Removal Detection and removal of outliers and noise. Improves data quality. Utilizes statistical methods and domain knowledge.
Missing Value Handling Completion or removal of missing data. Impute with mean, median, predicted values, etc. Consider MAR and MCAR assumptions.
Outlier Detection Identification of outliers using statistical methods or ML techniques. Important to check consistency with domain knowledge.
Duplicate Data Elimination Detection and removal of duplicate records. Ensures data consistency. Normalization of key items.

Data Labeling

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Ground Truth provides a data labeling service for creating high-quality training datasets.
Method Description
Manual Labeling Direct labeling by humans. High quality but time and cost intensive. Effective when specialized knowledge is required.
Semi-Automatic Labeling Combination of AI prediction and human verification. Enables efficient labeling. Achieves balance between quality and efficiency.
Active Learning Selection of target data for efficient labeling. Prioritizes data with high uncertainty. Optimizes labeling costs.
Label Quality Management Checks for consistency and errors. Includes consensus building among multiple annotators. Setting and monitoring of quality metrics.

Feature Engineering

[Amazon SageMaker components useful in this category]
・Amazon SageMaker Feature Store provides a centralized repository for managing and sharing features, ensuring consistency of features both online and offline.
Technique Description
Feature Selection Selection of useful input variables for the model. Selection based on correlation analysis and importance evaluation. Contributes to dimensionality reduction and model performance improvement. Used when there are many features or when you want to remove unnecessary features to prevent model overfitting. Particularly effective when analyzing datasets with many variables, such as medical or financial data.
Feature Extraction The process of extracting meaningful features from raw data. Examples include Fourier transform in signal processing and edge detection from images. Used when there is a need to extract useful information from complex raw data, especially important in image processing, speech processing, and sensor data analysis. In time series data analysis, it is utilized for extracting statistical measures and frequency characteristics.
Feature Scaling The process of adjusting the value range of features. Includes standardization and normalization. Essential when dealing with datasets that have features on different scales, particularly important for machine learning algorithms using gradient descent methods and clustering algorithms that perform distance calculations.
Feature Interaction Creation of new features by combining multiple features. Used when wanting to capture non-linear relationships or model phenomena that cannot be explained by individual features alone. Particularly utilized in regression analysis and predictive models to improve prediction accuracy.
Dimensionality Reduction Techniques for reducing the feature dimensions of data. PCA and t-SNE are representative methods. Contributes to improved computational efficiency and performance. Used when visualization of high-dimensional data is necessary or when wanting to reduce computational costs. Particularly useful when dealing with high-dimensional data in image recognition and document classification.
Encoding Numerical conversion of categorical values. Uses methods such as One-Hot, Label, and Target encoding. Select appropriate methods based on data characteristics. Essential when building machine learning models that handle categorical data, especially important when there are many categories or when relationships between categories need to be considered.
Embedding Conversion of high-dimensional data into low-dimensional vector representations. Examples include Word2Vec and BERT. Preserves semantic similarity. Used when dealing with text data or large-scale categorical data, playing a particularly important role in natural language processing and recommender system construction.
Data Augmentation Enhancement of training data by transforming existing data. Includes rotation, scaling, etc. Improves model generalization performance. Used when training data is limited or when wanting to prevent model overfitting. Particularly effective in tasks using deep learning, such as image recognition and speech recognition.

Encoding Techniques

Encoding Technique Description and Use Cases
Label Encoding A technique that converts categorical values into continuous integer values. Suitable when there is an ordinal relationship between categories (e.g., education level, age group). Memory-efficient and often used in decision tree-based algorithms. However, caution is needed for non-ordinal categorical data as it introduces a numerical relationship between categories.
One-Hot Encoding A technique that converts categorical values into binary vectors. Optimal for cases where there is no ordinal relationship between categories (e.g., color, gender, occupation). Treats each category equally, but can lead to high memory consumption and increased computational cost due to dimensionality increase when there are many categories. Particularly important in linear models and neural networks.
Target Encoding A technique that replaces categorical values with the mean of the target variable. Effective when there are a very large number of categories or when there is a strong association between categories and the target variable. However, there is a risk of overfitting, so appropriate regularization and cross-validation are necessary. Particularly useful for improving performance in predictive models.
Frequency Encoding A technique that converts categories to numerical values based on their frequency of occurrence. Suitable when the frequency of categories holds significant meaning (e.g., product popularity, usage frequency). Easy to implement and interpret, but has the limitation of not being able to distinguish between categories with the same frequency.
Binary Encoding A technique that converts categories into binary representations. More memory-efficient than One-Hot Encoding as it can represent with fewer dimensions, useful when there are many categories. However, the generated features can be difficult to interpret, and relationships between categories may be lost.
Hash Encoding A technique that uses a hash function to convert categories into fixed-dimensional features. Suitable for cases with an extremely large number of categories or when new categories are continuously added. Memory-efficient and can handle online learning, but there is a possibility of information loss due to hash collisions.

Feature Extraction Techniques for Text Data

Technique Description
TF-IDF A technique that calculates word importance based on term frequency and inverse document frequency. A fundamental feature in text analysis. Widely used in document classification, information retrieval, keyword extraction, and other tasks where word importance needs to be considered.
Word2Vec A technique that converts words into fixed-length dense vectors. Captures semantic similarity between words. Used in natural language processing tasks that require consideration of word meaning relationships and context, such as sentiment analysis, document classification, question-answering systems, and machine translation.
Doc2Vec An extension of Word2Vec that learns vector representations for entire documents. Used for document similarity calculations. Suitable for tasks requiring semantic comparison at the document level, such as document classification, clustering, recommender systems, and similar document search.
FastText A word embedding technique that considers substrings. Capable of handling unknown words. Particularly effective in processing languages with rich morphology, handling text with spelling errors, and analyzing social media posts where new or modified words frequently appear.
BERT Tokenization A tokenization technique for BERT models. Uses the WordPiece algorithm. Used as essential preprocessing when using BERT models for advanced natural language processing tasks that consider context, such as sentiment analysis, named entity recognition, and question answering.
BPE (Byte Pair Encoding) A technique that learns subword units by merging frequent character strings. Allows control of vocabulary size. Particularly effective in machine translation, multilingual processing, and processing languages with complex morphology where efficient handling of large vocabularies is necessary.
Bag of Words (BoW) The most basic technique that vectorizes word frequency in documents. Does not consider word order. Used in basic text analysis tasks where word frequency alone is sufficient for performance, such as spam email detection, document classification, and topic classification.
n-gram A technique that uses combinations of n consecutive words or characters as features. Captures local context. Used in language modeling, spell checking, author identification, predictive input for programming languages, and other cases where local context or word order is important.

Text Data Preprocessing

Method Description
Tokenization Splitting text into words or substrings. Selection of appropriate splitting method based on language characteristics.
Normalization Implement unification of uppercase and lowercase, accent removal, character type standardization, etc.
Stop Word Removal Removal of common words with little information (articles, prepositions, etc.).
Lemmatization/Stemming Convert words to their base form. Use morphological analysis or stemming.
Noise Removal Removal of special characters, HTML/XML tags, unnecessary spaces, etc.

Image Data Preprocessing

Method Description
Resize Standardization of image size. Convert to a size suitable for model input.
Normalization Standardization of pixel values. Generally converted to a range of 0-1 or -1-1.
Color Space Conversion Conversion to color spaces such as RGB, HSV, grayscale according to purpose.
Noise Removal Noise reduction using median filters or Gaussian filters.
Data Augmentation Enhancement of training data through rotation, flipping, scaling, etc.

Scaling Methods

Method Description
Normalization A transformation that fits data into a specific range (usually 0-1). Unifies the scale between features, making them comparable.
Standardization A transformation that converts data to a distribution with mean 0 and standard deviation 1. Susceptible to outliers but suitable for normally distributed data.
Standard Scaler A scaler that implements standardization. Converts to mean 0 and standard deviation 1. Suitable for data following normal distribution.
Robust Scaler Robust scaling using median and interquartile range. Less affected by outliers.
Min Max Scaler A scaler that implements normalization. Converts data to a specified range such as 0-1. Suitable for neural network inputs.
Max Absolute Scaler Normalizes by maximum absolute value. Suitable for sparse data. Maintains zero-centered scale.

Model Selection

[Amazon SageMaker components useful in this category]
* Amazon SageMaker JumpStart functions as an ML hub providing pre-trained models and solutions, deployable with one click.

Model Types

Classification Description
Supervised Learning Models Predictive models for classification or regression. Uses labeled data. Examples include RandomForest, SVM, Neural Networks.
Unsupervised Learning Models Models for clustering and pattern discovery. Uses unlabeled data. Examples include K-means, PCA, Auto-encoders.
Semi-Supervised Learning Models Models that learn using a small amount of labeled data and a large amount of unlabeled data. Examples include pseudo-labeling, co-training.
Generative Models Models that generate data or learn distributions. Examples include GANs, VAE, Diffusion Models.
Transfer Learning Models Models that utilize pre-trained knowledge. Based on foundation models such as BERT, GPT, ResNet.

Selection Criteria

Criteria Description
Data Characteristics Model selection based on data quantity, dimensionality, presence of noise, class balance, sparsity, etc.
Computational Resources Selection based on available memory, CPU/GPU, and training time constraints.
Prediction Performance Selection based on target performance metrics such as accuracy, recall, F1 score.
Inference Speed Selection based on real-time requirements, batch processing requirements.
Explainability Selection based on requirements for model transparency and interpretability.
Scalability Selection based on ability to handle increases in data volume and system expansion.
Cost Selection based on total cost of ownership including development, training, and operation.

Common Algorithms

Algorithm Application Scenarios
Linear Regression Simple regression problems, when interpretation of relationships is important.
Logistic Regression Binary classification problems, when probability prediction is needed.
Random Forest Classification and regression of structured data, analysis of feature importance.
XGBoost/LightGBM High-performance prediction problems with structured data.
Neural Networks Complex pattern recognition, image and speech processing.
BERT/Transformer Text processing, natural language understanding tasks.
CNN Image recognition, pattern detection.
RNN/LSTM Time series data analysis, sequence data processing.
Reinforcement Learning Models Decision making, game strategies, robot control, etc.

Architecture Considerations

Element Description
Model Size Consideration of number of parameters, memory requirements, storage requirements.
Layer Configuration Selection of number of layers, number of units, activation functions for neural networks.
Ensemble Methods Methods of combining multiple models, voting or averaging strategies.
Quantization and Compression Model lightweighting, adaptation to edge deployment.
Batch Size Balance between memory usage and speed during training and inference.
Distributed Learning Support Possibility of learning on multiple GPUs or multi-nodes.

Model Selection Strategies

Strategy Description
Baseline Construction Strategy to start with simple models and gradually increase complexity.
AutoML Utilization of tools for automatic model selection and optimization.
Algorithm Comparison Evaluate multiple models in parallel and select the optimal one.
Experiment Management Tracking and documentation of the model selection process.
A/B Testing Comparison of different models' performance in real environments.
Gradual Optimization Strategy to optimize while gradually increasing model complexity.

Model Training

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Debugger performs debugging and monitoring of the training process, enabling visualization of metrics and setting of alerts.

Basic Concepts of Model Learning

Concept Description
Inductive Bias Assumptions or hypotheses inherent in the model. Determines the nature of the learning algorithm. Appropriate bias improves generalization performance.
Bias-Variance Tradeoff The tradeoff relationship between model complexity and generalization performance. Balance between overfitting and underfitting.
Cross-Entropy Loss A loss function commonly used in classification problems. Measures the difference between predicted probabilities and true distribution.

Training Methods

Method Description
Supervised Learning A method of learning using data with correct labels. Used for classification and regression tasks. The quality and quantity of data determine performance.
Unsupervised Learning A method to discover patterns from unlabeled data. Used for clustering and anomaly detection. Reveals latent structures in data.
Semi-Supervised Learning A method of learning using a small amount of labeled data and a large amount of unlabeled data. Achieves high performance while suppressing labeling costs.
Reinforcement Learning A method to learn actions that maximize rewards through interaction with the environment. Acquires optimal action policies through trial and error.
Transfer Learning A method to apply knowledge learned from one task to another task. Improves learning efficiency by utilizing pre-trained models.
Batch Learning A learning method that processes all training data at once. Enables stable learning with good computational efficiency. Requires retraining when data is updated.
Online Learning A learning method that processes data sequentially and continuously updates the model. Quick adaptation to new patterns. Risk of instability.
Incremental Learning A method to perform additional learning on existing models with new data. Enables model updates without complete retraining.
Pre-training The basic process of training a model from scratch with large-scale data. Acquires general knowledge and patterns. Forms the foundation for subsequent tasks.
Fine-tuning A method to adapt pre-trained models to specific tasks. Achieves high performance even with small amounts of data. A type of transfer learning.
Continuous Pre-training Periodic retraining with new data. Effective for maintaining performance of domain-specific models. Important as a drift countermeasure.
RLHF Reinforcement learning through human feedback. Improves quality and safety of generative AI models. Addresses alignment problems.
Custom Vocabulary Learning A method to train models on specialized terminology in specific fields. Important for building domain-specific models. Contributes to improving expertise.
Meta-learning A method to learn the learning algorithm itself. Realizes high flexibility and efficiency for new tasks. The foundation of automatic ML systems. Enables quick adaptation to new tasks.
Few-shot Learning A method that enables learning from a small number of samples. A form of meta-learning.
Zero-shot Learning A method that can infer even classes not seen during learning. An advanced form of transfer learning.
Self-Supervised Learning A method to learn by automatically generating teaching signals from unlabeled data. Effective for pre-training.
Multi-task Learning A method to learn multiple tasks simultaneously. Enables efficient learning through knowledge sharing between tasks.
Federated Learning A method to perform cooperative learning on multiple clients while keeping data distributed. Effective for privacy protection.
Knowledge Distillation A method to transfer knowledge from a large model to a small model. Used for model lightweighting.

Model Optimization Methods

Method Description
Hyperparameter Tuning Optimization of model configuration parameters. Uses grid search, random search, Bayesian optimization, etc. Consider the balance between computational cost and performance.
Regularization Addition of penalty terms to prevent overfitting. L1 (Lasso), L2 (Ridge) regularization, etc. Controls model complexity.
Early Stopping Ends learning when improvement in validation performance is no longer observed, preventing overfitting. Also contributes to efficient use of computational resources.
Cross-validation Conducts evaluation by dividing data into multiple parts. Accurately estimates model's generalization performance. Particularly effective when data quantity is limited.
Ensemble Learning Improves performance by combining multiple models. Random Forest, Gradient Boosting, etc. Compensates for weaknesses of individual models.
Gradient Descent A method to search for optimal solutions by updating parameters based on the gradient of the loss function. There are variations such as Stochastic Gradient Descent (SGD) and Mini-batch Gradient Descent.
Learning Rate Adjustment Adjustment of parameters that control the update amount in gradient descent. There are adaptive methods such as AdaGrad, Adam, RMSprop.
Momentum A method to accelerate optimization by using past gradient information. Effective for avoiding local optima and accelerating convergence.

Optimizers (Optimization Algorithms)

Term Description
SGD (Stochastic Gradient Descent) A basic optimization algorithm that calculates gradients and updates parameters on a mini-batch basis.
Adam A popular optimization algorithm that combines momentum and adaptive learning rates. Shows good convergence in many cases.
RMSprop An adaptive optimization algorithm that considers past gradients using exponential moving average.
AdaGrad An adaptive optimization algorithm that applies different learning rates for each parameter.

Regularization Methods

Method Description
Dropout A method to prevent overfitting by randomly disabling neurons.
Batch Normalization A method to normalize inputs on a mini-batch basis, stabilizing and accelerating learning.
Layer Normalization A method to perform normalization at the layer level. Effective for RNNs and Transformers.
Weight Decay A regularization method that penalizes the magnitude of weights. Also called L2 regularization.
Label Smoothing A method to prevent model overconfidence by softening teacher labels.

Learning Rate Scheduling

Method Description
Step Decay A method to decrease the learning rate in stages every certain number of epochs.
Exponential Decay A method to decay the learning rate exponentially.
Cosine Annealing A method to periodically change the learning rate based on a cosine function.
Warm-up A method to gradually increase the learning rate at the beginning of learning, then transition to normal learning rate scheduling.

Model Evaluation

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Clarify can evaluate bias detection and explainability (interpretation of predictions) in model evaluation.
* Amazon SageMaker Experiments provides tools for tracking and managing machine learning experiments, automatically recording experiment results such as training runs, parameters, metrics, and enabling comparative analysis.

Baseline Evaluation

Metric Description
Rule-based Baseline Performance of a prediction model based on simple rules. Used as a baseline for improvement.
Random Baseline Performance when making random predictions. Used as a minimum performance standard.
Industry Standard Baseline Performance standards generally accepted in the industry. Used as a benchmark for competitive comparison.

Classification Tasks

Metric Description
Accuracy The proportion of correct predictions among all predictions. Effective when classes are balanced. Caution is needed with imbalanced data. Often used in cases where the number of data points is similar across classes, such as image classification and document classification. Examples: handwritten character recognition, general object recognition tasks, etc.
Precision The proportion of true positives among positive predictions. Important when minimizing false positives is crucial. Emphasized in spam filters, etc. Used when the cost of false positives is high. Examples: spam email detection, fraudulent transaction detection, quality control inspection, etc., where incorrectly classifying normal items as abnormal can cause significant problems.
Recall The proportion of correct predictions among actual positives. Important when minimizing false negatives is crucial. Emphasized in disease diagnosis, etc. Used when the cost of false negatives is high. Examples: cancer screening, security systems, earthquake prediction, etc., where missed detections can lead to serious consequences.
F1 Score The harmonic mean of precision and recall. A balanced evaluation metric. Used as a single comprehensive evaluation. Used when both precision and recall are important. Examples: information retrieval systems, product recommendation, document classification, etc., where both accuracy and comprehensiveness are required.
ROC Curve A graph plotting true positive rate vs false positive rate for each classification threshold. Performance is evaluated by AUC (Area Under the Curve). Used when a comprehensive evaluation of model performance is desired or when determining the optimal classification threshold is necessary. Examples: credit scoring, medical diagnostic systems, risk assessment models, etc., where threshold adjustment is important.

Regression Tasks

Metric Description
MSE (Mean Squared Error) The average of the squared differences between predicted and actual values. As it squares the errors, it is strongly affected by outliers. Commonly used in general regression problems, especially when emphasizing outliers or when larger errors need to be penalized more severely.
RMSE (Root Mean Squared Error) The square root of the mean squared error. Evaluates the magnitude of prediction errors in the original unit. Strongly affected by outliers. Suitable for tasks like housing price prediction or sales forecasting where interpreting the predicted values in the original scale is desired. Taking the square root of MSE allows for more intuitive interpretation.
MAE (Mean Absolute Error) An evaluation metric less sensitive to outliers. It's the average of the absolute differences between predicted and actual values. Suitable for demand forecasting or inventory management where you want to assess the average magnitude of errors while suppressing the impact of outliers. It doesn't overestimate prediction errors and is easy to understand intuitively.
MAPE (Mean Absolute Percentage Error) Evaluates relative errors. Allows comparison between data of different scales. Suitable for sales forecasting or stock price prediction where actual values are greater than 0 and you want to assess the relative magnitude of errors. Useful for comparing prediction accuracy across companies or products of different sizes.
R² (Coefficient of Determination) Expresses the goodness of fit of a model on a scale from 0 to 1, with values closer to 1 indicating higher prediction accuracy. Can also take negative values. Used to evaluate the overall explanatory power of a model. Particularly used as an indicator for variable selection in multiple regression analysis and is useful for model comparison and selection.
Adjusted R² A modified version of R². It corrects for the influence of the number of explanatory variables, allowing for more accurate model evaluation. Suitable for variable selection and model comparison, especially when comparing models with different numbers of explanatory variables. Used to prevent overfitting.
RMSLE (Root Mean Squared Logarithmic Error) The RMSE after taking the logarithm of predicted and actual values. Evaluates relative errors and mitigates the impact of large values. Suitable for sales forecasting or population prediction where the range of data values is wide and relative errors are emphasized. Particularly useful when predicted values have vastly different scales.
MSLE (Mean Squared Logarithmic Error) The MSE after taking the logarithm of predicted and actual values. Evaluates relative errors and mitigates the impact of large values. Used for similar purposes as RMSLE, particularly suitable for cases where predicted values increase exponentially or for predicting ratios.
MedAE (Median Absolute Error) The median of absolute errors. The evaluation metric least affected by outliers. Particularly useful for prediction tasks with noisy datasets or when outliers are present. Suitable for analyzing sensor data or evaluating actual measurement data that may contain anomalies.

Text Generation

Metric Description
ROUGE Evaluates the similarity between generated text and reference text. Calculates n-gram matching. Commonly used for summarization tasks. Particularly effective for evaluating the performance of news article summarization and document summarization systems.
Human Evaluation Subjectively assesses quality, coherence, relevance, etc. Sets qualitative evaluation criteria and judges with multiple evaluators. Used when subtle nuances and contextual understanding that cannot be fully captured by automatic evaluation metrics are required, or when evaluating creative text generation.

Evaluation Metrics for Generative AI Models

Metric Description
BLEU A metric used for evaluating machine translation. Calculates n-gram matching between generated and reference sentences. Particularly effective for quality assessment of multilingual translation systems and comparison of different translation models.
METEOR An evaluation metric for translation and generated text. Allows for flexible evaluation considering synonyms and morphological variations. Used in translation tasks where there are significant differences in grammatical structures between languages or when considering diversity of expressions.
BERTScore A metric that evaluates semantic similarity of sentences using BERT's contextual word embeddings. Used when semantic similarity assessment is important beyond surface-level matching, or when evaluating paraphrasing.
Perplexity A metric for evaluating the predictive performance of language models. Lower values indicate better models. Used for evaluating the learning process of language models and comparing language models with different architectures.

Explainability Evaluation

Method Description
LIME A method providing local explainability. Generates interpretable explanations for individual predictions. Used in cases where explanation of individual decision bases is important, such as medical diagnostics or financial credit assessments.
SHAP A feature importance calculation method based on game theory. Evaluates the contribution of each feature to predictions. Used when there is a need to understand the decision-making process of complex models or when ranking feature importance.
Attention Visualization Visualization of the attention mechanism in Transformer models. Visually represents the basis of model decisions. Used to confirm the areas of focus in natural language processing tasks or when analyzing model behavior.
Feature Attribution A method to quantify the contribution of each feature to prediction results. Analyzes the decision process of models. Used for evaluating model fairness or detecting bias when necessary.

Correlation Analysis

Method Description
Pearson Correlation Measures the strength of linear relationships on a scale from -1 to 1. Used for evaluating relationships between continuous variables. Applied when analyzing variables expected to have a linear relationship, such as height and weight.
Spearman Correlation Evaluates the strength of ordinal relationships. Applicable to non-linear relationships. Suitable for ordinal variables. Used when analyzing monotonic but not necessarily linear relationships, such as between customer satisfaction and purchase amount.
Chi-square Test Statistically tests the association between categorical variables. Used for verifying independence. Applied when analyzing relationships between categorical data, such as the association between gender and product selection.
Phi Coefficient Measures correlation between binary variables. Applied to 2x2 contingency table data. Used when measuring the strength of relationships between binary data, such as the association between pass/fail and male/female.

Model Challenges and Phenomena

Term Description
Overfitting A state where the model excessively fits to the training data, reducing generalization performance on new data. Also called overlearning, balance between model complexity and training data amount is important.
Underfitting A state where the model fails to capture patterns in the training data sufficiently. Caused by lack of model expressiveness or insufficient learning. Requires more complex models or additional learning.
Bias Systematic error between model predictions and true values. Increases when model expressiveness is insufficient, causing underfitting.
Variance Variability in model predictions. The magnitude of prediction value fluctuations to small changes in training data. If too high, it can cause overfitting.
Hallucination When AI models generate incorrect information not based on facts. Particularly problematic in generative AI, can be mitigated by techniques like RAG.
Drift Changes in data distribution or model performance over time. Includes concept drift (changes in target variable relationships) and data drift (changes in input distribution).

Dealing with Data Imbalance

Method Description
Oversampling A method to increase data of minority classes. Includes synthetic data generation like SMOTE. Improves class balance. Effective when overall data quantity is small and you want to maximize use of information, or when there are ample computational resources.
Undersampling A method to balance by reducing data from majority classes. Risk of information loss but computationally efficient. Effective when data quantity is sufficient and there are strict constraints on computation time or memory.
SMOTE A method to generate synthetic data for minority classes. Uses k-nearest neighbors to generate new samples. Ensures data diversity. Effective when simple duplication risks overfitting or when you want to learn more diverse features of minority classes.
Class Weighting Adjusts balance by weighting minority classes during learning. Modifies the model's loss function to balance. Effective when you want to maintain the original data distribution or avoid data modification.

Bias and Fairness Evaluation

Metric Description
Demographic Parity Evaluates the uniformity of prediction result distribution across different demographic groups.
Equal Opportunity An indicator that confirms true positive rates are equal across protected attributes.
Predictive Parity Evaluates the consistency of prediction accuracy between different groups.
Individual Fairness Evaluates the consistency of predictions for individuals with similar characteristics.
Bias Amplification Measures the degree to which the model amplifies existing biases in the data.

Performance Stability Evaluation

Metric Description
Prediction Variance Evaluates the variability of model predictions. Used as an indicator of stability.
Threshold Stability Evaluates the robustness of performance to changes in classification thresholds.
Cross-validation Standard Deviation Evaluates the variability of performance across different data splits.
Noise Resistance Evaluates the stability of predictions against input noise.
Temporal Stability Evaluates the consistency of prediction performance in time series data.

Reliability and Robustness Evaluation

Metric Description
Adversarial Attack Resistance Evaluates the model's robustness against adversarial samples. Identifies security vulnerabilities.
Model Uncertainty Quantification of prediction confidence and uncertainty. Evaluation using Bayesian methods or ensemble methods.
Stress Test Evaluation of model behavior in extreme cases or boundary conditions. Understanding system limitations.
Data Quality Sensitivity Evaluation of model sensitivity to deterioration in input data quality. Used as an indicator of robustness.
Fail-safe Property Evaluation of safety in case of model abnormal operation. Confirmation of fallback mechanism effectiveness.

Cost Efficiency Evaluation

Metric Description
Computational Cost Evaluation of computational resources required for model training and inference. GPU time, memory usage, etc.
Infrastructure Cost Evaluation of infrastructure costs required for model operation. Storage, network, etc.
Maintenance Cost Evaluation of human resources and time required for model maintenance and updates.
ROI Analysis Evaluation of return on investment from model deployment. Quantification of cost reduction or revenue increase.

Model Deployment

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Registry catalogs and version-manages ML models, managing model metadata.

Deployment Strategies

Term Description
Canary Deployment A technique that applies a new version of the model to only a portion of the traffic and gradually expands. Allows validation of new models while minimizing risk.
Blue/Green Deployment A deployment method that prepares production (blue) and new (green) environments in parallel and switches between them. Enables immediate rollback.
Shadow Deployment A method that mirrors production traffic to a new model for parallel evaluation. Enables performance verification under actual workloads.
A/B Testing A method to operate multiple versions of models simultaneously and compare their performance. Enables data-driven decision making.
Rolling Update A gradual deployment method that updates instances sequentially. Minimizes service interruption.
Rollback Plan Setting of recovery procedures and trigger conditions in case of problems. Ensures consistency of data and models.

Inference Options

Term Description
Real-time Inference A method that executes inference in real-time for requests. Suitable for use cases requiring low latency. In Amazon SageMaker, it's provided as persistent, fully managed endpoints that can handle payloads up to 6MB and processing times up to 60 seconds. A scalable solution capable of handling continuous traffic.
Batch Inference A method that executes inference in bulk for large amounts of data. Suitable for periodic prediction processing. In Amazon SageMaker, it's provided as batch transform, capable of processing large-scale datasets of several GB. Optimal for offline processing or preprocessing that doesn't require persistent endpoints.
Asynchronous Inference A method that queues and processes inference requests requiring large payloads or long processing times. In Amazon SageMaker, it supports payloads up to 1GB and processing times up to 1 hour. Can scale down to 0 when there's no traffic.
Serverless Inference An event-driven inference execution method. Automatically scales according to demand. In Amazon SageMaker, it provides a model that requires no infrastructure management and charges only for usage, for intermittent or unpredictable traffic. Supports payloads up to 4MB and processing times up to 60 seconds.

Endpoint Options

Term Description
Single Model Endpoint A basic endpoint configuration for deploying a single model. Simple and easy to manage.
Multi-Model Endpoint An endpoint configuration that serves multiple models of the same framework in a single container. In Amazon SageMaker, it improves endpoint utilization and reduces deployment overhead, realizing cost optimization.
Multi-Container Endpoint An endpoint configuration that serves multiple models of different frameworks in separate containers. In Amazon SageMaker, it allows flexible deployment of various frameworks and models.
Serial Inference Pipeline An endpoint configuration that executes preprocessing, inference, and post-processing as a series of pipelines. In Amazon SageMaker, all containers are hosted on the same EC2 instance and fully managed, achieving low latency.
Scalable Endpoint An endpoint configuration that automatically scales according to load. Flexibly responds to traffic fluctuations.
High Availability Endpoint An endpoint configuration deployed across multiple availability zones, ensuring redundancy.

Infrastructure

Term Description
Model Containerization Packaging of models using container technologies like Docker. Ensures consistency and portability of environments.
Scaling Strategy Setting of automatic scaling according to load. Selection and policy setting of horizontal/vertical scaling.
Service Mesh Management of traffic control and inter-service communication in microservice architectures.
Deployment Pipeline Automation and standardization of deployments. Construction of CI/CD pipelines and setting of quality gates.

Optimization and Security

Term Description
Model Optimization Model optimization before deployment. Lightweighting and acceleration through quantization, pruning, distillation, etc.
Security Settings Setting of access control, encryption, authentication and authorization. Configuration of secure endpoints.
API Versioning Version management of model APIs and ensuring compatibility between versions.
Monitoring Settings Configuration of metric collection, logging, alert settings. Establishment of performance and quality monitoring system.

Inference

Prompting Techniques

Technique Description
Prompt Engineering A technique to obtain desired outputs by crafting inputs (prompts) to AI models. Optimizes methods of setting context and presenting constraints.
Zero-shot Prompting One of the prompt engineering techniques. Executes tasks directly without examples. Utilizes the model's generalization ability to handle new tasks.
Few-shot Prompting One of the prompt engineering techniques. Teaches how to execute tasks by showing a few examples. Controls model behavior through concrete examples.
Chain-of-Thought Prompting One of the prompt engineering techniques. Guides the model to solve complex problems step by step (Chain of Thought). Encourages explicit expansion of the reasoning process.

Prompt Optimization Techniques

Technique Description
Prompt Templates Design of reusable fixed prompts. Ensures consistent outputs.
Hallucination Countermeasures Techniques to prevent generation not based on facts. Incorporation of knowledge base references and fact-checking.
Context Management Effective setting and control of context information in prompts. Leads to more accurate responses.
Prompt Variation Experimentation and optimization of different expression methods for the same intent. Improves robustness.

Monitoring

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Monitor continuously monitors model quality in production environments and detects data drift and bias.
* Amazon SageMaker Clarify can also be used in model quality monitoring, continuously evaluating model bias and explainability.

Model Quality Monitoring

Term Description
Data Drift Detection Monitors changes in input data distribution. Used as an indicator to determine timing for model retraining.
Concept Drift Detection Monitors changes in the relationship between inputs and outputs. Used as an indicator to determine the need for model updates.
Prediction Quality Monitoring Continuously evaluates the quality of model prediction results. Includes monitoring of bias and fairness.
Explainability Monitoring Monitors metrics related to model explainability. Ensures transparency and reliability of predictions.
Input Data Validation Continuously validates the validity of input data schema, type, range, etc.
Data Completeness Monitoring Monitors data quality indicators such as missing values, outliers, duplicates.
Feature Stability Monitoring Tracks changes in statistical properties of features. Detects distribution shifts.
Data Source Monitoring Monitors availability, freshness, and consistency of data sources.

Performance Monitoring

Term Description
Latency Monitoring Monitors inference processing time. Checks SLA compliance status and used for performance optimization.
Throughput Monitoring Monitors the number of processes per unit time. Used for capacity planning.
Resource Utilization Monitoring Monitors infrastructure metrics such as CPU, memory, disk usage. Used for scaling decisions.
Error Rate Monitoring Monitors the occurrence rate of inference errors and system errors. Used for maintaining service quality.

Security Monitoring

Term Description
Access Monitoring Monitors access patterns and authentication status to API endpoints. Detection of unauthorized access.
Data Security Monitoring Monitors data encryption status, access control, and privacy protection status.
Compliance Monitoring Continuously monitors compliance with regulatory requirements. Maintenance of audit trails.

Operational Monitoring

Term Description
Alert Settings A mechanism to notify when important metrics exceed thresholds. Enables early response.
Log Analysis Analysis of system logs and application logs. Used for identifying causes of failures and trend analysis.
Incident Tracking Tracking of occurrence history and response status of failures and abnormalities. Used for formulating recurrence prevention measures.
Capacity Management Prediction and planning of resource usage. Formulation of appropriate scaling strategies.

Business Impact Monitoring

Term Description
ROI Analysis Continuous evaluation of costs and effects of model operation. Measurement of return on investment.
Business Metrics Monitoring of indicators showing the business contribution of the model. Measurement of effects such as sales and cost reduction.
User Satisfaction Tracking of feedback and service evaluations from end users.

System Health Monitoring

Term Description
Infrastructure Availability Monitoring Operational status and health checks of system components.
Network Monitoring Monitoring of network connectivity, latency, bandwidth.
Cache Efficiency Monitoring Tracking of cache hit rates, memory usage efficiency.
Batch Processing Monitoring Monitoring of batch job execution status, success rates, processing times.

Fairness Monitoring

Term Description
Bias Metrics Monitoring of demographic biases in model predictions. Tracking of fairness indicators.
Attribute-based Monitoring Monitoring of prediction biases based on protected attributes. Detection of discriminatory results.
Fairness Score Evaluation of prediction accuracy uniformity across different groups. Quantitative measurement of fairness.
Impact Analysis Analysis of the impact of model predictions on different populations. Evaluation of social impact.

Governance Monitoring

Term Description
Policy Compliance Monitoring of compliance with the organization's AI governance policies. Confirmation of guideline adherence.
Accountability Tracking Monitoring of transparency and explainability in the model's decision-making process. Ensuring accountability.
Ethical Risk Monitoring Continuous assessment of AI's ethical impacts and potential risks. Fulfillment of social responsibility.
Regulatory Compliance Tracking Monitoring of compliance status with new regulatory requirements. Maintenance of compliance.

MLOps Management Process

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Studio provides an integrated development environment (IDE) that enables one-stop execution from notebook creation to model development, training, and deployment, realizing centralized management of ML workflows.
* Amazon SageMaker Canvas provides a no-code ML development environment that allows data preparation to model deployment through drag & drop without writing code, enabling development for business analysts.
* Amazon SageMaker Pipelines orchestrates ML workflows, building reproducible ML pipelines.

Experiment Management

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Experiments provides tools for tracking and managing machine learning experiments, automatically recording experiment results such as training runs, parameters, metrics, and enabling comparative analysis.
Term Description
Experiment Tracking Activity of recording and managing settings, parameters, and results of each experiment in model development. Uses tools like MLflow, SageMaker Experiments.
Metadata Management Management of associated information such as settings, environment, datasets, results related to experiments. Ensures reproducibility and traceability of experiments.
Hyperparameter Logging History management of model hyperparameter settings. Used for tracking and comparative analysis of optimization processes.
Evaluation Metrics Tracking Time-series recording and analysis of model performance indicators. Used for understanding improvement trends and comparative evaluation.
Artifact Management Storage and management of artifacts such as models, checkpoints, plots. Streamlines storage and sharing of experiment results.
A/B Test Management Design and result management of comparative experiments of multiple models. Supports statistical significance evaluation and decision making.
Experiment Environment Management Management of development environment configurations, dependencies, resource settings, etc. Maintains reproducibility and consistency of environments.

Version Control

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Registry catalogs and version-manages ML models, managing model metadata.
Term Description
Data Versioning Change history management of training datasets. Uses tools like DVC (Data Version Control). Enables tracking of data lineage.
Model Versioning Management of different versions of trained models. Implemented with tools like SageMaker Model Registry. Controls switching and rollback in production environments.
Code Version Control Version management of model development code. Uses Git, etc. Setting of branch strategies and merge policies.
Configuration File Management Version management of configuration files such as environment settings, parameter settings. Ensures consistency across environments.
Dependency Management Version management of libraries and frameworks. Clarified in requirements.txt or Dockerfile.
Tagging Assigning meaningful tags to versions of models, data, code. Facilitates release management and tracking.
Baseline Management Management of model versions that serve as benchmarks for performance comparison. Used as indicators for quality assurance.

Documentation

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Model Cards creates and manages model documentation, centralizing management of detailed model information.
Term Description
Model Card A standardized document recording detailed information about the model. Includes usage, performance, limitations, ethical considerations, etc. Can be managed with SageMaker Model Cards.
Data Sheet A document recording characteristics of datasets, collection methods, preprocessing procedures, license information, etc. Ensures transparency and reusability of data.
API Specification Describes model interface specifications, input/output formats, endpoint information, etc. Managed in standard formats like OpenAPI/Swagger.
Experiment Report A document summarizing the purpose, method, results, and discussion of experiments. Records important findings and decisions.
Operation Manual A manual describing procedures for model deployment, monitoring, and maintenance. Includes incident response procedures.
Training Record Detailed record of model learning. Includes data preparation, parameter settings, learning process, summary of results.
Change History Records important changes to models, data, code. Documents reasons for updates and scope of impact.
Risk Assessment Document A document evaluating potential risks, biases, ethical considerations of the model. Also used as evidence for regulatory compliance.
Quality Assurance Document Records test results, performance evaluations, validation procedures. Demonstrates compliance with quality standards.
Compliance Document A document demonstrating compliance with regulatory requirements. Records responses to GDPR, AI governance, etc.
Architecture Diagram Visual representation of system configuration, data flow, relationships between components.
Troubleshooting Guide A guide listing common problems and their solution procedures. Contributes to improving operational efficiency.
Model Lineage Diagram Visual representation of model development process, derivative relationships, important changes. Clarifies relationships between versions.
Performance Benchmark Performance comparison results between different model versions. Used as evidence of improvement.
Deployment Plan A plan describing model deployment strategy, schedule, risk countermeasures.

Orchestration Management

Term Description
Pipeline Management Automation and control of ML workflows. Manages the series of flows from data processing to inference.
Workflow Definition Definition of each step in the ML process and its dependencies. DAG-based control flow design.
Automation Triggers Setting of conditions and schedules for pipeline execution. Control of event-driven processing.
Error Handling Implementation of anomaly detection and recovery mechanisms. Definition of fallback strategies.

Quality Management

Term Description
Quality Gates Quality checkpoints before deployment. Verification of performance, security, compliance.
Test Automation Automated execution system for unit tests, integration tests, performance tests. Continuous quality assurance.
Quality Metrics Definition and monitoring of quality indicators for models and systems. Tracking of SLO/SLA compliance status.

Infrastructure Management

Term Description
Resource Optimization Efficient allocation and management of computational resources and storage. Cost optimization.
Scaling Management Setting and monitoring of auto-scaling policies. Flexible resource adjustment according to demand.
Availability Management Ensuring system redundancy and fault tolerance. Management of backup and disaster recovery plans.

Security Management

[Amazon SageMaker components useful in this category]
* Amazon SageMaker Role Manager manages access permissions for ML activities, implementing security based on the principle of least privilege and providing appropriate access control.
Term Description
Access Control Role-based access management. Permission settings based on the principle of least privilege.
Data Protection Encryption and anonymization of sensitive data. Implementation of privacy protection mechanisms.
Vulnerability Management Detection and countermeasures for security vulnerabilities. Conducting regular security assessments.

Governance Management

Term Description
Policy Management Formulation and compliance management of AI governance policies. Setting of ethical guidelines.
Audit Response Maintenance of audit trails and management of audit response processes. Preparation of compliance evidence.
Risk Management Identification, assessment, and implementation of mitigation measures for risks related to AI use. Continuous risk monitoring.
References:
Tech Blog with curated related content

Summary

In this article, I have compiled an "AI and Machine Learning Terminology for AWS" based on the knowledge I gained during my study process to pass the newly added AWS certifications: AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate. Additionally, I included insights from co-authoring the "Quiz to Learn AWS Functions and History: Selected 'Machine Learning' Edition" in the "Compilation of Thin Books on AWS Vol.01", which was self-published for Japan's "Technical Book Festival 17".

I will continue to shape ideas that can be useful for learning and utilizing AWS.
Additionally, I plan to update this article periodically to reflect changes in AI and machine learning in AWS.


Written by Hidekazu Konishi


Copyright © Hidekazu Konishi ( hidekazu-konishi.com ) All Rights Reserved.