Selecting and Configuring FMs

AWS Certified Generative AI Developer Professional Exam Notes and Practice Tests Selecting and Configuring FMs

This section aligns with Content Domain 1: Foundation Model Integration, Data Management, and Compliance, specifically Task 1.2: Select and configure foundation models.

This task evaluates your ability to move beyond simply invoking a model API and instead design flexible, resilient, and governable FM strategies suitable for enterprise-scale GenAI systems.

1. Introduction

Key Concepts

This domain focuses on the selection, abstraction, hardening, and lifecycle management of foundation models. The exam expects solutions that are model-agnostic, enterprise-ready, and capable of evolving safely over time. Correct designs emphasize flexibility, resilience, governance, and controlled customization rather than one-off or tightly coupled implementations.

What the Exam Is Really Testing

The exam assesses whether you can justify why a particular foundation model fits a given use case, design runtime model switching without application redeployment, and maintain system availability during regional or service disruptions. You are also expected to understand how customized models are managed safely through versioning, rollback, and retirement.

Core Pillars of FM Selection and Configuration

Strong solutions align model choice with the use case, balance performance, latency, and cost, and account for operational availability. Equally important is lifecycle governance—ensuring models can be deployed, updated, rolled back, or retired in a controlled and auditable manner.

2. Foundation Model Assessment and Selection

Key Concepts

There is no universally “best” foundation model. Instead, the correct choice is the one that best fits the workload’s functional and operational requirements. Model selection must consider capabilities, limitations, performance characteristics, and real-world constraints.

How to Assess Foundation Models

Capability assessment includes determining whether the use case requires text-only or multimodal inputs, simple generation or deep reasoning, agent compatibility, tool-calling support, and large context windows for long documents.

Performance evaluation focuses on latency under load, token throughput, output consistency, and hallucination risk. Benchmarks may include automated metrics such as BLEU or ROUGE, but human evaluation is often critical for quality-sensitive workloads.

Limitations must also be evaluated, including regional availability, maximum token limits, pricing per thousand tokens, and unsupported modalities or languages.

AWS-Preferred Approach

AWS recommends using Amazon Bedrock to compare, evaluate, and switch between multiple foundation models without committing to a single provider. On-demand inference is typically preferred initially, allowing teams to validate fit before pursuing any form of customization.

Exam Tips

When a question asks for optimal alignment, look for answers that match model capabilities to the specific use case—not the most powerful or largest model. Token cost and latency are just as important as accuracy. Keywords such as benchmarks, trade-offs, limitations, and fit for purpose are strong selection signals.

3. Flexible Architecture for Dynamic Model Selection

Key Concepts

GenAI applications should never hardcode model identifiers or providers. Model routing must be driven by configuration rather than application logic to support experimentation, cost optimization, and vendor flexibility.

Reference Architecture Pattern

A common abstraction pattern routes requests through API Gateway and AWS Lambda before invoking Amazon Bedrock. The Lambda function selects the appropriate model at runtime based on configuration rather than embedded logic.

Dynamic configuration is typically stored in AWS AppConfig or Parameter Store and may include model identifiers, fallback order, and cost or latency preferences.

Benefits

This approach enables model switching without redeployment, safe A/B testing, and rapid emergency failover in response to outages or cost spikes.

Exam Tips

If a question explicitly mentions no code changes or runtime switching, AWS AppConfig is a strong indicator. API Gateway plus Lambda is the most common abstraction layer. Hardcoded model IDs are almost always incorrect in exam scenarios.

4. Designing Resilient GenAI Systems

Key Concepts

GenAI systems must be designed to degrade gracefully rather than fail outright. The exam prioritizes availability and continuity of service over perfect responses.

Resilience Strategies

Cross-region inference using Amazon Bedrock allows workloads to continue operating when models are available in limited Regions or when a regional disruption occurs.

Circuit breaker patterns, often implemented using AWS Step Functions, detect repeated failures, temporarily halt calls to unstable models, and redirect traffic to fallback models or alternative responses.

Graceful degradation strategies may include falling back to smaller or cheaper models, non-AI rule-based responses, or cached summaries when full inference is unavailable.

Exam Tips

Resilience is not achieved through retries alone. Correct answers demonstrate fallback logic and controlled degradation. A system that returns a reduced response is always preferable to one that fails completely. Phrases such as high availability, service disruption, and resilient are strong design signals.

5. FM Customization, Deployment, and Lifecycle Management

Key Concepts

Model customization should be deliberate and incremental. Any customized foundation model must be versioned, auditable, and reversible to support safe long-term operation.

Customization Options

Parameter-efficient fine-tuning techniques, such as LoRA or adapters, are often preferred over full fine-tuning. These approaches reduce cost, accelerate iteration, and minimize operational risk.

Deployment and Governance

Customized models are typically deployed using Amazon SageMaker endpoints. SageMaker Model Registry plays a critical role by managing versioning, approval workflows, and rollback control.

Lifecycle management includes CI/CD pipelines for model updates, automated rollback on failure, and the retirement of outdated or underperforming models.

Exam Tips

Fine-tuning is justified only when retrieval-based approaches are insufficient or when domain language is highly specialized. The Model Registry is a governance mechanism, not merely a storage location. Keywords such as versioning, rollback, and model lifecycle are strong indicators.

6. Flash Questions (Exam Reinforcement)

These scenarios reinforce common exam patterns:

Low-cost summarization prioritizes latency and token pricing over deep reasoning.
Runtime model switching requires configuration-driven routing, typically using AWS AppConfig.
Regional outages are best handled with cross-region inference and fallback logic.
Circuit breakers prevent cascading failures by redirecting traffic away from unstable models.
Full fine-tuning is rarely appropriate early in the lifecycle.
Model version approvals are governed through SageMaker Model Registry.
Graceful degradation favors smaller models or cached responses.
Hardcoding a single FM endpoint violates flexibility principles.

7. Exam-Focused Guidance

High-Probability Exam Traps

Common pitfalls include selecting the most powerful model instead of the most appropriate one, assuming fine-tuning is required for domain data, ignoring regional availability, relying solely on retries for resilience, and lacking rollback strategies for customized models.

Key Exam Memory Hooks

Fit over power
Configuration over code
Fallbacks are better than failures
Fine-tune last, not first
Version everything

Previous Lesson

Back to Course

Next Lesson