SIP Study Group - Domain 3 - 14th August 2025

SIP Study Group - 14th August 2025

Thursday August 14, 2025 4:00 pm AWST Duration: 1h

Meeting Summary for SIP Study Group - 14th August 2025

Quick recap

The session focused on applications and foundation models as part of AWS Certified AI Practitioner (CAIP) training, with emphasis on practical AI workflows and prompt engineering. Winton discussed various aspects of AI model selection, implementation, and optimization, including considerations for cybersecurity, performance metrics, and ethical concerns. The presentation covered topics ranging from model architecture and training techniques to performance evaluation and safety measures, concluding with an overview of generative AI app stacks and upcoming session plans.

Next steps

Winton to prepare Domain 4 content for next week's AWS CAIP study session with more visual elements.
Attendees to book a free 15-minute discovery call with Winton if they want personalized guidance on IT and cybersecurity career paths.
Attendees to study AWS services related to foundation models in preparation for the CAIP exam.
Winton to develop content on prompt engineering for practical applications after completing the 5 domains of the AWS CAIP.

Summary

AWS CAIP Training Overview

Winton introduced the session on applications and foundation models as part of the AWS Certified AI Practitioner (CAIP) training, explaining it as a foundational certification for those new to AI, complementing the more technical Machine Learning Engineer certification. He outlined the session's focus on practical AI workflows and prompt engineering, mentioning the recent release of GPT-5 and its implications. Winton also shared his background as a cybersecurity professional and program director for Osaka Hawaii, emphasizing the platform's goal to support learners in AI, cybersecurity, and related fields through certifications, networking, and career guidance. He encouraged participants to book a discovery call for personalized support and guidance.

Understanding Foundation Models in Cybersecurity

Winton introduced a module on foundation models, explaining their applications and importance in cybersecurity. He outlined key topics to be covered, including choosing the right engine, driving models safely, tuning for specific domains, and measuring effectiveness. Winton emphasized the importance of balancing model size and complexity with hardware capabilities, similar to sizing security tools. He also discussed cost considerations, scalability, and the need for ongoing maintenance of foundation models, comparing them to long-lived systems that require regular updates and care.

AI Model Selection Strategies

Winton discussed selecting the appropriate AI models for different tasks, emphasizing the importance of multimodal and multilingual capabilities for analyzing various types of data. He highlighted the need for fast latency and efficient resource use, advocating for smaller, focused models over larger, more complex ones. Winton also stressed the importance of context length, customization, and the use of templates to improve model performance. He introduced the concept of "rag" (retrieval augmented generation), which involves retrieving information from reputable sources to enhance the accuracy and relevance of model outputs.

Vector Embeddings and Search Solutions

Winton discussed AWS services suitable for storing vector embeddings and performing similarity search, identifying Amazon OpenSearch Service and Amazon RDS for PostgreSQL as the correct answers. He emphasized the importance of using appropriate performance metrics like precision, recall, and F1-score, particularly when dealing with imbalanced data, and explained that false negatives are more critical than false positives in cybersecurity. Winton also highlighted the need to address bias in training data and ethical risks by identifying sources, documenting issues, and using techniques such as negative prompts and guardrails to reduce harmful content.

AI Implementation: Human Oversight and Trust

Winton discussed the importance of keeping humans in the loop when using AI models, particularly in sensitive areas like compliance, HR, legal, and healthcare. He emphasized the need for transparency, trust, and proper change management when implementing AI systems. Winton also explained the difference between interpretability and explainability in AI models, and advised right-sizing models based on specific use cases. He concluded by highlighting the benefits and challenges of complex models, and stressed the importance of starting simple and scaling complexity only when it clearly improves outcomes.

AI Creativity and Temperature Settings

Winton discussed the relationship between temperature settings and creativity in AI models, explaining that higher temperatures produce more creative and diverse outputs, while lower temperatures yield more accurate and consistent responses. He emphasized the importance of balancing creativity with coherence and suggested adjusting stop sequences and maximum token limits to prevent endless rambling. Winton also mentioned that these configurations can be applied across various models in AWS Bedrock, advising participants to test different settings and monitor outcomes for their specific use cases.

RAG Fundamentals and Prompt Engineering

Winton discussed the fundamentals of RAG (Retrieval-Augmented Generation), explaining how it reduces hallucinations and keeps answers aligned with the latest data without retraining. He highlighted the benefits of using AWS services for managing RAG and knowledge bases, including options like Open Search for vector search and RDS PostgreSQL with PG Vector. Winton also covered prompt engineering fundamentals, emphasizing the importance of clear, concise prompts with relevant context and examples, and introduced various prompt types such as zero-shot, chain of thought, and prompt templates. He concluded by discussing model limitations, noting that performance is largely dependent on the data used for training.

AI Safety and Prompt Engineering

Winton discussed the limitations and potential hallucinations of AI models when dealing with niche topics or outdated data, emphasizing the importance of providing specific and updated information to improve accuracy. He shared a personal experience with a mobile game's AI assistant that initially provided an incorrect response due to outdated information, but corrected itself upon being prompted with more context. Winton also highlighted the need for guardrails and safety measures in AI systems, such as negative prompts, privacy rules, output checks for toxicity or PII data, and human approval processes for high-risk cases. He stressed the importance of treating prompts as a new attack surface and avoiding the leakage of system or developer prompts, as well as the risks of poisoning, hijacking, and jailbreaking attempts by malicious actors.

AI Model Training Techniques Overview

Winton discussed various aspects of training and fine-tuning AI models, emphasizing the importance of reliable instruction following and guided responses. He explained different techniques like pre-training, continuous pre-training, fine-tuning, and instruction tuning, highlighting the trade-offs and benefits of each. Winton also covered data preparation for fine-tuning, including the need for quality, balanced, and representative datasets. He introduced AWS tools for data preparation and labeling, and discussed reinforcement learning from human feedback (RLHF) for teaching models preferences based on human ratings. Winton concluded by stressing the importance of evaluating model performance through human reviews for nuance and usefulness, rather than relying solely on automated metrics.

AI Model Performance Optimization Strategies

Winton discussed performance optimization strategies for AI models, focusing on metrics like ROUGE for summarization and BLEU for translation, along with broader comparisons using GLUE and MMLU benchmarks. He emphasized the importance of tracking design choices that affect latency, budget, and output quality, such as prompt length, retrieval chunk size, and inference parameters. Winton also highlighted the need for human review to check factuality and suggested using version control for prompts, storing user feedback, and optimizing inference techniques like using smaller models and adjusting temperature settings to achieve fast, good, and repeatable results.

Generative AI App Architecture Overview

Winton discussed the architecture of a generative AI app stack, which consists of infrastructure, model, and UI layers, and explained how agents for Amazon Bedrock function as small coordinators that break down tasks and generate answers. He emphasized the importance of evaluating AI models using a combination of ROUGE and BERT scores on a curated test set, along with human evaluation for factuality and readability. Winton concluded by sharing key takeaways for implementing AI systems, including selecting models based on business needs, thinking in layers, defining SLAs, and considering safety and security measures, and announced that the next session would cover Domain 4.

Complete and Continue

Discussion