Intermediate

60 minutes

Images, Audio & Video

Unleash creative expression with generative AI while understanding how it works and the limits

MediaCreativeToolsDiffusion

What You'll Learn

✓
The intuition behind diffusion models and how they generate images
✓
Why prompts for images differ from text prompts
✓
How negative prompts filter out unwanted artifacts
✓
Techniques like style referencing, prompt composition, inpainting, and outpainting
✓
The strengths and limitations of Midjourney, DALL-E, and Stable Diffusion
✓
Combining generative tools with manual editing for polished results
✓
Professional applications and workflows for marketing, product visualization, and multimedia
✓
Legal and copyright considerations: data provenance, fair use, and licensing

Key Ideas

What are Diffusion Models?

Diffusion models are generative algorithms that progressively add noise to training data and then learn to reverse the process to reconstruct or synthesize new data. During training, a neural network is taught to introduce noise to the data in a forward diffusion process, then to reverse the diffusion to generate outputs. Examples include DALL-E 2, Midjourney, and Stable Diffusion.

Examples:

• DALL-E 2: Strong at following complex prompts and generating realistic compositions
• Midjourney: Known for artistic and stylized output; excels at atmospheric and cinematic scenes
• Stable Diffusion: Open-source model enabling local control, custom training, and fine-tuning

Prompt Differences

Text-to-image prompts describe visual elements (objects, composition, lighting, style) rather than specifying roles or tasks. Because diffusion models map text embeddings into visual latent space, the wording of prompts influences composition, color palette, and style. Negative prompts allow you to specify what you don't want in the image (e.g., 'blurry,' 'extra limbs,' 'low resolution'), acting as soft constraints.

Examples:

• Positive: 'A futuristic city skyline at golden hour, ultra-wide angle, neon lights'
• Negative: 'no people, no text, clean background, no blur'
• Style: 'inspired by Syd Mead, digital art, 8K resolution'

Dive Deeper

Explore the mechanism, mastery techniques, and critical thinking considerations. Click to expand each layer.

Mechanism Layer

Advanced

How Diffusion Works

During training, random noise is gradually added to data samples. The model learns to predict and remove this noise step-by-step in reverse order. This two-phase process (forward diffusion and reverse denoising) enables the model to learn the probability distribution of complex data. In inference, the model starts with pure noise and iteratively denoises it using the learned weights to produce an image.

Key Points:

•Negative prompts and guidance scales: Negative prompts filter out unwanted elements by reducing attention on specified features during generation
•Inpainting and outpainting: Inpainting fills in missing regions based on surrounding pixels and a prompt. Outpainting extends an existing image beyond its original borders
•Guidance scale parameters allow you to control how strongly the model follows the prompt versus the learned distribution

Mastery Layer

Advanced

Advanced Techniques

Style references and consistency: Provide an image as a style reference along with your prompt to maintain visual consistency across multiple outputs. Control the random seed and guidance scale to reproduce similar compositions.

Techniques:

•Prompt composition: Combine descriptive phrases, artistic styles, camera settings, and aspect ratios
•Advanced syntax: Use '+' to join concepts ('forest + cyberpunk city'), specify negative prompts
•Combining tools: Generate base image → upscale → refine with manual editing → AI enhancer
•Professional applications: Marketing material creation, product visualization, multimedia storytelling

Critical Thinking Layer

Advanced

Legal and Ethical Considerations

Diffusion models are trained on vast datasets, some of which may include copyrighted images without permission. Understand the legal landscape in your jurisdiction: fair use, transformative works, and derivative rights are complex and evolving. When using AI-generated assets commercially, verify licensing terms, credit sources when required, and consider using models trained on curated, rights-cleared datasets.

Considerations:

•Bias and representation: Image models reflect biases in their training data and may under-represent certain cultures or reinforce stereotypes
•Environmental impact: Training and running diffusion models require significant computational resources
•Copyright issues: Verify licensing terms and understand data provenance

Suggested Resources

Generative AI for Beginners

Microsoft

Course

View Resource

Midjourney Documentation

Midjourney

Article

View Resource

Try This Now

Put your learning into practice with these hands-on exercises. Copy the prompts and try them in your favorite AI tool.

Exercise 1: Prompt Variation

Generate images of the same concept with different styles. Use negative prompts to remove unwanted artifacts.

20 minutes

Generate 'cat in a hat' in three styles: (1) oil painting, (2) photorealistic, (3) pixel art. Use negative prompts: 'blurred background, extra limbs, low quality'

Exercise 2: Style Transfer

Use a reference image to generate new images in the same style. Observe how style influences composition, color, and mood.

15 minutes

Choose a famous painting style (e.g., Van Gogh's Starry Night) and generate modern scenes in that style

Exercise 3: Inpainting Experiment

Remove an object from an image and prompt the model to fill in the missing area naturally.

15 minutes

Take a street scene and remove a car. Evaluate how well the model fills in the missing area with natural surroundings

Related Prompts from the Library

Practice what you've learned with these prompts from our library.

IntermediateContent Creation

Generate a 7-minute video script for

ROLE: You are a assistant. GOAL: Generate a 7-minute video script for a YouTube video about our newest <product/service description> and <targeted audience>. CONTEXT: Input details: describe your audience, describe your product. Ask clarifying questions if any input details are missing. TASK: Generate a 7-minute video script for a YouTube video about our newest <product/service description> and <targeted audience>. Product/service description = [describe your product]. Targeted audience = [describe your audience] OUTPUT FORMAT: Script CONSTRAINTS:

View Prompt →

BeginnerBusiness & Product

Design a referral program that incentivizes

ROLE: You are a video producer. GOAL: Design a referral program that incentivizes current customers to share our <product/service> with their network "I want you to act as a video producer. CONTEXT: Ask clarifying questions if any input details are missing. TASK: Design a referral program that incentivizes current customers to share our <product/service> with their network "I want you to act as a video producer. You will produce a video that showcases <company/product/service> in OUTPUT FORMAT: Text CONSTRAINTS:

View Prompt →

BeginnerContent Creation

Develop a video series that showcases

ROLE: You are a assistant. GOAL: Develop a video series that showcases the features and benefits of our <product/service>, while also addressin. CONTEXT: Ask clarifying questions if any input details are missing. TASK: Develop a video series that showcases the features and benefits of our <product/service>, while also addressin OUTPUT FORMAT: Text CONSTRAINTS:

View Prompt →

BeginnerContent Creation

Design an infographic that visualizes the

ROLE: You are a assistant. GOAL: Design an infographic that visualizes the key benefits and features of <product/service> in a simple and easy to. CONTEXT: Ask clarifying questions if any input details are missing. TASK: Design an infographic that visualizes the key benefits and features of <product/service> in a simple and easy to OUTPUT FORMAT: Text CONSTRAINTS:

View Prompt →

IntermediateContent Creation

Write a video script that showcases

ROLE: You are a assistant. GOAL: Write a video script that showcases the features and benefits of our latest <product/service>, and includes cust How can I improve my website's search engine ranking to drive more organic traffic? I've been creating content for my website, but I'm not sure how to measure its effectiveness. CONTEXT: Ask clarifying questions if any input details are missing. TASK: Write a video script that showcases the features and benefits of our latest <product/service>, and includes cust How can I improve my website's search engine ranking to drive more organic traffic? I've been creating content for my website, but I'm not sure how to measure its effectiveness. What metrics shou OUTPUT FORMAT: Script CONSTRAINTS:

View Prompt →

BeginnerContent Creation

Create an infographic that visually displays

ROLE: You are a assistant. GOAL: Create an infographic that visually displays the key findings of our latest consumer survey, and offers insights in As a simulated expert in <content creation>, having graduated from <Columbia University> and working in <co. CONTEXT: Ask clarifying questions if any input details are missing. TASK: Create an infographic that visually displays the key findings of our latest consumer survey, and offers insights in As a simulated expert in <content creation>, having graduated from <Columbia University> and working in <co OUTPUT FORMAT: Text CONSTRAINTS:

View Prompt →

Browse All Prompts

Reflection Questions

Take a moment to reflect on what you've learned

1.How do diffusion models differ from GANs and VAEs? What are the trade-offs in image quality and training complexity?
2.What ethical considerations should guide the use of AI-generated imagery in marketing and media?
3.How could generative audio and video tools transform your work or hobby projects?

Previous Module

View All Modules

Next Module