Skip to main content

The Tech Behind LLMs

Do you often use AI but never really know what it does with your prompt? 🤔 Let’s dive a bit into the tech behind it — the Transformer inside LLMs (Large Language Models).

The video below breaks it down step by step, showing what’s really going on during an AI’s “thinking” process 🧠. This is the core engine behind tools like ChatGPT, Gemini, and other Generative AI.

But here’s the big question: do they actually think... or are they just predicting words? 🤖
Watch the video below to find out! 🎥


Summary

How a Transformer Executes Your Prompt (Illustration)

First, split → “tokens”
Your prompt is broken into small pieces (tokens).

Turn tokens into numbers → “embeddings” (map the meaning)
Each token is mapped to a vector (a list of numbers). Words with similar meanings end up close together in a very high-dimensional “space.” (Example: GPT-3 uses 12,288 dimensions for its embeddings.)

Attention = a context spotlight
Words like “chatting” inform each other; the word “mole” in biology ≠ “mole” in chemistry / skin disease. Attention adapts the meaning based on neighboring words. In short: the model highlights the most relevant context before updating a word’s representation.

Feed-forward = fast parallel checks
After being “spotlit,” each vector goes through parallel “checks” (a multi-layer perceptron) to enrich details. Attention and feed-forward layers are stacked many times—this stacking is the “deep” in deep learning.

Pick the next word → softmax & “temperature”
At the end, the model produces a probability distribution over all candidate tokens. Softmax turns scores into probabilities; temperature can make outputs safer/calm (cool) or more creative (warm).

Scale is the key
Modern models are huge: e.g., 175B parameters (GPT-3). A lot of parameters actually live in the feed-forward blocks between attention layers. Transformers get their power from parallelism, enabling training on GPUs at massive scale. This architecture comes from the 2017 paper “Attention Is All You Need.”


Visual Analogies

Roundtable Meeting: every word asks, “who’s relevant to me?” (query). Relevant words raise their hands (keys) and share their content (values). Result: each word’s meaning becomes more specific to its context.

Giant 3D Dictionary: words = points in a huge space. “Queen” sits near “king,” but also shifts along a “female vs. male” direction. (Illustration; reality is more complex.)

Creativity Thermometer: higher temperature = more unusual ideas; lower = safer/cleaner answers.


Strengths vs. Limitations

Strengths: summarizing text, explaining concepts, brainstorming ideas, drafting, light translation.
Limitations: can sound confident while being wrong (hallucinations), inherits training-data biases, doesn’t “understand” the world like humans, sensitive to prompt wording.


Do & Don’t

Do

  • State goal & role clearly (format, style, constraints).
  • Verify important numbers/facts before using them.
  • Keep a trail of key prompts & outputs.
  • Start with small use cases: summarize emails, create presentation outlines, seed ideas.

Don’t

  • Paste secret/sensitive data.
  • Assume AI is always correct.
  • Rely on it blindly without reasoning & checks.

Lessons Learned
  1. Don’t idolize it — think of AI as a “language calculator.”
    It’s great at arranging words and patterns, not “understanding” like humans. It can be very convincing even when wrong. You still need human reasoning & verification. (The video focuses on next-token prediction mechanics, not absolute factual truth.)
  2. Context is king.
    Good results come from clear context: define the AI’s role, your goal, constraints, and output format. Clear prompts → attention aims at the right info. (Matches the idea of attention selecting the most relevant signals.)
  3. Bigger ≠ always the answer.
    Larger often helps, but costs more and doesn’t erase bias. Use models proportionally to the task.
  4. Safe & healthy AI habits (for beginners):
    • Protect privacy: don’t paste secrets.
    • Verify: double-check critical facts for serious decisions; get a second opinion.
    • Leave a trace: save prompts & output versions.
    • Red-flag routine: if it looks too smooth, re-check sources & numbers.
  1. Grounded ways to start:
    • Use AI to summarize emails/docs & generate idea lists.
    • Ask for a presentation outline, then fill in details.
    • Ask for template examples, then adapt.
    • Practice fact-checking: ask for sources, compare manually.
    • Make a personal “allow/avoid” list (what’s safe to process with AI).

Moving Forward
  1. Pro-human, pro-tool: use AI to speed up first drafts, brainstorming, and concept explanations—final decisions stay with us.
  2. If you want deeper understanding, learn gradually: grasp the core terms (token, embedding, attention, softmax)—enough to level up your AI literacy.
  3. Follow the architecture, not the hype: know that modern AI’s big leap came from transformers (2017) and their parallel nature—this helps you separate marketing claims from real architectural progress.