← Melio Blog

How AI Really Works

From a grain of sand to a thinking machine, explained in plain language, with a picture for every idea.

1

How an AI Chip Is Made

It starts with sand. Really.

Every AI chip begins as sand, the same stuff as glass. We purify it into nearly perfect silicon, melt it, and slowly grow it into one giant, flawless crystal.

Think of it like → growing rock candy on a string. You dip a seed in, pull slowly, and the crystal grows. Except this crystal is perfect down to the atom.

That crystal log gets sliced into thin discs called wafers, like slicing a salami. Hundreds of chips will be built on each wafer at once.

Sand Crystal log Wafer (sliced) Finished chip
The journey: sand → purified crystal → sliced wafers → individual chips.

A chip is just billions of tiny switches

Zoom into a chip and you find transistors: microscopic switches. Each one is either on (1) or off (0). That's it. Stack billions of these switches in clever patterns and you can do any math or logic in the world.

OFF 0 ON 1
One transistor = one switch = one 1 or 0. A modern AI chip holds tens of billions of them.

They're built by "printing" with light

Here's the surprising part. We don't carve chips by hand. We print them with light. Coat the wafer in a light-sensitive coating, shine light through a stencil that holds the circuit pattern, then wash and etch away the parts you don't want. Then do it again. And again: roughly 100 layers, stacked like a tiny city.

Think of it like → screen-printing a T-shirt. A stencil + light transfers the pattern. Now imagine doing that 100 times, each layer perfectly lined up, with lines only a few atoms wide.
Printing a chip with light light stencil pattern lands on the wafer
Light passes through the open parts of a stencil and prints the circuit onto the wafer. Repeat ~100 times.

The lines are so small (a few atoms wide) that ordinary light is too "fat" to draw them. The cutting edge uses a special extreme-ultraviolet light, made by zapping tin droplets with a laser tens of thousands of times a second. The machine that does this is so complex that essentially one company on Earth can build it, and each one costs hundreds of millions of dollars.

Why AI needs its own kind of chip

Your laptop's main chip (the CPU) is like one genius worker who does tasks brilliantly, but mostly one at a time. AI doesn't need a genius. It needs a mountain of tiny, easy sums done all at once. So AI uses a GPU: thousands of small workers doing simple math in parallel.

CPU 1 big worker, one task at a time GPU 1000s of small workers, all at the same time
AI is a huge pile of simple math. The GPU's many small cores chew through it in parallel.

One more quiet truth: moving the numbers around often costs more energy than the math itself. So chip designers stack super-fast memory right next to the processor to shorten the trip. Keep this in mind: it's why "light" shows up later (Part 4).


2

How AI Actually Thinks

It isn't programmed with rules. It learns from examples.

Normal software is a list of human-written rules: if this, then that. AI flips that. Instead of writing rules, we show it millions of examples and let it tune itself until it gets good. Nobody ever codes "what a cat looks like"; the system works it out.

The building block: a neuron

The tool for this is a neural network: layers of tiny math units called neurons. Each neuron does something simple: take some numbers in, multiply each by a weight (an "importance dial"), add them up, and pass the result on.

The key idea → the weights are the knowledge. A trained AI is really just a giant pile of these dials (billions of them) set to the right values.
in in in × weight × weight × weight Σ add up fire? how strongly? → pass result on
A single neuron. Thicker line = bigger weight = that input matters more. Billions of these together get powerful.

Wire many neurons into layers: numbers go in one side, get reshaped layer by layer, and an answer comes out the other. "Deep learning" just means lots of layers.

input hidden layers (the "deep" part) answer
Information flows left to right, getting reshaped at each layer until an answer pops out.

Training: tuning billions of dials

How do the dials get the right values? Through training, which is basically trial and error at massive scale:

1 · Guess make a prediction 2 · Check how wrong was it? 3 · Adjust nudge the dials repeat billions of times →
Guess → see how wrong it was → nudge every dial slightly toward "less wrong" → repeat. This is the expensive part.

Slowly, those random dials organize themselves into real understanding. Training a big model can take thousands of chips running for weeks. Once it's trained, using it (called inference) is much cheaper: just one quick pass.

The breakthrough behind today's AI: "attention"

Modern chatbots (including the one writing this) use a design called the Transformer. Its superpower is attention: when reading, the model looks at all the words at once and figures out which ones matter to each other.

"The trophy didn't fit in the suitcase because it was too big." trophy suitcase it strong link → "it" means the trophy
Attention is how the model knows "it" refers to the trophy (the big thing), not the suitcase.

A chatbot writes one chunk at a time (each chunk is called a token, roughly a word or part of a word). It predicts the most likely next chunk, adds it, then predicts the next, building the sentence piece by piece.

The cat sat on the next? … each new chunk is predicted from everything before it
It's a very, very good "next-chunk predictor."
An honest catch → the model has no fact database it looks things up in. All its "knowledge" is baked into those dials. That's why it can sound confident and still be wrong: it's predicting what sounds right, not checking a source.

And why does it need so much computing power? Multiply three big numbers together: billions of dials × trillions of example chunks × many repeats. That's an astronomical amount of math, which is exactly why we need the chips and buildings in Parts 1, 3, and 4.


3

Data Centers: Where AI Lives

A warehouse full of computers, acting as one giant brain.

One chip isn't nearly enough. To train a big model you wire thousands of GPUs together so they behave like a single enormous machine. They live in data centers: big, humming warehouses built for exactly this.

chip server (several chips) rack (many servers) hall (thousands of chips)
The nesting: chip → server → rack → a hall holding thousands of chips, all working together.

The three things they fight every second

Networking chips must talk constantly Power draws like a small city Heat all that power becomes heat
The three constant battles of an AI data center.

1. Networking. The chips must share their results constantly to stay in sync. If the network is slow, hugely expensive chips just sit there waiting. So they're linked with ultra-fast connections.

2. Power. A big AI site can use as much electricity as a small city. Power is now the main limit on how big AI can grow. Operators are building next to power plants just to feed them.

3. Heat. Every watt of electricity turns into heat. Pack thousands of chips together and they'll cook themselves. Cooling goes from fans → liquid piped onto the chips → fully dunking servers in special non-conductive fluid.

This is why data centers get built in cold regions, near water, and next to cheap power: geography is part of the engineering.


4

The Next Leap: Computing With Light

This is the "energy using light" idea, and there are two versions of it.

Remember from Part 1: moving data wastes a lot of energy and heat. Inside computers, data normally travels as electricity through copper wires, and copper gets hot and fades over distance. Light fixes that.

Version 1: using light to move data (happening now)

Instead of pushing electricity through copper, send the data as pulses of light through glass fiber. Light is fast, stays cool over distance, and can carry a staggering amount of data. You can even send many streams down one fiber at once, each on a different color.

Copper wire: hot & limited wastes energy as heat Light through fiber: cool & huge capacity pulses of light · little heat · much more data
Swapping copper for light is the near-term win. It's already rolling out in AI data centers.

Version 2: using light to do the math (still emerging)

This is the bold one. Recall the main AI operation is multiply-and-add (Part 2). It turns out light naturally does that math when beams pass through optical parts and combine: at the speed of light, using very little energy, with almost no heat.

The dream → do the heaviest AI math with light instead of electricity. The reward could be far more speed for far less power.

The catch: it's still maturing. The tricky part is converting back and forth between electricity and light without burning up the energy you just saved. So it will likely first appear as a specialist helper chip sitting alongside normal chips, not a full replacement.

Why this matters most of all: AI's single biggest limit isn't ideas anymore; it's power and heat. Light attacks both. That's why "computing with light" is one of the most exciting directions in the whole field.

The Whole Story in 4 Steps

1

Sand → chips. Purified sand becomes wafers, printed with light into chips full of billions of tiny on/off switches.

2

Chips → thinking. Those chips do mountains of simple math, which is exactly what a neural network needs to learn patterns from examples.

3

Thinking → buildings. One chip isn't enough, so thousands are wired together in data centers, where the real fights are network speed, power, and heat.

4

Buildings → light. The next leap beats those limits by moving (and maybe doing) the math with light instead of electricity.

Quick Glossary

Transistor: a microscopic on/off switch. The basic piece of every chip.
Wafer: the thin silicon disc chips are built on.
GPU: a chip with thousands of small cores doing math in parallel. The engine of AI.
Neural network: layers of tiny math units that learn patterns from examples.
Weights: the "importance dials." Set correctly, they hold the AI's knowledge.
Training: tuning the dials by guessing, checking, and adjusting billions of times.
Inference: actually using a trained model. Much cheaper than training.
Transformer / attention: the design behind modern AI; lets it weigh which words matter to each other.
Token: a chunk of text (about a word) the AI reads and writes one at a time.
Data center: a warehouse of computers wired to work as one giant machine.
Silicon photonics: moving data with light instead of copper.
Photonic computing: doing the math itself with light.

Keep Reading

How AI Is Actually Built: The whole stack in one walkthrough: chips made from sand, neural networks, data centers, and the push to co... Read →
How Cryptography Works: The math that lets you send a secret across a world full of eavesdroppers. Read →
How Evolution Works: The one simple process that, given enough time, produced everything alive. Read →

Browse all the Melio Blog guides →