Module 3: Vectorization and Semantic Collisions

Module 3 of 9

What This Teaches

Why vague chunks lose embedding battles.

Embeddings collapse meaning into dense vectors. If a chunk contains multiple ideas, mixed intent, or generic language, its vector becomes non-dominant.

Embedding Reality

Embeddings collapse meaning into dense vectors.

If a chunk contains:

  • multiple ideas
  • mixed intent
  • generic language

Its vector becomes non-dominant.

Result: Your content exists but never wins nearest-neighbor retrieval.

Prechunking Rule

Each chunk must be:

  • semantically narrow
  • lexically explicit
  • intent-pure

Optional Operator Task

Task: Take three chunks from an existing page. Rewrite each chunk to be semantically narrow and intent-pure. Remove generic language and mixed ideas.

Constraint: Each rewritten chunk must focus on a single concept. If a chunk contains multiple ideas, split it into separate chunks.

What success looks like: You produce chunks where each chunk has a dominant semantic vector. When you read a chunk, you can immediately identify its single purpose without inference.

This task is optional. No submission required. No validation. Use it to convert theory into applied thinking.