Module 3: Vectorization and Semantic Collisions
Module 3 of 9
What This Teaches
Why vague chunks lose embedding battles.
Embeddings collapse meaning into dense vectors. If a chunk contains multiple ideas, mixed intent, or generic language, its vector becomes non-dominant.
Embedding Reality
Embeddings collapse meaning into dense vectors.
If a chunk contains:
- multiple ideas
- mixed intent
- generic language
Its vector becomes non-dominant.
Result: Your content exists but never wins nearest-neighbor retrieval.
Prechunking Rule
Each chunk must be:
- semantically narrow
- lexically explicit
- intent-pure
Optional Operator Task
Task: Take three chunks from an existing page. Rewrite each chunk to be semantically narrow and intent-pure. Remove generic language and mixed ideas.
Constraint: Each rewritten chunk must focus on a single concept. If a chunk contains multiple ideas, split it into separate chunks.
What success looks like: You produce chunks where each chunk has a dominant semantic vector. When you read a chunk, you can immediately identify its single purpose without inference.
This task is optional. No submission required. No validation. Use it to convert theory into applied thinking.