Skip to content

Smart Duplicate Detector

Stop rewriting the same paragraph.

Smart Duplicate Detector finds duplicate (and near-duplicate) blocks across your vault, shows a side-by-side preview, and lets you jump straight to the match.

It is designed for speed and sanity:

Note

Smart Duplicate Detector is a Smart Connections Pro feature.


What this solves

Use this when you want to:

Practical outcomes:


Quick start

  1. Open any note (or just start from the ribbon).
  2. Run a duplicate command from the command palette.
  3. Choose scan scope and settings (start with threshold 0.90).
  4. Review matches and open the best candidates.


One-click vault scan

Use the ribbon icon to launch a full-vault scan (default behavior).

This is the fastest way to get a "top duplicates anywhere" list.


Choose scan scope

Current note

Compares blocks in the active note against blocks in the rest of your vault.

Use this when:

Full vault

Finds the top duplicate block pairs across your entire vault.

Use this when:

Info

Same-note pairs are skipped so results focus on duplicates across different notes.


Configure settings (the controls that matter)

The threshold modal controls strictness, speed, and result volume.

Block similarity threshold

This is the main "how duplicate is duplicate" control.

Source similarity floor (optional speed control)

This skips block comparisons between notes that look unrelated.

A practical default is 0.35.

Max results

Stops after the top N matches.


While it scans

Vault scans show a progress modal with:

Closing the modal while scanning minimizes it to a bottom-right indicator you can restore.


Review matches and act

Results show:

Opening a match minimizes the results modal so the list stays accessible while you work.

Similarity cheat sheet

Use this as a starting point:


How it works (high-level)

Pipeline:

  1. Fast pass: exact text matches are detected first using a content hash when available.
  2. Semantic pass: remaining candidates are scored with cosine similarity over block embeddings.
  3. Speed controls:
    • skips same-note pairs
    • optional source similarity floor reduces cross-note comparisons
    • stops early once Max results is reached
  4. Cancel anytime: vault scans can be cancelled and still return partial results (marked "Cancelled").

Troubleshooting

No results

Try:

Vault scan feels slow

Try:


FAQ

Does 1.00 similarity mean an exact duplicate?

Often yes (especially when hash matching triggers), but always confirm before deleting.

Can I cancel a scan?

Yes. Full-vault scans can be cancelled and will still return partial results.

What is a good default threshold?

0.90 is a strong starting point for "likely duplicates".

Related pages