Child pages
  • Detect Duplicate Nodes

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

This algorithm will not detect 100% of the duplicates in a dataset in most cases, but can be a good first step to create a Merge Table which can then be checked by hand for 100% accuracy. It can also produce acceptable results on its own if 100% accuracy is not mandatory.

See Also

spaces
Wiki Markup
{incoming-links
:spaces=}
{incoming-links}

Usage Hints

Try running this algorithm once, then inspecting the "Nodes that will be merged" and "Noteworthy nodes that will NOT be merged" reports. If you agree with the recommendations in those reports, follow through by running Update Network by Merging Nodes. If too many nodes are being merged, or not enough, run this algorithm again with adjusted thresholds. Continue adjusting until you see the results you want. You may need to edit the Merge Table by hand for 100% accurate merging.