This feature is available to subscription customers on projects created on or after June 15, 2021.

What is the Similar Docs tool?

Similar Docs allows you to quickly and easily view near duplicates.

A near duplicate is a document that shares text content with other documents in your workspace. For example, a contract with redlines on it would be a near duplicate of the original non redlined contract.

Using Similar Docs is an efficient way of quickly pulling up related documents to gain a fuller understanding of the evolution of a document, the way in which data developed in your collection, and more!

To access Similar Docs, make sure it's added to the right-side toolbar.
Learn more about customizing document tools here.

How does it work?

During indexing, Logikcull analyzes your document’s text to extract information that can be used to quickly compare it to other documents in your workspace. Matching documents are retrieved and assigned a match strength score to help you assess similarity at a glance.

How do I see the differences between documents?

To view the differences between the document in your search results/review set and a document listed in Similar Docs:

  1. Select a similar document from the right sidebar

  2. Click the "Difference Viewer" toggle

  3. The document text will appear side-by-side with differences highlighted.

Can I take bulk actions with similar docs (tag, assign, cull)?

Yes. You can manually select using checkboxes, or click the "Select all" button, then use the kabob (three vertical dots) to choose your bulk action with the selection. To remove all checkmarks, click "Select none."

What kind of data does Logikcull look at to determine similar docs?

Logikcull looks specifically at the text of your document to determine Similar Docs. This means it is particularly powerful when used for documents with a lot of text, even if the file types are different (e.g., content from PDF compared to content from email or Word Documents).

For example, a redlined contract with changes throughout, but keeping a lot of the same contextual structure, would be given a strong near duplicate score to its original to indicate the closeness of their content.

Logikcull will not examine raw image, audio, and video files; but will review any available transcripts from those files for near dupe analysis.

Did this answer your question?