What are Auto Tags?
Get alerted to important metadata and processing details with Logikcull's Quality Control Auto Tags.
John OHara avatar
Written by John OHara
Updated over a week ago

Introduction

Auto Tags in Logikcull provide alerts on important metadata and processing details, such as potentially privileged emails, duplicates, and more. These tags help users to better understand their data and streamline the review process.

How does it work?

Auto Tags (QC Tags) tell you things like:

How many potentially privileged emails were detected?
How many documents were from a database upload?
How many documents have a duplicate?
...and more.

What if a person was sending privileged emails with PDF attachments to a law firm that the attachments themselves contained embedded images or screenshots, like photos or scanned documents? Would you want to have the option to review just those files?

With Logikcull, you can. (Hint: filter on the "Has Deep Text" Auto Tag.) This is just one example out of thousands where you can leverage Auto Tags to uncover more information about your data.

Locating Auto Tags in your Project

Auto Tags are automatically applied during post-processing, and appear in three locations in your Logikcull Project: the Filter Carousel, Search Results, and the Document Viewer.

If you're curious what an Auto Tag means, simply hover over its name and the tooltip box will tell you.

Auto Tags in the Filter Carousel

Auto Tags in Search Results

Auto Tags in Document Viewer

Auto Tags with Descriptions

ℹ️ Auto Tags with an * next to their name indicate Subscription Only features.

Auto Tag

Description

Embedded Document

Documents that are not email attachments, but have come from within another document. E.g., a .PPT within a .DOC is an embedded document

Failed Extraction

Containers that failed to explode any files

From Box

Documents that have been imported from Box

From Import

Documents imported from a database (i.e. production, load file, etc.)

From Slack

Documents that are part of a Slack archive

Has BCC

Emails that contain BCC (Blind Carbon Copy) metadata

Has Deep Text

PDFs with additional searchable text that is found after running DTR (Deep Text Recognition). This indicates that the PDF has an embedded image that contains text.

Has Duplicate

Documents that are duplicates of other documents

Has Embedded Files

Documents that are not email, but contain embedded files as attachments. For instance, a .DOC that contains an embedded .PPT file

Has Hidden Comments

MS Excel documents containing hidden comments

OR

PDFs containing comments or “sticky notes"

ℹ️ Please note, Logikcull does not render hidden comments in the document viewer. Depending on whether the document was uploaded with a text layer that notes the comments, Hidden Comments may be viewable in the Text View.

Has Hidden Worksheets

Documents that contain MS Excel hidden worksheets.

ℹ️ Please note, Logikcull will attempt to render hidden worksheets in the document viewer.

Has MS Office Macros

Documents that contain MS Office embedded macros

Has No Native

Imported documents that have no Native File

Has No Text

Documents without any extracted or OCRed text

Has Revisions

Documents that contain MS Word revisions or document comments

Has Speaker Notes

Documents that contain MS PowerPoint speaker notes

Has Threads

Documents that are part of an Email thread

Has Virus

Documents that have been detected to contain a virus. These documents are quarantined during processing and can not be downloaded

Is a Copy

This document is a copy of a document from another project

Is Overlaid

Overlay(s) applied to document

Last Email

Email that is the last message of an email thread or is a message without a thread. When part of a thread, this tag indicates the end of a particular thread and not the inclusiveness of the thread's contents within this email.

ℹ️ More information on this tag can be found in THIS ARTICLE.

Mismatched Extension

Documents with incorrect or missing file extensions. E.g. a .DOC file that is actually a .PPT file but with an incorrect extension in the filename metadata

Nist File

Documents identified as being part of the NSRL database of known computer files

None

Use this Auto Tag to find documents that contain zero Auto Tags

Not Rendered

Documents that were not rendered to PDF during processing

Ocr Failed

Document where OCR (Optical Character Recognition) was attempted but failed

Ocred

Documents that were OCRed (Optical Character Recognition) so they can be searched.

Potentially Privileged

Emails that have a law firm email address in the From, To, CC, or BCC fields. They are considered to be potentially privileged. Suggest a new domain name by clicking the Get Support link in your Account drop down menu at the top of the screen.

Protected

Documents that are password protected

Rendered Text

The document's text was used to render the document to PDF. This happens if all other means to render the document fail

Transfer Failed

The transfer of this file from Box failed or was corrupted

Truncated Email Metadata

Documents whose To, CC, or BCC fields exceed system capacity for indexing

Truncated Text

Documents whose text length exceeds system capacity for indexing

Was Copied

This document was copied to another project

Zero Bytes

Documents that have a file size of zero bytes. These documents contain no content.

From Google Vault*

Documents imported from Google Vault

From MS365*

Documents imported from MS365

Has Slack Deleted Messages*

Includes messages that were deleted in Slack

Has Slack Edited Messages*

Includes messages that were edited in Slack

Has Splits

PDF documents that have been split into smaller PDF files.

Inclusive Email*

Emails that include all unique content of a thread.

Is Slack 1:1 DM*

Is Slack 1:1 Direct Message between two parties

Is Slack Multi-Party DM*

Is Slack Multi-Party Direct Messages

Is Slack Thread*

Is Slack Thread

Rendered From Import

Imported documents that have been rendered from native

Split From PDF

Documents that were created by splitting a large PDF into smaller PDF files.

Transcribed

Audio content was transcribed and is text-searchable

Transcription Failed

We were unable to transcribe audio content. This file is not text-searchable.

PII Detected

Personal Identity Information (PII) is detected with a 75% or greater confidence level

PII Detection Failed

PII Detection Failed in the Document

PII Detection Skipped

PII Detection was skipped

Did this answer your question?