Veriafy - Universal File Classification Platform

What is Veriafy?

Veriafy is a universal file classification system that works on hash representations instead of raw content. This means your files are never accessed, transmitted, or stored by the classification system — only their mathematical fingerprints (Veriafy Vectors) are processed.

Key Insight

Traditional ML classification requires access to your data. Veriafy flips this model: it classifies the representation of your data, not the data itself. This is privacy by mathematics, not policy.

How It Works

1. Extract

Files are processed locally by specialized extractors (PDQ for images, TMK for video, Chromaprint for audio, etc.) to generate perceptual hashes and semantic embeddings.

2. Transform

The extracted features are compressed into a Veriafy Vector — an irreversible representation that is 500,000x smaller than the original file.

3. Classify

ML models classify the Veriafy Vector, returning categories, confidence scores, and recommended actions — all without ever seeing the original content.

4. Protect

Based on classification results, take automated actions: allow, flag for review, or block — all configurable per use case.

Supported File Types

Veriafy includes 12 specialized extractors for different content types:

Images

PDQ + CLIP

Videos

TMK + PDQF

Audio

Chromaprint

PDF

SimHash + SBERT

Office

Text + Layout

Header + Body

Code

AST + Semantic

Use Cases

•Content Moderation: Detect NSFW, violent, or illegal content without human reviewers seeing flagged material
•Fraud Detection: Identify fraudulent documents without accessing sensitive financial data
•Healthcare: Classify medical records while maintaining HIPAA compliance
•Legal: eDiscovery and document classification without attorney-client privilege concerns
•Security: Malware detection without executing suspicious files

Next Steps

Installation Guide Quick Start Tutorial

Introduction to Veriafy