• Product
  • Pricing
  • Docs
  • Using PostHog
  • Community
  • Company
  • Login
  • Docs

  • Overview
    • Quickstart with PostHog Cloud
    • Overview
      • AWS
      • Azure
      • DigitalOcean
      • Google Cloud Platform
      • Hobby
      • EU Hosting Companies
      • Other platforms
      • Instance settings
      • Environment variables
      • Securing PostHog
      • Monitoring with Grafana
      • Running behind a proxy
      • Configuring email
      • Helm chart configuration
      • Deploying ClickHouse using Altinity.Cloud
      • Configuring Slack
      • Overview
        • Overview
        • Upgrade notes
        • Overview
        • 0001-events-sample-by
        • 0002_events_sample_by
        • 0003_fill_person_distinct_id2
        • ClickHouse
          • Backup
          • Debug hanging / freezing process
          • Horizontal scaling (Sharding & replication)
          • Kafka Engine
          • Resize disk
          • Restore
          • Vertical scaling
        • Kafka
          • Resize disk
          • Log retention
        • PostgreSQL
          • Resize disk
          • Troubleshooting long-running migrations
        • Plugin server
        • MinIO
        • Redis
        • Zookeeper
      • Disaster recovery
    • Troubleshooting and FAQs
    • Support for self-hosting (open-source and enterprise)
    • Managing hosting costs
    • Overview
    • Ingest live data
    • Ingest historical data
    • Identify users
    • User properties
    • Deploying a reverse proxy
    • Library comparison
    • Badge
    • Browser Extensions
      • Snippet installation
      • Android
      • iOS
      • JavaScript
      • Flutter
      • React Native
      • Node.js
      • Go
      • Python
      • Rust
      • Java
      • PHP
      • Ruby
      • Elixir
      • Docusaurus v2
      • Gatsby
      • Google Tag Manager
      • Next.js
      • Nuxt.js
      • Retool
      • RudderStack
      • Segment
      • Sentry
      • Slack
      • Shopify
      • WordPress
      • Message formatting
      • Microsoft Teams
      • Slack
      • Discord
    • To another self-hosted instance
    • To PostHog from Amplitude
    • To PostHog Cloud EU
    • Between Cloud and self-hosted
    • Overview
    • Tutorial
    • Troubleshooting
    • Developer reference
    • Using the PostHog API
    • Jobs
    • Testing
    • TypeScript types
    • Overview
    • POST-only public endpoints
    • Actions
    • Annotations
    • Cohorts
    • Dashboards
    • Event definitions
    • Events
    • Experiments
    • Feature flags
    • Funnels
    • Groups
    • Groups types
    • Insights
    • Invites
    • Members
    • Persons
    • Plugin configs
    • Plugins
    • Projects
    • Property definitions
    • Session recordings
    • Trends
    • Users
    • Data model
    • Overview
    • Data model
    • Ingestion pipeline
    • ClickHouse
    • Querying data
    • Overview
    • GDPR guidance
    • HIPAA guidance
    • CCPA guidance
    • Data egress & compliance
    • Data deletion
    • Overview
    • Code of conduct
    • Recognizing contributions
  • Using PostHog

  • Table of contents
      • Dashboards
      • Funnels
      • Group Analytics
      • Insights
      • Lifecycle
      • Path analysis
      • Retention
      • Stickiness
      • Trends
      • Heatmaps
      • Session Recording
      • Correlation Analysis
      • Experimentation
      • Feature Flags
      • Actions
      • Annotations
      • Cohorts
      • Data Management
      • Events
      • Persons
      • Sessions
      • UTM segmentation
      • Team collaboration
      • Organizations & projects
      • Settings
      • SSO & SAML
      • Toolbar
      • Notifications & alerts
    • Overview
      • Amazon Kinesis Import
      • BitBucket Release Tracker
      • Event Replicator
      • GitHub Release Tracker
      • GitHub Star Sync
      • GitLab Release Tracker
      • Heartbeat
      • Ingestion Alert
      • Email Scoring
      • n8n Connector
      • Orbit Connector
      • Redshift Import
      • Segment Connector
      • Shopify Connector
      • Twitter Followers Tracker
      • Zendesk Connector
      • Airbyte Exporter
      • Amazon S3 Export
      • BigQuery Export
      • Customer.io Connector
      • Databricks Export
      • Engage Connector
      • GCP Pub/Sub Connector
      • Google Cloud Storage Export
      • Hubspot Connector
      • Intercom Connector
      • Migrator 3000
      • PagerDuty Connector
      • PostgreSQL Export
      • Redshift Export
      • RudderStack Export
      • Salesforce Connector
      • Sendgrid Connector
      • Sentry Connector
      • Snowflake Export
      • Twilio Connector
      • Variance Connector
      • Zapier Connector
      • Downsampler
      • Event Sequence Timer
      • First Time Event Tracker
      • Property Filter
      • Property Flattener
      • Schema Enforcer
      • Taxonomy Standardizer
      • Unduplicator
      • Automatic Cohort Creator
      • Currency Normalizer
      • GeoIP Enricher
      • Timestamp Parser
      • URL Normalizer
      • User Agent Populator
  • Tutorials
    • All tutorials
    • Actions
    • Apps
    • Cohorts
    • Dashboards
    • Feature flags
    • Funnels
    • Heatmaps
    • Path analysis
    • Retention
    • Session recording
    • Trends
  • Support
  • Glossary
  • Docs

  • Overview
    • Quickstart with PostHog Cloud
    • Overview
      • AWS
      • Azure
      • DigitalOcean
      • Google Cloud Platform
      • Hobby
      • EU Hosting Companies
      • Other platforms
      • Instance settings
      • Environment variables
      • Securing PostHog
      • Monitoring with Grafana
      • Running behind a proxy
      • Configuring email
      • Helm chart configuration
      • Deploying ClickHouse using Altinity.Cloud
      • Configuring Slack
      • Overview
        • Overview
        • Upgrade notes
        • Overview
        • 0001-events-sample-by
        • 0002_events_sample_by
        • 0003_fill_person_distinct_id2
        • ClickHouse
          • Backup
          • Debug hanging / freezing process
          • Horizontal scaling (Sharding & replication)
          • Kafka Engine
          • Resize disk
          • Restore
          • Vertical scaling
        • Kafka
          • Resize disk
          • Log retention
        • PostgreSQL
          • Resize disk
          • Troubleshooting long-running migrations
        • Plugin server
        • MinIO
        • Redis
        • Zookeeper
      • Disaster recovery
    • Troubleshooting and FAQs
    • Support for self-hosting (open-source and enterprise)
    • Managing hosting costs
    • Overview
    • Ingest live data
    • Ingest historical data
    • Identify users
    • User properties
    • Deploying a reverse proxy
    • Library comparison
    • Badge
    • Browser Extensions
      • Snippet installation
      • Android
      • iOS
      • JavaScript
      • Flutter
      • React Native
      • Node.js
      • Go
      • Python
      • Rust
      • Java
      • PHP
      • Ruby
      • Elixir
      • Docusaurus v2
      • Gatsby
      • Google Tag Manager
      • Next.js
      • Nuxt.js
      • Retool
      • RudderStack
      • Segment
      • Sentry
      • Slack
      • Shopify
      • WordPress
      • Message formatting
      • Microsoft Teams
      • Slack
      • Discord
    • To another self-hosted instance
    • To PostHog from Amplitude
    • To PostHog Cloud EU
    • Between Cloud and self-hosted
    • Overview
    • Tutorial
    • Troubleshooting
    • Developer reference
    • Using the PostHog API
    • Jobs
    • Testing
    • TypeScript types
    • Overview
    • POST-only public endpoints
    • Actions
    • Annotations
    • Cohorts
    • Dashboards
    • Event definitions
    • Events
    • Experiments
    • Feature flags
    • Funnels
    • Groups
    • Groups types
    • Insights
    • Invites
    • Members
    • Persons
    • Plugin configs
    • Plugins
    • Projects
    • Property definitions
    • Session recordings
    • Trends
    • Users
    • Data model
    • Overview
    • Data model
    • Ingestion pipeline
    • ClickHouse
    • Querying data
    • Overview
    • GDPR guidance
    • HIPAA guidance
    • CCPA guidance
    • Data egress & compliance
    • Data deletion
    • Overview
    • Code of conduct
    • Recognizing contributions
  • Using PostHog

  • Table of contents
      • Dashboards
      • Funnels
      • Group Analytics
      • Insights
      • Lifecycle
      • Path analysis
      • Retention
      • Stickiness
      • Trends
      • Heatmaps
      • Session Recording
      • Correlation Analysis
      • Experimentation
      • Feature Flags
      • Actions
      • Annotations
      • Cohorts
      • Data Management
      • Events
      • Persons
      • Sessions
      • UTM segmentation
      • Team collaboration
      • Organizations & projects
      • Settings
      • SSO & SAML
      • Toolbar
      • Notifications & alerts
    • Overview
      • Amazon Kinesis Import
      • BitBucket Release Tracker
      • Event Replicator
      • GitHub Release Tracker
      • GitHub Star Sync
      • GitLab Release Tracker
      • Heartbeat
      • Ingestion Alert
      • Email Scoring
      • n8n Connector
      • Orbit Connector
      • Redshift Import
      • Segment Connector
      • Shopify Connector
      • Twitter Followers Tracker
      • Zendesk Connector
      • Airbyte Exporter
      • Amazon S3 Export
      • BigQuery Export
      • Customer.io Connector
      • Databricks Export
      • Engage Connector
      • GCP Pub/Sub Connector
      • Google Cloud Storage Export
      • Hubspot Connector
      • Intercom Connector
      • Migrator 3000
      • PagerDuty Connector
      • PostgreSQL Export
      • Redshift Export
      • RudderStack Export
      • Salesforce Connector
      • Sendgrid Connector
      • Sentry Connector
      • Snowflake Export
      • Twilio Connector
      • Variance Connector
      • Zapier Connector
      • Downsampler
      • Event Sequence Timer
      • First Time Event Tracker
      • Property Filter
      • Property Flattener
      • Schema Enforcer
      • Taxonomy Standardizer
      • Unduplicator
      • Automatic Cohort Creator
      • Currency Normalizer
      • GeoIP Enricher
      • Timestamp Parser
      • URL Normalizer
      • User Agent Populator
  • Tutorials
    • All tutorials
    • Actions
    • Apps
    • Cohorts
    • Dashboards
    • Feature flags
    • Funnels
    • Heatmaps
    • Path analysis
    • Retention
    • Session recording
    • Trends
  • Support
  • Glossary
  • Docs
  • Apps
  • Ingestion filtering
  • Unduplicator

Unduplicator

Last updated: Aug 24, 2022

On this page

  • What does the Unduplicator do?
  • How does the Unduplicator work?
  • What are the requirements for this app?
  • How do I install the Unduplicator app for PostHog?
  • Configuration
  • Is the source code for this app available?
  • Who created this app?
  • What if I have feedback on this app?
  • What if my question isn't answered above?

What does the Unduplicator do?

This app prevents duplicate events from being ingested into PostHog. It's particularly helpful if you're backfilling information while you're already ingesting ongoing events.

How does the Unduplicator work?

The Unduplicator crafts an event UUID based on key properties for the event, so if the event is the same (see below for definition) it'll end with the same UUID.

When events are processed by ClickHouse, the database will automatically dedupe events which have the same toDate(timestamp), event, distinct_id and uuid combo, effectively making sure duplicates are not stored.

The app has two modes that define what's considered a duplicate event. Either mode will prevent duplicates within a project, though duplicates across projects are still permitted.

  • Event and Timestamp. An event will be treated as duplicate if the timestamp, event name and user's distinct ID matches exactly, regardless of what internal properties are included.
  • All Properties. An event will be treated as duplicate only if all properties match exactly, as well as the timestamp, event name and user's distinct ID.

What are the requirements for this app?

The Unduplicator requires either PostHog Cloud, or a self-hosted PostHog instance running version 1.30.0 or later.

Not running 1.30.0? Find out how to update your self-hosted PostHog deployment!

How do I install the Unduplicator app for PostHog?

  1. Log in to your PostHog instance
  2. Click 'Apps' on the left-hand tool bar
  3. Search for 'Unduplicates'
  4. Select the Unduplicator app, press 'Install'.
  5. Once the app is installed, press the blue button to configure the app and select which of the de-duplication methods you want to use (described above).

Configuration

OptionDescription
Dedup Mode
Type: choice
Required: True

Is the source code for this app available?

PostHog is open source and so are all apps on the platform. The source code for the Unduplicator app is available on GitHub.

Who created this app?

We'd like to thank former PostHog team member Paolo D'Amico for creating the Unduplicator. We miss you, Paolo!

What if I have feedback on this app?

We love feature requests and feedback! Please create an issue to tell us what you think.

What if my question isn't answered above?

We love answering questions. Ask us anything via our FAQ page.

You can also join the PostHog Community Slack group to collaborate with others and get advice on developing your own PostHog apps.

Questions?

Was this page useful?

Next article

Automatic Cohort Creator

Warning: We are currently in the process of deprecating this app in favor of group analytics , and do not recommend installing it at this time. What does the Automatic Cohort Creator app do? The Automatic Cohort Creator app enables you to specify a list of user properties which will be used to automatically assign new users to a cohort. What are the requirements for this app? The Automatic Cohort Creator app requires either PostHog Cloud, or a self-hosted PostHog instance running version 1.3…

Read next article

Authors

  • Paul Hultgren
    Paul Hultgren
  • Eli Kinsey
    Eli Kinsey

Share

Jump to:

  • What does the Unduplicator do?
  • How does the Unduplicator work?
  • What are the requirements for this app?
  • How do I install the Unduplicator app for PostHog?
  • Configuration
  • Is the source code for this app available?
  • Who created this app?
  • What if I have feedback on this app?
  • What if my question isn't answered above?
  • Questions?
  • Edit this page
  • Raise an issue
  • Toggle content width
  • Toggle dark mode
  • Product

  • Overview
  • Pricing
  • Product analytics
  • Session recording
  • A/B testing
  • Feature flags
  • Apps
  • Customer stories
  • PostHog vs...
  • Docs

  • Quickstart guide
  • Self-hosting
  • Installing PostHog
  • Building an app
  • API
  • Webhooks
  • How PostHog works
  • Data privacy
  • Using PostHog

  • Product manual
  • Apps manuals
  • Tutorials
  • Community

  • Questions?
  • Product roadmap
  • Contributors
  • Partners
  • Newsletter
  • Merch
  • PostHog FM
  • PostHog on GitHub
  • Handbook

  • Getting started
  • Company
  • Strategy
  • How we work
  • Small teams
  • People & Ops
  • Engineering
  • Product
  • Design
  • Marketing
  • Customer success
  • Company

  • About
  • Team
  • Investors
  • Press
  • Blog
  • FAQ
  • Support
  • Careers
© 2022 PostHog, Inc.
  • Code of conduct
  • Privacy policy
  • Terms