Writing a data shredder

If your plugin adds the ability to store personal data within pretix, you should also implement a “data shredder” to anonymize or pseudonymize the data later.

Shredder registration

The data shredder API does not make a lot of usage from signals, however, it does use a signal to get a list of all available data shredders. Your plugin should listen for this signal and return the subclass of pretix.base.shredder.BaseDataShredder that we’ll provide in this plugin:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from django.dispatch import receiver

from pretix.base.signals import register_data_shredders


@receiver(register_data_shredders, dispatch_uid="custom_data_shredders")
def register_shredder(sender, **kwargs):
    return [
        PluginDataShredder,
    ]

The shredder class

class pretix.base.shredder.BaseDataShredder

The central object of each data shredder is the subclass of BaseDataShredder.

BaseDataShredder.event

The default constructor sets this property to the event we are currently working for.

BaseDataShredder.identifier

A short and unique identifier for this shredder. This should only contain lowercase letters and in most cases will be the same as your package name.

This is an abstract attribute, you must override this!

BaseDataShredder.verbose_name

A human-readable name for what this shredder removes. This should be short but self-explanatory. Good examples include ‘E-Mail addresses’ or ‘Invoices’.

This is an abstract attribute, you must override this!

BaseDataShredder.description

A more detailed description of what this shredder does. Can contain HTML.

This is an abstract attribute, you must override this!

BaseDataShredder.generate_files() → List[Tuple[str, str, str]]

This method is called to export the data that is about to be shred and return a list of tuples consisting of a filename, a file type and file content.

You can also implement this as a generator and yield those tuples instead of returning a list of them.

BaseDataShredder.shred_data()

This method is called to actually remove the data from the system. You should remove any database objects here.

You should never delete LogEntry objects, but you might modify them to remove personal data. In this case, set the LogEntry.shredded attribute to True to show that this is no longer original log data.

Example

For example, the core data shredder responsible for removing invoice address information including their history looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class InvoiceAddressShredder(BaseDataShredder):
    verbose_name = _('Invoice addresses')
    identifier = 'invoice_addresses'
    description = _('This will remove all invoice addresses from orders, '
                    'as well as logged changes to them.')

    def generate_files(self) -> List[Tuple[str, str, str]]:
        yield 'invoice-addresses.json', 'application/json', json.dumps({
            ia.order.code: InvoiceAddressSerializer(ia).data
            for ia in InvoiceAddress.objects.filter(order__event=self.event)
        }, indent=4)

    @transaction.atomic
    def shred_data(self):
        InvoiceAddress.objects.filter(order__event=self.event).delete()

        for le in self.event.logentry_set.filter(action_type="pretix.event.order.modified"):
            d = le.parsed_data
            if 'invoice_data' in d and not isinstance(d['invoice_data'], bool):
                for field in d['invoice_data']:
                    if d['invoice_data'][field]:
                        d['invoice_data'][field] = '█'
                le.data = json.dumps(d)
                le.shredded = True
                le.save(update_fields=['data', 'shredded'])