ProcessWire modules for importing and handling large data sets.

DataSet

It is a set of ProcessWire modules for importing, manipulating and displaying large (50k+ entries) data sets.
The software was developed for the [Mikes-dictionary] and other Digital Humanities projects.

Main features


  • import data from CSV and XML sources
  • user configurable input <-> field mappings
  • on-the-fly field data composition
  • supports downloading external resources (files, images)
  • purge, extend or overwrite existing data (PW pages and their fields)
  • handle page references and option fields
  • fairly low resource requirements (uses Tasker to execute long-running jobs)
  • and many more (filtering, limits, default values etc.)

How to use it


See the wiki.

History


The first version was created in 2017 to import a large XML dataset into ProcessWire pages.
The CSV import sub-module was created in 2018. It was tested to import large dataset containing 200k+ entries and many kinds of references between them.
The CSV + PDF import was developed in 2019 to create a complete digital library using a single CSV upload.

License


The "github-version" of the software is licensed under MPL 2.0.

Install and use modules at your own risk. Always have a site and database backup before installing new modules.

Latest news

  • ProcessWire Weekly #561
    In the 561st issue of ProcessWire Weekly we're going to check out the latest core updates, share recent support forum highlights and online resources, and more. Read on!
    Weekly.pw / 8 February 2025
  • ProcessWire 3.0.244 new main/master version
    ProcessWire 3.0.244 is our newest main/master/stable version. It's been more than a year in the making and is packed with tons of new features, issue fixes, optimizations and more. This post covers all the details.
    Blog / 18 January 2025
  • Subscribe to weekly ProcessWire news

“…building with ProcessWire was a breeze, I really love all the flexibility the system provides. I can’t imagine using any other CMS in the future.” —Thomas Aull