ProcessWire modules for importing and handling large data sets.

DataSet

It is a set of ProcessWire modules for importing, manipulating and displaying large (50k+ entries) data sets.
The software was developed for the [Mikes-dictionary] and other Digital Humanities projects.

Main features


  • import data from CSV and XML sources
  • user configurable input <-> field mappings
  • on-the-fly field data composition
  • supports downloading external resources (files, images)
  • purge, extend or overwrite existing data (PW pages and their fields)
  • handle page references and option fields
  • fairly low resource requirements (uses Tasker to execute long-running jobs)
  • and many more (filtering, limits, default values etc.)

How to use it


See the wiki.

History


The first version was created in 2017 to import a large XML dataset into ProcessWire pages.
The CSV import sub-module was created in 2018. It was tested to import large dataset containing 200k+ entries and many kinds of references between them.
The CSV + PDF import was developed in 2019 to create a complete digital library using a single CSV upload.

License


The "github-version" of the software is licensed under MPL 2.0.

Install and use modules at your own risk. Always have a site and database backup before installing new modules.

Latest news

  • ProcessWire Weekly #570
    The 570th issue of ProcessWire Weekly brings in all the latest news from the ProcessWire community. Modules, sites, and more. Read on!
    Weekly.pw / 12 April 2025
  • ProcessWire 3.0.244 new main/master version
    ProcessWire 3.0.244 is our newest main/master/stable version. It's been more than a year in the making and is packed with tons of new features, issue fixes, optimizations and more. This post covers all the details.
    Blog / 18 January 2025
  • Subscribe to weekly ProcessWire news

“To Drupal, or to ProcessWire? The million dollar choice. We decided to make an early switch to PW. And in retrospect, ProcessWire was probably the best decision we made. Thanks are due to ProcessWire and the amazing system and set of modules that are in place.” —Unni Krishnan, Founder of PigtailPundits