This year for its annual Fall Workshop, the Michigan Archival Association is sponsoring the Mid-Michigan Digital Practitioners (MMDP) workshop! Because of this generous donation, MMDP is giving MAA members the opportunity to register first. Registration will open to everyone on August 24th, so be sure and reserve your spot as early as possible! The workshop is free of charge, and you are also welcome to attend the MMDP meeting held the following day on October 9 (details on the registration page).
The free half-day workshop will be held October 8 at Albion College. The first half will focus on Structured Data Wrangling and in the second half, we will switch to Leveraging DH Methods and Tools in the Archive (details below). To register, follow this link: https://www.surveymonkey.com/r/JR6QYF6.
Structured Data Wrangling
Instructors: Max Eckard, Dallas Pillen – Bentley Historical Library
Whether you’re a historical society managing Past Perfect catalog records, an academic library doing visualizations on subject relationships in MARC records, or an archive trying to reconcile years of legacy descriptive practices in EAD, chances are that you have some structured data or metadata sitting around somewhere, and it’s messy!
At the completion of this workshop, attendees will be able to:
1. Use Python and lxml to iterate through a directory of XML files and output the contents and location (XPath) of a particular node to a CSV file;
2. Use OpenRefine and the Google Refine Expression Language (GREL) to clean that CSV file; and
3. Use Python and lxml on an export from OpenRefine to update the original XML files with the cleaned nodes.
We’ll just be covering the basics, but we hope this workshop will help to open your eyes to new and automated possibilities for cleaning data back at your institution!
Leveraging DH Methods and Tools in the Archive
Instructors: Thomas Padilla, Devin Higgins – MSU Libraries
This workshop will be geared toward folks interested in exploring how methods and tools commonly used in the Digital Humanities could be leveraged to help (1) enhance collections (2) provide alternative means for navigating collections (3) shorten the time between collection acquisition and user access.
Tools used in this workshop will be the Stanford Named Entity Recognizer and the Topic Modeling Tool.
The workshop will teach attendees about the techniques of Named Entity Recognition – automatic recognition of people, organizations, and geographic places in unstructured text and Topic Modeling – algorithmic method for surfacing themes in unstructured text.
1:00 PM – 1:15 PM: Welcome
1:30 PM – 2:45 PM: Structured Data Wrangling (Max Eckard, Dallas Pillen – Bentley Historical Library)
2:45 Pm – 3:00 PM: Break
3:00 PM – 4:15 PM: Leveraging DH Methods and Tools in the Archive (Thomas Padilla, Devin Higgins – MSU Libraries)
4:15 PM – 4:30 PM: Wrap-up
4:30 PM – ?: Networking (offsite)