Challenges and Considerations
Beyond some of the minor challenges encountered during the digitization workflow process, there are some larger considerations that came to our attention.
Integration with Archivematica
When the workflow was developed based on the best practices found in the available literature, integration with ARCS’ existing digital preservation system, Archivematica, was not considered. Our workflow proposes that all the analysis and appraisal of the disk image and its files be conducted using BitCurator prior to uploading the SIP to Archivematica. However, Archivematica includes an ingest workflow specific to disk images. This method begins with uploading the disk image file to Archivematica, then extracting the files which can be manipulated and examined using some of the same tools available in BitCurator, such as bulk extractor.
For the purposes of this project, we decided to complete the workflow using BitCurator up until the point of ingest. In the future, ARCS will need to consider which tool is more appropriate to perform the file extraction, analysis, and appraisal. Though some of the capabilities of Archivematica and BitCurator overlap, Archivematica is not equipped to deal with damaged or more unusual disk formats. However, using Archivematica earlier on in the process, for example beginning at the extract files stage, could potentially streamline the digitization workflow.
Legacy file formats
Once the disk image is created and the files are extracted, it may be impossible to open the newly extracted files if they are in legacy file formats such as Word Perfect. BitCurator does not have the tools to open and read these types of files. In order to do so, archivists will need to find external tools, such as the following (Erway, 2013):
- Quick View Plus: https://www.avantstar.com/metro/visit
- TreeSize Professional: https://www.jam-software.com/treesize/
- WinDirStat (Windows): https://windirstat.info/
- Disk Inventory X (Mac OS X 10.3 and later): https://www.derlien.com/
- IrfanView: https://www.irfanview.com/
- Inkscape: https://inkscape.org/
- VLC Media Player: https://www.videolan.org/vlc/index.html
Appraisal decisions need to weigh the difficulty of accessing and migrating files against their potential value. The availability of special software to access files or the obsolescence of the format may impact appraisal decisions (McGuire, 2018).
During ingest, Archivematica uses FIDO and Siegfried, both PRONOM-based programs, to identify file formats. However, best practice is to use a file identification program during the Appraisal stage before uploading files into Archivematica as a SIP. Standard tools such as DROID, JHOVE and FITS may be used for this purpose (McGuire, 2018).
What to do with the disk after digitization
It is hard to tell what to do exactly with a 3.5-inch floppy disk after it has been digitized due to the lack of research on the subject. After extensively researching this subject, it was impossible to find any literature that mentioned what was done with the physical diskette once a SIP and an AIP was created for the disk image and extracted files. Due to the lack of research, it is only possible for us to propose a solution based on our own experience. Since the disk image is essentially a digital replica of the physical floppy disk, including all parts of the disk that are usually not visible to users, it may be sufficient to retain the disk image for long-term preservation. Preserving the disk image is akin to preserving the physical disk, but in a format that can be accessed and manipulated in the future should the physical disk become unreadable.
Ethical Considerations
When disk images of floppy disks are created using BitCurator, the deleted documents from the floppy disk are still preserved and accessible. Archivists should ask donors about their intentions regarding deleted documents at the beginning of the acquisition process. If it is impossible to contact the donor, the code of ethics requires that the recovered deleted documents be immediately removed since the author likely did not intend to give access to these documents (International Council on Archives, 1996; Association of Canadian Archivists, 2017). It is important in this context to educate and inform any potential donor about the possibility to recover and access deleted files.
In the case of the Monique Frize fonds, the decision was made to remove the deleted files in the Initial Analysis stage.