RSS Feed

“Garbage In…”


January 27, 2015 by jeldredge1

Any traditional history or public history scholar has to become familiar with digital scholarship, and it’s tools; there is no way to avoid such aspects in our field any longer.  The readings this week have broached many issues, pros and cons of such mediums and I will elucidate some of the aspects that intrigued me.

I have recently been knee-deep in digital and traditional research for my GRA, and I have repeatedly been reminded of the adage, “garbage in equals garbage out”.  Meaning, when we use digital sources and resources, we are at the mercy of whomever created the metadata.  I found a digital copy of an Atlanta Journal article that detailed some City Council info I needed, but the page had been copied while creased and the words closest to the binding were compressed and unreadable.  I’ll have to find a more original copy to try to read, but the digitized copy did allow me to realize that this page did contain some of the info that I was looking for, and that I need to dig into it further. I am excited about the increased availability and new ways of historic data analysis that digital mediums can bring, but historians, their readers, and those they teach need to be aware of its limitations.

As for Wikipedia, I am old enough to remember using my home set of World Book Encyclopedia to write reports in elementary school.  Even though so much has changed, students today (starting as early as grade school and continuing through college) need to be repeatedly taught the skills to handle research tools, old and new.  As Roy Rosenzweig notes in Clio Wired teachers need to “spend more time teaching about the limitations of all information sources, including Wikipedia, and emphasizing the skills of critical analysis of primary and secondary sources.”  If more people were cognizant of such aspects of all historic writing and  sources, then open-sourced mediums like Wikipedia lose some of their problems.  In an effort to further improve such sites, Roy Rosenzweig, Alex S. Cummings and Jonathan Jarrett encourage professional historians to contribute to and help edit open access history pages on the web.  I agree with this in theory, but personally, I am hesitant to do so after viewing so many troll wars on social media; Amanda Seligman’s account in her article in Writing History in the Digital Age mentions that Larry Sanger, one of the pioneers of Wikipedia grew disillusioned with the project after too many troll attacks and the hostility that the community can show to experts.  Perhaps I would feel more secure and have the extra time later in my career to contribute to such a site, but I am still too leery to do it now.

On the positive side, I do agree with authors like Alex S. Cummings, that open history sites can offer an important and often lacking community for those professional historians who might be at a point in their career where they are without such a readily accessible peer group and feedback.  He notes in Writing History in the Digital Age that the informal, public space of Internet collaboration can compliment the formal, scholarly pursuit of history where authors usually do not receive wide-scale feedback under after they have published.

The piece by Brian Maidment really impressed upon me the truth of digital sources often being taken out of context.  It’s not something that one usually thinks about while researching, but it is an important point to consider when analyzing evidence and considering them in their original contexts of time and rarity.  I was also intrigued by the use of ‘abstract’ keywords and tags for search purposes that were utilized in the Guildhall Archives.  By crowd sourcing possible new descriptors of images, it can open up new avenues of research or new artifacts for researchers that wouldn’t have found them using traditional search terms.




  1. Adina Langer says:

    Jennie, I appreciate your detailed and well-considered commentary on the readings. I am intrigued in particular by your analysis of metadata in digital archives. It is very true that digitization is only as good as the person (or machine) responsible for digitizing an object or document. When it comes to context and metadata, what do you think is the right balance between time and effort spent on detailed metadata and a focus on getting large amounts of content digitized quickly?

  2. jeldredge1 says:

    I think it is a matter of thinking about the future user when one is digitizing. Human error is always going to be a factor, but will a future researcher be able to read and understand the data on the page I just scanned? A simple error like Smithe instead of Smith doesn’t render the data incoherent. Researchers know that surname spelling often change across documents and will adjust their searches accordingly, etc. But if a page is creased or blurred to the point that it obscures words or other data, then time should be taken to go back and fix it. If the data isn’t legible, then the time taken to digitize was time wasted anyway.

Leave a Reply

Your email address will not be published. Required fields are marked *

Skip to toolbar