In January, February, and March 2012, a lot of ideas and scrambling began that will eventually culminate in panels at the 2013 MLA Convention in Boston. I helped (using that word generously) with a few panels put forth by MLA discussion groups (to which I’ve been elected) including the panel sponsored by the Discussion Group on Computers in Languages and Literature and another panel sponsored by the Discussion Group on Bibliography and Textual Studies.

Alan Galey, the organizer for the Bibliography & Textual Studies panel, organized me right onto that panel about Digital Archives and Their Margins where I will talk about some of the issues outlined below:

After commenting on Ted Underwood’s tremendous undertaking, reading Miriam Posner’s blog post, “Some things to think about before you exhort everyone to code,” and reading the really interesting (and enormous) set of comments by the DH community on both posts, I was moved to tweet about a recent data set. Romanticism and Victorianism on the Net came out with its latest journal edition which includes an interesting article about big data, aesthetics and the long 18th century in literature. (Yep, as a Romanticist, I too bristle that some fields persist in trying to subsume Romantic-era literature into the long 18th or 19th centuries…but that’s another story about administrative politics in underfunded departments.)

Updated to add: I’ve also been engaged in some fairly exciting conversations with Jacque Wernimont (see her post on Feminism and Digital Humanities) about all of these topics (and I would argue, her scholarship has pushed me further in my own areas).

All this has got me thinking about that MLA panel for 2013 and returning to a topic that’s close to the Society for Textual Scholarship 2011 conference panels on feminism, textual studies, and Digital Humanities where nothing was resolved except a stark articulation of gender differences in all of these fields.

As is the thing to do, I sent a concerned tweet which Roger Whitson immediately picked up — and that lead to an engaging and interesting conversation along with Lauren Klein. Natalia Cecire storified the entire conversation for us, “From Archival Silence to Glorious Data.” And, possibly, an MLA panel has been borne about text mining and textual criticism. Lauren is working on a longer project about this very topic, archival silence and topic modeling.

Roger suggested (post-storify) that

my hope is to be able to find a way to express that absence algorithmically. But I’m being utopian.

The conversation left me with an even larger question (post-storify):

Were British 19th C women more prone to publish in single-author writings or as part of a newspaper, magazine, journal, anthology, etc.?

Off the top of my book history hat, I can’t think of a scholarly project that answers this question definitively. We have case studies, but this explosion of accessibility and data mining has proved that some of these case studies may not be all that universal.

Print culture exploded in the early 19th Century. There are so many documents and texts to digitize that it’s become the job of libraries, who have deeper pockets than some, to curate these collections. And, now Google Books, ECCO, and HathiTrust have become the custodians who also perform the labor of digitization and mark-up. (There are issues with corporations taking over cultural materials, but that’s another topic.) These smaller digital projects, the ones that are usually full of that ephemeral stuff by the non-canonical people, typically languish at this stage, that digitizing and mark-up stage, because the individual whose passion fuels the project has lost some institutional support or funding.

…aaannnndddd, now we come full circle to the conversation about professionalization and what counts — hence my post on doing the risky thing with this gothic stuff.

Nevertheless, the big data sets that are in play in this conversation (on Twitter and here in this post) in both projects are ones that were created by other institutions. If the traditionally marginalized authors are marginalized now because it’s no longer sexy or innovative to digitize and mark-up those collections, then how have we far have we really come? Are those recovery projects then marginalized because they bring nothing innovative to Digital Humanities?

[Caveat: This claim, let's be clear, is not based on funding figures from the NEH Office of Digital Humanities -- that would be an interesting set of numbers to crunch, though: those under-represented peoples as the topic of digital projects. And, to be fair, again, other departments in the NEH are now funding digital projects. Can we obtain those numbers, too, to discover if, say, the Scholarly Editions NEH grants are being awarded to projects that are about creating scholarly editions of a marginalized set? and the funding would primarily support the huge labor of digitizing and mark-up? I write this as my Beard-Stair students struggle with the next step of the project, creating a digital representation of their work with an out-of-the-box platform.]

Book historians and print culture scholars seem uniquely positioned to answer some of these questions because of their propensity for doing big projects (I mean, lifelong career projects) that cover wide swaths of literary history and culture. Lisa Maruca threw down the gauntlet today to book historians to take up this cause about silence in the archives. Ok, I’ll do it. Have been for awhile. Will be for a very, very long time. Most of my opinions about Digital Humanities comes out of work in print culture and book history, especially on the British literary annuals: some 3000 volumes of poetry, prose, translations, travel narratives, landscape engravings, portraits, women authors, Romantic/Victorian authors, editors, publishers 1823-1860. There’s a treasure trove of materials just waiting, waiting, to be digitized and then mined. But, we’ve never gotten enough money to fund the laborious digitizing and mark-up that’s required. [sigh]

So, how about it other print culture and book historian types? How about you? Ready to take up this cause?

But, we’re not ready to stop talking about Big Data. Before I could even finish this post, Ted responded via Twitter:

Re: representativeness of collections. I think humanists are still implicitly thinking about canonicity, which is a zero-sum game.

Collection-building is not a zero-sum game. I build collection X, you build Y, she builds Z, we all go public. Then other scholars can

select whatever subset seems to them “representative.” We’re used to assuming there must be a conflict here — but I don’t see one.

Every single book you digitize / normalize / mark up is good news for me, even if you have a view of “representativeness” I don’t share.

Fair enough. Ted’s project is to use the corpus that is available. He’s moved his project beyond that labor of creating the digital representatives and is working on the humanistic queries that are so engaging (to me) in Digital Humanities. To be fair, Ted has promised to return to his data set to add more women authors from the Brown Womens Writers Project and such. He gets into some of this in a longer response to representation, big data, and the canon.

But, I return to this ethos: do we have a responsibility to acknowledge the lack, this silence in the archive?

…to be continued at MLA 13….or in the comments below.

(Just a note about scholarly engagement: This is why I really enjoy blogging, tweeting, and other forms of more immediate writing: a conversation begun with Ted’s original post a few days ago has become multiple posts for multiple types of scholars who are commenting in real time on these very cogent, and sometimes urgent, questions.)

Update 3/3/12 10:12pm: Both Jacque Wernimont and Roger Whitson respond to these and related questions circling around the DH in this post. Michael Kramer’s post on DH as Process got me thinking about platforms for exposing digital projects to the world. And the conversation among librarians and archivists over at Kate Theimer’s blog resulted in lengthy comments that indicate a division between scholars and archivists: “The problem with the scholar as “archivist,” or is there a problem?

[This and a whole series of blogs on digital feminism wound up as an Editors' Choice for Digital Humanities Now, March 5, 2012 in addition to sparking some interesting doings with THATCamp, articles, contributions]