middlemarch-critical-histories

Middlemarch Critical Histories Project

What can we learn about literary scholarship by analysing thousands of articles at a time? What can we learn about a literary work by examining which parts of it get repeatedly quoted in (different phases of) its critical afterlife? (And a complementary question: which parts never get quoted?) What can we learn about canon formation, contestation and reformation by considering a highly canonical work not as a single unit but as subject to highly uneven critical attention across its length? What can we learn about literary scholarship as an institution and as a set of practices by tracing patterns of citation as they unfold over time?

These are just some of the questions which our project aims to address. Decades of research in linguistics has shown that when a large enough collection of texts (a corpus) is analysed, patterns emerge which aren’t accessible at a smaller scale. In some cases these patterns confirm our intuitions and expectations, in other cases they are entirely unexpected or counter-intuitive. In either case, corpus methods offer an measure of patterns actually present in the data.

In applying these methods to literary scholarship, we’ve chosen to start small - relatively speaking. George Eliot’s novel Middlemarch is an ideal test case for the following reasons:

We’ll start with Middlemarch then, and attempt to construct a corpus of criticism and scholarship which discusses this novel. But this immediately raises questions and problems: on the one hand, we want as large a corpus as possible, since this will give us the best possible data for making claims about “criticism in general”. However, we don’t want to fall into the trap of quantity over quality: texts which have been poorly digitised or miscategorised will be useless at best, actively distorting at worst. A more technical question concerns the representativeness of our corpus. How can we best select a sample of all Middlemarch criticism to stand for the whole? And, crucially, how can we do this without creating enormous amounts of work for ourselves? (Aside from valuing our own time, this is important because we want this methodology to be readily expandable to other texts and authors).

There are also a set of concerns specific to the historical (diachronic) axis. Should we aim for equal numbers of texts from each year/decade, or try to have numbers proportionate to overall output (measured how)? Should we just use whatever we can get our hands on? How far back in time can we go before the categories of literary criticism and scholarship are so different that we aren’t any longer comparing like with like?

We don’t yet have solutions to these problems, and part of the interest in the project is to search for the best (which is not to say perfect) solutions, and indeed to ask ourselves how we should even decide what counts as “best”.

Although we hope to learn and adapt our methods as our research progresses, we have decided on the following initial parameters:

Results