We are having a very productive week here in London. It started with all hands meeting on Monday and continues with more focused discussions. We went through all the datasets with our prospective end users and identified metadata properties that could potentially overlap and give us a chance for meaningful data integration and possibly some inferencing. Next to the obvious spatiotemporal properties (findspots, dates) we have object form type (bowl, jar, coin), material, color, text class (contract, dedication) and several types of persons. The precise set of vocabularies is yet to be defined, but we had a quick go and designed a small graph trying to use Europeana Data Model (see image).
This exercise generated some questions. We will seek answers to these questions on the Europeana WP3 mailing list.
As a basis for our work in linked data in the humanities we have undertaken a survey of a number of data sets. These include those used in our LaQuAT project (Heidelberger Gesamtverzeichnis, Projet Volterra) as well as many others in the areas of the classics, epigraphy and archeology, including Inscriptions of Aphrodisias, Pleiades, Lexicon of Greek Personal Names, Nomisma.org (ancient coins), papyri.info, Arachne, Inscriptions of Roman Tripolitania, Greek, Roman and Byzantine Pottery at Ilion, Khirbat al-Mudayna al-Aliya excavations, Petra Great Temple, Small Finds database, Ure Museum of Greek Archaeology catalog, and Ure Museum’s images catalog. As part of this survey we have documented how these data sets can be accessed and have have made notes as to what fields in the data may provide suitable candidates for linking with other data sets, focusing on places, people and dates. We plan to review our findings and, from this, select a candidate set of data sets to use to populate and evaluate our proposed infrastructure.
We have designed an architecture for SPQR. The architecture consists of a triple-store of humanities data, populated from existing data sets using RDF imports, XSL transformations from XML, web scraping of HTML pages and querying of REST endpoints. The architecture provides services to support browsing of the linked data, running SPAQRL queries over it, running full-text searches to return RDF nodes, linking new data and proposing new links within the existing data. We plan to implement this architecture using off-the-shelf, tried-and-tested linked data components where possible, while striving to deliver an integrated front-end for users. We hope to evolve the architecture and discuss it in detail during an “all hands” project meeting in the first week of November.
SQPR (Supporting Productive Queries for Research) will look at the potential of Semantic Web (SW) and Linked Data (LD) approaches have for humanities researchers to formalise resources and the links between them flexibly, and to create, explore and query these linked resources. Closely allied to LD has been work on ontologies for providing agreed meanings for both links and the resources they connect. Thus ontologies can act as the semantic mediator between heterogeneous databases, enabling researchers to explore, understand and extend these datasets more productively and so improve the contributions that the data can make to their research.