05 December 2008

“Names touch everything…” here too!

Derek Whitehead drew my attention to this entry in the hangingtogether blog. The idea of a Cooperative Identities Hub as a more broadly based name authority file suitable for use by a wide range of data custodians (libraries, archives, museums, repositories, aggregators, publishers) certainly fits well with what this project hopes to do.

It has occurred to us that, because fresh new researchers frequently publish for the first time as a co-author of a paper while still a graduate student, university research repositories will often be the first to see the researcher's name and, as a consequence, be the ones to do the original authority work (and will also be in the best position to gather researcher persona attribute data).

So, if there is data which might not be of immediate interest to a repository manager, but is nevertheless easily accessible and likely to be of use to other institutions later, then we probably should gather it and pass it on.

Progress Report December 2008

It has taken a while to make appointments, but the project is finally underway, albeit in a somewhat cart before horse fashion – the stakeholder requirements analysis will now be done in parallel with at least schema design and some preliminary investigation of name matching and distinguishing algorithms, all of which will be happening through December and January.

We are currently looking at how well EAC-CPF (Encoded Archival Context – Corporate bodies, Persons and Families) might meet our needs after doing a rough comparison of FRAD, RDA, DC, MADS, EAC, FOAF and VCARD against a set of possible attributes and relationships that might be readily available to ARROW repository managers. Rough because most of these are in a state of flux and because our learning time is limited.

EAC-CPF is attractive because it is a rich namespace structured to represent relationships as well as entities and because People Australia is proposing to use it. Once we learn how to code EAC, the next step will be to try to test it by generating some use cases and attempting to render them in EAC.

At this stage, we are not intending to go to the next step of defining an application profile and wrapping our EAC and whatever other vocabulary elements we might need into an RDF structure. It would be a desirable outcome, but we will probably not have time to get that far.

On the application side, we are going to have a look at how the BibApp application might fit into what we are doing – it does seem to have some effective mechanisms for disambiguating and distinguishing names that seem to overlap with what we are doing.

07 November 2008


We have had a few comments, just not through the blog! They fall into a few categories as follows;

1. this is a project whose time has come;

This has come from a number of quarters; from the library world and from people interested in learning object repositories as well as those running research repositories. Authority control has been around for a long time, but it seems the new context of digital repositories has led to the issue bubbling up for a rethink.

2. the timeline is very short;

Yes, indeed. Particularly with Christmas and the New Year in the middle, we recognise that we may have to cut our cloth to fit the timeline (sorry for the mixed metaphor). There may be a some flexibility with the March deadline, but we will see.

We are also focusing solely on personal names as an attempt to keep it as simple as possible.

3. will there be an operational relationship with People Australia?

We have already had a discussion about how this project might interact with People Australia and, without wanting to prejudice the outcome of the project, it does seem only sensible to build and use People Australia as the authority file for Australian researchers.

4. identifiers and vocabularies;

We have had mention of URL/URIs, People Australia persistent identifiers, ISNI, ISADN, OpenID and various commercial researcher numbers as identifiers. There are also many developments to do with schemas, DTDs, vocabularies, etc and sorting something reasonable out of all that will be a core part of the project.

30 October 2008

Welcome to NicNames

Welcome to the blog of the ARROW NicNames Project. NicNames can be read as 'Names in Context' and refers to the purpose of the project, which is 'to provide a means to more effectively manage author names in institutional repositories'.

A draft project plan has been developed and comments are sought from anyone with an interest in this area, but particularly from members of the ARROW community.

At this stage the project is in the process of starting up and should be fully underway by about the middle of November (2008), so we would be intending to finalise the plan before then.

The intention is to at least publish monthly reports through this blog, but team members will also blog about anything they find interesting as the project proceeds.