Johannes has forwarded an e-mail exchange to the list that referred to the subject of Gauri’s recent post. I have added it to this blog, and changed the order for readibiliy. I will comment in a next post. Hugo Besemer
From: Joost Lieshout
Sent: 09 February 2007 15:28Hello.Some input from my side.
Yeas I did talk again to some folks that can be considered
‘users’ of such a service. Again it becomes clear to me that
managing organisational directories is a hassle everyone wants
to get rid of themselves. Basically one would be happy to have
up-to-date central contact information availbale at all time tha
can be used locally to get rid of locally maintained address
lists.
In the spam era it also becomes tricky to have readily available
up-to-date contact info on the web. One may wonder if
organisations would be really happy to have their contact info
in easy to access XML formats just hanging around ready to be
used. There must be clear icentives for orgs to be happy to
participate. Again I think this brings us back to the discussion
‘where/what do we do it for’.
I have been talking to Chris Addison on the ideas we have and
basically he sees its value more as an entry point to
query/collect information coming from multiple resources around
an organisation (cross searching). Unique identification of an
org. may still be tricky. It looks feasible to have a unique
identifier, which includes a part for sub-units, but when
stopping at the institution level, one can try to use unique
official acronyms as well. Then it would be good to have an life
example of what you can do if implementing this, compared to
what you cannot do now.
Other examples we may look at are eg vCard, OECD.
– Joost.
From: Pesce, Valeria (SDRD)
Sent: 09 February 2007 16:53
Hi Joost,
yes I agree that managing organisational directories is a
nightmare for most organizations and even for those who provide
information services.
But in my opinion this doesn’t necessarily mean that
centralisation is the solution. I think centralised architecture
/ distributed architecture are not mutually exclusive and we
should try to exploit what is best of both kinds of
architectures: updating is most efficiently done in a
distributed architecture while retrieving and searching of
course is much more efficiently done in a centralised
architecture. Considering this, the “distributed” principle of
letting each organization describe itself and keep its record
updated (wherever: on its server or on a hosting server) would
ensure reliable and updated records, while information systems
could either choose the difficult way of providing services
based directly on the distributed records (thus engaging on
tasks like harvesting, storing/caching, advanced querying etc.)
or decide to “plug” into the value-added services provided by
one or two big organizations that have the capacities to
organise an efficient information service which would somehow
become a centralised source for everyone. And this centralised
information service may or may not be the same as the one that
keeps the reference registry file. For instance, in the case of
Agris keeping the registry file, I don’t think they would engage
in also building an information service, which instead a service
provider like Wisard could do.
In short, I would go for distributed data sources and a (more or
less) centralised information system.
There is another advantage to this: keeping the data sources
distributed and publicly available makes it possible for
everyone to create alternative (very specific and tailored)
information services in cases where the services provided by the
centralised information system are too generic or for some
reason do not completely meet the needs of a certain community.
In other words, we would not be all dependant on one centralised
service that both stores the data and provides information
services.
It seems to me that the architecture we dicussed in our last
meeting can implement such a solution.
I admit you raise an important issue when you speak about
contact emails hanging around in XML files and of course we
should find a solution to this.
What do you all think about this?
Hope to meet you all soon,
Valeria
From: “Keizer, Johannes (KCEW)”
Date sent: Fri, 09 Feb 2007 18:37:48 +0100
Dear all,
I agree completely with Valeria.
The principles she is outlining are applicable not only to
information about
organizations, but generally. For bibliographic metadata in
AGRIS we are pursuing the same strategy for some years now:
data should be produced and maintained where they are originally
needed, this means normally with the data owner.
If you start with data collection instead with data production -
you are in sustainability misery just from the beginning
“Information systems” should use these datasets that have been
produced by the dataowners for their own puposes – by
harvesting or by distributed searching. Any other setup means to
start from the requirements of the information system, instead
to start with the requirements of the data owners and users.
This might include szenarios in which dataowner use a
centralized system for their data management. I personally -
have parts of my information on my own laptop – use a CMS with a
hosting service in the US – are managing links with Delicious
and – pictures with FlickR and…
This means: The real difference is not between centralized or
decentralized software, but between the requirements of data
owners and data harvesters.
If a data harvester wants to collect data that are not produced
because of the genuine interests of the data owners for their
own puposes he will need a constant flow of cash to keep the
data collection going.
johannes
From: “chris addison”
To: hugo@bircim.net
Date sent: Tue, 13 Feb 2007 21:06:14 +0200
The conversation I had with Joost went along the following
lines.
I think in many cases a unique identifier for an organisation
was very powerful for retreiving materials across multiple
sources. In reality we all do this now when we search for
acronyms.
Beyond that I am sceptical about central stand alone systems. I
think the onesite module is interesting for a small community.
here the key is that the Master list of organisations is shared,
and hence updated by 5 user organisations whilst individual
contact information is held in their own databases.
Onesite directory is featured on the search page of euforic and
on country pages. Read the rest of this entry »