Bug #324

perl implementation of harvester api

Added by Matt Jones almost 19 years ago. Updated over 17 years ago.

In Progress
Target version:
Start date:
Due date:
% Done:


Estimated time:


Need to implement registry and harvester.

Related issues

Blocked by Metacat - Bug #162: need harvest/batch load for metacatResolved10/24/2000

Blocked by Metacat - Bug #325: create site filters to convert site metadata to eml packagesResolved11/08/2001


#1 Updated by James Brunt over 18 years ago

Matt Jones and David Blankman can provide the start of the requirements for
this program based on discussions we had at the October meeting and a flow
diagram of communication with metacat that we produced. This development should
be done in close communication with Matt and the other KNB developers. There
are also some packaging related problems that will need to be addressed but
alas that is another bug.

#2 Updated by Owen Eddins over 18 years ago

April 11 David, Owen and James had an impromptu Harvester meeting and decided
that for purposes of 'Harvesting' .xml files that the web service architecture
did not make any sense. To much overhead to try and manage arbitrary files on
a file system. What we were trying to do with the current design was create a
lightweight dbms with out providing any controls on how files got in or out of
the 'dbms'. What we decided was we needed a one time metacat loader tool to
upload in to Metacat a .xml file. See cvs pubs for old design of Harvester. A
new design doc will be placed in pubs directory with metacat loader design for
further discussing. It was decided that all the ideas we had been discussing
for the web service were appropriate but only for harvesting metadata out of

#3 Updated by Owen Eddins about 18 years ago

Just an update. Currently looking in to Morpho classes and how to adapt them
to create a generalize client interface to Metacat. From this we're going to
create a command line tool for one time uploading of .xml files and datafiles
in to metacat.

#4 Updated by Matt Jones about 18 years ago

Excellent. You might also consider looking at the perl module that I
wrote. It is a perl client interface to metacat, and I think it is the easiest
way to access the metacat system from the command line. I wrote it to use in
the NRS and OBFS data registries, which use perl as a backend to their HTML form
submission pages. Just a thought. You can find it in the webmdentry module on (I think the path is webmdentry/src/Metacat). There are some
examples of its use in there as well (load-dataset.cgi and

#5 Updated by Owen Eddins about 18 years ago


Thanks for responding. Yes, I looked in to using your perl scripts to do this
and I agree that they probably are a quicker solution for a command line tool.
It's essentially written already for pre beta 9 EML. I downloaded it and
installed it and got it working easily. The problem is that we also are
working on creating an automated data replication engine (essentially the
Harvester with a cooler name) and a web service interface to the Metacat that
implements a WSDL I have hacked up to define a very light weight 'API' for
EML. I spent some time last week looking at the Morpho classes and they look
like they can lend themselves easily to being turned in to a generic client
side set of classes. Since we are looking in to other methods of connecting to
Metacat other than Morpho generializing these classes seems a worth while
effort. The ClientFramework class is the big one but other that most of the
others seem pretty general. Most of the methods in MetacatDataStore seem
relevent to the 'API' I'm thinking of. What do you think?

#6 Updated by Redmine Admin over 7 years ago

Original Bugzilla ID was 324

Also available in: Atom PDF