Project

General

Profile

Bug #2507

Data Manager Library: Create a EML parser lib to digest eml document

Added by Jing Tao over 13 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
datamanager
Target version:
Start date:
08/01/2006
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
2507

Description

Currently, the EML actor in Kepler can download eml document and parse it. After parsing, the entity information in eml document will be stored in java object and data file will be download into local file system and also be stored in relation db too.
We want to seperate this process from Kepler and make it as lib in eml module. So this library can be used in Kepler, Metacat and some other projects.


Related issues

Is duplicate of EML - Bug #2504: Create a EML parser lib to digest eml documentResolved07/31/2006

History

#1 Updated by Jing Tao over 13 years ago

Here is our plan:
Creating 3 packages in eml src dir:
1. org.ecoinformatics.eml.digestor package and main class is EML200Parser. The main class can be copied from kepler module.
2. org.ecoinformatics.eml.download package and main class is DataDistributionHandler. The this class will implement Runnable interface and API is
DataDistributionHandler(Entity entity);
run();

3. org.ecoinformatics.eml.db package and main class is is TableGenerator. The API of the class is:
TableGenerator(Enity entity, File localFile);
generateTable();
getTableName();
loadDataToTable();

The function of download package is very similar to cache system of Kepler. I am thinking how to reuse those code in kepler.

#2 Updated by Jing Tao over 13 years ago

In order to make download package more configurable, I would like to change to constructor to:
DataDistributionHandler(Entity entity, File cacheDir, File fileName);

#3 Updated by Jing Tao over 13 years ago

Here is the change in org.ecoinformatics.eml.db package:
Main class is SQLCommandHandler and API is:
SQLCommandHandler(DBConnection conn, String plugInName)
generateTable(Entity entity, File fileName) and it will return the generated table name as string;
dropTable(String tableName);
excuteSelectionSQLComman(String sqlCommand) and it return a Resultset object;

The org.ecoinformatics.eml.degestor package API is:
EML200Parser(InputStream stream);
EML200Parser(InputSource source);
getEntityList() and it will return a vector;
parse();

#4 Updated by Jing Tao over 13 years ago

New package name are suggested:
org.ecoinformatics.eml.digestor.parser
org.ecoinformatics.eml.digestor.download
org.ecoinformatics.eml.digestor.db

#5 Updated by Matt Jones over 13 years ago

digestor is a bit of a crude name. How about "loader"?

org.ecoinformatics.eml.loader.parser
org.ecoinformatics.eml.loader.download
org.ecoinformatics.eml.loader.database

This is an improvement but still not totally great. Suggestions welcome.

#6 Updated by James Brunt over 13 years ago

I like loader but it's not a perfect fit for the way the work is divided which is more like parse (eml) -> create (table) -> source (data). Correct?

#7 Updated by Duane Costa over 13 years ago

We named the top-level package "org.ecoinformatics.datamanager". The complete set of packages is:

org.ecoinformatics.datamanager
org.ecoinformatics.datamanager.database
org.ecoinformatics.datamanager.download
org.ecoinformatics.datamanager.parser
org.ecoinformatics.datamanager.parser.eml

#8 Updated by Duane Costa over 13 years ago

  • Bug 2504 has been marked as a duplicate of this bug. ***

#9 Updated by ben leinfelder about 10 years ago

this has been completed. Moreover, it has been extended to support any XML schema that makes use of the EML dataSet module.

#10 Updated by Redmine Admin about 7 years ago

Original Bugzilla ID was 2507

Also available in: Atom PDF