Project

General

Profile

Bug #4160

EML - Dynamic data retrieval

Added by ben leinfelder about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Normal
Category:
data access
Target version:
Start date:
06/15/2009
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
4160

Description

The existing EML actor is able to get the latest revision of a datapackage.
We need to augment EML data access and allow the actor to emit a set of datatables rather than just one.
A downstream actor can stack them as needed - hopefully that's manageable for the 2a1 release.
I've discussed with Jim, and I believe we have a decent approach.

History

#1 Updated by ben leinfelder about 10 years ago

new data output format emits information about each datatable (including the pointer to the cache file).
This allows R actor downstream to read and combine similar datasets into a complete set (annual tables, for example).
Combined with the existing "check for latest revision" this should get us close to the "dynamic" data needs for TPCs

#2 Updated by ben leinfelder about 10 years ago

testing this new output format with R actor downstream - looks promising, but deferring to Jim's R expertise to get the read.table() call corrected. Running into "more columns than column names" type of errors when constructing the dataframe for each table.

#3 Updated by ben leinfelder about 10 years ago

looks like we got this going in terms of getting the existing tables.
next up: check for latest revision!

#4 Updated by ben leinfelder about 10 years ago

Jim has this working in the riverflow tpc.
And I've recently made the "check for latest" feature of EML work as it needs to.
Looking pretty good for dynamic data.

#5 Updated by Redmine Admin over 6 years ago

Original Bugzilla ID was 4160

Also available in: Atom PDF