Bug #4160
closed
EML - Dynamic data retrieval
Added by ben leinfelder over 15 years ago.
Updated over 15 years ago.
Description
The existing EML actor is able to get the latest revision of a datapackage.
We need to augment EML data access and allow the actor to emit a set of datatables rather than just one.
A downstream actor can stack them as needed - hopefully that's manageable for the 2a1 release.
I've discussed with Jim, and I believe we have a decent approach.
new data output format emits information about each datatable (including the pointer to the cache file).
This allows R actor downstream to read and combine similar datasets into a complete set (annual tables, for example).
Combined with the existing "check for latest revision" this should get us close to the "dynamic" data needs for TPCs
testing this new output format with R actor downstream - looks promising, but deferring to Jim's R expertise to get the read.table() call corrected. Running into "more columns than column names" type of errors when constructing the dataframe for each table.
looks like we got this going in terms of getting the existing tables.
next up: check for latest revision!
Jim has this working in the riverflow tpc.
And I've recently made the "check for latest" feature of EML work as it needs to.
Looking pretty good for dynamic data.
Original Bugzilla ID was 4160
Also available in: Atom
PDF