Ability to easily concatenate identical data structures (EMLDatasource)
a) within an EML package
b) across data packages (or non-EML actors)
Hopefully leverage org.ecoinformatics.datamanager library for some/all of this.
#1 Updated by ben leinfelder over 11 years ago
While this approach may not be what was initially intended when this enhancement was requested, the RExpression actor can now support multiport input of "column based records" from an EML actor. R can easily concatenate the list of dataframes to produce a union of the records.
See bug #2959 for the formal request, and also the sample workflow in the demos/R directory:
#2 Updated by ben leinfelder about 11 years ago
I've worked with kevin to implement this in some of his workflows.
I've also packaged a custom R actor called "UnionAll" that performs this basic table stacking (in HEAD only).
While it might seem like a hack way of doing the concatenation, it also seems very workflow-oriented in that each piece of the concatenation is explicitly laid out (there's an individual EML data canister on the canvas for each and every table that is being concatenated). Sure, there's an upward limit to how many canisters you want to have on any given canvas, but there has to be some mechanism for specifying exactly which data tables are being included in the union and this one happens to be using pretty pictures.
I've marked the with the semantic type of: urn:lsid:localhost:onto:2:1#DataStructureOperation