Bug #3807
closedreserved symbols in record names passed to the RExpression actor generate a missing R error message
0%
Description
XP Pro x64 SP2, Java 1.6.0_11-b03, Kepler 1.0.0, R 2.8.0
The URL contains a bugged Kepler workflow. The data from KNB contains a column called %CC that causes an error when R is executing Kepler's RExpression initialization code. This causes Kepler to think that R is not found.
A frustrating workaround is to disassemble the record and reassemble it, changing the name of the offending column.
Updated by ben leinfelder over 15 years ago
will try this with the JRI implementation of R to see if that resolves the problem. RExpression2 needs more testing as it is!
Updated by ben leinfelder over 15 years ago
using RExpression2 (the JRI implementation) works!
of course this requires running from kepler-trunk and launching with java pointing to the appropriate native libraries in the "common" module....
Updated by Oliver Soong about 15 years ago
By way of update, Ben partially committed a patch for me that should mostly fix this. If I recall correctly, the remaining code should fix collisions with symbols in record names that are invalid in file names. Specifically, the remaining code addresses the cases where:
1. 2 ports have names that differ by reserved symbols
2. both ports cannot be converted to native tokens and so are saved to disk and passed as file names
The problem arises because the committed code converts almost all non-alphanumeric characters to underscores (_) to create valid file names, creating potential collisions. There is some code to avoid this, but it is implemented before files generated by the firing actor ports are processed. In other words, the existing code can only avoid collisions with ports belonging to different actors and not similarly named ports on a single actor.
Updated by ben leinfelder about 15 years ago
I believe the main part of this bug is fixed now. Certainly there are the potential for the temporary filename issues that mare mentioned, but that can be avoided when constructing RExpression actors and their ports. For EML data sources that emit columns with reserved symbols (ie '%CC'), we can now handle that with back ticks (`) and check.names=FALSE.
Can we close this bug [and open a different one to address the port name/temp file issue] so that the bug tracking doesn't drift?
Updated by Oliver Soong about 15 years ago
The original summary of the bug has been addressed, so I'm closing this and will open a new one for the additional file collision issue.