Need resource manager to handle objects in the resources directories
This class should have static methods for accessing items in the modules' resources directory. This would allow resources to be accessed from a centralized place based on module name and path, but it would keep developers from having to hard code file paths into the java code. An example method might be:
File f = ResourceManager.getResource("common", "images/kepler-about.png");
Which would read the kepler/modules/common/resources/images/kepler-about.png file.
We could also have resource-type specific calls, like
Image i = ResourceManager.getImage("common", "images/kepler-about.png");
Comments? Other ideas?
#1 Updated by Matt Jones almost 12 years ago
In general I think this is a good idea. But, I could also envision having a number of general search paths for resources that the class works through, so that even the relative paths need not necessarily be encoded in the Java code for each calling class. This would make it easy to find resources in common subdiectories like resources/images, resource/configuration, and resources/data. For example, ResourceManager might have two methods like:
(1) ResourceManager.getResource(String module, String name)
Searches the given module for a resource with the given name, where name can be either a bare name, or a relative path. If the name is a bare name, then a predefined set of paths are searched in the attempt to locate the resource, with precedence given to the first resource found. Some predetermined paths to be searched might include the root of the resources directory, and then subdirs such as "images", "configuration", and "data". Because 'name' can include arbitrary paths, users can create their own custom subdirectories within 'resources' to facilitate module-specific resolution. Example custom names include "images/mymodule/splash.png" and "mymodule/data"
(2) ResourceManager.getResource(String name)
Same as (1), except searches all modules in classpath order rather than just one specific module. First match has precedence.
Here are some example calls of these functions:
File f = ResourceManager.getResource("about.png");
File f = ResourceManager.getResource("common", "about.png");
File f = ResourceManager.getResource("common", "images/about.png");
File f = ResourceManager.getResource("common", "images/experimental/about.png");
File f = ResourceManager.getResource("mydata.csv");
File f = ResourceManager.getResource("common", "mydata.csv");
File f = ResourceManager.getResource("common", "data/mydata.csv");
File f = ResourceManager.getResource("mymodule", "mymodule/mydata.csv");
We will want to consider how to deal with these relative paths -- whether there is an implied '/' in front of each path, or whether "/" is considered the top of the resource directory structure.
We should also consider whether returning a File is the best thing, or maybe instead an InputStream? Or as Chad indicated, maybe even specific resource types like Image can have special methods that return the instantiated java object. In this case, we might have:
Image i = ResourceManager.getImageResource("about.png");
#2 Updated by David Welker almost 12 years ago
I want to point out a couple of things. First of all, we already have most of this functionality. If you have foo.png in common/resources/configs/foo.xml then you can reference this from the class path using the relative path "configs/foo.xml." I think that having the relative part of the path is a feature and not a bug, since otherwise it would be extremely easy for tricky bugs to arise where something is accidentally overridden when it is not intended to be. Such a bug could be extremely challenging to find. I think having and requiring a relative path (keeping in mind that the relative path could be nothing -- that is common/resources/foo.xml could be referenced just by "foo.xml" actually does not impose very much cost at all, but adds quite a bit of safety in terms of making the code much more robust from accidental changes.
If you aren't convinced of the potential for such bugs, imagine this scenario. Someone drops "foo.xml" in one of the many directories that we specify as being a "root" directory to read from. Perhaps they do not even realize this is a one of those canonical directories. The call ResourceManager.getResource("foo.xml") now unintentionally reads in the wrong resource (in the original module, "foo.xml" was stored and read from "resources/configs" but in the higher priority module there is a different "foo.xml" stored in "resources/data." Most people are probably not going to expect the addition of "foo.xml" to "resources/data" to break Kepler, especially as there is no "resources/data/foo.xml" being overridden in a lower priority module. So, chances are, this person is going to be looking for problems in the code associated with their own module for quite some time, when the real culprit lies in the way that code from a lower priority module is interacting with an unexpected "foo.xml." And the irony is that the more "readable" foo.xml is to the lower priority module, the harder the bug is to find. The more that the new foo.xml is just nonsense in the context of the lower priority module, the easier the bug is going to be to find. But, one can imagine some very subtle bugs arising from reading in the wrong resources, especially when those resources share many common features like XML files tend to do.
Basically, the principle I want to advocate here is that resource overrides should be obvious and automatically detectable by the build system. But, if there are multiple "roots" and multiple ways of reading in resources, then it will be impossible to detect whether two instances of "foo.xml" that exist in different relative subdirectories of "resources" are overrides or not. Alternatively, if resources are read in from their relative paths, all resource overrides can be detected by the build system and reported to the user and otherwise managed. This alone will save many developer hours of debugging time.
A second issue. Any references to modules by name in the code will render our code much more fragile. I think it should be avoided. I would even go so far as to say that I would greatly prefer that module names were NEVER referenced in either code or in properties.
Let us say you have a reference to ResourceManager.getResource("common", "images/kepler-about.png"). This code will break as soon as common is published so that common is now named common-2.0, for example. Furthermore, if a developer wants to change the behavior of the system so that instead of "common/resources/images/kepler-about.png" being read, "foo-module/resources/images/kepler-about.png" is read instead, they will not be able to override the resource. Instead, they will have to override the entire base Kepler class that references the resource, all so that the call to ResourceManager.getResource("common", images/kepler-about.png") can be either changed to ResourceManger.getResource("images/kepler-about.png") or less robustly ResourceManager.getResource("foo-module", "images/kepler-about.png"). Of course, this will lead to the risk of code drift. And it is completely unnecessary, as this override was introduced for a trivial rather than fundamental reason.
What if in the future, we want to refactor the common module? Perhaps we find that there are common patterns in our distributions, such that common would usefully be broken up for different contexts? Well, such refactoring is going to be much more difficult, because now we have to update all the references to the common module in our code. Note only that, since we would be implicitly encouraging the use of references to the "common" module, even if we fixed all of our code so that it could be refactored, we are likely to break a whole lot of code in other modules that has come to depend on the common module.
We have discussed and debated the fact that a module can make references to code in lower priority modules, but not in higher priority modules. This is a feature, in that just by looking at modules.txt, you already know a lot about the dependencies between the modules. This is a bug, in that if we allow cyclic dependencies it may (or may not) make the task of break util into more modules somewhat easier. Well, if you are going to reference specific resources from code, you can just forget it. There is nothing stopping a developer of a lower priority module from referencing a higher priority module and thus creating an implicit and hidden dependency between them. Whatever information that can be gleaned about code dependencies just from looking at modules.txt would be largely rendered uncertain.
So, the second principle I would like to advocate is this. All resources in the core modules should be read off the classpath or through the build system as an intermediary and never directly off the file system if such a reference involves coding a reference to a specific module in your code. In general, module names should never be referenced by either the code or by any resources or by system properties. Except by the build system, which is specifically designed to handle such references. In this way, our code will never become dependent on the particular module names we have chosen, and we will always have maximum flexibility to much more easily refactor modules as we see fit.
There are already cases where this second principle is violated. By me. For example, in the ppod suite (which include ppod, ppod-actors, ppod-gui, and provenance-apps) system properties reference common and other modules. This causes all sorts of problems when publishing, because the reference to, for example, common, becomes out of date when common is renamed to common-2.0a1 when published. Either referencing the file system or module names directly is not a good idea, especially since files are just as easy to read off the classpath.
Anyway, I am not against the idea of making a resource manager. It could make reading files off the classpath even more easy than it is already is, by for example, producing a BufferedReader or PrintWriter for any reference to a resource so that developers do not even have to think about how one goes about reading and writing files that are found on the classpath. (Although this is probably a skill that any Java developer should learn.) But, such a resource manager should not allow the users of that manager to reference specific modules.
#3 Updated by Christopher Brooks almost 12 years ago
Please be sure to review how the Java I18N stuff works.
which is an old article from 1998
which is a tutorial