1 |
3132
|
perry
|
# = Ruby Ecoinformatics Library
|
2 |
|
|
# == What is it?
|
3 |
|
|
# A tool for accessing ecological datasets and their metadata in ruby.
|
4 |
|
|
# It's a way to do this:
|
5 |
|
|
# squery = '<pathquery version="1.2">...</pathquery>' # looks for all pisco-subtidal data
|
6 |
|
|
# Metacat.new('http://data.piscoweb.org/catalog/metacat') do |metacat|
|
7 |
|
|
# eml_docs = metacat.find(:squery => squery)
|
8 |
|
|
# eml_docs.each do |eml|
|
9 |
|
|
# puts eml.docid
|
10 |
|
|
# eml.data_tables.each do |data_table|
|
11 |
|
|
# puts "\t --#{data_table.entity_name}"
|
12 |
|
|
# end
|
13 |
|
|
# end
|
14 |
|
|
# end
|
15 |
|
|
# Spits out :
|
16 |
|
|
# pisco_subtidal.40.1
|
17 |
|
|
# --quad_taxon_lookup_table.csv
|
18 |
|
|
# --pisco_quad_data.csv
|
19 |
|
|
# pisco_subtidal.12.4
|
20 |
|
|
# --fish_taxon_lookup_table.csv
|
21 |
|
|
# --pisco_fish_data.csv
|
22 |
|
|
# pisco_subtidal.21.7
|
23 |
|
|
# --swath_taxon_lookup_table.csv
|
24 |
|
|
# --pisco_swath_data.csv
|
25 |
|
|
# pisco_subtidal.30.2
|
26 |
|
|
# --upc_taxon_lookup_table.csv
|
27 |
|
|
# --pisco_upc_data.csv
|
28 |
|
|
# For now, this library is for dealing with EML metadata and data coming out of
|
29 |
|
|
# the Metacat data catalog (http://knb.ecoinformatics.org/software/metacat).
|
30 |
|
|
# The Metacat client class will generally return Eml objects representing their
|
31 |
|
|
# corresponding EML document. This object can then be used to access document-wide
|
32 |
|
|
# metadata. Furthermore, each Eml object contains DataTable objects representing
|
33 |
|
|
# their corresponding DataTable elements. For now, Eml and DataTable are mostly
|
34 |
|
|
# a wrapper for a REXML:Document DOM representation of the XML data. In the future
|
35 |
|
|
# they may contain more functionality.
|
36 |
|
|
#
|
37 |
|
|
# One very important feature of classes Metacat and DataTable is the ability to
|
38 |
|
|
# read plain-text data tables by passing a block(closure) that handles fragments
|
39 |
|
|
# of the file. This way data can be read without the costs of loading a huge dataset
|
40 |
|
|
# directly into RAM.
|
41 |
|
|
# == Status
|
42 |
|
|
# The Metacat client library is very usable and I'm happy with the current documentation
|
43 |
|
|
#
|
44 |
|
|
# The Eml and even worse, DataTable classes are minimally functional. They also still
|
45 |
|
|
# contain fragments of the original application that inspired them which should be removed
|
46 |
|
|
# as they are not generally applicable. Documentation is also not up to par.
|
47 |
|
|
#
|
48 |
|
|
# I would recommend using the REXML:Document contained within both Eml and DataTable
|
49 |
|
|
# where their methods are lacking to access metadata attributes. DataTable.read() is
|
50 |
|
|
# in a usable state and handles blocks the same way as Metacat.read.
|
51 |
|
|
#
|
52 |
|
|
# Where documentation is lacking check the unit tests under ./tests. These sometimes
|
53 |
|
|
# give a clearer picture of intended usage.
|
54 |
|
|
#
|
55 |
|
|
# == Examples
|
56 |
|
|
# See classes Metacat, Eml, and DataTable for more examples
|
57 |
|
|
# == Author
|
58 |
|
|
# Chad Burt
|
59 |
|
|
#
|
60 |
|
|
# Marine Science Institude
|
61 |
|
|
#
|
62 |
|
|
# University of California, Santa Barbara
|
63 |
|
|
#
|
64 |
|
|
# chad@underbluewaters.net
|