Project

General

Profile

metacat / src / ruby / README @ 8561

1
# = Ruby Ecoinformatics Library
2
# == What is it?
3
# A tool for accessing ecological datasets and their metadata in ruby.
4
# It's a way to do this:
5
#    squery = '<pathquery version="1.2">...</pathquery>' # looks for all pisco-subtidal data
6
#    Metacat.new('http://data.piscoweb.org/catalog/metacat') do |metacat|
7
#     eml_docs = metacat.find(:squery => squery)
8
#     eml_docs.each do |eml|
9
#       puts eml.docid
10
#       eml.data_tables.each do |data_table|
11
#         puts "\t --#{data_table.entity_name}"
12
#       end
13
#     end
14
#    end
15
# Spits out :
16
#    pisco_subtidal.40.1
17
#             --quad_taxon_lookup_table.csv
18
#             --pisco_quad_data.csv
19
#    pisco_subtidal.12.4
20
#             --fish_taxon_lookup_table.csv
21
#             --pisco_fish_data.csv
22
#    pisco_subtidal.21.7
23
#             --swath_taxon_lookup_table.csv
24
#             --pisco_swath_data.csv
25
#    pisco_subtidal.30.2
26
#             --upc_taxon_lookup_table.csv
27
#             --pisco_upc_data.csv
28
# For now, this library is for dealing with EML metadata and data coming out of
29
# the Metacat data catalog (http://knb.ecoinformatics.org/software/metacat).
30
# The Metacat client class will generally return Eml objects representing their
31
# corresponding EML document. This object can then be used to access document-wide
32
# metadata. Furthermore, each Eml object contains DataTable objects representing
33
# their corresponding DataTable elements. For now, Eml and DataTable are mostly
34
# a wrapper for a REXML:Document DOM representation of the XML data. In the future
35
# they may contain more functionality.
36
#
37
# One very important feature of classes Metacat and DataTable is the ability to
38
# read plain-text data tables by passing a block(closure) that handles fragments
39
# of the file. This way data can be read without the costs of loading a huge dataset
40
# directly into RAM.
41
# == Status
42
# The Metacat client library is very usable and I'm happy with the current documentation
43
#
44
# The Eml and even worse, DataTable classes are minimally functional. They also still
45
# contain fragments of the original application that inspired them which should be removed
46
# as they are not generally applicable. Documentation is also not up to par.
47
#
48
# I would recommend using the REXML:Document contained within both Eml and DataTable
49
# where their methods are lacking to access metadata attributes. DataTable.read() is 
50
# in a usable state and handles blocks the same way as Metacat.read.
51
#
52
# Where documentation is lacking check the unit tests under ./tests. These sometimes
53
# give a clearer picture of intended usage.
54
#
55
# == Examples
56
# See classes Metacat, Eml, and DataTable for more examples
57
# == Author
58
# Chad Burt
59
#
60
# Marine Science Institude
61
#
62
# University of California, Santa Barbara
63
#
64
# chad@underbluewaters.net