1
|
# = Ruby Ecoinformatics Library
|
2
|
# == What is it?
|
3
|
# A tool for accessing ecological datasets and their metadata in ruby.
|
4
|
# It's a way to do this:
|
5
|
# squery = '<pathquery version="1.2">...</pathquery>' # looks for all pisco-subtidal data
|
6
|
# Metacat.new('http://data.piscoweb.org/catalog/metacat') do |metacat|
|
7
|
# eml_docs = metacat.find(:squery => squery)
|
8
|
# eml_docs.each do |eml|
|
9
|
# puts eml.docid
|
10
|
# eml.data_tables.each do |data_table|
|
11
|
# puts "\t --#{data_table.entity_name}"
|
12
|
# end
|
13
|
# end
|
14
|
# end
|
15
|
# Spits out :
|
16
|
# pisco_subtidal.40.1
|
17
|
# --quad_taxon_lookup_table.csv
|
18
|
# --pisco_quad_data.csv
|
19
|
# pisco_subtidal.12.4
|
20
|
# --fish_taxon_lookup_table.csv
|
21
|
# --pisco_fish_data.csv
|
22
|
# pisco_subtidal.21.7
|
23
|
# --swath_taxon_lookup_table.csv
|
24
|
# --pisco_swath_data.csv
|
25
|
# pisco_subtidal.30.2
|
26
|
# --upc_taxon_lookup_table.csv
|
27
|
# --pisco_upc_data.csv
|
28
|
# For now, this library is for dealing with EML metadata and data coming out of
|
29
|
# the Metacat data catalog (http://knb.ecoinformatics.org/software/metacat).
|
30
|
# The Metacat client class will generally return Eml objects representing their
|
31
|
# corresponding EML document. This object can then be used to access document-wide
|
32
|
# metadata. Furthermore, each Eml object contains DataTable objects representing
|
33
|
# their corresponding DataTable elements. For now, Eml and DataTable are mostly
|
34
|
# a wrapper for a REXML:Document DOM representation of the XML data. In the future
|
35
|
# they may contain more functionality.
|
36
|
#
|
37
|
# One very important feature of classes Metacat and DataTable is the ability to
|
38
|
# read plain-text data tables by passing a block(closure) that handles fragments
|
39
|
# of the file. This way data can be read without the costs of loading a huge dataset
|
40
|
# directly into RAM.
|
41
|
# == Status
|
42
|
# The Metacat client library is very usable and I'm happy with the current documentation
|
43
|
#
|
44
|
# The Eml and even worse, DataTable classes are minimally functional. They also still
|
45
|
# contain fragments of the original application that inspired them which should be removed
|
46
|
# as they are not generally applicable. Documentation is also not up to par.
|
47
|
#
|
48
|
# I would recommend using the REXML:Document contained within both Eml and DataTable
|
49
|
# where their methods are lacking to access metadata attributes. DataTable.read() is
|
50
|
# in a usable state and handles blocks the same way as Metacat.read.
|
51
|
#
|
52
|
# Where documentation is lacking check the unit tests under ./tests. These sometimes
|
53
|
# give a clearer picture of intended usage.
|
54
|
#
|
55
|
# == Examples
|
56
|
# See classes Metacat, Eml, and DataTable for more examples
|
57
|
# == Author
|
58
|
# Chad Burt
|
59
|
#
|
60
|
# Marine Science Institude
|
61
|
#
|
62
|
# University of California, Santa Barbara
|
63
|
#
|
64
|
# chad@underbluewaters.net
|