Project

General

Profile

« Previous | Next » 

Revision 6846

Added by Matt Jones over 12 years ago

Converted Event Logging and Sitemaps chapters to RST.

View differences:

docs/dev/metacat/source/sitemaps.rst
1 1
Enabling Web Searches: Sitemaps
2 2
===============================
3 3

  
4
Under construction!
4
Sitemaps are XML files that tell search engines—such as Google, which is 
5
discussed in this section--which URLs on your websites are available for 
6
crawling. Currently, the only way for a search engine to crawl and index 
7
Metacat so that individual metadata entries are available via Web searches 
8
is with a sitemap. Metacat automatically creates sitemaps for all public 
9
documents in the repository. However, you must register the sitemaps with 
10
the search engine before it will take effect.
5 11

  
6
Heading 1
7
------------
8 12

  
9
Heading 2
10
------------
13
Creating a Sitemap
14
------------------
11 15

  
16
Metacat automatically generates a sitemap file for all public documents in 
17
the repository on a daily basis. The sitemap file(s) must be available via 
18
the Web on your server, and must be registered with Google before they take 
19
effect. For information on the sitemap protocol, please refer to the Google 
20
page on using the sitemap protocol. You can view Metacat's sitemap files at:: 
21

  
22
  <webapps_dir>/sitemaps
23

  
24
The directory contains one or more XML files named::
25

  
26
  metacat<X>.xml
27

  
28
where ``<X>`` is a number (e.g., 1 or 2) used to increment each sitemap file. 
29
Because Metacat limits the number of sitemap entries in each sitemap file to 
30
25,000, the servlet creates an additional sitemap file for each group of 
31
25,000 entries. 
32

  
33
Verify that your sitemap files are available to the Web by browsing to::
34

  
35
  <your_web_context>/sitemaps/metacat<X>.xml 
36
  (e.g., your.server.org/knb/sitemaps/metacat1.xml)
37

  
38
Registering a Sitemap
39
---------------------
40
Before Google will begin indexing the public files in your Metacat, you must 
41
register the sitemaps. To register your sitemaps and ensure that they are up 
42
to date:
43

  
44
1. Register for a Google Webmaster Tools account, and add your Metacat 
45
   site to the Dashboard.
46
2. From your Google Webmaster Tools site account, register your sitemaps. 
47
   See the Google help site for more information about how to register sitemaps. 
48
   Note: Register the full URL path to your sitemap files, including 
49
   the http:// (or https://) headers.
50

  
51
Once the sitemaps are registered, Google will begin to index the public 
52
documents in your Metacat repository. 
53

  
54
NOTE: As you add more publicly accessible data to Metacat, you will need to 
55
periodically revisit the Google Webmaster Tools utility to refresh your 
56
sitemap registration.
docs/dev/metacat/source/event-logging.rst
1 1
Event Logging
2 2
=============
3 3

  
4
Under construction!
4
Metacat keeps an internal log of events (such as insertions, updates, deletes, 
5
and reads) that can be accessed with the getlog action. Using the getlog action, 
6
event reports can be output from Metacat in XML format, and/or customized to 
7
include only certain events: events from a particular IP address, user, event 
8
type, or that occurred after a specified start date or before an end date. 
5 9

  
6
Heading 1
7
------------
10
The following URL is used to return the basic log—an XML-formatted log of all 
11
events since the log was initiated::
8 12

  
9
Heading 2
10
------------
13
  http://some.metacat.host/context/metacat?action=getlog 
11 14

  
15
Note that you must be logged in to Metacat using the HTTP interface or you 
16
will get an error message. For more information about logging in, please see 
17
Logging In with the HTTP Interface.
18

  
19
::
20

  
21
  <!-- Example of XML Log -->
22
  <?xml version="1.0"?>
23
  <log>
24
  <logEntry><entryid>44</entryid><ipAddress>34.237.20.142</ipAddress><principal>uid=jones,
25
  o=NCEAS,dc=ecoinformatics,dc=org</principal><docid>esa.2.1</docid><event>insert</event>
26
  <dateLogged>2004-09-08 19:08:18.16</dateLogged></logEntry>
27
  <logEntry><entryid>47</entryid><ipAddress>34.237.20.142</ipAddress><principal>uid=jones,o=NCEAS,
28
  dc=ecoinformatics,dc=org</principal><docid>esa.3.1</docid><event>insert</event><dateLogged>2004-
29
  09-14 19:50:40.61</dateLogged></logEntry>
30
  </log>
31

  
32
The basic log can be quite extensive. To subset the report, restrict the 
33
matching events using parameters. Query parameters can be combined to further 
34
restrict the report.
35

  
36
+-----------+-----------------------------------------------------+
37
| Parameter | Description and Values                              |
38
+===========+=====================================================+
39
| ipAddress | Restrict the report to this IP Address (repeatable) |
40
+-----------+-----------------------------------------------------+
41
| principal | Restrict the report to this user (repeatable)       |
42
+-----------+-----------------------------------------------------+
43
| docid     | Restrict the report to this docid (repeatable)      |
44
+-----------+-----------------------------------------------------+
45
| event     | Restrict the report to this event type (repeatable) |
46
|           | Values: insert, update, delete, read                |
47
+-----------+-----------------------------------------------------+
48
| start     | Restrict the report to events after this date       |
49
|           | Value: YYYY-MM-DD+hh:mm:ss                          |
50
+-----------+-----------------------------------------------------+
51
| end       | Restrict the report to events before this date.     |
52
|           | Value: YYYY-MM-DD+hh:mm:ss                          |
53
+-----------+-----------------------------------------------------+
54

  
55
To view only the 'read' events, use a URL like::
56

  
57
  http://some.metacat.host/context/metacat?action=getlog&event=read
58

  
59

  
60
To view only the events for a particular IP address, use a URL like::
61

  
62
  http://some.metacat.host/context/metacat?action=getlog&ipaddress=107.9.1.31
63

  
64

  
65
To view only the events for a given user, use a URL like::
66

  
67
  http://some.metacat.host/context/metacat?action=getlog&principal=uid=johndoe,o=NCEAS,dc=ecoinformatics,dc=org 
68

  
69

  
70
To view only the events for a particular document, use a URL like::
71

  
72
  http://some.metacat.host/context/metacat?action=getlog&docid=knb.5.1 
73

  
74

  
75
To view only the events after a given date, use a URL like::
76

  
77
  http://some.metacat.host/context/metacat?action=getlog&start=2004-09-15+12:00:00
78

  
79

  
80
To view only the events before a given date, use a URL like::
81

  
82
  http://some.metacat.host/context/metacat?action=getlog&end=2004-09-15+12:00:00
83

  
84

  
85
To view the 'insert' events for September 2004 (i.e., to combine parameters) use a URL like::
86

  
87
  http://some.metacat.host/context/metacat?action=getlog&event=insert&start=2004-09-01+12:00:00&end=2004-09-30+23:59:59 
88

  

Also available in: Unified diff