Project

General

Profile

Bug #3396

Enable event notification feature

Added by Chris Jones almost 11 years ago. Updated almost 11 years ago.

Status:
In Progress
Priority:
Immediate
Assignee:
Category:
metacat
Target version:
Start date:
06/14/2008
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
3396

Description

We would like to propose some changes to Metacat's event logging
feature to extend the functionality and provide a notification feature
that alerts data set owners and/or interested parties of downloads and
other events. We plan on prototyping the changes, and would like
input and suggestions from other metacat developers on the features
and implementation.

For an email notification system (or other, such as RSS) to work, it
would require a mechanism for the end user to 'subscribe' to
notifications based on events. In brainstorming this, we thought that
the subscription could be based on, perhaps, a hand chosen
notification list of packageIds by data set or data set group (e.g.
'notify me about events on: PISCO intertidal/subtidal/physical ocean/
data packages' ...). Expressing these groupings might be done via a
pathquery document or a cached query that produces a packageId list.
Suggestions are welcome on the best method to associate a data package
docid list and an email address of a person to be notified.

The information that's logged in metacat's access_log table is
sufficient for general reporting:

- registered user LDAP DN
user name
affiliated organization name
- event date/time stamp
- event type
- docid
(However, in building an email [or an RSS feed], the data package
title would be a more friendly way of displaying which package was
downloaded, etc.)

The changes to metacat would also likely a include mechanism to
register an event listener that monitors changes to the model backed
by the access_log table. For instance, a researcher might post the
following to metacat:

action=monitor&\
username=uid=rcore,o=PISCO,dc=ecoinformatics,dc=org&\
qformat=email&\
event=read&\
query=< the pathquery document that produces a package list >

By doing so, this action would register the listener, and the listener
would provide a callback used to handle the event notification. At
the moment, only metacat administrators have access to the logging
information via the getlog action.

Once someone is registered to monitor events, metacat would have to
then provide notification over specific protocols. The notification
process may be easiest if metacat includes an SMTP send-only server,
such as Aspirin, an embeddable SMTP server.

https://aspirin.dev.java.net/

There are other push mechanisms that could be used (like RSS), but the
researchers we work with specifically asked for email-based
notification.

We'll enter a placeholder bugzilla report to keep track of this
feature, but thought that people would have suggestions on both the
design and implementation before we get started.

Please let us know what you think.

Rex, Chris, Mike, Jordan

History

#1 Updated by Chris Jones almost 11 years ago

This is a great idea and a frequently requested feature. I think people would be more likely to archive their data if this were implemented. IMs from various sites would like to be able to get reports about access to the packages they manage, and individual contributers would like to know who is accessing their metadata and data. A couple of things to consider:

1) privacy: who should be able to see log data for a package? Are there privacy concerns with opening this up beyond administrators and data owners? If so, how does an IM check the access log for their site's packages if the packages are owned by someone else at their site and are set to private ACL rules?

2) Send back raw access events in real time? Reports for individual access events? Reports summarizing access over a certain time period? For many people, just getting a monthly download count for their package might be sufficient.

3) Log reporting from replicas. When objects are in several metacats via replication, people can access them from any of the servers. It would be nice if a logging request was replication aware and could request and aggregate logs from multiple replication servers so that a single report for all download could be provided back to the user.

#2 Updated by Callie Bowdish almost 11 years ago

Here is another request regarding this. It is from Dan Gruener. "Also, we want to upload the datasets for NCEAS but still would like to be notified when data requests come in and people download. Is that possible?" Many ecologist would like this feature.

#3 Updated by Margaret O'Brien almost 11 years ago

You might be interested in a project at the LTER Network Office to build an auditing system for all downloads, not just metacat/eml:
http://intranet.lternet.edu/archives/documents/Newsletters/DataBits/07fall/#fa3

#4 Updated by Redmine Admin about 6 years ago

Original Bugzilla ID was 3396

Also available in: Atom PDF