Project

General

Profile

Actions

Feature #5989

open

Track data download, view and citation statistics

Added by Matt Jones almost 11 years ago. Updated over 8 years ago.

Status:
In Progress
Priority:
Normal
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Bugzilla-Id:

Description

Currently the only usage stats we have in Metacat are the raw logs. This new service would provide several statistical reports in machine-readable format intended for efficient use on clients for building user interface displays that show those statistics.

The service should include the following response statistics, and be extensible to add other tracked statistics as needed:

  1. Number of views (defined as number of times the metadata has been viewed on the web)
  2. Number of package downloads (needs definition)
  3. Size in bytes of package downloads
  4. Number of citations (implement in a second phase)

For each of these statistics, calling apps should be able to constrain the results to only include records matching:

  1. a PID or list of PIDs
  2. creator or list of creators (DN, or ORCID, or some amalgam -- to be discussed)
  3. a time range of access event (upload, download, etc.)
  4. ? spatial location of access event (upload, download, etc.)
  5. ? IP Address
  6. accessor or list of accessors (DN, or ORCID, or some amalgam, needs ACL -- to be discussed)

For each of these statistics, calling apps should be able to request the statistic aggregated by several specific facets, including the following (in order of importance):

  1. User (DN, or ORCID, or some amalgam -- to be discussed)
  2. Time range, aggregated to requested unit (day, week, month, year)
  3. ? Spatial range, aggregated to requested unit (to be discussed)

Intersections of these aggregated facets should also be possible, but are a lower priority than the facets alone. For example, when finished, one should be able to request the following reports, among others:

  1. {Views,Downloads,Bytes,Citations} for a given pid or list of pids
  2. {Views,Downloads,Bytes,Citations} by user (aggregates across pids)
  3. {Views,Downloads,Bytes,Citations} by month (aggregates across pids)
  4. {Views,Downloads,Bytes,Citations} by spatial location (aggregates across pids)
  5. {Views,Downloads,Bytes,Citations} for a given pid by month for a specific time range
  6. {Views,Downloads,Bytes,Citations} by user by month
  7. etc.

The download format (JSON?, XML?) should allow for an extended set of response variables, and an extendable set of aggregating facets. Need to discuss, but probably XML as that is DataONE's initial choice for all other services. Contemplate both if useful.

The REST API for this service should be developed in the DataONE space, with intention of it being implementable by both other MNs and CNs in DataONE.


Subtasks 5 (5 open0 closed)

Task #5990: Track downloadsNew

Actions
Task #5991: Track viewsNew

Actions
Task #5992: Track citationsNew

Actions
Task #5993: Summarize and index statistics for fast accessNew

Actions
Task #5994: Create REST API for accessing statisticsNew

Actions

Related issues

Related to Metacat - Feature #6346: Make # READ events available in SOLR indexResolvedben leinfelder01/03/2014

Actions
Has duplicate MetacatUI - Feature #6289: Create usage statistics serviceRejected

Actions
Actions #1

Updated by ben leinfelder over 10 years ago

  • Target version changed from 1.1.0 to 1.2.0
Actions #2

Updated by ben leinfelder over 10 years ago

  • Target version changed from 1.2.0 to 1.3.0
Actions #3

Updated by ben leinfelder over 10 years ago

  • Target version deleted (1.3.0)
Actions #4

Updated by Lauren Walker over 10 years ago

  • Target version set to 1.5.0
Actions #5

Updated by Matt Jones over 10 years ago

  • Description updated (diff)
Actions #6

Updated by Matt Jones over 10 years ago

  • Project changed from MetacatUI to Metacat
  • Target version deleted (1.5.0)
Actions #7

Updated by Matt Jones over 10 years ago

  • Assignee set to Peter Slaughter
  • Target version set to 2.5.0
Actions #8

Updated by Matt Jones over 10 years ago

  • Description updated (diff)

Updated description to clarify a few of the reports and filters.

Actions #9

Updated by Matt Jones over 10 years ago

  • Description updated (diff)
Actions #10

Updated by Matt Jones over 10 years ago

  • Description updated (diff)
Actions #11

Updated by Matt Jones over 10 years ago

  • Target version changed from 2.5.0 to 2.4.0
Actions #12

Updated by Peter Slaughter over 10 years ago

  • Status changed from New to In Progress
Actions #13

Updated by ben leinfelder about 10 years ago

  • Target version changed from 2.4.0 to 2.5.0
Actions #14

Updated by ben leinfelder almost 10 years ago

I believe this is now all targeted for the CNs. We may consider moving tasks into the DataONE tracker as appropriate...

Actions #15

Updated by ben leinfelder over 8 years ago

  • Target version changed from 2.5.0 to 2.5.1
Actions #16

Updated by ben leinfelder over 8 years ago

  • Target version changed from 2.5.1 to 2.x.y
Actions

Also available in: Atom PDF