DSpace Documentation : DSpace Statistics
This page last changed on Jan 14, 2011 by benbosman.
DSpace StatisticsDSpace 1.6 and newer versions uses the Apache SOLR application underlying the statistics. SOLR enables performant searching and adding to vast amounts of (usage) data.
What is exactly being logged ?Each time a page or file gets requested, this request is being logged. The logging happens at the server side, and doesn't require a javascript like Google Analytics does, to provide usage data. Definition of which fields are to be stored happens in the file dspace/solr/statistics/conf/schema.xml. <field name="type" type="integer" indexed="true" stored="true" required="true" /> <field name="id" type="integer" indexed="true" stored="true" required="true" /> <field name="ip" type="string" indexed="true" stored="true" required="false" /> <field name="time" type="date" indexed="true" stored="true" required="true" /> <field name="epersonid" type="integer" indexed="true" stored="true" required="false" /> <field name="country" type="string" indexed="true" stored="true" required="false" /> <field name="city" type="string" indexed="true" stored="true" required="false"/> <field name="owningComm" type="integer" indexed="true" stored="true" required="false" multiValued="true" /> The combination of type and id determine which resource (either community, collection, item page or file download) has been requested. Web user interface for DSpace statisticsIn the XMLUI, statistics can be accessed from the lower end of the navigation menu. In the JSPUI, a view statistics button appears on the bottom of pages for which statistics are available. If you are not seeing these links or buttons, it's likely that they are only enabled for administrators in your installation. Change the configuration parameter "statistics.item.authorization.admin" to false in order to make statistics visible for all repository visitors. Home pageStarting from the repository homepage, the statistics page displays the top 10 most popular items of the entire repository. Community home pageThe following statistics are available for the community home pages:
Collection home pageThe following statistics are available for the collection home pages:
Item home pageThe following statistics are available for the item home pages:
Usage Event Logging and Usage Statistics GatheringThe DSpace Statistics Implementation is a Client/Server architecture based on Solr for collecting usage events in the JSPUI and XMLUI user interface applications of DSpace. Solr runs as a separate webapplication and an instance of Apache Http Client is utilized to allow parallel requests to log statistics events into this Solr instance. Configuration settings for StatisticsIn the dspace.cfg file review the following fields to make sure they are uncommented:
Upgrade Process for Statistics.Example of rebuild and redeploy DSpace (only if you have configured your distribution in this manner) First approach the traditional DSpace build process for updating cd [dspace-source]/dspace
mvn package
cd [dspace-source]/dspace/target/dspace-<version>-build.dir
ant -Dconfig=[dspace]/config/dspace.cfg update
cp -R [dspace]/webapps/* [TOMCAT]/webapps
The last step is only used if you are not mounting [dspace]/webapps directly into your Tomcat, Resin or Jetty host (the recommended practice)If you only need to build the statistics, and don't make any changes to other web applications, you can replace the copy step above with: cp -R dspace/webapps/solr TOMCAT/webapps Again, only if you are not mounting [dspace]/webapps directly into your Tomcat, Resin or Jetty host (the recommended practice) Restart your webapps (Tomcat/Jetty/Resin) Older setting that are not related to the new 1.6 StatisticsThe following Dspace.cfg fields are only applicable to the older statistics solution. ###### Statistical Report Configuration Settings ###### # should the stats be publicly available? should be set to false if you only # want administrators to access the stats, or you do not intend to generate # any report.public = false # directory where live reports are stored report.dir = ${dspace.dir}/reports/ These fields are not used by the new 1.6 Statistics, but are only related to the Statistics from previous DSpace releases Statistics AdministrationConverting older DSpace logs into SOLR usage dataIf you have upgraded from a previous version of DSpace, converting older log files ensures that you carry over older usage stats from before the upgrade. Statistics Client UtilityThe command line interface (CLI) scripts can be used to clean the usage database from additional spider traffic and other maintenance tasks. Statistics differences between DSpace 1.6.x and 1.7.0SOLR optimization addedIf required, the solr server can be optimized by running {dspace.dir}/bin/stats-util -o . More information on how these solr server optimizations work can be found here: http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations. SOLR AutocommitIn DSpace 1.6.x, each solr event was committed to the solr server individually. For high load DSpace installations, this would result in a huge load of small solr commits resulting in a very high load on the solr server. {dspace.dir}/solr/statistics/conf/solrconfig.xml. |
![]() |
Document generated by Confluence on Mar 25, 2011 19:21 |