Workload analysis for scientific literature digital libraries

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Workload studies of large-scale systems may help locating possible bottlenecks and improving performances. However, previous workload analysis for Web applications is typically focused on generic platforms, neglecting the unique characteristics exhibited in various domains of these applications. It is observed that different application domains have intrinsically heterogeneous characteristics, which have a direct impact on the system performance. In this study, we present an extensive analysis into the workload of scientific literature digital libraries, unveiling their temporal and user interest patterns. Logs of a computer science literature digital library, CiteSeer, are collected and analyzed. We intentionally remove service details specific to CiteSeer. We believe our analysis is applicable to other systems with similar characteristics. While many of our findings are consistent with previous Web analysis, we discover several unique characteristics of scientific literature digital library workload. Furthermore, we discuss how to utilize our findings to improve system performance.

Original languageEnglish (US)
Pages (from-to)139-149
Number of pages11
JournalInternational Journal on Digital Libraries
Volume9
Issue number2
DOIs
StatePublished - Nov 1 2008

Fingerprint

technical literature
workload
performance
computer science

All Science Journal Classification (ASJC) codes

  • Library and Information Sciences

Cite this

@article{9bf7cf5fdb034fc4a3d61722343fb561,
title = "Workload analysis for scientific literature digital libraries",
abstract = "Workload studies of large-scale systems may help locating possible bottlenecks and improving performances. However, previous workload analysis for Web applications is typically focused on generic platforms, neglecting the unique characteristics exhibited in various domains of these applications. It is observed that different application domains have intrinsically heterogeneous characteristics, which have a direct impact on the system performance. In this study, we present an extensive analysis into the workload of scientific literature digital libraries, unveiling their temporal and user interest patterns. Logs of a computer science literature digital library, CiteSeer, are collected and analyzed. We intentionally remove service details specific to CiteSeer. We believe our analysis is applicable to other systems with similar characteristics. While many of our findings are consistent with previous Web analysis, we discover several unique characteristics of scientific literature digital library workload. Furthermore, we discuss how to utilize our findings to improve system performance.",
author = "Huajing Li and Lee, {Wang Chien} and Anand Sivasubramaniam and Giles, {C. Lee}",
year = "2008",
month = "11",
day = "1",
doi = "10.1007/s00799-008-0043-z",
language = "English (US)",
volume = "9",
pages = "139--149",
journal = "International Journal on Digital Libraries",
issn = "1432-5012",
publisher = "Springer Verlag",
number = "2",

}

Workload analysis for scientific literature digital libraries. / Li, Huajing; Lee, Wang Chien; Sivasubramaniam, Anand; Giles, C. Lee.

In: International Journal on Digital Libraries, Vol. 9, No. 2, 01.11.2008, p. 139-149.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Workload analysis for scientific literature digital libraries

AU - Li, Huajing

AU - Lee, Wang Chien

AU - Sivasubramaniam, Anand

AU - Giles, C. Lee

PY - 2008/11/1

Y1 - 2008/11/1

N2 - Workload studies of large-scale systems may help locating possible bottlenecks and improving performances. However, previous workload analysis for Web applications is typically focused on generic platforms, neglecting the unique characteristics exhibited in various domains of these applications. It is observed that different application domains have intrinsically heterogeneous characteristics, which have a direct impact on the system performance. In this study, we present an extensive analysis into the workload of scientific literature digital libraries, unveiling their temporal and user interest patterns. Logs of a computer science literature digital library, CiteSeer, are collected and analyzed. We intentionally remove service details specific to CiteSeer. We believe our analysis is applicable to other systems with similar characteristics. While many of our findings are consistent with previous Web analysis, we discover several unique characteristics of scientific literature digital library workload. Furthermore, we discuss how to utilize our findings to improve system performance.

AB - Workload studies of large-scale systems may help locating possible bottlenecks and improving performances. However, previous workload analysis for Web applications is typically focused on generic platforms, neglecting the unique characteristics exhibited in various domains of these applications. It is observed that different application domains have intrinsically heterogeneous characteristics, which have a direct impact on the system performance. In this study, we present an extensive analysis into the workload of scientific literature digital libraries, unveiling their temporal and user interest patterns. Logs of a computer science literature digital library, CiteSeer, are collected and analyzed. We intentionally remove service details specific to CiteSeer. We believe our analysis is applicable to other systems with similar characteristics. While many of our findings are consistent with previous Web analysis, we discover several unique characteristics of scientific literature digital library workload. Furthermore, we discuss how to utilize our findings to improve system performance.

UR - http://www.scopus.com/inward/record.url?scp=56049091745&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56049091745&partnerID=8YFLogxK

U2 - 10.1007/s00799-008-0043-z

DO - 10.1007/s00799-008-0043-z

M3 - Article

AN - SCOPUS:56049091745

VL - 9

SP - 139

EP - 149

JO - International Journal on Digital Libraries

JF - International Journal on Digital Libraries

SN - 1432-5012

IS - 2

ER -