TY - GEN
T1 - Performance modeling of a distributed web crawler using stochastic activity networks
AU - Nasri, Mitra
AU - Shariati, Saeed
AU - Abdollahi Azgomi, Mohammad
PY - 2008
Y1 - 2008
N2 - One of the basic requirements of Web mining is a crawler system, which collects the information from the Web. To predict the performance, dependability and other operational measures of a system, it is required to construct and evaluate a formal model of the system. We have constructed a formal model for a distributed crawler, which is based on UbiCrawler, using stochastic activity networks (SANs). The constructed SAN model is used to evaluate some performance measures of the crawler. The results of the evaluation of throughput are same as the published statistics of UbiCrawler. In addition, we have been able to evaluate two other measures that are communication overhead and coverage. In this paper, we will discuss the architecture of the distributed crawler. Then, we will present a SAN model of the crawler and the results of its evaluation.
AB - One of the basic requirements of Web mining is a crawler system, which collects the information from the Web. To predict the performance, dependability and other operational measures of a system, it is required to construct and evaluate a formal model of the system. We have constructed a formal model for a distributed crawler, which is based on UbiCrawler, using stochastic activity networks (SANs). The constructed SAN model is used to evaluate some performance measures of the crawler. The results of the evaluation of throughput are same as the published statistics of UbiCrawler. In addition, we have been able to evaluate two other measures that are communication overhead and coverage. In this paper, we will discuss the architecture of the distributed crawler. Then, we will present a SAN model of the crawler and the results of its evaluation.
KW - performance modeling
KW - stochastic activity networks
KW - Web crawler
UR - http://www.scopus.com/inward/record.url?scp=78449236991&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-89985-3_66
DO - 10.1007/978-3-540-89985-3_66
M3 - Conference contribution
AN - SCOPUS:78449236991
SN - 3540899847
SN - 9783540899846
T3 - Communications in Computer and Information Science
SP - 535
EP - 542
BT - Advances in Computer Science and Engineering
A2 - Sarbazi-Azad, Hamid
PB - Springer
T2 - 13th International Computer Society of Iran Computer Conference on Advances in Computer Science and Engineering, CSICC 2008
Y2 - 9 March 2008 through 11 March 2008
ER -