Web Scraping: State-of-the-Art and Areas of Application - Archive ouverte HAL Access content directly
Conference Papers Year :

Web Scraping: State-of-the-Art and Areas of Application

Abstract

Main objective of Web Scraping is to extract information from one or many websites and process it into simple structures such as spreadsheets, database or CSV file. However, in addition to be a very complicated task, Web Scraping is resource and time consuming, mainly when it is carried out manually. Previous studies have developed several automated solutions. The purpose of this article is to revisit the different existing Web Scraping approaches, categories, and tools, but also its areas of application.
Not file

Dates and versions

hal-02492481 , version 1 (27-02-2020)

Identifiers

Cite

Rabiyatou Diouf, Edouard Ngor Sarr, Ousmane Sall, B. Birregah, Mamadou Bousso, et al.. Web Scraping: State-of-the-Art and Areas of Application. 2019 IEEE International Conference on Big Data (Big Data), Dec 2019, Los Angeles, United States. pp.6040-6042, ⟨10.1109/BigData47090.2019.9005594⟩. ⟨hal-02492481⟩
134 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More