DATA SCRAPING AND DATA CRUNCHING

DATA SCRAPING  AND DATA CRUNCHING

Preparing presentations and project reports are the requirements of every sector especially when you are doing a research but without proper data it cannot be done effectively. Data collection from internet is an art and requires the knowledge of proper source. If you know the proper source but don’t know how to pick up desired data from that webpage, a big chunk of your valuable time is wasted in converting that data to desired format manually. I was also facing this problem from a long time. I use to do manual scraping of the data I needed, for which I had to repeat a long no. of copy paste actions. All this, only because I was unaware that there is a solution to this problem.
Automated data scrapers are the tools which work for you to collect the data and saves you from doing a lot of donkey work. These tools should be used when the webpage contains data in some repetitive format. A good web scraper can help in application development, Public Administration transparency, Big Data processes, data analysis, online marketing, digital reputation, information reuse or content comparers and aggregators, among other typical scenarios.
I have attached the screenshots of the data which I scrapped in excel format.

Text Box: This is the scrapped data worksheet which I scrapped using “web harvy” software.
Text Box: This is the source of data which I referred for my project work.
Sometimes the data available in the source is spread over a large number of pages. In that case you may end up in wasting your time in scraping data from each page. If the page has load more option at the end, you can save your energy. You just need to know the total no. of pages that contain your desired data. Then just change the address or url of web page in the scraping software.
is the link for first page. If you want to load 30 pages all together, just change the 0 at the end of url to 30.This will load the data of all 30 pages at once.
Here are some of the softwares which can be used for the purpose of data scraping:
WebHarvy Web Scraper
Collected data is of no use until and unless it is processed by doing complicated operations. The conversion of complicated and large data to small, processed data and information is “Data crunching”. The substitute of manual processing of data is automated data crunching. There are various software available in the internet which can help in data crunching.
Some of the data crunching tools are :
Python
R
Matlab
Knime
Along with these google analytics application gallery tool can provide great assistance in data crunching.
By,
Mamta Joshi

LEAVE A REPLY

Please enter your comment!
Please enter your name here