close
The Wayback Machine - https://web.archive.org/web/20200907121910/https://github.com/topics/scrapy-crawler
Skip to content
#

scrapy-crawler

Here are 273 public repositories matching this topic...

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

  • Updated Nov 13, 2019
  • C#

A project which implements the Data Science pipeline - Extracting raw data, Data Cleaning, Feature Extraction, Entity Matching, Data Matching, Data Merging and OLAP style exploration. The two entities chosen are Yelp and Zomato. Restaurant data from the same localities will be extracted from both sites and similar restaurants will be merged into one big table. OLAP style exploration will be done on that table to find out the insights from the collected data (Eg, Which is the most highly rated restaurant in California)

  • Updated Feb 14, 2018
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the scrapy-crawler topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scrapy-crawler topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.