Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Twitter based sentiment analysis using JAVA and Hadoop. In this project we are doing the sentiment analysis on twitter data to analyse whether the tweets posted by people are positive or negative or neutral by checking the tweets with the AFFIN dictionary which has a set of 2500 words along with the value of each word ranging from -5 to +5 denoting whether tweets are positive or negative.
This project uses Lexicon-based approach for sentimental analysis of 1000 recent tweets of 4 countries. A sentiment score for each tweet is computed to ascertain the overall nature of the tweet.
Objective of the project is to build a hybrid-filtering personalized news articles recommendation system which can suggest articles from popular news service providers based on reading history of twitter users who share similar interests (Collaborative filtering) and content similarity of the article and user’s tweets (Content-based filtering).