Text mining is one of the applications of natural language processing techniques and analytical methods for text data in order to derive relevant information. Over the years, text mining has got a lot of attention, with an exponential increase in digital text data from web pages, Google’s projects and social media services such as Twitter.
Twitter data constitutes, a rich source that can be used for capturing information about any topic imaginable. This data can be used in different use cases such as finding trends related to a specific keyword, measuring brand sentiment, and gathering feedback about new products and services.
Firstly, I will explain how to connect to Twitter Streaming API and how to get the data.
Getting Data from Twitter Streaming API
API are tools that make interaction with computer programs and web services easy. Many web services provide APIs to developers to interact with their services and to access data in programmatic way. In the next few steps we will discuss the use of Twitter Streaming API to download tweets related to keyword “India”.
Step 1: Getting Twitter API keys
In order to access Twitter Streaming API, we need to get 4 pieces of information from Twitter: API key, API secret, Access token and Access token secret. Follow the steps below to get all 4 elements:
· Create a twitter account if you do not already have one.
· Go to https://apps.twitter.com/ and log in with your twitter credentials.
· Click “Create New App”
· Fill out the form, agree to the terms, and click “Create your Twitter application”
· In the next page, click on “API keys” tab, and copy your “API key” and “API secret”.
· Scroll down and click “Create my access token”, and copy your “Access token” and “Access token secret”.
Step 2: Connecting to Twitter Streaming API and downloading data
For connectivity and download of twitter data, we will work with Python library, “tweepy”.
If you don’t have tweepy installed in your machine, go to this link (click here), and follow the installation instructions.