Google News

Google News

You can create a dataset using the Google News API. Rubiscape fetches data from websites using Google News API based on the search string and time interval provided by you.
To create a Google News dataset, follow the steps given below.

  1. On the home page, click the Create icon ().
    The product Selection page is displayed.
  2. Hover over the Data Connect tile, and click Create DatasetDataset Selection page is displayed.



    From the API option, select Google News.


    Create a Google News Dataset page displayed.
  3. Enter Name for the dataset.
  4. Enter Description for the dataset (Optional).
  5. Enter the Search String for which you want to pull the Google News data. Multiple strings can be used, separated by comma.
  6. English is selected as default language.
  7. Select From Date and To Date. Use the Calendar icon (  ) to select the dates.
    The Google News data is fetched between these dates.
  8. Enter API Key and click Verify.
  9. From the Features list, select the Features you want to add to your Google News Dataset.
    Refer to Table: Features of Google News Dataset. 
  10. Click Create.



    The Google News dataset is created and is ready to use in the application.

    The table given below explains the Features of Google News Dataset.

    FieldDescriptionRemark

    author

    The name of the entity (person or website) who has published the  article.

    description

    The text of the article.

    publishedAt

    The date and time when the article was published.

    The time is displayed in in UTC time zone. Example: 2021-02-15T04:00:00Z, 2021-02-15T03:08:00Z, and so on.

    source

    The search string entered by you.

    title

    The title of the article.

    urlThe URL of the article.
    urlToImageThe URL of the main image displayed with the article.
    news_sourceThe name of portal where the article is published.
    wordcountThe number of words in the Title.
    monthStringThe month and year extracted from publishedAt field.Example: Feb 2021, Oct 2020, and so on.
    monthThe numerical form of the month extracted from the publishedAt field.The values can be 1 to 12.
    yearThe year of the publishedAt date.Example: 2021, 2020, an so on.

    (info)Notes

    • Enabling the "Disable Cache" option allows you to create a dataset without generating a dataset cache.
    • When you select to "Disable Cache", the dashboard will not offer the "Enable Direct Query" option. For more information, please refer to the "Enable Direct Query" document.

    • Related Articles

    • Google Spreadsheet

      You can create a dataset using the data stored in your Google Spreadsheet. This feature is especially useful when you have huge volumes of data online. With this feature, you are not required to download the data. Rubiscape can directly connect to ...
    • Google Big Query

      Google Big Query is a serverless architecture. It lets you query huge amounts of data and provides desired results in seconds. It helps you to manage and analyse your data with built-in features. Prerequisites for a Google Big Query are: Google Cloud ...
    • Basic Sentiment Analysis

      Basic Sentiment Analysis is located under Textual Analysis ( ) in Sentiment, in the task pane on the left. Use drag-and-drop method to use algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Basic Sentiment Analysis

      Basic Sentiment Analysis is located under Textual Analysis ( ) in Sentiment, in the task pane on the left. Use drag-and-drop method to use algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Batch Processing

      Working with Batches Workflow in Data Integrator allows you to divide the dataset into batches and then process it. Batch processing is mainly used to simplify many ETL operations like Missing value Imputation, expression, and validating data. You ...