Top 2020 AI tools for automated journalism

Connexun | news api
9 min readJun 9, 2020

--

A brief overview at some of the most disruptive AI technologies enhancing automation within the field of journalism

It is clear by now that AI is drastically contributing to save time and money within the publishing industry by helping journalists monitor and keep up with the ever-increasing scale of global news published online. If, on the one hand, fact-finding or fact-checking tasks are likely to become increasingly irrelevant with the rise of new AI tools & technologies, it is unlikely that the present AI developments in journalism will be the tailspin of the role of the journalist or writer. Most jobs infact will be “augmented” (rather than replaced) with additional capabilities to gather and manage data.

Artificial intelligence in news media is being employed in a great amount of distinct ways, from speeding up research for example, to collecting and cross-referencing data and beyond. The following post seeks to explore what new journalism tasks are made possible by artificial intelligence, which AI applications are playing a role in augmenting the journalistic process, and which are actually replacing journalists. At last it seeks to look into how publishers are using these applications to improve the quality of news media, and how will they affect the future of journalism.

· Connexun | news api — Real-time multilingual content monitoring and trending topics discovery

Connexun’s news api enable users to source in real time multilingual headlines, articles and dynamic summaries extracted from thousands of trusted online news outlets around the globe. It continues to augment intelligent tools that leverage the power of Machine Learning and Natural Language Processing to analyze large datasets of aggregated content. Its crawling and classifying technologies are capable of signalling trending topics online and content published by media outlets around the globe. Entity extraction can be a mean to monitor content based on a specific parameter (location, name, expression and more). Furthermore, Connexun’s capability to automatically generate summaries from aggregated articles can save journalists a great amount of time and effort in understanding and generating original content.

Connexun is sourcing multilingual headlines, articles and dynamic summaries from over 20.000 trusted information sites from hundreds of countries, in a great amount of distinct languages with its news & information API. It takes pride in employing its own artificial intelligent engine to value in its rankings sources dealing international matters. Its technology allows the browsing of multilingual headlines, news content and dynamic summaries based on key variables such as origin, country of interest, geolocation and more.

Connexun | News Api

· Narrative Science’s Quill & Lexio — Natural Language Generation (NLG)

Narrative Science is a data storytelling and natural language generating (NLG) company creating technologies that turn data into plain-English stories (numbers into stories). Its offering includes Quill, which transforms raw data into stories and embeds them directly into users’ favorite dashboards. Quill can be used as an extension of Qlik, Tableau and Power BI. Lexio, on the other hand, is a language-based augmented analytics product that turns business data into interactive plain-English stories. Lexio can be employed together with Salesforce for example. It can provide insights for those with limited data analytics abilities.

· Yahoo! Sports — Natural Language Generation (NLG)

According to Ken Fuchs, head of Yahoo! Sports, most fantasy football users spend up to 29 hours per year reading about their teams (Daniel Faggella: “Yahoo! Uses NLG to Deliver Personal Fantasy Sports Recaps and Updates”, on Emerj.com). Yahoo! used automated Insights’ wordsmith platform to showcase fantasy football data in the form of personalized reports, match previews, and match recaps. The Wordsmith platform uses natural language generation (NLG) to create personalized stories for each user. The process of creating a fantasy football match recap through an AI platform is a task that requires high levels of both variability and complexity in the content.

· Heliograf Smart Software, The Washington Post — Natural Language Generation (NLG)

The Washington Post has been experimenting another natural language generating (NLG) technology. Heliograf smart software automates news writing. It was tested during the Olympics in 2016 and was employed to put together news stories by analyzing data regarding the games and then matching the data to relevant phrases in a story template to develop content which could be published across different platforms. The software can also signal to journalists any anomalies it finds within the data. Automated products, such as Heliograf, got their original start in more data-grounded domains like sports and finance.

· BBC’s Juicer — Semantic Discovery

The Juicer is a news aggregation and content extraction API. The machine surveils c. 850 global news outlets’ RSS feeds and aggregates and extracts news articles. It takes articles from the BBC and other news sites, automatically parses them and tags them with related DBpedia entities. After assigning semantic tags to the stories, it classifies them to one of four categories: organizations, locations, people and things. If a journalist is looking for the latest stories on President Bolsonaro or articles associated with companies in the travel industry, Juicer quickly searches the web and provides a list of related content. BBC Lab is also experimenting with adding this capability to video content by overlaying facts on different parts of an image or shot.

For more information visit BBC newslabs:

Source: bbcnewslabs.co.uk

· Reuters — Data visualization technology (with Graphiq) and news tracer

Reuters developed another relevant tool to enrich data-driven news stories. In 2016, Reuters partnered with “Graphiq” to provide news publishers with a wide range of free interactive data visualizations across a spectrum of topics including entertainment, sports and news. Publishers can access the data via Reuters Open Media Express. Once embedded on publishers’ websites, data visualizations are updated in real time.

Futhermore, Reuters (as explained by X. Liu, et al. (2017) in: “Reuters Tracer: Toward Automated News Production Using Large Scale Social Media Data”) also developed its own news tracer, automating end-to-end news production using Twitter data. It is capable of detecting, classifying and disseminating news in real time for Reuters journalists without manual intervention. Tracer is topic and domain agnostic. It does not rely on a predefined set of sources or subjects. Instead, it identifies emerging conversations from 12+ million tweets per day and selects those that are news-like. Then, it contextualizes each story by adding a summary and a topic to it. An application such as Reuter’s News Tracer can track down breaking news, so that journalists are not tied down to grunt work and can be used in parallel with Connexun’s news api.

· The New York Times — Editor for Semantic Discovery and Perspective API for Comment Monitoring

BBC’S Juicer technology is not the only semantic discovery tool out there. In 2015 The New York Times implemented its experimental AI project known as Editor. When writing an article, a journalist could use tags to highlight a phrase, headline, or main points of the text. Over time, the computer learns to recognize these semantic tags and learns the most salient parts of an article. By searching through data in real time and extracting information based on requested categories, such as events, people, location and dates Editor can make information more accessible, simplifying the research process and providing fast and accurate fact checking.

The New York Times is also using AI in a unique approach to moderate reader comments, encourage constructive discussion and eliminate harassment and abuse. The Perspective API tool developed by Jigsaw (part of Google’s parent company Alphabet) organizes reader’s comments interactively so that viewers can quickly see which ones they may find “toxic” and which may be more illuminating. It applies sentiment analysis to comments. It is a valuable instrument to make sure users read and interact with comments they are interested in while avoiding more aggressive ones.

For more information visit www.perspectiveapi.com:

Source: perspectiveapi.com

· The Guardian — Chatbot Media Interfaces

Recent years signalled the advent also of chatbots. Chatbots were mainly employed to automize the distribution of news content to the audience (rather than automize content generation). In 2016, The Guardian launched its Chatbot via Facebook. To save time scrolling through or searching for news stories, the chatbot allowed users to pick from US, UK and Australian versions of Guardian News, choose from a 6am, 7am or 8am delivery time and receive selected news stories everyday via Facebook Messenger. Much like our Quartz example below, the interface replies to chat messages with related content relevant to the users query.

The Guardian is one of the many players with its chatbot technology, you can find more examples at the following link:

https://www.theguardian.com/technology/2016/apr/13/facebook-army-chatbots-messenger-news-sports

· Quartz Digital News — Chatbot Media Interfaces

Quartz is experimenting with a media and news app that resembles “chat”, and uses natural language processing to find articles about events, people, or topics that its users request. Aiming, once again, at automizing content distribution. Today news media has moved not just from print to desktop to mobile phones, but also to other internet-connected devices for the home and car. Users are interacting with companies through chat, voice, and other innovative new channels, and Quartz wants to find the cutting edge for how media can be consumed too. Quartz aims to develop bots and AI in applications that will interface seamlessly with all media platforms.

More recently Quartz established the Quartz AI Studio to produce articles that use machine learning to assist journalists in the reporting of those articles, such as by separating the signal from the noise in terabytes of data in a fraction of the time that it would take a team of humans to comb through them. The publication plans to use its AI Studio to help others, particularly small- and mid-sized outlets that may not be able to staff a standalone team dedicated to AI-assisted reporting. The Quartz AI Studio team will publish how-to guides and release code examples that other publications can use to start incorporating the technology into their own reporting.

· Associated Press — Semantic Discovery, AI for Analytics, Automated Journalism

At last, another relevant use case is that of the Associated Press, which first began using AI for the creation of news content in 2013 to draw data and produce sport and earnings reports. These days the AP newsroom uses NewsWhip to keep ahead of trending news stories on social media such as Twitter, Facebook, Pinterest, and LinkedIn. As well as tracking news stories, it can analyze a real or historical time period on any timescale scale between 30 minutes and 3 years and provide reporters with real time alerts or daily digests.

· Bloomberg & Forbes— Cyborg & Bertie

Many of the algorithms converting data into narrative news text in real-time are financially focused news stories since data is calculated and released frequently, which is why should be no surprise that Bloomberg news is one of the first adaptors of this automated content. Their program, Cyborg, churned out thousands of articles last year that took financial reports and turned them into news stories like a business reporter.

Forbes, on the other hand, is using “Bertie”. The engine behind the new site is an innovative content management system (“CMS”). An Artificially Intelligent publishing platform, Bertie is designed specifically for our in-house newsroom of journalists, our expert contributor network, and our BrandVoice partners. Bertie’s artificial intelligence gives our storytellers a bionic suit — providing real-time trending topics to cover, recommending ways to make headlines more compelling and suggesting relevant imagery.

The AP estimates that AI helps to free up about 20 percent of reporters’ time spent covering financial earnings for companies and can improve accuracy. This gives reporters more time to concentrate on the content and story-telling behind an article rather than the fact-checking and research. Therefore, all in all, this could truly benefit journalism.

Thank you for your attention!

For more details / to signal new AI technologies enhancing automation within the field of journalism reach us out at: aldo.visibelli@connexun.com

--

--

Connexun | news api

Connexun is the ultimate AI news engine — turning unstructured news content into multi-purpose actionable data.