Automatic content generation : a starting guide to OpenAI’s GPT-3 (and how to make it work in a foreign language)

Richard BEN
Webedia I/O
Published in
4 min readAug 18, 2022

--

Illustration : Aditya Rathod from Unsplash

One of the biggest challenges for today’s digital media companies like Webedia Group is to cover as many topics as possible in order to increase their reach and bring the freshest information possible to their audiences. As a result, writers sometimes need help to produce “simple” content (articles that require less expertise in terms of research and analysis than usual) very quickly… Of course, it is not about replacing humans but about equipping them with tools to allow them to go faster and avoid some repetitive tasks. This is where automatic content generation comes in !

Universally available to all developers for over a year, OpenAI’s algorithm GPT-3 (Generative Pre-trained Transformer 3) has undeniably become THE tool of choice for content generation. It is the biggest language AI model ever trained, which makes it perfect for generating online articles.

GPT-3 works by simply giving an instruction as an input (e.g “Write a paragraph about planet Mars”) and the algorithm will simply print out the generated output.

The OpenAI GPT-3 playground

Test it by yourself ! https://beta.openai.com/playground

At Webedia, we have been experimenting with the use of GPT-3 for several use cases in French.

Although the official GPT-3 documentation from OpenAI specifies that their tool works best in English, we will go through the different set of tasks we tried out in French (same could be applied in other languages). Let’s see how they performed !

Basic text generation

Unstructured text generation in languages other than English is handled relatively well by GPT-3 (no matter which parameter you use). You are simply required to specify which language you want your output text to be in the following way : “Rédige un texte en Français sur […]” (Write a text in French about […]). This method to generate text in another language will be the same for other NLG tasks with GPT-3.

Structured text generation

Let’s say that you want to generate an article with a certain outline, 3 paragraphs for instance, and you want each of them to deal with a certain topic and contain certain pieces of information.

Enumerating them in lists of elements seems to be the most optimal way to do so :

Note that for this task you will need to specify “the first contains information from 1, the second from 2 and the third from 3” in the instruction, otherwise the desired information will not be generated.

This task will also work in languages other than English, here in French. In order to generate text in French, you will simply need to specify it in the text prompt (otherwise even with instructions in French, GPT-3 could generate text in English) :

Text summarization

GPT-3 works relatively well when it comes to extractive summarization (identifying the sentence which best summarizes the entire text) :

To make it work in another language (e.g French, Spanish, Italian …) you will simply have to specify that the text is in that other language (e.g “Extrais la phrase la plus informative de ce texte en français”).

Same goes for abstractive summarization (generating a new set of sentences which takes into account all of the input).

This will work in any other language exactly as it does in English, e.g : “Text in French to be summarized” + “Tl;dr”

Paraphrasing

As for paraphrasing a text, that’s where GPT-3 struggles. It tends to either repeat the original text verbatim or just change it unsignificantly (even for text in English), so I would not recommend using GPT-3 for this purpose.

You should now be set to start experiencing these main use cases and carry out your own Natural Language Generation (NLG) projects 🎉

I would like to end by thanking Valentin Strach for supervising every project I carried out during my time at Webedia and also Mohamed Belmaaza for providing feedback for each of them ! 🙇

--

--