A Remède database generation can take a while… Let’s see what happend step by step in the generation.

What is a Remède database generation

When you generate a new fresh Remède database, you build JSON files for each letter (data/REMEDE_a.json) but also the Sqlite database (data/remede.db).

Generate the database step by step

Learn how to generate Remède database by yourself.

Generation require to execute multiple python scripts…

  1. generate multiple useful resources (mots.txt and ipa.json, from IPA.txt); see Dataset
  2. generate a JSON file which contains all the Remède documents for each letter of the alphabet (see
  3. generate the Sqlite database, from the previously generated JSON files
  4. generate the wordlist table (an index table) and push it to the Sqlite database
  5. add the rimes to the dictionary (see Rimes)

All the scripts are stored in scripts folder and must be executed from project root.

A script to iterate words and build their Remède document.

How it works ?

  1. It iterates over 250 000 words (from data/mots.txt)
  2. For each word, it retrieves its definition using api-definition and more information with extern services…
  3. It generates its Remède document
  4. It saves it under JSON format
flowchart TB
  words[(Word\ndatabase)] --> Loop
  Loop(Parser loop) --> def[Definition API]
  Loop --> syn[]
  Loop --> ant[]
  Loop .-> conj[]
  def --> doc[[Remède document]]
  syn --> doc
  conj .-> doc
  ant --> doc
  doc --> json[[Remède\nJSON]]
  json --> db[(Remède\nDatabase)]
  drime[(Drime database)] -- Reorganised and added to --> db

