LogoLogo
  • 👋Welcome
  • For the users
    • 📱Download
      • 🔄Update application
    • 📖Offline dictionaries
    • 🌐Dictionaries servers
    • ⁉️Support
  • For the developers
    • 🖥️Getting started
    • 🛠️Develop on Remède
      • 🔌Setup
      • 📁Structure
      • 🧸Development
      • ⚙️API
      • ✨Features
        • ✈️Offline
        • 📒Sheets
        • 🧰DICT Client
      • Android development
    • 📪API
  • Database
    • 🗃️Database
      • 📋Database schema
      • 🗒️Document schema
      • 📦Dataset
      • 🎶Rimes
      • 📍Internationalization
        • 🇬🇧English database
    • 🚧Build Dictionary
      • The building lifecycle
      • Generate my own database
      • About
    • 📌Remède for your project
    • 📎Available formats
      • DICT
      • XDXF
      • CSV
    • ©️Credits
  • Project
    • 📜Story
    • 🙏Contributing
      • 🌐Translation
    • 👣Lifecycles and infrastructure
    • ⏭️Remède Next
Powered by GitBook

Find us

  • Website
  • Github
  • Support

© 2025 The Remède Project and its contributors.

On this page
  • What is a Remède database generation
  • Generate the database step by step
  • generate.py

Was this helpful?

Edit on GitHub
  1. Database
  2. Build Dictionary

The building lifecycle

A Remède database generation can take a while... Let's see what happend step by step in the generation.

PreviousBuild DictionaryNextGenerate my own database

Last updated 3 months ago

Was this helpful?

What is a Remède database generation

When you generate a new fresh Remède database, you build an Sqlite database which includes all the dictionary's words, and their metadata, stored in a JSON format, specified as the Remède document format.

Generate the database step by step

Learn how to generate Remède database by yourself.

Generation used to require to execute a lot of python scripts. But now, only two steps are required to generate a database.

  1. pre_generate_ressources.py generate multiple useful resources (mots.txt and ipa.json, from IPA.txt); see

  2. generate.py generate the Sqlite database which contains all the for each letter of the alphabet (see )

All the scripts are stored in scripts folder and must be executed from project root.

generate.py

A script to iterate words and build their Remède document.

How it works ?

  1. It iterates over 1 000 000 words (from data/mots.txt)

  2. For each word, it retrieves its definition using and more information with extern services...

  3. It generates its

  4. It inserts into the Sqlite database: the word, its sanitized form, its phoneme, its JSON format (and more metadata required for powerful and advanced features)

    • Metadata like if the last phoneme is feminine, the number of syllables or if the word can have an elide are taken from database ( project) or calculated with less precision by us...

A lifecycle schema of parse.py

🚧
Dataset
Remède documents
Remède document
Open Lexicon
Drime
generate.py
api-definition