LogoLogo
  • 👋Welcome
  • For the users
    • 📱Download
      • 🔄Update application
    • 📖Offline dictionaries
    • 🌐Dictionaries servers
    • ⁉️Support
  • For the developers
    • 🖥️Getting started
    • 🛠️Develop on Remède
      • 🔌Setup
      • 📁Structure
      • 🧸Development
      • ⚙️API
      • ✨Features
        • ✈️Offline
        • 📒Sheets
        • 🧰DICT Client
      • Android development
    • 📪API
  • Database
    • 🗃️Database
      • 📋Database schema
      • 🗒️Document schema
      • 📦Dataset
      • 🎶Rimes
      • 📍Internationalization
        • 🇬🇧English database
    • 🚧Build Dictionary
      • The building lifecycle
      • Generate my own database
      • About
    • 📌Remède for your project
    • 📎Available formats
      • DICT
      • XDXF
      • CSV
    • ©️Credits
  • Project
    • 📜Story
    • 🙏Contributing
      • 🌐Translation
    • 👣Lifecycles and infrastructure
    • ⏭️Remède Next
Powered by GitBook

Find us

  • Website
  • Github
  • Support

© 2025 The Remède Project and its contributors.

On this page
  • What is a Remède database generation
  • Generate the database step by step
  • generate.py

Was this helpful?

Edit on GitHub
  1. Database
  2. Build Dictionary

The building lifecycle

A Remède database generation can take a while... Let's see what happend step by step in the generation.

What is a Remède database generation

When you generate a new fresh Remède database, you build an Sqlite database which includes all the dictionary's words, and their metadata, stored in a JSON format, specified as the Remède document format.

Generate the database step by step

Learn how to generate Remède database by yourself.

Generation used to require to execute a lot of python scripts. But now, only two steps are required to generate a database.

  1. pre_generate_ressources.py generate multiple useful resources (mots.txt and ipa.json, from IPA.txt); see Dataset

  2. generate.py generate the Sqlite database which contains all the Remède documents for each letter of the alphabet (see generate.py)

All the scripts are stored in scripts folder and must be executed from project root.

generate.py

A script to iterate words and build their Remède document.

How it works ?

  1. It iterates over 1 000 000 words (from data/mots.txt)

  2. For each word, it retrieves its definition using api-definition and more information with extern services...

  3. It generates its Remède document

  4. It inserts into the Sqlite database: the word, its sanitized form, its phoneme, its JSON format (and more metadata required for powerful and advanced features)

    • Metadata like if the last phoneme is feminine, the number of syllables or if the word can have an elide are taken from Open Lexicon database (Drime project) or calculated with less precision by us...

A lifecycle schema of parse.py

PreviousBuild DictionaryNextGenerate my own database

Last updated 3 months ago

Was this helpful?

🚧