Bug/py icu#60
Conversation
Thank you for the pull request! 💙🩵The Scribe-Server team will do our best to address your contribution as soon as we can. The following are some important points:
Note Scribe uses Conventional Comments in reviews to make sure that communication is as clear as possible. |
Maintainer ChecklistThe following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :) |
There was a problem hiding this comment.
First PR Commit Check
- The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution
- The contributor's name and icon in remote commits should be the same as what appears in the PR
- If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for
git config user.emailin their local Scribe-Server repo (can be set withgit config --global user.email "GITHUB_EMAIL")
|
Hi @LJSigersmith , I have done my review. Please check. Also @axif0, can you look at this too? |
| # MARK: System Dependencies | ||
|
|
||
| log "🔧 Installing system dependencies for PyICU..." | ||
| sudo apt-get install -y libicu-dev pkg-config g++ python3-dev || { |
There was a problem hiding this comment.
Toolforge containers do not provide sudo or root privileges, so installing system packages at runtime is not supported.
|
@axif0 I've updated the PR to move the dependencies needed to build PyICU to the workflow script |
|
@LJSigersmith Actually it appears, that changes won't work.
|
|
The build system uses
@DeleMike can you please reverify the commands if it actually installs the PyICU ? |


Contributor checklist
./pre-commitexecutable as well asmake lintand have fixed all reported issuesDescription
Fix PyICU import error and add emoji_keywords data type to update script
This PR fixes a runtime
ImportErrorthat caused theupdate_data.shscript to crash during emoji keyword generation, and addsemoji_keywordsas a processed data type.Files Changed
update_data.sh:System Dependenciessection that installslibicu-dev,pkg-config,g++, andpython3-devbefore the virtual environment is created. Installing system deps after the venv caused a runtime import failure even when pip reported a successful install.pip install --force-reinstall --no-binary :all: PyICUafterpip install -e .to ensure PyICU is always compiled from source against the locally installed ICU library rather than using a prebuilt wheel that may link against a different ICU version.emoji_keywordstoDATA_TYPES, ordered afternounsandverbsasemoji_keywordsgeneration requires thescribe_data_json_export/<language>/directory to exist, which is created when nouns are processed first..github/workflows/update_scribe_data.yml:>=3.12.emoji_keywordsrow after the script runs to confirm the output is valid before packaging.Testing
Tested locally on a Raspberry Pi 4 running Ubuntu using a fork of Scribe-Data (LJSigersmith/Scribe-Data, branch
fix/emoji-keywords-sqlite-generation) which includes fixes to emoji keyword SQLite generation. The full script ran successfully, generating and migratingemoji_keywordsfor English, with 3,393 emoji keywords written to the database. Also validated via GitHub Actions on the fork. The CI run confirmed the SQLite output step printed correct tables and a sample row.Related issue