lingua v0.4.0 Release Notes

Release Date: 2019-05-07 // almost 5 years ago
  • ๐Ÿš€ This release took some time, but here it is.

    Languages

    • โž• added 18 new languages: Afrikaans, Albanian, Basque, Bokmal, Catalan, Greek, Icelandic, Indonesian, Irish, Malay, Norwegian, Nynorsk, Slovak, Slovene, Somali, Tagalog, Vietnamese, Welsh

    ๐Ÿ”‹ Features

    Language models are now lazy-loaded into memory upon first access and not already when an instance of LanguageDetector is created. This way, if the rule-based engine can filter out some unlikely languages, their language models are not loaded into memory as they are not necessary at that point. So the overall memory consumption is further reduced.

    The fastutil library is used to compress the probability values of the language models in memory. They are now stored as primitive data types (double) instead of objects (Double) which reduces memory consumption by approximately 500 MB if all language models are selected.

    ๐Ÿ‘Œ Improvements

    • ๐Ÿ”ง The overall code quality has been improved significantly. This allows for easier unit testing, configuration and extensibility.

    ๐Ÿ› Bug Fixes

    • ๐Ÿ›  Reported bug #3 has been fixed which prevented certain character classes to be used on Android.

    ๐Ÿ‘ท Build system

    • ๐Ÿ‘ท Starting from this version, Gradle is used as this library's build system instead of Maven. This allows for more customizations, such as in test report generation, and is a first step towards multiplatform support. Please take a look at this project's README to read about the available Gradle tasks.

    โœ… Test Coverage

    • โœ… Test coverage has been extended from 24% to 55%.