lingua v1.0.0 Release Notes
Release Date: 2020-06-24 // almost 4 years ago-
Languages
- โ added 9 new languages, this time with a focus on Africa: Ganda, Shona, Sotho, Swahili, Tsonga, Tswana, Xhosa, Yoruba, Zulu
- โ removed language Norwegian in favor of Bokmal and Nynorsk (#59)
๐ Features
LanguageDetector
can now provide confidence scores for each evaluated language. (#11)- โ
The public API for creating language model (
LanguageModelFilesWriter
) and test data files (TestDataFilesWriter
) has been stabilized. (#37) - ๐ New convenience methods have been added to
LanguageDetectorBuilder
in order to buildLanguageDetector
from languages written in a certain script. (#61)
๐ Improvements
- The rule-based detection algorithm has been made less sensitive so that single words in a different language cannot mislead the algorithm so easily.
- The fastutil library has been added again to reduce memory consumption. (#58)
- โก๏ธ The language model-based algorithm has been optimized so that language detection performs approximately 25% faster now. (#58)
- ๐ Support for the Kotlin linter
ktlint
has been added to help with a consistent coding style. (#47) - โก๏ธ Third-party dependencies have been updated to their latest versions. (#36)
๐ Bug Fixes
- Incorrect regex character classes caused the library to not work properly on Android. (#32)
โ Test Coverage
- โ Test coverage has been extended from 59% to 72%.
๐ Documentation
- The README contains a new section describing how users can add their own languages to Lingua.
Other changes
๐ There is a breaking change in this release:
- Methods with the prefix
fromAllBuiltIn...
have been renamed tofromAll...
to make them more succinct and clear. (#61)