From 9d4628df369b92016b7fc3bfc7fed6d06ff2ca9a Mon Sep 17 00:00:00 2001 From: "Suren A. Chilingaryan" Date: Sun, 7 Aug 2005 21:41:32 +0000 Subject: - Russian autoengine is renamed to LibRCD - Fix Learning with Language Autodetection switched on - Attempt to perform rccFS with Language Autodetection switched off, if failed with default behaviour. - Systematization of translation engine: + Rearangement of the translation modes: OFF, TO_ENGLISH, SKIP_RELATED, SKIP_PARRENT, FULL. + New class types: TRANSLATE_LOCALE, TRANSLATE_CURRENT, TRANSLATE_FROM. - Detect "Unicode" locales for foreign languages - "out" class is assumed to be TRANSLATE_LOCALE - Respect RCC_CLASS_KNOWN - Check for Latin UTF-8 prior to running any charset detection engine. --- ToDo | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'ToDo') diff --git a/ToDo b/ToDo index 214495f..db0515f 100644 --- a/ToDo +++ b/ToDo @@ -10,6 +10,14 @@ - Revise locking subsystem - Libtranslate can leave translated message partly in old language. This causes problems because of recoding from UTF8 to Current language. (With UTF-8 encoding should be Okey). + - Lating languages. If in the string all characters < 0x7F then we have one of the Latin + languages? + - Statistic approach of language detection. + - LibRCD autolearning using db4 + + Charset detection + + Language detection (same as charsets, but for UTF8...) + * Consider word recognition based on probability + + Autolearning is triggered by large enough dictionary words 1.x: -- cgit v1.2.3