README.md (611B)
1 # PolyglotStats 2 3 This repository collects lists of most common words for different foreign 4 languages and the scripts to calculate them. 5 6 7 ## Languages 8 9 - Croatian 10 11 12 ## Data Sources 13 14 - Wikipedia 15 16 17 ## Open Issues 18 19 Some special characters are not correctly filtered out, some numbers are listed 20 as words. I guess that this can be fixed given some time to debug. 21 22 I guess that - depending on what you want to achieve with your language 23 skills - different data sources will lead to vastly different lists of 24 important words. An encyclopedia like Wikipedia contains very different words 25 than for example song lyrics.