awesome-croatian-nlp

collection of croatian natural language processing resources
Log | Files | Refs | README

commit fb33db9b8418a1aa8aac07f891c16879e9e5d288
parent b8f7571f596fc181653d5fe04a991eaa53f44b2b
Author: Stefan Koch <programming@stefan-koch.name>
Date:   Tue, 21 Dec 2021 22:28:56 +0100

add resources by Ljubesic, via another project

Diffstat:
MREADME.md | 14+++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/README.md b/README.md @@ -7,27 +7,19 @@ ## Tools and/or Models -### Named Entity Recognition - - [Named entity recognition (ffzg)](http://nlp.ffzg.hr/resources/models/ner/) - -### Stemmers - - [Rule-based stemmer for Croatian (ffzg)](http://nlp.ffzg.hr/resources/tools/stemmer-for-croatian/) - [Rule-based stemmer for Croatian (nltk-compliant)](https://eliteinformatiker.de/2015/05/15/rewriting-university-of-zagrebs-croatian-stemmer-to-a-nltk-compliant-class) - -### Taggers - +- [cstlemma](https://github.com/kuhumcst/cstlemma) - [Tagging model for hunpos tagger](http://nlp.ffzg.hr/resources/models/tagging/) - +- [classla: Tokenization, POS tagging, lemmatization, NER](https://pypi.org/project/classla/) ## Datasets -### Corpora - - [SETimes: Parallel English and South-East European Corpus](http://nlp.ffzg.hr/resources/corpora/setimes/) - [hrWaC: Croatian Web Corpus](http://nlp.ffzg.hr/resources/corpora/hrwac/) - [SETimes.HR+ Croatian dependency treebank](https://github.com/ffnlp/sethr) +- [hrLex Inflectional lexicon](https://www.clarin.si/repository/xmlui/handle/11356/1232) ## Organisations