© Copyright and user license Note
  • Copyright: All data, corpus, annotations, data structures, tags, translations, design, know-how, layout, user-experience design, software, and documentation, in this portal is fully copyrighted to Birzeit University.

  • Usage: people may search and use our data for (personal use) only, but they are not allowed to copy, share, remix, or (re)use any part of the data anywhere and for any other purpose.

  • Download a licensed copy: people may download a licensed copy of Curras (for academic or other purposes) through this link.

  • Citation: Users and researchers of our data and ideas must acknowledge and cite this article [1] in a proper and clear way.

  • Not agreeing on the above means that your use of the data is illegal.


What you can request to download:
  • Curras raw text: the full corpus (~60K words, ~50 documents) text files of Palestinian raw text.

  • Curras Annotations: the full corpus and the annotations of each word, in CSV format.

  • Words Frequencies: a table (csv file) with every word and its frequency in the whole corpus.

  • MSA Lemmas: a table (csv file) with every dialect word and its corresponding MSA lemma.

  • MSA lemma Glosses: a table (csv file) with every dialect word (i.e. token), its corresponding MSA lemma, and a gloss to describe their meaning in English.

  • Experimental data that we used in this article .

  • Web Service: a very fast RESTFull web service to lookup a term and retrieve its response in JOSN format.