User Tools

Site Tools


01_corpus:01_subcorpora

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
01_corpus:01_subcorpora [2020/04/16 11:34] simone01_corpus:01_subcorpora [2020/05/04 13:43] simone
Line 8: Line 8:
   * Each chat was to be assigned to only one language-sub-corpus.    * Each chat was to be assigned to only one language-sub-corpus. 
   * Additionally, we differentiate between chats where we have demographic information for all participants and those where we do not. In the former case, the sub-corpus gets the extension _DEMOG.   * Additionally, we differentiate between chats where we have demographic information for all participants and those where we do not. In the former case, the sub-corpus gets the extension _DEMOG.
-  * Where additional tasks were performed on individual chats (e.g. normalization or part-of-speech tagging) we created additional sub-corpora exist per language.+  * Where additional tasks were performed on individual chats (e.g. normalization or part-of-speech tagging) we created additional sub-corpora per language.
  
  
Line 32: Line 32:
   * WUS_SMALL_DEMOG: A subgroup thereof where we have demographic information from all communication partners.   * WUS_SMALL_DEMOG: A subgroup thereof where we have demographic information from all communication partners.
   * WUSdemographics: Only demographic data per person. This sub-corpus is much faster if you want to look up demographic data only.   * WUSdemographics: Only demographic data per person. This sub-corpus is much faster if you want to look up demographic data only.
-  * WUS_ARGDROP and WUS_ARGDROP_language: Sub-corporafor which argument drop has been manually annotated. For the architecture of the annotations and scientific considerations behind it see [[http://www.unige.ch/lettres/linge/syntaxe/journal/Volume11/11_Stuntebeck_2018.pdf|Stuntebeck, Franziska (2018): "Annotating Argument Drop in the Swiss WhatsApp Corpus". In: Generative Grammar in Geneva (GG@G) XI, 175-187.]]+  * WUS_ARGDROP and WUS_ARGDROP_language: Sub-corpora for which argument drop has been manually annotated. For the architecture of the annotations and scientific considerations behind it see [[http://www.unige.ch/lettres/linge/syntaxe/journal/Volume11/11_Stuntebeck_2018.pdf|Stuntebeck, Franziska (2018): "Annotating Argument Drop in the Swiss WhatsApp Corpus". In: Generative Grammar in Geneva (GG@G) XI, 175-187.]]
  
  
01_corpus/01_subcorpora.txt · Last modified: 2022/06/27 09:21 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki