Documentation

What's up, Switzerland?

User Tools

Site Tools


01_corpus:01_subcorpora

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
01_corpus:01_subcorpora [2020/04/16 11:34]
simone
01_corpus:01_subcorpora [2020/05/04 13:43]
simone
Line 8: Line 8:
   * Each chat was to be assigned to only one language-sub-corpus. ​   * Each chat was to be assigned to only one language-sub-corpus. ​
   * Additionally,​ we differentiate between chats where we have demographic information for all participants and those where we do not. In the former case, the sub-corpus gets the extension _DEMOG.   * Additionally,​ we differentiate between chats where we have demographic information for all participants and those where we do not. In the former case, the sub-corpus gets the extension _DEMOG.
-  * Where additional tasks were performed on individual chats (e.g. normalization or part-of-speech tagging) we created additional sub-corpora ​exist per language.+  * Where additional tasks were performed on individual chats (e.g. normalization or part-of-speech tagging) we created additional sub-corpora per language.
  
  
Line 32: Line 32:
   * WUS_SMALL_DEMOG:​ A subgroup thereof where we have demographic information from all communication partners.   * WUS_SMALL_DEMOG:​ A subgroup thereof where we have demographic information from all communication partners.
   * WUSdemographics:​ Only demographic data per person. This sub-corpus is much faster if you want to look up demographic data only.   * WUSdemographics:​ Only demographic data per person. This sub-corpus is much faster if you want to look up demographic data only.
-  * WUS_ARGDROP and WUS_ARGDROP_language:​ Sub-corporafor which argument drop has been manually annotated. For the architecture of the annotations and scientific considerations behind it see [[http://​www.unige.ch/​lettres/​linge/​syntaxe/​journal/​Volume11/​11_Stuntebeck_2018.pdf|Stuntebeck,​ Franziska (2018): "​Annotating Argument Drop in the Swiss WhatsApp Corpus"​. In: Generative Grammar in Geneva (GG@G) XI, 175-187.]]+  * WUS_ARGDROP and WUS_ARGDROP_language:​ Sub-corpora for which argument drop has been manually annotated. For the architecture of the annotations and scientific considerations behind it see [[http://​www.unige.ch/​lettres/​linge/​syntaxe/​journal/​Volume11/​11_Stuntebeck_2018.pdf|Stuntebeck,​ Franziska (2018): "​Annotating Argument Drop in the Swiss WhatsApp Corpus"​. In: Generative Grammar in Geneva (GG@G) XI, 175-187.]]
  
  
01_corpus/01_subcorpora.txt · Last modified: 2020/05/11 08:56 (external edit)