User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
start [2020/04/14 17:37] simonestart [2022/09/12 19:19] (current) Stefan Bircher
Line 2: Line 2:
  
 ===== The project ===== ===== The project =====
-The linguistic corpus presented here was collected in 2014 to constitute the data base of the research project "What's up, Switzerland?" under the lead of [[https://www.rose.uzh.ch/de/seminar/wersindwir/mitarbeitende/stark.html|Prof. Elisabeth Stark]] (University of Zurich). The project was funded by the [[http://www.snf.ch/en/Pages/default.aspx|Swiss National Fund]] (Sinergia: CRSII1_160714) with CHF 1'832'647 and ran between 2016 - 2020. [[3_project|More about the project ...]] +The data underlying the corpus was collected in 2014 to constitute the data base of the research project "What's up, Switzerland?" under the lead of [[https://www.rose.uzh.ch/de/seminar/wersindwir/mitarbeitende/stark.html|Prof. Elisabeth Stark]] (University of Zurich). The project was funded by the [[http://www.snf.ch/en/Pages/default.aspx|Swiss National Fund]] (Sinergia: CRSII1_160714) with CHF 1'832'647 and ran between 2016 - 2020. [[3_project|More about the project ...]] 
  
 +===== Using the corpus =====
 +[[https://corpora.linguistik.uzh.ch/annis/|This corpus is freely available]] for academic, non-commercial research. When using the corpus, please make sure to quote correctly.
  
 ===== The corpus ===== ===== The corpus =====
Line 10: Line 12:
   * Number of chats: 617   * Number of chats: 617
   * Number of messages (with permission to be used): 763’644   * Number of messages (with permission to be used): 763’644
-  *  Number of tokens: 5'155'476 (without redactedQ.* (cf. [[01_corpus:03_preprocessing:02_without_permission|Messages without permission]]))+  * Number of informants (who gave their permission): 944 
 +  *  Number of tokens: 5'155'476 (without redactedQ.* (cf. [[01_corpus:02_preprocessing:02_without_permission|Messages without permission]]))
   * Number of emojis: 382'116   * Number of emojis: 382'116
  
Line 18: Line 21:
   * fra: French   * fra: French
   * ita: Italian   * ita: Italian
-  * roh: Any variety of Romansh+  * roh: any variety of Romansh
   * gsw: dialectal German as used in Switzerland   * gsw: dialectal German as used in Switzerland
   * deu: non-dialectal German   * deu: non-dialectal German
   * eng: English   * eng: English
   * spa: Spanish   * spa: Spanish
-  * sla: Any Slavic language+  * sla: any Slavic language
  
 Romansh varieties: Romansh varieties:
  
   * roh-ja: Jauer Romansh   * roh-ja: Jauer Romansh
-  * roh-sr: romontsch sursilvan +  * roh-sr: Romontsch Sursilvan 
-  * roh-st: rumàntsch sutsilvan +  * roh-st: Rumàntsch Sutsilvan 
-  * roh-sm: rumantsch surmiran +  * roh-sm: Rumantsch Surmiran 
-  * roh-pt: rumauntsch puter +  * roh-pt: Rumauntsch Puter 
-  * roh-vl: rumantsch vallader +  * roh-vl: Rumantsch Vallader 
-  * roh-gr: rumantsch grischun +  * roh-gr: Rumantsch Grischun 
  
  
Line 41: Line 44:
 [[https://bop.unibe.ch/linguistik-online/article/view/3849|Ueberwasser, Simone/Stark, Elisabeth (2017). "What’s up, Switzerland? A corpus-based research project in a multilingual country". Linguistik online  84/5, 105-126]] DOI: [[https://doi.org/10.13092/lo.84.3849 |https://doi.org/10.13092/lo.84.3849 ]]. [[https://bop.unibe.ch/linguistik-online/article/view/3849|Ueberwasser, Simone/Stark, Elisabeth (2017). "What’s up, Switzerland? A corpus-based research project in a multilingual country". Linguistik online  84/5, 105-126]] DOI: [[https://doi.org/10.13092/lo.84.3849 |https://doi.org/10.13092/lo.84.3849 ]].
  
-===== Using the corpus ===== +
-This corpus is freely available for academic, non-commercial research. When using the corpus, please make sure to quote correctly.+
  
  
Line 52: Line 54:
  
 ==== This documentation ==== ==== This documentation ====
-Ueberwasser, Simone (2020): The corpus "What's up, Switzerland?". Documentation, facts and figures. www.whatsup-switzerland.ch.+Stark, Elisabeth; Ueberwasser, Simone (2020): The corpus "What's up, Switzerland?". Documentation, facts and figures. www.whatsup-switzerland.ch.
  
 ==== Creation of the corpus ==== ==== Creation of the corpus ====
-Ueberwasser, Simone; Stark, Elisabeth (2017): //What’s up, Switzerland? A corpus-based research project in a multilingual country//”. In: Linguistik online, 84/5, 105-126. https://bop.unibe.ch/linguistik-online/article/view/3849/5834+Ueberwasser, Simone; Stark, Elisabeth (2017): "What’s up, Switzerland? A corpus-based research project in a multilingual country”. In: Linguistik online, 84/5, 105-126. https://bop.unibe.ch/linguistik-online/article/view/3849/5834
  
 ==== The project ==== ==== The project ====
 Stark, Elisabeth (2016-2020). //SNSF project  "What’s up, Switzerland?"// (Sinergia: CRSII1_160714). University of Zurich. www.whatsup-switzerland.ch. Stark, Elisabeth (2016-2020). //SNSF project  "What’s up, Switzerland?"// (Sinergia: CRSII1_160714). University of Zurich. www.whatsup-switzerland.ch.
  
 +===== Raw data =====
 +If you want to use our raw data for computational linguistic projects, please contact [[estark@rom.uzh.ch|Prof. Elisabeth Stark]] to see whether your project complies with our requirements. If we make the data available, a CC BY-NC-ND license is applied.
  
  
start.1586878650.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki