- 1. THE CORPUS
- 2. USING THE CORPUS
- 3. PROJECT/PUBLICATIONS
As explained in section 1.1, you can work with either the full corpus WUS or you can select different sub-corpora. You find the list of sub-corpora in the bottom left in ANNIS.
The list of sub-corpora is also a good starting point to get information about available fields for your query, to get examples and statistics.
Next to the name of each sub-corpus, you see the number of messages (marked as "Texts") and tokens. You can use these figures for statistics.
Please note: If you work with corpora where not all participants gave their permission to use their messages, the figure for tokens is off because messages without permission were replaced by messages like redactedQ12tokens55characters . These texts count as tokens, too. If you need statistics that depend on the number of tokens in a (sub-)corpus, you are advised to work with corpora with the extension _DEMOG.
When you press on the small
i for information to the right of each (sub-)corpus name, you find more information about the corpus. More specifically:
On the right-hand side of the information window, you see which annotations are available to be queried for the selected sub-corpus.
node & meta::lang_100_and_more="fra, eng"is entered into the query field. This query would search for messages in chats with more than 100 messages in French and in English. More precisely:
nodewill fetch also all tokens that are in such chats; if you want to distinguish between messages and tokens, you should explicitly query for one or the other:
tok & …or
msg & ….
By clicking on the little piece of paper next to the information
i in the list of sub-corpora, you get a list of all chats in the respective sub-corpus.
From here, you can click on
complete chat view to view the whole chat (without any annotations). Once in this list of messages, you can alway click on an individual message ID to see that message with its annotations.
If you click on the little
i at the very right of the list of chats, you see all the meta information about the respective chat.