01_corpus:02_preprocessing:07_normalization
Differences
This shows you the differences between two versions of the page.
Previous revision | |||
— | 01_corpus:02_preprocessing:07_normalization [2022/06/27 09:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== 1.2.7 Normalization ====== | ||
+ | Normalization is the task of " | ||
+ | |||
+ | In the case of our corpus, we have manually normalized some data in the Swiss German dialect, resulting in the corpus WUS_DIALOG_GSW (5 chats, 34,683 tokens). | ||