01_corpus:02_preprocessing:03_emojis
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
01_corpus:02_preprocessing:03_emojis [2020/04/16 13:38] – simone | 01_corpus:02_preprocessing:03_emojis [2022/06/27 09:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 2: | Line 2: | ||
Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen. | Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen. | ||
- | Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as | + | Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as |
* '' | * '' | ||
* '' | * '' |
01_corpus/02_preprocessing/03_emojis.1587037105.txt.gz · Last modified: 2022/06/27 09:21 (external edit)