Documentation

What's up, Switzerland?

User Tools

Site Tools


01_corpus:02_preprocessing:03_emojis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
01_corpus:02_preprocessing:03_emojis [2020/04/22 12:56]
simone ↷ Links adapted because of a move operation
01_corpus:02_preprocessing:03_emojis [2020/05/11 08:56] (current)
Line 2: Line 2:
 Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen. Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen.
  
-Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as  ​//emojiQsmilingCatFaceWithOpenMouth//. This encoding system allows for easily finding individual or groups of emojis using [[02_browsing:​04_queries:​03_regex|Regular Expressions]],​ e.g.:+Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as  ​''​emojiQsmilingCatFaceWithOpenMouth''​. This encoding system allows for easily finding individual or groups of emojis using [[02_browsing:​04_queries:​03_regex|Regular Expressions]],​ e.g.:
   * ''​emojiQ.*''​ finds all emojis   * ''​emojiQ.*''​ finds all emojis
   * ''​emojiQcat.*'' ​ finds all cats   * ''​emojiQcat.*'' ​ finds all cats
01_corpus/02_preprocessing/03_emojis.txt · Last modified: 2020/05/11 08:56 (external edit)