User Tools

Site Tools


01_corpus:02_preprocessing:03_emojis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
01_corpus:02_preprocessing:03_emojis [2020/04/16 13:38] simone01_corpus:02_preprocessing:03_emojis [2022/06/27 09:21] (current) – external edit 127.0.0.1
Line 2: Line 2:
 Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen. Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen.
  
-Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as  //emojiQsmilingCatFaceWithOpenMouth//. This encoding system allows for easily finding individual or groups of emojis using [[02_browsing:04_queries:02_regex|Regular Expressions]], e.g.:+Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as  ''emojiQsmilingCatFaceWithOpenMouth''. This encoding system allows for easily finding individual or groups of emojis using [[02_browsing:04_queries:03_regex|Regular Expressions]], e.g.:
   * ''emojiQ.*'' finds all emojis   * ''emojiQ.*'' finds all emojis
   * ''emojiQcat.*''  finds all cats   * ''emojiQcat.*''  finds all cats
01_corpus/02_preprocessing/03_emojis.1587037105.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki